Jeff February 2016

In Haskell, how do you restrict functions to only one constructor of a data type?

I'm not sure how to word this question. Say I'm trying to pass the paths of tmpfiles around, and I want to capture the idea that there are different formats of tmpfile, and each function only works on one of them. This works:

data FileFormat
  = Spreadsheet
  | Picture
  | Video
  deriving Show

data TmpFile = TmpFile FileFormat FilePath
  deriving Show

videoPath :: TmpFile -> FilePath
videoPath (TmpFile Video p) = p
videoPath _ = error "only works on videos!"

But there must be a better way to write it without runtime errors right? I thought of two alternatives, this:

type TmpSpreadsheet = TmpFile Spreadsheet
type TmpPicture     = TmpFile Picture
type TmpVideo       = TmpFile Video

videoPath :: TmpVideo -> FilePath

Or this:

data TmpFile a = TmpFile a FilePath
  deriving Show

videoPath :: TmpFile Video -> FilePath

But obviously they don't compile. What's the proper way to do it? Some other ideas, none particularly appealing:

  • Wrap TmpFile in the format instead of the other way around, so the values are Video (TmpFile "test.avi") etc.
  • Make lots of separate data types VideoTmpFile, PictureTmpFile etc.
  • Make a TmpFile typeclass
  • Use partial functions everywhere, but add guard functions to abstract the pattern matching

I also considered learning the -XDataKinds extension, but suspect I'm missing something much simpler that can be done without it.

EDIT: I'm learning a lot today! I tried both the approaches outlined below (DataKinds and phantom types, which have dummy value constructors that can be removed with another extension), and they both work! Then I tried to go a little further. They both let you make a nested type TmpFile (ListOf a) in addition to regular TmpFile a, which is

Answers


leftaroundabout February 2016

You're right: with -XDataKinds, the TmpFile Video -> FilePath approach would work. And indeed I think this may be a good application for that extension.

{-# LANGUAGE DataKinds #-}

data TmpFile (a :: FileFormat) = TmpFile FilePath
  deriving Show

videoPath :: TmpFile Video -> FilePath

The reason you need this extension to write TmpFile Video is that the constructors of FileFormat are ab initio value-level (thus only exist at runtime), while TmpFile is type-level / compile-time.

Of course there's another way to generate type-level entities: define types!

data Spreadsheet = Spreadsheet
data Picture = Picture
data Video = Video

data TmpFile a = TmpFile a FilePath
  deriving Show

videoPath :: TmpFile Video -> FilePath

Such types are called phantom types. But really, they're a bit of a hack to work around the former lack of proper type-level values, which DataKinds has now given us. So, unless you need compatibility with old compilers, do use DataKinds!

An alternative would be to not enforce the file type at compile time, but simply make it explicit that the functions are partial.

data TmpFile = TmpFile FileFormat FilePath
  deriving Show

videoPath :: TmpFile -> Maybe FilePath
videoPath (TmpFile Video p) = p
videoPath _ = Nothing

In fact, that approach might well be the more rational one, depending on what you're planning to do.


Nikita Volkov February 2016

First of all, I would advice against using such exotic extensions as "DataKinds" unless you absolutely need them. The reason is quite practical and general: the more language concepts you use to solve your problem, the harder it is to reason about your code.

Besides, "DataKinds" isn't an easy concept to wrap a head around. It is a transitional concept crossing two universes simultaneously: the values and the types. Personally I find it quite controversial and would only apply it when I have no other option.

In your case you've already found two ways of approaching your problem simpler, without "DataKinds":

  • Wrap TmpFile in the format instead of the other way around, so the values are Video (TmpFile "test.avi") etc.

  • Make lots of separate data types VideoTmpFile, PictureTmpFile etc.

I particularly like the idea of the wrapping types, because it is flexible and composable. Here's how I'd build up on it:

newtype Video a =
  Video a
  deriving (Functor, Foldable, Traversable)

newtype Picture a =
  Picture a
  deriving (Functor, Foldable, Traversable)

videoPath :: Video FilePath -> FilePath

You can notice two things:

  1. Video and Picture are general concepts, which are not bound to just your temporary files, and they already implement some standard interfaces. This means that they can be reused for other purposes.

  2. There is an obvious pattern in the definitions of Video and Picture.


The pattern that you see in Video and Picture can be called "refinement types" and is abstracted from in the "refined" package among others. So you might be interested in that.


As for your other options:

  • Mak

Post Status

Asked in February 2016
Viewed 3,265 times
Voted 14
Answered 2 times

Search




Leave an answer