Deriving Enum for a sum type of records in Haskell

Deriving Enum for a sum type of records in Haskell - haskell

I have a sum type of records to represent all in-memory tables and as I will send them across the network. I have a binary protocol and need to initially pass the ordinal value (fromEnum) in the header to determine which table the data is associated with. The problem is that the sum type needs to derive from Enum but it doesn't want to.
data Table = MarketData {bid::[Float], ask::[Float]}
| Trade {price::[Float], qty::[Float]}
deriving Enum
main :: IO ()
main = do
print $ fromEnum Trade
This is the compilation error
Can't make a derived instance of `Enum Table':
`Table' must be an enumeration type
(an enumeration consists of one or more nullary, non-GADT constructors)
In the data declaration for `Table'
Any ideas of how I can do this without having to write boilerplate like this:
ordinalVal :: Table -> Int
ordinalVal tbl = case tbl of
MarketData{bid=_, ask=_} -> 0
| Trade{price=_, qty=_} -> 1

If you only want to enumerate the constructors, you can you the Data.Data module, like so:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
data T a b = C1 a b | C2 deriving (Typeable, Data)
main = print $ constrIndex $ toConstr x
where
x :: T Int Int
x = C1 1 1 -- will print 1
-- x = C2 -- will print 2
If you don't want to go down the road of using the Typebale and Data type classes, you could also simply write a function Table -> Int, like you proposed.

Related

Clarifying Data Constructor in Haskell

In the following:
data DataType a = Data a | Datum
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce. Please let me know if i am getting it well:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.

Another way of writing this:
data DataType a = Data a | Datum
Is with generalised algebraic data type (GADT) syntax, using the GADTSyntax extension, which lets us specify the types of the constructors explicitly:
{-# LANGUAGE GADTSyntax #-}
data DataType a where
Data :: a -> DataType a
Datum :: DataType a
(The GADTs extension would work too; it would also allow us to specify constructors with different type arguments in the result, like DataType Int vs. DataType Bool, but that’s a more advanced topic, and we don’t need that functionality here.)
These are exactly the types you would see in GHCi if you asked for the types of the constructor functions with :type / :t:
> :{
| data DataType a where
| Data :: a -> DataType a
| Datum :: DataType a
| :}
> :type Data
Data :: a -> DataType a
> :t Datum
Datum :: DataType a
With ExplicitForAll we can also specify the scope of the type variables explicitly, and make it clearer that the a in the data definition is a separate variable from the a in the constructor definitions by also giving them different names:
data DataType a where
Data :: forall b. b -> DataType b
Datum :: forall c. DataType c
Some more examples of this notation with standard prelude types:
data Either a b where
Left :: forall a b. a -> Either a b
Right :: forall a b. b -> Either a b
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
data Bool where
False :: Bool
True :: Bool
data Ordering where
LT, EQ, GT :: Ordering -- Shorthand for repeated ‘:: Ordering’
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce.
Datum and Data are both “constructors” of DataType a values; neither Datum nor Data is a type! These are just “tags” that select between the possible varieties of a DataType a value.
What is produced is always a value of type DataType a for a given a; the constructor selects which “shape” it takes.
A rough analogue of this is a union in languages like C or C++, plus an enumeration for the “tag”. In pseudocode:
enum Tag {
DataTag,
DatumTag,
}
// A single anonymous field.
struct DataFields<A> {
A field1;
}
// No fields.
struct DatumFields<A> {};
// A union of the possible field types.
union Fields<A> {
DataFields<A> data;
DatumFields<A> datum;
}
// A pair of a tag with the fields for that tag.
struct DataType<A> {
Tag tag;
Fields<A> fields;
}
The constructors are then just functions returning a value with the appropriate tag and fields. Pseudocode:
<A> DataType<A> newData(A x) {
DataType<A> result;
result.tag = DataTag;
result.fields.data.field1 = x;
return result;
}
<A> DataType<A> newDatum() {
DataType<A> result;
result.tag = DatumTag;
// No fields.
return result;
}
Unions are unsafe, since the tag and fields can get out of sync, but sum types are safe because they couple these together.
A pattern-match like this in Haskell:
case someDT of
Datum -> f
Data x -> g x
Is a combination of testing the tag and extracting the fields. Again, in pseudocode:
if (someDT.tag == DatumTag) {
f();
} else if (someDT.tag == DataTag) {
var x = someDT.fields.data.field1;
g(x);
}
Again this is coupled in Haskell to ensure that you can only ever access the fields if you have checked the tag by pattern-matching.
So, in answer to your questions:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
Data a in your original code is not defining a data structure, in that Data is not a separate type from DataType a, it’s just one of the possible tags that a DataType a value may have. Internally, a value of type DataType Int is one of the following:
The tag for Data (in GHC, a pointer to an “info table” for the constructor), and a reference to a value of type Int.
x = Data (1 :: Int) :: DataType Int
+----------+----------------+ +---------+----------------+
x ---->| Data tag | pointer to Int |---->| Int tag | unboxed Int# 1 |
+----------+----------------+ +---------+----------------+
The tag for Datum, and no other fields.
y = Datum :: DataType Int
+-----------+
y ----> | Datum tag |
+-----------+
In a language with unions, the size of a union is the maximum of all its alternatives, since the type must support representing any of the alternatives with mutation. In Haskell, since values are immutable, they don’t require any extra “padding” since they can’t be changed.
It’s a similar situation for standard data types, e.g., a product or sum type:
(x :: X, y :: Y) :: (X, Y)
+---------+--------------+--------------+
| (,) tag | pointer to X | pointer to Y |
+---------+--------------+--------------+
Left (m :: M) :: Either M N
+-----------+--------------+
| Left tag | pointer to M |
+-----------+--------------+
Right (n :: N) :: Either M N
+-----------+--------------+
| Right tag | pointer to N |
+-----------+--------------+
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
This is sort of correct, but again, the constructors Data and Datum aren’t “data structures” by themselves. They’re just the names used to introduce (construct) and eliminate (match) values of type DataType a, for some type a that is chosen by the caller of the constructors to fill in the forall
data DataType a = Data a | Datum says:
If some term e has type T, then the term Data e has type DataType T
Inversely, if some value of type DataType T matches the pattern Data x, then x has type T in the scope of the match (case branch or function equation)
The term Datum has type DataType T for any type T
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.
Not exactly. A type constructor like DataType :: Type -> Type, Maybe :: Type -> Type, or Either :: Type -> Type -> Type, or [] :: Type -> Type (list), or a polymorphic data type, represents an “infinite” family of concrete types (Maybe Int, Maybe Char, Maybe (String -> String), …) but only in the same way that id :: forall a. a -> a represents an “infinite” family of functions (id :: Int -> Int, id :: Char -> Char, id :: String -> String, …).
That is, the type a here is a parameter filled in with an argument value given by the caller. Usually this is implicit, through type inference, but you can specify it explicitly with the TypeApplications extension:
-- Akin to: \ (a :: Type) -> \ (x :: a) -> x
id :: forall a. a -> a
id x = x
id #Int :: Int -> Int
id #Int 1 :: Int
Data :: forall a. a -> DataType a
Data #Char :: Char -> DataType Char
Data #Char 'x' :: DataType Char
The data constructors of each instantiation don’t really have anything to do with each other. There’s nothing in common between the instantiations Data :: Int -> DataType Int and Data :: Char -> DataType Char, apart from the fact that they share the same tag name.
Another way of thinking about this in Java terms is with the visitor pattern. DataType would be represented as a function that accepts a “DataType visitor”, and then the constructors don’t correspond to separate data types, they’re just the methods of the visitor which accept the fields and return some result. Writing the equivalent code in Java is a worthwhile exercise, but here it is in Haskell:
{-# LANGUAGE RankNTypes #-}
-- (Allows passing polymorphic functions as arguments.)
type DataType a
= forall r. -- A visitor with a generic result type
r -- With one “method” for the ‘Datum’ case (no fields)
-> (a -> r) -- And one for the ‘Data’ case (one field)
-> r -- Returning the result
newData :: a -> DataType a
newData field = \ _visitDatum visitData -> visitData field
newDatum :: DataType a
newDatum = \ visitDatum _visitData -> visitDatum
Pattern-matching is simply running the visitor:
matchDT :: DataType a -> b -> (a -> b) -> b
matchDT dt visitDatum visitData = dt visitDatum visitData
-- Or: matchDT dt = dt
-- Or: matchDT = id
-- case someDT of { Datum -> f; Data x -> g x }
-- f :: r
-- g :: a -> r
-- someDT :: DataType a
-- :: forall r. r -> (a -> r) -> r
someDT f (\ x -> g x)
Similarly, in Haskell, data constructors are just the ways of introducing and eliminating values of a user-defined type.

What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce
I'm having trouble following your question, but I think you are complicating things. I would suggest not thinking too deeply about the "constructor" terminology.
But hopefully the following helps:
Starting simple:
data DataType = Data Int | Datum
The above reads "Declare a new type named DataType, which has the possible values Datum or Data <some_number> (e.g. Data 42)"
So e.g. Datum is a value of type DataType.
Going back to your example with a type parameter, I want to point out what the syntax is doing:
data DataType a = Data a | Datum
^ ^ ^ These things appear in type signatures (type level)
^ ^ These things appear in code (value level stuff)
There's a bit of punning happening here. so in the data declaration you might see "Data Int" and this is mixing type-level and value-level stuff in a way that you wouldn't see in code. In code you'd see e.g. Data 42 or Data someVal.
I hope that helps a little...

Polymorphic return types and "rigid type variable" error in Haskell

There's a simple record Column v a which holds a Vector from the Data.Vector family (so that v can be Vector.Unboxed, just Vector etc), it's name and type (simple enum-like ADT SupportedTypes). I would like to be able to serialize it using the binary package. To do that, I try to define a Binary instance below.
Now put works fine, however when I try to define deserialization in the get function and want to set a specific type to the rawVector that is being returned based on the colType (U.Vector Int64 when it's PInt, U.Vector Double when it's PDouble etc) - I get this error message:
Couldn't match type v with U.Vector
v is a rigid type variable bound by the instance declaration at src/Quark/Base/Column.hs:75:10
Expected type: v a
Actual type: U.Vector Int64
error.
Is there a better way to achieve my goal - deserialize Vectors of different types based on the colType value or am I stuck with defining Binary instance for all possible Vector / primitive type combinations? Shouldn't be the case...
Somewhat new to Haskell and appreciate any help! Thanks!
{-# LANGUAGE OverloadedStrings, TransformListComp, RankNTypes,
TypeSynonymInstances, FlexibleInstances, OverloadedLists, DeriveGeneric #-}
{-# LANGUAGE MultiParamTypeClasses, FlexibleContexts,
TypeFamilies, ScopedTypeVariables, InstanceSigs #-}
import qualified Data.Vector.Generic as G
import qualified Data.Vector.Unboxed as U
data Column v a = Column {rawVector :: G.Vector v a => v a, colName :: Text, colType :: SupportedTypes }
instance (G.Vector v a, Binary (v a)) => Binary (Column v a) where
put Column {rawVector = vec, colName = cn, colType = ct} = do put (fromEnum ct) >> put cn >> put vec
get = do t <- get :: Get Int
nm <- get :: Get Text
let pt = toEnum t :: SupportedTypes
case pt of
PInt -> do vec <- get :: Get (U.Vector Int64)
return Column {rawVector = vec, colName = nm, colType = pt}
PDouble -> do vec <- get :: Get (U.Vector Double)
return Column {rawVector = vec, colName = nm, colType = pt}
UPDATED Thank you for all the answers below, some pretty good ideas! It's quite clear that what I want to do is impossible to achieve head-on - so that is my answer. But the other suggested solutions are a good reading in itself, thanks a bunch!

The type you are really trying to represent is
data Column v = Column (Either (v Int) (v Double))
but this representation may be unsatisfactory to you. So how do you write this type with the vector itself at the 'top level' of the constructor?
First, start with a representation of your sum (Either Int Double) at the type level, as opposed to the value level:
data IsSupportedType a where
TInt :: IsSupportedType Int
TDouble :: IsSupportedType Double
From here Column is actually quite simple:
data Column v a = Column (IsSupportedType a) (v a)
But you'll probably want a existentially quantified to use it how you want:
data Column v = forall a . Column (IsSupportedType a) (v a)
The binary instance is as follows:
instance (Binary (v Int), Binary (v Double)) => Binary (Column v) where
put (Column t v) = do
case t of
TInt -> put (0 :: Int) >> put v
TDouble -> put (1 :: Int) >> put v
get = do
t :: Int <- get
case t of
0 -> Column TInt <$> get
1 -> Column TDouble <$> get
Note that there is no inherent reliance in Vector here - v could really be anything.

The problem you're actually running into (or if you're not yet, that you will) is that you're trying to decide a resulting type from an input value. You cannot do that. At all. You could cleverly lock the result type in a box and throw away the key so the type appears to be normal from the outside, but then you cannot do anything much with it because you locked the type in a box and threw away the key. You can store extra information about it using GADTs and boxing it up with a type class instance, but even still this is not a great idea.
Your could make your life far easier here if you simply had two constructors for Column to reflect whether there was a vector of Ints or Doubles.
But really, don't do any of that. Just let the automatically derivable Binary instance deserialize any deserializable value into your vector for you.
data Column a = ... deriving (Binary)
Using the DeriveAnyClass extension that let's you derive any class that has a Generic implementation (which Binary has). Then just deserialize a Column Double or a Column Int when you need it.

As the comment says, you can simply not case on the type, and always call
vec <- get
return Column {rawVector = vec, colName = nm, colType = pt}
This fulfills your type signature properly. But note that colType is not useful to you here -- you have no way to enforce that it corresponds to the type within your vector, since it only exists at the value level. But that may be ok, and you may simply want to remove colType from your data structure altogether, since you can always derive it directly from the concrete type of a chosen in Column v a.
In fact, the constraint in the Column type isn't doing much good either, and I think it would be better to render it just as
data Column v a = Column {rawVector :: v a, colName :: Text}
Now you can just enforce the G.Vector constraint at call sites where necessary...

Confusion about "type" and "data" in haskell

data MoneyAmount = Amount Float Currency
deriving (Show, Eq)
data Currency = EUR | GBP | USD | CHF
deriving (Show, Eq)
type Account = (Integer, MoneyAmount)
putAmount :: MoneyAmount -> Account -> Account
putAmount mon acc = undefined
I need to write a function that adds money to an account (display error if money added is wrong currency in account).
I know how to create an Amount
let moni = Amount 6.6 EUR
but i have no idea what to write to create an Account? (i hope that sentence makes sense) I don't know how to manipulate the input to do the whole add to account thing.
I've tried things like
let acc = Account 1 moni
My question is more how to manipulate the Account so I can write the function.

type creates a type synonym; an Account is exactly the same as an (Integer, MoneyAmount), and you write it the same way:
let acc = (1, moni)

A type is just an alias. It doesn't define a new type but instead a new name for an existing type. So you could do
type Money = Float
And you can use Money where ever you can use a Float and vice-versa. If you had
foo :: Float -> Float
foo x = 2 * x
Then
> foo (1 :: Float)
2
> foo (1 :: Money)
2
Both work fine. In your case, Account is just an alias for (Integer, MoneyAmount), so you would construct one just as you would any other tuple.
A data defines an entirely new type, and this requires new constructors. For example:
data Bool = False | True
defines the Bool type with the constructors False and True. A more complicated example would be
data Maybe a = Nothing | Just a
which defines the Maybe a polymorphic type with constructors Nothing :: Maybe a and Just :: a -> Maybe a. I've included the types of these constructors to highlight that they exist as normal values and functions. The difference between a function and a constructor is that you can do anything you want in a function, but a constructor is only allowed to take existing values and make a value of another type without performing any transformations to it. Constructors are just wrappers around values.

Retrieving number of fields in Haskell record

I am representing a table to store data as a Haskell record and I was wondering if there is a function to get the number of fields given a record?
I ask as I have a type class to represent a table and one of class functions is noOfCols; which correspond to the number of fields in a record representing a table.
data Price = Price {bid=[Float], ask=[Float]}
class Table a where
noOfCols :: a -> Int
...
instance Table Price where
noOfCols t = 2
...
So the problem is that I will be constantly adding new fields so it's possible to forget to update the instance implementation of noOfCols when I add new columns (fields) to Price; i.e. leaving it to 2 when I now have 3 or more fields.
Is there a function that can provide the number of fields for a given record so I don't have to make a manual edit everytime I change the record?

This is something that can be solved via various generic programming libraries. For example:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
data Price = Price {bid :: [Float], ask :: [Float]}
deriving (Typeable, Data)
noOfCols :: Data a => a -> Int
noOfCols = gmapQl (+) 0 (const 1)
Then:
GHCi> noOfCols (Price [1,2] [3,4,5])
2
GHCi> noOfCols (0,0,0,0)
4

Your noOfCols is the arity of the constructor function Price, i.e.
noOfCols = arity Price
See Haskell: Function to determine the arity of functions? for ways to implement arity (see the accepted answer). (Note: #kosmikus is IMHO a better solution for you).
Site note: Maybe [[Float]] is a better model for you, i.e.
type Price = [[Float]]
bid :: Price -> [Float]
bid = (!! 0)
ask :: Price -> [Float]
ask = (!! 1)
noOfCols :: Price -> Int
noOfCols = length

General conversion type class

I'd like to see if it is feasible to have a type class for converting one thing into another and back again from a mapping of [(a,b)].
This example should illustrate what I'd like to do:
data XX = One | Two | Three deriving (Show, Eq)
data YY = Eno | Owt | Eerht deriving (Show, Eq)
instance Convert XX YY where
mapping = [(One, Eno), (Two, Owt), (Three, Eerht)]
-- // How can I make this work?:
main = do print $ (convert One :: YY) -- Want to output: Eno
print $ (convert Owt :: XX) -- Want to output: Two
Here's my stab at making this work:
{-# LANGUAGE MultiParamTypeClasses #-}
import Data.Maybe(fromJust)
lk = flip lookup
flipPair = uncurry $ flip (,)
class (Eq a, Eq b) => Convert a b where
mapping :: [(a, b)]
mapping = error "No mapping defined"
convert :: a -> b
convert = fromJust . lk mapping
-- // This won't work:
instance (Convert a b) => Convert b a where
convert = fromJust . lk (map flipPair mapping)
It is easy to do this with defining two instances for the conversion going either way but I'd like to only have to declare one as in the first example. Any idea how I might do this?
Edit: By feasible I mean, can this be done without overlapping instances any other nasty extensions?

I, er... I almost hate to suggest this, because doing this is kinda horrible, but... doesn't your code work as is?
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE OverlappingInstances #-}
import Data.Maybe(fromJust)
lk x = flip lookup x
flipPair = uncurry $ flip (,)
class (Eq a, Eq b) => Convert a b where
mapping :: [(a, b)]
mapping = error "No mapping defined"
convert :: a -> b
convert = fromJust . lk mapping
instance (Convert a b) => Convert b a where
convert = fromJust . lk (map flipPair mapping)
data XX = One | Two | Three deriving (Show, Eq)
data YY = Eno | Owt | Eerht deriving (Show, Eq)
instance Convert XX YY where
mapping = [(One, Eno), (Two, Owt), (Three, Eerht)]
main = do print $ (convert One :: YY)
print $ (convert Owt :: XX)
And:
[1 of 1] Compiling Main ( GeneralConversion.hs, interpreted )
Ok, modules loaded: Main.
*Main> main
Eno
Two
*Main>
I'm not sure how useful such a type class is, and all the standard disclaimers about dubious extensions apply, but that much seems to work. Now, if you want to do anything fancier... like Convert a a or (Convert a b, Convert b c) => Convert a c... things might get awkward.
I suppose I might as well leave a few thoughts about why I doubt the utility of this:
In order to use the conversion, both types must be unambiguously known; likewise, the existence of a conversion depends on both types. This limits how useful the class can be for writing very generic code, compared to things such as fromIntegral.
The use of error to handle missing conversions, combined with the above, means that any allegedly generic function using convert will be a seething pit of runtime errors just waiting to happen.
To top it all off, the generic instance being used for the reversed mapping is in fact a universal instance, only being hidden by overlapped, more specific instances. That (Convert a b) in the context? That lets the reversed mapping work, but doesn't restrict it to only reversing instances that are specifically defined.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Deriving Enum for a sum type of records in Haskell - haskell

Related

Clarifying Data Constructor in Haskell

Polymorphic return types and "rigid type variable" error in Haskell

Confusion about "type" and "data" in haskell

Retrieving number of fields in Haskell record

General conversion type class

Categories

Resources