Haskell generic data structure - haskell

I want to create a type to store some generic information, as for me, this type is
Molecule, where i store chemical graph and molecular properties.
data Molecule = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property] -- that's a question
} deriving(Show)
Properties I want to represent as tuple
type Property a = (String,a)
because a property may have any type: Float, Int, String e.t.c.
The question is how to form Molecule data structure, so I will be able to collect any numbers of any types of properties in Molecule. If I do
data Molecule a = Molecule {
name :: Maybe String,
graph :: Gr Atom Bond,
property :: Maybe [Property a]
} deriving(Show)
I have to diretly assign one type when I create a molecule.

If you know in advance the set of properties a molecule might have, you could define a sum type:
data Property = Mass Float | CatalogNum Int | Comment String
If you want this type to be extensible, you could use Data.Dynamic as another answer suggests. For instance:
data Molecule = Molecule { name :: Maybe String,
graph :: Gr Atom Bond,
property :: [(String,Dynamic)]
} deriving (Show)
mass :: Molecule -> Maybe Float
mass m = case lookup "mass" (property m) of
Nothing -> Nothing
Just i -> fromDynamic i
You could also get rid of the "stringly-typed" (String,a) pairs, say:
-- in Molecule:
-- property :: [Dynamic]
data Mass = Mass Float
mass :: Molecule -> Maybe Mass
mass m = ...
Neither of these attempts gives much type safety over just parsing out of (String,String) pairs since there is no way to enforce the invariant that the user creates well-formed properties (short of wrapping properties in a new type and hiding the constructors in another module, which again breaks extensibility).
What you might want are Ocaml-style polymorphic variants. You could look at Vinyl, which provides type-safe extensible records.
As an aside, you might want to get rid of the Maybe wrapper around the list of properties, since the empty list already encodes the case of no properties.

You might want to look at Data.Dynamic for a psudo-dynamic typing solution.

Related

List of a Type Classe instance

I've been playing around with Haskell type classes and I am facing a problem I hope someone could help me to solve. Consider that I come from a Swift background and "trying" to port some of protocol oriented knowledge to Haskell code.
Initially I declared a bunch of JSON parsers which had the same structure, just a different implementation:
data Candle = Candle {
mts :: Integer,
open :: Double,
close :: Double
}
data Bar = Bar {
mts :: Integer,
min :: Double,
max :: Double
}
Then I decided to create a "Class" that would define their basic operations:
class GenericData a where
dataName :: a -> String
dataIdentifier :: a -> Double
dataParsing :: a -> String -> Maybe a
dataEmptyInstance :: a
instance GenericData Candle where
dataName _ = "Candle"
dataIdentifier = fromInteger . mts
dataParsing _ = candleParsing
dataEmptyInstance = emptyCandle
instance GenericData Bar where
dataName _ = "Bar"
dataIdentifier = fromInteger . mts
dataParsing _ = barParsing
dataEmptyInstance = emptyBar
My first code smell was the need to include "a" when it was not needed (dataName or dataParsing) but then I proceded.
analyzeArguments :: GenericData a => [] -> [String] -> Maybe (a, [String])
analyzeArguments [] _ = Nothing
analyzeArguments _ [] = Nothing
analyzeArguments name data
| name == "Candles" = Just (head possibleCandidates, data)
| name == "Bar" = Just (last possibleRecordCandidates, data)
| otherwise = Nothing
possibleCandidates :: GenericData a => [a]
possibleCandidates = [emptyCandle, emptyBar]
Now, when I want to select if either instance should be selected to perform parsing, I always get the following error
• Couldn't match expected type ‘a’ with actual type ‘Candle’
‘a’ is a rigid type variable bound by
the type signature for:
possibleCandidates :: forall a. GenericData a => [a]
at src/GenericRecords.hs:42:29
My objective was to create a list of instances of GenericData because other functions depend on that being selected to execute the correct dataParser. I understand this has something to do with the type class checker, the * -> Constraint, but still not finding a way to solve this conflict. I have used several GHC language extensions but none has solved the problem.
You have a type signature:
possibleCandidates :: GenericData a => [a]
Which you might thing implies that you can put anything in that list as long as it is GenericData. But that is not the way Haskell's type system actually works. The value possibleCandidates can be a list of any type which has a GenericData class but every element of the list must be of the same type.
What the GHC error message is telling you (in its own special way) is that the first element of the list is a Candle so it thinks that the rest of the list should also be of type Candle but the second element is actually a Bar.
Now there are ways to make heterogeneous lists (and other collections) in Haskell, but it is almost never the right thing to do.
One typical solution to this problem is to just merge everything down into one sum data type:
data GenericData = GenericCandle Candle | GenericBar Bar
You could even forgo the step of indirection and just put the Candle and Bar data directly into the data structure.
Now instead f a class you just have a datatype and your class functions become normal functions:
dataName :: GenericData -> String
dataIdentifier :: GenericData -> Double
dataParsing :: GenericData -> String -> Maybe a
dataEmptyInstance :: String -> GenericData
There are some other more complex ways to make this work, but if a sum data type fits the bill, use it. It is very common for parsers in Haskell to have a large sum data type (usually also recursive) as their result. Take a look at the Value type in Aeson the standard JSON library for an example.

Retrieve hidden type of a phantom type

I declared a a phantom type like this with Haskell.
newtype Length (a::UnitLength) b = Length b deriving (Eq,Show)
data UnitLength = Meter
| KiloMeter
| Miles
deriving (Eq,Show)
Now, I would like to write some functions to use this type. But I didn't happened to see and use the hidden type.
Is it possible to retrieve the hidden type a of the phantom type Length to perform test, pattern matching, .... ?
If you want a runtime representation of the phantom type you have used, you have to use what we call a singleton. It has precisely one constructor for each ones of the constructors in UnitLength and their types say precisely which constructor we are considering:
data SUnitLength (a :: UnitLength) where
SMeter :: SUnitLength Meter
SKiloMeter :: SUnitLength KiloMeter
SMiles :: SUnitLength Miles
Now that you have this you can for instance write a display function picking the right unit abbreviation depending on the phantom parameter:
display :: Show b => SUnitLength a -> Length a b -> String
display sa l = show (payload l) ++
case sa of
SKiloMeter -> "km"
_ -> "m"
Now, that does not really match your demand: the parameter a is available in the type Length a b but we somehow still have to manufacture the witness by hand. That's annoying. One way to avoid this issue is to define a type class doing that work for us. CUnitLength a tells us that provided a value of type Length a b, we can get a witness SUnitLength a of the shape a has.
class CUnitLength (a :: UnitLength) where
getUnit :: Length a b -> SUnitLength a
It is easy for us to write instances of CUnitLength for the various UnitLength constructors: getUnit can even ignore its argument!
instance CUnitLength Meter where
getUnit _ = SMeter
instance CUnitLength KiloMeter where
getUnit _ = SKiloMeter
instance CUnitLength Miles where
getUnit _ = SMiles
So why bother with getUnit's argument? Well if we remove it, getUnit needs to somehow magically guess which a it is suppose to describe. Sometimes it's possible to infer that ̀a based on the expected type at the call site but sometimes it's not. Having the Length a b argument guarantees that all calls will be unambiguous. We can always recover a simpler getUnit' anyway:
getUnit' :: CUnitLength a => SUnitLength a
getUnit' = getUnit (undefined :: Length a ())
Which leads us to the last definition display' which has the same role as display but does not require the extra argument:
display' :: (CUnitLength a, Show b) => Length a b -> String
display' = display getUnit'
I have put everything (including the LANGUAGE extensions, and the definition of payload to extract a b from Length a b) in a self-contained gist in case you want to play with the code.

Haskell polymorphic functions with records and class types

this post is the following of this one.
I'm realizing a simple battle system as toy project, the typical system you can find in games like Final Fantasy et simila. I've solved the notorious "Namespace Pollution" problem with a class type + custom instances. For example:
type HitPoints = Integer
type ManaPoints = Integer
data Status = Sleep | Poison | .. --Omitted
data Element = Fire | ... --Omitted
class Targetable a where
name :: a -> String
level :: a -> Int
hp :: a -> HitPoints
mp :: a -> ManaPoints
status :: a -> Maybe [Status]
data Monster = Monster{monsterName :: String,
monsterLevel :: Int,
monsterHp :: HitPoints,
monsterMp :: ManaPoints,
monsterElemType :: Maybe Element,
monsterStatus :: Maybe [Status]} deriving (Eq, Read)
instance Targetable Monster where
name = monsterName
level = monsterLevel
hp = monsterHp
mp = monsterMp
status = monsterStatus
data Player = Player{playerName :: String,
playerLevel :: Int,
playerHp :: HitPoints,
playerMp :: ManaPoints,
playerStatus :: Maybe [Status]} deriving (Show, Read)
instance Targetable Player where
name = playerName
level = playerLevel
hp = playerHp
mp = playerMp
status = playerStatus
Now the problem: I have a spell type, and a spell can deal damage or inflict a status (like Poison, Sleep, Confusion, etc):
--Essentially the result of a spell cast
data SpellEffect = Damage HitPoints ManaPoints
| Inflict [Status] deriving (Show)
--Essentially a magic
data Spell = Spell{spellName :: String,
spellCost :: Integer,
spellElem :: Maybe Element,
spellEffect :: SpellEffect} deriving (Show)
--For example
fire = Spell "Fire" 20 (Just Fire) (Damage 100 0)
frogSong = Spell "Frog Song" 30 Nothing (Inflict [Frog, Sleep])
As suggested in the linked topic, I've created a generic "cast" function like this:
--cast function
cast :: (Targetable t) => Spell -> t -> t
cast s t =
case spellEffect s of
Damage hp mana -> t
Inflict statList -> t
As you can see the return type is t, here showed just for consistency. I want be able to return a new targetable (i.e. a Monster or a Player) with some field value altered (for example a new Monster with less hp, or with a new status). The problem is that i can't just to the following:
--cast function
cast :: (Targetable t) => Spell -> t -> t
cast s t =
case spellEffect s of
Damage hp' mana' -> t {hp = hp', mana = mana'}
Inflict statList -> t {status = statList}
because hp, mana and status "are not valid record selector". The problem is that I don't know a priori if t will be a monster or a player, and I don't want to specify "monsterHp" or "playerHp", I want to write a pretty generic function.
I know that Haskell Records are clumsy and not much extensibile...
Any idea?
Bye and happy coding,
Alfredo
Personally, I think hammar is on the right track with pointing out the similarities between Player and Monster. I agree you don't want to make them the same, but consider this: Take the type class you have here...
class Targetable a where
name :: a -> String
level :: a -> Int
hp :: a -> HitPoints
mp :: a -> ManaPoints
status :: a -> Maybe [Status]
...and replace it with a data type:
data Targetable = Targetable { name :: String
, level :: Int
, hp :: HitPoints
, mp :: ManaPoints
, status :: Maybe [Status]
} deriving (Eq, Read, Show)
Then factor out the common fields from Player and Monster:
data Monster = Monster { monsterTarget :: Targetable
, monsterElemType :: Maybe Element,
} deriving (Eq, Read, Show)
data Player = Player { playerTarget :: Targetable } deriving (Eq, Read, Show)
Depending on what you do with these, it might make more sense to turn it inside-out instead:
data Targetable a = Targetable { target :: a
, name :: String
-- &c...
}
...and then have Targetable Player and Targetable Monster. The advantage here is that any functions that work with either can take things of type Targetable a--just like functions that would have taken any instance of the Targetable class.
Not only is this approach nearly identical to what you have already, it's also a lot less code, and keeps the types simpler (by not having class constraints everywhere). In fact, the Targetable type above is roughly what GHC creates behind the scenes for the type class.
The biggest downside to this approach is that it makes accessing fields clumsier--either way, some things end up being two layers deep, and extending this approach to more complicated types can nest them deeper still. A lot of what makes this awkward is the fact that field accessors aren't "first class" in the language--you can't pass them around like functions, abstract over them, or anything like that. The most popular solution is to use "lenses", which another answer mentioned already. I've typically used the fclabels package for this, so that's my recommendation.
The factored-out types I suggest, combined with strategic use of lenses, should give you something that's simpler to use than the type class approach, and doesn't pollute the namespace the way having lots of record types does.
I can suggest three possible solutions.
1) Your types are very OO-like, but Haskell can also express "sum" types with parameters:
data Unit = UMon Monster | UPlay Player
cast :: Spell -> Unit -> Unit
cast s t =
case spellEffect s of
Damage hp' mana' -> case t of
UMon m -> UMon (m { monsterHp = monsterHp m - hp', monsterMana = undefined})
UPluy p -> UPlay (p { playerHp = playerHp p - hp'})
Inflict statList -> undefined
Thing that are similar in OO-design often become "sum" types with parameters in Haskell.
2) You can do what Carston suggests and add all your methods to type classes.
3) You can change your read-only methods in Targetable to be "lenses" that expose both getting and setting. See the stack overflow discussion. If your type class returned lenses then it would make your spell damage possible to apply.
Why don't you just include functions like
InflicteDamage :: a -> Int -> a
AddStatus :: a -> Status -> a
into your type-class?

What's the recommended way of handling complexly composed POD(plain-old-data in OO) in Haskell?

I'm a Haskell newbie.
In statically typed OO languages (for instance, Java), all complex data structures are presented as class and instances. An object can have many attributes (fields). And another object can be a value of the field. Those fields can be accessed with their names, and statically typed by class. Finally, those objects construct huge graph of object which linked each other. Most program uses data graph like this.
How can I archive these functionality in Haskell?
If you really do have data without behavior, this maps nicely to a Haskell record:
data Person = Person { name :: String
, address :: String }
deriving (Eq, Read, Show)
data Department = Management | Accounting | IT | Programming
deriving (Eq, Read, Show)
data Employee = Employee { identity :: Person
, idNumber :: Int
, department :: Department }
| Contractor { identity :: Person
, company :: String }
deriving (Eq, Read, Show)
This says that a Person is a Person who has a name and address (both Strings); a Department is either Management, Accounting, IT, or Programming; and an Employee is either an Employee who has an identity (a Person), an idNumber (an Int), and a department (a Department), or is a Contractor who has an identity (a Person) and a company (a String). The deriving (Eq, Read, Show) lines enable you to compare these objects for equality, read them in, and convert them to strings.
In general, a Haskell data type is a combination of unions (also called sums) and tuples (also called products).1 The |s denote choice (a union): an Employee is either an Employee or a Contractor, a Department is one of four things, etc. In general, tuples are written something like the following:
data Process = Process String Int
This says that Process (in addition to being a type name) is a data constructor with type String -> Int -> Process. Thus, for instance, Process "init" 1, or Process "ls" 57300. A Process has to have both a String and an Int to exist. The record notation used above is just syntactic sugar for these products; I could also have written data Person = Person String String, and then defined
name :: Person -> String
name (Person n _) = n
address :: Person -> String
address (Person _ a) = a
Record notation, however, can be nice for complex data structures.
Also note that you can parametrize a Haskell type over other types; for instance, a three-dimensional point could be data Point3 a = Point3 a a a. This means that Point3 :: a -> a -> a -> Point3 a, so that one could write Point3 (3 :: Int) (4 :: Int) (5 :: Int) to get a Point3 Int, or Point3 (1.1 :: Double) (2.2 :: Double) (3.3 :: Double) to get a Point3 Double. (Or Point3 1 2 3 to get a Num a => Point3 a, if you've seen type classes and overloaded numeric literals.)
This is what you need to represent a data graph. However, take note: one problem for people transitioning from imperative languages to functional ones—or, really, between any two different paradigms (C to Python, Prolog to Ruby, Erlang to Java, whatever)—is to continue to try to solve problems the old way. The solution you're trying to model may not be constructed in a way amenable to easy functional programming techniques, even if the problem is. For instance, in Haskell, thinking about types is very important, in a way that's different from, say, Java. At the same time, implementing behaviors for those types is done very differently: higher-order functions capture some of the abstractions you've seen in Java, but also some which aren't easily expressible (map :: (a -> b) -> [a] -> [b], filter :: (a -> Bool) -> [a] -> [a], and foldr :: (a -> b -> b) -> b -> [a] -> b come to mind). So keep your options open, and consider addressing your problems in a functional way. Of course, maybe you are, in which case, full steam ahead. But do keep this in mind as you explore a new language. And have fun :-)
1: And recursion: you can represent a binary tree, for instance, with data Tree a = Leaf a | Branch a (Tree a) (Tree a).
Haskell has algebraic data types, which can describe structures or unions of structures such that something of a given type can hold one of a number of different sets of fields. These fields can set and accessed both positionally or via names with record syntax.
See here: http://learnyouahaskell.com/making-our-own-types-and-typeclasses

When should I use record syntax for data declarations in Haskell?

Record syntax seems extremely convenient compared to having to write your own accessor functions. I've never seen anyone give any guidelines as to when it's best to use record syntax over normal data declaration syntax, so I'll just ask here.
You should use record syntax in two situations:
The type has many fields
The type declaration gives no clue about its intended layout
For instance a Point type can be simply declared as:
data Point = Point Int Int deriving (Show)
It is obvious that the first Int denotes the x coordinate and the second stands for y. But the case with the following type declaration is different (taken from Learn You a Haskell for Great Good):
data Person = Person String String Int Float String String deriving (Show)
The intended type layout is: first name, last name, age, height, phone number, and favorite ice-cream flavor. But this is not evident in the above declaration. Record syntax comes handy here:
data Person = Person { firstName :: String
, lastName :: String
, age :: Int
, height :: Float
, phoneNumber :: String
, flavor :: String
} deriving (Show)
The record syntax made the code more readable, and saved a great deal of typing by automatically defining all the accessor functions for us!
In addition to complex multi-fielded data, newtypes are often defined with record syntax. In either of these cases, there aren't really any downsides to using record syntax, but in the case of sum types, record accessors usually don't make sense. For example:
data Either a b = Left { getLeft :: a } | Right { getRight :: b }
is valid, but the accessor functions are partial – it is an error to write getLeft (Right "banana"). For that reason, such accessors are generally speaking discouraged; something like getLeft :: Either a b -> Maybe a would be more common, and that would have to be defined manually. However, note that accessors can share names:
data Item = Food { description :: String, tastiness :: Integer }
| Wand { description :: String, magic :: Integer }
Now description is total, although tastiness and magic both still aren't.

Resources