Algebraic Data Types in Haskell

Algebraic Data Types in Haskell - haskell

Hello I have only worked with imperative programming so far and I am learning Haskell :)
I have the following algebraic data types:
data Day = I | ... | XXXI deriving(Ord,Eq) (Days in roman numerals)
data Month = Jan | ... | Dec deriving(Ord,Eq) (Months abbreviated to 3 letters)
data Year = 0 | ... | 2021 (actually Ints)
and I need to do some calculations with those. My first thought was mapping the Days and Months to Ints and do the calculations from there. For example:
dayConversionMap = [(I,1), (II,2), (III,3), (IV,4), ... , (XXXI,31)]
monthConversionMap = [(Jan,1), (Feb,2), (Mar,3), ... , (Dec,12)]
My question is:
Is this a good solution for my problem?
How could i convert the Days/Months into Ints in a function given I have those maps.
Thanks in advance! :)

Don't, and I repeat don't do date and time calculations yourself. Ship out to a good library. See also Falsehoods programmers believe about time for an incomplete list of reasons to do this.

Answering your questions:
There are many ways of doing what you are looking for, and a practical way would be to use a library, as already pointed out. Even better, find a library and read the source code. Hoogle is your friend.
But For learning Haskell purposes:
Instead of mapping them manually you could try provide a function. And because this is a behaviour you want more types to share you could create a Type Class. One naive (maybe didactic, less so practical, maybe fun to play with) way would be:
Define a Class that provides a way to convert to and from Int, let's call it ToInt. Now you have a common interface of getting that behaviour out of all your Types
class ToInt a where
toInt :: a -> Int
fromInt :: Int -> Maybe a
Now you can implement your to and from for your types and use them to convert.
instance ToInt Day where
toInt x =
case x of
I -> 1
II -> 2
-- ...
fromInt x =
case x of
1 -> I
-- ...
instance ToInt Month where
-- ...
calculation :: Day -> Month -> Year -> Int
calculation d m y = doSomething (toInt d)
where
doSomething :: Int -> Int
doSomething = ...
Note that this is a simple but bad example.
You can see the tedious nature of both implementation, and the limited help you get from the type checker. But once you start implementing something, you get a feel for what works and what not. Then you can search for how others deal with these issues in the actual libraries, for example.

Related

Avoiding boilerplate code due to equivalent constructors

I have an ADT as follows:
Prelude> data Bond = FixedRateBond Float Float | FloatingRateBond Float Float
I want to do an operation on every value constructors of this ADT as follows:
Prelude> let foo :: Bond -> Float
Prelude| foo (FixedRateBond a b) = a + b
Prelude| foo (FloatingRateBond a b) = a + b
As you can see I have code duplication here; for every value I have a + b. I will have more value constructors so this is going to be repeated even more. To me this is code smell, but I don't know how I would refactor it to eliminate the duplicated code. Is there a functional way to avoid this repeated code? This is a trivial example as I have stripped down the real problem to bare essentials to explain the problem.

You're correct. This is a code smell, and it's actually a very common modelling mistake. All you need to do is just factor the rate-type out. E.g.,
data RateType = Fixed | Floating
data Bond = Bond RateType Float Float
Then you'll have
foo :: Bond -> Float
foo (Bond _ a b) = a + b
atop of other benefits like RateType now actually being a type, which you can have Enum and Bounded instances for.
Basically, the rule of thumb here is: if you have multiple constructors implementing the same thing, there must be an enum asking to be factored out.

Extending algebraic data type

Note: if this question is somehow odd, this is because I was only recently exposed to Haskell and am still adapting to the functional mindset.
Considering a data type like Maybe:
data MyOwnMaybe a = MyOwnNothing | MyOwnJust a
everyone using my data type will write functions like
maybeToList :: MyOwnMaybe a -> [a]
maybeToList MyOwnNothing = []
maybeToList (MyOwnJust x) = [x]
Now, suppose that, at a later time, I wish to extend this data type
data MyOwnMaybe a = MyOwnNothing | MyOwnJust a | SuperpositionOfNothingAndJust a
how do I make sure that everyone's functions will break at compile-time?
Of course, there is the chance that somehow I'm not "getting" algebraic data types and maybe I shouldn't be doing this at all, but considering a data type Action
data Action = Reset | Send | Remove
it would seem that adding an extra Action like Add would not be so uncommon (and I wouldn't want to risk having all these functions around that possibly cannot handle my new Action)

Well, bad news first: sometimes you just can't do it. Period.
But that is language-agnostic; in any language you sometimes have to break interface. There is no way around it.
Now, good news: you can actually go a great length before you have to do that.
You just have to carefully consider what you export from your module. If, instead of exporting the internal workings of it, you export high-level functions, then there is a good chance you can rewrite those function using the new data type, and everything would go smooth.
In particular, be very careful when exporting data constructors. In this case, you don't just export functions that create your data; you are also exporting the possibility of pattern-matching; and that is not something that ties you pretty tight.
So, in your example, if you write functions like
myOwnNothing :: MyOwnMaybe a
myOwnJust :: a -> MyOwnMaybe a
and
fromMyOwnMaybe :: MyOwnMaybe a -> b -> (a -> b) -> b
fromMyOwnMaybe MyOwnNothing b _ = b
fromMyOwnMaybe (MyOwnJust a) _ f = f a
then it's reasonable to assume that you would be able to reimplement it for the updated MyOwnMaybe data type; so, just export those functions and the data type itself, but don't export constructors.
The only situation in which you would benefit from exporting constructors is when you are absolutely sure that your data type won't ever change. For example, Bool would always have only two (fully defined) values: True and False, it won't be extended by some FileNotFound or anything (although Edward Kmett might disagree). Ditto Maybe or [].
But the idea is more general: stay as high-level as you can.

You seem to know that GHC can warn about non-exhaustive pattern matches in function via the -W flag or explicitly with -fwarn-incomplete-patterns.
There is a good discussion about why these warnings are not automatically compile-time errors at this SO question:
In Haskell, why non-exhaustive patterns are not compile-time errors?
Also, consider this case where you have an ADT with a large number of constructors:
data Alphabet = A | B | C | ... | X | Y | Z
isVowel :: Alphabet -> Bool
isVowel A = True
isVowel E = True
isVowel I = True
isVowel O = True
isVowel U = True
isVowel _ = False
A default case is used as a convenience to avoid having to write out the other 21 cases.
Now if you add an addition constructor to Alphabet, should isVowel be flagged as "incomplete"?

One thing that a lot of modules do is not to export their constructors. Instead, they export functions that can be used (“smart constructors”). If you change your ADT later, you have to fix your functions in the module, but no one else's code gets broken.

Outputting the contents of a list of a custom data type

I have a custom data type Movie = String Int [(String,Int)] (Movie Name Year [(Fan,Rating)] and want to do a couple of things:
First I want to make a function that averages the Ints from the list of tuples and just outputs that number. So far I have this incomplete function:
avgRating :: [DataType] -> Int
avgRating [(Movie a b [(fan,rating)])] = sumRatings / (length [<mylist>])
Here I need a function sumRatings to recurse through the list and sum all the ratings, but i'm not sure where to start.
The other issue I have here is that i'm not sure what to put where <mylist> is as I would normally give the list a variable name and then use it there, but since I have split the list up to define other variables I can't name it.
I hope that makes sense, thanks.

I'm guessing you have a data structure defined as
data Movie = Movie String Int [(String, Int)]
While this works, it can be a bit cumbersome to work with when you have that many fields. Instead, you can leverage type aliases and record syntax as
type Name = String
type Year = Int
type Rating = Int
data Movie = Movie
{ mName :: Name
, mYear :: Year
, mRatings :: [(Name, Rating)]
} deriving (Eq, Show)
Now things are a bit more explicit and easier to work with. The mName, mYear, and mRatings functions will take a Movie and return the corresponding field from it. Your Movie constructor still works in the same way too, so it won't break existing code.
To calculate the average of the ratings, you really want a function that extracts all the ratings for a movie and aggregates them into a list:
ratings :: Movie -> [Rating]
ratings mov = map snd $ mRatings mov
Then you just need an average function. This will be a bit different because you can't calculate the average of Ints directly, you'll have to convert to a floating point type:
average :: [Rating] -> Float -- Double precision isn't really needed here
average rs = fromIntegral (sum rs) / fromIntegral (length rs)
The fromIntegral function converts an Int to a Float (the actual type signature is a bit more general). Since both the sum of Ints is an Int and the length of a list is always an Int, you need to convert both.
Now you can just compose these into a single function:
movieAvgRating :: Movie -> Float
movieAvgRating = average . ratings
Now, if you need to calculate the average ratings for several movies, you can apply ratings to each of them, aggregate them into a single list of ratings, then call average on that. I would suggest looking at the concatMap function. You'll be wanting to make a function like
moviesAvgRating :: [Movie] -> Float
moviesAvgRating movs = average $ ???

To answer your second question first, you can bind to a variable and unpack it simultaneously using #:
avgRating [(Movie a b mylist#[(fan, rating)])] = …
Note also that if you’re not going to be using variables that you unpack, it’s Haskell convention to bind them to _:
avgRating [(Movie _ _ mylist#[(fan, rating)])] = …
This helps readers focus on what’s actually important.
I don’t want to just give you the solution to your recursion problem, because learning to write recursive functions is an important and rewarding part of Haskell programming. (If you really want me to spoil it for you, let me know in a comment.) The basic idea, however, is that you need to think about two different cases: a base case (where the recursion stops) and a recursive case. As an example, consider the built-in sum function:
sum :: Num a => [a] -> a
sum [] = 0
sum (x:xs) = x + sum xs
Here, the base case is when sum gets an empty list – it simply evaluates to 0. In the recursive case, we assume that sum can already produce the sum of a smaller list, and we extend it to cover a larger list.
If you’re having trouble with recursion in general, Harold Abelson and Gerald Jay Sussman present a detailed discussion on the topic in Structure and Interpretation of Computer Programs, 2nd ed., The MIT Press (Cambridge), 1996, starting on p. 21 (§§1.1.7–1.2). It’s in Scheme, not Haskell, but the languages are sufficiently similar – at least at this conceptual level – that each can serve as a decent model for the other.

Repeating function recursive in Haskell

I am trying to make a function that outputs char*m n times, as such as the expected output would be ["ccc","ccc"] for the input 2 3 c. Here is what i have so far:
rectangle :: Int -> Int -> Char -> [string]
rectangle n m c
| m > 0 = [concat ([[c]] ++ (rectangle n (m-1) c))]
| otherwise = []
I am able to carry out the first part, char*m, so it returns ["ccc"]. Thing is: I also would like to be able to repeat my string n times.
I have tried using replicate but it doesn't seem to work, yet it works if doing it in the console: replicate 2 (rectangle 2 3 c).

Try the replicate function this way:
replicate :: Int -> a -> [a]
rectangle n m c = replicate n (replicate m c)
Also, don't forget to mention if this is homework.

As an addendum to Refactor's answer, I think his approach is the correct one. He subdivides the problem until it can be solved trivially using built-in functions. If you want to roll your own solution for learning purposes, I suggest you keep this subdivision, and go from there, implementing your own replicate. Otherwise, you will end up with a single function which does too much.
So the remaining problem is that of implementing replicate. My first idea would be to look at the source code for replicate. I found it via hoogle, which led me to hackage, which has links to the source code. Excerpted from the source:
replicate :: Int -> a -> [a]
replicate n x = take n (repeat x)
which is nice and concise, again using the built-in functions. If you want to completely roll your own replicate, you can do:
myReplicate :: Int -> a -> [a]
myReplicate n x | n <= 0 = []
| otherwise = x : replicate (n-1) x
----------EDIT----------------
As a side note, I think your problem requires two rather orthogonal skills. The first is trying not to tackle the whole problem at once, but making some small progress instead. Then you can try to solve that smaller problem, before returning to the larger. In your case, it would likely involve recognizing that you definitely need a way of transforming the character into a series of characters of length n. Experience with functions such as map, filter, foldr and so on will help you here, since they each represent a very distinct transformation, which you might recognize.
The second skill required for your solution - if you want to roll your own - is recognizing when a function can be expressed recursively. As you can see, your problem - and indeed many common problems - can be solved without explicit recursion, but it is a nice skill to have, when the need arises. Recursive solutions do not always come easily mind, so I think the best way to gain familiarity with them are to read and practice.
For further study, I'm sure you have already been pointed to the excellent Learn You a Haskell and Real World Haskell, but just in case you haven't, here they are.

Haskell record syntax

Haskell's record syntax is considered by many to be a wart on an otherwise elegant language, on account of its ugly syntax and namespace pollution. On the other hand it's often more useful than the position based alternative.
Instead of a declaration like this:
data Foo = Foo {
fooID :: Int,
fooName :: String
} deriving (Show)
It seems to me that something along these lines would be more attractive:
data Foo = Foo id :: Int
name :: String
deriving (Show)
I'm sure there must be a good reason I'm missing, but why was the C-like record syntax adopted over a cleaner layout-based approach?
Secondly, is there anything in the pipeline to solve the namespace problem, so we can write id foo instead of fooID foo in future versions of Haskell? (Apart from the longwinded type class based workarounds currently available.)

Well if no one else is going to try, then I'll take another (slightly more carefully researched) stab at answering these questions.
tl;dr
Question 1: That's just the way the dice rolled. It was a circumstantial choice and it stuck.
Question 2: Yes (sorta). Several different parties have certainly been thinking about the issue.
Read on for a very longwinded explanation for each answer, based around links and quotes that I found to be relevant and interesting.
Why was the C-like record syntax adopted over a cleaner layout-based approach?
Microsoft researchers wrote a History of Haskell paper. Section 5.6 talks about records. I'll quote the first tiny bit, which is insightful:
One of the most obvious omissions from early versions of Haskell
was the absence of records, offering named ﬁelds. Given that
records are extremely useful in practice, why were they omitted?
The Microsofties then answer their own question
The strongest reason seems to have been that there was no obvious “right” design.
You can read the paper yourself for the details, but they say Haskell eventually adopted record syntax due to "pressure for named fields in data structures".
By the time the Haskell 1.3 design was under way, in 1993, the user
pressure for named ﬁelds in data structures was strong, so the committee eventually adopted a minimalist design...
You ask why it is why it is? Well, from what I understand, if the early Haskellers had their way, we might've never had record syntax in the first place. The idea was apparently pushed onto Haskell by people who were already used to C-like syntax, and were more interested in getting C-like things into Haskell rather than doing things "the Haskell way". (Yes, I realize this is an extremely subjective interpretation. I could be dead wrong, but in the absence of better answers, this is the best conclusion I can draw.)
Is there anything in the pipeline to solve the namespace problem?
First of all, not everyone feels it is a problem. A few weeks ago, a Racket enthusiast explained to me (and others) that having different functions with the same name was a bad idea, because it complicates analysis of "what does the function named ___ do?" It is not, in fact, one function, but many. The idea can be extra troublesome for Haskell, since it complicates type inference.
On a slight tangent, the Microsofties have interesting things to say about Haskell's typeclasses:
It was a happy coincidence
of timing that Wadler and Blott happened to produce this key idea
at just the moment when the language design was still in ﬂux.
Don't forget that Haskell was young once. Some decisions were made simply because they were made.
Anyways, there are a few interesting ways that this "problem" could be dealt with:
Type Directed Name Resolution, a proposed modification to Haskell (mentioned in comments above). Just read that page to see that it touches a lot of areas of the language. All in all, it ain't a bad idea. A lot of thought has been put into it so that it won't clash with stuff. However, it will still require significantly more attention to get it into the now-(more-)mature Haskell language.
Another Microsoft paper, OO Haskell, specifically proposes an extension to the Haskell language to support "ad hoc overloading". It's rather complicated, so you'll just have to check out Section 4 for yourself. The gist of it is to automatically (?) infer "Has" types, and to add an additional step to type checking that they call "improvement", vaguely outlined in the selective quotes that follow:
Given the class constraint Has_m (Int -> C -> r) there is
only one instance for m that matches this constraint...Since there is exactly one choice, we should make it now, and that in turn
fixes r to be Int. Hence we get the expected type for f:
f :: C -> Int -> IO Int...[this] is simply a
design choice, and one based on the idea that the class Has_m is closed
Apologies for the incoherent quoting; if that helps you at all, then great, otherwise just go read the paper. It's a complicated (but convincing) idea.
Chris Done has used Template Haskell to provide duck typing in Haskell in a vaguely similar manner to the OO Haskell paper (using "Has" types). A few interactive session samples from his site:
λ> flap ^. donald
*Flap flap flap*
λ> flap ^. chris
I'm flapping my arms!
fly :: (Has Flap duck) => duck -> IO ()
fly duck = do go; go; go where go = flap ^. duck
λ> fly donald
*Flap flap flap*
*Flap flap flap*
*Flap flap flap*
This requires a little boilerplate/unusual syntax, and I personally would prefer to stick to typeclasses. But kudos to Chris Done for freely publishing his down-to-earth work in the area.

I just thought I'd add a link addressing the namespace issue. It seems that overloaded record fields for GHC are coming in GHC 7.10 (and are probably already in HEAD), using the OverloadedRecordFields extension.
This would allow for syntax such as
data Person = Person { id :: Int, name :: String }
data Company { name :: String, employees :: [Person] }
companyNames :: Company -> [String]
companyNames c = name c : map name (employees c)

[edit] This answer is just some random thoughts of mine on the matter. I recommend my other answer over this one, because for that answer I took a lot more time to look up and reference other people's work.
Record syntax
Taking a few stabs in the dark: your "layout-based" proposed syntax looks a lot like non-record-syntax data declarations; that might cause confusion for parsing (?)
--record
data Foo = Foo {i :: Int, s :: String} deriving (Show)
--non-record
data Foo = Foo Int String deriving (Show)
--new-record
data Foo = Foo i :: Int, s :: String deriving (Show)
--record
data LotsaInts = LI {a,b,c,i,j,k :: Int}
--new-record
data LostaInts = LI a,b,c,i,j,k :: Int
In the latter case, what exactly is :: Int applied to? The whole data declaration?
Declarations with the record syntax (currently) are similar to construction and update syntax. Layout-based syntax would not be clearer for these cases; how do you parse those extra = signs?
let f1 = Foo {s = "foo1", i = 1}
let f2 = f1 {s = "foo2"}
let f1 = Foo s = "foo1", i = "foo2"
let f2 = f1 s = "foo2"
How do you know f1 s is a record update, as opposed to a function application?
Namespacing
What if you want to intermingle usage of your class-defined id with the Prelude's id? How do you specify which one you're using? Can you think of any better way than qualified imports and/or the hiding keyword?
import Prelude hiding (id)
data Foo = Foo {a,b,c,i,j,k :: Int, s :: String}
deriving (Show)
id = i
ghci> :l data.hs
ghci> let foo = Foo 1 2 3 4 5 6 "foo"
ghci> id foo
4
ghci> Prelude.id f1
Foo {a = 1, b = 2, c = 3, i = 4, j = 5, k = 6, s = "foo"}
These aren't great answers, but they're the best I've got. I personally don't think record syntax is that ugly. I do feel there is room for improvement with the namespacing/modules stuff, but I have no idea how to make it better.

As of June 2021, it has been half-implemented by three opt-in language extensions and counting:
https://gitlab.haskell.org/ghc/ghc/-/wikis/records/overloaded-record-fields
Even with all three extensions enabled, basic stuff like
len2 :: Point -> Double
len2 p = (x p)^2 + (y p)^2 -- fails!
still won't work if, say, there's a Quaternion type with x and y fields as well. You would have to do this:
len2 :: Point -> Double
len2 p = (x (p :: Point))^2 + (y (p :: Point))^2
or this:
len2 :: Point -> Double
len2 (MkPoint {x = px, y = py}) = px^2 + py^2
Even if the first example did work, it would still be opt-in, so odds are that it will be another two decades before the extension is widely adopted by the libraries that any real application must rely on.
It's ironic when a deal breaker like this is not an issue in a language like C.
One point of interest, though: Idris 2 has actually fixed this. It isn't really ready yet either, though.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string