Haskell data types usage good practicies

Haskell data types usage good practicies - haskell

Reading "Real world Haskell" i found some intresting question about data types:
This pattern matching and positional
data access make it look like you have
very tight coupling between data and
code that operates on it (try adding
something to Book, or worse change the
type of an existing part).
This is usually a very bad thing in
imperative (particularly OO)
languages... is it not seen as a
problem in Haskell?
source at RWH comments
And really, writing some Haskell programs I found that when I make small change to data type structure it affects almost all functions that use that data type. Maybe there are some good practices for data type usage. How can i minimize code coupling?

What you are describing is commonly known as the expression problem -- http://en.wikipedia.org/wiki/Expression_Problem.
There is a definite trade-off to be made, haskell code in general, and algebraic data types in particular, tends to fall into the hard to change the type but easy to add functions over the type. This optimizes for (up front) well-designed, complete, data types.
All that said, there are a number of things that you can do to reduce the coupling.
Define good library functions, by defining a complete set of combinators and higher order functions that are useful for interacting with your data type you will reduce coupling. It is often said that when ever you think of pattern matching there is a better solution using higher-order functions. If you look for these situations you will be in a better spot.
Expose your data structures as more abstract types. This means implementing all appropriate type classes. This will assist with defining a library functions as you will get a bunch for free with any of the type classes you implement, for examples look at operations over Functor or Monad.
Hide (as much as possible) any type constructors. Constructors expose implementation detail and will encourage coupling. Hint: this links in with defining a good api for interacting with your type, consumers of your type should rarely, if ever, have to use the type constructors.
The haskell community seems particularly good at this, if you look at many of the libraries on hackage you will find really good examples of implementing type classes and exposing good library functions.

In addition to what's been said:
One interesting approach is the "scrap your boilerplate" style of defining functions over data types, which makes use of generic functions (as opposed to explicit pattern matching) to define functions over the constructors of a data type. Looking at the "scrap your boilerplate" papers, you will see examples of functions which can cope with changes to the structure of a data type.
A second method, as Hibberd pointed out, is to use folds, maps, unfolds, and other recursion combinators, to define your functions. When you write functions using higher order functions, oftentimes small changes to the underlying data type can be dealt with in the instance declarations for Functor, Foldable, and so on.

First, I'd like to mention that in my view, there are two kinds of couplings:
One that makes your code cease to compile when you change one and forget to change the other
One that makes your code buggy when you change one and forget to change the other
While both are problematic, the former is significantly less of a headache, and that seems to be the one you're talking about.
I think the main problem you're mentioning is due to over-using positional arguments. Haskell almost forces you to have positional arguments in your ordinary functions, but you can avoid them in your type products (records).
Just use records instead of multiple anonymous fields inside data constructors, and then you can pattern-match any field you want out of it, by name.
bad (Blah _ y) = ...
good (Blah{y = y}) = ...
Avoid over-using tuples, especially those beyond 2-tuples, and liberally create records/newtypes around things to avoid positional meaning.

Related

Forcing a monadic computation to be done statically

In Matthew Pickering's blog post about inlining, he mentions a trick to force GHC to inline recursive definitions applied to a statically known argument:
However, there is a trick that can be used in order to tell GHC that an argument is truly static. We replace the value argument with a type argument. Then by defining suitable type class instances, we can recurse on the structure of the type as we would on a normal value. This time however, GHC will happily inline each “recursive” call as each call to sum is at a different type.
Then the blog post proceeds to show a very simplistic example using type-level Nats. My question is about forcing GHC to fully inline functions over actually interesting types.
In my case, I have a computation that is via a monadic DSL. Under the hood, the DSL is an RWS. Looking at the generated Core with lots of INLINE pragmas, I can see calls to the recursive functions that implement <> for the writer output. Since these functions are recursive, GHC doesn't inline them. But if it did, since the full RWS computation is static enough that the spine of the writer output could be fully unrolled during compilation, it would, after finitely many steps, arrive at Core with no recursive definitions left.
Taking the idea from Matt's blog post, we would need to have a type-level representation of the monadic value; I'd love to see a version of this (especially with good ergonomics!), but I am not too optimistic: unlike the Servant approach, we would need a way to bind variables to results (a la monadic >>=). Alternatively, are there any other mechanisms to tell GHC that its input size is statically known, so it shouldn't shy away from aggressively inlining recrusive definitions? For example, is it possible to emulate Agda-style sized types by using a graded monad with some appropriate size measure in the index?
P.s.: My aim is not mere optimization, but working around a Clash bug, so this elimination of recursive definitions by aggressive inlining would need to be something dependable, not a nice-to-have performance improvement.

Use type classes to implement dependency inversion in a Haskell application?

One major architectural goal when designing large applications is to reduce coupling and dependencies. By dependencies, I mean source-code dependencies, when one function or data type uses another function or another type. A high-level architecture guideline seems to be the Ports & Adapters architecture, with slight variations also referred to as Onion Architecture, Hexagonal Architecture, or Clean Architecture: Types and functions that model the domain of the application are at the center, then come use cases that provide useful services on the basis of the domain, and in the outermost ring are technical aspects like persistence, networking and UI.
The dependency rule says that dependencies must point inwards only. E.g.; persistence may depend on functions and types from use cases, and use cases may depend on functions and types from the domain. But the domain is not allowed to depend on the outer rings. How should I implement this kind of architecture in Haskell? To make it concrete: How can I implement a use case module that does not depend (= import) functions and types from a persistence module, even though it needs to retrieve and store data?
Say I want to implement a use case order placement via a function U.placeOrder :: D.Customer -> [D.LineItem] -> IO U.OrderPlacementResult, which creates an order from line items and attempts to persist the order. Here, U indicates the use case module and D the domain module. The function returns an IO action because it somehow needs to persist the order. However, the persistence itself is in the outermost architectural ring - implemented in some module P; so, the above function must not depend on anything exported from P.
I can imagine two generic solutions:
Higher order functions: The function U.placeOrder takes an additional function argument, say U.OrderDto -> U.PersistenceResult. This function is implemented in the persistence (P) module, but it depends on types of the U module, whereas the U module does not need to declare a dependency on P.
Type classes: The U module defines a Persistence type class that declares the above function. The P module depends on this type class and provides an instance for it.
Variant 1 is quite explicit but not very general. Potentially it results in functions with many arguments. Variant 2 is less verbose (see, for example, here). However, Variant 2 results in many unprincipled type classes, something considered bad practice in most modern Haskell textbooks and tutorials.
So, I am left with two questions:
Am I missing other alternatives?
Which approach is the generally recommended one, if any?

There are, indeed, other alternatives (see below).
While you can use partial application as dependency injection, I don't consider it a proper functional architecture, because it makes everything impure.
With your current example, it doesn't seem to matter too much, because U.placeOrder is already impure, but in general, you'd want your Haskell code to consist of as much referentially transparent code as possible.
You sometimes see a suggestion involving the Reader monad, where the 'dependencies' are passed to the function as the reader context instead of as straight function arguments, but as far as I can tell, these are just (isomorphic?) variations of the same idea, with the same problems.
Better alternatives are functional core, imperative shell, and free monads. There may be other alternatives as well, but these are the ones I'm aware of.
Functional core, imperative shell
You can often factor your code so that your domain model is defined as a set of pure functions. This is often easier to do in languages like Haskell and F# because you can use sum types to communicate decisions. The U.placeOrder function might, for example, look like this:
U.placeOrder :: D.Customer -> [D.LineItem] -> U.OrderPlacementDecision
Notice that this is a pure function, where U.OrderPlacementDecision might be a sum type that enumerates all the possible outcomes of the use case.
That's your functional core. You'd then compose your imperative shell (e.g. your main function) in an impureim sandwich:
main :: IO ()
main = do
stuffFromDb <- -- call the persistence module code here
customer -- initialised from persistence module, or some other place
lineItems -- ditto
let decision = U.placeOrder customer lineItems
_ <- persist decision
return ()
(I've obviously not tried to type-check that code, but I hope it's sufficiently correct to get the point accross.)
Free monads
The functional core, imperative shell is by far the simplest way to achieve the desired architectural outcome, and it's conspicuously often possible to get away with. Still, there are cases where that's not possible. In those cases, you can instead use free monads.
With free monads, you can define data structures that are roughly equivalent to object-oriented interfaces. Like in the functional core, imperative shell case, these data structures are sum types, which means that you can keep your functions pure. You can then run an impure interpreter over the generated expression tree.
I've written an article series about how to think about dependency injection in F# and Haskell. I've also recently published an article that (among other things) showcases this technique. Most of my articles are accompanied by GitHub repositories.

Is Monad just a functional way of Error handling?

I am reading "Programming in Haskell" book and trying to correlate ideas of haskell to my knowledge in C#. Please correct me if I am wrong.
I feel like monads enforces the programmer to write code which can handle exceptions. So we will explicitly mention the error handling in the type system like
Optional(Int) functionName(Int a, Int b)
The return type is Optional(Int) but not an Int, So who so ever uses the library that has this kind of return types will understand that error handling is happening and the result will be like None(explains Something went wrong) and Some(explains we got some result).
Any code can result to Happy Path(where we get some results) and Sad Path(where errors occurs). Making this paths explicitly in Type system is what monad is. That's my understanding of it. Please correct me.
Monads are like a bridge between Pure Functional Programming and Impure code (that results in side effects).
Apart from this I want to make sure my understanding on Exception handling (VS) Option Types.
Exception Handling tries to do the operations without having a deeper look on to the inputs. Exception Handling's are Heavy since the Call stack has to unwind till it get to the Catch || Rescue || Handling Code.
And the Functional way of dealing things is check the inputs before doing the operations and return the "None" as result if the inputs does't match the required criteria. Option types are light weight of handling errors.

Monad is just an interface (in Haskell terms, typeclass) that a type can implement, along with a contract specifying some restrictions on how the interface should behave.
It's not that different from how a C# type T can implement, say, the IComparable<T> interface. However, the Monad interface is quite abstract and the functions can do surprisingly different things for different types (but always respecting the same laws, and the same "flavor" of composition).
Instead of seeing Monad as a functional way of error handling, it's better to go the other way: inventing a type like Optional that represents errors / absence of values, and start devising useful functions on that type. For example, a function that produces an "inhabited" Optional from an existing value, a function that composes two Optional-returning functions to minimize repetitive code, a function that changes the value inside the Optional if it exists, and so on. All of them functions that can be useful on their own.
After we have the type and and a bevy of useful functions, we might ask ourselves:
Does the type itself fit the requirements for Monad? It must have a type parameter, for example.
Do some (not necessarily all) of the useful functions we have discovered for the type fit the Monad interface? Not only they must fit the signatures, they must fit the contract.
In the affirmative case, good news! We can define a Monad instance for the type, and now we are able to use a whole lot of monad-generic functions for free!
But even if the Monad typeclass doesn't exist in our language, we can keep in mind that the type and some of the functions defined on it behave like a Monad. From the documentation of the thenCompose method of the CompletableFuture Java class:
This method is analogous to Optional.flatMap and Stream.flatMap.
This allows us to "transfer intuitions" between seemingly unrelated classes, even if we can't write monad-generic code because the shared interface doesn't exist.

Monads aren't 'just a functional way of error handling', so you are, indeed, wrong.
It would be pointless to turn this answer into a monad tutorial, so I'm not going to try. It takes some time to understand what a monad is, and the best advice I can give is to keep working with the concept until it clicks. Eventually it will.
The type described in the OP looks equivalent (isomorphic) to Haskell's more standard Maybe type, which is indeed a monad. In a pinch, it can be used for error handling, but more often you'd use another monad called Either (or types isomorphic to it), since it's better suited for that task.
A monad, though, can be many other things. Lists are monads, as are functions themselves (via the Reader monad). Trees are monads as well. These have nothing to do with error handling.
When it comes to exceptions in Haskell, I take the view that it's a legacy feature. I'd never design my Haskell code around exceptions, since exceptions aren't visible via the type system. When a function may fail to return a result, I'd let it return a Maybe, an Either, or another type isomorphic to that. This will, indeed, force the caller to handle not only the happy path, but also any failures that may occur.

Is Monad just a functional way of Error handling?
No it is not. The most prominent use for monads is handling side-effecting computations in Haskell. They can also be used for error handling, but the reason they are called "monads" and not "error handlers" is that they provide a common abstraction around several seemingly different things. For instance, in Haskell, join is a synonym for concat and =<< is an infix synonym for concatMap.
Monads are like a bridge between Pure Functional Programming and Impure code (that results in side effects).
This is a subtle point. In a lazy language such as Haskell, side effects refer to computations which may indeed be pure, such as FFI calls that require memory to be freed upon completion. Monads provide a way to track (and perform) side effects. "Pure" simply means "function returns the same value when called with the same value".

What are abstract patterns?

I am learning Haskell and trying to understand the Monoid typeclass.
At the moment, I am reading the haskellbook and it says the following about the pattern (monoid):
One of the finer points of the Haskell community has been its
propensity for recognizing abstract patterns in code which have
well-defined, lawful representations in mathematics.
What does the author mean by abstract patterns?

Abstract in this sense is the opposite of concrete. This is probably one of the key things to understand about Haskell.
What is a concrete thing? Well, most values in Haskell are concrete. For example 'a' :: Char. The letter 'a' is a Char value, and it's a concrete value. Char is a concrete type. But in 1 :: Num a => a, the number 1 is actually a value of any type, so long as that type has the set of functions that the Num typeclass sets out as mandatory. This is an abstract value! We can have abstract values, abstract types, and therefore abstract functions. When the program is compiled, the Haskell compiler will pick a particular concrete value that supports all of our requirements.
Haskell, at its core, has a very simple, small but incredibly flexible language. It's very similar to an expression of maths, actually. This makes it very powerful. Why? because most things that would be built in language constructs in other languages are not directly built into Haskell, but defined in terms of this simple core.
One of the core pieces is the function, which, it turns out, most of computation is expressible in terms of. Because so much of Haskell is just defined in terms of this small simple core, it means we can extend it to almost anywhere we can imagine.
Typeclasses are probably the best example of this. Monoid, and Num are examples of typeclasses. These are constructs that allow programmers to use an abstraction like a function across a great many types but only having to define it once. Typeclasses let us use the same function names across a whole range of types if we can define those functions for those types. Why is that important or useful? Well, if we can recognise a pattern across, for example, all numbers, and we have a mechanism for talking about all numbers in the language itself, then we can write functions that work with all numbers at once. This is an abstract pattern. You'll notice some Haskellers are quite interested in a branch of mathematics called Category Theory. This branch is pretty much the mathematical definition of abstract patterns. Contrast this ability to encode such things with the inability of other languages, where in other languages the patterns the community notice are often far less rigorous and have to be manually written out, and without any respect for its mathematical nature. The beauty of following the mathematics is the extremely large body of stuff we get for free by aligning our language closer with mathematics.
This is a good explanation of these basics including typeclasses in a book that I helped author: http://www.happylearnhaskelltutorial.com/1/output_other_things.html
Because functions are written in a very general way (because Haskell puts hardly any limits on our ability to express things generally), we can write functions that use types which express such things as "any type, so long as it's a Monoid". These are called type constraints, as above.
Generally abstractions are very useful because we can, for example, write on single function to operate on an entire range of types which means we can often find functions that do exactly what we want on our types if we just make them instances of specific typeclasses. The Ord typeclass is a great example of this. Making a type we define ourselves an instance of Ord gives us a whole bunch of sorting and comparing functions for free.
This is, in my opinion, one of the most exciting parts about Haskell, because while most other languages also allow you to be very general, they mostly take an extreme dip in how expressive you can be with that generality, so therefore also are less powerful. (This is because they are less precise in what they talk about, because their types are less well "defined").
This is how we're able to reason about the "possible values" of a function, and it's not limited to Haskell. The more information we encode at the type level, the more toward the specificity end of the spectrum of expressivity we veer. For example, to take a classic case, the function const :: a -> b -> a. This function requires that a and b can be of absolutely any type at all, including the same type if we like. From that, because the second parameter can be a different type than the first, we can work out that it really only has one possible functionality. It can't return an Int, unless we give it an Int as its first value, because that's not any type, right? So therefore we know the only value it can return is the first value! The functionality is defined right there in the type! If that's not mindblowing, then I don't know what is. :)
As we move to dependent types (that is, a type system where types are first class, which means also that ordinary values can be encoded in the type system), we can get closer and closer to having the type system specify specifically what the constraints of possible functionality are. However, the kicker is, it doesn't necessarily speak about the implementation of the functionality unless we want it to, because we're in control of how abstract it is, but while maintaining expressivity and much precision. That's pretty fascinating, and amazingly powerful.
Much math can be expressed in the language that underpins Haskell, the lambda calculus.

Are there benefits of strong typing besides safety?

In the Haskell community, we are slowly adding features of dependent types. Dependent types is an advanced typing feature by which types can depend on values. Some languages like Agda and Idris already have them. It appears to be a very advanced feature requiring an advanced type system, until you realize that python has had dependent types has had the dynamic typing version of dependent types, which may or may not be actual dependent types, from the beginning.
For most any program in a functional programming language, there is a way to reperesent it as an untyped lambda calculus term, no matter how advanced the typing. That's because typing only eliminates programs, not enable new ones.
Strong Typing wins us safety. How classes of errors that happened at run time can no longer happen at run time. This safety is rather nice. Besides this safety though, what does strong typing give you?
Are there an additional benefits of a strong type system besides safety?
(Note that I'm not saying that strong typing is worthless. Safety is a huge benefit in and of itself. I'm just wondering if there are additional benefits.)

First, we need to talk a bit about the history of the simply typed lambda calculus.
There are two historical developments of the simply typed lambda calculus.
When Alonzo Church described the lambda calculus the types were baked in as part of the meaning / operational behavior of the terms.
When Haskell Curry described the lambda calculus the types were annotations put on the terms.
So we have the lambda calculus a la Church and the lambda calculus a la Curry. See https://en.wikipedia.org/wiki/Simply_typed_lambda_calculus#Intrinsic_vs._extrinsic_interpretations for more.
Ironically, the language Haskell, which is named after Curry is based on a lambda calculus a la Church!
What this means is the types aren't simply annotations that rule out bad programs for you. They can "do stuff" too. Such types don't erase without leaving residue.
This shows up in Haskell's notion of type classes, which are really why Haskell is a language a la Church.
In Haskell, when I make a function
sort :: Ord a => [a] -> [a]
We're passing an object or dictionary for Ord a as the first argument.
But you aren't forced to plumb that argument around yourself in the code, it is the job of the compiler to build that up and use it.
instance Ord Char
instance Ord Int
instance Ord a => Ord [a]
So if you go and use sort on a list of strings, which are themselves lists of chars, then this will build up the dictionary by passing the Ord Char instance through the instance for Ord a => Ord [a] to get Ord [Char], which is the same as Ord String, then you can sort a list of strings.
Calling sort above, is a lot less verbose than manually building a LexicographicComparator<List<Char>> by passing it an IComparator<Char> to its constructor and calling the function with an extra second argument, if I were to compare the complexity of calling such a sort function in Haskell to calling it in C# or Java.
This shows us that programming with types can be significantly less verbose, because mechanisms like implicits and typeclasses can infer a large part of the code for your program during type checking.
On a simpler basis, even the sizes of arguments can depend on types, unless you want to pay fairly massive costs for boxing everything in your language up so that it has a homogeneous representation.
This shows us that programming with types can be significantly more efficient, because it can use dedicated representations, rather than paying for boxed structures everywhere in your code. An int can't just be a machine integer, because it has to somehow look like everything else in the system. If you're willing to give up an order of magnitude or more worth of performance at runtime, then this may not matter to you.
Finally, once we have types "doing stuff" for us, it is often beneficial to consider the refactoring benefits that mere safety provides.
If I refactor the smaller set of code that remains, it'll rewrite all that type-class plumbing for me. It'll figure out the new ways it can rewrite the code to unbox more arguments. I'm not stuck elaborating all of this stuff by hand, I can leave these mundane tasks to the type-checker.
But even when I do change the types, I can move arguments around fairly willy-nilly, comfortable that the compiler will very likely catch my errors. Types give you "free theorems" which are like unit tests for whole classes of such errors.
On the other hand, once I lock down an API in a language like Python I'm deathly afraid of changing it, because it'll silently break at runtime for all my downstream dependencies! This leads to baroque APIs that lean heavily on easily bit-rotted keyword-arguments, and the API of something that evolves over time rarely resembles what you'd build out of the box if you had it to do over again. Consequently, even the mere safety concern has long-term impact in API design once you ever want people to build on top of your work, rather than simply replace it when it gets too unwieldy.

That's because typing only eliminates programs, not enable new ones.
This is not a correct statement. Type-classes make it possible to generate parts of your program from type-level information.
Consider two expressions:
readMaybe "15" :: Maybe Integer
readMaybe "15" :: Maybe Bool
Here I'm using the readMaybe function from the Text.Read module. At term level those expressions are identical, only their type annotations are different. However, the results they produce at runtime differ (Just 15 in the first case, Nothing in the second case).
This is because the compiler generates code for you from the static type information you have. To be more precise, it selects a suitable type class instance and passes its dictionary to the polymorphic function (readMaybe in our case).
This example is simple, but there are way more complex use cases. Using the mtl library you can write computations that run in different computational contexts (aka Monads). The compiler will automatically insert a lot of code that manages the computational contexts. In a dynamically typed language, you would have no static information to make this possible.
As you can see, static typing not only cuts off incorrect programs but also writes correct ones for you.

You need "safety" when you already know what and how you want to write. It's a very small part of what types are useful for. The most important thing about types is that they make your reasoning structured. When someone writes in Python a + b he doesn't see a and b as some abstract variables — he sees them as some numbers. Types are already there in the internal language of humans, Python just doesn't have a type system to talk about them. The actual question in the "typed vs untyped (unityped) programming" dispute is "do we want to reflect our internal structured concepts in a safe and explicit or unsafe and implicit way?". Types don't introduce new concepts — it's untyped reasoning forgets the existing ones.
When someone looks at a tree (I mean a real green one) he doesn't see every single leaf on it, but he doesn't treat it as an abstract nameless object as well. "A tree" — is an approximation that is good enough for most cases and that's why we have Hindley-Milner type systems, but sometimes you want to talk about a specific tree and you do want to look at leaves. And that's what dependent types give you: the ability to zoom. "A tree without leaves", "a tree in the forest", "a tree of a particular form"... Dependently typed programming is just another step towards how humans think.
On a less abstract note, I have a type checker for a toy dependently typed language, where all typing rules are expressed as constructors of a data type. You don't need to dive into the type checking procedure to understand the rules of the system. That's the power of "zooming": you can introduce as complex invariants as you want, thus distinguishing essential parts from not important ones.
Another example of the power dependent types give you is various forms of reflection. Look e.g. at the Pierre-Évariste Dagand thesis, which proves that
generic programming is just programming
And of course types are hints, many functions and abstractions I defined I would define in a far more clumsy way in a weakly typed language, but types suggested better alternatives.
There is just no question "What to choose: simple types or dependent types?". Dependent types are always better and they of course subsume simple types. The question is "What to choose: no types or dependent types?", but that question doesn't stand for me.

Refactoring. By having a strong type system you can safely refactor code and have the compiler tell you whether what you are doing now even makes sense. The stronger the typing system, the more refactor errors are avoided. This of course means your code is a lot more maintainable.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string