What's a suitable data type?

What's a suitable data type? - haskell

In the question, Seeking constructive criticism on monad implementation, abesto asked people to criticize his "Monad" which kept count of the number of bind operations. It turned out that this was not actually a monad because it did not satisfy the first two monadic laws, but I found the example interesting. Is there any data type that would be suitable for such kinds of structures?

That's an interesting question, and has to do with the mathematical lineage of monads.
We could certainly create a typeclass called something like Monadish, which would look exactly like the Monad typeclass:
class Monadish m where
returnish :: a -> m a
bindish :: m a -> (a -> m b) -> m b
So the monad laws have nothing to do with the actual signature of the typeclass; they're extra information that an implementor has to enforce by themselves. So, in one sense, the answer is "of course"; just make another typeclass and say it doesn't have to satisfy any laws.
But is such a typeclass interesting? For a mathematician, the answer would be no: the lack of any laws means that there is no interesting structure by which to reason with. When we define a mathematical structure, we usually define some objects (check), some operations (check) and then some properties of the operations (...nope). We need all three of these to prove theorems about this class of objects, and, to take one example, abstract algebra is all about taking the same operations and adding more or fewer laws.
For a software engineer, the answer is a little more complex. Reasoning is not required: you can always just use a typeclass to overload syntax for your own nefarious purposes. We can use a typeclass to group things together that "feel" the same, even though we don't have any formal reasons for believing so. There are some benefits to doing this, but I personally feel this throws out a lot of the benefits of having laws, and leads to architecture astronauts who invent abstract structures without a whole lot of thought of their applicability. Maths is a safer bet: the monad laws correspond to left identity, right identity, and associativity, reasonably fundamental assumptions that even a non-mathematical person would be familiar with.

Related

Haskell functor implementation dealing with "BOX"?

In Category theory, Functor concept is as the below:
https://ncatlab.org/nlab/show/functor
In Haskell, Functor type can be expressed as:
fmap :: (a -> b) -> f a -> f b
https://hackage.haskell.org/package/base-4.14.0.0/docs/Data-Functor.html
and I could see the both really corresponds well.
However, once we actually try to implement this Functor concept down to a code, it seems impossible to define F or fmap as simple as the diagram shown above.
In fact, there is a famous article about Functor/Monad.
Functors, Applicatives, And Monads In Pictures
Here,
Simple enough. Lets extend this by saying that any value can be in a context. For now you can think of a context as a box that you can put a value in:
or
Here's what is happening behind the scenes when we write fmap (+3) (Just 2):
What I always feel about Functor is the concept of Functor in category theory and concept of wrap&unwrap to/from "BOX" does not match well.
Quesion Point 1.
fmap :: (a -> b) -> f a -> f b
https://hackage.haskell.org/package/base-4.14.0.0/docs/Data-Functor.html
Where is the actual implementation of wrap&unwrap to/from "BOX" in Haskell?
Question Point 2.
Why the concept of Functor in category theory and concept of wrap&unwrap to/from "BOX" does not match well?
EDIT:
Even for IO functor, during the composition process, f is unwrapped:
// f is unwrapped in composition process
const compose = g => f => x => g(f(x));
const fmap = compose;
const print = a => () => console.log(a);
// safely no side-effect
const todo = fmap(print("bar"))(print("foo"));
//side effect
todo(undefined); // foo bar
// with pipleline operator (ES.next)
//
// const todo = print("foo")
// |> fmap(print("bar"))
// |> fmap(print("baz"));
// todo(undefined); // foo bar baz

The ideas from category theory are so abstract that anyone who tries to provide an intuitive introduction runs the risk of simplifying concepts to the point where they may confuse people. As the author of an article series in this space, I can testify that it doesn't take much imprecise language before someone misunderstands the text.
I don't know the particular article, but I believe that it may exhibit the same trait. The wrap/unwrap metaphor fits a substantial subset of functors (e.g. Maybe, [], Either l, etc.), but not all.
Famously, you're not supposed to unwrap IO; that is by design. At that point, the wrap/unwrap metaphor falls apart. It's no longer valid in the face of IO.
Indeed, the concepts don't match. I'd say that the wrap/unwrap metaphor may be useful as an introduction, but as always, there's a limit to how much you can stretch a metaphor.
How are the Functor instances implemented? Most introductions to Haskell will show you how to write fmap for Maybe, [], and a few other types. It can also be a good exercise to implement them yourself, if you get the chance.
GHC and its ecosystem is open source, so you can always look at the source code if you wonder how a particular instance is implemented.
Again, IO is a big exception to the rule. As far as I understand it, its Functor, Applicative, Monad, etc. instances aren't implemented in (Safe) Haskell, but rather in a small core of unsafe code (C or C++, I believe) that constitutes the core of the compiler and/or runtime environment. There's no (explicit, visible, safe) unwrapping going on with IO. I think it's more helpful to think of IO's Functor instance as the structure-preserving map that it is.
For more details about the correspondence between category theory and Haskell, I recommend Bartosz Milewski's article series on the topic.

Look at the arrows in the picture. There is no way to go from the functor level back to the non-functor level. You would need a function that goes from F(x) to x, but - as you can see - none is defined.
There are specific functors (like Maybe) that offer the "unwrapping" functionality, but such feature is always an add-on, it's delivered on top of something being a functor. For example you might say: Maybe is a functor, and also it has an interesting property: there's a partial function that maps Maybe X to X, and which reverses pure.
UPDATE (after the additional question appeared) The concepts of a box and a functor simply don't match. Also, as far as I know, no good metaphor for a functor (or a monad, or an applicative) has been found - and not for the lack of trying. It's not even surprising: most abstractions lack good metaphors, precisely because abstractions and metaphors are polar opposites (in some way).
Abstraction strips a concept to its core, leaving only the barest essentials. Metaphor on the other hand, extends a concept, widens the semantic space, suggests more meaning. When I say: "your eyes have the color of chocolate", I am abstracting a notion of a "color". But I also metaphorically associate the eyes and chocolate: I suggest that they have more in common than just the color: silky texture, sweetness, pleasure - all those concepts are present, although none were named. If I said "your eyes have the color of excrement" the abstraction used would be exactly the same - but the metaphorical meaning: very different. I would not say it even to a logician, even though a logician would technically understand that the sentence is not offensive.
When on auto-pilot, most humans think in metaphors, not abstractions. Care must be taken when explaining the latter in terms of the former, because the meanings will spill over. When you hear "a box", the autopilot in your head tells you that you can put stuff in and get stuff out. Functor is not like that. So the metaphor is misleading.
Functor embodies an abstraction of a... box or a wrapper, but one that allows us to work on their contents WITHOUT ever unwrapping it. This lack of unwrapping is exactly the thing which makes functors interesting: otherwise fmap would just be a syntactic sugar for unwrapping, applying a function and wrapping the results up. Studying functors lets us understand how much is possible without unwrapping values - and, what's even better and more enlightening, it lets us understand what is impossible without unwrapping. The steps which lead up to applicatives, arrows and monads show us how to overcome some of the limitations by allowing additional operationsm but stukk without ever allowing unwrapping, because if we allowed unwrapping, the steps would make no sense (i.e. become trivial).

In Haskell, why is there a typeclass hierarchy/inheritance?

To clarify my question, let me rephrase it in a more or less equivalent way:
Why is there a concept of superclass/class inheritance in Haskell?
What are the historical reasons that led to that design choice?
Why would it be so bad, for example, to have a base library with no class hierarchy, just typeclasses independent from each other?
Here I'll expose some random thoughts that made me want to ask this question. My current intuitions might be inaccurate as they are based on my current understanding of Haskell which is not perfect, but here they are...
It is not obvious to me why type class inheritance exists in Haskell. I find it a bit weird, as it creates asymmetry in concepts.
Often in mathematics, concepts can be defined from different viewpoints, I don't necessarily want to favor an order of how they ought to be defined. OK there is some order in which one should prove things, but once theorems and structures are there, I'd rather see them as available independent tools.
Moreover one perhaps not so good thing I see with class inheritance is this: I think a class instance will silently pick a corresponding superclass instance, which was probably implemented to be the most natural one for that type. Let's consider a Monad viewed as a subclass of Functor. Maybe there could be more than one way to define a Functor on some type that also happens to be a Monad. But saying that a Monad is a Functor implicitly makes the choice of one particular Functor for that Monad. Someday, you might forget that actually you wanted some other Functor.
Perhaps this example is not the best fit, but I have the feeling this sort of situation might generalize and possibly be dangerous if your class is a child of many. Current Haskell inheritance sounds like it makes default choices about parents implicitly.
If instead you have a design without hierarchy, I feel you would always have to be explicit about all the properties required, which would perhaps mean a bit less risk, more clarity, and more symmetry. So far, what I'm seeing is that the cost of such a design, would be : more constraints to write in instance definitions, and newtype wrappers, for each meaningful conversion from one set of concepts to another. I am not sure, but perhaps that could have been acceptable. Unfortunately, I think Haskell auto deriving mechanism for newtypes doesn't work very well, I would appreciate that the language was somehow smarter with newtype wrapping/unwrapping and required less verbosity.
I'm not sure, but now that I think about it, perhaps an alternative to newtype wrappers could be specific imports of modules containing specific variations of instances.
Another alternative I thought about while writing this, is that maybe one could weaken the meaning of class (P x) => C x, where instead of it being a requirement that an instance of C selects an instance of P, we could just take it to loosely mean that for example, C class also contains P's methods but no instance of P is automatically selected, no other relationship with P exists. So we could keep some sort of weaker hierarchy that might be more flexible.
Thanks if you have some clarifications over that topic, and/or correct my possible misunderstandings.

Maybe you're tired of hearing from me, but here goes...
I think superclasses were introduced as a relatively minor and unimportant feature of type classes. In Wadler and Blott, 1988, they are briefly discussed in Section 6 where the example class Eq a => Num a is given. There, the only rationale offered is that it's annoying to have to write (Eq a, Num a) => ... in a function type when it should be "obvious" that data types that can be added, multiplied, and negated ought to be testable for equality as well. The superclass relationship allows "a convenient abbreviation".
(The unimportance of this feature is underscored by the fact that this example is so terrible. Modern Haskell doesn't have class Eq a => Num a because the logical justification for all Nums also being Eqs is so weak. The example class Eq a => Ord a would be been a lot more convincing.)
So, the base library implemented without any superclasses would look more or less the same. There would just be more logically superfluous constraints on function type signatures in both library and user code, and instead of fielding this question, I'd be fielding a beginner question about why:
leq :: (Ord a) => a -> a -> Bool
leq x y = x < y || x == y
doesn't type check.
To your point about superclasses forcing a particular hierarchy, you're missing your target.
This kind of "forcing" is actually a fundamental feature of type classes. Type classes are "opinionated by design", and in a given Haskell program (where "program" includes all the libraries, include base used by the program), there can be only one instance of a particular type class for a particular type. This property is referred to as coherence. (Even though there is a language extension IncohorentInstances, it is considered very dangerous and should only be used when all possible instances of a particular type class for a particular type are functionally equivalent.)
This design decision comes with certain costs, but it also comes with a number of benefits. Edward Kmett talks about this in detail in this video, starting at around 14:25. In particular, he compares Haskell's coherent-by-design type classes with Scala's incoherent-by-design implicits and contrasts the increased power that comes with the Scala approach with the reusability (and refactoring benefits) of "dumb data types" that comes with the Haskell approach.
So, there's enough room in the design space for both coherent type classes and incoherent implicits, and Haskell's appoach isn't necessarily the right one.
BUT, since Haskell has chosen coherent type classes, there's no "cost" to having a specific hierarchy:
class Functor a => Monad a
because, for a particular type, like [] or MyNewMonadDataType, there can only be one Monad and one Functor instance anyway. The superclass relationship introduces a requirement that any type with Monad instance must have Functor instance, but it doesn't restrict the choice of Functor instance because you never had a choice in the first place. Or rather, your choice was between having zero Functor [] instances and exactly one.
Note that this is separate from the question of whether or not there's only one reasonable Functor instance for a Monad type. In principle, we could define a law-violating data type with incompatible Functor and Monad instances. We'd still be restricted to using that one Functor MyType instance and that one Monad MyType instance throughout our program, whether or not Functor was a superclass of Monad.

How are functors in Haskell and OCaml similar?

I've been toying around in a Haskell for the past year or so and I'm actually starting to 'get' it, up until Monads, Lenses, Type Families, ... the lot.
I'm about to leave this comfort zone a little and I am moving to an OCaml project as a day job. Going through the syntax a little I was looking for similar higher level concepts, like for example functor.
I read the code in OCaml and the structure of a functor but I cannot seem to get whether they are now similar concepts in Haskell and OCaml or not.
In a nutshell, a functor in Haskell is for me mainly a way to lift functions in Haskell and I use it (and like it) like that.
In OCaml it gives me the feeling that it's closer to programming to an interface (for example when making a set or a list, with that compare function) and I wouldn't really know how to for example lift functions over the functor or so.
Can somebody explain me whether the two concepts are similar and if so what am I missing or not seeing? I googled around a bit and there doesn't seem to be a clear answer to be found.
Kasper

From a practical standpoint, you can think of "functors" in OCaml and Haskell as unrelated. As you said, in Haskell a functor is any type that allows you to map a function over it. In OCaml, a functor is a module parametrized by another module.
In Functional Programming, what is a functor? has a good description of what functors in the two languages are and how they differ.
However, as the name implies, there is actually a connection between the two seemingly disparate concepts! Both language's functors are just realizations of a concept from category theory.
Category theory is the study of categories, which are just arbitrary collections of objects with "morphisms" between them. The idea of a category is very abstract, so "objects" and "morphisms" can really be anything with a few restrictions—there has to be an identity morphism for every object and morphisms have to compose.
The most obvious example of a category is the category of sets and functions: the sets are the objects and the functions between sets the morphisms. Clearly, every set has has an identity function and functions can be composed. A very similar category can be formed by looking at a functional programming language like Haskell or OCaml: concrete types (e.g. types with kind *) are the objects and Haskell/OCaml functions are the morphisms between them.
In category theory, a functor is a transformation between categories. It is like a function between categories. When we're looking at the category of Haskell types, a functor is essentially a type-level function: it maps types to something else. The particular kind of functor we care about maps types to other types. A perfect example of this is Maybe: Maybe maps Int to Maybe Int, String to Maybe String and so on. It provides a mapping for every possible Haskell type.
Functors have one additional requirement—they have to map the category's morphisms as well as the objects. In particular, if we have a morphism A → B and our functor maps A to A' and B to B', it has to map the morphism A → B to some morphism A' → B'. As a concrete example, let's say we have the types Int and String. There are a whole bunch of Haskell functions Int → String. For Maybe to be a proper functor, it has to have a function Maybe Int → Maybe String for each of these.
Happily, this is exactly what the fmap function does—it maps functions. For Maybe, it has the type (a → b) → Maybe a → Maybe b; we can add some parentheses to get: (a → b) → (Maybe a → Maybe b). What this type signature tells us is that for any normal function we have, we also have a corresponding function over Maybes.
So a functor is a mapping between types that also preserves the functions between them. The fmap function is essentially just a proof of this second restriction on functors. This makes it easy to see how the Haskell Functor class is just a particular version of the mathematical concept.
So what about OCaml? In OCaml, a functor is not a type—it's a module. In particular, it's a parametrized module: a module that takes another module as an argument. Already, we can see some parallels: in Haskell, a Functor is like a type-level function; in OCaml, a functor is like a module-level function. So really, it's the same mathematical idea; however, instead of being used on types—like in Haskell—it's used on modules.
There's far more detail about how OCaml functors related to category theory functors on the CS site: What is the relation between functors in SML and Category theory?. The question talks about SML rather than OCaml per se, but my understanding is that the module system of OCaml is very closely related to that of SML.
In summary: functors in Haskell and in OCaml are two fundamentally different structures which both happen to be reifications of the same very abstract mathematical idea. I think it's pretty neat :).

Why do all Haskell typeclasses have laws?

All the typeclasses in Typeclassopedia have associated laws, such as associativity or commutativity for certain operators. The definition of a "law" seems to be a constraint that cannot be expressed in the type system. I certainly understand why you want to have, say, monad laws, but is there a fundamental reason why a typeclass that can be expressed fully within the type system is pointless?

You will notice that almost always the laws are algebraic laws. They could be expressed by the type system by using some extensions, but the proofs would be cumbersome to express. So you have unchecked laws and potentially implementations might break them. Why is this good?
The reason is that the design patterns used in Haskell are motivated (and in most cases mirrored) by mathematical structures, usually from abstract algebra. While most other languages have an intuitive notion of certain features like safety, performance and semantics, we Haskell programmers prefer to establish a formal notion. The advantage of doing this is: Once your types and functions obey the safety laws, they are safe in the sense of the underlying algebraic structure. They are provably safe.
Take functors as an example. A Haskell functor has the following two laws:
fmap f . fmap g = fmap (f . g)
fmap id = id
Firstly this is very important: Functions in Haskell are opaque. You cannot examine, compare or whatever them. While this sounds like a bad thing in Haskell it is actually a very good thing. The fmap function cannot examine the function you've passed it. Particularly it can't check that you've passed the identity function or that you've passed a composition. In short: it can't cheat! The only way for it to obey these two laws is actually not to introduce any effects of its own. That means, in a proper functor fmap will never do anything unexpected. In fact it cannot do anything else than to map the given function. This is a very simple example and I haven't explained all the subtleties why fmap can't cheat, but it demonstrates the point.
Now extend this all over the language, the base libraries and most sensible third party libraries. This gives you a language that is as predictable as a language can get. When you write code, you know what it's going to do. That's one of the main reasons why Haskell code often works out of the box. I often write pages of Haskell code before compiling. Once my type errors are fixed, my program usually works.
The other reason why this is desirable is that it allows a more compositional style of programming. This is particularly useful when working as a team. First you map your application to algebraic structures and establish the necessary laws. For example: You express what it means for something to be a Valid Web Server. In particular you establish a formal notion of web server composition. If you compose two Valid Web Servers, the result is a Valid Web Server. Do you see where this is going? After establishing these laws the teammates go to work, and they work in isolation. Little to no communication is necessary to get their job done. When they meet again, everybody presents their Valid Web Servers and they just compose them to make the final product, a web site. Since the individual components were all Valid Web Servers, the final result must be a Valid Web Server. Provably.

Yes and no. For instance the Show class does not have any laws associated with it, and it is certainly useful.
However, typeclasses express interfaces. An interface needs to satisfy more than being just a bunch of functions - you want these functions to fulfill a specification. The specification is normally more complicated than what can be expressed in Haskell's type system. For example, take the Eq class. It only needs to provide us with a function, the type of which has to be a -> a -> Bool. That's the most that Haskell's type system will allow us to require from an instance of an Eq type. However, we would normally expect more from this function - you would probably want it to be an equivalence relation (reflexive, symmetric and transitive). So then you state these requirements as separate "laws".

A typeclass doesn't need to have laws, but it often will be more useful if it has them. Many typeclasses are expected to function in a certain way, the laws codify user expectations. The laws let users make assumptions about the way that an instance of a typeclass will work. If you break the typeclass laws, you don't get arrested by the Haskell police, you just end up with confused users.

What functionality do you get for free with Functors or other type-classes?

I read an article which said:
Providing instances for the many standard type-classes [Functors] will immediately give you a lot of functionality for practically free
My question is: what is this functionality that you get for free (for functors or other type-classes)? I know what the definition of a functor is, but what do I get for free by defining something as a functor/other type-class. Something other than a prettier syntax. Ideally this would be general and useful functions that operate on functors/other type-classes.
My imagination (could be wrong) of what free means is functions of this sort: TypeClass x => useful x y = ..
== Edit/Additition ==
I guess I'm mainly asking about the more abstract (and brain boggling) type-classes, like the ones in this image. For less abstract classes like Ord, my object oriented intuition understands.

Functors are simple and probably not the best example. Let's look at Monads instead:
liftM - if something is a Monad, it is also a Functor where liftM is fmap.
>=>, <=<: you can compose a -> m b functions for free where m is your monad.
foldM, mapM, filterM... you get a bunch of utility functions that generalize existing functions to use your monad.
when, guard* and unless -- you also get some control functions for free.
join -- this is actually fairly fundamental to the definition of a monad, but you don't need to define it in Haskell since you've defined >>=.
transformers -- ErrorT and stuff. You can bolt error handling onto your new type, for free (give or take)!
Basically, you get a wide variety of standard functions "lifted" to use your new type as soon as you make it a Monad instance. It also becomes trivial (but alas not automatic) to make it a Functor and Applicative as well.
However, these are all "symptoms" of a more general idea. You can write interesting, nontrivial code that applies to all monads. You might find some of the functions you wrote for your type--which are useful in your particular case, for whatever reason--can be generalized to all monads. Now you can suddenly take your function and use it on parsers, and lists, and maybes and...
* As Daniel Fischer helpfully pointed out, guard requires MonadPlus rather than Monad.

Functors are not very interesting by themselves, but they are a necessary stepping stone to get into applicative functors and Traversables.
The main property which makes applicative functors useful is that you can use fmap with the applicative operator <*> to "lift" any function of any arity to work with applicative values. I.e. you can turn any a -> b -> c -> d into Applicative f => f a -> f b -> f c -> f d. You can also take a look at Data.Traversable and Data.Foldable which contain several general purpose functions that involve applicative functors.
Alternative is a specialized applicative functor which supports choice between alternatives that can "fail" (the exact meaning of "empty" depends in the applicative instance). Applicative parsers are one practical example where the definitions of some and many are very intuitive (e.g. match some pattern zero-or-more times or one-or-more times).
Monads are one of the most interesting and useful type-classes, but they are already well covered by the other answers.
Monoid is another type-class that is both simple and immediately useful. It basically defines a way to add two pieces of data together, which then gives you a generic concat as well as functionality in the aforementioned Foldable module and it also enables you to use the Writer monad with the data type.

There are many of the standard functions in haskell that require that their arguments implement one or more type-classes. Doing so in your code allows other developers (or yourself) to use your data in ways they are already familiar with, without having to write additional functions.
As an example, implementing the Ord type-class will allow you to use things like sort, min, max, etc. Where otherwise, you would need sortBy and the like.

Yes, it means that implementing the type class Foo gives you all the other functions that have a Foo constraint "for free".
The Functor type class isn't too interesting in that regard, as it doesn't give you a lot.
A better example is monads and the functions in the Control.Monad module. Once you've defined the two Monad functions (>>=) and return for your type, you get another thirty or so functions that can then be used on your type.
Some of the more useful ones include: mapM, sequence, forever, join, foldM, filterM, replicateM, when, unless and liftM. These show up all the time in Haskell code.

As others have said, Functor itself doesn't actually get you much for free. Basically, the more high-level or general a typeclass is (meaning the more things fit that description), then the less "free" functionality you are going to get. So for example, Functor, and Monoid don't provide you with much, but Monad and Arrow provide you with a lot of useful functions for free.
In Haskell, it's still a good idea to write an instance for Functor and Monoid though (if your data type is indeed a functor or a monoid), because we almost always try to use the most general interface possible when writing functions. If you are writing a new function that can get away with only using fmap to operate on your data type, then there is no reason to artificially restrict that function to to Monads or Applicatives, since it might be useful later for other things.

Your object-oriented intuition carries across, if you read "interface and implementation" for "typeclass and instance". If you make your new type C an instance of a standard typeclass B, then you get for free that your type will work with all existing code A that depends on B.
As others have said, when the typeclass is something like Monad, then the freebies are the many library functions like foldM and when.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string