a way to simulate Union types without type operators - haskell

I have done this before using type operators, but I want to exclude those because I want to be able to do it with a smaller hammer, because I actually want to do it in another language, and I'm not too sure type operators do quite what I want.
The setup is two data types, Integer and...
> data Rational = Rational Integer Integer deriving Show
two type classes with sensible instances...
> class Divide2 a where
> divide2 :: a -> a
> class Increment a where
> inc :: a -> a
> instance Increment Main.Rational where
> inc (Rational a b) = Rational (a + b) b
> instance Divide2 Main.Rational where
> divide2 (Rational a b) = Rational a (2 * b)
> instance Increment Integer where
> inc x = x + 1
I can define things that work instances of one type class or the other
> div4 x = divide2 (divide2 x)
> add2 :: Increment c => c -> c
> add2 = inc . inc
and then I want to take the union of these two data types...so the obvious thing to do is use a discriminated
> data Number = Rat Main.Rational | Int Integer
now...in my scenario, the functions that act on this union, exist in one distinct module (a binary, I'm not familiar with Haskells binaries)
but the data types themselves are defined in another
So clearly I can define some functions that (in principle) can potentially "work" on this union, e.g. functions that act on values of instances of Increment....and some that don't, e.g. one in Divide2
So how do I write a function, against this discriminated union, that applys a function to values in the union, that will compile for functions on Increment, but don't compile on functions on Divide2...I'll have a go here, and fall flat on my face.
> apply (Rat r) f = f r
> apply (Int i) f = f i
.
• Couldn't match expected type ‘Main.Rational’
with actual type ‘Integer’
• In the first argument of ‘f’, namely ‘i’
In the expression: f i
In an equation for ‘apply’: apply (Int i) f = f i
|
86 | > apply (Int i) f = f i | ^
Failed, no modules loaded.
as expected, the inference says its got to be an Rational because of the first call, but its an Integer...
but "clearly"...if I could make haskell suspect disbelief...like some sort of macro...then the function
> apply (Int 1) add2
does make sense, and moreover, makes sense for any value of Number I care to choose.
so the obvious thing to do is to make Number a member of anything in the intersection of the set of type classes each member is in....
> instance Increment Number where
> inc (Rat r) = inc (Rat r)
> inc (Int i) = inc (Int i)
and then ghc implements "apply" itself...I CAN as well map this solution back into other languages by some explicit dictionary passing...but I have hundreds, if not thousands of tiny typeclasses (I may even have to consider all their combinations as well).
so really I want to know is there some type magic (existential? rankn?) that means that I CAN write "apply" against Number, without resorting to some dependent type magic, or have to implement instances of type classes on the discriminated union.
P.S. I can do limited dependent type magic...but its a last resort,
Edit...
The code that contains the functions defined on Number can of course match the disciminated values, but if they do, then whenever union is extended, they will fail to compile (ok, they don't have to match each case individally, but unless they do, they can't extract the wrapped value to apply the function, because it wont know the type)
Hmmm...written down it looks like the expression problem, in fact it IS the expression problem...I know of many solutions then...I just don't usually like any of them...let me knock up the canonical Haskell solution to this using type classes.

You can accept only functions that make use of Increment methods (and do not make use of any non-Incremental functionality) like this:
{-# LANGUAGE RankNTypes #-}
apply :: (forall a. Increment a => a -> a) -> Number -> Number
apply f (Rat r) = Rat (f r)
apply f (Int i) = Int (f i)
You can now pass add2 to apply if you like:
> apply add2 (Rat (Rational 3 4))
Rat (Rational 11 4)
In this specific case, implementing apply amounts to exactly the same thing as supplying an Increment instance for Number itself:
instance Increment Number where
inc (Rat r) = Rat (inc r)
inc (Int i) = Int (inc i)
...and now you don't even need the mediating apply function to apply add2:
> add2 (Rat (Rational 3 4))
Rat (Rational 11 4)
But this is a pretty special case; it won't always be so easy to just implement the appropriate classes for Number, and you will need to resort to something like the higher-rank types we used in apply instead.

So this IS the expression problem, so type classes solve this specific case.
you take the function you want to make general over some as yet undefined universe of types
> class Add2 a where
> add2' :: a -> a
> newtype Number' a = Number' a
> instance (Increment a) => Add2 (Number' a) where
> add2' (Number' x) = Number' $ inc (inc x)
> three = add2 (Int 1)
and then make any type that inhabits the required preconditions in terms of type classes, inhabit the typeclass for your generalised "function".
you then implement your new "Number" data types, and create instances of them where they make sense.

Related

Besides as-pattern, what else can # mean in Haskell?

I am studying Haskell currently and try to understand a project that uses Haskell to implement cryptographic algorithms. After reading Learn You a Haskell for Great Good online, I begin to understand the code in that project. Then I found I am stuck at the following code with the "#" symbol:
-- | Generate an #n#-dimensional secret key over #rq#.
genKey :: forall rq rnd n . (MonadRandom rnd, Random rq, Reflects n Int)
=> rnd (PRFKey n rq)
genKey = fmap Key $ randomMtx 1 $ value #n
Here the randomMtx is defined as follows:
-- | A random matrix having a given number of rows and columns.
randomMtx :: (MonadRandom rnd, Random a) => Int -> Int -> rnd (Matrix a)
randomMtx r c = M.fromList r c <$> replicateM (r*c) getRandom
And PRFKey is defined below:
-- | A PRF secret key of dimension #n# over ring #a#.
newtype PRFKey n a = Key { key :: Matrix a }
All information sources I can find say that # is the as-pattern, but this piece of code is apparently not that case. I have checked the online tutorial, blogs and even the Haskell 2010 language report at https://www.haskell.org/definition/haskell2010.pdf. There is simply no answer to this question.
More code snippets can be found in this project using # in this way too:
-- | Generate public parameters (\( \mathbf{A}_0 \) and \(
-- \mathbf{A}_1 \)) for #n#-dimensional secret keys over a ring #rq#
-- for gadget indicated by #gad#.
genParams :: forall gad rq rnd n .
(MonadRandom rnd, Random rq, Reflects n Int, Gadget gad rq)
=> rnd (PRFParams n gad rq)
genParams = let len = length $ gadget #gad #rq
n = value #n
in Params <$> (randomMtx n (n*len)) <*> (randomMtx n (n*len))
I deeply appreciate any help on this.
That #n is an advanced feature of modern Haskell, which is usually not covered by tutorials like LYAH, nor can be found the the Report.
It's called a type application and is a GHC language extension. To understand it, consider this simple polymorphic function
dup :: forall a . a -> (a, a)
dup x = (x, x)
Intuitively calling dup works as follows:
the caller chooses a type a
the caller chooses a value x of the previously chosen type a
dup then answers with a value of type (a,a)
In a sense, dup takes two arguments: the type a and the value x :: a. However, GHC is usually able to infer the type a (e.g. from x, or from the context where we are using dup), so we usually pass only one argument to dup, namely x. For instance, we have
dup True :: (Bool, Bool)
dup "hello" :: (String, String)
...
Now, what if we want to pass a explicitly? Well, in that case we can turn on the TypeApplications extension, and write
dup #Bool True :: (Bool, Bool)
dup #String "hello" :: (String, String)
...
Note the #... arguments carrying types (not values). Those are something that exists at compile time, only -- at runtime the argument does not exist.
Why do we want that? Well, sometimes there is no x around, and we want to prod the compiler to choose the right a. E.g.
dup #Bool :: Bool -> (Bool, Bool)
dup #String :: String -> (String, String)
...
Type applications are often useful in combination with some other extensions which make type inference unfeasible for GHC, like ambiguous types or type families. I won't discuss those, but you can simply understand that sometimes you really need to help the compiler, especially when using powerful type-level features.
Now, about your specific case. I don't have all the details, I don't know the library, but it's very likely that your n represents a kind of natural-number value at the type level. Here we are diving in rather advanced extensions, like the above-mentioned ones plus DataKinds, maybe GADTs, and some typeclass machinery. While I can't explain everything, hopefully I can provide some basic insight. Intuitively,
foo :: forall n . some type using n
takes as argument #n, a kind-of compile-time natural, which is not passed at runtime. Instead,
foo :: forall n . C n => some type using n
takes #n (compile-time), together with a proof that n satisfies constraint C n. The latter is a run-time argument, which might expose the actual value of n. Indeed, in your case, I guess you have something vaguely resembling
value :: forall n . Reflects n Int => Int
which essentially allows the code to bring the type-level natural to the term-level, essentially accessing the "type" as a "value". (The above type is considered an "ambiguous" one, by the way -- you really need #n to disambiguate.)
Finally: why should one want to pass n at the type level if we then later on convert that to the term level? Wouldn't be easier to simply write out functions like
foo :: Int -> ...
foo n ... = ... use n
instead of the more cumbersome
foo :: forall n . Reflects n Int => ...
foo ... = ... use (value #n)
The honest answer is: yes, it would be easier. However, having n at the type level allows the compiler to perform more static checks. For instance, you might want a type to represent "integers modulo n", and allow adding those. Having
data Mod = Mod Int -- Int modulo some n
foo :: Int -> Mod -> Mod -> Mod
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
works, but there is no check that x and y are of the same modulus. We might add apples and oranges, if we are not careful. We could instead write
data Mod n = Mod Int -- Int modulo n
foo :: Int -> Mod n -> Mod n -> Mod n
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
which is better, but still allows to call foo 5 x y even when n is not 5. Not good. Instead,
data Mod n = Mod Int -- Int modulo n
-- a lot of type machinery omitted here
foo :: forall n . SomeConstraint n => Mod n -> Mod n -> Mod n
foo (Mod x) (Mod y) = Mod ((x+y) `mod` (value #n))
prevents things to go wrong. The compiler statically checks everything. The code is harder to use, yes, but in a sense making it harder to use is the whole point: we want to make it impossible for the user to try adding something of the wrong modulus.
Concluding: these are very advanced extensions. If you're a beginner, you will need to slowly progress towards these techniques. Don't be discouraged if you can't grasp them after only a short study, it does take some time. Make a small step at a time, solve some exercises for each feature to understand the point of it. And you'll always have StackOverflow when you are stuck :-)

Why are ML/Haskell datatypes useful for defining "languages" like arithmetic expressions?

This is more of a soft question about static type systems in functional languages like those of the ML family. I understand why you need datatypes to describe data structures like lists and trees but defining "expressions" like those of propositional logic within datatypes seems to bring just some convenience and is not necessary. For example
datatype arithmetic_exp = Constant of int
| Neg of arithmetic_exp
| Add of (arithmetic_exp * arithmetic_exp)
| Mult of (arithmetic_exp * arithmetic_exp)
defines a set of values, on which you can write an eval function which would give you the result. You could just as well define 4 functions: const: int -> int, neg: int -> int, add: int * int -> int and mult: int * int -> int and then an expression of the sort add (mult (const 3, neg 2), neg 4) would give you the same thing without any loss of static security. The only complication is that you have to do four things instead of two. While learning SML and Haskell I've been trying to think about which features give you something necessary and which are just a convenience, so this is the reason why I'm asking. I guess this would matter if you want to decouple the process of evaluating a value from the value itself but I'm not sure where that would be useful.
Thanks a lot.
There is a duality between initial / first-order / datatype-based encodings (aka deep embeddings) and final / higher-order / evaluator-based encodings (aka shallow embeddings). You can indeed typically use a typeclass of combinators instead of a datatype (and convert back and forth between the two).
Here is a module showing the two approaches:
{-# LANGUAGE GADTs, Rank2Types #-}
module Expr where
data Expr where
Val :: Int -> Expr
Add :: Expr -> Expr -> Expr
class Expr' a where
val :: Int -> a
add :: a -> a -> a
You can see that the two definitions look eerily similar. Expr' a is basically describing an algebra on Expr which means that you can get an a out of an Expr if you have such an Expr' a. Similarly, because you can write an instance Expr' Expr, you're able to reify a term of type forall a. Expr' a => a into a syntactic value of type Expr:
expr :: Expr' a => Expr -> a
expr e = case e of
Val n -> val n
Add p q -> add (expr p) (expr q)
instance Expr' Expr where
val = Val
add = Add
expr' :: (forall a. Expr' a => a) -> Expr
expr' e = e
In the end, picking a representation over another really depends on what your main focus is: if you want to inspect the structure of the expression (e.g. if you want to optimise / compile it), it's easier if you have access to an AST. If, on the other hand, you're only interested in computing an invariant using a fold (e.g. the depth of the expression or its evaluation), a higher order encoding will do.
The ADT is in a form you can inspect and manipulate in ways other than simply evaluating it. Once you hide all the interesting data in a function call, there is no longer anything you can do with it but evaluate it. Consider this definition, similar to the one in your question, but with a Var term to represent variables and with the Mul and Neg terms removed to focus on addition.
data Expr a = Constant a
| Add (Expr a) (Expr a)
| Var String
deriving Show
The obvious function to write is eval, of course. It requires a way to look up a variable's value, and is straightforward:
-- cheating a little bit by assuming all Vars are defined
eval :: Num a => Expr a -> (String -> a) -> a
eval (Constant x) _env = x
eval (Add x y) env = eval x env + eval y env
eval (Var x) env = env x
But suppose you don't have a variable mapping yet. You have a large expression that you will evaluate many times, for different choices of variable. And some silly recursive function built up an expression like:
Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Add (Constant 1)
(Var "x"))))))
It would be wasteful to recompute 1+1+1+1+1+1 every time you evaluate this: wouldn't it be nice if your evaluator could realize that this is just another way of writing Add (Constant 6) (Var "x")?
So, you write an expression optimizer, which runs before any variables are available and tries to simplify expressions. There are many simplification rules you could apply, of course; below I've implemented just two very easy ones to illustrate the point.
simplify :: Num a => Expr a -> Expr a
simplify (Add (Constant x) (Constant y)) = Constant $ x + y
simplify (Add (Constant x) (Add (Constant y) z)) = simplify $ Add (Constant $ x + y) z
simplify x = x
Now how does our silly expression look?
> simplify $ Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Add (Constant 1) (Var "x"))))))
Add (Constant 6) (Var "x")
All the unnecessary stuff has been removed, and you now have a nice clean expression to try for various values of x.
How do you do the same thing with a representation of this expression in functions? You can't, because there is no "intermediate form" between the initial specification of the expression and its final evaluation: you can only look at the expression as a single, opaque function call. Evaluating it at a particular value of x necessarily evaluates each of the subexpressions anew, and there is no way to disentangle them.
Here's an extension of the functional type you propose in your question, again enriched with variables:
type FExpr a = (String -> a) -> a
lit :: a -> FExpr a
lit x _env = x
add :: Num a => FExpr a -> FExpr a -> FExpr a
add x y env = x env + y env
var :: String -> FExpr a
var x env = env x
with the same silly expression to evaluate many times:
sample :: Num a => FExpr a
sample = add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(add (lit 1)
(var "x"))))))
It works as expected:
> sample $ \_var -> 5
11
But it has to do a bunch of addition every time you try it for a different x, even though the addition and the variable are mostly unrelated. And there's no way you can simplify the expression tree. You can't simplify it while defining it: that is, you can't make add smarter, because it can't inspect its arguments at all: its arguments are functions which, as far as add is concerned, could do anything at all. And you can't simplify it after constructing it, either: at that point you just have an opaque function that takes in a variable lookup function and produces a value.
By modeling the important parts of your problem as data types in their own right, you make them values that your program can manipulate intelligently. If you leave them as functions, you get a shorter program that is less powerful, because you lock all the information inside lambdas that only GHC can manipulate.
And once you've written it with ADTs, it's not hard to collapse that representation back into the shorter function-based representation if you want. That is, it might be nice to have a function of type
convert :: Expr a -> FExpr a
But in fact, we've already done this! That's exactly the type that eval has. You just might not have noticed because of the FExpr type alias, which is not used in the definition of eval.
So in a way, the ADT representation is more general and more powerful, acting as a tree that you can fold up in many different ways. One of those ways is evaluating it, in exactly the way that the function-based representation does. But there are others:
Simplify the expression before evaluating it
Produce a list of all the variables that must be defined for this expression to be well formed
Count how deeply nested the deepest part of the tree is, to estimate how many stack frames an evaluator might need
Convert the expression to a String approximating a Haskell expression you could type to get the same result
So if possible you'd like to work with information-rich ADTs for as long as you can, and then eventually fold the tree up into a more compact form once you have something specific to do with it.

What can type families do that multi param type classes and functional dependencies cannot

I have played around with TypeFamilies, FunctionalDependencies, and MultiParamTypeClasses. And it seems to me as though TypeFamilies doesn't add any concrete functionality over the other two. (But not vice versa). But I know type families are pretty well liked so I feel like I am missing something:
"open" relation between types, such as a conversion function, which does not seem possible with TypeFamilies. Done with MultiParamTypeClasses:
class Convert a b where
convert :: a -> b
instance Convert Foo Bar where
convert = foo2Bar
instance Convert Foo Baz where
convert = foo2Baz
instance Convert Bar Baz where
convert = bar2Baz
Surjective relation between types, such as a sort of type safe pseudo-duck typing mechanism, that would normally be done with a standard type family. Done with MultiParamTypeClasses and FunctionalDependencies:
class HasLength a b | a -> b where
getLength :: a -> b
instance HasLength [a] Int where
getLength = length
instance HasLength (Set a) Int where
getLength = S.size
instance HasLength Event DateDiff where
getLength = dateDiff (start event) (end event)
Bijective relation between types, such as for an unboxed container, which could be done through TypeFamilies with a data family, although then you have to declare a new data type for every contained type, such as with a newtype. Either that or with an injective type family, which I think is not available prior to GHC 8. Done with MultiParamTypeClasses and FunctionalDependencies:
class Unboxed a b | a -> b, b -> a where
toList :: a -> [b]
fromList :: [b] -> a
instance Unboxed FooVector Foo where
toList = fooVector2List
fromList = list2FooVector
instance Unboxed BarVector Bar where
toList = barVector2List
fromList = list2BarVector
And lastly a surjective relations between two types and a third type, such as python2 or java style division function, which can be done with TypeFamilies by also using MultiParamTypeClasses. Done with MultiParamTypeClasses and FunctionalDependencies:
class Divide a b c | a b -> c where
divide :: a -> b -> c
instance Divide Int Int Int where
divide = div
instance Divide Int Double Double where
divide = (/) . fromIntegral
instance Divide Double Int Double where
divide = (. fromIntegral) . (/)
instance Divide Double Double Double where
divide = (/)
One other thing I should also add is that it seems like FunctionalDependencies and MultiParamTypeClasses are also quite a bit more concise (for the examples above anyway) as you only have to write the type once, and you don't have to come up with a dummy type name which you then have to type for every instance like you do with TypeFamilies:
instance FooBar LongTypeName LongerTypeName where
FooBarResult LongTypeName LongerTypeName = LongestTypeName
fooBar = someFunction
vs:
instance FooBar LongTypeName LongerTypeName LongestTypeName where
fooBar = someFunction
So unless I am convinced otherwise it really seems like I should just not bother with TypeFamilies and use solely FunctionalDependencies and MultiParamTypeClasses. Because as far as I can tell it will make my code more concise, more consistent (one less extension to care about), and will also give me more flexibility such as with open type relationships or bijective relations (potentially the latter is solver by GHC 8).
Here's an example of where TypeFamilies really shines compared to MultiParamClasses with FunctionalDependencies. In fact, I challenge you to come up with an equivalent MultiParamClasses solution, even one that uses FlexibleInstances, OverlappingInstance, etc.
Consider the problem of type level substitution (I ran across a specific variant of this in Quipper in QData.hs). Essentially what you want to do is recursively substitute one type for another. For example, I want to be able to
substitute Int for Bool in Either [Int] String and get Either [Bool] String,
substitute [Int] for Bool in Either [Int] String and get Either Bool String,
substitute [Int] for [Bool] in Either [Int] String and get Either [Bool] String.
All in all, I want the usual notion of type level substitution. With a closed type family, I can do this for any types (albeit I need an extra line for each higher-kinded type constructor - I stopped at * -> * -> * -> * -> *).
{-# LANGUAGE TypeFamilies #-}
-- Subsitute type `x` for type `y` in type `a`
type family Substitute x y a where
Substitute x y x = y
Substitute x y (k a b c d) = k (Substitute x y a) (Substitute x y b) (Substitute x y c) (Substitute x y d)
Substitute x y (k a b c) = k (Substitute x y a) (Substitute x y b) (Substitute x y c)
Substitute x y (k a b) = k (Substitute x y a) (Substitute x y b)
Substitute x y (k a) = k (Substitute x y a)
Substitute x y a = a
And trying at ghci I get the desired output:
> :t undefined :: Substitute Int Bool (Either [Int] String)
undefined :: Either [Bool] [Char]
> :t undefined :: Substitute [Int] Bool (Either [Int] String)
undefined :: Either Bool [Char]
> :t undefined :: Substitute [Int] [Bool] (Either [Int] String)
undefined :: Either [Bool] [Char]
With that said, maybe you should be asking yourself why am I using MultiParamClasses and not TypeFamilies. Of the examples you gave above, all except Convert translate to type families (albeit you will need an extra line per instance for the type declaration).
Then again, for Convert, I am not convinced it is a good idea to define such a thing. The natural extension to Convert would be instances such as
instance (Convert a b, Convert b c) => Convert a c where
convert = convert . convert
instance Convert a a where
convert = id
which are as unresolvable for GHC as they are elegant to write...
To be clear, I am not saying there are no uses of MultiParamClasses, just that when possible you should be using TypeFamilies - they let you think about type-level functions instead of just relations.
This old HaskellWiki page does an OK job of comparing the two.
EDIT
Some more contrasting and history I stumbled upon from augustss blog
Type families grew out of the need to have type classes with
associated types. The latter is not strictly necessary since it can be
emulated with multi-parameter type classes, but it gives a much nicer
notation in many cases. The same is true for type families; they can
also be emulated by multi-parameter type classes. But MPTC gives a
very logic programming style of doing type computation; whereas type
families (which are just type functions that can pattern match on the
arguments) is like functional programming.
Using closed type families
adds some extra strength that cannot be achieved by type classes. To
get the same power from type classes we would need to add closed type
classes. Which would be quite useful; this is what instance chains
gives you.
Functional dependencies only affect the process of constraint solving, while type families introduced the notion of non-syntactic type equality, represented in GHC's intermediate form by coercions. This means type families interact better with GADTs. See this question for the canonical example of how functional dependencies fail here.

What's the difference between the "data" and "type" keywords?

The data and type keywords always confuse me.
I want to know what is the difference between data and type and how to use them.
type declares a type synonym. A type synonym is a new name for an existing type. For example, this is how String is defined in the standard library:
type String = [Char]
String is another name for a list of Chars. GHC will replace all usages of String in your program with [Char] at compile-time.
To be clear, a String literally is a list of Chars. It's just an alias. You can use all the standard list functions on String values:
-- length :: [a] -> Int
ghci> length "haskell"
7
-- reverse :: [a] -> [a]
ghci> reverse "functional"
"lanoitcnuf"
data declares a new data type, which, unlike a type synonym, is different from any other type. Data types have a number of constructors defining the possible cases of your type. For example, this is how Bool is defined in the standard library:
data Bool = False | True
A Bool value can be either True or False. Data types support pattern matching, allowing you to perform a runtime case-analysis on a value of a data type.
yesno :: Bool -> String
yesno True = "yes"
yesno False = "no"
data types can have multiple constructors (as with Bool), can be parameterised by other types, can contain other types inside them, and can recursively refer to themselves. Here's a model of exceptions which demonstrates this; an Error a contains an error message of type a, and possibly the error which caused it.
data Error a = Error { value :: a, cause :: Maybe (Error a) }
type ErrorWithMessage = Error String
myError1, myError2 :: ErrorWithMessage
myError1 = Error "woops" Nothing
myError2 = Error "myError1 was thrown" (Just myError1)
It's important to realise that data declares a new type which is apart from any other type in the system. If String had been declared as a data type containing a list of Chars (rather than a type synonym), you wouldn't be able to use any list functions on it.
data String = MkString [Char]
myString = MkString ['h', 'e', 'l', 'l', 'o']
myReversedString = reverse myString -- type error
There's one more variety of type declaration: newtype. This works rather like a data declaration - it introduces a new data type separate from any other type, and can be pattern matched - except you are restricted to a single constructor with a single field. In other words, a newtype is a data type which wraps up an existing type.
The important difference is the cost of a newtype: the compiler promises that a newtype is represented in the same way as the type it wraps. There's no runtime cost to packing or unpacking a newtype. This makes newtypes useful for making administrative (rather than structural) distinctions between values.
newtypes interact well with type classes. For example, consider Monoid, the class of types with a way to combine elements (mappend) and a special 'empty' element (mempty). Int can be made into a Monoid in many ways, including addition with 0 and multiplication with 1. How can we choose which one to use for a possible Monoid instance of Int? It's better not to express a preference, and use newtypes to enable either usage with no runtime cost. Paraphrasing the standard library:
-- introduce a type Sum with a constructor Sum which wraps an Int, and an extractor getSum which gives you back the Int
newtype Sum = Sum { getSum :: Int }
instance Monoid Sum where
(Sum x) `mappend` (Sum y) = Sum (x + y)
mempty = Sum 0
newtype Product = Product { getProduct :: Int }
instance Monoid Product where
(Product x) `mappend` (Product y) = Product (x * y)
mempty = Product 1
With data you create new datatype and declare a constructor for it:
data NewData = NewDataConstructor
With type you define just an alias:
type MyChar = Char
In the type case you can pass value of MyChar type to function expecting a Char and vice versa, but you can't do this for data MyChar = MyChar Char.
type works just like let: it allows you to give a re-usable name to something, but that something will always work just as if you had inlined the definition. So
type ℝ = Double
f :: ℝ -> ℝ -> ℝ
f x y = let x2 = x^2
in x2 + y
behaves exactly the same way as
f' :: Double -> Double -> Double
f' x y = x^2 + y
as in: you can anywhere in your code replace f with f' and vice versa; nothing would change.
OTOH, both data and newtype create an opaque abstraction. They are more like a class constructor in OO: even though some value is implemented simply in terms of a single number, it doesn't necessarily behave like such a number. For instance,
newtype Logscaledℝ = LogScaledℝ { getLogscaled :: Double }
instance Num LogScaledℝ where
LogScaledℝ a + LogScaledℝ b = LogScaledℝ $ a*b
LogScaledℝ a - LogScaledℝ b = LogScaledℝ $ a/b
LogScaledℝ a * LogScaledℝ b = LogScaledℝ $ a**b
Here, although Logscaledℝ is data-wise still just a Double number, it clearly behaves different from Double.

Why context is not considered when selecting typeclass instance in Haskell?

I understand that when having
instance (Foo a) => Bar a
instance (Xyy a) => Bar a
GHC doesn't consider the contexts, and the instances are reported as duplicate.
What is counterintuitive, that (I guess) after selecting an instance, it still needs to check if the context matches, and if not, discard the instance. So why not reverse the order, and discard instances with non-matching contexts, and proceed with the remaining set.
Would this be intractable in some way? I see how it could cause more constraint resolution work upfront, but just as there is UndecidableInstances / IncoherentInstances, couldn't there be a ConsiderInstanceContexts when "I know what I am doing"?
This breaks the open-world assumption. Assume:
class B1 a
class B2 a
class T a
If we allow constraints to disambiguate instances, we may write
instance B1 a => T a
instance B2 a => T a
And may write
instance B1 Int
Now, if I have
f :: T a => a
Then f :: Int works. But, the open world assumption says that, once something works, adding more instances cannot break it. Our new system doesn't obey:
instance B2 Int
will make f :: Int ambiguous. Which implementation of T should be used?
Another way to state this is that you've broken coherence. For typeclasses to be coherent means that there is only one way to satisfy a given constraint. In normal Haskell, a constraint c has only one implementation. Even with overlapping instances, coherence generally holds true. The idea is that instance T a and instance {-# OVERLAPPING #-} T Int do not break coherence, because GHC can't be tricked into using the former instance in a place where the latter would do. (You can trick it with orphans, but you shouldn't.) Coherence, at least to me, seems somewhat desirable. Typeclass usage is "hidden", in some sense, and it makes sense to enforce that it be unambiguous. You can also break coherence with IncoherentInstances and/or unsafeCoerce, but, y'know.
In a category theoretic way, the category Constraint is thin: there is at most one instance/arrow from one Constraint to another. We first construct two arrows a : () => B1 Int and b : () => B2 Int, and then we break thinness by adding new arrows x_Int : B1 Int => T Int, y_Int : B2 Int => T Int such that x_Int . a and y_Int . b are both arrows () => T Int that are not identical. Diamond problem, anyone?
This does not answer you question as to why this is the case. Note, however, that you can always define a newtype wrapper to disambiguate between the two instances:
newtype FooWrapper a = FooWrapper a
newtype XyyWrapper a = XyyWrapper a
instance (Foo a) => Bar (FooWrapper a)
instance (Xyy a) => Bar (XyyWrapper a)
This has the added advantage that by passing around either a FooWrapper or a XyyWrapper you explicitly control which of the two instances you'd like to use if your a happens to satisfy both.
Classes are a bit weird. The original idea (which still pretty much works) is a sort of syntactic sugar around what would otherwise be data statements. For example you can imagine:
data Num a = Num {plus :: a -> a -> a, ... , fromInt :: Integer -> a}
numInteger :: Num Integer
numInteger = Num (+) ... id
then you can write functions which have e.g. type:
test :: Num x -> x -> x -> x -> x
test lib a b c = a + b * (abs (c + b))
where (+) = plus lib
(*) = times lib
abs = absoluteValue lib
So the idea is "we're going to automatically derive all of this library code." The question is, how do we find the library that we want? It's easy if we have a library of type Num Int, but how do we extend it to "constrained instances" based on functions of type:
fooLib :: Foo x -> Bar x
xyyLib :: Xyy x -> Bar x
The present solution in Haskell is to do a type-pattern-match on the output-types of those functions and propagate the inputs to the resulting declaration. But when there's two outputs of the same type, we would need a combinator which merges these into:
eitherLib :: Either (Foo x) (Xyy x) -> Bar x
and basically the problem is that there is no good constraint-combinator of this kind right now. That's your objection.
Well, that's true, but there are ways to achieve something morally similar in practice. Suppose we define some functions with types:
data F
data X
foobar'lib :: Foo x -> Bar' x F
xyybar'lib :: Xyy x -> Bar' x X
bar'barlib :: Bar' x y -> Bar x
Clearly the y is a sort of "phantom type" threaded through all of this, but it remains powerful because given that we want a Bar x we will propagate the need for a Bar' x y and given the need for the Bar' x y we will generate either a Bar' x X or a Bar' x y. So with phantom types and multi-parameter type classes, we get the result we want.
More info: https://www.haskell.org/haskellwiki/GHC/AdvancedOverlap
Adding backtracking would make instance resolution require exponential time, in the worst case.
Essentially, instances become logical statements of the form
P(x) => R(f(x)) /\ Q(x) => R(f(x))
which is equivalent to
(P(x) \/ Q(x)) => R(f(x))
Computationally, the cost of this check is (in the worst case)
c_R(n) = c_P(n-1) + c_Q(n-1)
assuming P and Q have similar costs
c_R(n) = 2 * c_PQ(n-1)
which leads to exponential growth.
To avoid this issue, it is important to have fast ways to choose a branch, i.e. to have clauses of the form
((fastP(x) /\ P(x)) \/ (fastQ(x) /\ Q(x))) => R(f(x))
where fastP and fastQ are computable in constant time, and are incompatible so that at most one branch needs to be visited.
Haskell decided that this "fast check" is head compatibility (hence disregarding contexts). It could use other fast checks, of course -- it's a design decision.

Resources