Is Haskell's "variable" eager evaluated? [duplicate] - haskell

I am wondering why :sprint reports xs = _ in this case:
Prelude> xs = map (+1) [1..10]
Prelude> length xs
10
Prelude> :sprint xs
xs = _
but not in this case:
Prelude> xs = map (+1) [1..10] :: [Int]
Prelude> length xs
10
Prelude> :sprint xs
xs = [_,_,_,_,_,_,_,_,_,_]
Note: I am running ghci with -XNoMonomorphismRestriction. Does it have to do with the fact that the type of xs is polymorphic in the first case but not in the second? I'd like to know what's going on internally.

The gist is that the with the polymorphic xs it has a type of the form
xs :: Num a => [a]
typeclasses under the hood are really just functions, they take an extra argument that GHC automatically fills that contains a record of the typeclasses functions. So you can think of xs having the type
xs :: NumDict a -> [a]
So when you run
Prelude> length xs
It has to choose some value for a, and find the corresponding NumDict value. IIRC it'll fill it with Integer, so you're actually calling a function with and checking the length of the resulting list.
When you then :sprint xs, you once again fill in that argument, this time with a fresh type variable. But the point is that you're getting an entirely different list, you gave it a different NumDict so it's not forced in any way when you called length before.
This is very different then with the explicitly monomorphic list since there really is only one list there, there's only one value to force so when you call length, it forces it for all future uses of xs.
To make this a bit clearer, consider the code
data Smash a = Smash { smash :: a -> a -> a }
-- ^ Think of Monoids
intSmash :: Smash Int
intSmash = Smash (+)
listSmash :: Smash [a]
listPlus = Smash (++)
join :: Smash a -> [a] -> a
join (Smash s) xs = foldl1' s xs
This is really what type classes are like under the hood, GHC would automatically fill in that first Smash a argument for us. Now your first example is like join, we can't make any assumptions about what the output will be as we apply it to different types, but your second example is more like
join' :: [Int] -> Int
join' = join intSmash

Related

Length with foldl and foldr

I have two functions computing the length of a list of integers
lengthFoldl :: [Int] -> Int
lengthFoldl xs = (foldl (\_ y -> y+1) 0 xs)
and
lengthFold :: [a] -> Int
lengthFold xs = foldr (\_ y -> y+1) 0 xs
they are the same except one uses foldr and one foldl.
But when trying to compute the length of any list [1 .. n] I get a wrong result (one too big) from lengthFoldl.
To complement joelfischerr's answer, I'd like to point out that a hint is given by the types of your functions.
lengthFoldl :: [Int] -> Int
lengthFold :: [a] -> Int
Why are they different? I guess you might had to change the first one to take an [Int] since with [a] it did not compile. This is however a big warning sign!
If it is indeed computing the length, why should lengthFoldl care about what is the type of the list elements? Why do we need the elements to be Ints? There is only one possible explanation for Int being needed: looking at the code
lengthFoldl xs = foldl (\_ y -> y+1) 0 xs
we can see that the only numeric variable here is y. If y is forced to be a number, and list elements are also forced to be numbers, it seems as if y is taken to be a list element!
And indeed that is the case: foldl passes to the function the accumulator first, the list element second, unlike foldr.
The general thumb rule is: when type and code do not agree, one should think carefully about which one is right. I'd say that most Haskellers would think that, in most cases, it is easier to get the type right than the code right. So, one should not just adapt the type to the code to force it to compile: a type error can instead witness a bug in the code.
Looking at the type definitions of foldl and foldr it becomes clear what the issue is.
:t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
and
:t foldl
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b
One can see that the foldr takes the item of the list and the second argument into the function and foldl takes the second argument and the item of the list into the function.
Changing lengthFoldl to this solves the problem
lengthFoldl :: [Int] -> Int
lengthFoldl xs = foldl (\y _ -> y+1) 0 xs
Edit: Using foldl instead of foldl' is a bad idea: https://wiki.haskell.org/Foldr_Foldl_Foldl'

Why doesn't this function work if I use "[xs]" instead of "xs"?

split :: [a] -> Int -> ([a], [a])
split [xs] n =
(take n [xs], drop n [xs])
The same code works if I give the variable as xs instead of [xs], signatures are same in both cases. Using [xs] gives the error that pattern is non-exhaustive. I understand it's telling that the input I gave is not covered by my code, but not clear what is happening under the hood.
Test input: [1,2,3] 2.
Somehow a lot of people think that [xs] as pattern means that you unify a list with xs. But this is incorrect, since the function signature (either derived implicitly, or stated explicitly) already will prevent you to write code where you call the function with a non-list item.
A list has two constructors:
the empty list []; and
the "cons" (h : t) with h the head (first element), and t the tail (a list with the remaining elements).
Haskell however introduces some syntactical sugar as well. For example [1] is short for (1:[]), and [1, 4, 2] for (1:(4:(2:[]))).
So that means that if you write [xs], behind the curtains you defined a pattern (xs: []) which thus means you match all lists with exactly one element, and that single element (not the entire list) is then xs.
Anyway, the solution is to use:
split xs n = (take n xs, drop n xs)
Since both take :: Int -> [a] -> [a] and drop :: Int -> [a] -> [a] have in the signature that xs is supposed to be a list, Haskell will derive automatically that n is supposed to be an Int, and xs an [a].
Note that you can use splitAt :: Int -> [a] -> ([a], [a]) as well. We can make the signature equivalent to the one you target with:
split = flip splitAt

Haskell:Non-exhaustive patterns in function bean [duplicate]

split :: [a] -> Int -> ([a], [a])
split [xs] n =
(take n [xs], drop n [xs])
The same code works if I give the variable as xs instead of [xs], signatures are same in both cases. Using [xs] gives the error that pattern is non-exhaustive. I understand it's telling that the input I gave is not covered by my code, but not clear what is happening under the hood.
Test input: [1,2,3] 2.
Somehow a lot of people think that [xs] as pattern means that you unify a list with xs. But this is incorrect, since the function signature (either derived implicitly, or stated explicitly) already will prevent you to write code where you call the function with a non-list item.
A list has two constructors:
the empty list []; and
the "cons" (h : t) with h the head (first element), and t the tail (a list with the remaining elements).
Haskell however introduces some syntactical sugar as well. For example [1] is short for (1:[]), and [1, 4, 2] for (1:(4:(2:[]))).
So that means that if you write [xs], behind the curtains you defined a pattern (xs: []) which thus means you match all lists with exactly one element, and that single element (not the entire list) is then xs.
Anyway, the solution is to use:
split xs n = (take n xs, drop n xs)
Since both take :: Int -> [a] -> [a] and drop :: Int -> [a] -> [a] have in the signature that xs is supposed to be a list, Haskell will derive automatically that n is supposed to be an Int, and xs an [a].
Note that you can use splitAt :: Int -> [a] -> ([a], [a]) as well. We can make the signature equivalent to the one you target with:
split = flip splitAt

:sprint for polymorphic values?

I am wondering why :sprint reports xs = _ in this case:
Prelude> xs = map (+1) [1..10]
Prelude> length xs
10
Prelude> :sprint xs
xs = _
but not in this case:
Prelude> xs = map (+1) [1..10] :: [Int]
Prelude> length xs
10
Prelude> :sprint xs
xs = [_,_,_,_,_,_,_,_,_,_]
Note: I am running ghci with -XNoMonomorphismRestriction. Does it have to do with the fact that the type of xs is polymorphic in the first case but not in the second? I'd like to know what's going on internally.
The gist is that the with the polymorphic xs it has a type of the form
xs :: Num a => [a]
typeclasses under the hood are really just functions, they take an extra argument that GHC automatically fills that contains a record of the typeclasses functions. So you can think of xs having the type
xs :: NumDict a -> [a]
So when you run
Prelude> length xs
It has to choose some value for a, and find the corresponding NumDict value. IIRC it'll fill it with Integer, so you're actually calling a function with and checking the length of the resulting list.
When you then :sprint xs, you once again fill in that argument, this time with a fresh type variable. But the point is that you're getting an entirely different list, you gave it a different NumDict so it's not forced in any way when you called length before.
This is very different then with the explicitly monomorphic list since there really is only one list there, there's only one value to force so when you call length, it forces it for all future uses of xs.
To make this a bit clearer, consider the code
data Smash a = Smash { smash :: a -> a -> a }
-- ^ Think of Monoids
intSmash :: Smash Int
intSmash = Smash (+)
listSmash :: Smash [a]
listPlus = Smash (++)
join :: Smash a -> [a] -> a
join (Smash s) xs = foldl1' s xs
This is really what type classes are like under the hood, GHC would automatically fill in that first Smash a argument for us. Now your first example is like join, we can't make any assumptions about what the output will be as we apply it to different types, but your second example is more like
join' :: [Int] -> Int
join' = join intSmash

Using a different ordering on lists

In Haskell, the default ordering for [a], given an ordering on a, seems to be a lexicographic ordering (side question: where can I find out if this is really the case)? What I want is a graded lexicographic ordering (also called "length plus lexicographic" ordering).
How would I specify that I want comparisons to be done in a graded lexicographical manner? I want it for only one type, not for all [a]. I tried this:
instance Ord [Int] where
compare xs ys = case compare (length xs) (length ys) of
LT -> LT
GT -> GT
EQ -> lexicographic_compare xs ys
but got this error message:
> [1 of 1] Compiling Main ( test.hs, interpreted )
test.hs:1:10:
Illegal instance declaration for `Ord [Int]'
(All instance types must be of the form (T a1 ... an)
where a1 ... an are *distinct type variables*,
and each type variable appears at most once in the instance head.
Use -XFlexibleInstances if you want to disable this.)
In the instance declaration for `Ord [Int]'
Failed, modules loaded: none.
Thanks for any and all help!
This is a typical application for a newtype wrapper:
newtype GradedLexOrd a = GradedLexOrd { runGradedLexOrd :: [a] }
instance (Ord a) => Ord (GradedLexOrd a) where
compare (GradedLexOrd xs) (GradedLexOrd ys) = gradedLexOrd xs ys
gradedLexOrd :: Ord a => [a] -> [a] -> Ordering
gradedLexOrd = comparing length <> compare -- Nice Monoid-based implementation,
--due to Aaron Roth (see answer below)
Alternatively, you could openly use lists, but instead of the Ord constrained functions like sort use the more general alternatives which accept a custom comparison function, e.g. sortBy gradedLexOrd.
There are two questions here:
How does Ord [a] looks like?
Of course you can experiment within GHCi, but maybe you want something more reliable. This is surprisingly difficult, especially as the definition of Lists is (due to their special syntax) built into the compiler. Let’s ask GHCi:
Prelude> :info []
data [] a = [] | a : [a] -- Defined in `GHC.Types'
instance Eq a => Eq [a] -- Defined in `GHC.Classes'
instance Monad [] -- Defined in `GHC.Base'
instance Functor [] -- Defined in `GHC.Base'
instance Ord a => Ord [a] -- Defined in `GHC.Classes'
instance Read a => Read [a] -- Defined in `GHC.Read'
instance Show a => Show [a] -- Defined in `GHC.Show'
It says that the instance is defined in GHC.Classes, which we find in GHC’s git repo, and there it says:
instance (Ord a) => Ord [a] where
{-# SPECIALISE instance Ord [Char] #-}
compare [] [] = EQ
compare [] (_:_) = LT
compare (_:_) [] = GT
compare (x:xs) (y:ys) = case compare x y of
EQ -> compare xs ys
other -> other
So yes, it is indeed the lexicographic ordering.
How to overwrite the ordering?
Don’t. There is an instance for [a] and there can be only one. With FlexibleInstances and OverlappingInstances, you could make it use an alternative instance for, say, [Int], but it is bad style. As leftaroundabout writes, use a NewtypeWrapper for it, or use parametrized functions like sortBy.
Creating a whole new Ord instance for lists of Ints seems a bit heavyweight to my taste (not to mention that you may be sowing confusion: someone who comes along to your code later will probably expect the default, non-graded lexicographic comparison behavior).
If you're merely hoping not to have to copy your custom comparison code every time you use sortBy or the like, there's actually a fairly lightweight way of defining chained comparison functions like yours on the spot. Ordering, as it happens, is an instance of Monoid, which means you can compare two things according to a succession of criteria, then combine the resulting Orderings of those comparisons using the Monoid function, mappend (recently abbreviated to <>). This is all explained in some detail in the Learn You a Haskell chapter on Monoids, etc., which is where I picked up the trick. So:
import Data.Monoid ((<>))
import Data.Ord (comparing)
gradedLexicographicCompare :: (Ord a) => [a] -> [a] -> Ordering
gradedLexicographicCompare xs ys = comparing length xs ys <> comparing id xs ys
(Of course, comparing id is just compare, but for the sake of uniformity...) Then it becomes relatively unburdensome to write things like
f = ... sortBy s ...
where
...
s xs ys = comparing length xs ys <> compare xs ys
...
And this also has the virtue that your successor will see immediately that you're using a custom comparison function.
Update: leftaroundabout points out below that we can achieve even greater elegance -- this is Haskell after all, and in Haskell we can always achieve greater elegance -- by making use of the monoid instance, instance Monoid b => Monoid (a -> b). That is, a function whose result is a monoid can itself be considered a monoid. The instance is given by
instance Monoid b => Monoid (a -> b) where
mempty _ = mempty
mappend f g x = f x `mappend` g x (1)
Now let's indulge in a little equational reasoning and see what comparing length <> compare expands to according to this instance. Applying (1) once, we have
comparing length <> compare
= mappend (comparing length) compare
= \xs -> mappend ((comparing length) xs) (compare xs) (2)
But ((comparing length) xs) :: [a] -> Ordering and (compare xs) :: (Ord a) => a -> Ordering are themselves functions whose results are monoids, namely Orderings, so we can apply (1) a second time to obtain
mappend ((comparing length) xs) (compare xs)
= \ys -> mappend (((comparing length) xs) ys) ((compare xs) ys) (3)
But now (((comparing length) xs) ys) and ((compare xs) ys) are fully applied functions. Specifically, they are Orderings, and from the original answer we know how to combine two Orderings using mappend from the Ordering instance of Monoid. (Note that we are not using mappend from (1).) Writing everything down in one big chain, we have
comparing length <> compare
= mappend (comparing length) compare [definition of <>]
= \xs -> mappend ((comparing length) xs) (compare xs) [by (1)]
= \xs -> (\ys -> mappend (((comparing length) xs) ys) ((compare xs) ys)) [substituting (3) in (2)]
= \xs -> \ys -> mappend (comparing length xs ys) (compare xs ys) [function application is left associative]
= \xs -> \ys -> comparing length xs ys <> compare xs ys [definition of <>]
And the last line of this expansion is just our original gradedLexicographicCompare! After a long, long digression, then, the punchline is that we can write the elegantly points-free
gradedLexicographicCompare = comparing length <> compare
Pretty.

Resources