Using Haskell ranges: Why would mapping a floating point function across a range cause it to return an extra element? - haskell

I know that floats can lead to odd behavior in ranges due to their imprecise nature.
I would expect the possibility of imprecise values. For instance:
[0.1,0.3..1] might give [0.1,0.3,0.5,0.7,0.8999999999999999] instead of [0.1,0.3,0.5,0.7,0.9]
In addition to the precision loss, however, I get an extra element:
ghci> [0.1,0.3..1]
[0.1,0.3,0.5,0.7,0.8999999999999999,1.0999999999999999]
This is weird, but explained here. I could work around it like this, I suppose:
ghci> [0.1,0.3..0.99]
[0.1,0.3,0.5,0.7,0.8999999999999999]
But that's kind of gross. Maybe there's a cleaner way. For this simple example, of course, I could just use the range [0.1,0.3..0.9] and everything is fine.
But in a more complex example, I may not quickly know (or care to figure out, if I'm lazy) the exact upper bound I should use. So, I'll just make a range of integers and then divide by 10, right? Nope:
ghci> map (/10) [1,3..10]
[0.1,0.3,0.5,0.7,0.9,1.1]
Any floating point function seems to cause this behavior:
ghci> map (*1.0) [1,3..10]
[1.0,3.0,5.0,7.0,9.0,11.0]
Whereas a non-floating function doesn't:
ghci> map (*1) [1,3..10]
[1,3,5,7,9]
While it seems unlikely, I thought that maybe some lazy evaluation was at play, and tried to force evaluation of the range first:
ghci> let list = [1,3..10] in seq list (map (*1.0) list)
[1.0,3.0,5.0,7.0,9.0,11.0]
Obviously, using the literal list instead of the range works fine:
ghci> map (*1.0) [1,3,5,7,9]
[1.0,3.0,5.0,7.0,9.0]
ghci> let list = [1,3,5,7,9] in seq list (map (*1.0) list)
[1.0,3.0,5.0,7.0,9.0]
It isn't just mapping either:
ghci> last [1,3..10]
9
ghci> 1.0 * (last [1,3..10])
11.0
How does applying a function to the result of a range can impact the actual evaluated result of that range?

I answered this for myself as I was writing it.
Haskell uses type inference, so when it sees a floating point function being mapped over a list (or used on an element of that list, as in my example using last), it is going to infer the type of that list to be floating point and therefore evaluate the range as if it were [1,3..10] :: [Float] instead of what I was intending, which is [1,3..10] :: [Int]
At this point, it uses the Float rules for enumerating, as described in the post that I linked to in the question.
The expected behavior can be forced like this:
ghci> map (\x -> (fromIntegral x) / 10) ([1,3..10]::[Int])
[0.1,0.3,0.5,0.7,0.9]
Relying on Haskell's type inference, we can drop the ::[Int] since fromIntegral causes our lambda expression to have the correct type:
ghci> :t (\x -> (fromIntegral x) / 10)
(\x -> (fromIntegral x) / 10)
:: (Fractional a, Integral a1) => a1 -> a

Related

Why the pointfree style does not cause a problem?

I read about The Monomorphism Restriction from the page https://www.haskell.org/tutorial/pitfalls.html and could not understand the last point:
A common violation of the restriction happens with functions defined
in a higher-order manner, as in this definition of sum from the
Standard Prelude:
sum = foldl (+) 0
As is, this would cause a static type error. We can fix the problem by
adding the type signature:
sum :: (Num a) => [a] -> a
Also note that this problem would not have arisen if we had written:
sum xs = foldl (+) 0 xs
because the restriction only applies to pattern bindings.
Why the last point does not cause any error?
because the restriction only applies to pattern bindings.
Essentially, the MR does not apply when we are defining a function using a function binding of the form
f arg1 ... argN = ...
with N > 0.
The intuition is as follows. The purpose of the MR is to avoid turning Haskell non-functions into lower-level functions accidentally. For instance,
x = 3 + 4
is not a function. However, its type is Num a => a, which is usually implemented as a function from a Num dictionary to the result of 3+4 where + is a function defined by the dictionary. This can lead to a bad performance, since every time we use x the sum will need to be recomputed from scratch. This is unavoidable if we want to compute print (x :: Int) >> print (x :: Double), for instance. But actually using x at different types is rather uncommon.
So, the MR makes x monomorphic, preventing us to use it at more than a single type. In that way, recomputation can be avoided.
However, if x is already a function there is no harm in keeping that polymorphic, since we are "recomputing" function calls anyway. So, the MR does not apply to function bindings.

A list comprehension with double elements [duplicate]

I know that floats can lead to odd behavior in ranges due to their imprecise nature.
I would expect the possibility of imprecise values. For instance:
[0.1,0.3..1] might give [0.1,0.3,0.5,0.7,0.8999999999999999] instead of [0.1,0.3,0.5,0.7,0.9]
In addition to the precision loss, however, I get an extra element:
ghci> [0.1,0.3..1]
[0.1,0.3,0.5,0.7,0.8999999999999999,1.0999999999999999]
This is weird, but explained here. I could work around it like this, I suppose:
ghci> [0.1,0.3..0.99]
[0.1,0.3,0.5,0.7,0.8999999999999999]
But that's kind of gross. Maybe there's a cleaner way. For this simple example, of course, I could just use the range [0.1,0.3..0.9] and everything is fine.
But in a more complex example, I may not quickly know (or care to figure out, if I'm lazy) the exact upper bound I should use. So, I'll just make a range of integers and then divide by 10, right? Nope:
ghci> map (/10) [1,3..10]
[0.1,0.3,0.5,0.7,0.9,1.1]
Any floating point function seems to cause this behavior:
ghci> map (*1.0) [1,3..10]
[1.0,3.0,5.0,7.0,9.0,11.0]
Whereas a non-floating function doesn't:
ghci> map (*1) [1,3..10]
[1,3,5,7,9]
While it seems unlikely, I thought that maybe some lazy evaluation was at play, and tried to force evaluation of the range first:
ghci> let list = [1,3..10] in seq list (map (*1.0) list)
[1.0,3.0,5.0,7.0,9.0,11.0]
Obviously, using the literal list instead of the range works fine:
ghci> map (*1.0) [1,3,5,7,9]
[1.0,3.0,5.0,7.0,9.0]
ghci> let list = [1,3,5,7,9] in seq list (map (*1.0) list)
[1.0,3.0,5.0,7.0,9.0]
It isn't just mapping either:
ghci> last [1,3..10]
9
ghci> 1.0 * (last [1,3..10])
11.0
How does applying a function to the result of a range can impact the actual evaluated result of that range?
I answered this for myself as I was writing it.
Haskell uses type inference, so when it sees a floating point function being mapped over a list (or used on an element of that list, as in my example using last), it is going to infer the type of that list to be floating point and therefore evaluate the range as if it were [1,3..10] :: [Float] instead of what I was intending, which is [1,3..10] :: [Int]
At this point, it uses the Float rules for enumerating, as described in the post that I linked to in the question.
The expected behavior can be forced like this:
ghci> map (\x -> (fromIntegral x) / 10) ([1,3..10]::[Int])
[0.1,0.3,0.5,0.7,0.9]
Relying on Haskell's type inference, we can drop the ::[Int] since fromIntegral causes our lambda expression to have the correct type:
ghci> :t (\x -> (fromIntegral x) / 10)
(\x -> (fromIntegral x) / 10)
:: (Fractional a, Integral a1) => a1 -> a

About value in context (applied in Monad)

I have a small question about value in context.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Take [3], so value in context of type [a] in this case is 3
And if you apply the monad for [3] like this: [3] >>= \x -> [x+3], it means you assign x with value 3. It's ok.
But now, take [3,2], so what is the value in the context of type [a]?. And it's so strange that if you apply monad for it like this:
[3,4] >>= \x -> x+3
It got the correct answer [6,7], but actually we don't understand what is x in this case. You can answer, ah x is 3 and then 4, and x feeds the function 2 times and concat as Monad does: concat (map f xs) like this:
[3,4] >>= concat (map f x)
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
I think your problem is focusing too much on the values. A monad is a type constructor, and as such not concerned with how many and what kinds of values there are, but only the context.
A Maybe a can be an a, or nothing. Easy, and you correctly observed that.
An Either String a is either some a, or alternatively some information in form of a String (e.g. why the calculation of a failed).
Finally, [a] is an unknown number of as (or none at all), that may have resulted from an ambiguous computation, or one giving multiple results (like a quadratic equation).
Now, for the interpretation of (>>=), it is helpful to know that the essential property of a monad (how it is defined by category theorists) is
join :: m (m a) -> m a.
Together with fmap, (>>=) can be written in terms of join.
What join means is the following: A context, put in the same context again, still has the same resulting behavior (for this monad).
This is quite obvious for Maybe (Maybe a): Something can essentially be Just (Just x), or Nothing, or Just Nothing, which provides the same information as Nothing. So, instead of using Maybe (Maybe a), you could just have Maybe a and you wouldn't lose any information. That's what join does: it converts to the "easier" context.
[[a]] is somehow more difficult, but not much. You essentially have multiple/ambiguous results out of multiple/ambiguous results. A good example are the roots of a fourth-degree polynomial, found by solving a quadratic equation. You first get two solutions, and out of each you can find two others, resulting in four roots.
But the point is, it doesn't matter if you speak of an ambiguous ambiguous result, or just an ambiguous result. You could just always use the context "ambiguous", and transform multiple levels with join.
And here comes what (>>=) does for lists: it applies ambiguous functions to ambiguous values:
squareRoots :: Complex -> [Complex]
fourthRoots num = squareRoots num >>= squareRoots
can be rewritten as
fourthRoots num = join $ squareRoots `fmap` (squareRoots num)
-- [1,-1,i,-i] <- [[1,-1],[i,-i]] <- [1,-1] <- 1
since all you have to do is to find all possible results for each possible value.
This is why join is concat for lists, and in fact
m >>= f == join (fmap f) m
must hold in any monad.
A similar interpretation can be given to IO. A computation with side-effects, which can also have side-effects (IO (IO a)), is in essence just something with side-effects.
You have to take the word "context" quite broadly.
A common way of interpreting a list of values is that it represents an indeterminate value, so [3,4] represents a value which is three or four, but we don't know which (perhaps we just know it's a solution of x^2 - 7x + 12 = 0).
If we then apply f to that, we know it's 6 or 7 but we still don't know which.
Another example of an indeterminate value that you're more used to is 3. It could mean 3::Int or 3::Integer or even sometimes 3.0::Double. It feels easier because there's only one symbol representing the indeterminate value, whereas in a list, all the possibilities are listed (!).
If you write
asum = do
x <- [10,20]
y <- [1,2]
return (x+y)
You'll get a list with four possible answers: [11,12,21,22]
That's one for each of the possible ways you could add x and y.
It is not the values that are in the context, it's the types.
Just 'a' :: Maybe Char --- Char is in a Maybe context.
[3, 2] :: [Int] --- Int is in a [] context.
Whether there is one, none or many of the a in the m a is beside the point.
Edit: Consider the type of (>>=) :: Monad m => m a -> (a -> m b) -> m b.
You give the example Just 3 >>= (\x->Just(4+x)). But consider Nothing >>= (\x->Just(4+x)). There is no value in the context. But the type is in the context all the same.
It doesn't make sense to think of x as necessarily being a single value. x has a single type. If we are dealing with the Identity monad, then x will be a single value, yes. If we are in the Maybe monad, x may be a single value, or it may never be a value at all. If we are in the list monad, x may be a single value, or not be a value at all, or be various different values... but what it is not is the list of all those different values.
Your other example --- [2, 3] >>= (\x -> x + 3) --- [2, 3] is not passed to the function. [2, 3] + 3 would have a type error. 2 is passed to the function. And so is 3. The function is invoked twice, gives results for both those inputs, and the results are combined by the >>= operator. [2, 3] is not passed to the function.
"context" is one of my favorite ways to think about monads. But you've got a slight misconception.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Not quite. You keep saying the value in context, but there is not always a value "inside" a context, or if there is, then it is not necessarily the only value. It all depends on which context we are talking about.
The Maybe context is the context of "nullability", or potential absence. There might be something there, or there might be Nothing. There is no value "inside" of Nothing. So the maybe context might have a value inside, or it might not. If I give you a Maybe Foo, then you cannot assume that there is a Foo. Rather, you must assume that it is a Foo inside the context where there might actually be Nothing instead. You might say that something of type Maybe Foo is a nullable Foo.
Take [3], so value in context of type [a] in this case is 3
Again, not quite right. A list represents a nondeterministic context. We're not quite sure what "the value" is supposed to be, or if there is one at all. In the case of a singleton list, such as [3], then yes, there is just one. But one way to think about the list [3,4] is as some unobservable value which we are not quite sure what it is, but we are certain that it 3 or that it is 4. You might say that something of type [Foo] is a nondeterministic Foo.
[3,4] >>= \x -> x+3
This is a type error; not quite sure what you meant by this.
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
You totally lost me here. Each instance of Monad has its own implementation of >>= which defines the context that it represents. For lists, the definition is
(xs >>= f) = (concat (map f xs))
You may want to learn about Functor and Applicative operations, which are related to the idea of Monad, and might help clear some confusion.

Haskell types frustrating a simple 'average' function

I'm playing around with beginner Haskell, and I wanted to write an average function. It seemed like the simplest thing in the world, right?
Wrong.
It seems like Haskell's type system forbids average from working on a generic numeric type - I can get it to work on a list of Integrals, or an list of Fractionals, but not both.
I want:
average :: (Num a, Fractional b) => [a] -> b
average xs = ...
But I can only get:
averageInt :: (Integral a, Fractional b) => [a] -> b
averageInt xs = fromIntegral (sum xs) / fromIntegral (length xs)
or
averageFrac :: (Fractional a) => [a] -> a
averageFrac xs = sum xs / fromIntegral (length xs)
and the second one seems to work. Until I try to pass a variable.
*Main> averageFrac [1,2,3]
2.0
*Main> let x = [1,2,3]
*Main> :t x
x :: [Integer]
*Main> averageFrac x
<interactive>:1:0:
No instance for (Fractional Integer)
arising from a use of `averageFrac ' at <interactive>:1:0-8
Possible fix: add an instance declaration for (Fractional Integer)
In the expression: average x
In the definition of `it': it = averageFrac x
Apparently, Haskell is really picky about its types. That makes sense. But not when they could both be [Num]
Am I missing an obvious application of RealFrac?
Is there way to coerce Integrals into Fractionals that doesn't choke when it gets a Fractional input?
Is there some way to use Either and either to make some sort of polymorphic average function that would work on any sort of numeric array?
Does Haskell's type system outright forbid this function from ever existing?
Learning Haskell is like learning Calculus. It's really complicated and based on mountains of theory, and sometimes the problem is so mindbogglingly complex that I don't even know enough to phrase the question correctly, so any insight will be warmly accepted.
(Also, footnote: this is based off a homework problem. Everybody agrees that averageFrac, above, gets full points, but I have a sneaking suspicion that there is a way to make it work on both Integral AND Fractional arrays)
So fundamentally, you're constrained by the type of (/):
(/) :: (Fractional a) => a -> a -> a
BTW, you also want Data.List.genericLength
genericLength :: (Num i) => [b] -> i
So how about removing the fromIntegral for something more general:
import Data.List
average xs = realToFrac (sum xs) / genericLength xs
which has only a Real constraint (Int, Integer, Float, Double)...
average :: (Real a, Fractional b) => [a] -> b
So that'll take any Real into any Fractional.
And note all the posters getting caught by the polymorphic numeric literals in Haskell. 1 is not an integer, it is any number.
The Real class provides only one method: the ability to turn a value in class Num to a rational. Which is exactly what we need here.
And thus,
Prelude> average ([1 .. 10] :: [Double])
5.5
Prelude> average ([1 .. 10] :: [Int])
5.5
Prelude> average ([1 .. 10] :: [Float])
5.5
Prelude> average ([1 .. 10] :: [Data.Word.Word8])
5.5
The question has been very well answered by Dons, I thought I might add something.
When calculating the average this way :
average xs = realToFrac (sum xs) / genericLength xs
What your code will do is to traverse the list twice, once to calculate the sum of its elements, and once to get its length.
As far as I know, GHC isn't able yet to optimize this and compute both the sum and length in a single pass.
It doesn't hurt even as a beginner to think about it and about possible solutions, for example the average function might be written using a fold that computes both the sum and length; on ghci :
:set -XBangPatterns
import Data.List
let avg l=let (t,n) = foldl' (\(!b,!c) a -> (a+b,c+1)) (0,0) l in realToFrac(t)/realToFrac(n)
avg ([1,2,3,4]::[Int])
2.5
avg ([1,2,3,4]::[Double])
2.5
The function doesn't look as elegant, but the performance is better.
More information on Dons blog:
http://donsbot.wordpress.com/2008/06/04/haskell-as-fast-as-c-working-at-a-high-altitude-for-low-level-performance/
Since dons has done such a good job at answering your question, I'll work on questioning your question....
For example, in your question, where you first run an average on a given list, getting a good answer. Then, you take what looks like the exact same list, assign it to a variable, then use the function the variable...which then blows up.
What you've run into here is a set-up in the compiler, called the DMR: the D readed M onomorphic R estriction. When you passed the list straight into the function, the compiler made no assumption about which type the numbers were, it just inferred what types it could be based on usage, and then picked one once it couldn't narrow the field down any more. It's kind of like the direct opposite of duck-typing, there.
Anyway, when you assigned the list to a variable, the DMR kicked in. Since you've put the list in a variable, but given no hints on how you want to use it, the DMR made the compiler pick a type, in this case, it picked one that matched the form and seemed to fit: Integer. Since your function couldn't use an Integer in its / operation (it needs a type in the Fractional class), it makes that very complaint: there's no instance of Integer in the Fractional class. There are options you can set in GHC so that it doesn't force your values into a single form ("mono-morphic", get it?) until it needs to, but it makes any error messages slightly tougher to figure out.
Now, on another note, you had a reply to dons' answer that caught my eye:
I was mislead by the chart on the last page of cs.ut.ee/~varmo/MFP2004/PreludeTour.pdf
that shows Floating NOT inheriting properties from Real, and I then assumed that
they would share no types in common.
Haskell does types differently from what you're used to. Real and Floating are type classes, which work more like interfaces than object classes. They tell you what you can do with a type that's in that class, but it doesn't mean that some type can't do other things, any more than having one interface means that a(n OO-style) class can't have any others.
Learning Haskell is like learning Calculus
I'd say learning Haskell is like learning Swedish - there are lots of little, simple things (letters, numbers) that look and work the same, but there are also words that look like they should mean one thing, when they actually mean something else. But once you get fluent in it, your regular friends will be amazed at how you can spout off this oddball stuff that makes gorgeous beauties do amazing tricks. Curiously, there are many folks involved in Haskell from the beginnings, who also know Swedish. Maybe that metaphor is more than just a metaphor...
:m Data.List
let list = [1..10]
let average = div (sum list) (genericLength list)
average
I'm amazed that after all of these years, no one has pointed out that Don Stewart's average doesn't work with complex numbers, while OP's averageFrac does work with complex numbers. Neither one is unambiguously superior to the other.
The fundamental reason why you can't write
average :: (Num a, Fractional b) => [a] -> b
is that it can be instantiated at a type like
average :: [Complex Double] -> Double
Haskell's numeric classes support conversions that are a little bit lossy, like Rational to Double, Double to Float, and Integer to Int, but don't support extremely lossy conversions like complex to real, or fractional to integral. You can't convert Complex Double to Double without explicitly taking (e.g.) the real part of it, which is not something that average should be doing. Therefore, you can't write average :: [Complex Double] -> Double. Therefore, you can't write average with any type that can be specialized to [Complex Double] -> Double.
The most Haskellish type for average is probably OP's averageFrac. Generally, functions that aren't dedicated to type conversion should be leaving the type conversion to the caller as much as possible. averageFrac will work with practically any numeric type, either directly or after coercion of the input list. The caller, being closer to the source of the data, is more likely to know whether it needs to be coerced or not (and if it doesn't know, it can leave the decision to its caller). In contrast, Don Stewart's average just doesn't support complex numbers, even with coercion. You'd either have to rewrite it from scratch or else call it twice with the real and imaginary projections of the list (and then write another wrapper for quaternions that calls it four times, etc.).
Yeah, Haskell's type system is very picky. The problem here is the type of fromIntegral:
Prelude> :t fromIntegral
fromIntegral :: (Integral a, Num b) => a -> b
fromIntegral will only accept an Integral as a, not any other kind of Num. (/), on the other hand only accepts fractional. How do you go about making the two work together?
Well, the sum function is a good start:
Prelude> :t sum
sum :: (Num a) => [a] -> a
Sum takes a list of any Num and returns a Num.
Your next problem is the length of the list. The length is an Int:
Prelude> :t length
length :: [a] -> Int
You need to convert that Int into a Num as well. That's what fromIntegral does.
So now you've got a function that returns a Num and another function that returns a Num. There are some rules for type promotion of numbers you can look up, but basically at this point you're good to go:
Prelude> let average xs = (sum xs) / (fromIntegral (length xs))
Prelude> :t average
average :: (Fractional a) => [a] -> a
Let's give it a trial run:
Prelude> average [1,2,3,4,5]
3.0
Prelude> average [1.2,3.4,5.6,7.8,9.0]
5.4
Prelude> average [1.2,3,4.5,6,7.8,9]
5.25

Reliable cube root in Haskell

I am doing question 62 at project euler and came up with the following to test whether a number is cubic:
isInt x = x == fromInteger (round x)
isCube x= isInt $ x**(1/3)
But due to floating point error, it returns incorrect results:
*Main> isCube (384^3)
False
Is there a way to implement a more reliable cube test?
On a side-note, here is the rest of my solution, which doesn't work because of a type interface error on filter (isCube) (perms n):
cubes = [n^3|n<-[1..]]
perms n = map read $ permutations $ show n :: [Integer]
answer = head [n|n<-cubes,(length $ filter (isCube) (perms n)) == 5]
What do I need to do to fix the error?
No instances for (Floating Integer, RealFrac Integer)
arising from a use of `isCube' at prob62.hs:10:44-49
Any optimisations are also welcome ;-)
Try to avoid using floating point numbers as much as possible, especially when you have a problem which concerns integer values. Floating point numbers have problems with rounding and that certain values (like 1/3) cannot be represented exactly. So it's no surprise that you get mysterious answers.
First of all, in order to fix your type error you have to redefine isCube. If you check it's type signature it looks like this:
isCube :: (RealFrac a, Floating a) => a -> Bool
Note that it expects something that is of class Floating as its first argument. Your problem is that you want to use this function on integer values and integers are not an instance of Floating. You can redefine isCube like this to make the function type check.
isCube x = isInt $ (fromIntegral x) ** (1/3)
However, that will not make your program correct.
One way to make your program more correct is to do what Henrik suggested. It would look like this:
isCube x = (round (fromIntegral x ** (1/3))) ^ 3 == x
Good luck!
Don't know much about Haskell, but I would take the cube root, round to the nearerst integer, take the cube, and compare to the original value.
For another approach useful for Integer values have a look at the integerCubeRoot function in the arithmoi package.
Example:
ghci> import Math.NumberTheory.Powers.Cube
ghci> let x = 12345^3333
ghci> length $ show x
13637
ghci> isCube x
True
ghci> isCube (x+1)
False
ghci> length $ show $ integerCubeRoot x
4546
perms has the type [Integer]. isCube has the type (RealFrac a, Floating a) => a -> Bool (as you can check in GHCI). The RealFrac constraint comes from round x, the Floating constraint comes from x**(1/3). Since Integer is neither RealFrac nor Floating, isCube can't be used as Integer -> Bool. So filter isCube (perms n) doesn't make sense.
So you need to fix isCube to work properly on Integers:
isCube x = isInt $ (fromInteger x)**(1/3)
In fact, the reason isCube (384^3) even compiles is that it "really" means isCube ((fromInteger 384)^(fromInteger 3)).
Of course, this will still work badly due to floating point errors. Basically, checking floating numbers for equality, as you do in isInt, is almost always a bad idea. See other answers for explanation how to make a better test.

Resources