Floating point numbers, precision, and Parsec - haskell

Consider the following code:
import Text.Parsec
import Text.Parsec.Language
import Text.Parsec.String
import qualified Text.Parsec.Token as Token
float :: Parser Double
float = Token.float (Token.makeTokenParser emptyDef)
myTest :: String -> Either ParseError Double
myTest = parse float ""
Now, thanks to QuickCheck I know a magic number (I have aligned result for
convenience):
λ> myTest "4.23808622486133"
Right 4.2380862248613305
Some floating point numbers cannot be exactly represented in memory, some
operations easily introduce «fluctuations» into floating point numbers. We
all know that. However, cause of this parsing problem seems to be different.
A few words about tests that helped me discover this… feature. Put simply,
in these tests floating point value is generated, printed, and parsed back
(with Parsec). For example, number 9.2 is known to be impossible to
represent as floating
point value,
however it passes the tests (obviously because of «smart» printing
function). Why does 4.23808622486133 fail?
For those who believe that these numbers are the same and 4.23808622486133 is just shortest unambiguous representation of 4.2380862248613305:
a1 :: Double
a1 = 9.2000000000000003
a2 :: Double
a2 = 9.200000000000001
b1 :: Double
b1 = 4.23808622486133
b2 :: Double
b2 = 4.2380862248613305
Now:
λ> a1 == a2
True
λ> b1 == b2
False

Parsec does the conversion to Double using what amounts to
foldr (\d acc -> read [d] + acc / 10) 0 "423808622486133" :: Double
and as you point out, this is not equal to
423808622486133 / 100000000000000 :: Double
I agree that this should be considered a bug in Parsec.

This is still not fixed in Parsec. If this exact problem breaks your day, take a look at Megaparsec, which is a fork of Parsec that fixes many bugs and conceptual flaws, improves quality of error messages and more.
As you can see this problem is fixed there:
λ> parseTest float "4.23808622486133"
4.23808622486133
λ> parseTest float "4.2380862248613305"
4.2380862248613305
Disclosure: I'm one of the authors of Megaparsec.

Related

How do I distinguish negative zero with Aeson?

Haskell distinguishes negative zero:
ghci> (isNegativeZero (0 :: Float), isNegativeZero (-0 :: Float))
(False,True)
JSON also allows for distinguishing them, since both "0" and "-0" are valid, syntactically.
But Aeson throws away the sign bit:
ghci> isNegativeZero <$> eitherDecode "-0"
Right False
Why? How can I decode a JSON document while distinguishing non-negative and negative zero?
It looks like in Data.Aeson the floating point number is constructed using Data.Scientific.scientific
scientific :: Integer -> Int -> Scientific
scientific c e constructs a scientific number which corresponds to the Fractional number: fromInteger c * 10 ^^ e.
Since the mantissa is an Integer, where we have 0 == -0, it can not construct a negative zero. Not the best API for constructing special floating point values, it seems.
Perhaps you should file a bug for aeson, asking for a workaround in the parser.

How does Haskell pick a type for an ambiguous expression

If an expression can be typed in several ways, how does Haskell pick which one to use?
Motivating example
Take this example:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
Prelude> import Data.Ratio (Ratio)
Prelude Data.Ratio> f z s = show $ truncate $ z + (read s)
Prelude Data.Ratio> :type f
f :: (RealFrac a, Read a) => a -> String -> String
Prelude Data.Ratio> s = take 30 (cycle "12345")
Prelude Data.Ratio> s
"123451234512345123451234512345"
Prelude Data.Ratio> f 0 s
"123451234512345121227855101952"
Prelude Data.Ratio> f (0::Double) s
"123451234512345121227855101952"
Prelude Data.Ratio> f (0::Float) s
"123451235679745417161721511936"
Prelude Data.Ratio> f (0::Ratio Integer) (s ++ "%1")
"123451234512345123451234512345"
Prelude Data.Ratio> show $ truncate $ read s
"123451234512345121227855101952"
When I used 0 without any type, I got the same result as for (0::Double). So it seems to me that when I just invoke f 0 s, it uses a version of read that produces a Double, and a version of truncate that turns that Double into some integral type. I introduces the variable z so I could have some easy control over the type here. With that I could show that other interpretations are possible, e.g. using Float or exact ratios. So why Double? The last line, which omits the addition, shows that the behavior is independent of that zero constant.
I guess something tells Haskell that Double is a more canonical type than others, either in general or when used as a RealFrac, so if it can interpret an expression using Double as an intermediate result, but also some other types, then it will prefer the Double interpretation.
Core questions
Is my interpretation observed behavior correct, and there is an implicit type default here?
What is the name for this kind of preference?
Is there a way to disable such a choice of canonical type and enforce explicit type specifications for things that can be interpreted in multiple ways?
Own research
I've read that https://en.wikibooks.org/wiki/Haskell/Type_basics_II#Polymorphic_guesswork writes
With no other restrictions, 5.12 will assume the default Fractional type of Double, so (-7) will become a Double as well.
That appears to confirm my assumption that Double is somehow blessed as the default type for some parent category of RealFrac. It still doesn't offer a name for that concept, nor a complete list of the rules around that.
Background
The situation I actually want to handle is more like this:
f :: Integer -> String -> Integer
f scale str = truncate $ (fromInteger scale) * (read str)
So I want a function that takes a string, reads it as a decimal fraction, multiplies it with a given number, then truncates the result back to an integer. I was very surprised to find that this compiles without me specifying the intermediate fractional type anywhere.
If there is an ambiguous type variable v with a Num v constraint, it gets defaulted to Integer or Double, tried in that order, whichever satisfies all other constraints on v.
Those defaulting rules are explained in the Haskell Report: https://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-620004
The GHC manual also explains additional defaulting rules in GHCi (this means trying things in GHCi will not give you an accurate picture of what is going on when you compile a program): https://downloads.haskell.org/ghc/latest/docs/html/users_guide/ghci.html#type-defaulting-in-ghci

What is the right way to convert from Double to Fixed in Haskell?

I have some code that uses Centi for its currency type. It receives Doubles from an Excel spreadsheet using the xlsx package. I want to convert these Doubles into Centis correctly. I have tried using realToFrac, and for the most part it works, but occasionally it's out by a cent:
λ realToFrac (-1372.92 :: Double) :: Centi
-1372.93
I can guess that the problem is in the binary representation of that decimal number not being exact, and that I shouldn't be using Double to represent currency in the first place, but I don't have control of that part. So what is a fail-safe way of doing the conversion?
Maybe not the best way but one is to round to the closest integer after adjusting for the desired precision:
floatToCenti :: Double -> Centi
floatToCenti = floatToFixed
floatToFixed :: HasResolution a => Double -> Fixed a
floatToFixed x = y where
y = MkFixed (round (fromInteger (resolution y) * x))
-- reusing y as the dummy argument of resolution

Haskell datatype conversion problems

I am currently learning Haskell and have been writing a couple of very simple programs to practice as I go along. One of these programs is the one I have bellow:
import System.IO
main = do
putStrLn "Give me year: "
y <- getLine
let res = show . calcPop $ read y
putStrLn ("Population in " ++ y ++ " will be " ++ res)
pop :: Float
pop = 307357870.0
secInYear :: Float
secInYear = 365.0 * 24.0 * 60.0 * 60.0
bRate :: Float
bRate = secInYear / 7.0
dRate :: Float
dRate = secInYear / 13.0
iRate :: Float
iRate = secInYear / 35.0
calcPop :: Float -> Float
calcPop year = let years = year - 2010 in (years*bRate + years*iRate + pop - years*dRate)
What it does is take a year after 2010 and calculate the estimated population in that year, and as it is right now it works fine and all except that as you may have noticed EVERY single number is cast as a float. Now it is pretty ridiculous to do this since there is no reason to have the current population, the number of seconds in a year or the year itself for that matter as anything but ints, unfortunately though when I had it that way I was getting compiler error with the / function saying something about fractional int and with the * function saying it inferred a float as the first parameter. Now I had understood that Haskell like other languages when encountering operations involving ints and floats would just change the int to act like a float which aperantly didn't happen here. Could someone explain why I was getting these errors and how I can get ints and floats to cooperate since evidently I still don't have a good enough grasp of the Haskell type system to do it myself?
Haskell types are strict; it never automatically converts a type for you, except that integer literals are automatically wrapped in fromIntegral. Instead, you may want to use more type-appropriate operations such as `div` when you only need to deal with Int/Integer, and fromIntegral to promote to Float or Double when needed.
(Syntax note: `function` converts a prefix function into an infix operator.)
For integer division in Haskell you use div or quot.
From the Haskell 98 language report:
An integer literal represents the
application of the function
fromInteger to the appropriate value
of type Integer. Similarly, a floating
point literal stands for an
application of fromRational to a value
of type Rational (that is, Ratio
Integer).
This means a "7" in the source code becomes "(fromInteger 7)" and "1.5" becomes "(fromRational (3 Ratio.% 2))". The numerical operations such as (+) and (/) have type signatures like "a->a->a" meaning they two arguments of perfectly identical types and return the same type as given. These standard operators can never do things like add a Float to an Int. You can write a (fromIntegral) to try and promote an Int-like-type to things like a Double or Float.

Odd values when enumerating a list

As part of a larger function definition, I needed to allow the domain (i, n) of a function to increment from i to n at varying rates. So I wrote:
f (i, n) k = [i, (i+k)..n]
into GHC. This returned odd results:
*Main> f (0.0, 1.0) 0.1
[0.0,0.1,0.2,0.30000000000000004,0.4000000000000001,0.5000000000000001,0.6000000000000001,0.7000000000000001,0.8,0.9,1.0]
Why does GHC return, e.g., 0.30000000000000004 instead of 0.3?
Because IEEE floating point arithmetic can't express decimal numbers precisely, in general. There is always a rounding error in the binary representation, which sometimes seeps to the surface when displaying the number.
Depending on how GHC converts floating-point numbers to a decimal representation, you might find that on Windows it displays the result as the expected 0.3. This is because Microsoft's runtime libraries are smarter than Linux and Mac about how they present floats.
EDIT: This may not be the case. The number 0.3 will encode as the integer 3fd3333333333333 when using IEEE floats, whereas 0.1 + 0.1 + 0.1 will produce a number that encodes as 3fd3333333333334, and I don't know whether Microsoft's runtime libraries are tolerant enough to round back to 0.3 when displaying this.
In any event, a good example of the different handling is to type 0.3 into a Python interactive shell. If it's Python 2.6, you'll get back 0.29999999999999999, and if it's 2.7, it will display 0.3.
If i, n, and k are rational, you could go the infinite-precision route:
f :: (Rational, Rational) -> Rational -> [Rational]
f (i, n) k = [i, (i+k) .. n]
The notation may require a bit of getting used to:
ghci> f (0%1, 1%1) (1%10)
[0 % 1,1 % 10,1 % 5,3 % 10,2 % 5,1 % 2,3 % 5,7 % 10,4 % 5,9 % 10,1 % 1]
Think of the % as a funny looking fraction bar.
You could view approximations with
import Control.Monad (mapM_)
import Data.Ratio (Rational, (%), denominator, numerator)
import Text.Printf (printf)
printApprox :: [Rational] -> IO ()
printApprox rs = do
mapM_ putRationalToOnePlaceLn rs
where putRationalToOnePlaceLn :: Rational -> IO ()
putRationalToOnePlaceLn r = do
let toOnePlace :: String
toOnePlace = printf "%.1f" (numFrac / denomFrac)
numFrac, denomFrac :: Double
numFrac = fromIntegral $ numerator r
denomFrac = fromIntegral $ denominator r
putStrLn toOnePlace
The code above is written in an imperative style with full type annotations. Read its type as transforming a list of rational numbers into some I/O action. The mapM_ combinator from Control.Monad evaluates an action (putRationalToOnePlaceLn in this case) for each value in a list (the rationals we want to approximate). You can think of it as sort of a for loop, and there is even a forM_ combinator that's identical to mapM_ except the order of the arguments is reversed. The underscore at the end is a Haskell convention showing that it discards the results of running the actions, and note that there are mapM and forM that do collect those results.
To arrange for the output of the approximations via putStrLn, we have to generate a string. If you were writing this in C, you'd have code along the lines of
int numerator = 1, denominator = 10;
printf("%.1f\n", (double) numerator / (double) denominator);
The Haskell code above is similar in structure. The type of Haskell's / operator is
(/) :: (Fractional a) => a -> a -> a
This says for some instance a of the typeclass Fractional, when given two values of the same type a, you'll get back another value of that type.
We can ask ghci to tell us about Fractional:
ghci> :info Fractional
class (Num a) => Fractional a where
(/) :: a -> a -> a
recip :: a -> a
fromRational :: Rational -> a
-- Defined in GHC.Real
instance Fractional Float -- Defined in GHC.Float
instance Fractional Double -- Defined in GHC.Float
Notice the instance lines at the bottom. This means we can
ghci> (22::Float) / (7::Float)
3.142857
or
ghci> (22::Double) / (7::Double)
3.142857142857143
but not
ghci> (22::Double) / (7::Float)
<interactive>:1:16:
Couldn't match expected type `Double' against inferred type `Float'
In the second argument of `(/)', namely `(7 :: Float)'
In the expression: (22 :: Double) / (7 :: Float)
In the definition of `it': it = (22 :: Double) / (7 :: Float)
and certainly not
ghci> (22::Integer) / (7::Integer)
<interactive>:1:0:
No instance for (Fractional Integer)
arising from a use of `/' at :1:0-27
Possible fix: add an instance declaration for (Fractional Integer)
In the expression: (22 :: Integer) / (7 :: Integer)
In the definition of `it': it = (22 :: Integer) / (7 :: Integer)
Remember that Haskell's Rational type is defined as a ratio of Integers, so you can think of fromIntegral as sort of like a typecast in C.
Even after reading A Gentle Introduction to Haskell: Numbers, you'll still likely find Haskell to be frustratingly picky about mixing numeric types. It's too easy for us, who perform infinite-precision arithmetic in our heads or on paper, to forget that computers have only finite precision and must deal in approximations. Type safety is a helpful reality check.
Sample output:
*Main> printApprox $ f (0%1, 1%1) (1%10)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
The definition of printApprox probably seemed comforting with all the helpful signposts such as names of functions and parameters or type annotations. As you grow more experienced and comfortable with Haskell, such imperative-looking definitions will begin to look cluttered and messy.
Haskell is a functional language: its strength is specifying the what, not the how, by assembling simple functions into more complex ones. Someone once suggested that Haskell manipulates functions as powerfully as Perl manipulates strings.
In point-free style, the arguments disappear leaving the structure of the computation. Learning to read and this style does take practice, but you'll find that it helps write cleaner code.
With tweaks to the imports, we can define a point-free equivalent such as
import Control.Arrow ((***), (&&&))
import Control.Monad (join, mapM_)
import Data.Ratio (Rational, (%), denominator, numerator)
import Text.Printf (printf)
printApproxPointFree :: [Rational] -> IO ()
printApproxPointFree =
mapM_ $
putStrLn .
toOnePlace .
uncurry (/) .
join (***) fromIntegral .
(numerator &&& denominator)
where toOnePlace = printf "%.1f" :: Double -> String
We see a few familiar bits: our new friend mapM_, putStrLn, printf, numerator, and denominator.
There's also some weird stuff. Haskell's $ operator is another way to write function application. Its definition is
f $ x = f x
It may not seem terribly useful until you try
Prelude> show 1.0 / 2.0
<interactive>:1:0:
No instance for (Fractional String)
arising from a use of `/' at :1:0-13
Possible fix: add an instance declaration for (Fractional String)
In the expression: show 1.0 / 2.0
In the definition of `it': it = show 1.0 / 2.0
You could write that line as
show (1.0 / 2.0)
or
show $ 1.0 / 2.0
So you can think of $ as another way to write parentheses.
Then there's . that means function composition. Its definition is
(f . g) x = f (g x)
which we could also write as
(f . g) x = f $ g x
As you can see, we apply the right-hand function and then feed the result to the left-hand function. You may remember definitions from mathematics textbooks such as
The name . was chosen for its similarity in appearance to the raised dot.
So with a chain of function compositions, it's often easiest to understand it by reading back-to-front.
The (numerator &&& denominator) bit uses a fan-out combinator from Control.Arrow. For example:
ghci> (numerator &&& denominator) $ 1%3
(1,3)
So it applies two functions to the same value and gives you back a tuple with the results. Remember we need to apply fromIntegral to both the numerator and denominator, and that's what join (***) fromIntegral does. Note that *** also comes from the Control.Arrow module.
Finally, the / operator takes separate arguments, not a tuple. Thinking imperatively, you might want to write something like
(fst tuple) / (snd tuple)
where
fst (a,_) = a
snd (_,b) = b
but think functionally! What if we could somehow transform / into a function that takes a tuple and uses its components as arguments for the division? That's exactly what uncurry (/) does!
You've taken a great first step with Haskell. Enjoy the journey!
A better way of doing this is more along the lines of
map (/10) [0 .. 10]
This takes whole numbers, thereby avoiding the float problem, and divides each one by 10.

Resources