How to get n digits of a Haskell Double? - haskell

I want to have a function digits :: Double -> Int -> Double, that gives n digits of a double (must round!) and returns a new double only with the digits wanted.
Example: digits 1.2345 3 -> 1.235
I could implement this using strings, but I think there is a better solution.
Any idea how to implement this function?
Thanks in advance!

One relatively simple way is to multiply by an appropriate power of ten and round, then divide away the power of ten. For example:
digits d n = fromInteger (round (d * 10^n)) / 10^n
In ghci:
> digits 1.2346 3
1.235
> digits 1.2344 3
1.234
> digits 1.2345 3
1.234
The last example shows that Haskell uses banker's rounding by default, so it doesn't quite meet your spec -- but perhaps it is close enough to be useful anyway. It is easy to implement other rounding variants (or find them on Hackage) if you really need them.

Related

Round a float to a specific number of decimal places

I'm trying to display a float but rounding it, so that
f = 5.545
Displays as : 5.55 while
f = 5.544
displays as : 5.54
I've seen method to display only the two first decimals, but I want to have it rounded.
Thank you !
This happens because of the way floating-point numbers are represented in the computer: they're actually in base-2, rather than base-10 (a bit of oversimplification, but good enough). As a consequence, when you type in 0.545, the computer actually records that as 0.5499999999... - very close to 0.545, but slightly smaller. And since it's smaller than 0.545, it's no wonder it gets rounded to 0.54.
If you really have to have exact base-10 numbers, you should use Decimal instead of Float or Double. That package specifically takes care to represent floating-point numbers in base-10 without loss.
x :: Decimal
x = 0.545
show x
> "0.545"
The caveat is that printf does not support Decimal, so you'd have to display it by rounding via roundTo and converting to string via show. Another caveat that roundTo does "banker's rounding" - if the last digit is five, it rounds to the nearest even digit, so we'd need to counteract that as a special case (I couldn't find a ready-to-use function that rounds by arithmetic rules):
displayDecimal :: Decimal -> String
displayDecimal x = show (rounded + compensate)
where rounded = roundTo 2 x
compensate = if (x - rounded) == 0.005 then 0.01 else 0
displayDecimal 0.545
> "0.55"
displayDecimal 0.5450000000001
> "0.55"
displayDecimal 0.544
> "0.54"
displayDecimal 0.5449999999999
> "0.54"
However, if you just want this to work for numbers with three decimal places, you can get away with just adding a very small value before rounding, like 0.00001. This value is small enough that it won't mess up your actual numbers, but large enough to compensate for the base-2 vs. base-10 discrepancy:
displayRounded :: Double -> String
displayRounded x = printf "%.2f" (x + 0.00001)
displayRounded 0.544
> "0.54"
displayRounded 0.545
> "0.55"
So I didn't quite find a real solution to this, but I realised something :
Printf already does the job, but not exactly how I wanted.
Let's say I want to round 1.445, it will display 1.44. But if the number was 1.446, then it would have displayed 1.45.
Not exactly what I wanted, but close enough.

Round NaN in Haskell

To my great surprise, I found that rounding a NaN value in Haskell returns a gigantic negative number:
round (0/0)
-269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824
The same thing happens with floor and ceiling.
What is happening here? Is this behavior intended? Of course, I understand that anyone who doesn't want this behavior can always write another function that checks isNaN - but are there existing alternative standard library functions that handle NaN more sanely (for some definition of "more sanely")?
TL;DR: NaN have an arbitrary representation between 2 ^ 1024 and 2 ^ 1025 (bounds not included), and - 1.5 * 2 ^ 1024 (which is one possible) NaN happens to be the one you hit.
Why any reasoning is off
What is happening here?
You're entering the region of undefined behaviour. Or at least that is what you would call it in some other languages. The report defines round as follows:
6.4.6 Coercions and Component Extraction
The ceiling, floor, truncate, and round functions each take a real fractional argument and return an integral result. … round x returns the nearest integer to x, the even integer if x is equidistant between two integers.
In our case x does not represent a number to begin with. According to 6.4.6, y = round x should fulfil that any other z from round's codomain has an equal or greater distance:
y = round x ⇒ ∀z : dist(z,x) >= dist(y,x)
However, the distance (aka the subtraction) of numbers is defined only for, well, numbers. If we used
dist n d = fromIntegral n - d
we get in trouble soon: any operation that includes NaN will return NaN again, and comparisons on NaN fail, so our property above does not hold for any z if x was a NaN to begin with. If we check for NaN, we can return any value, but then our property holds for all pairs:
dist n d = if isNaN d then constant else fromIntegral n - d
So we're completely arbitrary in what round x shall return if x was not a number.
Why do we get that large number regardless?
"OK", I hear you say, "that's all fine and dandy, but why do I get that number?" That's a good question.
Is this behavior intended?
Somewhat. It isn't really intended, but to be expected. First of all, we have to know how Double works.
IEE 754 double precision floating point numbers
A Double in Haskell is usually a IEEE 754 compliant double precision floating point number, that is a number that has 64 bits and is represented with
x = s * m * (b ^ e)
where s is a single bit, m is the mantissa (52 bits) and e is the exponent (11 bits, floatRange). b is the base, and its usually 2 (you can check with floadRadix). Since the value of m is normalized, every well-formed Double has a unique representation.
IEEE 754 NaN
Except NaN. NaN is represented as the emax+1, as well as a non-zero mantissa. So if the bitfield
SEEEEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
represents a Double, what's a valid way to represent NaN?
?111111111111000000000000000000000000000000000000000000000000000
^
That is, a single M is set to 1, the other are not necessary to set this notion. The sign is arbitrary. Why only a single bit? Because its sufficient.
Interpret NaN as Double
Now, when we ignore the fact that this is a malformed Double—a NaN– and really, really, really want to interpret it as number, what number would we get?
m = 1.5
e = 1024
x = 1.5 * 2 ^ 1024
= 3 * 2 ^ 1024 / 2
= 3 * 2 ^ 1023
And lo and behold, that's exactly the number you get for round (0/0):
ghci> round $ 0 / 0
-269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824
ghci> negate $ 3 * 2 ^ 1023
-269653970229347386159395778618353710042696546841345985910145121736599013708251444699062715983611304031680170819807090036488184653221624933739271145959211186566651840137298227914453329401869141179179624428127508653257226023513694322210869665811240855745025766026879447359920868907719574457253034494436336205824
Which brings our small adventure to a halt. We have a NaN, which yields a 2 ^ 1024, and we have some non-zero mantissa, which yields a result with absolute value between 2 ^ 1024 < x < 2 ^ 1025.
Note that this isn't the only way NaN can get represented:
In IEEE 754, NaNs are often represented as floating-point numbers with the exponent emax + 1 and nonzero significands. Implementations are free to put system-dependent information into the significand. Thus there is not a unique NaN, but rather a whole family of NaNs.
For more information, see the classic paper on floating point numbers by Goldberg.
This has long been observed as a problem. Here're a few tickets filed against GHC on this very topic:
https://ghc.haskell.org/trac/ghc/ticket/3070
https://ghc.haskell.org/trac/ghc/ticket/11553
https://ghc.haskell.org/trac/ghc/ticket/3676
Unfortunately, this is a thorny issue with lots of ramifications. My personal belief is that this is a genuine bug and it should be fixed properly by throwing an error. But you can read the comments on these tickets to get an understanding of the tricky issues preventing GHC from implementing a proper solution. Essentially, it comes down to speed vs. correctness, and this is one point where (i) the Haskell report is woefully underspecified, and (ii) GHC compromises the latter for the former.

format float with minimum and maximum number of decimals

I'd like to format a float with minimum 2 and maximum 8 decimals. I do it like this:
def format_btc(btc):
s = locale.format("%.8f", btc).rstrip('0')
if s.endswith(','):
s += '00'
return s
Is there a way to do it only with format() function?
edit:
examples: left is float, right is string
1 -> 1,00
1.1 -> 1,10 (I have now realised that my code returns 1,1 for this example; that's a bug)
1.12 -> 1,12
1.123 -> 1,123
1.12345678 -> 1,12345678
1.123456789 -> 1,12345678
1,1234567890 -> 1,12345678
No. I rechecked the specification language to make sure. Possible reasons:
(Theoretical) If 8 digits after decimal are significant, then deleting 0s deletes information.
(Practical) The complication of adding a third number argument used only for floats and then very seldom is not work the rare gain.

Create list of strings from list of doubles, non Scientific notation

listOfLongDeci = [showFFloat Nothing (1/a) | a<-[2..1000], length (show (1/a)) > 7]
listOfLongDeci2 = [show (1/a) | a<-[2..1000], length (show (1/a)) > 7]
listOfLongDeci3 = [(1/a) | a<-[2..1000], length (show (1/a)) > 7]
the 1st gives a list of ShowS, how can I make a string from showS?
the 2nd gives a list of scientific notation
the 3rd only gives list
of doubles
How can I use any of these to create a list of strings with non scientific notation? (Euler 26)
As requested:
the 1st gives a list of ShowS, how can I make a String from ShowS?
Since ShowS is a type synonym for String -> String, you obtain a String by applying the function to a String. Since the showXFloat functions produce a function that prepends some String to the final String argument (basically a difference list; many show-related functions produce such - shows, showChar, showString, to name a few - for reasons of efficiency), the natural choice for the final argument is the empty String, so
listOfLongDeci = [showFFloat Nothing (1/a) "" | a<-[2..1000], length (show (1/a)) > 7]
produces a list of Strings, correctly rounded approximations to the decimal representation of the numbers 1/a in non scientific notation.
how can I use any of these to create a list of strings with non scientific notation? (euler 26)
The first part has been answered, but these representations won't help you solve Problem 26 of Project Euler,
Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.
A Double has 53 bits of precision (52 explicit bits for the significand plus one hidden bit for normalized numbers, no hidden bit, thus 52 or fewer bits of precision for subnormal numbers), and the number 1/d cannot be exactly represented as a Double unless d is a power of 2. The 53 bits of precision give you roughly
Prelude> 53 * log 2 / log 10
15.954589770191001
significant decimal digits of precision, so from the first nonzero digit on, you have 15 or 16 digits that you can expect to be correct for the exact [terminating or recurring] decimal expansion of the fraction 1/d, beyond that, the expansions differ.
For example, 1/71 has a recurring cycle 01408450704225352112676056338028169 of length 35 (by far not the longest in the range to be considered). The closest representable Double to 1/71 is
0.01408450704225352144438598855913369334302842617034912109375 = 8119165525400331 / (2^59)
of which the first 17 significant digits are correct (and 0.014084507042253521 is also what showFFloat Nothing (1/71) "" gives you).
To find the longest recurring cycle in the decimal expansion of 1/d, you can use an exact (or sufficiently accurate finite) string representation of the Rational number 1 % d, or, better, use pure integer arithmetic (compute the decimal expansion using long division) without involving a Rational.

how do you convert a double to a string?

I know that most programming languages have functions built in for doing that for you, but how do those functions work?
The javadoc about the Double toString() method is quite comprehensive:
Creates a string representation of the double argument. All characters mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is '-' ('-'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces the result "0.0".
If m is greater than or equal to 10^-3 but less than 10^7, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by '.' (.), followed by one or more decimal digits representing the fractional part of m.
If m is less than 10^-3 or not less than 10^7, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10^n<=m<10^(n+1); then let a be the mathematically exact quotient of m and 10^n so that 1<=a<10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by '.' (.), followed by decimal digits representing the fractional part of a, followed by the letter 'E' (E), followed by a representation of n as a decimal integer, as produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
Is that enough? Otherwise you might like to look up the implementation too...
A simple (but non-generic, naïve and slow way):
convert the number to an integer, then divide this value by 10 stepwise to find out its digits in reverse order. Concatenate them together and you have the integer representation.
substract the integer from the original number, now multiply by 10 stepwise and find the digits after the decimal point. Concatenate the first string with a point and this second string.
This has a few problems, of course:
slow as hell;
doesn't work for negative numbers;
won't give you exponential notation for very small or large numbers.
All in all, it's an idea, but not a very good one; I suspect there are no programming languages that do this.
This paper by Guy Steele provides details on how to do this correctly. It's much more subtle than you might think.
http://portal.acm.org/citation.cfm?id=93559
"Printing Floating-Point Numbers Quickly and Accurately" - Robert G. Burger
Scheme and C code for above.
As Oded mentioned in a comment, different languages will do this in different ways. As an example, here's how Ruby 1.9 does it (in C). Your best bet, just as a research exercise, will be to look into open-source languages and see how they do it.

Resources