how do you convert a double to a string? - string

I know that most programming languages have functions built in for doing that for you, but how do those functions work?

The javadoc about the Double toString() method is quite comprehensive:
Creates a string representation of the double argument. All characters mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is '-' ('-'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces the result "0.0".
If m is greater than or equal to 10^-3 but less than 10^7, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by '.' (.), followed by one or more decimal digits representing the fractional part of m.
If m is less than 10^-3 or not less than 10^7, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10^n<=m<10^(n+1); then let a be the mathematically exact quotient of m and 10^n so that 1<=a<10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by '.' (.), followed by decimal digits representing the fractional part of a, followed by the letter 'E' (E), followed by a representation of n as a decimal integer, as produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
Is that enough? Otherwise you might like to look up the implementation too...

A simple (but non-generic, naïve and slow way):
convert the number to an integer, then divide this value by 10 stepwise to find out its digits in reverse order. Concatenate them together and you have the integer representation.
substract the integer from the original number, now multiply by 10 stepwise and find the digits after the decimal point. Concatenate the first string with a point and this second string.
This has a few problems, of course:
slow as hell;
doesn't work for negative numbers;
won't give you exponential notation for very small or large numbers.
All in all, it's an idea, but not a very good one; I suspect there are no programming languages that do this.

This paper by Guy Steele provides details on how to do this correctly. It's much more subtle than you might think.
http://portal.acm.org/citation.cfm?id=93559

"Printing Floating-Point Numbers Quickly and Accurately" - Robert G. Burger
Scheme and C code for above.

As Oded mentioned in a comment, different languages will do this in different ways. As an example, here's how Ruby 1.9 does it (in C). Your best bet, just as a research exercise, will be to look into open-source languages and see how they do it.

Related

representing a number in the octal system

I am not looking for help with my homework. I just need someone to show me the direction to do it.
I know the answer theoretically. I just stuck with idea of how to prove it mathematically.
here is the question.
Representing a number in the octal system require, on the average, about 10 percent more characters than in the decimal system.
How can I prove this mathematically?
Suppose you wanted to represent a given number x in both systems. In the decimal system, this will take in the order of log10(x) digits. In the octal system, it will take in the order of log8(x) digits.
For any a and b, loga(b) can be written as logc(b)/logc(a) for a given c. In particular, let c=10. Therefore, log8(x) = log10(x)/log10(8) ~= 1.1 log10(x), which means log8(x) is about 1.1 times greater than log10(x) for any given x. Note that this result is exact aside from the rounding. What is not exact is approximating the number of digits by log10(x) and log8(x).
The approximative number of decimal digits required for representing a number is : log10(x), and the number of octal digits is : log8(x)
Which means that the average ratio is log8(x)/log10(x)
As log8(x) = ln(x)/ln(8) and log10(x) = ln(x)/ln(10)
The average ratio is ln(10)/ln(8) = 1.1073...
Of course this is not a 100% exact demonstration, a real demonstration would define exactly the number we are trying to find (such as the average number of digits for numbers between 0 and n when n goes to infinity, etc...) and would compute the exact number of digits (which is an integer) and not an approximation.

How to extract dyadic fraction from float

Now, floating and double-precision numbers, although they can approximate any sort of number (although the same could be said integers, floats are just more precise), they are represented as binary decimals internally. For example, one tenth would be approximated
0.00011001100110011... (... only goes to computers precision, not infinity)
Now, any number in binary with finite bits as something called a dyadic fraction representation in mathematics (has nothing to do with p-adic). This means you represent it as a fraction, where the denominator is a power of 2. For example, let's say our computer approximates one tenth as 0.00011. The dyadic fraction for that is 3/32 or 3/(2^5), which is close to one tenth. Now for my technical question. What would be the simplest way to extract the dyadic fraction from a floating number.
Irrelevant Note: If you are wondering why I would want to do this, it is because I am creating a surreal number library in Haskell. Dyadic fractions are easily translated into Surreal numbers, which is why it is convenient that binary is easily translated into dyadic, (I'll sure have trouble with the rational numbers though.)
The decodeFloat function seems useful for this. Technically, you should also check that floatRadix is 2, but as far I can see this is always the case in GHC.
Just be careful since it does not simplify mantissa and exponent. Here, if I evaluate decodeFloat (1.0 :: Double) I get an exponent of -52 and a mantissa of 2^52 which is not what I expected.
Also, toRational seems to generate a dyadic fraction. I am not sure this is always the case, though.
Hold your numbers in binary and convert to decimal for display.
Binary numbers are all dyatic. The numbers after the decimal place represent the number of powers of two for the denominator and the number evaluated without a decimal place is the numerator. That's binary numbers for you.
There is an ideal representation for surreal numbers in binary. I call them "sinary". It's this:
0s is Not a number
1s is zero
10s is neg one
11s is one
100s is neg two
101s is neg half
110s is half
111s is two
... etc...
so you see that the standard binary count matches the surreal birth order of numeric values when evaluated in sinary. The way to determine the numeric value of sinary is that the 1's are rights and the 0's are lefts. We start with +/-1's and then 1/2, 1/4, 1/8, etc. With sign equal to + for 1 and - for 0.
ex: evaluating sinary
1011011s
-> is the 91st surreal number (because 64+16+8+2+1 = 91)
-> with a value of −0.28125, because...
1011011
NLRRLRR
+-++-++
+ 0 − 1 + 1/2 + 1/4 − 1/8 + 1/16 + 1/32
= 0 − 32/32 + 16/32 + 8/32 − 4/32 + 2/32 + 1/32
= − 9/32
The surreal numbers form a binary tree, so there is an ideal binary format matching their location on the tree according to the Left/Right pattern to reach the number. Assign 1 to right and 0 to left. Then the birth order of surreal number is equal to the binary count of this representation. ie: the 15th surreal number value represented in sinary is the 15th number representation in the standard binary count. The value of a sinary is the surreal label value. Strip the leading bit from the representation, and start adding +1's or -1's depending on if the number starts with 1 or 0 after the first one. Then once the bit flips, begin adding and subtracting halved values (1/2, 1/4, 1/8, etc) using + or - values according to the bit value 1/0.
I have tested this format and it seems to work well. And there are some other secrets... such as the left and right of any sinary representation is the same binary format with the tail clipped to the last 0 and last 1 respectively. Conversion to decimal into a dyatic is NOT required in order to preform the recursive functions requested by Conway.

Create list of strings from list of doubles, non Scientific notation

listOfLongDeci = [showFFloat Nothing (1/a) | a<-[2..1000], length (show (1/a)) > 7]
listOfLongDeci2 = [show (1/a) | a<-[2..1000], length (show (1/a)) > 7]
listOfLongDeci3 = [(1/a) | a<-[2..1000], length (show (1/a)) > 7]
the 1st gives a list of ShowS, how can I make a string from showS?
the 2nd gives a list of scientific notation
the 3rd only gives list
of doubles
How can I use any of these to create a list of strings with non scientific notation? (Euler 26)
As requested:
the 1st gives a list of ShowS, how can I make a String from ShowS?
Since ShowS is a type synonym for String -> String, you obtain a String by applying the function to a String. Since the showXFloat functions produce a function that prepends some String to the final String argument (basically a difference list; many show-related functions produce such - shows, showChar, showString, to name a few - for reasons of efficiency), the natural choice for the final argument is the empty String, so
listOfLongDeci = [showFFloat Nothing (1/a) "" | a<-[2..1000], length (show (1/a)) > 7]
produces a list of Strings, correctly rounded approximations to the decimal representation of the numbers 1/a in non scientific notation.
how can I use any of these to create a list of strings with non scientific notation? (euler 26)
The first part has been answered, but these representations won't help you solve Problem 26 of Project Euler,
Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.
A Double has 53 bits of precision (52 explicit bits for the significand plus one hidden bit for normalized numbers, no hidden bit, thus 52 or fewer bits of precision for subnormal numbers), and the number 1/d cannot be exactly represented as a Double unless d is a power of 2. The 53 bits of precision give you roughly
Prelude> 53 * log 2 / log 10
15.954589770191001
significant decimal digits of precision, so from the first nonzero digit on, you have 15 or 16 digits that you can expect to be correct for the exact [terminating or recurring] decimal expansion of the fraction 1/d, beyond that, the expansions differ.
For example, 1/71 has a recurring cycle 01408450704225352112676056338028169 of length 35 (by far not the longest in the range to be considered). The closest representable Double to 1/71 is
0.01408450704225352144438598855913369334302842617034912109375 = 8119165525400331 / (2^59)
of which the first 17 significant digits are correct (and 0.014084507042253521 is also what showFFloat Nothing (1/71) "" gives you).
To find the longest recurring cycle in the decimal expansion of 1/d, you can use an exact (or sufficiently accurate finite) string representation of the Rational number 1 % d, or, better, use pure integer arithmetic (compute the decimal expansion using long division) without involving a Rational.

Convert text to numbers while preserving ordering?

I've got a strange requirements, which I can't seem to get my head around. I need to come up with a function that would take a text string and return a number corresponding to that string - in such a way that, when sorted, these numbers would go in the same order as the original strings. For example, if I the function produces this mapping:
"abcd" -> x
"abdef" -> y
"xyz" -> z
then the numbers must be such that x < y < z. The strings can be arbitrary length, but always non-empty and the string comparison should be case-insensitive (i.e. "ABC" and "abc" should result in the same numerical value).
My first though was to map each letter to a corresponding number 1 through 26 and then just get the resulting number, e.g. a = 1, b = 2, c = 3, ..., z = 26, then "abc" would become 1*26^2 + 2*26 + 3, however then I realised that the text string can contain any text in any language (i.e. full unicode), so this isn't going to work. At this point I'm stuck. Any other ideas before I tell the client to sod off?
P.S. This strange requirement is due to a limitation in a proprietary system that can only do sorting by a numeric field. If the sorting is required by any other field type, it must be converted to some numerical representation - and then sorted. Don't ask.
You can make this work if you allow for arbitrary-precision real numbers, though that kinda feels like cheating. Unicode strings are sequences of characters drawn from 1,114,112 options. You can therefore think of them as decimal base-1,114,113 numbers: write 0., then write out your Unicode string, and you have a real number in base-1,114,113 (shift each character's numeric value up by one so that missing characters have the value 0). Comparing two of these numbers in base-1,114,113 compares the numbers lexicographically: if you scan the digits from left-to-right, the first digit that they disagree on tiebreaks between the two. This approach is completely infeasible unless you have an arbitrary-precision real number library.
If you just have IEEE-734 doubles, this approach won't work. One way to see this is that there are at most 264 possible doubles (or 280 of them if you allow for long doubles) because there are only 64 (80) bits in a double, but there are infinitely many different strings. That eliminates the possibility simply because there are too many strings to go around.
Unfortunately, you can't make this work if you have arbitrary-precision integers. The natural ordering on strings has the fun property that you can find pairs of strings that have infinitely many strings lexicographically between them. For example, notice that
a < ab < aab < aaab < aaaab < ... < b
Now imagine that you have a function that maps each string to an integer that obeys the rules you'd like. That would mean that
f(a) < f(ab) < f(aab) < f(aaab) < f(aaaab) < ... < f(b)
But that's not possible in the integers - you can't have two integers f(a) and f(b) with infinitely many integers between them. (The number of integers between f(a) and f(b) is at most f(b) - f(a) - 1).
So it seems like the answer is "this is possible if you have arbitrary-precision real numbers, it's not possible with doubles, and it's not possible with arbitrary-precision integers." I'd basically label that "not going to happen in practice" even though it's theoretically possible. :-)

What does $ with a numeric value mean in Delphi

What does it means, in Delphi, when I see a command like this:
char($23)
What does the dollar symbol mean in this context?
The dollar symbol represents that the following is a hex value.
ShowMessage(Char($23)); shows #.
The $ symbol is used to prefix a hexadecimal literal. The documentation says:
Numerals
Integer and real constants can be represented in decimal notation as
sequences of digits without commas or spaces, and prefixed with the +
or - operator to indicate sign. Values default to positive (so that,
for example, 67258 is equivalent to +67258) and must be within the
range of the largest predefined real or integer type.
Numerals with decimal points or exponents denote reals, while other
numerals denote integers. When the character E or e occurs within a
real, it means "times ten to the power of". For example, 7E2 means 7 *
10^2, and 12.25e+6 and 12.25e6 both mean 12.25 * 10^6.
The dollar-sign prefix indicates a hexadecimal numeral, for example,
$8F. Hexadecimal numbers without a preceding - unary operator are
taken to be positive values. During an assignment, if a hexadecimal
value lies outside the range of the receiving type an error is raised,
except in the case of the Integer (32-bit integer) where a warning
is raised. In this case, values exceeding the positive range for
Integer are taken to be negative numbers in a manner consistent with two's complement integer representation.
So, in your example, $23 is the number whose hexadecimal representation is 23. That number has decimal representation 35, so you can write:
Assert($23 = 35);
It represents a character. For example char(13) is end of line.

Resources