Understanding the maths - python-3.x

I am trying to understand the maths in this code that converts binary to decimal. I was wondering if anyone could break it down so that I can see the working of a conversion. Sorry if this is too newb, but I've been searching for an explanation for hours and can't find one that explains it sufficently.
I know the conversion is decimal*2 + int(digit) but I still can't break it down to understand exaclty how it's converting to decimal
binary = input('enter a number: ')
decimal = 0
for digit in binary:
decimal= decimal*2 + int(digit)
print(decimal)

Here's example with small binary number 10 (which is 2 in decimal number)
binary = 10
for digit in binary:
decimal= decimal*2 + int(digit)
For for loop will take 1 from binary number which is at first place.
digit = 1 for 1st iteration.
It will overwrite the value of decimal which is initially 0.
decimal = 0*2 + 1 = 1
For the 2nd iteration digit= 0.
It will again calculate the value of decimal like below:
decimal = 1*2 + 0 = 2
So your decimal number is 2.
You can refer this for binary to decimal conversion

The for loop and syntax are hiding a larger pattern. First, consider the same base-10 numbers we use in everyday life. One way of representing the number 237 is 200 + 30 + 7. Breaking it down further, we get 2*10^2 + 3*10^1 + 7*10^0 (note that ** is the exponent operator in Python, but ^ is used nearly everywhere else in the world).
There's this pattern of exponents and coefficients with respect to the base 10. The exponents are 2, 1, and 0 for our example, and we can represent fractions with negative exponents. The coefficients 2, 3, and 7 are the same as from the number 237 that we started with.
It winds up being the case that you can do this uniquely for any base. I.e., every real number has a unique representation in base 10, base 2, and any other base you want to work in. In base 2, the exact same pattern emerges, but all the 10s are replaced with 2s. E.g., in binary consider 101. This is the same as 1*2^2 + 0*2^1 + 1*2^0, or just 5 in base-10.
What the algorithm you have does is make that a little more efficient. It's pretty wasteful to compute 2^20, 2^19, 2^18, and so on when you're basically doing the same operations in each of those cases. With our same binary example of 101, they've re-written it as (1 *2+0)*2+1. Notice that if you distribute the second 2 into the parenthesis, you get the same representation we started with.
What if we had a larger binary number, say 11001? Well, the same trick still works. (((1 *2+1 )*2+0)*2+0)*2+1.
With that last example, what is your algorithm doing? It's first computing (1 *2+1 ). On the next loop, it takes that number and multiplies it by 2 and adds the next digit to get ((1 *2+1 )*2+0), and so on. After just two more iterations your entire decimal number has been computed.

Effectively, what this is doing is taking each binary digit and multiplying it by 2^n where n is the place of that digit, and then summing them up. The confusion comes due to this being done almost in reverse, let's step through an example:
binary = "11100"
So first it takes the digit '1' and adds it on to 0 * 2 = 0, so we
have digit = '1'.
Next take the second digit '1' and add it to 1* 2 =
2, digit = '1' + '1'*2.
Same again, with digit = '1' + '1'*2 +
'1'*2^2.
Then the 2 zeros add nothing, but double the result twice,
so finally, digit = '0' + '0'*2 + '1'*2^2 + '1'*2^3 + '1'*2^4 = 28
(I've left quotes around digits to show where they are)
As you can see, the end result in this format is a pretty simple binary to decimal conversion.
I hope this helped you understand a bit :)

I will try to explain the logic :
Consider a binary number 11001010. When looping in Python, the first digit 1 comes in first and so on.
To convert it to decimal, we will multiply it with 2^7 and do this till 0 multiplied by 2^0.
And then we will add(sum) them.
Here we are adding whenever a digit is taken and then will multiply by 2 till the end of loop. For example, 1*(2^7) is performed here as decimal=0(decimal) +1, and then multiplied by 2, 7 times. When the next digit(1) comes in the second iteration, it is added as decimal = 1(decimal) *2 + 1(digit). During the third iteration of the loop, decimal = 3(decimal)*2 + 0(digit)
3*2 = (2+1)*2 = (first_digit) 1*2*2 + (seconds_digit) 1*2.
It continues so on for all the digits.

Related

Inversions in a binary string

How many inversions are there in a binary string of length n ?
For example , n = 3
000->0
001->0
010->1
011->0
100->2
101->1
110->2
111->0
So total inversions are 6
The question looks like a homework, that's why let me omit the details. You can:
Solve the problem as a recurrency (see Толя's answer)
Make up and solve the characteristic equation, get the solution as a close formula with some arbitrary constants (c1, c2, ..., cn); as the matter of fact you'll get just one unknown constant.
Put some known solutions (e.g. f(1) = 0, f(3) = 6) into the formula and find out all the unknown coefficients
The final answer you should get is
f(n) = n*(n-1)*2**(n-3)
where ** means raising into power (2**(n-3) is 2 in n-3 power). In case you don't want to deal with recurrency and the like stuff, you can just prove the formula by induction.
It is easy recurrent function.
Assume that we know answer for n-1.
And after ato all previous sequences we add 0 or 1 as first character.
if we adding 0 as first character that mean that count of inversions will not be changed: hence answer will be same as for n-1.
if we adding 1 as first character that mean count of inversions will be same as before and will be added extra inversion equals to count of 0 into all previous sequences.
Count of zeros ans ones in sequences of length n-1 will be:
(n-1)*2^(n-1)
Half of them is zeros it will give following result
(n-1)*2^(n-2)
It means that we have following formula:
f(1) = 0
f(n) = 2*f(n-1) + (n-1)*2^(n-2)

Subsequences whose sum of digits is divisible by 6

Say I have a string whose characters are nothing but digits in [0 - 9] range. E.g: "2486". Now I want to find out all the subsequences whose sum of digits is divisible by 6. E.g: in "2486", the subsequences are - "6", "246" ( 2+ 4 + 6 = 12 is divisible by 6 ), "486" (4 + 8 + 6 = 18 is divisible by 6 ) etc. I know generating all 2^n combinations we can do this. But that's very costly. What is the most efficient way to do this?
Edit:
I found the following solution somewhere in quora.
int len,ar[MAXLEN],dp[MAXLEN][MAXN];
int fun(int idx,int m)
{
if(idx==len)
return (m==0);
if(dp[idx][m]!=-1)
return dp[idx][m];
int ans=fun(idx+1,m);
ans+=fun(idx+1,(m*10+ar[idx])%n);
return dp[idx][m]=ans;
}
int main()
{
// input len , n , array
memset(dp,-1,sizeof(dp));
printf("%d\n",fun(0,0));
return 0;
}
Can someone please explain what is the logic behind the code - 'm*10+ar[idx])%n' ? Why is m multiplied by 10 here?
Say you have a sequence of 16 digits You could generate all 216 subsequences and test them, which is 65536 operations.
Or you could take the first 8 digits and generate the 28 possible subsequences, and sort them based on the result of their sum modulo 6, and do the same for the last 8 digits. This is only 512 operations.
Then you can generate all subsequences of the original 16 digit string that are divisible by 6 by taking each subsequence of the first list with a modulo value equal to 0 (including the empty subsquence) and concatenating it with each subsequence of the last list with a modulo value equal to 0.
Then take each subsequence of the first list with a modulo value equal to 1 and concatenate it with each subsequence of the last list with a modulo value equal to 5. Then 2 with 4, 3 with 3, 4 with 2 and 5 with 1.
So after an initial cost of 512 operations you can generate just those subsequences whose sum is divisible by 6. You can apply this algorithm recursively for larger sequences.
Create an array with a 6-bit bitmap for each position in the string. Work from right to left and set the array of bitmaps so that bitmaps have bits set in the array when there is some subsequence starting from just after the array which sums up to that position in the bitmap. You can do this from right to left using the bitmap just after the current position. If you see a 3 and the bitmap just after the current position is 010001 then sums 1 and 5 are already accessible by just skipping the 3. Using the 3 sums 4 and 2 are now available, so the new bitmap is 011011.
Now do a depth first search for subsequences from left to right, with the choice at each character being either to take that character or not. As you do this keep track of the mod 6 sum of the characters taken so far. Use the bitmaps to work out whether there is a subsequence to the right of that position that, added to the sum so far, yields zero. Carry on as long as you can see that the current sum leads to a subsequence of sum zero, otherwise stop and recurse.
The first stage has cost linear in the size of the input (for fixed values of 6). The second stage has cost linear in the number of subsequences produced. In fact, if you have to actually write out the subsequences visited (E.g. by maintaining an explicit stack and writing out the contents of the stack) THAT will be the most expensive part of the program.
The worst case is of course input 000000...0000 when all 2^n subsequences are valid.
I'm pretty sure a user named, amit, recently answered a similar question for combinations rather than subsequences where the divisor is 4, although I can't find it right now. His answer was to create, in this case, five arrays (call them Array_i) in O(n) where each array contains the array elements with a modular relationship i with 6. With subsequences we also need a way to record element order. For example, in your case of 2486, our arrays could be:
Array_0 = [null,null,null,6]
Array_1 = []
Array_2 = [null,4,null,null]
Array_3 = []
Array_4 = [2,null,8,null]
Array_5 = []
Now just cross-combine the appropriate arrays, maintaining element order: Array_0, Array_2 & Array_4, Array_0 & any other combination of arrays:
6, 24, 48, 246, 486

How to extract dyadic fraction from float

Now, floating and double-precision numbers, although they can approximate any sort of number (although the same could be said integers, floats are just more precise), they are represented as binary decimals internally. For example, one tenth would be approximated
0.00011001100110011... (... only goes to computers precision, not infinity)
Now, any number in binary with finite bits as something called a dyadic fraction representation in mathematics (has nothing to do with p-adic). This means you represent it as a fraction, where the denominator is a power of 2. For example, let's say our computer approximates one tenth as 0.00011. The dyadic fraction for that is 3/32 or 3/(2^5), which is close to one tenth. Now for my technical question. What would be the simplest way to extract the dyadic fraction from a floating number.
Irrelevant Note: If you are wondering why I would want to do this, it is because I am creating a surreal number library in Haskell. Dyadic fractions are easily translated into Surreal numbers, which is why it is convenient that binary is easily translated into dyadic, (I'll sure have trouble with the rational numbers though.)
The decodeFloat function seems useful for this. Technically, you should also check that floatRadix is 2, but as far I can see this is always the case in GHC.
Just be careful since it does not simplify mantissa and exponent. Here, if I evaluate decodeFloat (1.0 :: Double) I get an exponent of -52 and a mantissa of 2^52 which is not what I expected.
Also, toRational seems to generate a dyadic fraction. I am not sure this is always the case, though.
Hold your numbers in binary and convert to decimal for display.
Binary numbers are all dyatic. The numbers after the decimal place represent the number of powers of two for the denominator and the number evaluated without a decimal place is the numerator. That's binary numbers for you.
There is an ideal representation for surreal numbers in binary. I call them "sinary". It's this:
0s is Not a number
1s is zero
10s is neg one
11s is one
100s is neg two
101s is neg half
110s is half
111s is two
... etc...
so you see that the standard binary count matches the surreal birth order of numeric values when evaluated in sinary. The way to determine the numeric value of sinary is that the 1's are rights and the 0's are lefts. We start with +/-1's and then 1/2, 1/4, 1/8, etc. With sign equal to + for 1 and - for 0.
ex: evaluating sinary
1011011s
-> is the 91st surreal number (because 64+16+8+2+1 = 91)
-> with a value of −0.28125, because...
1011011
NLRRLRR
+-++-++
+ 0 − 1 + 1/2 + 1/4 − 1/8 + 1/16 + 1/32
= 0 − 32/32 + 16/32 + 8/32 − 4/32 + 2/32 + 1/32
= − 9/32
The surreal numbers form a binary tree, so there is an ideal binary format matching their location on the tree according to the Left/Right pattern to reach the number. Assign 1 to right and 0 to left. Then the birth order of surreal number is equal to the binary count of this representation. ie: the 15th surreal number value represented in sinary is the 15th number representation in the standard binary count. The value of a sinary is the surreal label value. Strip the leading bit from the representation, and start adding +1's or -1's depending on if the number starts with 1 or 0 after the first one. Then once the bit flips, begin adding and subtracting halved values (1/2, 1/4, 1/8, etc) using + or - values according to the bit value 1/0.
I have tested this format and it seems to work well. And there are some other secrets... such as the left and right of any sinary representation is the same binary format with the tail clipped to the last 0 and last 1 respectively. Conversion to decimal into a dyatic is NOT required in order to preform the recursive functions requested by Conway.

Conversion of numeric to string in MATLAB

Suppose I want to conver the number 0.011124325465476454 to string in MATLAB.
If I hit
mat2str(0.011124325465476454,100)
I get 0.011124325465476453 which differs in the last digit.
If I hit num2str(0.011124325465476454,'%5.25f')
I get 0.0111243254654764530000000
which is padded with undesirable zeros and differs in the last digit (3 should be 4).
I need a way to convert numerics with random number of decimals to their EXACT string matches (no zeros padded, no final digit modification).
Is there such as way?
EDIT: Since I din't have in mind the info about precision that Amro and nrz provided, I am adding some more additional info about the problem. The numbers I actually need to convert come from a C++ program that outputs them to a txt file and they are all of the C++ double type. [NOTE: The part that inputs the numbers from the txt file to MATLAB is not coded by me and I'm actually not allowed to modify it to keep the numbers as strings without converting them to numerics. I only have access to this code's "output" which is the numerics I'd like to convert]. So far I haven't gotten numbers with more than 17 decimals (NOTE: consequently the example provided above, with 18 decimals, is not very indicative).
Now, if the number has 15 digits eg 0.280783055069002
then num2str(0.280783055069002,'%5.17f') or mat2str(0.280783055069002,17) returns
0.28078305506900197
which is not the exact number (see last digits).
But if I hit mat2str(0.280783055069002,15) I get
0.280783055069002 which is correct!!!
Probably there a million ways to "code around" the problem (eg create a routine that does the conversion), but isn't there some way using the standard built-in MATLAB's to get desirable results when I input a number with random number of decimals (but no more than 17);
My HPF toolbox also allows you to work with an arbitrary precision of numbers in MATLAB.
In MATLAB, try this:
>> format long g
>> x = 0.280783054
x =
0.280783054
As you can see, MATLAB writes it out with the digits you have posed. But how does MATLAB really "feel" about that number? What does it store internally? See what sprintf says:
>> sprintf('%.60f',x)
ans =
0.280783053999999976380053112734458409249782562255859375000000
And this is what HPF sees, when it tries to extract that number from the double:
>> hpf(x,60)
ans =
0.280783053999999976380053112734458409249782562255859375000000
The fact is, almost all decimal numbers are NOT representable exactly in floating point arithmetic as a double. (0.5 or 0.375 are exceptions to that rule, for obvious reasons.)
However, when stored in a decimal form with 18 digits, we see that HPF did not need to store the number as a binary approximation to the decimal form.
x = hpf('0.280783054',[18 0])
x =
0.280783054
>> x.mantissa
ans =
2 8 0 7 8 3 0 5 4 0 0 0 0 0 0 0 0 0
What niels does not appreciate is that decimal numbers are not stored in decimal form as a double. For example what does 0.1 look like internally?
>> sprintf('%.60f',0.1)
ans =
0.100000000000000005551115123125782702118158340454101562500000
As you see, matlab does not store it as 0.1. In fact, matlab stores 0.1 as a binary number, here in effect...
1/16 + 1/32 + 1/256 + 1/512 + 1/4096 + 1/8192 + 1/65536 + ...
or if you prefer
2^-4 + 2^-5 + 2^-8 + 2^-9 + 2^-12 + 2^13 + 2^-16 + ...
To represent 0.1 exactly, this would take infinitely many such terms since 0.1 is a repeating number in binary. MATLAB stops at 52 bits. Just like 2/3 = 0.6666666666... as a decimal, 0.1 is stored only as an approximation as a double.
This is why your problem really is completely about precision and the binary form that a double comprises.
As a final edit after chat...
The point is that MATLAB uses a double to represent a number. So it will take in a number with up to 15 decimal digits and be able to spew them out with the proper format setting.
>> format long g
>> eps
ans =
2.22044604925031e-16
So for example...
>> x = 1.23456789012345
x =
1.23456789012345
And we see that MATLAB has gotten it right. But now add one more digit to the end.
>> x = 1.234567890123456
x =
1.23456789012346
In its full glory, look at x, as MATLAB sees it:
>> sprintf('%.60f',x)
ans =
1.234567890123456024298320699017494916915893554687500000000000
So always beware the last digit of any floating point number. MATLAB will try to round things intelligently, but 15 digits is just on the edge of where you are safe.
Is it necessary to use a tool like HPF or MP to solve such a problem? No, as long as you recognize the limitations of a double. However tools that offer arbitrary precision give you the ability to be more flexible when you need it. For example, HPF offers the use and control of guard digits down in that basement area. If you need them, they are there to save the digits you need from corruption.
You can use Multiple Precision Toolkit from MATLAB File Exchange for arbitrary precision numbers. Floating point numbers do not usually have a precise base-10 presentation.
That's because your number is beyond the precision of the double numeric type (it gives you between 15 to 17 significant decimal digits). In your case, it is rounded to the nearest representable number as soon as the literal is evaluated.
If you need more precision than what the double-precision floating-points provides, store the numbers in strings, or use arbitrary-precision libraries. For example use the Symbolic Toolbox:
sym('0.0111243254654764549999999')
You cannot get EXACT string since the number is stored in double type, or even long double type.
The number stored will be a subtle more or less than the number you gives.
computer only knows binary number 0 & 1. You must know that numbers in one radix may not expressed the same in other radix. For example, number 1/3, radix 10 yields 0.33333333...(The ellipsis (three dots) indicate that there would still be more digits to come, here is digit 3), and it will be truncated to 0.333333; radix 3 yields 0.10000000, see, no more or less, exactly the amount; radix 2 yields 0.01010101... , so it will likely truncated to 0.01010101 in computer,that's 85/256, less than 1/3 by rounding, and next time you fetch the number, it won't be the same you want.
So from the beginning, you should store the number in string instead of float type, otherwise it will lose precision.
Considering the precision problem, MATLAB provides symbolic computation to arbitrary precision.

Float to String: What is an Exponent part?

I've written a small function in C, which almost do the same work as standart function `fcvt'. As you may know, this function takes a float/double and make a string, representing this number in ANSI characters. Everything works ;-)
For example, for number 1.33334, my function gives me string: "133334" and set up special integer variable `decimal_part', in this example will be 1, which means in decimal part only 1 symbol, everything else is a fraction.
Now I'm curious about what to do standart C function `printf'. It can take %a or %e as format string. Let me cite for %e (link junked):
"double" argument is output in scientific notation
[-]m.nnnnnne+xx
... The exponent always contains two digits.
It said: "The exponent always contains two digits". But what is an Exponent? This is the main question. And also, how to get this 'exponent' from my function above or from `fcvt'.
The notation might be better explained if we expand the e:
[-]m.nnnnnn * (10^xx)
So you have one digit of m (from 0 to 9, but it will only ever be 0 if the entire value is 0), and several digits of n. I guess it might be best to show with examples:
1 = 1.0000 * 10^0 = 1e0
10 = 1.0000 * 10^1 = 1e1
10000 = 1.0000 * 10^4 = 1e4
0.1 = 1.0000 * 10^-1 = 1e-1
1,419 = 1.419 * 10^3 = 1.419e3
0.00000123 = 1.23 * 10^-5 = 1.23e-5
You can look up scientific notation off Google, but it is useful for expressing very large or small numbers like 1232100000000000000 would be 1.2321e24 (I didn't actually count, exponent may be inaccurate).
In C, I think you can actually extract the exponent from the top 12 bits (the first being the sign which you will have to ignore). See: IEEE758-1985 Floating Point
The exponent is the power 10 is raised to then multiplied by the base.
SI is explained at wikipeida. http://en.wikipedia.org/wiki/Scientific_notation
m.nnnnnne+xx is logically equal to m.nnnnnn * 10 ^ +xx
In scientific notation, the exponent is the ten to the XX power, so 1234.5678 can be represented as 1.2345678E03 where the normalized form is multiplied by 10^3 to get the "real" answer.
400 = 4 * 10 ^ 2
2 is the exponent.
If you write a number in scientific notation then the exponent is part of that notation.
You can see a full description here http://en.wikipedia.org/wiki/Scientific_notation, but basically its just another way to write a number, typically used for very large or very small numbers.
Say you have the number 300, that is equal to 3 * 100, or 3 * 10^2 in scientific notation.
If you use %e it will be printed as 3.0e+02

Resources