The problem:
What is the decimal number -234 in 2's complement using 16 bits?
Do I just have to convert 234 to binary?
Yes, converting it to binary is sufficient. The answer is 0xff16.
Also, you can try using WolframAlpha to verify your calculations.
Related
I am trying to understand the behavior of large numbers when casted with Float Data type in spark.
Last select statement in the above picture gives very abrupt output.
Thanks in Advance !
The output is not abrupt. It is simply a demonstration of the limitation of the truncated floating-point representation. FloatType in Spark is backed by Java's float - 32-bit IEEE754 floating-point number. It has 24 bits for the significand, but the MSB is always 1 and hence the actual precision is only 23 bits.
123456789.6 is 1.8396495223045348... x 226. 1.8396495223045348 is 1.110101101111001101000101011001... in binary. Limiting it to only 24 bits results in 1.11010110111100110100011 (the last bit is rounded up), which is 1.8396495580673218 in decimal. Multiply it by 226 and you get 123456792.
I thought, I know the base64 encoding, but I often see so encoded text: iVBORw0KGgoAAAANSUhEUgAAAXwAAAG4C ... YPQr/w8B0CBr+DAkGQAAAABJRU5ErkJggg==. I got mean, It overs by double =. Why does it push second void byte, if 8 bits enough satisfy empty bits of encoded text?
I found. Length of base64-encoded data should be divisible by 4. The remainder of division length by 4 is filled by =-chars. That's because conflict of bit rate. In modern systems it's used 8-bits bytes, but in base64 is used 6-bits chars. So the lowest common multiple is 24 bits or 3 bytes or 4 base64-chars. A lack is filled by =.
In Python3 (I am using 3.6) they decided to start outputting Integral values.
That created the following problem for me. Suppose that we input a large float
math.floor(4.444444444444445e+85)
The output in this case is being
44444444444444447395279681404626730521364975775215375673863470153230912354225773084672
In Python2.7 the output used to be 4.444444444444445e+85.
Question 1: Is the output in 3.6 reproducible? In other words, what is it? Computing several times in different computers gave me the same result. I guess then that it is a value depending only on the input 4.444444444444445e+85. My guess what it is is that it is the floor of the binary representation of that float. The factorization of the output is
2^232 × 3 × 17 × 31 × 131 × 1217 × 1933 × 13217
where that factor 2^232 is close to the 10^70 that the scientific notation has, but I am not completely sure.
Question 2: I think I know how to take a float 4.444444444444445e+85, extract its significand and exponent, and produce myself that actual integral value of 4444444444444445*10**70 or the float 4.444444444444445e+85, which in my opinion seems a more honest value of for the floor of float(4.444444444444445e+85). Is there a neat way to recover this (allow me to call it) honest floor?
Ok, I retract about calling 'honest' to the floor of the decimal representation. Since the computer stores the numbers in binary, it is fair calling honest the output computed for the binary representation. This, if my guess for Question 1 is correct.
Displaying the output in hex should be helpful:
>>> import math
>>> math.floor(4.444444444444445e+85)
44444444444444447395279681404626730521364975775215375673863470153230912354225773084672
>>> hex(_)
'0x16e0c6d18f4bfb0000000000000000000000000000000000000000000000000000000000'
Note all the trailing zeroes! On almost all platforms, Python floats are represented by the hardware with a significand containing 53 bits, and a power-of-2 exponent. And, indeed,
>>> (0x16e0c6d18f4bfb).bit_length() # the non-zero part does have 53 bits
53
>>> 0x16e0c6d18f4bfb * 2**232 # and 232 zero bits follow it
44444444444444447395279681404626730521364975775215375673863470153230912354225773084672
So the integer you got back is, mathematically, exactly equal to the float you started with. Another way to see that:
>>> (4.444444444444445e85).hex()
'0x1.6e0c6d18f4bfbp+284'
If you want to work with decimal representations instead, see the docs for the decimal module.
Edit: as discussed in comments, perhaps what you really want here is simply
float(math.floor(x))
That will reproduce the same result Python 2 gave for
math.floor(x)
Having a file test2.py with the following contents:
print(2.0000000000000003)
print(2.0000000000000002)
I get this output:
$ python3 test2.py
2.0000000000000004
2.0
I thought lack of memory allocated for float might be causing this but 2.0000000000000003 and 2.0000000000000002 need same amount of memory.
IEEE 754 64-bit binary floating point always uses 64 bits to store a number. It can exactly represent a finite subset of the binary fractions. Looking only at the normal numbers, if N is a power of two in its range, it can represent a number of the form, in binary, 1.s*N where s is a string of 52 zeros and ones.
All the 32 bit binary integers, including 2, are exactly representable.
The smallest exactly representable number greater than 2 is 2.000000000000000444089209850062616169452667236328125. It is twice the binary fraction 1.0000000000000000000000000000000000000000000000000001.
2.0000000000000003 is closer to 2.000000000000000444089209850062616169452667236328125 than to 2, so it rounds up and prints as 2.0000000000000004.
2.0000000000000002 is closer to 2.0, so it rounds down to 2.0.
To store numbers between 2.0 and 2.000000000000000444089209850062616169452667236328125 would require a different floating point format likely to take more than 64 bits for each number.
Floats are not stored as integers are, with each bit signaling a yes/no term of 1,2,4,8,16,32,... value that you add up to get the complete number. They are stored as sign + mantissa + exponent in base 2. Several combinations have special meaning (NaN, +-inf, -0,...). Positive and negative numbers are idential in mantissa and exponent, only the sign differs.
At any given time they have a specific bit-length they are "put into". They can not overflow.
They have however a minimal accuracy, if you try to fit numbers into them that would need a bigger accuracy you get rounding errors - thats what you see in your example.
More on floats and storage (with example):
http://stupidpythonideas.blogspot.de/2015/01/ieee-floats-and-python.html
(which links to a more technical https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)
More on accuracy of floats:
- Floating Point Arithmetic: Issues and Limitations
I watched a video on youtube about the bits. After watching it I have a confusion what the actual size does a number, string or any character takes. What I have understood from the video is.
1= 00000001 // 1 bit
3= 00000011 // 2 bits
511 = 111111111 // 9bits
4294967295= 11111111111111111111111111111111 //32 bit
1.5 = ? // ?
I just want to know above given statement expect in decimal point are correct ? or all numeric , string or any character take 8 byte. I am using 64 bit operating system.
And what is the binary code of decimal value
If I understand correctly, you're asking how many bits/bytes are used to represent a given number or character. I'll try to cover the common cases:
Integer (whole number) values
Since most systems use 8-bits per byte, integer numbers are usually represented as a multiple of 8 bits:
8 bits (1 byte) is typical for the C char datatype.
16 bits (2 bytes) is typical for int or short values.
32 bits (4 bytes) is typical for int or long values.
Each successive bit is used to represent a value twice the size of the previous one, so the first bit represents one, the second bit represents two, the third represents four, and so on. If a bit is set to 1, the value it represents is added to the "total" value of the number as a whole, so the 4-bit value 1101 (8, 4, 2, 1) is 8+4+1 = 13.
Note that the zeroes are still counted as bits, even for numbers such as 3, because they're still necessary to represent the number. For example:
00000011 represents a decimal value of 3 as an 8-bit binary number.
00000111 represents a decimal value of 7 as an 8-bit binary number.
The zero in the first number is used to distinguish it from the second, even if it's not "set" as 1.
An "unsigned" 8-bit variable can represent 2^8 (256) values, in the range 0 to 255 inclusive. "Signed" values (i.e. numbers which can be negative) are often described as using a single bit to indicate whether the value is positive (0) or negative (1), which would give an effective range of 2^7 (-127 to +127) either way, but since there's not much point in having two different ways to represent zero (-0 and +0), two's complement is commonly used to allow a slightly greater storage range: -128 to +127.
Decimal (fractional) values
Numbers such as 1.5 are usually represented as IEEE floating point values. A 32-bit IEEE floating point value uses 32 bits like a typical int value, but will use those bits differently. I'd suggest reading the Wikipedia article if you're interested in the technical details of how it works - I hope that you like mathematics.
Alternatively, non-integer numbers may be represented using a fixed point format; this was a fairly common occurrence in the early days of DOS gaming, before FPUs became a standard feature of desktop machines, and fixed point arithmetic is still used today in some situations, such as embedded systems.
Text
Simple ASCII or Latin-1 text is usually represented as a series of 8-bit bytes - in other words it's a series of integers, with each numeric value representing a single character code. For example, an 8-bit value of 00100000 (32) represents the ASCII space () character.
Alternative 8-bit encodings (such as JIS X 0201) map those 2^8 number values to different visible characters, whilst yet other encodings may use 16-bit or 32-bit values for each character instead.
Unicode character sets (such a the 8-bit UTF-8 or 16-bit UTF-16) are more complicated; a single UTF-16 character might be represented as a single 16-bit value or a pair of 16-bit values, whilst UTF-8 characters can be anywhere from one 8-bit byte to four 8-bit bytes!
Endian-ness
You should also be aware that values spanning more than a single 8-bit byte are typically byte-ordered in one of two ways: little endian, or big endian.
Little Endian: A 16-bit value of 512 would be represented as 11111111 00000001 (i.e. smallest-value bits come first).
Big Endian: A 16-bit value of 512 would be represented as 00000001 11111111 (i.e. largest-value bits come first).
You may also hear of mixed-endian, middle-endian, or bi-endian representations - see the Wikipedia article for further information.
There is difference between a bit, a byte.
1 bit is a single digit 0 or 1 in base 2.
1 byte = 8 bits.
Yes the statements you gave are correct.
Binary code of 1.5 will be 001.100. However this is how we interpret binary. The way computer stores numbers is different and is based on compiler, platform. For example C uses IEEE 754 format. Google about it to learn more.
Your OS is 64 bit means your CPU architecture is 64 bit.
a bit is one binary digit e.g 0 or 1.
a byte is eight bits, or two hex digits
A Nibble is half a byte or 1 hex digit
Words get a bit more complex. Originally it was the number of bytes required to cover the range of addresses available in memory. As we had a lot of hybrids and memory schemes. Word is usually two-bytes, a double word, 4 bytes etc.
In a computing everything comes down to binary, a combination of 0s and 1s. Characters, decimal numbers etc are representations.
So the character 0 is 7 (or 8 bit ascii) is 00110000 in binary, 30 in hex, and 48 as decimal (base 10) number. It's only '0' if you choose to 'see' it as a single byte character.
Representing numbers with decimal points is even more varied. There are many accepted ways of doing that, but they are conventions not rules.
Have a look for 1 and 2s complement, gray code, BCD, floating point representation and such to get more of an idea.