I'm trying to convert a 32-bit number to decimal in python. I'm quite new at python and so I'm not sure how to go about it. What I have so far is something like
file=open('filepath', 'rb')
num=file.read(4)
The value for num looks something like
b'\x05\x00\x00\x00'
How can I easily convert this to an integer value that can be stored? Eventually I will want to read in every value of this file, and store them to plotted later.
Thanks!
There is a module called struct which might be helpful when unpacking bytes to integer or any other form.
import struct
struct.unpack('i', b'\x05\x00\x00\x00') # i stands for integer
Gives output as (5,) which you can again unpack in some var or use it directly as per your needs.
Related
I'm going to import the txt file which contains Numbers only, for some coding practice.
Noticed that i can get the same result with either code_1 or code_2:
code_1 = np.array(pd.read_csv('e:/data.txt', sep='\t', header=None)).astype(np.float)
code_2 = np.array(pd.read_csv('e:/data.txt', sep='\t', header=None))
So I wonder if there is any difference between using or not using .astype(np.float)?
please tell me if there is an similar question. thx a lot.
DataFrame.astype() method is used to cast a pandas object to a specified dtype. astype() function also provides the capability to convert any suitable existing column to categorical type.
The DataFrame.astype() function comes very handy when we want to case a particular column data type to another data type.
In your case, the file is loaded as a DataFrame. The numbers will be loaded as integers or floats depending on the numbers. The astype(np.float) method converts the numbers to floats. On the other hand if the numbers are already of float type, then as you saw there will not be any difference between the two.
I have a little-endian hex string (for example, 'E61000003C9BFAE53893') that I'm trying to convert to a double. I've tried the following:
struct.unpack('<d', binascii.unhexlify('E61000003C9BFAE53893'))
but I keep getting
struct.error: unpack requires a buffer of 8 bytes
I checked the output of binascii.unhexlify('E61000003C9BFAE53893'), and it looks correct:
>> print (binascii.unhexlify('E61000003C9BFAE53893'))
b'\xe6\x10\x00\x00<\x9b\xfa\xe58\x93'
so I'm not sure what the issue is.
For some context, I have a bunch of coordinate data encoded as WKB, but geopandas only supports WKT. I thought it would be easy to write a function to convert one to the other (or WKB to floats), but it's proving more challenging that I expected.
0xE61000003C9BFAE53893 is too long to be a double. A double is 8 bytes and that is 9...ish? If you take a look at the second to last in your output, it is "0xe58".
struct.unpack only accepts 8 byte buffers, as per the error message.
python 3.6.5
numpy 1.14.3
scipy 1.0.1
cerberus 1.2
I'm trying to convert a string '6.1e-7' to a float 0.00000061 so I can save it in a mongoDb field.
My problem here is that float('6.1e-7') doesn't work (it will work for float('6.1e-4'), but not float('6.1e-5') and more).
Python float
I can't seem to find any information about why this happen, on float limitations, and every examples I found shows a conversion on e-3, never up to that.
Numpy
I installed Numpy to try the float96()/float128() ...float96() doesn't exist and float128() return a float '6.09999999999999983e-07'
Format
I tried 'format(6.1E-07, '.8f')' which works, as it return a string '0.00000061' but when I convert the string to a float (so it can pass cerberus validation) it revert back to '6.1E-7'.
Any help on this subject would be greatly appreciated.
Thanks
'6.1e-7' is a string:
>>> type('6.1e-7')
<class 'str'>
While 6.1e-7 is a float:
>>> type(6.1e-7)
<class 'float'>
0.00000061 is the same as 6.1e-7
>>> 0.00000061 == 6.1e-7
True
And, internally, this float is represented by 0's and 1's. That's just yet another representation of the same float.
However, when converted into a string, they're no longer compared as numbers, they are just characters:
>>> '0.00000061' == '6.1e-7'
False
And you can't compare strings with numbers either:
>>> 0.00000061 == '6.1e-7'
False
Your problem description is too twisted to be precisely understood but I'll try to get some telepathy for this.
In an internal format, numbers don't keep any formatting information, neither integers nor floats do. For an integer 123, you can't restore whether it was presented as "123", " 123 " (with tons of spaces before and after it), 000000123 or +0123. For a floating number, 0.1, +0.0001e00003, 1.000000e-1 and myriads of other forms can be used. Internally, all they shall result in the same number.
(There are some specifics with it when you use IEEE754 "decimal floating", but I am sure it is not your case.)
When saving to a database, internal representation stops having much sense. Instead, the database specifics starts playing role, and it can be quite different. For example, SQL suggests using column types like numeric(10,4), and each value will be converted to decimal format corresponding to the column type (typically, saved on disk as text string, with or without decimal point). In MongoDB, you can keep a floating value either as JSON number (IEEE754 double) or as text. Each variant has its own specifics, but, if you choose text, it is your own responsibility to provide proper formatting each time you form this text. You want to see a fixed-point decimal number with 8 digits after point? OK, no problems: you just shall format according to %.8f on each preparing of such representation.
The issues with representation selection are:
Uniqueness: no different forms should be available for the same value. Otherwise you can, for example, store the same contents under multiple keys, and then mistake older one for a last one.
Ordering awareness: DB should be able to provide natural order of values, for requests like "ceiling key-value pair".
If you always format values using %.8f, you will reach uniqueness, but not ordering. The same for %.g, %.e and really other text format except special (not human readable) ones that are constructed to keep such ordering. If you need ordering, just use numbers as numbers, and don't concentrate on how they look like in text forms.
(And, your problem is not tied with numpy.)
Okay. Here is my minimal working example. When I type this into python 3.6.2:
foo = '0.670'
str(foo)
I get
>>>'0.670'
but when I type
foo = 0.670
str(foo)
I get
>>>'0.67'
What gives? It is stripping off the zero, which I believe has to do with representing a float on a computer in general. But by using the str() method, why can it retain the extra 0 in the first case?
You are mixing strings and floats. The string is sequence of code points (one code point represents one character) representing some text and interpreter processing it as a text. The string is always inside single-quotes or double-quotes (e.g. 'Hello'). The float is a number and Python know it so it also know that 1.0000 is the same as 1.0.
In the first case you saved into foo a string. The str() call on string just take the string and return it as is.
In the second case you saved 0.670 as a float (because it's not wrapped in quotes). When Python converting float into a string it always tries create the shortest string possible.
Why Python automatically truncates the trailing zero?
When you try save some real number into computer's memory you have to convert it into binary representation. Usually (but there some exceptions) it's saved in format described in the standard IEEE 754 and Python uses it for floats too.
Let's go to the some example:
from struct import pack
x = -1.53
y = -1.53000
print("X:", pack(">d", x).hex())
print("Y:", pack(">d", y).hex())
The pack() function takes input and based on given format (>d) convert it into bytes. In this case it takes float number and give as how it is saved in memory. If you run the code you will see the x and y are saved in the memory in the same way. The memory doesn't contain information about the format of saved number.
Of course you can add some information about it but:
It would take another memory and it's good practice to use as much memory as you actually need and don't waste it.
What would be result of 0.10 + 0.1 should it be 0.2 or 0.20?
For scientific purposes and significant figures shouldn't it leave the value as the user defined it?
It doesn't matter how you defined the input number. The important is what format you want to use for presenting. As I said the str() always tries create the shortest string possible. str() is good for some simple scripts or tests. For scientific purposes (or for uses where some representation is required) you can convert your numbers to string as you want or need.
For example:
x = -1655484.4584631
y = 42.0
# always print number with sign and exactly 5 numbers from fractional part
print("{:+.5f}".format(x)) # -1655484.45846
print("{:+.5f}".format(y)) # +42.00000
# always print number in scientific format (sign is showed only when the number is negative)
print("{:-2e}".format(x)) # -1.66e+06
print("{:-2e}".format(y)) # 4.20e+01
For more information about formatting numbers and others types look at the Python's documentation.
I have a dictionary as follows
d={'apples':1349532000000, 'pears':1349532000000}
Doing either str(d) or repr(d) results in the following output
{'apples': 1349532000000L, 'pears': 1349532000000L}
How can I get str, repr, or print to display the dictionary without it adding an L to the numbers?
I am using Python 2.7
You can't, because the L suffix denotes a 64-bit integer. Without it, those numbers are 32-bit integers. Those numbers don't fit into 32 bits because they are too large. If the L suffix was omitted, the result would not be valid Python, and the whole point of repr() is to emit valid Python.
Well this is a little embarrasing, I just found a solution after quite a few attempts :-P
One way to do this (which suits my purpose) is to use json.dumps() to convert the dictionary to a string.
d={'apples':1349532000000, 'pears':1349532000000}
import json
json.dumps(d)
Outputs
'{"apples": 1349532000000, "pears": 1349532000000}'