In Python 2.x I can Serial port write a list of bytes like this:
numbers=[0x40,0x00,0x99,0x54,0x78,0x13]
for x in numbers:
ser.write(x)
Now I'm converting to Python 3.8.6 it doesn't work. From what I read, in Python 3 all serial writes must be strings or "byte literals". What is the best way to convert my list of numbers into "byte literals" that I can send out the serial port? I don't really understand what "byte literal" means...
I figured it out.
numbers=[0x40,0x00,0x99,0x54,0x78,0x13]
numbers=bytes(numbers)
ser.write(numbers)
What confused me was when I tried testing the bytes() operation on a single number, that creates a byte literal string filled with zeros. But if you do the bytes() operation on a list, it converts the list to a byte literal string.
Related
How do I get "\x90" to be read as the byte value corresponding to the x86 NOP instruction when supplied as a field within the standard argument list in Linux? I have a buffer being stuffed all the way to 10 and then being overwritten into the next 8 bytes with the new return address, at least so I would like. Because the byte sequence being supplied is not read as a byte sequence but rather as characters, I do not know how to fix this. What next?
I am using MongoDB's build in id fields to label products and for ease of usage/typability, I would like to compress the _id field down from a hexadecimal string that looks like 5b69c35ac2cc78c8979a8a9b to something shorter and involving all letters of the alphabet (both uppercase and lowercase) and numbers. preferably it would involve no more than 10 or 12 characters. Are there any common methods of accomplishing this in Node.JS/MongoDB?
You could convert them to base64, that would make them 16 characters long.
Example:
Buffer.from('5b69c35ac2cc78c8979a8a9b', 'hex').toString('base64') // W2nDWsLMeMiXmoqb
It's better if you can directly access the Buffer - converting many ObjectIds from string could be costly.
The code 5b69c35ac2cc78c8979a8a9b is 24 bytes long (in hex), which means the absolute minimum number of bytes needed to represent this value without losing information is 12, ranging from 0-255 which is not what we want.
If we take a look at the ObjectId we could (maybe) eliminate some bytes:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
Removing machine identifier and process id (if all id's are generated by the same process) would leave us with 7 bytes (0-255), which is still not ideal to encode in base64 or even base32.
So it would probably be better to just use a 32 bit unsigned integer for the product codes and display it as hex using 8 bytes (the leading zeros could be removed).
Encoding those 4 bytes in base64 wouldn't help much (every 3 bytes become 4 bytes), and personally I would prefer case insensitive id's for use in url's which would leave us only with base32.
For better ease of usage/typability than hexadecimal, those 4 bytes could be encoded in z-base-32 and would fit in 7 bytes without padding (7 * 5 bits = 35 bits).
I am new to python 3. I am sending bytes across the wire.
When I send s.send(b'\x8f\x35\x4a\x5f"), and I look at the stack trace, I only see 5f4a358f.
However, if I create a variable:
test=(['\x8f\x35\x4a\x5f'])
print(str(''.join(test).encode()))
I receive b'\xc2\x8f5J_'
As you can see, there is an extra byte /xc2.
My question is two-fold:
1) Why when using str.encode() which encodes a string to bytes which are already "encoded" an extra byte /xc2 is added whereas a literal byte string b'\x8f\x35\x4a\x5f' has no extra encoding is added?
2) If I am passing in bytes into a variable which is used as a buffer to send data across a socket, how does one create and send a set of literal bytes (e.g. b') programmatically such that there is no added /xc2 byte when sent across the wire?
Thank you all for your time! I really appreciate the help.
Because it's not encoded; it's text consisting of U+008F U+0035 U+004A U+005F. And then when you encode it (as UTF-8, per default) the extra byte is added. Either use bytes in the first place, or encode as Latin-1. But use bytes.
I'm working with python3 and do not find an answer for my little problem.
My problem is sending a byte greater than 0x7F over the serial port with my raspberry pi.
example:
import serial
ser=serial.Serial("/dev/ttyAMA0")
a=0x7F
ser.write(bytes(chr(a), 'UTF-8'))
works fine! The receiver gets 0x7F
if a equals 0x80
a=0x80
ser.write(bytes(chr(a), 'UTF-8'))
the receiver gets two bytes: 0xC2 0x80
if i change the type to UTF-16 the receiver reads
0xFF 0xFE 0x80 0x00
The receiver should get only 0x80!
Whats wrong! Thanks for your answers.
UTF-8 specification says that words that are 1 byte/octet start with 0. Because 0x80 is "10000000" in binary, it needs to be preceded by a C2, "11000010 10000000" (2 bytes/octets). 0x7F is 01111111, so when reading it, it knows it is only 1 byte/octet long.
UTF-16 says that all words are represented as 2 byte/octets and has a Byte Order Mark which essentially tells the reader which one is the most-significant octet (or endianness.
Check on UTF-8 for full specifications, but essentially you are moving from the end of the 1 byte range, to the start of the 2 byte range.
I don't understand why you want to send your own custom 1-byte words, but what you are really looking for is any SBCS (Single Byte Character Set) which has a character for those bytes you specify. UTF-8/UTF-16 are MBCS, which means when you encode a character, it may give you more than a single byte.
Before UTF-? came along, everything was SBCS, which meant that any code page you selected was coded using 8-bits. The problem arose when 256 characters were not enough, and they had to make code pages like IBM273 (IBM EBCDIC Germany) and ISO-8859-1 (ANSI Latin 1; Western European) to interpret what "0x2C" meant. Both the sender and receiver needed to set their code page identifier to the same, or they wouldn't understand each other. There is further confusion because these SBCS code pages don't always use the full 256 characters, so "0x7F" may not even exist / have a meaning.
What you could do is encode it to something like codepage 737/IBM 00737, send the "Α" (Greek Alpha) character and it should encode it as 0x80.
If it doesn't work, t'm not sure if you can send the raw byte through pyserial as the write() method seems to require an encoding, you may need to look into the source code to see the lower level details.
a=0x80
ser.write(bytes(chr(a), 'ISO-8859-1'))
I am receiving a packet through a serial port but when I receive the packet it is of class bytes and looks like this:
b'>0011581158NNNNYNNN +6\r'
How do I convert this to a normal string? When I try to take information from this string, it comes out as a decimal representation it appears.
You can call decode on the bytes object to convert it to a string, but that only works if the bytes object actually represents text:
>>> bs = b'>0011581158NNNNYNNN +6\r'
>>> bs.decode('utf-8')
'>0011581158NNNNYNNN +6\r'
To really parse the input, you need to know the format, and what it actually means. To do that, identify the device that is connected to the serial port (A scanner? A robot? A receiver of some kind?). And look up the protocol. In your case, it may be a text-based protocol, but you'll often find that bytes stand for digits, in which you'll probably want to have a look at the struct module.