How to input audio as bytes in moviepy - python-3.x

I have audio as bytes in the form of:
b'ID3\x04\x00\x00\x00\x00\x00#TSSE\x00\x00\x00\x0f\x00\x00\x03Lavf57.71.100\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\...
That I got from Amazon web services:
import boto3
client = boto3.client('polly')
response = client.synthesize_speech(
Engine='neural',
LanguageCode='en-GB',
OutputFormat='mp3',
SampleRate='8000',
Text='hey whats up this is a test',
VoiceId='Brian'
)
And I want to input it into moviepy audiofile using
AudioFileClip()
AudioFileClip takes filename or an array representing a sound. I know I can save the audio as a file and read it, but I would like to have AudioFileClip take the bytes output I showed above.
I tried:
AudioFileClip(response['AudioStream'].read())
But this gives the error:
TypeError: endswith first arg must be bytes or a tuple of bytes, not
str
What can I do?

You need to convert the stream of audio to a different type. (Thats why its called TypeError). You are putting it as a string and it wants a byte format.
You can convert a str to a byte by using the bytearrayfunction!
https://docs.python.org/3/library/functions.html#func-bytearray
You can also look at this question:
Best way to convert string to bytes in Python 3?
For more help just comment on this anwser, and Ill try to help you as soon as possible.
Hope this can help you on your project,
PythonMasterLua

Related

Converting b-string to png in Python 3.9.6

I have been trying to convert this b-string to a png image.
Here is the bytes string for a barcode received from an api. It is called Cloudmersive 1D barcode generator api.
I have tried to use base64.b64decode() and then write binary to an image file but it does not work. I also tried using BytesIO but that does not work either.
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00h\x00\x00\x00d\x08\x02\x00\x00\x00\xe5\xbc\xe2\x8d\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00\tpHYs\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01\xc7o\xa8d\x00\x00\x0c\x8aIDATx^\xed\x95\xd1\x95d\xc5\x0e\x041\x0f\x830\x07_p\xe5y\xc2SU\xce\xc4\x84\xa4Z\x18\xf6{\xe3C'\x94\xca{\xbb\xa7\x81\xc3o\x7f\x8b\xdf~;+\xd3R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083M\xdbG\xd5R\x9c\x86\xf2\xb8'\xdc\xe3\xbb\t;\xc4}\x1a\x85\x11Z\xf0"\xbe\xc3\xac\x84\x968\x12\x083\xcd\xdc\x7f\xf1M~\xfdp?\xc9\xaf\x1f\xee'\xf9\xc6\x0f\xf7\xd7\x1f\xf7\xbf\xf7\xe2\xf7?\xff\xf7\x91}R7\x87\xff\xfb\xf3\xf7\x1fU{s\xbf\xf3+\xf9\xe4\x8f\xbf\xfa\xe1c?<?\x87b\xfb\xec\xef>\xfe\x95\x1dT\xfe\x01\xff\xfa\xc3\x9d\x0f\xcek\xee\xab\xfd\xc6|\xa7\xaf\xcf>{\xb6S\xf5\xd7\xef\xcds\xfd|\xcf\xd7\xeb\x85\x1e\xaf{L\xc5\x97\xea\xcb\xf9\xc3\xbf\xfb\xf8\xd5\x0f\xfb\x1e\xff\xf6\xc3\xe9\xd5w\xf9\xf8F\xf7{\xfe\xf6\xfb\x1f\x7f\xe8\x07\xfa\xba\x9aW\xd3\xec\xa7\x9e\x7fx\xe2\xfbM\xda\x13\x9f\xcb3\xfc\xfe\xe3\xe7\xfc\x95~\x87\xff\xf8o\x1c_\xe3\xaf\xbf>>n\x7f\xb3\xce\xa3)N<\x9e\xfa\xfa\xc0\xf1\xce\xf6\x17\x7f\xf2\x19\xb6\xe3\xe7+\xbe\xff\xf8\x95\xdf\xcf?\xe2\xf3Oy\xd5\x1e|\xe7\x7f\x0e\xe7{\xbc_\xe8\x9f\xe3~\x87\xbf\xce\x0fq\xe8?\xc6\xeb\x87\xbb\xbf\xd9zi\xef\xfd\xdb_~\xbeY\xb2\xfb\xb6\x8f\xea\xfd\xbe\xdf\xf9\xe1\xfa\xe3\x1fW\xbf\xe9\x1f\xf8\xd6\xbfq\x1f\xaf<\xda\xdf\xa8\xcf\xcb\xd7\xd5\x87\x7f\xe6a'\x97\xf9-G\xed\xac?\xfc\xcb\xef\xb3J\xee\x178\xfc\xf1\xe7\xc7c\xff\xe9\xf1/\xf6\xdf\xf9\xe0\xdf~\xb8\xf9\xd9\xfd\x8d\xe7\xfa\xf9\xd9\xed\xd8\x1e;\xb8\xd9\xe8\x7f\xcf\xf8\xeb\xe6\xa7\x7f\xdd\xce\xe5\xfd\xc2\xa2\x8e\xf7\xf4\xf3\x8f\xf7/\xff\xe2?\xfdp\xeb\x8d'\xf8\xfc\xf8\xef7\x1b\xed\xef\xa9\xa5?\xa6\xab^y\xf4\x87\x7f\xb6\x9b?\xf5x\xffJ?\xe2_\xffS=\x9f\xf2\xf1\x9a\xf5g\xf9X|}\xe0?7u\xed/8[\x7f\xec\xab\xfb\xf5P\x7f\xe6\x03\x85\xfe\xf0\xef>\xaeg\x8e\xae\xfb\xe6\xbf\xfc\xcf\xe1\x1f\xbf\xee\xe5\xec\xe1\xf3\x9b\x7f\xd2\x9b_\xbdV\xd4\xd7\xff\x82O\xd7\x1f\xd6\xc9\xe1\xeb\x9d\xedk\xfe\xf7\xc7\xf7Wx\xf1\x9d\x1f\xee\x17\x0f~\xfdp?\xc9\xaf\x1f\xee\xa7\xf8\xfb\xef\xff\x03\x11\xda\xa3\xaefM\x89\xbf\x00\x00\x00\x00IEND\xaeB`\x82'
There's no need to use b64decode or any other operation on that byte string, it's ready to write to the file as is.
with open(r'c:\temp\temp.png', 'wb') as f:
f.write(b_str)
It produces this:

Converting a nodejs buffer to string and back to buffer gives a different result in some cases

I created a .docx file.
Now, I do this:
// read the file to a buffer
const data = await fs.promises.readFile('<pathToMy.docx>')
// Converts the buffer to a string using 'utf8' but we could use any encoding
const stringContent = data.toString()
// Converts the string back to a buffer using the same encoding
const newData = Buffer.from(stringContent)
// We expect the values to be equal...
console.log(data.equals(newData)) // -> false
I don't understand in what step of the process the bytes are being changed...
I already spent sooo much time trying to figure this out, without any result... If someone can help me understand what part I'm missing out, it would be really awesome!
A .docXfile is not a UTF-8 string (it's a binary ZIP file) so when you read it into a Buffer object and then call .toString() on it, you're assuming it is already encoding as UTF-8 in the buffer and you want to now move it into a Javascript string. That's not what you have. Your binary data will likely encounter things that are invalid in UTF-8 and those will be discarded or coerced into valid UTF-8, causing an irreversible change.
What Buffer.toString() does is take a Buffer that is ALREADY encoded in UTF-8 and puts it into a Javascript string. See this comment in the doc,
If encoding is 'utf8' and a byte sequence in the input is not valid UTF-8, then each invalid byte is replaced with the replacement character U+FFFD.
So, the code you show in your question is wrongly assuming that Buffer.toString() takes binary data and reversibly encodes it as a UTF8 string. That is not what it does and that's why it doesn't do what you are expecting.
Your question doesn't describe what you're actually trying to accomplish. If you want to do something useful with the .docX file, you probably need to actually parse it from it's binary ZIP file form into the actual components of the file in their appropriate format.
Now that you explain you're trying to store it in localStorage, then you need to encode the binary into a string format. One such popular option is Base64 though it isn't super efficient (size wise), but it is better than many others. See Binary Data in JSON String. Something better than Base64 for prior discussion on this topic. Ignore the notes about compression in that other answer because your data is already ZIP compressed.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte while accessing csv file

I am trying to access csv file from aws s3 bucket and getting error 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte code is below I am using python 3.7 version
from io import BytesIO
import boto3
import pandas as pd
import gzip
s3 = boto3.client('s3', aws_access_key_id='######',
aws_secret_access_key='#######')
response = s3.get_object(Bucket='#####', Key='raw.csv')
# print(response)
s3_data = StringIO(response.get('Body').read().decode('utf-8')
data = pd.read_csv(s3_data)
print(data.head())
kindly help me out here how i can resolve this issue
using gzip worked for me
client = boto3.client('s3', aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key)
csv_obj = client.get_object(Bucket=####, Key=###)
body = csv_obj['Body']
with gzip.open(body, 'rt') as gf:
csv_file = pd.read_csv(gf)
The error you're getting means the CSV file you're getting from this S3 bucket is not encoded using UTF-8.
Unfortunately the CSV file format is quite under-specified and doesn't really carry information about the character encoding used inside the file... So either you need to know the encoding, or you can guess it, or you can try to detect it.
If you'd like to guess, popular encodings are ISO-8859-1 (also known as Latin-1) and Windows-1252 (which is roughly a superset of Latin-1). ISO-8859-1 doesn't have a character defined for 0x8b (so that's not the right encoding), but Windows-1252 uses that code to represent a left single angle quote (‹).
So maybe try .decode('windows-1252')?
If you'd like to detect it, look into the chardet Python module which, given a file or BytesIO or similar, will try to detect the encoding of the file, giving you what it thinks the correct encoding is and the degree of confidence it has in its detection of the encoding.
Finally, I suggest that, instead of using an explicit decode() and using a StringIO object for the contents of the file, store the raw bytes in an io.BytesIO and have pd.read_csv() decode the CSV by passing it an encoding argument.
import io
s3_data = io.BytesIO(response.get('Body').read())
data = pd.read_csv(s3_data, encoding='windows-1252')
As a general practice, you want to delay decoding as much as you can. In this particular case, having access to the raw bytes can be quite useful, since you can use that to write a copy of them to a local file (that you can then inspect with a text editor, or on Excel.)
Also, if you want to do detection of the encoding (using chardet, for example), you need to do so before you decode it, so again in that case you need the raw bytes, so that's yet another advantage to using the BytesIO here.

need suggestion to read specific frame

i want to read a specific frame from the byte array. i found that byte array by reading the video file in read binary mode. i know that i can read a specific frame with the help of cv2 module. but i want to read it from byte array and want to save that frame as an image.
I have tried to read it like a text file but failed to read the full frame.
f1=open('video,mp4','rb')
f2=open('image.jpg','wb')
frame_count=0
for frame in f1:
frame_count+=1
if frame_count==50:
f2.write(frame)
f2.close()
break
f1.close()
i got a byte string using that method but it did not work exactly.

TypeError: POST data should be bytes or an iterable of bytes. It cannot be str

I just updated from python 3.1 to python 3.2 (formatted HD) and one of my scripts stopped working. It gives me the error in the title.
I would fix it myself but I don't even know what an iterable of bytes is lol. I tried typecasting bytes(data) but that didn't work either. TypeError: string argument without an encoding
url = "http://example.com/index.php?app=core&module=global&section=login&do=process"
values = {"username" : USERNAME,
"password" : PASSWORD}
data = urllib.parse.urlencode(values)
req = urllib.request.Request(url, data)
urllib.request.urlopen(req)
It crashes at the last line.
Works in 3.1, but not 3.2
You did basically correct in trying to convert the string into bytes, but you did it the wrong way. Python doesn't have typecasting (so what you did was not typecasting).
The way to do it is to encode the text data into bytes data, which you do with the encode function:
binary_data = data.encode('encoding')
What 'encoding' should be depends. You should probably use 'ascii' here. If you have characters that aren't ASCII, then you need to use another encoding, typically 'utf8', but then you also need to tell the receiving webserver that it is UTF-8. It might also not want UTF8, but then you have to ask it, and it's getting complicated. :-)
#Enders, I know this is an old question, but I'd like to explain a few more things for somebody fighting with this issue.
It is specifically with this line of code here:
data = urllib.parse.urlencode(values)
That you are having issues, as you are trying to encode the data: values (urlencode).
If you refer to the urllib.parse documentation scroll to the bottom to find what urlencode does: https://docs.python.org/3/library/urllib.parse.html <~ you will see that you are trying to encode your user/pass into a data string:
Convert a mapping object or a sequence of two-element tuples, which may contain str or bytes objects, to a percent-encoded ASCII text string. If the resultant string is to be used as a data for POST operation with the urlopen() function, then it should be encoded to bytes, otherwise it would result in a TypeError.
Perhaps what you are trying to do here is do some kind of encryption of your user/password, but I don't really think this is the right way. If it is, then you probably need to make sure that the receiving end (the destination of your url) know that you're encoding your user/pass with this.
A more up-to-date approach is to use the powerful Requests library. They have compatibility with very common authentication protocols: http://docs.python-requests.org/en/master/user/authentication/
In this case, I'd do something like this:
requests.get(url, auth=('user', 'pass'))

Resources