How to fix: AttributeError: 'bytes' object has no attribute 'insert' - python-3.x

I am writing a python file server and I want to be able to recv all the data on the buffer. So I made a send function that sends the data along with it's file size but when I send the data there is an attribute error.
I have tried looping around the recv function but couldn't get it to work so I decided to send the size of the data being sent and let the server/client see how much data they have received and if it's less than the amount they are told then they should keep receiving data.
def getsizeof(data, tl=0):
for i in data:
data = sys.getsizeof(i)
tl += data
return tl
def send(conn, request):
length = getsizeof(request)
length_size = sys.getsizeof(length)
byte_size = length_size + length
###########################################################
request.insert(0, byte_size) # THIS IS WHERE I GET 'AttributeError: 'bytes' has no attribute 'insert'
###########################################################
conn.sendall(pickle.dumps(request)) # Encoding & sending data
print(total_length)
I am getting an attribute error but when that is fixed i think that it should work and recv all data in the buffer and send correctly with the byte size of the sent data.

Related

GoogleCloud Speech2Text "long_running_recognize" response object un-iterable

When running a speech to text api request from Google cloud services (over 60s audio so i need to use the long_running_recognize function, as well as retrieve the audio from a Cloud Storage Bucket), i properly get a text response, but i cannot iterate through the LongRunningResponse object that is returned, which renders the info inside semi useless.
When using just the "client.recognize()" function, i get a similar response to the long running response, except when i check for the results in the short form, i can iterate through the object just fine, contrary to the long response.
I run nearly identical parameters through each recognize function (a 1m40s long audio for long running, and a 30s for the short recognize, both from my cloud bucket).
short_response = client.recognize(config=config, audio=audio_uri)
subs_list = []
for result in short_response.results:
for alternative in result.alternatives:
for word in alternative.words:
if not word.start_time:
start = 0
else:
start = word.start_time.total_seconds()
end = word.end_time.total_seconds()
t = word.word
subs_list.append( ((float(start),float(end)), t) )
print(subs_list)
Above function works fine, the ".results" attribute correctly returns objects that i can further gain attributes from and iterate through. I use the for loops to create subtitles for a video.
I then try a similar thing on the long_running_recognize, and get this:
long_response = client.long_running_recognize(config=config, audio=audio_uri)
#1
print(long_response.results)
#2
print(long_response.result())
Output from #1 returns error:
AttributeError: 'Operation' object has no attribute 'results'. Did you mean: 'result'?
Output from #2 returns the info i need, but when checking "type(long_response.result())" i get:
<class 'google.cloud.speech_v1.types.cloud_speech.LongRunningRecognizeResponse'>
Which i suppose is not an iterable object, and i cannot figure out how to apply a similar process as i do to the recognize function to gain subtitles the way i need.

getting scheduling error when forwarding s3 object to flask response

I'm using the following to pull a data file from an s3 compliant server:
try:
buf = io.BytesIO()
client.download_fileobj(project_id, path, buf)
body = buf.getvalue().decode("utf-8")
except botocore.exceptions.ClientError as e:
if defaultValue is not None:
return defaultValue
else:
raise S3Error(project_id, path, e) from e
else:
return body
The code generates this error:
RuntimeError: cannot schedule new futures after interpreter shutdown
In general, I'm simply trying to read an s3-compliant file into the body of a response object. The caller of the above snippet is as follows:
data = read_file(project_id, f"{PATH}/data.csv")
response = Response(
data,
mimetype="text/csv",
headers=[
("Content-Type", "application/octet-stream; charset=utf-8"),
("Content-Disposition", "attachment; filename=data.csv")
],
direct_passthrough=True
)
Playing with the code, if I don't get a runtime error, the request hangs in that I don't get a returned response.
Thank you to anyone with guidance.
I'm not sure how generic this answer will be, however, the combination of using boto to access the digital ocean version of the s3 implementation does not "strictly" permit using an object key that starts with /. Once I removed the offending leading character, the files downloaded as expected.
I base the boto specificity on the fact that I was able to read the files using a Haskell's amazonka.

how to handle when client sends more bytes than we can store and the buffer gets overwritten?

We have a socket in python3 that receive x bytes from client, our problem lies when the client sends more bytes than x, when this happens our buffer gets overwritten and we lose previous bytes. We need a way to avoid lose the first bytes. We'll appreciate any help. Thanks!
class Connection(object):
def __init__(self, socket, directory):
self.sock = socket
self.directory = directory
def handle(self):
while(True):
data = self.sock.recv(4096)
if len(data) > 0:
...
we expect to stop the socket from receving or some way to avoid losing the bytes that we already have in the buffer
You could do the following:
def receivallData(sock, buffer_size=4096):
buf = sock.recv(buffer_size)
while buf:
yield buf
if len(buf) < buffer_size: break
buf = sock.recv(buffer_size)
You can read more on this here:
Python Socket Receive Large Amount of Data
you can follow this logic:
create a buffer in which you store all the receive data
append the data you receive at each iteration of the loop to this buffer so you won't lose it
check if you receive the full data in order to process it
The example below explains how to create the buffer and append the data to it (in the example I exit the loop if no more data is available on the socket or socket closed)
total_data = []
while True:
data = self.sock.recv(4096)
if not data:
break
total_data.append(data)
# TODO: add processing on total_data
print "".join(total_data)

Formatting Urllib Requests to Get Data From a Server

I am trying to use python to access pull and plot data from a server (ERRDAP if you're interested).
Here is the code that I am trying to use:
url = 'https://scoos.org/erddap/tabledap/autoss.csv?query'
values = {'time>=' : '2016-07-10T00:00:00Z',
'time<' : '2017-02-10T00:00:00Z',
'temperature' : 'temperature',
'time' :'time',
'temperature_flagPrimary'}
data = urllib.parse.urlencode(values)
data.encode('utf-8')
resp = urllib.request.urlopen(req)
The error message that I get is: "POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str." I think this either has something to do with me trying to request a csv file or improperly using urllib. Any help would be greatly appreciated!

Python3 socket cannot decode content

I'm facing a strange issue. I cannot decode the data received through a socket connection while it's working with the same code in python 2.7. I know that the data type received in python 2 is a string an bytes in python 3. But I don't understand why I'm receiving an error when I try to decode.
I'm sending exactly the same datas(copy/paste to be sure) except that I need to perform .encode() for python 3 to avoid to received "TypeError, a bytes-like object is required, not 'str' "
Python2:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(15)
s.connect((SERVERIP, SERVERPORT))
s.send(message)
data = ''
while True:
new_data = s.recv(4096)
if not new_data:
break
data += new_data
s.close()
Python 3
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(15)
s.connect((SERVERIP, SERVERPORT))
s.send(message)
data = ''
while True:
new_data = s.recv(4096)
if not new_data:
break
data += new_data.decode('utf-8') #same result with new_data.decode()
s.close()
Python 2 new_data content:
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\x05\xc1\xdd\x12B#\x18\x00\xd0\x07r\xa3\xb6\xfdv]t\xa1T&\xb5d\x91\xd1tA\x98]F\xfeB\x1a\x0f\xdf9yu\x10s\xa3\xa29:\xdbl\xae\xe9\xe8\xd9H\xc8v\xa8\xd0K\x8c\xde\xd7\xef\xf9\xc4uf\xca\xfd \xdd\xb7\x0c\x9a\x84\xe9\xec\xb7\xf1\xf3\x97o\\k\xd5E\xc3\r\x11(\x9d{\xf7!\xdc*\x8c\xd5\x1c\x0b\xadG\xa5\x1e(\x97dO\x9b\x8f\x14\xaa\xddf\xd7I\x1e\xbb\xd4\xe7a\xe4\xe6a\x88\x8b\xf5\xa0\x08\xab\x11\xda\xea\xb8S\xf0\x98\x94\x1c\x9d\xa24>9\xbai\xd3\x1f\xe6\xcc`^\x91\xca\x02j\x1aLy\xccj\x0fdVn\x17#\xb0\xc1#\x80hX#\xb0\x06\n\x0b\xc0\xf2x\xfe\x01?\x05\x1f\xc1\xc5\x00\x00\x00'
Python3 new_data content:
b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\x05\xc1\xdb\x12B#\x00\x00\xd0\x0f\xf2\xc0\xda\xb5\xcbC\x0f"-\xb9gPM\x0f\x85&\x8b)\xb7\x1d\x1a\x1f\xdf9\xe3\xbc\xbe\xfd\x9e\xd9A\xe3:\x851,\xcf\xc4\xe5\x865|\xa5\xcb\xbb\xcbs\xa8\x8f\xcc\x1b\xf7\x06\xc5\x8f\xfa\xba\x84\xd8>\xea\xc0\xa5b\xe6\xceC\xea\xd0\x88\xebM\t\xd7\xf8\xc1*#hI\xd6F\x80\xb3B[\xa7\x99\x91\xbe\x16%Q\xf5\x1d(\xa0\x93\x87\n\x13\xbe\x92\x91\xcc\xbfT\x98b\xd3\x0b=\xc0\xd5\xb3\xdf}\xcc\xc9\xb1\xe4\'\xb1\xe25\xcc{tl\xe5\x92\xf34x\xd5\xa1\xf9K\xa4\xa8k\xa8 dU\xd7\x1e\xce\xb4\x02\xean\xc3\x10#\x05\x13L\x14\xa0(H\xd2d\xb8a\xbc\xdd\xee\x7f\x1b\xe5\xf1\xd2\xc5\x00\x00\x00'
And so when in python3 I'm receiving this error when I try to decode:
'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
The data received are not the same. The difference start after 'x12B#'. Someone has an explanation?
I'm not managing the server side so don't ask me to check this side!
Thanks,
Matthieu
For Python 3 you need to work with bytes, the data you have is not a text string so don't try and interpret it as one.
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(15)
s.connect((SERVERIP, SERVERPORT))
s.send(message)
data = b''
while True:
new_data = s.recv(4096)
if not new_data:
break
data += new_data
s.close()
That should be all you need to receive the data: start with an empty bytes object created using b'' or just bytes(), but you will also have to be aware you are working with bytes when you come to process the data so that code will probably need changing as well.
You next step in processing this is probably:
import gzip
text = gzip.decompress(data)
and at this point it may be appropriate to change that to:
text = gzip.decompress(data).decode('ascii')
using whatever encoding is appropriate here (the sample data you posted above only contains ascii when decompressed so that might be all you need, or you might want utf-8 or some other encoding but you'll have to find out what was used to encode the data as you shouldn't attempt to guess). However it looks like it contains some pipe-separated fields so you might want to split the fields first and decode or otherwise process them individually:
fields = gzip.decompress(b).split(b'|')

Resources