Python 3: Requests response.iter_content: ChunkedEncodingError

Python 3: Requests response.iter_content: ChunkedEncodingError - python-3.x

I am using requests stream for preforming a 'GET' download of a remote very large CSVs, then chunking response using response.iter_content(). This has been working for multiple data providers.
However, for one remote data provider, when using response.iter_content(), occasionally I am getting a ChunkedEncodingError, specifically:
ChunkedEncodingError: (
ProtocolError(
'Connection broken: IncompleteRead(921 bytes read, 103 more expected)',
IncompleteRead(921 bytes read, 103 more expected)),
)
Here is the Python 3 code, and I would like to know of an alternative to resolving this chunking exception problem:
tmp_csv_chunk_sum = 0
with open(
file=tmp_csv_file_path,
mode='wb',
encoding=encoding_write
) as csv_file_wb:
try:
for chunk in response.iter_content(chunk_size=8192):
if not chunk:
break
tmp_csv_chunk_sum += 8192
csv_file_wb.write(chunk)
csv_file_wb.flush()
os.fsync(csv_file_wb.fileno())
except Exception as ex:
self.logger.error(
"Request CSV Download: Exception",
extra={
'error_details': str(ex),
'chunk_total_sum': tmp_csv_chunk_sum
}
)
raise
I truly appreciate assistance, Thank you

Related

Is there a string size limit when feeding .fromstring() method as input?

I'm working on multiple well-formed xml files, whose sizes range from 100 MB to 4 GB. My goal is to read them as strings and then import them as ElementTree objects using .fromstring() method (from xml.etree.ElementTree module).
However, as the process goes through and the string size increases, two exceptions occured related to memory restriction :
xml.etree.ElementTree.ParseError: out of memory: line 1, column 0
OverflowError: size does not fit in an int
It looks like .fromstring() method enforces a string size limit to the input, around 1GB... ?
To debug this, I wrote a short script using a for loop:
xmlFiles_list = [path1, path2, ...]
for fp in xmlFiles_list:
xml_fo = open(fp, mode='r', encoding="utf-8")
xml_asStr = xml_fo.read()
xml_fo.close()
print(len(xml_asStr.encode("utf-8")) / 10**9) # display string size in GB
try:
etree = cElementTree.fromstring(xml_asStr)
print(".fromstring() success!\n")
except Exception as e:
print(f"Error :{type(e)} {str(e)}\n")
continue
The ouput is as following :
0.895206753
.fromstring() success!
1.220224531
Error :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
1.328233473
Erreur :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
2.567867904
Error :<class 'OverflowError'> size does not fit in an int
4.080672538
Error :<class 'OverflowError'> size does not fit in an int
I found multiple workarounds to avoid this issue : .parse() method or lxml module for bette performance. I just hope someone could shed some light on this :
Is there a specific string size limit in xml.etree.ET module and .fromstring() method ?
Why do I end up with two different exceptions as the string size increases ? Are they related to the same memory-allocation restriction ?
Python version/system: 3.9 (64 bits)
RAM : 32go
Hope my topic is clear enough, I'm new on stackoverflow

Azure FaceAPI limits iteration to 20 items

I have a list of image urls from which I use MS Azure faceAPI to extract some features from the photos. The problem is that whenever I iterate more than 20 urls, it seems not to work on any url after the 20th one. There is no error shown. However, when I manually changed the range to iterate the next 20 urls, it worked.
Side note, on free version, MS Azure Face allows only 20 requests/minute; however, even when I let time sleep up to 60s per 10 requests, the problem still persists.
FYI, I have 360,000 urls in total, and sofar I have made only about 1000 requests.
Can anyone help tell me why this happens and how to solve this? Thank you so much!
# My codes
i = 0
for post in list_post[800:1000]:
i += 1
try:
image_url = post['photo_url']
headers = {'Ocp-Apim-Subscription-Key': KEY}
params = {
'returnFaceId': 'true',
'returnFaceLandmarks': 'false',
'returnFaceAttributes': 'age,gender,headPose,smile,facialHair,glasses,emotion,hair,makeup,occlusion,accessories,blur,exposure,noise',
}
response = requests.post(face_api_url, params=params, headers=headers, json={"url": image_url})
post['face_feature'] = response.json()[0]
except (KeyError, IndexError):
continue
if i % 10 == 0:
time.sleep(60)

The free version has a max of 30 000 request per month, your 356 000 faces will therefore take a year to run.
The standard version costs USD 1 per 1000 requests, giving a total cost of USD 360. This option supports 10 transactions per second.
https://azure.microsoft.com/en-au/pricing/details/cognitive-services/face-api/

WebDriverException: Message: unknown error: bad inspector message error while printing HTML content using ChromeDriver Chrome through Selenium Python

I am scraping some HTML content..
for i, c in enumerate(cards[75:77]):
print(i)
a = c.find_element_by_class_name("influencer-stagename")
print(a.get_attribute('innerHTML'))
Works fine for all records except the 76th one. Output before error...
0
b'<a class="influencer-analytics-link" href="/influencers/sophiewilling"><h5><span>SOPHIE WILLING</span></h5></a>'
1
b'<a class="influencer-analytics-link" href="/influencers/ferntaylorr"><h5><span>Fern Taylor.</span></h5></a>'
2
b'<a class="influencer-analytics-link" href="/influencers/officialshaniceslatter"><h5><span>Shanice Slatter</span></h5></a>'
3
Stacktrace...
> -------------------------------------------------------------------------
WebDriverException Traceback (most recent call last) <ipython-input-484-0a80d1af1568> in <module>
3 #print(c.find_element_by_class_name("influencer-stagename").text)
4 a = c.find_element_by_class_name("influencer-stagename")
----> 5 print(a.get_attribute('innerHTML').encode('ascii', 'ignore'))
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py in get_attribute(self, name)
141 self, name)
142 else:
--> 143 resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
144 attributeValue = resp.get('value')
145 if attributeValue is not None:
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py in _execute(self, command, params)
631 params = {}
632 params['id'] = self._id
--> 633 return self._parent.execute(command, params)
634
635 def find_element(self, by=By.ID, value=None):
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
319 response = self.command_executor.execute(driver_command, params)
320 if response:
--> 321 self.error_handler.check_response(response)
322 response['value'] = self._unwrap_value(
323 response.get('value', None))
~/anaconda3/envs/py3-env/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
240 alert_text = value['alert'].get('text')
241 raise exception_class(message, screen, stacktrace, alert_text)
--> 242 raise exception_class(message, screen, stacktrace)
243
244 def _value_or_default(self, obj, key, default):
WebDriverException: Message: unknown error: bad inspector message: {"id":110297,"result":{"result":{"type":"object","value":{"status":0,"value":"<a class=\"influencer-analytics-link\" href=\"/influencers/bookishemily\"><h5><span>Emily | 18 | GB | Student\uD83C...</span></h5></a>"}}}} (Session info: chrome=75.0.3770.100) (Driver info: chromedriver=2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363),platform=Mac OS X 10.14.0 x86_64)
I suspect it is an invalid character in
value":"Emily | 18 | GB | Student\uD83C..."
Specifically I suspect "\uD83C"
Adding
.encode("utf-8") OR .encode('ascii', 'ignore')
to the second print statement changes nothing.
Any thoughts on how to solve this??
UPDATE: The problem is with Emoji characters. I have found 3 examples to far and each has an emoji (pink flower 🌸, russian flag 🇷🇺 and swirling leaves 🍃). If I edit them out with Chrome inspector my code runs fine but this is not a solution that works at scale

This error message...
WebDriverException: Message: unknown error: bad inspector message: {"id":110297,"result":{"result":{"type":"object","value":{"status":0,"value":"<a class=\"influencer-analytics-link\" href=\"/influencers/bookishemily\"><h5><span>Emily | 18 | GB | Student\uD83C...</span></h5></a>"}}}} (Session info: chrome=75.0.3770.100) (Driver info: chromedriver=2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363),platform=Mac OS X 10.14.0 x86_64)
...implies that the ChromeDriver was unable to parse some non-UTF-8 characters due to JSON encoding/decoding issue.
Deep Dive
As per the discussion in Issue 723592: 'Bad inspector message' errors when running URL web-platform-tests via webdriver John Chen (Owner - WebDriver for Google Chrome) in his comment mentioned:
A JSON encoding/decoding issue caused the "Bad inspector message" error reported at https://travis-ci.org/w3c/web-platform-tests/jobs/232845351. Part of the error message from part 1 contains an invalid Unicode character \uFDD0 (from https://github.com/w3c/web-platform-tests/blob/34435a4/url/urltestdata.json#L3596). The JSON encoder inside Chrome didn't detect such error, and passed it through in the JSON blob sent to ChromeDriver. ChromeDriver uses base/json/json_parser.cc to parse the JSON string. This parser does a more thorough error detection, notices that \uFDD0 is an invalid character, and reports an error. I think our JSON encoder and decoder should have exactly the same amount of error checking. It's problematic that the encoder can create a blob that is rejected by the decoder.
Analysis
John Chen (Owner - WebDriver for Google Chrome) further added:
The JSON encoding happens in protocol layout of DevTools, just before the result is sent back to ChromeDriver. The relevant code is in https://cs.chromium.org/chromium/src/out/Debug/gen/v8/src/inspector/protocol/Protocol.cpp. In particular, escapeStringForJSON function is responsible for encoding strings. It's actually quite conservative. Anything above 126 is encoded in \uXXXX format. (Note that Protocol.cpp is a generated file. The real source is https://cs.chromium.org/chromium/src/v8/third_party/inspector_protocol/lib/Values_cpp.template.)
The error occurs in the JSON parser used by ChromeDriver. The decoding of \uXXXX sequence happens at https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=564 and https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=670. After decoding an escape sequence, the decoder rejects anything that's not a valid Unicode character.
I noticed that there was a recent change to prevent a JSON encoder from emitting invalid Unicode code point: https://crrev.com/478900. Unfortunately it's not the JSON encoder used by the code involved in this bug, so it doesn't help us directly, but it's an indication that we're not the only ones affected by this type of issue.
Solution
This issue was addressed replacing invalid UTF-16 escape sequences when decoding invalid UTF strings in chromedriver as Web platform tests may use ECMAScript strings which aren't necessarily utf-16 characters through this revision / commit.
So a quick solution would be to ensure the following and re-execute your tests:
Selenium is upgraded to current levels Version 3.141.59.
ChromeDriver is updated to current ChromeDriver v79.0.3945.36 level.
Chrome is updated to current Chrome Version 79.0 level. (as per ChromeDriver v79.0 release notes)
Alternative
As an alternative you can use GeckoDriver / Firefox combination and you can find a relevant discussion in Chromedriver only supports characters in the BMP error while sending Emoji with ChromeDriver Chrome using Selenium Python to Tkinter's label() textbox

Using Python-pptx, what conditions could a PowerPoint have that give KeyError?

I have a PowerPoint that I would like to open, amend, and save as a different filename. However, I'm getting a KeyError.
I tried this code with a blank PowerPoint presentation and it works perfectly. However, when I use the code to ope an existing PowerPoint presentation and try to run the same code, I get a KeyError.
KeyError: "There is no item named 'ppt/slides/NULL' in the archive"
#Replace Source Text
import re
#s = "string. With. Punctuation?"
#s = re.sub(r'[^\w\s]','',s)
search_str = '{{{FILTER}}}'
repl_str = re.sub(r'[^\w\s]','',(str(list(dashboard_filter2.values()))))
ppt = Presentation('HispPres1.pptx')
for slide in ppt.slides:
for shape in slide.shapes:
if shape.has_text_frame:
shape.text = shape.text.replace(search_str, repl_str)
ppt.save('HispPresSourceUpdate.pptx')
I expect to have the existing PowerPoint amended by finding all the instances of {{{FILTER}}} and replacing it with the value listed. However, it looks like there's a problem using my existing PowerPoint presentation. I don't have this issue with a blank presentation.
So, I'm wondering what would cause an existing PowerPoint presentation to raise an error??? I plan on making several "templates" to start with and really need to know if there are any hardfast rules to adhere to.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-42-41deffabe2f9> in <module>()
7 search_str = '{{{FILTER}}}'
8 repl_str = re.sub(r'[^\w\s]','',(str(list(dashboard_filter2.values()))))
----> 9 ppt = Presentation('HispPres1.pptx')
10
11 for slide in ppt.slides:
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\api.py in Presentation(pptx)
28 pptx = _default_pptx_path()
29
---> 30 presentation_part = Package.open(pptx).main_document_part
31
32 if not _is_pptx_package(presentation_part):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\package.py in open(cls, pkg_file)
120 *pkg_file*.
121 """
--> 122 pkg_reader = PackageReader.from_file(pkg_file)
123 package = cls()
124 Unmarshaller.unmarshal(pkg_reader, package, PartFactory)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in from_file(pkg_file)
34 pkg_srels = PackageReader._srels_for(phys_reader, PACKAGE_URI)
35 sparts = PackageReader._load_serialized_parts(
---> 36 phys_reader, pkg_srels, content_types
37 )
38 phys_reader.close()
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _load_serialized_parts(phys_reader, pkg_srels, content_types)
67 sparts = []
68 part_walker = PackageReader._walk_phys_parts(phys_reader, pkg_srels)
---> 69 for partname, blob, srels in part_walker:
70 content_type = content_types[partname]
71 spart = _SerializedPart(partname, content_type, blob, srels)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
102 yield (partname, blob, part_srels)
103 for partname, blob, srels in PackageReader._walk_phys_parts(
--> 104 phys_reader, part_srels, visited_partnames):
105 yield (partname, blob, srels)
106
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
102 yield (partname, blob, part_srels)
103 for partname, blob, srels in PackageReader._walk_phys_parts(
--> 104 phys_reader, part_srels, visited_partnames):
105 yield (partname, blob, srels)
106
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\pkgreader.py in _walk_phys_parts(phys_reader, srels, visited_partnames)
99 visited_partnames.append(partname)
100 part_srels = PackageReader._srels_for(phys_reader, partname)
--> 101 blob = phys_reader.blob_for(partname)
102 yield (partname, blob, part_srels)
103 for partname, blob, srels in PackageReader._walk_phys_parts(
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pptx\opc\phys_pkg.py in blob_for(self, pack_uri)
107 matching member is present in zip archive.
108 """
--> 109 return self._zipf.read(pack_uri.membername)
110
111 def close(self):
~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in read(self, name, pwd)
1312 def read(self, name, pwd=None):
1313 """Return file bytes (as a string) for name."""
-> 1314 with self.open(name, "r", pwd) as fp:
1315 return fp.read()
1316
~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in open(self, name, mode, pwd, force_zip64)
1350 else:
1351 # Get info object for name
-> 1352 zinfo = self.getinfo(name)
1353
1354 if mode == 'w':
~\AppData\Local\Continuum\anaconda3\lib\zipfile.py in getinfo(self, name)
1279 if info is None:
1280 raise KeyError(
-> 1281 'There is no item named %r in the archive' % name)
1282
1283 return info
KeyError: "There is no item named 'ppt/slides/NULL' in the archive"

Yeah, this is a bit of a thorny problem. The spec doesn't provide for a "broken" relationship (one that refers to a package-part that doesn't exist), but at least one library (Java-based if I recall correctly) does not clean up relationships properly in some cases, perhaps a slide delete operation in this case.
The gist of the explanation is this:
A PPTX file is an Open Packaging Convention (OPC) package. DOCX and XLSX files are other examples of OPC packages.
An OPC package is a Zip archive of multiple parts (official term, perhaps package-part more precisely). Each part is essentially a file, so something like slide1.xml, and they are arranged in a "directory structure".
One part can be related to other parts. For example, a presentation part (presentation.xml) is related to each of its slide parts. These relationships are stored in a file like presentation.xml.rels. The relationship is keyed with a string like "rId3" and identifies the related part by its path in the package.
One part refers to another using the key in its XML (e.g. <p:sldId r:id="rId3"/>). The target part is "looked-up" in the .rels file to find its path and get to it that way.
The KeyError you're getting means that the .rels file has a <Relationship> element referring to the part ppt/slides/NULL (instead of something like ppt/slides/slide3.xml). Since there is no such part in the package, the lookup fails.
If you open the "template" file in PowerPoint and save it, I think it will repair itself. You might need to rearrange a slide and move it back to jostle that part of the code.
If that doesn't work, you'll need to patch the package by hand, removing any broken references and relationships. opc-diag can be handy for that.

You can clean the PPTX from the dangling relations through:
File -> Info -> Check for Issues -> Inspect Document.
Clean up, save, replay python script.

So, thanks Scanny for the help. You're exactly right. The lookup was looking for ppt/slides/slide#.xml and it wasn't finding a relationship for it. The reason is because the relationships are coded as just slides/slide#.xml (without ppt/). I did get into the opc-diag to see what I could do there, but I found an easy fix.
My previous code had a line that said for slide in ppt.slides: and this was the error: KeyError: "There is no item named 'ppt/slides/NULL' in the archive". When browsed the PresentationML using opc-diag, I found that the relationship was set up like this: <Relationship Id="x" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide" Target="slides/slide1.xml"/>\n. The relationship does not include ppt.
So, to get rid of that lookup and have it match the way PowerPoint stores the slide relationships, I changed these lines:
ppt = Presentation('HispPres1.pptx')
for slide in ppt.slides:
to this
ppt = Presentation('HispPres1.pptx')
slides = ppt.slides
for slide in slides:

MicroPython client not receive text file larger than 4kb (4096 bytes) from Python Server

I have an micropython client on esp32 board, and Python on linux server. I am trying send 5.5kb text file from Python Server to MicroPython client. It sends successfully but MicroPython client does not receive all data. Codes as follows;
Python Server:
with open('downloads/%s' % (request_path), 'rb') as f:
data = f.read()
self.wfile.write(data) #data is 5.5kb
MicroPython Client
recvData = sock.read(4096).decode('utf-8').split("\r\n")
print("Response_Received:: %s" % recvData)
sock.close()
Response_Received:: ['HTTP/1.0 200 OK', 'Server: SimpleHTTP/0.6 Python/3.5.3', 'Date: Sat, 09 Jun 2018 09:29:41 GMT', '', '# Ity: asdasd\n# ksduygfkhsgdkjfksjdhfg\n kjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy98\n 47y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhs\n gdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349rio\n t34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r3\n 49riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkv\n nvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogijiksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweu\n oiruy9847y397r349riot34jt;o\n giji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbs\n djkvbjcxbvhweioufhoiweuoiruy9847y397\n r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufh\n oiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduyg\n fkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhwei\n oufhoiweuoiruy9847y397r349riot\n 34jt;ogiji4vuijo4vjlkvnvl;kksduyg\n fkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhw\n eioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjd\n hfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiwe\n uoiruy9847y397r349riot34jt;ogiji4vuij\n o4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhwe\n ioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfk\n hsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;o\n giji4vuijo4vjlkvnvl;kksduygfkhsgdk\n jfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvh\n weioufhoiweuoiruy9847y397r349riot34jt;ogiji\n 4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiru\n y9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;k4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduyg\n fkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcx\n bvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhs\n gdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdj\n nvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogijiksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9bfkjcbsdjkvbjcxbvhweioufhoi847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweu\nnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogijiksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweu\nnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogijiksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufhoiweuoiruy9847y397r349riot34jt;ogiji4vuijo4vjlkvnvl;kksduygfkhsgdkjfksjdhfgkjdhsbfkjdhsbfkjcbsdjkvbjcxbvhweioufnvl;k']
Client receives only 4140 bytes of the array data in due to buffer size(4096), 4th element of the recvData is lost. MicroPython does not accept over this Buffer size. How can i receive all my data (5.5kb) in 4th element of recvData array without any loss?
I have tried to fragment the received data, but it was not successful.
while True:
chunck = s.recv(4096)
if not chunck:
break
fragments.append(chunck)

Since your goal is to write the file to the filesystem, the simplest solution is to stop trying to hold the entire file in memory. Instead of building up your fragments array, just write the received chunks to a file:
with open('datafile', 'w') as fd:
while True:
chunk = s.recv(4096)
if not chunk:
break
fd.write(chunk)
This requires a constant amount of memory and can be used to receive
files of arbitrary size.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string