Flask: Get gzip filename sent from Postman - python-3.x

I am sending a gzip file from Postman to a Flask endpoint. I can take that binary file with request.data and read it, save it, upload it, etc.
My problem is that I can't take its name. How can I do that?
My gzip file is called "test_file.json.gz" and my file is called "test_file.json".
How can I take any of those names?
Edit:
I'm taking the stream data with io.BytesIO(), but this library doesn't contain a name attribute or something, although I can see the file name into the string if I just:
>>>print(request.data)
>>>b'\x1f\x8b\x08\x08\xca\xb1\xd3]\x00\x03test_file.json\x00\xab\xe6RPP\xcaN\xad4T\xb2RP*K\xcc)M5T\xe2\xaa\x05\x00\xc2\x8b\xb6;\x16\x00\x00\x00'

Further to the comment, I think the code which handles your upload is relevant here.
See this answer regarding request.data:
request.data Contains the incoming request data as string in case it came with a mimetype Flask does not handle.
The recommended way to handle file uploads in flask is to use:
file = request.files['file']
file is then of type: werkzeug.datastructures.FileStorage.
file.stream is the stream, which can be read with file.stream.read() or simply file.read()
file.filename is the filename as specified on the client.
file.save(path) a method which saves the file to disk. path should be a string like '/some/location/file.ext'
source

Related

Uploading a file from memory to S3 with Boto3

This question has been asked many times, but my case is ever so slightly different. I'm trying to create a lambda that makes an .html file and uploads it to S3. It works when the file was created on disk, then I can upload it like so:
boto3.client('s3').upload_file('index.html', bucket_name, 'folder/index.html')
So now I have to create the file in memory, for this I first tried StringIO(). However then .upload_file throws an error.
boto3.client('s3').upload_file(temp_file, bucket_name, 'folder/index.html')
ValueError: Filename must be a string`.
So I tried using .upload_fileobj() but then I get the error TypeError: a bytes-like object is required, not 'str'
So I tried using Bytesio() which wants me to convert the str to bytes first, so I did:
temp_file = BytesIO()
temp_file.write(index_top.encode('utf-8'))
print(temp_file.getvalue())
boto3.client('s3').upload_file(temp_file, bucket_name, 'folder/index.html')
But now it just uploads an empty file, despite the .getvalue() clearly showing that it does have content in there.
What am I doing wrong?
If you wish to create an object in Amazon S3 from memory, use put_object():
import boto3
s3_client = boto3.client('s3')
html = "<h2>Hello World</h2>"
s3_client.put_object(Body=html, Bucket='my-bucket', Key='foo.html', ContentType='text/html')
But now it just uploads an empty file, despite the .getvalue() clearly >showing that it does have content in there.
When you finish writing to a file buffer, the position stays at the end. When you upload a buffer, it starts from the position it is currently in. Since you're at the end, you get no data. To fix this, you just need to add a seek(0) to reset the buffer back to the beginning after you finish writing to it. Your code would look like this:
temp_file = BytesIO()
temp_file.write(index_top.encode('utf-8'))
temp_file.seek(0)
print(temp_file.getvalue())
boto3.client('s3').upload_file(temp_file, bucket_name, 'folder/index.html')

How to receive .zip file in Flask from Angular UI?

I have an Angular application in which I used
<input type="file" (change)="fileChanged($event)">
to upload a .zip file.
I get the file like this:
fileChanged(e) {
this.file = e.target.files[0];
}
Now, I want to send this .zip file to my server (Python-Flask) and want to store it as a .zip file on my server-side.
I send the request to server like this:
const fd = new FormData();
fd.append('files', this.file, this.file.name);
fd.append('fileType', 'zip');
this.http.post('/my_app_route1/', this.file,
{headers:{'Content-Type': 'application/zip'}}).subscribe(f => {
....});
To receive and save this, I do the following on my python side.
#app.route('/my_app_route1/', methods=['POST'])
mydata = (request.get_data())
with open('hello.zip', 'wb') as f:
f.write(mydata)
Now, this data is binary data which I do not know how to write to a .zip file from Python.
When I try to print "mydata", I see glimpses of my original .zip file data which I had sent from front end mixed with other symbols.
Am I passing the .zip file right? If not how should I pass it?
Also, how can I save the .zip on my server-side using Python-Flask?
NOTE: The .zip file consists of multiple directories and sub-directories, and I want to replicate the same on my server-side.

How to we send a file (accepted as part of Multipart request) to MINIO object storage in python without saving the file in local storage?

I am trying to write an API in python (Falcon) to accept a file from multipart-form parameter and put the file in MINIO object storage. The problem is I want to send the file to Minio without saving it in any temp location.
Minio-python client has a function using which we can send the file.
`put_object(bucket_name, object_name, data, length)`
where data is the file data and length is total length of object.
For more explanation: https://docs.min.io/docs/python-client-api-reference.html#put_object
I am facing problem accumulating the values of "data" and "length" arguments in the put_object function.
The type of file accepted in the API class is of falcon_multipart.parser.Parser which cannot be sent to Minio.
I can make it work if I write the file to any temp location and then read it from the desired location and send.
Can anyone help me finding a solution to this?
I tried reading file data from the Parser object and tried converting the file to bytes io.BytesIO. But it did not work.
def on_post(self,req, resp):
file = req.get_param('file')
file_data = file.file.read()
file_data= io.BytesIO(file_data)
bucket_name = req.get_param('bucket_name')
self.upload_file_to_minio(bucket_name, file, file_data)
def upload_file_to_minio(self, bucket_name, file, file_data):
minioClient = Minio("localhost:9000", access_key='minio', secret_key='minio', secure=False)
try:
file_stat = sys.getsizeof(file_data)
#file_stat = file_data.getbuffer().nbytes
minioClient.put_object(bucket_name, "SampleFile" , file, file_stat)
except ResponseError as err:
print(err)
Traceback (most recent call last):
File "/home/user/.local/lib/python3.6/site-packages/minio/helpers.py", line 382, in is_non_empty_string
if not input_string.strip():
AttributeError: 'NoneType' object has no attribute 'strip'
A very late answer to your question. As of Falcon 3.0, this should be possible leveraging the framework's native multipart/form-data support.
There is an example how to perform the same task to AWS S3: How can I save POSTed files (from a multipart form) directly to AWS S3?
However, as I understand, MinIO requires either the total length, which is unknown, or alternatively, it requires you to wrap the upload as a multipart form. That should be doable by reading reasonably large (e.g., 8 MiB or similar) chunks into the memory, and wrapping them as multipart upload parts without storing anything on disk.
IIRC Boto3's transfer manager does something like that under the hood for you too.

Get a file meta data object in Node

Is there a way to get a file object in Node with file meta data like mimetype, originalname, path, etc..? I'd like to read a file from disk and get this information, rather than a Buffer.
Here is one nodejs package that could help you work with file meta data information.
https://www.npmjs.com/package/fs-meta

saving an image to bytes and uploading to boto3 returning content-MD5 mismatch

I'm trying to pull an image from s3, quantize it/manipulate it, and then store it back into s3 without saving anything to disk (entirely in-memory). I was able to do it once, but upon returning to the code and trying it again it did not work. The code is as follows:
import boto3
import io
from PIL import Image
client = boto3.client('s3',aws_access_key_id='',
aws_secret_access_key='')
cur_image = client.get_object(Bucket='mybucket',Key='2016-03-19 19.15.40.jpg')['Body'].read()
loaded_image = Image.open(io.BytesIO(cur_image))
quantized_image = loaded_image.quantize(colors=50)
saved_quantized_image = io.BytesIO()
quantized_image.save(saved_quantized_image,'PNG')
client.put_object(ACL='public-read',Body=saved_quantized_image,Key='testimage.png',Bucket='mybucket')
The error I received is:
botocore.exceptions.ClientError: An error occurred (BadDigest) when calling the PutObject operation: The Content-MD5 you specified did not match what we received.
It works fine if I just pull an image, and then put it right back without manipulating it. I'm not quite sure what's going on here.
I had this same problem, and the solution was to seek to the beginning of the saved in-memory file:
out_img = BytesIO()
image.save(out_img, img_type)
out_img.seek(0) # Without this line it fails
self.bucket.put_object(Bucket=self.bucket_name,
Key=key,
Body=out_img)
The file may need to be saved and reloaded before you send it off to S3. The file pointer seek also needs to be at 0.
My problem was sending a file after reading out the first few bytes of it. Opening a file cleanly did the trick.
I found this question getting the same error trying to upload files -- two scripts clashed, one creating, the other uploading. My answer was to create using ".filename" then:
os.rename(filename.replace(".filename","filename"))
The upload script then needs to ignore . files. This ensured the file was done being created.
To anyone else facing similar errors, this usually happens when content of the file gets modified during file upload, possibly due to file being modified by another process/thread.
A classic example would be to scripts modifying the same file at the same time, which throws the bad digest due to change in MD5 content. In the below example, the data file is being uploaded to s3, while it is being uploaded, if another process overwrites it, you will end up with this exception
random_uuid=$(uuidgen)
cat data
aws s3api put-object --acl bucket-owner-full-control --bucket $s3_bucket --key $random_uuid --body data

Resources