500 Internal error while saving data into Firestore - python-3.x

I have a following process to load documents into Firestore:
Upload document(Its a JSON file) into GCS bucket,
Trigger Cloud Function when document get uploaded into bucket and save uploaded document into Firestore
I am using below code for saving data into firestore
#Save document in firestore
collection = db.collection(u'my_collection')
try:
collection.document(u'' + file_name + '').set(data)
print('Data saved successfully with document id {}'.format(file_name))
except Exception as e:
print('Exception occurred while saving data into firestore.', e)
Problem arise when I uploaded large number of files (1000-2000) into the bucket simultaneously. Few of documents saved successfully but for few of them got below error.
Exception occurred while saving data into firestore. 500 An internal error occurred.
Edit 1: Above error occurs when calling set() method
What is the way to diagnose why this occurred? If there is quota or limit issue or anything else?
Any suggestions would be of great help. Thank you.

Related

Check if a file exists in s3 using graphql code

Can someone tell me how to check if a file exists in s3 bucket using graphql code.
As of now i am using :
await s3.headObject(existParams).promise();
But the above is not working I just keep on waiting for the response and it doesnot return anything and after waiting for 1 min it throws time out(504).

Why is the userIdentity property always empty in AWS’ Kinesis DataStream?

I have enabled Kinesis DataStream in DynamoDB and have configured a Delivery Stream to store the stream as audit logs into an s3 bucket.
I then query the s3 bucket from Amazon Athena.
Everything seems to be working, but the userIdentity property is always empty (null) which seems pointless to me to have an audit if I cannot capture who did the transaction. Is this property only populated when a record is deleted from DynamoDB and TTL is enabled?
Questions:
How do I capture the user id / name of the user responsible for adding, updating, or deleting a record via the application or directly via DynamoDB in AWS console?
(Less important question) How do I format the stream before it hits the s3 bucket so I can include the record id being updated?
Also please note that I have a lambda function that I use from the Delivery Stream that simply adds new line to each stream as a delimeter. If I wanted to do more processing/formatting to the stream, should I be executing this lambda when the stream hits the DeliveryStream? Or should I be executing this as a trigger in the DynamoDB table itself before it hits the DeliveryStream?
DynamoDB does not include the user details in the Data Stream. This needs to be implemented by the application, then you can get the values from the newImage if provided by the stream.

Writing json to AWS S3 from AWS Lambda

I am trying to write a response to AWS S3 as a new file each time.
Below is the code I am using
s3 = boto3.resource('s3', region_name=region_name)
s3_obj = s3.Object(s3_bucket, f'/{folder}/{file_name}.json')
resp_ = s3_obj.put(Body=json.dumps(response_json).encode('UTF-8'))
I can see that I get a 200 response and the file on the directory as well. But it also produces the below exception :
[DEBUG] 2020-10-13T08:29:10.828Z. Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7f2cf2fdfe123>>
My code throws 500 Exception even though it works. I have other business logic as part of the lambda and things work just fine as the write to S3 operation is at the last. Any help would be appreciated.
The Key (filename) of an Amazon S3 object should not start with a slash (/).

How to upload files larger than 10mb via google cloud http functions. ? Any alternative options?

I have made a google cloud function to upload the file into google bucket and returns signed URL in response.
Whenever large files (more than 10mb) uploaded. It is not working.
It works fine for files less than 10mb.
I have searched and see in cloud documentation. It says max data sent size is 10mb for HTTP functions not allowed to increase size.
resource: {…}
severity: "ERROR"
textPayload: "Function execution could not start, status: 'request too large'"
timestamp: "2019-06-25T06:26:41.731015173Z"
for successful file upload, it gives below log
Function execution took 271 ms, finished with status code: 200
for large files, it gives below log
Function execution could not start, status: 'request too large'
Are there any alternative options to upload file in the bucket using API? Any different service would be fine. I need to upload file up to 20mb files. Thanks in advance
You could upload directly to a Google Cloud Storage bucket using the Firebase SDK for web and mobile clients. Then, you can use a Storage trigger to deal with the file after it's finished uploading.

Azure DataLake (ADLS) BulkDownload Bad Request

I am trying to download the file from adls using the BulkDownload method using BulkDownload but I am getting a BAD Request response as below:
Error in getting metadata for path cc-
adl://testaccount.azuredatalakestore.net//HelloWorld//test.txt
Operation: GETFILESTATUS failed with HttpStatus:BadRequest Error: Uexpected
error in JSON parsing.
Last encountered exception thrown after 1 tries. [Uexpected error in JSON
parsing]
[ServerRequestId:]
However, if I try to download the file through azure client shell it works.
I am using the BulkDownload as follow:
client.BulkDownload(
srcPath,
dstPath);
Is anyone else facing the same issue for BulkDownload call?
I got this fixed as the srcPath is the relative path ("/HelloWorld/test.txt") in the azure datalake storage, previously I was the using the absolute path ("adl://testaccount.azuredatalakestore.net//HelloWorld/test.txt).

Resources