How to create custom retry logic for aiobotocore? - python-3.x

I'm trying to upload a lot of files to S3. This cannot be done with the standard AWS CLI because of the translation required between file names on disk and object names in S3. Indeed may of the objects don't exist at all on disk.
I keep getting an error:
botocore.exceptions.ClientError: An error occurred (SlowDown) when calling the PutObject operation (reached max retries: 0): Please reduce your request rate
It doesn't seem to make a difference wether I use boto3 / botocore / aioboto3 / aiobotocore. I've tried various configurations of retry logic as described here. Nothing seems to fix the problem. That includes all three retry modes and retry counts ranging everything from 0 to 50.
I could add custom retry logic to every method that calls the client but that's going to be a lot of work and feels like the wrong approach.
Is it possible to customize the retry logic used by boto3 or aiobotocore?

Related

Batch operation in AWS SSM putParameter. Is there an option to write parameters in bulk to ParameterStore in AWS?

Is there an option to write parameters in bulk to ParameterStore in AWS? I have tried the putParameter API, but from the documentation I can see that only one parameter can be updated at a time. This operation takes around 20 milliseconds ( I maybe wrong ), so if I need to update some 20 parameters, it will exceed 400 ms. Typically, I have a requirement to accommodate for up to 50 parameters. Is there a better way to handle updating of parameters in parameter store?

python time out of stream method on gcp firestore

I am using GCP firestore. For some reason, I am querying all the documents present in a collection. I am using the python API.
Code I am using
db=firestore.Client()
documents = db.collection(collection_name).stream()
for doc in tqdm(documents):
#some time consuming operation.(2-3 seconds)
Everything runs fine but after 1 minute, the for loop ends.
I thought maybe the connection was getting timed out. I found this on the documentation page.
The underlying stream of responses will time out after the max_rpc_timeout_millis value set in
the GAPIC client configuration for the RunQuery API. Snapshots not consumed from the iterator
before that point will be lost.
My question is how can I modify this timeout value, to suit my needs. Thank you.
In my case, the 503 The datastore operation timed out, or the data was temporarily unavailable. response from Firestore has also been causing AttributeError: '_UnaryStreamMultiCallable' object has no attribute '_retry'.
This looks like retry policy is not set, though Python's firebase_admin package is capable of retrying timeout errors too. So, I have just configured a basic Retry object explicitly and this solved my issue:
from google.api_core.retry import Retry
documents = db.collection(collection_name).stream(retry=Retry())
A collection of 190K items is exported in 5 minutes in my case. Originally, the iteration also has been interrupted after 60 seconds.
Counterintuitively, as mentioned in the docs, .stream() has a cumulative timeout for an entire collection consumption, and not a single item or chunk retrieval.
So, if your collection has 1000 items and every item processing takes 0.5 seconds, total consumption time will sum up to 500 seconds which is greater than the default (undocumented) timeout of 60 seconds.
Also counterintuitively, a timeout argument of the CollectionReference.stream method does not override the max_rpc_timeout_millis mentioned in the documentation. In fact, it behaves like a client-side timeout, and the operation is effectively timed out after min(max_rpc_timeout_millis / 1000, timeout) seconds.

How to force exceptions and errors to print in AWS Lambda that is forcing the Lambda function to run twice?

I have a AWS lambda function (written in python 3.7) that is triggered when a specific JSON file is uploaded from a server to a s3 bucket. Currently I have trigger set on AWS lambda for a PUT request with the specific suffix of the file.
The issue is the lambda function is running twice everytime the JSON file is uploaded once to the s3 bucket. I confirmed via cloudwatch that every instance of any additional runs is roughly 10seconds to 1min apart and each run has an unique requestID.
To troubleshoot, I confirmed that JSON input is coming from one bucket and outputs are being written to completely separate bucket. I silenced all warnings coming from pandas, and do not see any errors that would occur in the code pop up in cloudwatch. I also have changed the retry attempts from 2 to 0.
The function also has the following metrics when it is running, with a timeout set at 40seconds and memory size set to 1920MB. There should be enough time and memory for the function to use:
Duration: 1216.03 ms Billed Duration: 1300 ms Memory Size: 1920 MB Max Memory Used: 164 MB
I am at a loss as to what I am doing wrong.
How can I force AWS Lambda to display the issues or errors it is encountering that is forcing the Lambda function to run multiple times in my python code or where ever the issue is occurring?
The issue was that my code was throwing an error, but for some reason cloudwatch was not showing the error.

Overcoming Azure Vision Read API Transactions-Per-Second (TPS) limit

I am working on a system where we are calling Vision Read API for extracting the contents from raster PDF. Files are of different sizes, ranging from one page to several hundred pages.
Files are stored in Azure Blob and there will be a function to push files to Read API once when all files are uploaded to blob. There could be hundreds of files.
Therefore, when the process starts, a large number of documents are expected to be sent for text extraction per second. But Vision API has limit of 10 transactions per second including read.
I am wondering what would be best approach? Some type of throttling or queue?
Is there any integration available (say with queue) from where the Read API will pull documents and is there any type of push notification available to notify about completion of read operation? How can I prevent timeouts due to exceeding 10 TPS limit?
Per my understanding , there are 2 key points you want to know :
How to overcome 10 TPS limit while you have lot of files to read.
Looking for a best approach to get the Read operation status and
result.
Your question is a bit broad,maybe I can provide you with some suggestions:
For Q1, Generally ,if you reach TPS limit , you will get a HTTP 429 response , you must wait for some time to call API again, or else the next call of API will be refused. Usually we retry the operation using something like an exponential back off retry policy to handle the 429 error:
2.1) You need check the HTTP response code in your code.
2.2) When HTTP response code is 429, then retry this operation after N seconds which you can define by yourself such as 10 seconds…
For example, the following is a response of 429. You can set your wait time as (26 + n) seconds. (PS: you can define n by yourself here, such as n = 5…)
{
"error":{
"statusCode": 429,
"message": "Rate limit is exceeded. Try again in 26 seconds."
}
}
2.3) If step 2 succeed, continue the next operation.
2.4) If step 2 fail with 429 too, retry this operation after N*N seconds (you can define by yourself too) which is an exponential back off retry policy..
2.5) If step 4 fail with 429 too, retry this operation after NNN seconds…
2.6) You should always wait for current operation to succeed, and the Waiting time will be exponential growth.
For Q2,, As we know , we can use this API to get Read operation status/result.
If you want to get the completion notification/result, you should build a roll polling request for each of your operation at intervals,i.e. each 10 seconds to send a check request.You can use Azure function or Azure automation runbook to create asynchronous tasks to check read operation status and once its done , handle the result based on your requirement.
Hope it helps. If you have any further concerns , please feel free to let me know.

How kinesis keep the offset and push the record again when an event fails in lambda

I am new to AWS lambda and Kinesis. Please help with the following question
I have a kinesis stream as a source to lambda and the target is again kinesis. I have following queries.
The system doesnt want to lose a record.
if any of the records fails the processing in lambda, How it again pull into the lambda? How it keep the unprocessed records ? How kinesis track the offset to process the next record?
Please update.
From the AWS Lambda docs about using Lambda with Kinesis:
If your function returns an error, Lambda retries the batch until processing succeeds or the data expires. Until the issue is resolved, no data in the shard is processed. To avoid stalled shards and potential data loss, make sure to handle and record processing errors in your code.
In this context, also consider the Retention Period of Kinesis:
The retention period is the length of time that data records are accessible after they are added to the stream. A stream’s retention period is set to a default of 24 hours after creation. You can increase the retention period up to 168 hours (7 days)
As mentioned in the first quote, AWS will drop the event after the retention period is due. This means for you:
a) Take care that your Lambda function handles errors correctly.
b) If it's important to keep all records, also store them in a persistent storage, e.g. DynamoDB.
In addition to that, you should read about duplicate Lambda executions as well. There is a great blog post available explaining how you can achieve an idempotent implementation. And read here on another StackOverflow question & answer.

Resources