I have created a web scraping python script and its running fine on my local system it takes 30 mins.
But when I tried to put the script on GCP cloud function it threw timeout after 60004 ms.
2022-03-16T11:41:01.420461007Zget_training_databudock8slftb Function execution took 60004 ms, finished with status: 'timeout'
Function execution took 60004 ms, finished with status: 'timeout'
To complete this I used the following services.
Cloud scheduler -> Pub/Sub -> cloud function
Could you please suggest which GCP service should I pick to run the python script with cost-effective and which runs daily?
Function execution time is limited by the timeout duration, which you can specify at function deployment time. By default, a function times out after 1 minute or 60000ms which you're experiencing, but you can extend this period up to a maximum of 9 minutes.
You can extend it by editing your deployed function and set the timeout to 540 seconds.
For more information, you may also refer on this documentation.
If your scraper took 30mins on your local, then maybe you need to optimize it first and logically set a timer before 9mins and create another schedule by using Google Cloud Client Libraries for google-cloud-scheduler to logically continue where you end and start on it until it finishes the scraping.
You can now use 2nd generation Cloud Functions.
If you create a 2nd gen Cloud Function and use an https trigger, you can set the max timeout to 3600 seconds = 1 hour instead of 10 minutes.
For info on 2nd generation Cloud Functions:
https://cloud.google.com/functions/docs/2nd-gen/overview
See also here on how to combine with Cloud Scheduler:
Do Google Cloud background functions have max timeout?
Related
Because http-trigger azure function has a strict time limit for 230s, I created a http-trigger durable azure function. I find that when I trigger durable function multiple times and if the last run is not completed, the current run will continue the last run until it is finished. It is a little confused for me because I only want each run to do the task of the current run, not replay the last un-finished run. So my question is that:
Is it by design for durable function to make sure each run is completed (succeed or failed)?
Can durable function only focus on the current run just like the normal http-trigger azure function?
If 2) is not, is there any way to mitigate the time limit issue for normal http-trigger azure function?
Thanks a lot!
The function runs until it gets results that is Successfully completed or failed message.
According to Microsoft-Documentation it says,
Azure Functions times out after 230 seconds regardless of the functionTimeout setting you've configured in the settings.
Introduction
I am using the Azure Face API in my Google Cloud Function (I make around 3 or 4 https requests everytime my function is called) but I am getting a really slow execution time, 5 seconds.
Function execution took 5395 ms, finished with status: 'ok'
Function execution took 3957 ms, finished with status: 'ok
Function execution took 2512 ms, finished with status: 'ok
Basically what I am doing in my cloud function is:
1. Detect a face using Azure
2. Save the face in the Azure LargeFaceList
3. Find 20 similar faces using Azure
4. Train the updated Azure LargeFaceList (if it is not being trained already)
I have the Google Cloud Function located in us-central1 ('near' my Azure Sace Service, which is in north-central-us). I have assigned it a memory of 2GB and a timeout of 540 secs. I am in Europe.
Problem
As I said before, the function takes too long to complete its execution (from 3.5 to 5 seconds). I don't know if this is because of the "Cold Start" or because it takes a time to run the algorithm.
Pd: The LargeFaceList currently only contains 10 faces (for 1000 faces the training duration is 1 second, and for 1 million 30 minutes).
My Options
Run the code on:
1- Google Cloud Function (doing this now)
2- Google Cloud App Engine
I have been experimenting with cloud functions from the last 3 months, and I have never used the App Engine service.
My Question
Is it possible to use firestore triggers on App Engine? And will I get a faster execution time if I move this code to App Engine?
With Cloud Functions, you can process only one request on 1 instance of function. If you have 2 requests, Cloud Functions creates 2 instances and each one are processed on only one instance.
Thus, if you have 180 concurrent requests you will have 180 function instances in the same time. (up to 1000 instance, default quotas)
Cloud Run runs on the same underlying infrastructure as Cloud Functions, but run containers. On 1 instance of Cloud Run, you can handle up to 80 requests concurrently.
Therefore, for 180 concurrent request, you should have 3 or 4 instance, and not 180 as for Cloud Functions. And because you pay the processing time (CPU + Memory), 180 Cloud Functions instances is more expensive than 3 Cloud Run services.
I wrote an article on this.
In summary, serverless architecture are highly scalable and process the request in parallel. Think about the processing time of only one request, not about the max amount of concurrent requests. (So, do it to have a cost perspective)
I need to develop a process (e.g. Azure fucntion app) that will load a file from FTP once every week, and perform ETL and update to other service for a long time (100mins).
My question is that will Timer Trigger Azure Function app with COMSUMPTION plan works in this scenario, given that the max running time of Azure function app is 10 mins.
Update
My theory of using Timer trigger function with Comumption plan is that if the timer is set to wake up every 4 mins from certain period (e.g. 5am - 10am Monday only), and within the function, a status tells whether or not an existing processing is in progress. If it is, the process continues its on-going job, otherwise, the function exits.
Is it doable or any flaw?
I'm not sure what is your exact scenario, but I would consider one of the following options:
Option 1
Use durable functions. (Here is a C# example)
It will allow you to start your process and while you wait for different tasks to complete, your function won't actually be running.
Option2
In case durable functions doesn't suit your needs, you can try to use a combination of a timer triggered function and ACI with your logic.
In a nutshell, your flow should looks something like this:
Timer function is triggered
Call an API to create the ACI
End of timer function.
The service in the ACI starts his job
After the service is done, it calls an API to remove it's own ACI.
But in anyway, durable functions usually do the trick.
Let me know if something is unclear.
Good luck. :)
With Consumptions plan, the azure function can run for max 10 minutes, still, you need to configure in host.json
You can go for the App Service Plan which has no time limit. Again you need to configure function timeout property in host.json
for more seed the following tutorial
https://sps-cloud-architect.blogspot.com/2019/12/azure-data-load-etl-process-using-azure.html
I am very new to Azure so I am not sure if my question is stated correctly but I will do my best:
I have an App that sends data in the form (1.bin, 2.bin, 3.bin...) always in consecutive order to a blob input container, when this happens it triggers an Azure function via QueueTrigger and the output of the function (1output.bin, 2output.bin, 3output.bin...) is stored in a blog output container.
When azure crashes the program tries 5 times before giving up. When azure succeeds it will run just once and that's it.
I am not sure what happened last week but since last week after each successful run, functions is idle like for 7 minutes and then it starts the process again as if it was the first time. So for example the blob receives 22.bin and functions process 22.bin and generates 22output.bin, it is supossed to stop after that but after seven minutes is processing 22.bin again.
I don't think is the app because each time the app sends data, even if it is the same one it will name the data with the next number (in my example 23.bin) but this is not the case it is just doing 22.bin again as if the trigger queue was not clear after the successful azure run, and it keeps doing it over and over again until I have to stop functions and make it crash i order to stop it.
Any idea in why is this happening and what can I try to correct it is greatly appreciated. I am just starting to learn about all this stuff.
One thing that could be possibly happening is that, the function execution time is exceeding 5 mins. Since this is a hard limit, function runtime would terminate the current execution and restart the function host.
One way to test this would be to create a Function app using Standard App Service plan instead of Consumption plan. Function app created with standard plan does not have execution time limit. You can log function start time and end time to see if it is taking longer than 5 mins to complete processing queue message.
I setup an Azure function which is set to run with a timer trigger every 10 minutes (CRON schedule is set to 0 */10 * * * *) . The function is running and everything works well, but suddenly after 3 days, the function is no longer invoked. When I restarted the service, it returns to normal and runs every 10 minutes. This problem happened 3 times.
Is there any explanation for it ?
Are you running on consumption plan or a dedicated web app?
I'm assuming this is dedicated web app, as consumption has a built in timeout of 5 min to avoid runaway functions.
You can set a timeout value for functions in host.json which will kill these runaway functions and restart automatically. You can also add verbose logging settings in host.json to help determine why these functions aren't completing.