I'm currently running 2 scripts on a weekly schedule on a raspberry pi with the following configuration:
Cron executes a python script at a fixed time weekly. This python script waits between 0 and 50 hours then runs python script A. It waits about 16 hours and runs script A again 3 more times every 8 hours (The script takes about 4x longer to run the first time). 8 hours after the 4th run it runs script B.
I would like to move my scripts to Google Cloud VM for improved reliability but running the VM 24/7 just to run 30 hours worth of computations over a 100 hour period is inefficient and expensive.
I know I can use Google Scheduler as my cron to initiate the VM weekly but I still risk letting it run up to 50 hours waiting for script A to run. I understand cron supports adding a random sleep interval as listed in the example here:
30 8-21/* * * * sleep ${RANDOM:0:2}m ; /path/to/script.php
However, from what I've discovered, Google Cloud Scheduler is limited to 60 minutes and rightfully so. In this case what are my options? Does Google Cloud Task support delayed triggering of VM (up to 50 hours)? Is this something Pub Sub would support instead?
My scripts use a python library that I don't think is compatible with Google App Engine so I would further need to figure how to trigger a specific script in the VM on trigger.
You can use Cloud Scheduler and Pub/Sub to trigger a Cloud Function that will start your VM and execute your script. If you do not want your Compute Engine instances to be running 24/7, at the end of your script you can have your Cloud Function stop your VM.
You can find how to schedule compute instances with Cloud Scheduler here and how to use HTTP functions in Cloud Functions to start and stop your Compute Engine instance [1].
Most importantly, here is the documentation on how to use Cloud Scheduler and Pub/Sub to trigger a Cloud Function [2].
[1] https://cloud.google.com/scheduler/docs/start-and-stop-compute-engine-instances-on-a-schedule
[2] https://cloud.google.com/scheduler/docs/tut-pub-sub
[3] Cloud Functions: https://cloud.google.com/functions/docs/concepts/overview
Related
If I have 5 bots for trading and a along with this a script that does some updating on prices using scraping. All these files uses Node js. Now, I was able to deploy all the 6 scripts on digital ocean, but due to 6 scripts running together as 6 different processes the CPU usage in even their most expensive plan became 100%. Then I decided to shift to google cloud. But it turns out with GPU it is hell expensive.
Essentially what I want to do is that run the 6 scripts at 3 distinct times in a day for 10 mins. Other than those particular times the 6 scripts do nothing.
I have set a file named concurrently.js that runs all these scripts using the command concurrently.
Is it possible to run concurrently.js at 3 particular times of the day and then after 10 mins when the job is done, shut down the virtual machine?
Say machine turns on at 12.00pm then the 6 files work for 10 mins and then the machine shuts off at 12.10 pm. And then turns on at say 3.05 pm and so on.
If I can schedule on and off of the VM I can afford google cloud.
I got to know about cron and google cloud scheduler, but they need an App url to schedule tasks. But I don't have an app url because I don't have app only, I just want to run the concurrently.js file present in the virtual machine along with other files, can I do the scheduling?
Any help is highly appreciated!!!
You can do this with Google Cloud. Here the process
Cloud Scheduler start your Compute Engine VM
At startup, the Compute Engine VM runs a startup script that run your process
At the end of the process the VM auto shutdown
So for that you need to
Call the Compute Engine start API
Set a startup script on your VM
Shutdown the VM automatically at the end of the processing
If you are stuck in one step, let me know, I could narrow my help.
So i wrote a node script which fetches data from WP.org themes/plugins. The theme script will take around 4-5 hours to complete ( scraping and inserting data into BigQuery ).
The problem arises when i used google app engine to deploy the script, it works fine for 15 mins then it stops. Any way to increase the execution time of scripts in app engine.
These scripts will run weekly or every fortnight and will run until they are done. But app engine stops them after 15 mins. They works fine on my localhost so its not issue with node.
The max allowed run-time of a request is based on your selected scaling type. So it sounds like you will need to create a separate service to run this task with Basic or Manual set for the scaling type
https://cloud.google.com/appengine/docs/standard/nodejs/how-instances-are-managed#scaling_types
You could also try breaking up your task into multiple 10 minute tasks and chain them together
I would like to perform some video processing task which can take a long time to complete.
I had thought of using Cloud functions but I found that it can run for a maximum time of 540 seconds.
Browsing the internet, I find that App Engine can be used to execute long running processes.
I need the 'scale to zero' functionality, so, I cannot use Flexible environment.
On https://cloud.google.com/appengine/docs/the-appengine-environments, I find that 'Maximum request timeout' in standard environment is 60 seconds.
Is there a way to execute long running task in standard environment?
You can use Cloud Tasks
all workers must send an HTTP response code (200-299) to the Cloud
Tasks service before a deadline based on the instance scaling type of
the service: 10 minutes for automatic scaling or up to 24 hours for
manual scaling.
I have a python code which reads data from one cloud system via rest api using the requests module and then writes data back to another cloud system via rest api . This code runs anywhere from 1 to 4 hours every week. Is there a place in Google Cloud Platform , I can execute this code on a periodic basis. Sort of like a scheduled batch job . Is there a serverless option to do this in App Engine . I know about the App engine cron service but seems like it is only for calling a URL regularly . Any thoughts ? Appreciate your help.
Google Cloud Scheduler could be the tool you are looking for. As it is mentioned in its documentation:
Cloud Scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, cloud infrastructure operations, and more. You can automate everything, including retries in case of failure to reduce manual toil and intervention.
Here you have the quickstart for Cloud Scheduler, and also another tutorial for Cron jobs.
You can use the Google Genomics API pipelines.run endpoint to run a long-running job on a Google Compute Engine virtual machine and then it will destroy the machine when it's done. If your job will run for less than 24 hours and it can handle a failure, then you can use a Preemptible VM to save cost.
Pipelines: Run
https://cloud.google.com/genomics/reference/rest/v2alpha1/pipelines/run
Preemptible Virtual Machines
https://cloud.google.com/preemptible-vms/
You could use Cloud Scheduler to kick off the job
Pipelines may be preferred to trying to use one of the serverless technologies because they don't tend to handle the long running jobs as well.
You can use AI Platform Training to run any arbitrary Python package — it doesn’t have to be a machine learning job.
I have used Azure Schedulers before for quick jobs before. It targets a URL which is ASPX page or WebApi and it did the job.
Now I have a job that takes up to 15-20 minutes. Of course, I am getting timeout error after 30 seconds.
I'm trying to avoid creating a Windows Service or some console application that would run on Azure VM, rather have a non-UI application that runs in the background.
Do you have any suggestion what should I do?
You should use an Azure WebJob for this. WebJobs support simple scheduling via a cron expression (details here). Basically you upload a simple script file or exe that performs the work you want done, upload it to your WebApp along with a cron schedule expression, and Azure WebJobs will make sure it runs on schedule.
For your scenario, you'll want to create a "Continuous" WebJob and ensure you've enabled "Always On" which ensures the background job continues running (i.e. it isn't request triggered).
WebJobs sure is a good solutions, but it will share resources with its attached Web App.
You could consider using an Azure Cloud Service. I do that myself for longer running tasks, that are more CPU intensive.
Read more
For long running WebJobs, you have to tinker with the Timeout value (by default 2 minutes) or make sure your Webjob makes some Console.Writes.
To achieve that, go to the Web App Settings > Application Settings and add the following configurations:
WEBJOBS_IDLE_TIMEOUT - Time in seconds after which we'll abort a running triggered job's process if it's in idle, has no cpu time or output.
SCM_COMMAND_IDLE_TIMEOUT - Time in milisecods. By default, when your build process launches some command, it's allowed to run for up to 60 seconds without producing any output. If that is not long enough, you can make it longer, e.g. to make it 10 minutes: