I am running a triggered web job - runs on every 15 minutes - it is a console application which moves data from SQL Server to Mongo Database. I have configured the WEBJOBS_IDLE_TIMEOUT and SCM_COMMAND_IDLE_TIMEOUT values to 3600. It was in an idle state for the last 5 days and it didn't abort or timed out. I had to restart the web app to abort the web job and restart it.
Is there a way I can configure the web job should abort after a specific time period?
The WEBJOBS_IDLE_TIMEOUT configuration only applies if your webjob has no CPU and no IO activity (see WebJobs in the Kudu wiki)
CPU activity is measured via Process.TotalProcessorTime while IO activity is measured via inspecting the standard and error output streams of your webjob for new characters. Source: IdleManager.WaitForExit and Executable.ExecuteInternal.
If possible, let your webjob exit gracefully after it's done its job. If you want to forcefully abort a long running operation that is still consuming CPU or IO, you'll have to code this in your webjob directly.
Related
I've got a .NET worker service based on a cron schedule running in a Docker container and pushed up to Azure Container Apps. The schedule is managed within the application itself.
The scaling is set to have a minumum of 1 replica running at all times.
However, we've found that for some reason the application starts up, idles waiting for the schedule trigger for ~20-30 seconds, stops for 2 seconds, starts and idles for ~20-30 seconds again and then doesn't run again for ~5-6 minutes. During the idling time, the job might start if the cron schedule lines up while the process is running.
Is there any way to diagnose why it might be auto-killing the application?
I can't seem to find any logs that show any fatal exceptions or along those lines, and running in other environments (locally, Azure Container Instance etc.) doesn't replicate the behavior. My suspicion is that it's the auto-scaling behavior: Azure is noticing that the process is idle for 20-30 seconds at a time and killing that replica, only for it to spin up again 5 mins later. However, I can't seem to find anything to prove that theory.
I'm aware that other resource types might be better suited (Container Instances, App Service, Functions) though for now I'm stuck with Container Apps.
Found the cause of the issue based on this SO question:
Azure Container Apps Restarts every 30 seconds
Turns out, Azure was trying to do health checks on it despite no HTTP ports being exposed. Azure thinking the container is unhealthy, kills and restarts it. Turning off HTTP ingress (and therefore the health checks) solved this issue.
I'm looking to use the API to change the number of web job instances I have running based on the size of a processing Q, I know I can setup rules in the portal but the minimum aggregation time is 60 minutes and I don't want to have the system waiting 60 minutes before scaling up if we suddenly get a burst of work.
The issue I have is that currently if I scale out in the portal manually from say 1 to 5 instances it kills the single running instance and then starts 5 new ones.
I assume if I did this through the API the same thing would happen, do you know if there is any way to avoid this?
Thanks
Si
UPDATE:
See below, I submitted 4 jobs and then as the first was processing I scaled out from 1 to 3 instances and this is what happened, the job that Never Finished then reran after the next 3 had finished as it's message would have popped back on the queue because it's processing failed initially.
if I scale out in the portal manually from say 1 to 5 instances it kills the single running instance and then starts 5 new ones.
As my test, if you scale your webjob, it will not kill the single running instance. I created a Webjob template and write a timer trigger in it.
Here is the time I scaled my web app:
Here is the trigger log in Azure storage ('azure-jobs-host-output'):
If you find your webjob is running in 'inactive instance' state in Azure webjob dashboard. Please do not worry about it. Your webjob is still in running. Please have a look at David's reply in this thread. Here is a snippet:
This is actually a bug in what the Portal displays. The Portal ends up asking an arbitrary instance about the WebJob status, and if it happens to hit any instance other than the one that's actually running it, it will be reported as inactive.
I have used Azure Schedulers before for quick jobs before. It targets a URL which is ASPX page or WebApi and it did the job.
Now I have a job that takes up to 15-20 minutes. Of course, I am getting timeout error after 30 seconds.
I'm trying to avoid creating a Windows Service or some console application that would run on Azure VM, rather have a non-UI application that runs in the background.
Do you have any suggestion what should I do?
You should use an Azure WebJob for this. WebJobs support simple scheduling via a cron expression (details here). Basically you upload a simple script file or exe that performs the work you want done, upload it to your WebApp along with a cron schedule expression, and Azure WebJobs will make sure it runs on schedule.
For your scenario, you'll want to create a "Continuous" WebJob and ensure you've enabled "Always On" which ensures the background job continues running (i.e. it isn't request triggered).
WebJobs sure is a good solutions, but it will share resources with its attached Web App.
You could consider using an Azure Cloud Service. I do that myself for longer running tasks, that are more CPU intensive.
Read more
For long running WebJobs, you have to tinker with the Timeout value (by default 2 minutes) or make sure your Webjob makes some Console.Writes.
To achieve that, go to the Web App Settings > Application Settings and add the following configurations:
WEBJOBS_IDLE_TIMEOUT - Time in seconds after which we'll abort a running triggered job's process if it's in idle, has no cpu time or output.
SCM_COMMAND_IDLE_TIMEOUT - Time in milisecods. By default, when your build process launches some command, it's allowed to run for up to 60 seconds without producing any output. If that is not long enough, you can make it longer, e.g. to make it 10 minutes:
I have a Scheduled Azure WebJob which runs every 5 mins. It's not clear what happens if the running times takes 10 mins. Is a new one started parallel to the one still running, or is it not started until the previous one has finished?
From this answer What happens when a scheduled WebJob runs for a long time :
As i understand it scheduled webjobs is just triggered webjobs that is run using Azure Scheduler, if you open Azure Scheduler in management portal you can see the webjobs and even configure them in more detail. (You can see the log too which would give you the simple answer to your question).
If you like to look at whats going on your scheduled webjob is run as a Triggered webjob by Kudu, if you look in the Kudu source you will see that a lockfile is created when a job is started, and if you try to start another job a ConflictException is thrown if there is already a lock file.
The Azure scheduler calls your job using a webhook that catches the ConflictException and gives you the "Error_WebJobAlreadyRunning" warning which will tell you: "Cannot start a new run since job is already running."
I have an azure web job running continuously, but the logs indicated that over the weekend it's status changed to Aborted, then Stopped. Although I did not use the website for the weekend, I am not sure why this would happen as there are still a lot of messages on the queue that need to be processed.
What can cause a continuous web job to stop or abort?
Does it have a timeout period?
Can the occurrence of multiple errors also cause it to stop or abort?
The job itself doesn't have a timeout period but the website does. Unless you enable the Always On option, the website (and the jobs) will unload after some idle time.
Another reason why a continuous job could stop is if you are running on the free tier and the job uses too much CPU time (I think you have 2.5 minutes CPU time for every 5 minutes).