I had developed a time triggered Azure Web Job and published into Azure with Triggered type. Everything working fine but sometimes the web job goes into shutdown state without logging any exception information in web job dashboard logs and kudu logs.
Before posting question here, I read this blog about Graceful Shutdown.
So, can anyone suggest me how to resolve the above issue.
For Continuous jobs, there is a default period of 5 seconds waiting for the job process to shutdown before getting killed.
For Triggered jobs, when a shutdown request is detected there is a 30 seconds default waiting period for the job process to stop.
You can change the grace period of a job by specifying it (in seconds) in the settings.job file where the name of the setting is stopping_wait_time like so:
{ "stopping_wait_time": 60 }
Here is a similar issue you could refer to.
Related
I have a webjob in Azure, hosted on an App Service that is not used for anything. I am currently deploying my webjob from Visual Studio, but this will change in the future as it's not in production. It's a .NET Core 3.1 application webjob that compiles to an EXE, but that shouldn't matter to this question (and I'm aware of Azure Functions, but that is also not a part of my question).
The webjob is a continous webjob triggered by a queue. I have set it up to run 10 batches simultaneously. I have looked online for answers, but I have found unclear answers.
My question is: Let's say I have 3 jobs running. Then I deploy a new version of the EXE file. This seems to work without problems. But what happens to the jobs that are running? Will they continue running to the end? Or will it fail and stop? I haven't quite managed to sort that out and I wanted to ask here in case someone have helpful experience on this.
My queue related config is like this, if that's helpful:
.ConfigureWebJobs(b =>
{
b.AddAzureStorageCoreServices();
b.AddAzureStorage(a =>
{
a.BatchSize = 10;
a.NewBatchThreshold = 5;
a.MaxDequeueCount = 1;
a.MaxPollingInterval = TimeSpan.FromSeconds(20);
});
})
Thank you!
But what happens to the jobs that are running? Will they continue running to the end? Or will it fail and stop?
When a WebJob that uses the WebJobs SDK picks up a message from a queue, it acquires it with a 10 minutes lease. If the job process dies while processing the message, the lease expires after 10 minutes and the message goes back in the queue. If the WebJob is restarted, it will pick that message again. The message is only deleted if the function completes successfully.
Therefore, if the job dies and restarts immediately, like in the case of a redeploy, it might take up to 10 minutes to pick again the message. Also, because of this, it is recommended to either save state yourself or make the function idempotent.
In the WebJobs Dashboard you will see two invocations for the same message. One of them will be marked as Never Finished because the function execution never completed.
Unfortunately, there is no out of the box solution to prevent jobs from running during deploy. You would have to create your own logic that notifies (through a queue message?) that a deploy is about the start and then aborts the host. The host abort will wait for any existing function to stop and will prevent new ones from starting. However, this is a very tricky situation if you have multiple instances of the webjob because only one of them will get the notification.
We have a webjob that is invoked with a queutrigger that we trigger every time an order is submitted.
public async Task OrderJobAsync([QueueTrigger("newshipment")] string filename, TextWriter log)
{
doSomething()
}
Over the past week we had some very high order activity and our webjob didn't run for a handful of these orders. We have email and slack notifications set up to send when a webjob fails but those were not triggered at all because the job simply did not run. Does anyone know what could have caused this? Could our order activity have triggered this webjob too many times?
I believe this was fixed by adding the WEB_JOB_IDLE_TIMEOUT setting in our azure configuration.
What I think was happening was that our webjob queue was getting backed up during periods of high order activity and some jobs did not run within the 2:00 minute window that is the default idle timeout. I extended this to be a half an hour and we haven't had this problem since.
So My webjob runs on 10 instances, grabs 10 messages of the queue and processes them from what I can tell in my personal logs, but the webjob log never shows it finishing and the status continues to be "running" even though it should be finished. This job does run for awhile, about 45-60 minutes, since I'm syncing a ton of data for each call. I checked the process explorer and the thread says "Running" but when I look in the details I see below:
Process Explorer Example Here
Not sure what to do to make the job change its status to "Success" and continue on with the next item in the queue.
Another related issue, I'm using a ServiceBusTrigger but since the call is taking more than 5 minutes to complete, the next instance of the job picks up the same item from the queue again, so then I have 2 processes running the same message off the queue. It keeps doing this every 5 minutes until I maxed out my instance count available which is 10. Is there a way to stop this from happening? This may be related to issue above.
In order to fix this, I had to add the following:
public async Task SyncTest([ServiceBusTrigger("syncqueue")] BrokeredMessage message, TextWriter log)
{
message.Complete();
}
We have an Azure WebJob which is scheduled to run at 8:00 AM UTC daily(CRON - 0 00 08 * * *).Most of the days it is triggering correctly but on some days it is triggering twice (Second run is occurring ~10 secs after first run ). I can see in the web job history that when it triggered twice,first run's trigger property(from WebJob History JSON) is showing as "External - " and second run trigger property is showing as "Schedule - 0 0 8 * * *" but we don't have any external services triggering this WebJob.
When i checked the Job scheduler log for more details, "Web Job Invoked" status is only present for those days when the webjob got triggered twice.
Your problem appears to be that you appear to have two different things triggering your WebJob:
You probably have a settings.job (wiki) with a cron expression.
You may also have an Azure Scheduler Job Collection hitting your WebJob from the outside (possibly with a different schedule, which is why they don't always match).
Suggestion is to get rid of #2, and only keep the internal WebJobs scheduling via settings.job.
Subject says it all really :) Say I've got a pretty busy Azure continuous webjob that is processing from an azure Queue:
public static void ProcessQueue([QueueTrigger("trigger")] Info info)
{ .... }
If I re-deploy the webjob, I can see that any currently executing job seems to be aborted (I see a "Never Finished" status). Is that job replayed after I release or is it lost forever?
Also, is there a nice way to make sure that no jobs are running when we deploy webjobs, or is it up to the developer to code a solution to that (such as a config flag that is checked every run).
Thanks
When a WebJob that uses the WebJobs SDK picks up a message from a queue, it acquires it with a 10 minutes lease. If the job process dies while processing the message, the lease expires after 10 minutes and the message goes back in the queue. If the WebJob is restarted, it will pick that message again. The message is only deleted if the function completes successfully.
Therefore, if the job dies and restarts immediately, like in the case of a redeploy, it might take up to 10 minutes to pick again the message. Also, because of this, it is recommended to either save state yourself or make the function idempotent.
In the WebJobs Dashboard you will see two invocations for the same message. One of them will be marked as Never Finished because the function execution never completed.
Unfortunately, there is no out of the box solution to prevent jobs from running during deploy. You would have to create your own logic that notifies (through a queue message?) that a deploy is about the start and then aborts the host. The host abort will wait for any existing function to stop and will prevent new ones from starting. However, this is a very tricky situation if you have multiple instances of the webjob because only one of them will get the notification.