I am using azure web jobs in a system to process xml data received in real-time via a web service that is queued for later processing. Some of the webjob functions are invoked at quite a high frequency (100s per minute). When I first trialed the system the logging seemed to perform well. However now after several weeks a large volume of log data has accumulated it seems to stop updating and displays "indexing in process" fairly constantly.
How do I 'purge' or clear-out the logs?
Can I and should I turn off logging selectively for the frequently updated jobs? How can this be achieved?
My web jobs are continuous and use the c# api. My question isn't the same as Azure webjobs output logs indexing taking very long although this answer is also relevant - I was specifically asking how to purge the logs and turn off logging selectively.
You would have configured the AzureWebJobsStorage application setting or connection string in your WebJob. The logs are stored in the blob storage of this storage account. You should be able to manually clear them up there.
Assuming that you are using the Azure WebJobs SDK, you can plug in a custom logger
JobHostConfiguration config = new JobHostConfiguration();
config.Tracing.Tracers.Add(new CustomTraceWriter(TraceLevel.Info));
JobHost host = new JobHost(config);
host.RunAndBlock();
CustomTraceWriter can wrap all writes within a check for an application setting
if(CloudConfigurationManager.GetSetting("EnableWebJobLogging"))
{
....
}
Related
I use data stored in in a blob for some configuration for some azure web apps, and I'd like to react to changes to it in near realtime. Currently I just set a timed event and periodically check if the etag of the blob has changed, and if it has then download the new blob.
This is ok, but I don't want to poll the blob too often, and I also want to be reactive. The devs changing the values in the blob want to be able to test the new values quickly.
The web app scales up and down, and each instance of the web app needs to download the config file. So, as far as I can tell, I can't just use the event system that azure storage has, as that would only send a notification to one instance.
Is there a recommended way to do this?
Per my understanding, you want to centralize manage your azure web apps. Once some config has been changed, your app services should reload configs on time automatically. Actually, Azure App Configuration provides this kind of functionality.
You can also config the condition to reload all configs in code. This is a .net core sample here. And you find other samples under the Enable dynamic configuration blade.
I know I can delete old files manually but I need to automate the process. Some cron script would do the job but as far as I know when the app service will be reproviosioned my changes will be lost.
The App Service runs Ubuntu.
Yes, by default, logs are not automatically deleted (with the exception of Application Logging (Filesystem)). To automatically delete logs, set the Retention Period (Days) field (it's one of the way to do that).
You could automate the deletion by leveraging KUDU Virtual File System (VFS) Rest API. For a sample script, checkout this discussion thread for a similar approach.
While WebJobs is not yet supported for App Service on Linux.You could use Azure Functions for running scripts, if your requirement fits.
I am using the default logging mechanism that Azure web job provides. Type of logger is 'TextWriter'. I have 3 functions in the same web job with extensive logging. A number of logs being generated every minute. As with the default settings of azure web job, all the logs go to the storage account into blobs. I do not want my storage account to just keep on growing with months and months of old logs.
I need a way of cleaning the logs on a periodic basis. Or is there any setting/configuration that can be done so that my logs get cleaned on a periodic basis? Or should I write code to monitor the blob container 'azure-webjobs-hosts' and then the files inside 'output-logs'. Is that the only place where the logs for my application are stored by default by the web job?
I tried searching the web but couldn't find any related posts. Any pointers would be of great help.
Based on my experience, we can achieve this purpose by define the azure storage container name. We can define weekly/monthly/daily as container name. Then use a time trigger function to delete the container. For example, if we need delete weekly data, then we set container for this weekly data, then delete it in the next week via time trigger.
I have a continuous running webjob that pulls message from a service bus queue, processes them, and persists data to a SQL database. The processing can sometimes be database intensive.
In trying to increase the performance of the webjob, I noticed one of the largest bottlenecks seems to be logging. I have logging enabled to blob storage and set the level to informational. When I turn off logging (via the portal) the message processing rate triples! Re-enabling the logging brings the performance back down.
Are there any tricks to get the logging performance up? I have checked the obvious things like setting up the storage account in the same location and resource group.
I believe that some improvements were made in WebJobs 2.0, which is still in pre-release (https://www.nuget.org/packages/Microsoft.Azure.WebJobs/2.0.0-beta1). Can you give that a shot to see if that helps?
I'm trying to figure out a solution for recurring data aggregation of several thousand remote XML and JSON data files, by using Azure queues and WebJobs to fetch the data.
Basically, an input endpoint URL of some sort would be called (with a data URL as parameter) on an Azure website/app. It should trigger a WebJobs background job (or can it continuously running and checking the queue periodically for new work), fetch the data URL and then callback an external endpoint URL on completion.
Now the main concern is the volume and its performance/scaling/pricing overhead. There will be around 10,000 URLs to be fetched every 10-60 minutes (most URLs will be fetched once every 60 minutes). With regards to this scenario of recurring high-volume background jobs, I have a couple of questions:
Is Azure WebJobs (or Workers?) the right option for background processing at this volume, and be able to scale accordingly?
For this sort of volume, which Azure website tier will be most suitable (comparison at http://azure.microsoft.com/en-us/pricing/details/app-service/)? Or would only a Cloud or VM(s) work at this scale?
Any suggestions or tips are appreciated.
Yes, Azure WebJobs is an ideal solution to this. Azure WebJobs will scale with your Web App (formerly Websites). So, if you increase your web app instances, you will also increase your web job instances. There are ways to prevent this but that's the default behavior. You could also setup autoscale to automatically scale your web app based on CPU or other performance rules you specify.
It is also possible to scale your web job independently of your web front end (WFE) by deploying the web job to a web app separate from the web app where your WFE is deployed. This has the benefit of not taking up machine resources (CPU, RAM) that your WFE is using while giving you flexibility to scale your web job instances to the appropriate level. Not saying this is what you should do. You will have to do some load testing to determine if this strategy is right (or necessary) for your situation.
You should consider at least the Basic tier for your web app. That would allow you to scale out to 3 instances if you needed to and also removes the CPU and Network I/O limits that the Free and Shared plans have.
As for the queue, I would definitely suggest using the WebJobs SDK and let the JobHost (from the SDK) invoke your web job function for you instead of polling the queue. This is a really slick solution and frees you from having to write the infrastructure code to retrieve messages from the queue, manage message visibility, delete the message, etc. For a working example of this and a quick start on building your web job like this, take a look at the sample code the Azure WebJobs SDK Queues template punches out for you.