Azure Functions - Logging Consolidation - Controlling the host log? - azure

I run a system on top of a bunch of Azure Functions and I'm just tidying some last threads up. I mostly abandoned the logging provided out of the box by Azure functions because I found the flush timings to be super irregular and I also wanted to consolidate the logs from all of my functions into one spot and be able to query them. This all works for the most part but I have one annoying use-case remaining where if a function binding is faulty (e.g. the azure function method signature is wrong because someone checked garbage into Git) the function won't be invoked and even the log for the function wont be invoked but the error will instead be placed into a different file (the host log).
Now I guess I can just access the storage account that backs up the azure function and pull the host log from there but I was wondering if there was a better means of directly controlling/intercepting the logging in Azure Functions. Does anyone know if there is at least a way of getting it to flush more quickly?

You can see host logs as well as function logs in associated Application Insights:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-monitoring#other-categories

Related

Two Azure Functions share an In-Proc collection

I have two Azure Functions. I can think of them as "Producer-Consumer". One is "HttpTrigger" based Function (Producer) which can be fired randomly. It writes the input data in a static "ConcurrentDictionary". The second one is "Timer Trigger" Azure Function(consumer). It reads the data periodically from the same "ConcurrentDictionary" which was being used by the "Producer" function App and then do some processing.
Both the functions are within the same .Net project (but in different classes). The in-memory data sharing through static "ConcurrentDictionary" works perfectly fine when I run the application locally. While running locally, I assume that they are running under the same process. However, when I deploy these Functions in Azure Portal ( They are in the same function App Resource), I found that data sharing through static "ConcurrentDictionary" is not not working.
I am just curious to know, if in Azure Portal, both the Functions have their own process (Probably, that's why they are not able to share in-process static collection). If that is the case, what are my options that these two Functions work as proper "Producer-Consumer"? Will keeping both the Functions in the same class help?
Probably, the scenario is just opposite to what is described in the post - "https://stackoverflow.com/questions/62203987/do-azure-function-from-same-app-service-run-in-same-instance". As against the question in the post, I would like both the Functions to use the same static member of a static class instance.
I am sorry that I cannot experiment too much because the deployment is done through Azure-DevOps pipeline. Too many check-ins in repository is slightly inconvenient. As I mention, it works well locally. So, I don't know how to recreate what's happening in Azure Portal in local environment so that I can try different options? Is there any configurable thing which I am missing to apply?
Don't do that, use an azure queue, event grid, service bus or something else that is reliable but just don't try using a shared object. It will fail as soon as scale out happens or as soon as one of the processes dies. Do think about functions as independent pieces and do not try to go against the framework.
Yes, it might work when you run the functions locally but then you are running on a single machine and the runtime might use the same process but once deployed that ain't true anymore.
If you really really don't want to decouple your logic into a fully seperated producer and consumer then write a single function that uses an in process queue or collection and have that function deal with the processing.

Blob trigger affecting application insight logging in azure functions

I have two azure functions that exist in the same azure function app and they are both connected to the same instance of application insights:
TimerFunction uses a TimerTrigger and executes every 60 seconds and logs each log type for testing purposes.
BlobFunction uses a BlobTrigger and its functionality is irrelevant for this question.
It appears that when BlobFunction is enabled (it isn't being triggered by the way), it clogs up the application insights with polling, as I don't receive some of the log messages written in TimerFunction. If I disable BlobFunction, then the logs I see in the development tools monitor for TimerFunction are all there.
This is shown in the screenshot below. TimerFunction and BlobFunction were both running until I disabled BlobFunction at 20:24, where you can clearly see the logs working "normally", then at 20:26 I re-enabled BlobFunction and the logs written by TimerFunction are again intermittent, and missing my own logged info.
Here is the sample telemetry from the live metrics tab:
Am I missing something glaringly obvious here? What is going on?
FYI: My host.json file does not set any log levels, I took them all out in the process of testing this and it is currently a near-skeleton. I also changed the BlobFunction to use a HttpTrigger instead, and the issue disappeared, so I'm 99% certain it's because of the BlobTrigger.
EDIT:
I tried to add an Event Grid trigger instead as Peter Bons suggested, but my resource group shows no storage account for some reason. The way the linked article shows, and the way this video shows (https://www.youtube.com/watch?v=0sEzimJYhME&list=WL) just don't work for me. The options are just different, as shown below:
It is normal behavior that the polling is cluttering your logs. You can of course set a log level in host.json to filter out those message, though you might loose some valueable other logging as well.
As for possible missing telemetry: it could very well be that some logs are dropped due to sampling that is enabled by default. I would also not be suprised if some logging is not shown on the portal. I've personally experienced logging being delayed up to 10 minutes or not available at all in the azure function log page on the portal. Try a direct query in App Insights as well.
Or you can go directly to the App Insights resource and create some queries yourself that filter out those messages using Search or Logs.
The other option is to not rely on polling using the blobtrigger but instead use an event grid trigger that invocates the function once a blob is added. Here is an example of calling a function when an image is uploaded to an azure storage blob container. Because there is no polling involved this is a much more efficient way of reacting to storage events.

Performance impact of writing Azure diagnostic logs to blob storage

Our C# web app, running on Azure, uses System.Diagnostics.Trace to write trace statements for debugging/troubleshooting. Once we enable blob storage for these logs (using the "Application Logging (blob)" option in the Azure portal), the response time for our application slows down considerably. If I turn this option off, the web app speeds up again (though obviously we don't get logs in blob storage anymore).
Does anyone know if this is expected? We certainly write a lot of trace statements on every request (100 or so per request), but I would not think this was unusual for web application. Is there some way to diagnose why enabling blob storage for the logs dramatically slows down the execution of these trace statements? Is writing the trace statement synchronous with the logs being updated in blob storage, for instance?
I was unable to find any information about how logging to blob storage in Azure was implemented. However, this is what I was able to deduce:
I confirmed that disabling the global lock had no effect. Therefore, the performance problem was not directly related to lock contention.
I also confirmed that if I turn AutoFlush off, the performance problem did not occur.
From further cross referencing the source code for the .NET trace API, my conclusion is that it appears that when you enable blob storage for logs, it injects some kind of trace listener into your application (the same way you might add a listener in web.config) and it synchronously writes every trace statement it receives to blob storage.
As such, it seems that there are a few ways to workaround this behavior:
Don't turn on AutoFlush, but manually flush periodically. This will prevent the synchronous blob writes from interrupting every log statement.
Write your own daemon that will periodically copy local log files to blob storage or something like this
Don't use this blob storage feature at all but instead leverage the tracing functionality in Application Insights.
I ended up doing #3 because, as it turns out, we already had Application Insights configured and on, we just didn't realize it could handle trace logging and querying. After disabling sampling for tracing events, we now have a way to easily query for any log statement remotely and get the full set of traces subject to any criteria (keyword match, all traces for a particular request, all traces in a particular time period, etc.) Moreover, there is no noticeable synchronous overhead to writing log statements with the Application Insights trace listener, so nothing in our application has to change (we can continue using the .NET trace class). As a bonus, since Application Insights tracing is pretty flexible with the tracing source, we can even switch to another higher performance logging API (e.g. ETW or log4net) if needed and Application Insights still works.
Ultimately, you should consider using Application Insights for storing and querying your traces. Depending on why you wanted your logs in blob storage in the first place, it may or may not meet your needs, but it worked for us.

Azure Function reaching timeout without doing anything

I have an Azure Function app in Node.js with a couple of Queue-triggered functions.
These were working great, until I saw a couple of timeouts in my function logs.
From that point, none of my triggered functions are actually doing anything. They just keep timing out even before executing the first line of code, which is a context.log()-statement to show the execution time.
What could be the cause of this?
Check your functions storage account in the azure portal, you'll likely see very high activity for files monitoring.
This is likely due to the interaction between Azure Files and requiring a large node_modules tree. Once the modules have been required once, functions will execute quickly because modules are cached, but these timeouts can throw the function app into a timeout -> restart loop.
There's a lot of discussion on this, along with one possible improvement (using webpack on server side modules) here.
Other possibilities:
decrease number of node modules if possible
move to dedicated instead of consumption plan (it runs on a different file system which has better performance)
use C# or F#, which don't suffer from these limitations

Azure Diagnostic Configuration from Separate Assembly

We are developing several Azure-based applications in C# and are attempting to centralize some common code in a utility library. One of the common functions is Diagnostic monitoring setup.
We created a class that simplifies the configuration of diag collection, log transfer, etc.
The main issue we are facing is that when we run our code while the class lives in a different assembly from the WebRole or WorkerRole, the diagnostic information is never collected and transferred to azure table storage. If we move the class to the same project as the Web/Worker role, then everything works as expected.
Is there something that either the DiagnosticMonitor.GetDefaultInitialConfiguration(); or the DiagnosticMonitor.Start(StorageConnectionStringKey, _diagConfig); doesn't like about being in another assembly? I'm stumped!
Any insight would be appreciated.
Thanks,
Matt
Which part is not working here? Trace Logs not getting transferred? That seems to be the one that most people have issues with.
We do something similar and have no issues. Typically when you don't see stuff getting transferred it's because the current process where the listener is getting configured is not always the same one where tracing occurs (especially when dynamically adding to trace listener collection). Notably, a lot of users find this issue with web apps in Windows Azure.
What are you expecting to see transferred? Perf counters? Traces? Event Logs? etc.

Resources