Azure Function Failing on portal with Exception: System.TimeoutException - azure

I am new to Azure development and developed a function app
I published my function app to Azure portal. It is working fine on my development machine but on portal it's throwing following exception (some times)
The operation 'ScanLogs' with id '2eggec6de-54f5-4t34-5423-afffce5c6a43' did not complete in '00:02:00'.
I couldn't find solution to this error. Can somebody help me to understand what this error is about and why we get this?
following is timeout specified in host.json in prod.

Depending on the version of function you are running check here for the exact syntax in host.json - https://learn.microsoft.com/en-us/azure/azure-functions/functions-host-json-v1
The default timeout for a function on the consumption plan is 2 minutes, if you need it to run longer then change/add the functiontimeout value, i.e.
"functionTimeout": "00:05:00",
in host.json.
NB: On a Consumption plan this can't be more than 10 minutes so if you need it to run longer either find a way to break up your function into smaller chunks and maybe use a Durable function fan=out fan-in pattern or change it to run on a dedicated App Service Plan where it can run for as long as you like but obviously you'll have to pay to have the server running 24/7

Related

Stopping the listener 'Microsoft.Azure.WebJobs.Host.Blobs.Listeners.BlobListener' for function <XXX>

I have function app where I have one HttpTrigger and 3 BlobTrigger functions. After I deployed it, http trigger is working fine but for others functions which are blob triggers, it gives following errors
"Stopping the listener 'Microsoft.Azure.WebJobs.Host.Blobs.Listeners.BlobListener' for function " for one function
Stopping the listener 'Microsoft.Azure.WebJobs.Host.Listeners.CompositeListener' for function
" for another two
I verified with other environments and config values are same/similar so not sure why we are getting this issue in one environment only. I am using consumption mode.
Update: When file is placed in a blob function is not getting triggered.
Stopping the listener 'Microsoft.Azure.WebJobs.Host.Blobs.Listeners.BlobListener' for function
I was observed the same message when working on the Azure Functions Queue Trigger:
This message doesn't mean the error in function. Due to timeout of Function activity, this message will appear in the App Insights > Traces.
I have stopped sending the messages in the Queue for some time and has been observed the traces like Web Job Host Stopped and if you run the function again or any continuous activity is present in the Function, then this message will not appear in the traces.
If you are using elastic Premium and has VNET integrated, the non-http trigers needs Runtime scale monitoring enabled.
You can find Function App-->Configuration--> Function runtime settings and turn on Runtime scale monitoring.
If function app and storage account which holds the metadata of the function Private linked, you will need to add the app settings WEBSITE_CONTENTOVERVNET = 1.
Also, make sure you have private linked for blob, file, table and queue on storage account.
I created ticket with MS to fix this issue. After analysis I did some code changes as
Function was async but returning void so changed to return Task.
For the trigger I was using connection string from app settings. But then I changed it to azureWebJobStorage(even though bobth were same) in function trigger attribute param
It started working. So posting here in case it is helpful for others

Name or Service not known - intermittent error in Azure

I have a TimerTrigger which calls my own Azure Functions at a relatively high rate - a few times per second. It is not being stress tested. Every call takes just a 100ms and the purpose of the test is not a stress test.
This call to my own endpoint works about 9999 times out of 10000 but just once in a while I get the following error:
System.Net.Http.HttpRequestException: Name or service not known (app.mycustomdomain.com:443)
---> System.Net.Sockets.SocketException (0xFFFDFFFF): Name or service not known
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
I replaced my actual domain with "app.mycustomdomain.com" in the error message above. It is a custom domain set up to point to the Azure Function App using CNAME.
The Function App does not detect any downtime in the Azure Portal and I have Application Insights enabled and do not see any errors. So I assume the issue is somehow on the callers side and the call never actually happens.
What does this error indicate? And how can I alleviate the problem?
For your second question - alleviating the problem, one option would certainly be to build in retry using a library like Polly. High level you create a policy, e.g. for a simple retry:
var myPolicy = Policy
.Handle<SomeExceptionType>()
.Retry(3);
This would retry 3 times, to use the policy you can call a sync or async version of Execute:
await myPolicy.ExecuteAsync(async () =>
{
//do stuff that might fail up to three times
});
More complete samples are available
This library has lots of support for other approaches, e.g. with delays, exponential delays, etc.

Stackdriver-trace on Google Cloud Run failing, while working fine on localhost

I have a node server running on Google Cloud Run. Now I want to enable stackdriver tracing. When I run the service locally, I am able to get the traces in the GCP. However, when I run the service as Google Cloud Run, I am getting an an error:
"#google-cloud/trace-agent ERROR TraceWriter#publish: Received error with status code 403 while publishing traces to cloudtrace.googleapis.com: Error: The request is missing a valid API key."
I made sure that the service account has tracing agent role.
First line in my app.js
require('#google-cloud/trace-agent').start();
running locally I am using .env file containing
GOOGLE_APPLICATION_CREDENTIALS=<path to credentials.json>
According to https://github.com/googleapis/cloud-trace-nodejs These values are auto-detected if the application is running on Google Cloud Platform so, I don't have this credentials on the gcp image
There are two challenges to using this library with Cloud Run:
Despite the note about auto-detection, Cloud Run is an exception. It is not yet autodetected. This can be addressed for now with some explicit configuration.
Because Cloud Run services only have resources until they respond to a request, queued up trace data may not be sent before CPU resources are withdrawn. This can be addressed for now by configuring the trace agent to flush ASAP
const tracer = require('#google-cloud/trace-agent').start({
serviceContext: {
service: process.env.K_SERVICE || "unknown-service",
version: process.env.K_REVISION || "unknown-revision"
},
flushDelaySeconds: 1,
});
On a quick review I couldn't see how to trigger the trace flush, but the shorter timeout should help avoid some delays in seeing the trace data appear in Stackdriver.
EDIT: While nice in theory, in practice there's still significant race conditions with CPU withdrawal. Filed https://github.com/googleapis/cloud-trace-nodejs/issues/1161 to see if we can find a more consistent solution.

How to set Infinite Timeout for Azure Function app v2.0

I have a very long running process which is hosted using Azure Function App (though it's not recommended for long running processes) targeting v2.0. Earlier it was targeting v1.0 runtime so I didn't face any function timeout issue.
But now after updating the runtime to target v2.0, I am not able to find any way to set the function timeout to Infinite as it was in case of v1.0.
Can someone please help me out on this ?
From your comments it looks like breaking up into smaller functions or using something other than functions isn't an option for you currently. In such case, AFAIK you can still do it with v2.0 as long as you're ready to use "App Service Plan".
The max limit of 10 minutes only applies to "Consumption Plan".
In fact, documentation explicitly suggests that if you have functions that run continuously or near continuously then App Service Plan can be more cost-effective as well.
You can use the "Always On" setting. Read about it on Microsoft Docs here.
Azure Functions scale and hosting
Also, documentation clearly states that default value for timeout with App Service plan is 30 minutes, but it can be set to unlimited manually.
Changes in features and functionality
UPDATE
From our discussion in comments, as null value isn't working for you like it did in version 1.x, please try taking out the "functionTimeout" setting completely.
I came across 2 different SO posts mentioning something similar and the Microsoft documentation text also says there is no real limit. Here are the links to SO posts I came across:
SO Post 1
SO Post 2
One way of doing it is to implement Eternal orchestrations from Durable Functions. It allows you to implement an infinite loop with dynamic intervals. Of course, you need to slightly modify your code by adding support for the stop/start function at any time (you must pass the state between calls).
[FunctionName("Long_Running_Process")]
public static async Task Run(
[OrchestrationTrigger] DurableOrchestrationContext context)
{
var initialState = context.GetInput<object>();
var state = await context.CallActivityAsync("Run_Long_Running_Process", initialState);
if (state == ???) // stop execution when long running process is completed
{
return;
}
context.ContinueAsNew(state);
}
You cannot set an Azure Function App timeout to infinite. I believe the longest any azure function app will consistently run is 10 minuets. As you stated Azure functions are not meant for long running processes. You may need find a new solution for your app, especially if you will need to scale up the app at all in the future.

Azure function goes idle when running in Consumption Plan with ServiceBus Queue trigger

I have also asked this question in the MSDN Azure forums, but have not received any guidance as to why my function goes idle.
I have an Azure function running on a Consumption plan that goes idle (i.e. does not respond to new messages on the ServiceBus trigger queue) despite following the instructions outlined in this GitHub issue:
The configuration for the function is the following json:
{
"ConnectionStrings": {
"MyConnectionString": "Server=tcp:project.database.windows.net,1433;Database=myDB;User ID=user#project;Password=password;Encrypt=True;Connection Timeout=30;"
},
"Values": {
"serviceBusConnection": "Endpoint=sb://project.servicebus.windows.net/;SharedAccessKeyName=SharedAccessKeyName;SharedAccessKey=KEY_HERE",
}
}
And the function signature is:
public static void ProcessQueue([ServiceBusTrigger("queueName", AccessRights.Listen, Connection = "serviceBusConnection")] ...)
Based on the discussion in the GitHub issue, I believed that having either a serviceBusConnection entry OR an AzureWebJobServiceBus entry should be enough to ensure that the central listener triggers the function when a new message is added to the ServiceBusQueue, but that is proving to not be the case.
Can anyone clarify the difference between how those two settings are used, or notice anything else with the settings I provided that might be causing the function to not properly be triggered after a period of inactivity?
I suggest there are several possible causes for this behavior. I have several Azure subs and only one of them had issues with Storage/Service Bus-based triggers only popping up when app is not idle. So far I have observed that actions listed below will prevent triggers from working correctly:
Creating any Storage-based trigger, deleting (for any reason) the triggering object and re-creating it.
Corrupting azure function input parameters by deleting/altering associated objects without recompiling a function
Restarting functions app when one of the functions fails to compile/bind to trigger OR input parameter and hangs may cause same problems.
It has also been observed that using legacy Connection Strings setting for trigger binding will not work.
Clean deploy of an affected function app will most likely solve the problem if it was caused by any of the actions described above.
EDIT:
It looks like this is also caused by setting Authorization/Authentication on the functions app, but I have not yet figured out if it happens in general or when Auth has specific configuration. Tested on affected Azure sub by disabling auth at all - function going idle after 30-40 mins, queue trigger still initiates an execution, though with a delay as expected. I have found an old bug related to this, but it says issue resolved.

Resources