How to catch errors raised in Azure Device SDK? - azure

I am using the Azure Device SDK for .NET Core in order to connect my devices to Azure IoT Hub. From time to time the server rejects some messages (like twin updates or telemetry messages) from the devices and responds with status code 400. As a result there are exceptions thrown on client side but due to its asynchronous nature they are swallowed somewhere inside the Azure SDK and never thrown at my code.
How can I actually be notified about these errors so I can handle and display them?
I can also see from the Azure Device SDK code that it uses some kind of logging (EventSource) but this is never enabled in the code:
From Logging.Common.cs:
Log.IsEnabled() // always returns false
Can you point me to some way where I can 1) actually enable logging in the Azure Device SDK and 2) find the content that was actually logged?
Update: Details regarding exception that is swallowed somewhere
// Fired here after I send twin reported properties to server:
AmqpTransportHandler.VerifyResponseMessage:
if (status >= 400)
{
throw new InvalidOperationException("Service rejected the message with status: " + status);
}
// Then becomes caught and re-fired here:
AmqpTransportHandler.SendTwinPatchAsync:
throw AmqpClientHelper.ToIotHubClientContract(exception);
// Then it disappears somewhere in the "dance" of the async tasks

You can capture traces: https://github.com/Azure/azure-iot-sdk-csharp/tree/master/tools/CaptureLogs
Our sample demonstrates best practice regarding exception catching, for example: https://github.com/Azure/azure-iot-sdk-csharp/blob/master/iothub/device/samples/DeviceClientMqttSample/Program.cs

Related

Error reporting in Azure Functions vs App Insights

We call REST based APIs hosted by Azure Functions and fail to implement a consistent error handling supporting App Insights and wonder what can be done about it:
If we don't handle exceptions of the function, then App Insights
reports a 'failure', but the service returns only the the error code to the caller, but no error content:
Hence, the client receives a 500 and thats it.
If we handle the exception and log it (to AppInsights) then App Insights stops reporting a 'failure' hence monitoring on function level is broken. We can query for the exception, but they are out-of-context (i.e. we can see the exception by a custom query only) and we don't know which function is impacted actually.
How to marry up the two needs:
Let the function fail so that AppInsights reports the failure (and monitor can alert)
Return a bit more meaningful error message to the caller than 500.
Example on how it looks in AppInsights:
Exception is visible on the Exceptions tab, but the underlying operation has not failed
UPDATE:
According to Microsoft, App Insight Failures are exclusive to unhandled exceptions. Still, open whether there is a way to at least pass-through an error message.
If don't handle exceptions of the function, then App Insights reports a 'failure', returns the error code, but no error content. Hence, the client receives a 500 and thats it.
App Insights doesn't return anything, so what do you mean with returns the error code?
If we handle the exception, LOG it (to AppInsights), return simple error message in code 500, then App Insights doesn't log this as 'failure' hence monitoring is not possible.
Can you show how you do the logging? Because as soon as you log an exception using App Insights you should see it under failures.
This should work:
try
{
...
}
catch(Exception e)
{
logger.LogError(e, e.Message);
return httpRequest.CreateResponse(HttpStatusCode.InternalServerError, e.Message);
}

Azure Service Bus - random deserialization issues

I've been recently having problems with my Service Bus queue. Random messages (one can pass and the other not) are placed on the deadletter queue with the error message saying:
"DeadLetterReason": "Moved because of Unable to get Message content There was an error deserializing the object of type System.String. The input source is not correctly formatted."
"DeadLetterErrorDescription": "Des"
This happens even before my consumer has the chance to receive the message from the queue.
The weird part is that when I requeue the message through Service Bus Explorer it passes and is successfully received and handled by my consumer.
I am using the same version of Service Bus either for sending and receiving the messages:
Azure.Messaging.ServiceBus, version: 7.2.1
My message is being sent like this:
await using var client = new ServiceBusClient(connString);
var sender = client.CreateSender(endpointName);
var message = new ServiceBusMessage(serializedMessage);
await sender.SendMessageAsync(message).ConfigureAwait(true);
So the solution I have for now for the described issue is that I implemented a retry policy for the messages that land on the dead-letter queue. The message is cloned from the DLQ and added again to the ServiceBus queue and for the second time there is no problems and the message completes successfully. I suppose that this happens because of some weird performance issues I might have in the Azure infrastructure. But this approach bought me some time to investigate further.

Intermittent 501 response from InvokeDeviceMethodAsync - Azure IoT

InvokeDeviceMethodAsync is intermittently (and only recently) returning a status code of 501 within the responses (the response body is null).
I understand this means Not Implemented. However, the method IS implemented - in fact, it's the only method that is. The device is using Microsoft.Azure.Devices.Client (1.32.0-preview-001 since we're also previewing the Device Streams feature).
Setup, device side
This is all called at startup. After this, some invocations succeed, some fail.
var deviceClient = DeviceClient.CreateFromConnectionString(connectionDetails.ConnectionString, TransportType.Mqtt);
await deviceClient.SetMethodHandlerAsync("RedactedMethodName", RedactedMethodHandler, myObj, cancel).ConfigureAwait(true);
Call, server side
var methodInvocation = new CloudToDeviceMethod("RedactedMethodName")
{
ResponseTimeout = TimeSpan.FromSeconds(60),
ConnectionTimeout = TimeSpan.FromSeconds(60)
};
var invokeResponse = await _serviceClient.InvokeDeviceMethodAsync(iotHubDeviceId, methodInvocation, CancellationToken.None);
What have I tried?
Check code, method registration
Looking for documentation about 501: can't find any
Looking through the source for the libraries (https://github.com/Azure/azure-iot-sdk-csharp/search?q=501). Just looks like "not implemented", i.e. nothing registered
Turning on Distributed Tracing from the Azure portal, with sampling rate 100%. Waited a long time, but still says "Device is not synchronised with desired settings"
Exploring intellisense on the DeviceClient object. Not much there!
What next?
Well, I'd like to diagnose.
What possible reasons are there for the 501 response?
Are there and diagnostic tools, e.g. logging, I have access to?
It looks like, there is no response from the method within the responseTimeoutInSeconds value, so for test purpose (and the real response error) try to use a REST API to invoke the device method.
You should received a http status code described here.

Receiving empty bytes array for Imessage.content from Azure service bus

I am currently using two different platforms to perform sync actions. So, when action happens on platform 1 (C# -visual studio), that same action then needs to happen on platform 2 (Java). To do that, I use SB (Azure service bus) for sharing the Json messages between two platforms.
Action 1: platform 1 ----> Service bus -----> platform 2 .
Works fine. The message is sent and received exactly as it should be.
Action 2: platform 2 ----> Service bus ----> platform 1.
Does not work. The message is sent correctly to SB, but when it's picked up, the content property only contains an empty array, when it should be an array of bytes (containing the message). The other message attributes like contentType, replyTo and so on are set correctly, so the message is getting picked up fine. It's just that there is no content or in other words message body.
Action 3: Azure SB/Send messages -----> Service bus -----> platform 1.
Works fine. Another option is to send the JSON message directly to the Service Bus using Azure Service Bus/Send messages and put the JSON message in there.
For actions 2 and 3 exactly the same message is sent to the service bus (when I view it). One is sent directly to service bus, another is generated by the platform 2. I think the actual message somehow seems to be different in the background, which is why the message body is empty in the action 2.
Threads and processes are being handled correctly.
To set up receiver:
IMessageReceiver receive r= null;
if (config.isSessionsEnabled()) {
receiver = ClientFactory.acceptSessionFromConnectionStringBuilder(new ConnectionStringBuilder(connString), config.getSessionID(), ReceiveMode.PEEKLOCK);
} else {
receiver = ClientFactory.createMessageReceiverFromConnectionStringBuilder(new ConnectionStringBuilder(connString), ReceiveMode.PEEKLOCK);
}
For receiving message:
try {
message = receiver.receive(Duration.ofSeconds(1)); //message.Content will be empty array
if(message != null){
//process message
}
} catch (Exception e) {
}
}
Is my understanding correct or perhaps there is another explanation to it!
Is my understanding correct or perhaps there is another explanation to it!
It is very odd that if you send the message from Java platform and the content of message is empty.
Based on my understanding, it seems that message content is empty when you send to ABS. I recommand that you could check the Java send message code before sending to ABS.
You alse could use Azure Service bus exploer to get the message information after it sent to ABS [Before received from .Net platform].

How to stop an Azure WebJobs queue message from being deleted from an Azure Queue?

I'm using Azure WebJobs to poll a queue and then process the message.
Part of the message processing includes a hit to 3rd party HTTP endpoint. (e.g. a Weather api or some Stock market api).
Now, if the hit to the api fails (network error, 500 error, whatever) I try/catch this in my code, log whatever and then ... what??
If I continue .. then I assume the message will be deleted by the WebJobs SDK.
How can I:
1) Say to the SDK - please don't delete this message (so it will be retried automatically at the next queue poll and when the message is visible again).
2) Set the invisibility time value, when the SDK pops a message off the queue for processing.
Thanks!
Now, if the hit to the api fails (network error, 500 error, whatever) I try/catch this in my code, log whatever and then ... what??
The Webjobs SDK behaves like this: If your method throws an uncaught exception, the message is returned to the Queue with its dequeueCount property +1. Else, if all is well, the message is considered successfully processed and is deleted from the Queue - i.e. queue.DeleteMessage(retrievedMessage);
So don't gracefully catch the HTTP 500, throw an exception so the SDK gets the hint.
If I continue .. then I assume the message will be deleted by the WebJobs SDK.
From https://github.com/Azure/azure-content/blob/master/articles/app-service-web/websites-dotnet-webjobs-sdk-get-started.md#contosoadswebjob---functionscs---generatethumbnail-method:
If the method fails before completing, the queue message is not deleted; after a 10-minute lease expires, the message is released to be picked up again and processed. This sequence won't be repeated indefinitely if a message always causes an exception. After 5 unsuccessful attempts to process a message, the message is moved to a queue named {queuename}-poison. The maximum number of attempts is configurable.
If you really dislike the hardcoded 10-minute visibility timeout (the time the message stays hidden from consumers), you can change it. See this answer by #mathewc:
From https://stackoverflow.com/a/34093943/4148708:
In the latest v1.1.0 release, you can now control the visibility timeout by registering your own custom QueueProcessor instances via JobHostConfiguration.Queues.QueueProcessorFactory. This allows you to control advanced message processing behavior globally or per queue/function.
https://github.com/Azure/azure-webjobs-sdk-samples/blob/master/BasicSamples/MiscOperations/CustomQueueProcessorFactory.cs#L63
protected override async Task ReleaseMessageAsync(CloudQueueMessage message, FunctionResult result, TimeSpan visibilityTimeout, CancellationToken cancellationToken)
{
// demonstrates how visibility timeout for failed messages can be customized
// the logic here could implement exponential backoff, etc.
visibilityTimeout = TimeSpan.FromSeconds(message.DequeueCount);
await base.ReleaseMessageAsync(message, result, visibilityTimeout, cancellationToken);
}

Resources