Azure NotificationHub - Detect failed notifications - azure

I am trying to store failed notifications in a db, e.g. client does not have internet access. This will enable me to check from a backgroundService if there is a missing notification, and then create it from the backgroundService.
I therefore have the following, on my Azure App Service Mobile:
var notStat = await hub.SendWindowsNativeNotificationAsync(wnsToast, tag);
telemetry.TrackTrace("failure : " + notStat.Failure + " | Results : " + notStat.Results + " | State : " + notStat.State + " | Success : " + notStat.Success + " | trackingID : " + notStat.TrackingId + ");
The code snippet was to test the impact from the client, but no matter what I do the resulting log is just that the message was enqueued.
Question
So how to I detect failed Notifications?
Conclusion
To sum up the discussions made to the accepted answer:
When the notification has been send, the NotificationId and other relevant data, is stored in a separate Table.
The event on the client receiving the notification, will then send a message to the server stating that the notification is received. The entry is then removed from the Table.
The notifications that then are not received by the client, will through a background task be found. This will be every time the background task fires, e.g. every 6 hours, the background task will retrieve all the missing notifications. This enables the background task to create the relevant notifications and the user will not miss any notification.

The return of enqueued is expected - please, refer to the troubleshooting guidance. For more insights on what is happened try to set the EnableTestSend -
"result.State will simply state Enqueued at the end of the execution without any insight into what happened to your push. Now you can use the EnableTestSend boolean" (c) documentation
But be aware that when EnableTestSend is enabled, there are some limits (described on the same page, so will not copy paste it here to avoid the future issues with the outdated info).
You can use Per Message Telemetry functionality or REST API as well - Fiddler+some documentation.
And, as a follow-up questions, there were some discussions on SO i saw that you may find helpful: first and second.
And, as a last one, i would highly recommend (if you did not yet) to take a look at FAQ - it is important to know how different platforms handle the notifications, to avoid the situation when you try to debug something that was done by desing (for example, maybe, if the device is offline, and there are notifications, only the last will be delivered, etc).

Related

Apache Pulsar Client - Broker notification of Closed consumer - how to resume data feed?

TLDR: using python client library to subscribe to pulsar topic. logs show: 'broker notification of consumer closed' when something happens server-side. subscription appears to be re-established according to logs but we find later that backlog was growing on cluster b/c no msgs being sent to our subscription to consume
Running into an issue where we have an Apache-Pulsar cluster we are using that is opaque to us, and has a namespace defined where we publish/consume topics, is losing connection with our consumer.
We have a python client consuming from a topic (with one Pulsar Client subscription per thread).
We have run into an issue where, due to an issue on the pulsar cluster, we see the following entry in our client logs:
"Broker notification of Closed consumer"
followed by:
"Created connection for pulsar://houpulsar05.mycompany.com:6650"
....for every thread in our agent.
Then we see the usual periodic log entries like this:
{"log":"2022-09-01 04:23:30.269 INFO [139640375858944] ConsumerStatsImpl:63 | Consumer [persistent://tenant/namespace/topicname, subscription-name, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 6545742, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 3294], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 3294], })\n","stream":"stdout","time":"2022-09-01T04:23:30.270009746Z"}
This gives the appearance that some connection has been re-established to some other broker.
However, we do not get any messages being consumed. We have an alert on Grafana dashboard which shows us the backlog on topics and subscription backlog. Eventually it either hits a count or rate thresshold which will alert us that there is a problem. When we restart our agent, the subscription is re-establish and the backlog is can immediately be seen heading to 0.
Has anyone experienced such an issue?
Our code is typical:
consumer = client.subscribe(
topic='my-topic',
subscription_name='my-subscription',
consumer_type=my_consumer_type,
consumer_name=my_agent_name
)
while True:
msg = consumer.receive()
ex = msg.value()
i haven't yet found a readily-available way docker-compose or anything to run a multi-cluster pulsar installation locally on Docker desktop for me to try killing off a broker and see how consumer reacts.
Currently Python client only supports configuring one broker's address and doesn't support retry for lookup yet. Here are two related PRs to support it:
https://github.com/apache/pulsar/pull/17162
https://github.com/apache/pulsar/pull/17410
Therefore, setting up a multi-nodes cluster might be nothing different from a standalone.
If you only specified one broker in the service URL, you can simply test it with a standalone. Run a consumer and a producer sending messages periodically, then restart the standalone. The "Broker notification of Closed consumer" appears when the broker actively closes the connection, e.g. your consumer has sent a SEEK command (by seek call), then broker will disconnect the consumer and the log appears.
BTW, it's better to show your Python client version. And GitHub issues might be a better place to track the issue.

Bots being stuck in reserved state in storage

This happened in EIL on 12/13/2021:
realtime
None of the bots had jobs, confirmed on bot dashboard:
dashboard
Apparently a message was lost because there was a K8s upgrade going on and 3 pods were mistakenly called by ops to maintenance. Dispatch had requests still in progress for moving each pod/bot into storage. The solution in this case was the following:
I. Confirm the request still in progress and bot assignation still exist for each bot:
SELECT JobId FROM [iHerb_Scs_Wes_Agv_Dispatch].[request].[Requests]
WHERE Id = (SELECT RequestId FROM[iHerb_Scs_Wes_Agv_Dispatch].[dispatch].[BotAssignments] WHERE BotId = 'c45766cb-21e7-4a91-a509-017cf0e38580')
II. Emit FleetJobCompleted event for each bot, using job id found out on previous step:
https://rabbit-cluster-scs-prod.iherbscs.net/#/queues/scs.wes.agv.dispatch/FleetJobCompleted
{
"JobId" : "EA5E165F-0A72-428B-9B2A-017DB3216120"
}

Azure Push Notification Works Randomliy

I have mobile application which uses a backend services to register to Azure push notification. Things were working fine until 4 days ago where the most notifications are not delivered to the application.
I'm using Service Bus Queue and WebJob to send the notification and I can see things executed successfully for Android but the notifications most of the time doesn't deliver to the app and the notification State equals Enqueued and Success equals 0 and Failure equals 0
I updated Microsoft.ServiceBus to the latest version but that didn't resolve the issue.
Last thing, Apple notifications used to work successfully but now they are throwing exception "The remote server returned an error: (400) Bad Request. The supplied notification payload is invalid"
Does anyone face similar issues?
I have experienced the same issue when pushing notifications to iOS devices via Azure's Notification Hubs. I received the same error message when calling the "SendAppleNativeNotificationAsync" method on the hub.
I made sure that I had no illegal characters in my message by replacing "\" and "'". After reading a few posts regarding issues with max limit on notifications, we decided to limit our message size to 150 characters (a magic number, we didn't do any research to find out exactly how big a push notification message is allowed to be).
I also changed how the JSON payload was created, and I'm now using Newtonsoft.Json.Linq to create a JSON object with my payload. I previously created a simple json string for the payload, something like this :
var apnsMessage = "{\"aps\":{\"alert\":"+message+", \"sound\" : \"default\", \"badge\" : 1}}";
Now, my JSON object is created as so:
var jsonPayload = JObject.FromObject(new
{
aps = new { alert = message.Replace("\"", "").Replace("'", "") },
sound = "default",
badge = 1
});
and I send the notification like this:
await Hub.SendAppleNativeNotificationAsync(jsonPayload.ToString());
Hope this helps you (or anyone else with the same issue) :)
EDIT:
Here is a simple helper for trimming/truncating strings :)
private static string GetTrimmedAndTruncatedString(string source, int length)
{
return source.Length > length ? source.Substring(0, length) + "..." : source;
}
/Isa

Pusher subscription fails silently

I am subscribing to a channel in Pusher on my local machine using the Javascript SDK, and I don't get any error.
However, when I publish an event to that channel it is not received by the subscriber.
I've looked at Pusher's debug console and saw that the message is indeed sent but the subscription never occurs, as the connection is somehow interrupted, apparently prior to the subscription request (i.e I get a disconnection message, as shown in the console screenshot below).
the code is pretty boilerplate:
var pusher = new Pusher('PUSHER_KEY');
channel = pusher.subscribe('game' + game.gameId);
channel.bind('statusChange', function(game) {
console.log("GOT PUSHER - STATUS " + game.status);
$scope.game.status = game.status;
});
Examining the channel.subscribed property shows that the subscription failed as it equals false. I am at the sandbox plan (max 20 connections) and am only using 2 connections.
What can disrupt the connection?
The channel object:
Console screenshot:
I don't know what's the issue exactly but enabling the logs on the client side might help your find it:
Pusher.log = function(message) {
if (window.console && window.console.log) {
window.console.log(message);
}
};
There's some resources on the website to debug that kind of problem too: http://pusher.com/docs/debugging

Azure Servicebus AutoDeleteOnIdle

I'm trying to figure out the correct behavior when setting AutoDeleteOnIdle. I have a topic called MyGameMessages (not disclosing the game name since it might be considered advertisement).
What I do is that I create a subscription on each node in my server farm.
var manager = GetNameSpaceManager();
_subscriptionId = Guid.NewGuid().ToString();
var description = new SubscriptionDescription(topic, _subscriptionId);
description.AutoDeleteOnIdle = TimeSpan.FromHours(1);
manager.CreateSubscription(description);
Then I start up a thread that pretty much loops for eternity (or at least until signaled to quit)
while(_running)
{
if (_subscriptionId == null)
break;
var message = client.Receive(TimeSpan.FromMinutes(1)); // MARK A
if (message != null)
{
var body = message.GetBody<T>();
// Do stuff with message
message.Complete();
}
}
Question A:
The first implementation had no timeout at MARK A. If no message is sent to this topic within one hour the subscription was autodeleted. Is this the behavior to expect? The client isn't really dead but I guess it just sits around waiting for a message. Is there no keep alive?
Question B:
Would it help to add the timeout as in MARK A or is it a better solution to create a new subscription every 50th minute (to create a small overlap just in case) and abandon the old one?
Thanks
Johan
Johan, the scenario you describe above should work per your expectations. A pending receive call will keep the subscription alive even if no messages are flowing. Using longer timeouts for the Receive are better so you do not have chatty traffic when message volume is low. One thing to confirm is if your are setting the AutoDeleteOnIdle value for the Topic, in that case a receive on a subscription will NOT keep the Topic alive and if no messages are sent to the Topic for one hour then it will get deleted. Deleting a Topic results in all the Subscriptions being deleted too.
Are you still seeing this behavior of Subscriptions being deleted? If so then please create a ticket with Azure live site support and the product team an investigate the specifics.

Resources