I have an Azure Kubernetes Service that subscribes to a topic on an Azure Service Bus. As messages are received, a number of operations happen before calling a stored proc in an Azure Database. It then publishes a message to another topic on the same Service Bus. This runs, processing thousands of messages without issue.
When the DBA changes the DTUs on that Azure Database, the k8s service stops receiving messages, indicating a message exists but none received. It also begins showing "Service link closed" errors with the topic the app would attempt to publish to.
It never corrects out of this state.
Subscribed to topic messages
my-subscribed-topic/Subscriptions/my-subscription-d363a5a7-2262-4c74-a134-3a94f6b3c290-Receiver: RenewLockAsync start. MessageCount = 1, LockToken = 5abf6b8a-21fe-4b16-938a-b179b29ebadc
my-subscribed-topic/Subscriptions/my-subscription-d363a5a7-2262-4c74-a134-3a94f6b3c290-Receiver: RenewLockAsync done. LockToken = 5abf6b8a-21fe-4b16-938a-b179b29ebadc
my-subscribed-topic/Subscriptions/my-subscription-d363a5a7-2262-4c74-a134-3a94f6b3c290: Processor RenewMessageLock complete. LockToken = 5abf6b8a-21fe-4b16-938a-b179b29ebadc
my-subscribed-topic/Subscriptions/my-subscription-d363a5a7-2262-4c74-a134-3a94f6b3c290: Processor RenewMessageLock start. MessageCount = 1, LockToken = 5abf6b8a-21fe-4b16-938a-b179b29ebadc
my-subscribed-topic/Subscriptions/my-subscription-eaf2b5f2-2d34-43e0-88b1-76414175422e-Receiver: ReceiveBatchAsync done. Received '0' messages. LockTokens =
Published to topic messages
Send Link Closed. Identifier: published-to-topic-995469e6-d697-433a-aaea-112366bdc58a, linkException: Azure.Messaging.ServiceBus.ServiceBusException: The link 'G24:260656346:amqps://my-sb-resource.servicebus.windows.net/-34d9f631;1:454:455' is force detached. Code: publisher(link83580724). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. (GeneralError). For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot..
Send Link Closed. Identifier: published-to-topic-6ba720c4-8894-474e-b321-0f84f569e6fc, linkException: Azure.Messaging.ServiceBus.ServiceBusException: The link 'G24:260657004:amqps://my-sb-resource.servicebus.windows.net/-34d9f631;1:456:457' is force detached. Code: publisher(link83581007). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. (GeneralError). For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot..
Send Link Closed. Identifier: published-to-topic-865efa89-0775-4f5f-a5d0-9fde35fdabce, linkException: Azure.Messaging.ServiceBus.ServiceBusException: The link 'G24:260657815:amqps://my-sb-resource.servicebus.windows.net/-34d9f631;1:458:459' is force detached. Code: publisher(link83581287). Details: AmqpMessagePublisher.IdleTimerExpired: Idle timeout: 00:10:00. (GeneralError). For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot..
I can't think of a reason changing DTUs would have any impact whatsoever on maintaining connection with a service bus. We've replicated the behavior three straight times, though.
Related
TLDR: using python client library to subscribe to pulsar topic. logs show: 'broker notification of consumer closed' when something happens server-side. subscription appears to be re-established according to logs but we find later that backlog was growing on cluster b/c no msgs being sent to our subscription to consume
Running into an issue where we have an Apache-Pulsar cluster we are using that is opaque to us, and has a namespace defined where we publish/consume topics, is losing connection with our consumer.
We have a python client consuming from a topic (with one Pulsar Client subscription per thread).
We have run into an issue where, due to an issue on the pulsar cluster, we see the following entry in our client logs:
"Broker notification of Closed consumer"
followed by:
"Created connection for pulsar://houpulsar05.mycompany.com:6650"
....for every thread in our agent.
Then we see the usual periodic log entries like this:
{"log":"2022-09-01 04:23:30.269 INFO [139640375858944] ConsumerStatsImpl:63 | Consumer [persistent://tenant/namespace/topicname, subscription-name, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 6545742, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 3294], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 3294], })\n","stream":"stdout","time":"2022-09-01T04:23:30.270009746Z"}
This gives the appearance that some connection has been re-established to some other broker.
However, we do not get any messages being consumed. We have an alert on Grafana dashboard which shows us the backlog on topics and subscription backlog. Eventually it either hits a count or rate thresshold which will alert us that there is a problem. When we restart our agent, the subscription is re-establish and the backlog is can immediately be seen heading to 0.
Has anyone experienced such an issue?
Our code is typical:
consumer = client.subscribe(
topic='my-topic',
subscription_name='my-subscription',
consumer_type=my_consumer_type,
consumer_name=my_agent_name
)
while True:
msg = consumer.receive()
ex = msg.value()
i haven't yet found a readily-available way docker-compose or anything to run a multi-cluster pulsar installation locally on Docker desktop for me to try killing off a broker and see how consumer reacts.
Currently Python client only supports configuring one broker's address and doesn't support retry for lookup yet. Here are two related PRs to support it:
https://github.com/apache/pulsar/pull/17162
https://github.com/apache/pulsar/pull/17410
Therefore, setting up a multi-nodes cluster might be nothing different from a standalone.
If you only specified one broker in the service URL, you can simply test it with a standalone. Run a consumer and a producer sending messages periodically, then restart the standalone. The "Broker notification of Closed consumer" appears when the broker actively closes the connection, e.g. your consumer has sent a SEEK command (by seek call), then broker will disconnect the consumer and the log appears.
BTW, it's better to show your Python client version. And GitHub issues might be a better place to track the issue.
I am using ServiceBusProcessorClient consume the events from topic:
ServiceBusProcessorClient serviceBusProcessorClient = new ServiceBusClientBuilder()
.connectionString(busConnectionString)
.processor()
.disableAutoComplete()
.topicName(topicName)
.subscriptionName(subscriptionName)
.processMessage(processMessage)
.processError(context -> processError(context,countdownLatch))
.maxConcurrentCalls(maxConcurrentCalls)
.buildProcessorClient();
serviceBusProcessorClient.start();
But after kill the app ,The message count in Azure service bus keep decrease until reach 0 .
I can not understand what goes wrong in my implementation.
The Topic configuration :
topic config
The subscription configuration :
subscription config
Looks like helm deletes using the background propagation policy which lets the garbage collector to delete in the background. This is probably why your service is processing messages even after you run uninstall.
You would have to kill the process directly in addition to helm uninstall to not have anymore messages from being processed.
I've been recently having problems with my Service Bus queue. Random messages (one can pass and the other not) are placed on the deadletter queue with the error message saying:
"DeadLetterReason": "Moved because of Unable to get Message content There was an error deserializing the object of type System.String. The input source is not correctly formatted."
"DeadLetterErrorDescription": "Des"
This happens even before my consumer has the chance to receive the message from the queue.
The weird part is that when I requeue the message through Service Bus Explorer it passes and is successfully received and handled by my consumer.
I am using the same version of Service Bus either for sending and receiving the messages:
Azure.Messaging.ServiceBus, version: 7.2.1
My message is being sent like this:
await using var client = new ServiceBusClient(connString);
var sender = client.CreateSender(endpointName);
var message = new ServiceBusMessage(serializedMessage);
await sender.SendMessageAsync(message).ConfigureAwait(true);
So the solution I have for now for the described issue is that I implemented a retry policy for the messages that land on the dead-letter queue. The message is cloned from the DLQ and added again to the ServiceBus queue and for the second time there is no problems and the message completes successfully. I suppose that this happens because of some weird performance issues I might have in the Azure infrastructure. But this approach bought me some time to investigate further.
Does anyone knows the difference between the receive and peek options in azure service bus?
var client = new MessageReceiver("ServiceBusConnectionString", "Queue");
// difference between this one:
var peekResults = await client.PeekAsync(100);
// and this one
var receiveResults = await client.ReceiveAsync(100);
I see I can get the same results, but I want to know which one should I use and why? so internally what would be the difference?
Peek will fetch messages w/o increasing delivery counter. It's a way to "preview" messages w/o removing from the queue.
Receive will increase the delivery counter. When received in ReceiveAndDelete mode, messages will be gone from the queue. With PeekLock mode messages will remain on the queue unless MaxDeliveryCount was exceeded and they will be dead-lettered.
I understand that MS Azure Queue service document http://msdn.microsoft.com/en-us/library/windowsazure/dd179363.aspx says first out (FIFO) behavior is not guaranteed.
However, our application is such that ALL the messages have to be read and processed in FIFO order. Could anyone please suggest how to achieve a guaranteed FIFO using Azure Queue Service?
Thank you.
The docs say for Azure Storage queues that:
Messages in Storage queues are typically first-in-first-out, but sometimes they can be out of order; for example, when a message's
visibility timeout duration expires (for example, as a result of a
client application crashing during processing). When the visibility
timeout expires, the message becomes visible again on the queue for
another worker to dequeue it. At that point, the newly visible message
might be placed in the queue (to be dequeued again) after a message
that was originally enqueued after it.
Maybe that is good enough for you? Else use Service bus.
The latest Service Bus release offers reliable messaging queuing: Queues, topics and subscriptions
Adding to #RichBower answer... check out this... Azure Storage Queues vs. Azure Service Bus Queues
MSDN (link retired)
http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
learn.microsoft.com
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted
Unfortunately, many answers misleads to Service Bus Queues but I assume the question is about Storage Queues from the tags mentioned. In Azure Storage Queues, FIFO is not guranteed, whereas in Service Bus, FIFO message ordering is guaranteed and that too, only with the use of a concept called Sessions.
A simple scenario could be, if any consumer receives a message from the queue, it is not visible to you when you are the second receiver. So you assume the second message you received is actually the first message (Where FIFO failed :P)
Consider using Service Bus if this is not your requirement.
I don't know how fast do you want to process the messages, but if you need to have a real FIFO, don't allow Azure's queue to get more than one message at a time.
Use this at your "program.cs" at the top of the function.
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.Queues.BatchSize = 1; //Number of messages to dequeue at the same time.
config.Queues.MaxPollingInterval = TimeSpan.FromMilliseconds(100); //Pooling request to the queue.
JobHost host = new JobHost(config);
....your initial information...
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
This will get one message at a time with a wait period of 100 miliseconds.
This is working perfectly with a logger webjob to write to files the traze information.
As mentioned here https://www.jayway.com/2013/12/20/message-ordering-on-windows-azure-service-bus-queues/ ordering is not guaranteed also in service bus, except of using recieve and delete mode which is risky
You just need to follow below steps to ensure Message ordering.:
1) Create a Queue with session enabled=false.
2) While saving message in the queue, provide the session id like below:-
var message = new BrokeredMessage(item);
message.SessionId = "LB";
Console.WriteLine("Response from Central Scoring System : " + item);
client.Send(message);
3) While creating receiver for reviving message:-
queueClient.OnMessage(s =>
{
var body = s.GetBody<string>();
var messageId = s.MessageId;
Console.WriteLine("Message Body:" + body);
Console.WriteLine("Message Id:" + messageId);
});
4) While having the same session id, it would automatically ensure order and give the ordered message.
Thanks!!