From the doc: https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#get-the-ttl-configuration-for-a-namespace, it is a little bit confusing about the difference between backlog quotas and TTL.
As I understand so far, a message arrives broker, and broker will find out all subscriptions on that topic, and retrieve their backlog, and put the message to those backlog. If this message is acknowledged by one subscription, it will be removed from its backlog (backlog is per subscription). If the message is not in any backlog (means all subscription acknowledged it), then this message is considered as acknowledged, and then retention policy kicks in, to decide if it needs to be deleted or keep for some time.
If a message is not acknowledged in one backlog for some time, and the backlog quota reaches a size limitation, then backlog retention policy kicks in. So this is more about size than time. And if we use consumer_backlog_eviction, this message will be discarded from the backlog, but question, is that considered acknowledged or not? so the first retention policy kicks in?
And the TTL, if a message is not acknowledged for some time, will it be removed from all backlogs? and then considered as acknowledged and then let the first retention policy handle it?
UPDATE:
to be more precise of this question:
In backlog quotas document, it says:
consumer_backlog_eviction: The broker will begin discarding backlog
messages
Discarding means, making it acknowledged? So that the global retention policy can kick in?
producer_request_hold: The broker will hold and not persist produce request payload
Is it saying, that, it will not put new messages into the backlog, but for those new coming messages, are they automatically acknowledged or not (say there is just one subscription at that moment)? And does this block the real producer (I guess not, it is just that the broker won't put new messages into the backlog anymore)
(for TTL) If disk space is a concern, you can set a time to live (TTL) that determines how long unacknowledged messages will be retained.
Again, if TTL is exceeded, it will not "retain" it, means, make it acknowledged? or just throw it away?
And if we use consumer_backlog_eviction, this message will be
discarded from the backlog, but question, is that considered
acknowledged or not? so the first retention policy kicks in?
The message will be acknowledged and marked for deletion. Then the retention policy for acknowledged messages will kick in at some point depending on the configuration.
And the TTL, if a message is not acknowledged for some time, will it
be removed from all backlogs? and then considered as acknowledged and
then let the first retention policy handle it?
The TTL should be applied to all backlogs and outdated unconsumed messages will be automatically acknowledged. And again the retention policy for acknowledged messages will kick in.
Related
Azure Service Bus entities (queues/topics) support a Time to Live (TTL). When the TTL passes the message expires. On expiry, the system deletes the message OR moves it to the Dead-Letter Queue (DLQ). Does Service Bus have another setting to delete messages from the DLQ after a specified period? For instance, to avoid passing size quotas, we might like to delete messages from the DLQ after six months.
See also:
Do messages in dead letter queues in Azure Service Bus expire?
https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-expiration?WT.mc_id=Portal-Microsoft_Azure_ServiceBus
Azure Service Bus doesn't have an expiration option on the dead-letter queues. This is likely intentional, as the system shouldn't just lose those messages but rather do something about them.
Sometimes, monitoring all dead-letter queues for total size and whatnot is inconvenient. One option is to create a centralized DLQ. That will allow the following:
Monitoring a single "dead-letter" queue.
Receive messages from a single entity for processing.
Keep the size under control by specifying a TTL on the queue.
For example, let's say you've got two queues, test-dlq and test-dlq2. You'd configure those to auto-forward dead-lettered messages to a 3rd queue, test-dlq-all. With that, when you have messages that are received by test-dlq or test-dlq2 and dead-lettering,
Those messages will end up in the centralized "DLQ" queue (test-dlq-all).
The nice part is whenever you have messages auto-forwarded, you'll always know where they originally dead-lettered.
For example, let's say you've got two messages, each from a different queue, ending up in test-dlq-all, the centralized "DLQ".
Inspecting its messages will reveal a system property, DeadLetterSource, stamped with the name of the queue it was dead-lettered initially in.
This solution lets you set TTL on the test-dlq-all queue and have messages auto-perged.
Also, worth mentioning that it's possible to either set up dead-lettering with the centralized "DLQ" or get messaged dead-lettered as a result of failing processing that exceeds MaxDeliveryCount. For that reason, it is worth wither monitoring test-dlq-alls DLQ.
I'm new to Pulsar and after reading some document I am a little bit confusing about message acknowledge.
Say, I have one topic, and two subscriptions: SubA and SubB. Now a message is consumed by SubA and SubB, but only SubA acknowledged that message. Now my question is, say after 2 days, our retention policy kicks in and it want to delete all acknowledged message older than 2 days, then in this case, is the message considered acknowledged or not? (because only SubA acknowledged it, SubB didn't)
The message is considered not acknowledged and will not be deleted. It is still being held for delivery in subscription SubB. Messages need to be acknowledged in all subscriptions before they are removed.
Like a traditional message broker, Pulsar keeps unacknowledged messages indefinitely. If this is not what you want, you can configure message TTL, which acknowledges a message after a configurable amount of time.
We include dates in the events sent to our hub. Whenever I connect a new Azure Function to our Event Hub with a new consumer group, it seems to receive all events ever sent to the hub. This is somewhat expected, however I set the Message Retention on the hub to 1 day, so I expected at most to receive one day worth of events for the new consumer, but it seems to receive all events, even months old events, based on the date within the message, and lots more events than we generate over a day.
Based on this page:
https://blogs.msdn.microsoft.com/servicebus/2015/03/09/data-retention-in-event-hubs/
It seems like maybe this retention period is somewhat irrelevant, or misleading. If the "container" hasn't filled up yet, it could contain messages forever. If, for example, the container has a limit of 1000 messages before the event hub looks at it, but it takes a year to generate 1000 messages, does that mean any new consumer could get year-old messages, even with a 1-day "retention period"?
When the container does hit the limit of 1000 messages, are the messages older than 1 day discarded and the messages newer than 1 day ago (within the retention period) retained? Or is the whole container discarded?
From looking at our test and prod environments it seems like this container fits at least 50000 messages (or equivalent size).
Is a checkpoint the only way to limit this initial influx of messages for a new consumer group?
Retention time is the minimum guaranteed period, not the maximum or exact. 1 day retention means you will have all the messages from last day, but maybe some more messages too.
So you can rely on 1 day of retention, but be prepared to see older messages too.
Sometimes there are some messages in Azure Queues that are not taken in charge by Azure Functions and also are not visible from StorageExplorer.
These messages are created without any visibility delay.
Is there any way to know what do those messages contain, and why are they not processed by our Azure Functions?
In the image you can see that we have a message in queue but it is not visible in the list and it is there from hours.
The Azure Queue API currently has no way to check invisible messages.
There are several situations in which a message will become invisible:
The message was added with an VisibilityTimeout in the Put Message request. The message will be invisible until this initial timeout expires.
The message has been retrieved (dequeued). Whenever a message is retrieved it will be invisible for the duration of the VisibilityTimeout specified by the Get Messages request, or 30 seconds by default.
The message has expired. Messages expire after 7 days by default, or after the MessageTTL specified in the Put Message request. Note: after a while these messages are automatically deleted, but until that point they are there as invisible message.
Use cases
Initial VisibilityTimeout
Messages are created with an initial VisibilityTimeout so that the message can be created now, but processed later (after the timeout expires), for whatever reason the creator has for wanting to delay this processing.
VisibilityTimeout on retrieving
The intended process for processing queue messages is:
The application dequeues one or more messages, optionally specifying the next VisibilityTimeout. This timeout should be bigger than the time it takes to process the message(s).
The application processes the message(s).
The application deletes the messages. When the processing fails the message(s) are not deleted.
Message(s) for which the process failed will become visible again as soon as their VisibilityTimeout expires, so that they can be re-tried. To prevent endless retries step 2. should start by checking the DequeueCount of the message: if it is bigger than the desired retry-count it should be deleted, instead of processed. It is good practice to copy such messages to a deadletter / poison queue (for example a queue with the original queue name plus a -poison suffix).
MessageTTL
By default messages have a time-to-live of 7 days. If the application processing cannot handle the amount of messages being added, a backlog could build up. Adjusting the TTL will determine what happens to such backlog.
Alternatively the application could crash, so that the backlog would build up until the application would be started again.
It seems that the message is expired. The following steps could reproduce the issue, you could test it.
Add message with a short TTL
After the message has been expired
We are seeing some behaviour that we can’t understand with the servicebus and deadletters, and wonder if someone can give us some insight into how the rules work.
If I create a topic with a TTL of 5 mins (‘LongTopic’) which has 2 subscriptions, ‘Long’ with a TTL of 5 mins as well and ‘Short’ with a TTL of 5 seconds and then send a test message to the topic, then what we see is that we don’t get a dead letter on ‘Short’ after 5 seconds, but do after about 1 minute. So it seems I can override the topic TTL with a shorter TTL, but this doesn’t necessarily mean it will be dead lettered immediately upon the TTL expiring.
If I create a topic with a TTL of 5 seconds (‘ShortTopic’) which has 2 subscriptions, ‘Long’ with a TTL of 5 mins as well and ‘Short’ with a TTL of 5 seconds and then send a test message to the topic, then what we see is that we don’t get a dead letter on ‘Short’ after 5 seconds, but do after about 1 minute, and we also get a deadlettered message on the ‘Long’ after about a minute as well. So it seems I can’t override the topic TTL with a longer TTL in the subscription but again this doesn’t necessarily mean it will be dead lettered immediately upon the TTL expiring.
We have had topics with much longer TTL (3000 days) and sometimes we see messages which are not being forwarded from a subscription which are not deadlettering for 1.5hours despite the TTL on the subscription being 1 minute.
Does anyone have any idea if this is expected behaviour? Or have a link to the rules about when a message might be dead lettered?
When you set the dead letter info on a message, that tells Service Bus when a message should move to the dead letter queue or be completed (aka deleted)-- the choice depends on whether or not dead lettering is enabled on the queue or subscription. If you have an active receiver on the queue or subscription, then the dead letter time will match the dead letter interval within a few seconds. When you are just sending messages, the system runs with a background task that checks the queue or subscription on a regular cadence. As you have discovered experimentally, this check happens every 60s.
Your next question is likely "why doesn't this behave the way I want it to?" The Service Bus has been designed with a large number of optimizations to make sure that all messages are sent durably and that sends and receives happen as fast as possible. This means that we spend a lot of engineering time to be durable and fast for the primary scenarios- send/receive/browse messages.
The behavior you are seeing, which we call "proactive TTL", is actually quite new. It was first introduced into the Windows Azure Service in April 2013. Prior to this time, a user would have to actively receive on a queue or subscription in order to force the bookkeeping code to run.
At this time, you will not see the proactive TTL behavior for lightly used queues and subscriptions. If you have a message that expires in > 2 hours, that message won't move into the dead letter queue because the timer does not run on what are "idle entities". When the Service Bus is seeing an unusually high amount of usage, this window can shrink considerably-- as low as a 2 minute idle time on your entity will cause the proactive TTL timer to stop running.