My app just disappears from the list of consumers in RabbitMQ Admin after working just fine for like 30-40 mins. AMQP lib used: node-amqp. Here's the connection:
const con = amqp.createConnection(options,{defaultExchangeName: 'amq.topic', reconnect: true})
The following event handlers are configured too: connect, ready, close, tag.change, error
The worst part is that i don't get any errors or close events, app just disconnects and logs nothing...
It just seems that connection is terminated out of being 'idle' for a while...
Has anyone had something similar? How did you deal with it?
Perhaps this helps someone. To resolve the issue we have to put heartbeat field to options and specify the interval in seconds the connection has to be checked and refreshed.
The heartbeat is doesn't have any default values, so if it is not explicitly added, amqp won't use it.
Related
TLDR: using python client library to subscribe to pulsar topic. logs show: 'broker notification of consumer closed' when something happens server-side. subscription appears to be re-established according to logs but we find later that backlog was growing on cluster b/c no msgs being sent to our subscription to consume
Running into an issue where we have an Apache-Pulsar cluster we are using that is opaque to us, and has a namespace defined where we publish/consume topics, is losing connection with our consumer.
We have a python client consuming from a topic (with one Pulsar Client subscription per thread).
We have run into an issue where, due to an issue on the pulsar cluster, we see the following entry in our client logs:
"Broker notification of Closed consumer"
followed by:
"Created connection for pulsar://houpulsar05.mycompany.com:6650"
....for every thread in our agent.
Then we see the usual periodic log entries like this:
{"log":"2022-09-01 04:23:30.269 INFO [139640375858944] ConsumerStatsImpl:63 | Consumer [persistent://tenant/namespace/topicname, subscription-name, 0] , ConsumerStatsImpl (numBytesRecieved_ = 0, totalNumBytesRecieved_ = 6545742, receivedMsgMap_ = {}, ackedMsgMap_ = {}, totalReceivedMsgMap_ = {[Key: Ok, Value: 3294], }, totalAckedMsgMap_ = {[Key: {Result: Ok, ackType: 0}, Value: 3294], })\n","stream":"stdout","time":"2022-09-01T04:23:30.270009746Z"}
This gives the appearance that some connection has been re-established to some other broker.
However, we do not get any messages being consumed. We have an alert on Grafana dashboard which shows us the backlog on topics and subscription backlog. Eventually it either hits a count or rate thresshold which will alert us that there is a problem. When we restart our agent, the subscription is re-establish and the backlog is can immediately be seen heading to 0.
Has anyone experienced such an issue?
Our code is typical:
consumer = client.subscribe(
topic='my-topic',
subscription_name='my-subscription',
consumer_type=my_consumer_type,
consumer_name=my_agent_name
)
while True:
msg = consumer.receive()
ex = msg.value()
i haven't yet found a readily-available way docker-compose or anything to run a multi-cluster pulsar installation locally on Docker desktop for me to try killing off a broker and see how consumer reacts.
Currently Python client only supports configuring one broker's address and doesn't support retry for lookup yet. Here are two related PRs to support it:
https://github.com/apache/pulsar/pull/17162
https://github.com/apache/pulsar/pull/17410
Therefore, setting up a multi-nodes cluster might be nothing different from a standalone.
If you only specified one broker in the service URL, you can simply test it with a standalone. Run a consumer and a producer sending messages periodically, then restart the standalone. The "Broker notification of Closed consumer" appears when the broker actively closes the connection, e.g. your consumer has sent a SEEK command (by seek call), then broker will disconnect the consumer and the log appears.
BTW, it's better to show your Python client version. And GitHub issues might be a better place to track the issue.
I have written a client which tries to connect to Azure service bus. As soon as the server starts up i get the below errors and i receive no messages present at the queue. I tried replacing the sb protocol with amqpwss, but it dint help.
2020-05-25 21:23:11 [ReactorThreadeebf108d-444b-4acd-935f-c2c2c135451d] INFO c.m.a.s.p.RequestResponseLink - Internal send link 'RequestResponseLink-Sender_0480eb_c31e1cc239bf471e811e53a30adc6488_G51' of requestresponselink to '$cbs' encountered error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: com.microsoft.azure.servicebus.amqp.AmqpException: The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:c31e1cc239bf471e811e53a30adc6488_G51, SystemTracker:gateway7, Timestamp:2020-05-25T21:23:10
at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:55)
at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalSender.onClose(RequestResponseLink.java:759)
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:66)
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:42)
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176)
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291)
at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:491)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.microsoft.azure.servicebus.amqp.AmqpException: The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:c31e1cc239bf471e811e53a30adc6488_G51, SystemTracker:gateway7, Timestamp:2020-05-25T21:23:10
... 10 common frames omitted
There is a similar issue opened in GitHub
what you posted here is the trace, not the error. Yes, the service
closes idle connections are 10 minutes. The client traces it and
reopens the connection. It is seamless, doesn't throw any exceptions
to the application. That can't be your problem. If your sends are
failing means there may be another problem, but not this one.
As i see the second line it is about the timeout of 6 secs, can you check the troubleshoot page if it helps. Also this.
we recommend adding "sync-publish=true" to the connection url
I have a Blazor server side app published on IIS 10.
When browsing to an arbitrary page and just letting it idle after a minute or so (sometimes only 45 sec, sometimes something between 1 and two minutes) the modal
Attempting to reconnect to server ...
appears for a couple of seconds.
In the browser console the logging shows either
Error: Connection disconnected with error 'Error: Server timeout
elapsed without receiving a message from the server.'.
or
Information: Connection disconnected.
Since this seems to be a timeout problem I added the following options to ConfigureServices in my startup.cs
services.AddServerSideBlazor()
.AddHubOptions(options =>
{
options.ClientTimeoutInterval = TimeSpan.FromMinutes(10);
options.KeepAliveInterval = TimeSpan.FromSeconds(3);
options.HandshakeTimeout = TimeSpan.FromMinutes(10);
});
This does not solve the problem though.
I also went to the advanced settings of my site in IIS and increased the connection timeout from the default 120 sec to 600 sec. This did not help either.
Those frequent disconnections only happen on the live site hosted on IIS 10.
If I start the app locally with Visual Studio the connection is stable.
Any hints of what I'm missing would be appreciated!
Update:
As suggested by #agua from mars in comment below I changed transport type like this
app.UseEndpoints(endpoints =>
{
endpoints.MapControllers();
endpoints.MapBlazorHub(options => { options.Transports = HttpTransportType.LongPolling; });
endpoints.MapFallbackToPage("/_Host");
});
With this change the connection is still closed. The console log shows
Information: (LongPolling transport) Poll terminated by server.
I also tried HttpTransportType.ServerSentEvents which does not work at all but gives this error
Error: Failed to start the connection: Error: Unable to connect to the
server with any of the available transports. ServerSentEvents failed:
Error: 'ServerSentEvents' does not support Binary.
Update 2:
The IIS is configured to use HTTP 1.1
I tried changing to HTTP/2 but this did not change anything regarding the disconnections.
This is related to application pool recycling in IIS as stated by #Programmer. You can reproduce this by going into the application pool, right click the pool and choose recycle to force it. Your blazor app will get the "reconnect modal screen".
For me, I did not want to disable pool recycle, so I added js in the _Hosts.cshtml file as
<script>Blazor.defaultReconnectionHandler._reconnectCallback = function (d) {document.location.reload();}</script>
to automatically reconnect when the server comes back up.
Try this out..
app.UseEndpoints(endpoints =>
{
//other settings
.
.
endpoints.MapBlazorHub(options => options.WebSockets.CloseTimeout = new TimeSpan(1, 1, 1));
//other settings
.
.
});
This could be related to IIS application pool recycling. Try disabling the recycling to see if that's casing the disconnection.
I suffer the same problem on my Blazor server too: Myspector.com
I am sure this comes from network of data provider. I use Othello in Germany with 4G and see disconnection in 5 sec . When I am with wifi with t online on same target server no disconnection at all.
I Think some operators are incompatible with Blazor server/websoscket....
My recent experience especially on a shared server, increase the pool memory. Connectivity issues went away when we bumped 256MB up to 1GB for a small user base.
I am trying to add a series of responses inside an intent handler and set a timer of 20 minutes which will trigger(at its end) a followup event.
So here is what I've tried:
agent.add(response_1);
//...
agent.add(response_n);
setTimeout(() => {
console.log("Setting follow up event")
agent.setFollowupEvent('20_MINUTES_PASSED');
}, 1200000);
Even though the log was displayed, my function execution stopped before it. I have checked the logs and I saw the message "Function execution took 26 ms, finished with status code: 200" displayed before "Setting follow up event".
I know that each function has a 3-5 sec timeout and I understand this is why the function finished its execution, but I cannot figure out how to trigger that event after those 20 minutes...
There are two issues with this idea: Cloud functions aren't meant to run for that long, you would have to use either a real server or some scheduling service for this. However, Dialogflow doesn't let you do this anyway, webhook requests time out after a few seconds. If you haven't send a response by then the agent will tell the user that your service is unavailable. You can also not initiate a new session without the users explicit request to do so, presumably because developers would quickly abuse this for spam etc. There is thus no way to trigger an event after 20 minutes.
The closest to what you are looking for are probably push notifications, but they are very limited compared to follow up events.
I made an app with a Notification Listener which should observe every notification. But after a certain time the Listener is destroyed or disconnected. What can I do to prevent this?
I tried calling the onConnected method but it doesn't work.
I want the Listener to stay always active.
PackageManager pm = getPackageManager();
pm.setComponentEnabledSetting(new ComponentName(this,com.xinghui.notificationlistenerservicedemo.NotificationListenerServiceImpl.class), PackageManager.COMPONENT_ENABLED_STATE_DISABLED,PackageManager.DONT_KILL_APP);
pm.setComponentEnabledSetting(new ComponentName(this,com.xinghui.notificationlistenerservicedemo.NotificationListenerServiceImpl.class), PackageManager.COMPONENT_ENABLED_STATE_ENABLED,PackageManager.DONT_KILL_APP);
that worked for me. check this out:Cannot get the NotificationListenerService class to work
thx to Hugo