No more devices are able to connect to the MQTT Adapter - eclipse-hono

No more devices are able to connect to the MQTT Adapter. The Adapter log contains the message DefaultConnectionLimitManager - Connection limit (1) exceeded. What does it mean?
[vert.x-eventloop-thread-1] DEBUG o.e.h.s.l.DefaultConnectionLimitManager - Connection limit (1) exceeded
[vert.x-eventloop-thread-1] DEBUG o.e.h.a.m.i.VertxBasedMqttProtocolAdapter - connection request from client [clientId: mosqsub] rejected due to Connection failed: CONNECTION_REFUSED_SERVER_UNAVAILABLE

The configured maximum number of concurrent connections is exceeded and the protocol adapter refuses to accept further connections to prevent resources from running out.
This limit can be configured in the protocol adapter (see Admin Guide). If it is not set, the protocol adapter determines a reasonable value based on the available memory.

Related

What is the use of Nodejs mongodb driver keepAlive option?

I'm having hard time understanding keepAlive option passed to nodeJS mongo driver.
this docs says keepAlive takes boolean value and keepAliveInitialDelay is used to wait before initiating keepAlive on the TCP socket
this docs says keepAlive takes an integer value and reads The number of milliseconds to wait before initiating keepAlive on the TCP socket.
I tried using both and failed to find any difference also I tried with both true and false values for keepAlive and tried 0,1 and 30000(default) for keepAliveInitialDelay.
what is the correct way to use keepAlive?
what does "initiating" keepAlive do? or what is the use of keepAlive option?
why did it not make any difference even after setting keepAlive to false or settings it to 0 or 1?
docs here state that keepAlive affects artefacts -> Server, ReplicaSet, Mongos. so to which option does it map to at server side docs?
I'm using mongo driver V3.3 and mongo atlas V4.2
Thanks in advance.
The TCP keep-alive mechanism is described here. It is used for:
Checking for dead peers
Preventing disconnection due to network inactivity
When keep-alive is enabled, the driver instructs the network stack to periodically send ping packets to the server on the established connection. If the server's network stack does not respond, the connection is flagged as failed.
Without keep-alive, the driver wouldn't find out about some of the network issues until after the application issued a query (and was waiting for it to be executed).
The driver sets the network stack options here using setKeepAlive.
To see whether keep-alives are being sent, you need to use a tool like tcpdump to inspect traffic on the connections established by the driver.
The server uses the system-wide keep-alive value if it is under 300 seconds, otherwise sets keep-alive interval to 300 seconds.

Azure AspNetCore WebApp under high load returns "The specified CGI application encountered an error and the server terminated the process"

I'm hosting my AspNetCore app in Azure (Windows hosting plan P3v2 plan). It works perfectly fine under normal load (5-10 requests/sec) but under high load (100-200 requests/sec) starts to hang and requests return the following response:
The specified CGI application encountered an error and the server terminated the process.
And from the event logs I can get even more details:
An attempt was made to access a socket in a way forbidden by its access permissions aaa.bbb.ccc.ddd
I have to scale instance count to 30 instances, and while each instance getting just 3-5 requests per sec, it works just fine. I beleive that 30 hosts is too much to process this high load, beleive that the resource is underutilized and trying to find the real bottleneck. If I set instance count to 10 - everything crashes and every request starts to return the error above. Resources utilization metrics for the high load case with 30 instances enabled:
The service plan CPU usage is low, about 10-15% for each host
The service plan memory usage is around 30-40%
Dependency responses quickly, 50-200 ms
Azure SQL DTU usage is about 5%
I discovered this useful article on current tier limits and after an Azure TCP connections diagnostics I figured out a few possible issues:
High outbound TCP connection
High TCP Socket handle count - High TCP Socket handle count was detected on the instance .... During this period, the process dotnet.exe of site ... with ProcessId 8144 had the maximum open handle count of 17004.
So I dig more and found the following information:
Per my service plan tier, my tcp connections limit should be 8064 which is far from the displayed above. Next I've checked the socket state:
Even though I see that number of active TCP connections is below the limit, I'm wondering if open socket handles count could be an issue here. What can cause this socket handle leak (if any)? How can I troubleshoot and debug it?
I see that you have tried to isolate the possible cause for the error, just highlighting some of the reasons to revalidate/remediate:
1- On Azure App Service - Connection attempts to local addresses (e.g. localhost, 127.0.0.1) and the machine's own IP will fail, except if another process in the same sandbox has created a listening socket on the destination port. Rejected connection attempts, normally returns the above socket forbidden error (above).
For peered VNet/On_premise, kindly ensure that the IP address used is in the ranges listed for routing to VNet/Incorrect routing.
2.On Azure App service - If the outbound TCP connections on the VM instance are exhausted. limits are enforced for the maximum number of outbound connections that can be made for each VM instance.
Other causes as highlighted in this blog
Using client libraries which are not implemented to re-use TCP connections.
Application code or the client library is leaking TCP socket handles.
Burst load of requests opening too many TCP socket connections at once.
In case of higher level protocol like HTTP this is encountered if the Keep-Alive option is not leveraged.
I'm unusure if you have already tried the App Service Diagonstic to fetch more details, kindly give that a shot:
Navigate to the Diagnose and solve problems blade in the Azure portal.
In the Azure portal, open the app in App Services.
Select Diagnose and solve problems > "TCP Connections"
Consider optimizing the application to implement Connection Pooling for your .Net/Observe the behavior locally. If feasible restart the WebApp and then check to see if that helps.
If the issue still persists, kindly file a support ticket for a detailed/deeper investigation of the backend logs.

Cassandra DB connection Socket Exception: tried 127.0.0.1:49984

I am new to Cassandra. I installed Cassandra in Cloud server & its up and running.
I downloaded "No Sql Manager" to connect to Cassandra DB. While trying to connect giving error as below.
All hosts tried for query failed (tried 127.0.0.1:49984: SocketException 'A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied') Details: A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied
How to connect?
Is your Cassandra node running on the same node as NoSQL Manager? If not, then you should supply the actual (external) IP address, and not 127.0.0.1.
Also, Apache Cassandra by default accepts client connections on port 9042. I'm not sure where 49984 is coming from, but try changing that to 9042, instead.

Unable to connect to azure from a specific server

I have an Azure service bus queue which can't connect to my queue. On my pc it works fine, On our dev server it also works fine. We have deployed it on our test box and We are getting this error when trying to receive messages from the queue:
Microsoft.ServiceBus.Messaging.MessagingCommunicationException: Could
not connect to net.tcp://jeportal.servicebus.windows.net:9354/. The
connection attempt lasted for a time span of 00:00:14.9062482. TCP
error code 10060: A connection attempt failed because the connected
party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond
168.62.48.238:9354. ---> System.ServiceModel.EndpointNotFoundException: Could not connect to
net.tcp://jeportal.servicebus.windows.net:9354/. The connection
attempt lasted for a time span of 00:00:14.9062482. TCP error code
10060: A connection attempt failed because the connected party did
not properly respond after a period of time, or established
connection failed because connected host has failed to respond
168.62.48.238:9354. ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed
because connected host has failed to respond
168.62.48.238:9354
We have disabled the firewall and it still doesn't work, any suggestions on troubleshooting ?
If this is related to firewall setting that you may want to try to set the connectivity mode to Http. More details at
http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.connectivitysettings.mode.aspx
and:
http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.connectivitymode.aspx
Try to increase the timeouts on your bindings to 1 minute and add your server application as an exception in Windows Firewall manually.
So this ended up being a simple issue of ou network firewall being restricted. We had told our SA's to open the ports up for 9354 goinging to the sb. They said they did open them... but they didn't. I walked throght it with them and we discovered it wasn't open

Concurrent connections on Azure Worker Role

I would like to know maximum no of concurrent connection allowed for a azure worker role listening on port 21,having one instance.
Scenario is:
I have one instance of worker role and port open on 21 and Tcplistener listening on port 21 for incoming connections.I have implemented it async so that i can listen to other client request while processing one.
On azure part i would like to know if is there any limitation on no of concurrent request on port 21 of Worker role.
Regards,
Vivek
I think you shall first understand the networking principles. There is solely only one process that can listen to a particular port and protocol on a network interface at a time. How then the communication is handled? In a very simple outline the situation looks as follows (port numbers and IP addresses are intentionally fake):
There is a listener on port 21 on IP Address 902.168.13.24 (and no other process can listen to that port, so no concurrent listeners)
An connection request comes from some other host to that IP address and that port.
Now the port 21 is occupied and no one can connect to it anymore (more incoming connections are queued)
After a handshake at protocol level, if connection is successful, the former is being moved to another (arbitrary) socket - i.e. port 43251
Port 21 is freed to accept another connections.
Next connection in the queue is accepted.
All that happens at very low protocol level, and developer (unless developing network adapter driver) does not see/care about this port shifting.
Now the real question is, how to accept more concurrent connections. If you are developing your own TCP server, you set the maximum number of allowed connections and manage those on your own (by managing threads). If you use some other third party server - it shall have a configurable option for maximum number of allowed concurrent connections.
You can check out a simple implementation of TCP Server here. As you can see, there is a private field _maxConnections, which is used to handle the connections.
As a summary - the maximum number of concurrent connection to a specific resource (socket) depends on the server that serves that resource. For example default maximum concurrent connections limit for IIS8 is set to 4294967295:
The limit is 500K per VM or role instance: https://learn.microsoft.com/en-us/azure/azure-subscription-service-limits#networking-limits-1

Resources