I'm running a node/express app and it keeps dying randomly after a few hours.
The process is always still up, no logs or CPU/memory spikes or even much utilization but the process doesn't serve any requests anymore.
I suspect it's that the process has too many active UDP connections: lsof -i -a -p X | wc -l counted 9k+ network connections when I ssh'd into the docker-container running the node process, 98% UDP like this:
node 6 root 223u IPv4 614173 0t0 UDP *:63025
node 6 root 224u IPv4 324249 0t0 UDP *:34622
node 6 root 225u IPv4 415898 0t0 UDP *:44176
The #connections grows at a rate of exactly 10 new connections per minute.
The only UDP-related functionality in my app is https://github.com/sazze/winston-logstash-udp
Details:
Node v6.12.3 in Docker on AWS ec2 t2.medium
This behavior started happening after migrating from a debian:wheezy Docker base image to node:6.14.2-alpine.
Questions:
How can I further debug each UDP connection? E.g. target, duration, ... That would help find the underlying problem.
What is the limit for node's connection limit? I've read 4096.
This problem didn't occur under debian, what differences are there that might be related?
Related
Node Exporter is always running on my local machine on localhost:9100 even if I don't execute it with terminal following this error message:
FATA[0000] listen tcp :9100: bind: address already in use source="node_exporter.go:172"
By which I can understand that this port number is already being used by another application but the thing is I don't have anything hosted there.
This is what netstat | grep 9100 gives:
tcp 0 0 localhost:60232 localhost:9100 ESTABLISHED
tcp6 0 0 localhost:9100 localhost:60232 ESTABLISHED
All I had to do was to "kill" the 9100 port in which Node Exporter was running by using fuser -k 9100/tcp as this was shown on How to kill a process running on particular port in Linux?.
I have created a queue manager using
these commands in a linux machine
crtmqm MQ1
strmqm MQ1
runmqsc MQ1
the queue manager is created successfully,
i wanted to know on which port is the queue manager MQ1 running, i tried all possible ways netstat -au and also ps -ef command. It looks like it is running on a different port. I am unable to find the correct port number where it is running, could anyone help?
By default a new IBM MQ queue manager will not have a listener running on any port.
There is one default LISTENER object on a new queue manager which looks like this:
$echo "dis listener(SYSTEM.DEFAULT.LISTENER.TCP)"|runmqsc MQ1
....
1 : dis listener(SYSTEM.DEFAULT.LISTENER.TCP)
AMQ8630: Display listener information details.
LISTENER(SYSTEM.DEFAULT.LISTENER.TCP) CONTROL(MANUAL)
TRPTYPE(TCP) PORT(0)
IPADDR( ) BACKLOG(0)
DESCR( ) ALTDATE(yyyy-mm-dd)
ALTTIME(hh.mm.ss)
If you were to start this LISTENER the PORT(0) means to start on the default port which is 1414.
Best practice is to not use SYSTEM objects and create a new object such as:
DEFINE LISTENER(LISTENER.1414.TCP) TRPTYPE(TCP) PORT(1414) CONTROL(QMGR)
The CONTROL(QMGR) tells the queue manager to start the listener when the queue manager is started and stop it when the queue manager is ended.
You can manually start and stop the above listener with the commands:
START LISTENER(LISTENER.1414.TCP)
STOP LISTENER(LISTENER.1414.TCP)
Use netstat as root with -p option
sudo netstat -nltp
[sudo] password for root:
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1362/dnsmasq
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1580/sshd
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 1480/cupsd
The last column gives the PID and 'Program name'. If you are running the queue manager with your user, you don't need to sudo.
My server setup is nginx directly connects to a node.js server (nginx and node.js are in the same node and nginx is forwarding request to node: 127.0.0.1:8000). The symptom is sometimes there are some 504 logs in nginx log. And node.js log doesn't show any sign of ever receiving the request.
I then enabled tcp log using iptables, which logs all tcp packets to port 8000. After checking the tcp log, it seems that nginx was trying to establish a tcp connection with node.js server, but it never succeeds. It just kept retrying sending SYN packets and then got timed out by nginx. Here's an example (tcp + nginx log):
13:44:44 sp:48103 dp:8000 SYN
13:44:45 sp:48103 dp:8000 SYN
13:44:47 sp:48103 dp:8000 SYN
13:44:51 sp:48103 dp:8000 SYN
13:44:59 sp:48103 dp:8000 SYN
13:45:15 sp:48103 dp:8000 SYN
13:45:44 nginx 504
During the period, the CPU load is pretty light, memory < 50%, incoming request is less than 50 per minute. And other requests were processed normally.
Server is Ubuntu 14.04.2 LTS
Any idea what's going on? Seems like an OS level issue? Thank you in advance.
Check whether something is actually running on TCP port 8000 on your loopback interface. Try some commands like:
lsof -P -n -i tcp:8000
fuser 8000/tcp
ss -4lnt
netstat -4lnt
These should give you some hints on whether something is listening at all. Or it may only be listening on a specific interface/address and not your loopback.
For my own sanity, does anyone know why rpcbind (linux) is opening a seemingly random port every time it's restarted? I know it uses port 111, but what is this other port that keeps opening up with it? Thanks.
[root#testmachine ~]# nmap -sU -p 0-65535 127.0.0.1
Starting Nmap 5.51 ( http://nmap.org ) at 2016-03-03 16:00 EST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.0000080s latency).
Not shown: 65533 closed ports
PORT STATE SERVICE
111/udp open|filtered rpcbind
819/udp open|filtered unknown
Nmap done: 1 IP address (1 host up) scanned in 3.11 seconds
[root#testmachine ~]# service rpcbind restart
Stopping rpcbind: [ OK ]
Starting rpcbind: [ OK ]
[root#testmachine ~]# nmap -sU -p 0-65535 127.0.0.1
Starting Nmap 5.51 ( http://nmap.org ) at 2016-03-03 16:00 EST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.0000080s latency).
Not shown: 65533 closed ports
PORT STATE SERVICE
111/udp open|filtered rpcbind
846/udp open|filtered unknown
Nmap done: 1 IP address (1 host up) scanned in 2.97 seconds
[root#testmachine ~]# service rpcbind restart
Stopping rpcbind: [ OK ]
Starting rpcbind: [ OK ]
[root#testmachine ~]# nmap -sU -p 0-65535 127.0.0.1
Starting Nmap 5.51 ( http://nmap.org ) at 2016-03-03 16:05 EST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.0000070s latency).
Not shown: 65533 closed ports
PORT STATE SERVICE
111/udp open|filtered rpcbind
892/udp open|filtered unknown
Nmap done: 1 IP address (1 host up) scanned in 2.86 seconds
More than likely, it's an RPC service. Try the rpcinfo command to see what it is.
Unlike most other network services (FTP, HTTP, SMTP, etc), RPC services are bound to dynamic ports. Instead of connecting directly to the server, an RPC client first sends a request to the RPC port mapper (UDP/111 by default) to find out what port the server is on (a similar is used on Windows).
On a related note, nmap is great, but there are much easier ways to learn about the listening ports on your computer. Try this instead: sudo netstat -anp | grep LISTEN. It's much faster and will even give you the process name and number.
Also, nmap 5.51 is about five years old now. If you use it often, it's worth upgrading to get some new features.
The Debian man page for rpcbind tells me that:
All RPC servers must be restarted if rpcbind is restarted.
The OP didn't mention that they'd done that, so how would any of the RPC services have reregistered? Imagine my surprise, then, on repeating the OP's experiment and applying the rpcinfo -p suggestion from #SArcher to see that all the RPC services were still registered... and on their original ports, suggesting that #SArcher wasn't quite on the money.
If, however, we also apply the other great suggestion from #SArcher, namely to sudo netstat -anp, we get something more interesting. Now we can't |grep LISTEN as suggested because the OP's post says udp and UDP sockets are never in state LISTEN. What we do find is that rpcbind doesn't just have sockets on port 111 - its job - but also another "reserved" port picked seemingly at random when rpcbind starts, just as the OP says.
So "what is this other port for?" you ask. Sorry to tease but I just answered that in my description of:
Debian bug 870579: rpcbind callit replies from a random reserved udp port, making firewalling hard
On redhat there is a separate service called rpcbind.socket. This gets started with rpcbind.service. rpcbind.service first checks if port 111 is available, if it is not available then it chooses a port and starts listening on that port.
In redhat the rpcbind.socket is started first and it startes using port 111. In netstat the port 111 will be displayed as used by systemd. When rpcbind is started it finds that port 111 is already used by systemd and hence it chooses a different port. If you mask the rpcbind.socket service and then start rpcbind.service, rpcbind will start listening on port 111.
I lunched a simulator program which developed on C++ in my Ubuntu 11 when i want kill this process from process list of Linux and want to run it again, i faced to this error:
Error initializing sockets: port=6000. Address already in use
I used lsof command to find PID of process:
saman#jack:~$ lsof -i:6000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rcssserve 8764 saman 3u IPv4 81762 0t0 UDP *:x11
after that i tried to kill PID of 8764. but still it has error.
How can i fix it?
I think the problem you are having is that the socket if it is not shutdown correctly then it is still reserved and waiting for a timeout to be closed by the kernel.
Try doing a netstat -nutap and see if there's a line like this:
tcp 0 0 AAA.AAA.AAA.AAA:6000 XXX.XXX.XXX.XXX:YYYY TIME_WAIT -
if that's the case you just have to wait until the kernel drops it (30 secs approx) until you can open the socket at 6000 without conflict
It would seem that port 6000 is used by the X windowing system (the GUI part of linux) and is probably just restarted when you kill the process... either you'll need run the simulation without X-windows running, or you tweak the code to use a different port..