Socket files not closed by Node process - node.js

I am facing a strange issue with our node service.
Lots of socket file descriptors have remained open against node process and limit for open files (10240 files) against process has reached. So, I am getting EMFILE error.
Service has stalled and stopped accepting new requests as well as sending outbound requests to other services.
Nowhere in code I am EXPLICITLY dealing with socket connections.
Node process still listening on port. We are using PM2.
Similar question for Java: https://serverfault.com/questions/153983/sockets-found-by-lsof-but-not-by-netstat
Below is details of versions:
Node version: 8.16.0
Hapi: 14.2.0
request: 2.88.2 (used to send
outbound requests)
Commands output in console:
[CONSOLE ~]$ lsof -p [PID] | wc -l
10253
[CONSOLE ~]$ ulimit -a
.
.
file size (blocks, -f) unlimited
.
.
max memory size (kbytes, -m) unlimited
open files (-n) 10240
.
.
.
[CONSOLE ~]$ netstat -np | grep [PORT]
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
[CONSOLE ~]$ netstat -a -n | grep [PORT]
tcp 0 0 0.0.0.0:[PORT] 0.0.0.0:* LISTEN
[CONSOLE ~]$ lsof -i :[PORT]
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
node [PID] glpd 15u IPv4 2270823542 0t0 TCP *:[PORT] (LISTEN)
[CONSOLE ~]$ lsof -p [PID]
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
.
.
.
node [PID] glpd 15u IPv4 2270823542 0t0 TCP *:[PORT] (LISTEN)
node [PID] glpd 16u sock 0,7 0t0 2271652082 protocol: TCP
node [PID] glpd 17u sock 0,7 0t0 2271551621 protocol: TCP
node [PID] glpd 18u sock 0,7 0t0 2271663118 protocol: TCP
node [PID] glpd 19u sock 0,7 0t0 2271662963 protocol: TCP
node [PID] glpd 20u sock 0,7 0t0 2271660595 protocol: TCP
node [PID] glpd 21u sock 0,7 0t0 2271652144 protocol: TCP
node [PID] glpd 22u sock 0,7 0t0 2271660631 protocol: TCP
node [PID] glpd 23u sock 0,7 0t0 2271662997 protocol: TCP
node [PID] glpd 24u sock 0,7 0t0 2271660660 protocol: TCP
node [PID] glpd 25u sock 0,7 0t0 2271663083 protocol: TCP
.
.
.
Has anyone come across this in Node?
EDIT:
Socket timeout for all incoming requests to this service are disabled(set to false) as it is our main processing service and we can not predict the amount of time a request can take to get processed.

I had the same problem. In my case nodemailer did not close the connection after transport.close(). Updating to the latest nodemailer version solved the problem.

Related

UDP bind to 0.0.0.0 seems to be lost after a while

I'm binding UDP socket to the INADDR_ANY (0.0.0.0) with a port. The bind would succeed, but for some reason, the binding seems to be lost after some unknown time.
I noticed that by running lsof -i 4 to check the open network fd , and saw the UDP binding disappeared after sometime.
The bind port is "mdns", i.e. 5353.
$ lsof -i 4
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
foofoo 3642 pi 34u IPv4 1903802 0t0 UDP *:mdns
foofoo 3642 pi 35u IPv4 1907783 0t0 UDP *:47531
after a while,
$ lsof -i 4
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
foofoo 3642 pi 35u IPv4 1907783 0t0 UDP *:47531
AFAIK, the code did not close the socket for "mdns" binding. Is there any case this could happen?
Thanks.

Unable to kill process Ubuntu 14.04

I am trying to kill processes on port 80. Here are the process running on port 80
lsof -i tcp:80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 6233 root 13u IPv4 4216925 0t0 TCP *:http (LISTEN)
nginx 6235 opscode 13u IPv4 4216925 0t0 TCP *:http (LISTEN)
I have tried killing processes using kill -9 <PID> but they still exist with PID changed. How can I kill the processes forcefully?
Your question is better suited on serverfault.com or askubuntu.com.
But I think your problem is that you have an nginx daemon started.
You can stop it with either systemctl stop nginx if you are using systemd or service nginx stop if you are using system V

SSH server - Get pid of sshd process forwarding port #N

I'm running a server (Ubuntu Server 14.04) which allows the clients to make a ssh tunnel from their device (Raspberry Pi) so they can access their web server from the internet (as a mean to traverse NATs). I can get a list of processes owned by the user (which is the same for all the devices) using ps -u username (this user only runs sshd to forward ports), but I can't filter those processes by the port they're forwarding. So the question is, how can I get the pid of the sshd that is forwarding port #N?
You can make use of lsof command since everything is a file on linux.
Something like lsof -Pan -i | grep :PORT will get you what you ask. It has an output like this when i run it for port 80 on my machine:
Command PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 1104 root 6u IPv4 23348 0t0 TCP *:80 (LISTEN)
nginx 1105 www-data 6u IPv4 23348 0t0 TCP *:80 (LISTEN)
nginx 1106 www-data 6u IPv4 23348 0t0 TCP *:80 (LISTEN)
nginx 1107 www-data 6u IPv4 23348 0t0 TCP *:80 (LISTEN)
nginx 1108 www-data 6u IPv4 23348 0t0 TCP *:80 (LISTEN)
More on lsof can be found here

Why does netstat report lesser number of open ports than lsof

I have storm running on 2 machines.
Each machine runs nimbus process (fancy for master process) and worker processes.
And I wanted to see the communication between them - what ports are open and how they connect to each other.
$ netstat -tulpn | grep -w 10669
tcp 0 0 :::6700 :::* LISTEN 10669/java
udp 0 0 :::42405 :::* 10669/java
$ lsof -i :6700
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 10669 storm 25u IPv6 57830 0t0 TCP host1:50778->host2:6700 (ESTABLISHED)
java 10669 storm 26u IPv6 57831 0t0 TCP host1:6700->host2:57339 (ESTABLISHED)
java 10669 storm 29u IPv6 57843 0t0 TCP host1:6700->host1:50847 (ESTABLISHED)
java 10669 storm 53u IPv6 57811 0t0 TCP *:6700 (LISTEN)
java 10681 storm 53u IPv6 57841 0t0 TCP host1:50780->host2:6700 (ESTABLISHED)
java 10681 storm 54u IPv6 57842 0t0 TCP host1:50847->host1:6700 (ESTABLISHED)
What I dont understand from the above output is that why netstat does not show port 50778 being open in the process with PID=10669 where as lsof clearly shows that the same process has an established connection as host1:50778->host2:6700
netstat -l limits the results to listening sockets, and prevents the display of sockets in other states.
Try this instead:
netstat -anp | egrep :6700

weird situation for nanomsg and linux

It's very strange.
I write a message distribute server upon nanomsg.
But after some time,when i restart the server, i failed because the listening port has been used.
Here is the situation:
[root#vsmHost12 src]# lsof -n -i:3333
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
hsmcs 105013 root **20u** IPv4 **24845821** 0t0 TCP :dec-notes (LISTEN)
hsmcs 105013 root 66u IPv4 25366582 0t0 TCP 192.168.167.1:dec-notes->192.168.167.1:47826 (ESTABLISHED)
java 111946 root **20u** IPv4 **24845821** 0t0 TCP *:dec-notes (LISTEN)
java 111946 root 34u IPv6 25366581 0t0 TCP 192.168.167.1:47826->192.168.167.1:dec-notes (ESTABLISHED)
It's not because of the java, some other daemon may also have the problem.
Look at the FD number and DEVICE number, it's the SAME!
Can anyone explain it ?

Resources