Can't open more than 1023 sockets - linux

I'm developing some code that is simulating network equipment. I need to run several thousand simulated "agents", and each needs to connect to a service. The problem is that after opening 1023 connections, the connects start to time out, and the whole thing comes crashing down.
The main code is in Go, but I've written a very trivial python script which reproduces the problem.
The one thing that is somewhat unusual is that we need to set the local address on the socket when we create it. This is because the equipment that the agents are connecting to expects the apparent IP to match what we say it should be. To achieve this, I have configured 10,000 virtual interfaces (eth0:1 to eth0:10000). These are assigned unique IP addresses in a private network.
The python script is just this (only runs to 2000 connnects):
import socket
i = 0
for b in range(10, 30):
for d in range(1, 100):
i += 1
ip = "1.%d.1.%d" % (b, d)
print("Conn %i %s" % (i, ip))
s = socket.create_connection(("1.6.1.1", 5060), 10, (ip, 5060))
If I remove the last argument to socket.create_connection (the source address), then I can get all 2000 connections.
The thing that is different with using a local address is that a bind must be made before the connection can be set up, so the output from this program running under strace looks like this:
Conn 1023 1.20.1.33
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(5060), sin_addr=inet_addr("1.20.1.33")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(5060), sin_addr=inet_addr("1.6.1.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
If I run without a local address, the AF_INET bind goes away, and it works.
So, it seems there must be some kind of limit on the number of binds that can be made. I've waded through all sorts of links about TCP tuning on Linux, and I've tried messing with tcp_tw_reuse/recycle and I've reduced the fin_timeout, and I've done other things that I can't remember.
This is running on Ubuntu Linux (11.04, kernel 2.6.38 (64 bit). It's a virtual machine on a VMWare ESX cluster.
Just before posting this, I tried running a second instances of the python script, with the additional starting at 1.30.1.1. The first script plowed through to 1023 connections, but the second one couldn't even get the first one done, indicating that the problem is related to the large number of virtual interfaces. Could some internal data structure be limited? Some max memory setting somewhere?
Can anyone think of some limit in Linux that would cause this?
Update:
This morning I decided to try an experiment. I modified the python script to use the "main" interface IP as the source IP, and ephemeral ports in the range 10000+. The script now looks like this:
import socket
i = 0
for i in range(1, 2000):
print("Conn %i" % i)
s = socket.create_connection(("1.6.1.1", 5060), 10, ("1.1.1.30", i + 10000))
This script works just fine, so this adds to my belief that the problem is related to the large number of aliased IP addresses.

What a DOH moment. I was watching the server, using netstat, and since I didn't see a large number of connects I didn't think there was a problem. But finally I wised up and checked the /var/log/kernel, in which I found this:
Mar 8 11:03:52 TestServer01 kernel: ipv4: Neighbour table overflow.
This lead me to this posting: http://www.serveradminblog.com/2011/02/neighbour-table-overflow-sysctl-conf-tunning/ which explains how to increase the limit. Bumping the thresh3 value immediately solved the problem.

You may want to look at sysctl settings related to net.ipv4 .
These settings include stuff like maxconntrack and other relevant settings you may wish to tweak.

Are you absolutely certain that the issue is not on the server side connection not closing the sockets? i.e. what does lsof -n -p of the server process show? What does plimit -p of the server process show? The server side could be tied up not being able to accept any more connections, while the client side is getting the EINPROGRESS result.
Check the ulimit for the number of open files on both sides of the connection - 1024 is too close to a ulimit level to be a coincidence.

Related

Need to measure pairwire (ip based) bandwidth used over time when using TC

I need to measure the datarates of packets between multiple servers. I need pairwise bandwidths between the servers (if possible even the ports), not the overall datarate per interface on each server.
Example output
Timestamp
Server A to B
Server B to A
Server A to C
Server C to A
0
1
2
1
5
1
5
3
7
1
What I tried or thought of
tcpdump - I was capturing all the packets and looking at ip.len for getting the datarates. It worked quite well till I started testing along with TC.
Turns out tcpdump captures packets at a lower layer than TC. So, the bandwidths I measure using this can't see the limit set by TC.
netstat - I tried using this by greping the output and look at Recv-Q and Send-Q columns. But later I found out that it reports the bytes that have been received and are buffered, waiting for the local process that is using this connection to read and consume them. I won't be able to use them to get bandwidth being used.
iftop - Amazing GUI and has all the things I need. But no way to get the output in a good way to process. Might also overwhelm the storage because of the amount of extra text it stores along with.
bwm-ng - Gives overall datarate per interface on each server but not pairwise.
Please let me know if there are any other ways to achieve what I need.
Thanks in advance for your help.

python nmap: what to do to speed up the process

I am working on idea to check port status of websites using NMAP library of python. so, I done some code, I used This Link to check 'open' word in dictionary I was getting in to print directly port number and status.
I need help to get result fast as I am not able to do it. I want to check port range of 80 to 443 ports.whenever I try to do that range It takes about 15 mins for one host (i.e. google.com). I have about 4-5 host names with range of 80 to 443 ports to check.
code image is for reference, what my code is looked like. but I used a list for host names. and basically two for loops to work all this. one for host name and other for range of port numbers.
any help is appreciated.
Thank you
Hmm, that’s really slow for a scan which would indicate to me there’s something wrong with the host you’re scanning (IDS?) or poor connection (you probably ruled this out). I believe IPs should be strings (as you correctly did in your script) but ports are numeric. Also maybe try scanning your local host or another local network machine which will be much faster and won’t block your IP.
Either way, some other ways to improve your speed include multi-threading (via threading, Queue and a host of other libraries), passing the T1-5 flags in Nmap (-T0 is slowest, -T3 is default, -T5 is insane mode). There’s also IDS evasive Nmap settings if you’re set on scanning actual remote hosts

Understanding linux DISPLAY variable

I am new to Linux and I had to set DISPLAY variable for running a Java application. Somehow I managed to do that, and I understand that display can be set using
<host>:<display>[.<screen>]
but what I am doing is <host>:1001.
Now, this 1001 is 1001th display of this Linux? Are this many display possible in a machine or my understanding is wrong?
The DISPLAY variable is used by X11 to identify your display (and keyboard and mouse). Usually it'll be :0 on a desktop PC, referring to the primary monitor, etc.
If you're using SSH with X forwarding (ssh -X otherhost), then it'll be set to something like localhost:10.0. This tells X applications to send their output, and receive their input from the TCP port 127.0.0.1:6010, which SSH will forward back to your original host.
And, yes, back in the day, when "thin client" computing meant an X terminal, it was common to have several hundred displays connected to the same host.
The DISPLAY values are usually like :0, :0.0, etc. when running under the X Window server on the same host. Big numbers like in :1001 are typical for SSH passed X connection. The numbers are really summand to 6000 to get TCP port number; local ones start with 6000 and SSH passed ones could start with 7000. (This augment is different in different systems, e.g. 10 or 100 are also possible.)
As soon as these values are assigned dynamically, you should get the value for DISPLAY from an existing connection environment, provided that proper authorization token is also available (e.g. in ~/.Xauthority).

Increasing the maximum number of TCP/IP connections in Linux

I am programming a server and it seems like my number of connections is being limited since my bandwidth isn't being saturated even when I've set the number of connections to "unlimited".
How can I increase or eliminate a maximum number of connections that my Ubuntu Linux box can open at a time? Does the OS limit this, or is it the router or the ISP? Or is it something else?
Maximum number of connections are impacted by certain limits on both client & server sides, albeit a little differently.
On the client side:
Increase the ephermal port range, and decrease the tcp_fin_timeout
To find out the default values:
sysctl net.ipv4.ip_local_port_range
sysctl net.ipv4.tcp_fin_timeout
The ephermal port range defines the maximum number of outbound sockets a host can create from a particular I.P. address. The fin_timeout defines the minimum time these sockets will stay in TIME_WAIT state (unusable after being used once).
Usual system defaults are:
net.ipv4.ip_local_port_range = 32768 61000
net.ipv4.tcp_fin_timeout = 60
This basically means your system cannot consistently guarantee more than (61000 - 32768) / 60 = 470 sockets per second. If you are not happy with that, you could begin with increasing the port_range. Setting the range to 15000 61000 is pretty common these days. You could further increase the availability by decreasing the fin_timeout. Suppose you do both, you should see over 1500 outbound connections per second, more readily.
To change the values:
sysctl net.ipv4.ip_local_port_range="15000 61000"
sysctl net.ipv4.tcp_fin_timeout=30
The above should not be interpreted as the factors impacting system capability for making outbound connections per second. But rather these factors affect system's ability to handle concurrent connections in a sustainable manner for large periods of "activity."
Default Sysctl values on a typical Linux box for tcp_tw_recycle & tcp_tw_reuse would be
net.ipv4.tcp_tw_recycle=0
net.ipv4.tcp_tw_reuse=0
These do not allow a connection from a "used" socket (in wait state) and force the sockets to last the complete time_wait cycle. I recommend setting:
sysctl net.ipv4.tcp_tw_recycle=1
sysctl net.ipv4.tcp_tw_reuse=1
This allows fast cycling of sockets in time_wait state and re-using them. But before you do this change make sure that this does not conflict with the protocols that you would use for the application that needs these sockets. Make sure to read post "Coping with the TCP TIME-WAIT" from Vincent Bernat to understand the implications. The net.ipv4.tcp_tw_recycle option is quite problematic for public-facing servers as it won’t handle connections from two different computers behind the same NAT device, which is a problem hard to detect and waiting to bite you. Note that net.ipv4.tcp_tw_recycle has been removed from Linux 4.12.
On the Server Side:
The net.core.somaxconn value has an important role. It limits the maximum number of requests queued to a listen socket. If you are sure of your server application's capability, bump it up from default 128 to something like 128 to 1024. Now you can take advantage of this increase by modifying the listen backlog variable in your application's listen call, to an equal or higher integer.
sysctl net.core.somaxconn=1024
txqueuelen parameter of your ethernet cards also have a role to play. Default values are 1000, so bump them up to 5000 or even more if your system can handle it.
ifconfig eth0 txqueuelen 5000
echo "/sbin/ifconfig eth0 txqueuelen 5000" >> /etc/rc.local
Similarly bump up the values for net.core.netdev_max_backlog and net.ipv4.tcp_max_syn_backlog. Their default values are 1000 and 1024 respectively.
sysctl net.core.netdev_max_backlog=2000
sysctl net.ipv4.tcp_max_syn_backlog=2048
Now remember to start both your client and server side applications by increasing the FD ulimts, in the shell.
Besides the above one more popular technique used by programmers is to reduce the number of tcp write calls. My own preference is to use a buffer wherein I push the data I wish to send to the client, and then at appropriate points I write out the buffered data into the actual socket. This technique allows me to use large data packets, reduce fragmentation, reduces my CPU utilization both in the user land and at kernel-level.
There are a couple of variables to set the max number of connections. Most likely, you're running out of file numbers first. Check ulimit -n. After that, there are settings in /proc, but those default to the tens of thousands.
More importantly, it sounds like you're doing something wrong. A single TCP connection ought to be able to use all of the bandwidth between two parties; if it isn't:
Check if your TCP window setting is large enough. Linux defaults are good for everything except really fast inet link (hundreds of mbps) or fast satellite links. What is your bandwidth*delay product?
Check for packet loss using ping with large packets (ping -s 1472 ...)
Check for rate limiting. On Linux, this is configured with tc
Confirm that the bandwidth you think exists actually exists using e.g., iperf
Confirm that your protocol is sane. Remember latency.
If this is a gigabit+ LAN, can you use jumbo packets? Are you?
Possibly I have misunderstood. Maybe you're doing something like Bittorrent, where you need lots of connections. If so, you need to figure out how many connections you're actually using (try netstat or lsof). If that number is substantial, you might:
Have a lot of bandwidth, e.g., 100mbps+. In this case, you may actually need to up the ulimit -n. Still, ~1000 connections (default on my system) is quite a few.
Have network problems which are slowing down your connections (e.g., packet loss)
Have something else slowing you down, e.g., IO bandwidth, especially if you're seeking. Have you checked iostat -x?
Also, if you are using a consumer-grade NAT router (Linksys, Netgear, DLink, etc.), beware that you may exceed its abilities with thousands of connections.
I hope this provides some help. You're really asking a networking question.
To improve upon the answer given by #derobert,
You can determine what your OS connection limit is by catting nf_conntrack_max. For example:
cat /proc/sys/net/netfilter/nf_conntrack_max
You can use the following script to count the number of TCP connections to a given range of tcp ports. By default 1-65535.
This will confirm whether or not you are maxing out your OS connection limit.
Here's the script.
#!/bin/sh
OS=$(uname)
case "$OS" in
'SunOS')
AWK=/usr/bin/nawk
;;
'Linux')
AWK=/bin/awk
;;
'AIX')
AWK=/usr/bin/awk
;;
esac
netstat -an | $AWK -v start=1 -v end=65535 ' $NF ~ /TIME_WAIT|ESTABLISHED/ && $4 !~ /127\.0\.0\.1/ {
if ($1 ~ /\./)
{sip=$1}
else {sip=$4}
if ( sip ~ /:/ )
{d=2}
else {d=5}
split( sip, a, /:|\./ )
if ( a[d] >= start && a[d] <= end ) {
++connections;
}
}
END {print connections}'
In an application level, here are something a developer can do:
From server side:
Check if load balancer(if you have),works correctly.
Turn slow TCP timeouts into 503 Fast Immediate response, if you load balancer work correctly, it should pick the working resource to serve, and it's better than hanging there with unexpected error massages.
Eg: If you are using node server, u can use toobusy from npm.
Implementation something like:
var toobusy = require('toobusy');
app.use(function(req, res, next) {
if (toobusy()) res.send(503, "I'm busy right now, sorry.");
else next();
});
Why 503? Here are some good insights for overload:
http://ferd.ca/queues-don-t-fix-overload.html
We can do some work in client side too:
Try to group calls in batch, reduce the traffic and total requests number b/w client and server.
Try to build a cache mid-layer to handle unnecessary duplicates requests.
im trying to resolve this in 2022 on loadbalancers and one way I found is to attach another IPv4 (or eventualy IPv6) to NIC, so the limit is now doubled. Of course you need to configure the second IP to the service which is trying to connect to the machine (in my case another DNS entry)

Virtual Serial Port for Linux

I need to test a serial port application on Linux, however, my test machine only has one serial port.
Is there a way to add a virtual serial port to Linux and test my application by emulating a device through a shell or script?
Note: I cannot remap the port, it hard coded on ttys2 and I need to test the application as it is written.
Complementing the #slonik's answer.
You can test socat to create Virtual Serial Port doing the following procedure (tested on Ubuntu 12.04):
Open a terminal (let's call it Terminal 0) and execute it:
socat -d -d pty,raw,echo=0 pty,raw,echo=0
The code above returns:
2013/11/01 13:47:27 socat[2506] N PTY is /dev/pts/2
2013/11/01 13:47:27 socat[2506] N PTY is /dev/pts/3
2013/11/01 13:47:27 socat[2506] N starting data transfer loop with FDs [3,3] and [5,5]
Open another terminal and write (Terminal 1):
cat < /dev/pts/2
this command's port name can be changed according to the pc. it's depends on the previous output.
2013/11/01 13:47:27 socat[2506] N PTY is /dev/pts/**2**
2013/11/01 13:47:27 socat[2506] N PTY is /dev/pts/**3**
2013/11/01 13:47:27 socat[2506] N starting data transfer loop with FDs
you should use the number available on highlighted area.
Open another terminal and write (Terminal 2):
echo "Test" > /dev/pts/3
Now back to Terminal 1 and you'll see the string "Test".
You can use a pty ("pseudo-teletype", where a serial port is a "real teletype") for this. From one end, open /dev/ptyp5, and then attach your program to /dev/ttyp5; ttyp5 will act just like a serial port, but will send/receive everything it does via /dev/ptyp5.
If you really need it to talk to a file called /dev/ttys2, then simply move your old /dev/ttys2 out of the way and make a symlink from ptyp5 to ttys2.
Of course you can use some number other than ptyp5. Perhaps pick one with a high number to avoid duplicates, since all your login terminals will also be using ptys.
Wikipedia has more about ptys: http://en.wikipedia.org/wiki/Pseudo_terminal
Use socat for this:
For example:
socat PTY,link=/dev/ttyS10 PTY,link=/dev/ttyS11
There is also tty0tty http://sourceforge.net/projects/tty0tty/ which is a real null modem emulator for linux.
It is a simple kernel module - a small source file. I don't know why it only got thumbs down on sourceforge, but it works well for me. The best thing about it is that is also emulates the hardware pins (RTC/CTS DSR/DTR). It even implements TIOCMGET/TIOCMSET and TIOCMIWAIT iotcl commands!
On a recent kernel you may get compilation errors. This is easy to fix. Just insert a few lines at the top of the module/tty0tty.c source (after the includes):
#ifndef init_MUTEX
#define init_MUTEX(x) sema_init((x),1)
#endif
When the module is loaded, it creates 4 pairs of serial ports. The devices are /dev/tnt0 to /dev/tnt7 where tnt0 is connected to tnt1, tnt2 is connected to tnt3, etc.
You may need to fix the file permissions to be able to use the devices.
edit:
I guess I was a little quick with my enthusiasm. While the driver looks promising, it seems unstable. I don't know for sure but I think it crashed a machine in the office I was working on from home. I can't check until I'm back in the office on monday.
The second thing is that TIOCMIWAIT does not work. The code seems to be copied from some "tiny tty" example code. The handling of TIOCMIWAIT seems in place, but it never wakes up because the corresponding call to wake_up_interruptible() is missing.
edit:
The crash in the office really was the driver's fault. There was an initialization missing, and the completely untested TIOCMIWAIT code caused a crash of the machine.
I spent yesterday and today rewriting the driver. There were a lot of issues, but now it works well for me. There's still code missing for hardware flow control managed by the driver, but I don't need it because I'll be managing the pins myself using TIOCMGET/TIOCMSET/TIOCMIWAIT from user mode code.
If anyone is interested in my version of the code, send me a message and I'll send it to you.
You may want to look at Tibbo VSPDL for creating a linux virtual serial port using a Kernel driver -- it seems pretty new, and is available for download right now (beta version). Not sure about the license at this point, or whether they want to make it available commercially only in the future.
There are other commercial alternatives, such as http://www.ttyredirector.com/.
In Open Source, Remserial (GPL) may also do what you want, using Unix PTY's. It transmits the serial data in "raw form" to a network socket; STTY-like setup of terminal parameters must be done when creating the port, changing them later like described in RFC 2217 does not seem to be supported. You should be able to run two remserial instances to create a virtual nullmodem like com0com, except that you'll need to set up port speed etc in advance.
Socat (also GPL) is like an extended variant of Remserial with many many more options, including a "PTY" method for redirecting the PTY to something else, which can be another instance of Socat. For Unit tets, socat is likely nicer than remserial because you can directly cat files into the PTY. See the PTY example on the manpage. A patch exists under "contrib" to provide RFC2217 support for negotiating serial line settings.
Using the links posted in the previous answers, I coded a little example in C++ using a Virtual Serial Port. I pushed the code into GitHub: https://github.com/cymait/virtual-serial-port-example .
The code is pretty self explanatory. First, you create the master process by running ./main master and it will print to stderr the device is using. After that, you invoke ./main slave device, where device is the device printed in the first command.
And that's it. You have a bidirectional link between the two process.
Using this example you can test you the application by sending all kind of data, and see if it works correctly.
Also, you can always symlink the device, so you don't need to re-compile the application you are testing.
Would you be able to use a USB->RS232 adapter? I have a few, and they just use the FTDI driver. Then, you should be able to rename /dev/ttyUSB0 (or whatever gets created) as /dev/ttyS2 .
I can think of three options:
Implement RFC 2217
RFC 2217 covers a com port to TCP/IP standard that allows a client on one system to emulate a serial port to the local programs, while transparently sending and receiving data and control signals to a server on another system which actually has the serial port. Here's a high-level overview.
What you would do is find or implement a client com port driver that would implement the client side of the system on your PC - appearing to be a real serial port but in reality shuttling everything to a server. You might be able to get this driver for free from Digi, Lantronix, etc in support of their real standalone serial port servers.
You would then implement the server side of the connection locally in another program - allowing the client to connect and issuing the data and control commands as needed.
It's probably non trivial, but the RFC is out there, and you might be able to find an open source project that implements one or both sides of the connection.
Modify the linux serial port driver
Alternately, the serial port driver source for Linux is readily available. Take that, gut the hardware control pieces, and have that one driver run two /dev/ttySx ports, as a simple loopback. Then connect your real program to the ttyS2 and your simulator to the other ttySx.
Use two USB<-->Serial cables in a loopback
But the easiest thing to do right now? Spend $40 on two serial port USB devices, wire them together (null modem) and actually have two real serial ports - one for the program you're testing, one for your simulator.
-Adam
$ socat -d -d pty,link=/tmp/vserial1,raw,echo=0 pty,link=/tmp/vserial2,raw,echo=0
Will generate symlinks of /tmp/vserial1 and /tmp/vserial2 for generated virtual serial ports in /dev/pts/*
Resource
Combining all other amazingly useful answers, I found the below command to be VERY useful for testing on different types of Linux distros where there's no guarantee you're going to get the same /dev/pts/#'s every time and/or you need to test multiple psuedo serial devices and connections at once.
parallel 'i="{1}"; socat -d -d pty,raw,echo=0,link=$HOME/pty{1} pty,raw,echo=0,link=$HOME/pty$(($i+1))' ::: $(seq 0 2 3;)
Breaking this down:
parallel runs the same command for each argument supplied to it.
So for example if we run it with the --dryrun flag it gives us:
i="0"; socat -d -d pty,raw,echo=0,link=$HOME/pty0 pty,raw,echo=0,link=$HOME/pty$(($i+1))
i="2"; socat -d -d pty,raw,echo=0,link=$HOME/pty2 pty,raw,echo=0,link=$HOME/pty$(($i+1))
This is due to the $(seq x y z;) at the end, where
x = start #, y = increment by, and z = end # (or # of devices you need to spawn)
parallel 'i="{1}"; echo "make psuedo_devices {1} $(($i+1))"' ::: $(seq 0 2 3;)
Outputs:
make psuedo_devices 0 1
make psuedo_devices 2 3
Gathering all this together the final above command symlinks the proper psuedo devices together regardless of whats in /dev/pts/ to whatever directory supplied to socat via the link flag.
pstree -c -a $PROC_ID gives:
perl /usr/bin/parallel i="{1}"; socat -d -d pty,raw,echo=0,link=$HOME/pty{1} pty,raw,echo=0,link=$HOME/pty$(($i+1)) ::: 0 2
├─bash -c i="0"; socat -d -d pty,raw,echo=0,link=$HOME/pty0 pty,raw,echo=0,link=$HOME/pty$(($i+1))
│ └─socat -d -d pty,raw,echo=0,link=/home/user/pty0 pty,raw,echo=0,link=/home/user/pty1
└─bash -c i="2"; socat -d -d pty,raw,echo=0,link=$HOME/pty2 pty,raw,echo=0,link=$HOME/pty$(($i+1))
└─socat -d -d pty,raw,echo=0,link=/home/user/pty2 pty,raw,echo=0,link=/home/user/pty3
ls -l $HOME/pty* yield:
lrwxrwxrwx 1 user user 10 Sep 7 11:46 /home/user/pty0 -> /dev/pts/4
lrwxrwxrwx 1 user user 10 Sep 7 11:46 /home/user/pty1 -> /dev/pts/6
lrwxrwxrwx 1 user user 10 Sep 7 11:46 /home/user/pty2 -> /dev/pts/7
lrwxrwxrwx 1 user user 10 Sep 7 11:46 /home/user/pty3 -> /dev/pts/8
This was all because I was trying running tests against a platform where I needed to generated a lot of mach-serial connections and to test their input/output via containerization (Docker). Hopefully someone finds it useful.

Resources