AFAIK, there exist two methods for IPC over sockets. Unix sockets and TCP/IP sockets.
UNIX domain sockets know that they’re executing on the same system, so they can avoid some checks and operations (like routing); which makes them faster and lighter than IP sockets. They also transfer the packets over the file system, meaning disk access is a natural part of the process (AFAIU, from what using file system means).
IP sockets (especially TCP/IP sockets) are a mechanism allowing communication between processes over the network. In some cases, you can use TCP/IP sockets to talk with processes running on the same computer (by using the loopback interface).
My question is: in the latter case, where does the transfer of packets occur exactly? If they are being passed over the memory, although it seems like there is a logical overhead, IP sockets are actually more performant than UNIX sockets.
Is there something that I am missing? I understand that logically IP sockets introduce an overhead, I want to understand what happens to a message in both cases.
UNIX domain sockets ... They also transfer the packets over the file system, meaning disk access is a natural part of the process
This is wrong. While there is a special socket file in the file system it only regulates access to the socket by using file system permissions for it. The data transfer itself is done purely in memory.
IP sockets ... where does the transfer of packets occur exactly?
Also in memory.
Unix variants map a lot of things over the filesystem that have absolutely no relation to actual disk drives.
What you're describing is both in memory only, just the amount of layering and overhead differs. Unix sockets just use the DOS while IP sockets use the full network stack.
Consider a client application that is going to receive UDP packets from the same IP address, but on different ports. The bit rates of both data streams are much lower than the overall network connection throughput (say, both are around 2 Kb/s). The machine running the client is going to have a modern ARMv7 or x86-64 CPU.
Which of the following two approaches is better in terms of efficient use of the target machine’s resources?
Run single-threaded, blocking on both sockets with the epoll system call, and read from one or both sockets sequentially.
Run in two threads, each dedicated to one of the two sockets, and use simple blocking I/O (not invoking epoll).
Is there a possibility of losing packets by following the first approach when both sockets have data? Does the answer change with different number of CPU cores available on the target machine?
I am building a message layer for processes running on an embedded Linux system. I am planing to use sockets. This system might be ported to different operating systems down the road so portability is a concern. Performance is below portability in priority order.
I have a few questions regarding my way forward.
I am thinking of using internet sockets over TCP/IP for this communication between local processes for the sake of portability. Is there any reason that I should not do that and use domain sockets?
Does it really improve the portability when using internet sockets instead of domain sockets?
If this is indeed the way forward, can you point me in the right direction (how to use ports for each process etc.) with some online resources?
When we doing network programming, no matter you use multi-process, multi-thread or select/poll(epoll), there is only one process/thread to deal with accept the connection on same port. And if you want to take advantage of multi-cores, you need to create worker processes/threads. But what about the bound is dealing with network connections? Is there a way to take advantage of multi-core when dealing with network connections?
I found some materials. And seems this is hard to complete.
Three-way hand shaking will be implicit done by the kernel. And in smp structure operating system will be divided into several critical zones. The same critical zone can't be run on more than one core at the same time.
All modern operating systems that run on PC hardware already have their network stacks heavily optimized for multi-core CPUs. For example, the packet handling code that pushes data to and from the network card is going to be independent of the TCP/IP stack code so a hardware interrupt can run to completion without disturbing the TCP code.
For most real-world applications though, the bulk of the work is between the packets. Data that comes in has to be processed and data that goes out has to be generated. That's up to application code, and that code can take advantage of multiple cores either by using multiple threads or multiple processes. How you do that best is very application and operating system specific. Windows, for example, has I/O completion ports which combine job discovery with multi-threaded job dispatch. Linux has epoll.
With just the network traffic, that's almost soley done by the network card (i.e. not the computer's CPU). Communication with the network card is usually single-threaded (queued by the OS so you can send/receive on multiple threads) because a NIC can only push/pop stuff off it's stack one-at-a-time.
It's up to your process to do what it needs in response to received data. That can be done on one thread and you can spawn other threads upon receive of data on that master thread and divide work up that way. If you have a language that supports asynchronous communications, I would try to get it to do most of the work to use multiple threads.
Has anyone an idea how many tcp-socket connections are possible on a modern standard Linux server?
(There is in general less traffic on each connection, but all the connections have to be up all the time.)
I achieved 1600k concurrent idle socket connections, and at the same time 57k req/s on a Linux desktop (16G RAM, I7 2600 CPU). It's a single thread http server written in C with epoll. Source code is on github, a blog here.
Edit:
I did 600k concurrent HTTP connections (client & server) on both the same computer, with JAVA/Clojure . detail info post, HN discussion: http://news.ycombinator.com/item?id=5127251
The cost of a connection(with epoll):
application need some RAM per connection
TCP buffer 2 * 4k ~ 10k, or more
epoll need some memory for a file descriptor, from epoll(7)
Each registered file descriptor costs roughly 90
bytes on a 32-bit kernel, and roughly 160 bytes on a 64-bit kernel.
This depends not only on the operating system in question, but also on configuration, potentially real-time configuration.
For Linux:
cat /proc/sys/fs/file-max
will show the current maximum number of file descriptors total allowed to be opened simultaneously. Check out http://www.cs.uwaterloo.ca/~brecht/servers/openfiles.html
A limit on the number of open sockets is configurable in the /proc file system
cat /proc/sys/fs/file-max
Max for incoming connections in the OS defined by integer limits.
Linux itself allows billions of open sockets.
To use the sockets you need an application listening, e.g. a web server, and that will use a certain amount of RAM per socket.
RAM and CPU will introduce the real limits. (modern 2017, think millions not billions)
1 millions is possible, not easy. Expect to use X Gigabytes of RAM to manage 1 million sockets.
Outgoing TCP connections are limited by port numbers ~65000 per IP. You can have multiple IP addresses, but not unlimited IP addresses.
This is a limit in TCP not Linux.
10,000? 70,000? is that all :)
FreeBSD is probably the server you want, Here's a little blog post about tuning it to handle 100,000 connections, its has had some interesting features like zero-copy sockets for some time now, along with kqueue to act as a completion port mechanism.
Solaris can handle 100,000 connections back in the last century!. They say linux would be better
The best description I've come across is this presentation/paper on writing a scalable webserver. He's not afraid to say it like it is :)
Same for software: the cretins on the
application layer forced great
innovations on the OS layer. Because
Lotus Notes keeps one TCP connection
per client open, IBM contributed major
optimizations for the ”one process,
100.000 open connections” case to Linux
And the O(1) scheduler was originally
created to score well on some
irrelevant Java benchmark. The bottom
line is that this bloat benefits all of
us.
On Linux you should be looking at using epoll for async I/O. It might also be worth fine-tuning socket-buffers to not waste too much kernel space per connection.
I would guess that you should be able to reach 100k connections on a reasonable machine.
depends on the application. if there is only a few packages from each client, 100K is very easy for linux. A engineer of my team had done a test years ago, the result shows : when there is no package from client after connection established, linux epoll can watch 400k fd for readablity at cpu usage level under 50%.
Which operating system?
For windows machines, if you're writing a server to scale well, and therefore using I/O Completion Ports and async I/O, then the main limitation is the amount of non-paged pool that you're using for each active connection. This translates directly into a limit based on the amount of memory that your machine has installed (non-paged pool is a finite, fixed size amount that is based on the total memory installed).
For connections that don't see much traffic you can reduce make them more efficient by posting 'zero byte reads' which don't use non-paged pool and don't affect the locked pages limit (another potentially limited resource that may prevent you having lots of socket connections open).
Apart from that, well, you will need to profile but I've managed to get more than 70,000 concurrent connections on a modestly specified (760MB memory) server; see here http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html for more details.
Obviously if you're using a less efficient architecture such as 'thread per connection' or 'select' then you should expect to achieve less impressive figures; but, IMHO, there's simply no reason to select such architectures for windows socket servers.
Edit: see here http://blogs.technet.com/markrussinovich/archive/2009/03/26/3211216.aspx; the way that the amount of non-paged pool is calculated has changed in Vista and Server 2008 and there's now much more available.
Realistically for an application, more then 4000-5000 open sockets on a single machine becomes impractical. Just checking for activity on all the sockets and managing them starts to become a performance issue - especially in real-time environments.