Is there only one Unix Domain Socket in the communication between two processes? - linux

There are two kinds of sockets: network sockets and Unix Domain Sockets.
When two processes communicate using network sockets, each process creates its own network socket, and the processes communicate by connection between their sockets. There are two sockets, each belonging to a different process, being the connection endpoint of each process
When two processes communicate using Unix Domain sockets, a Unix Domain socket is addressed by a filename in the filesystem.
Does that imply the two processes communicate by only one Unix Domain socket, instead of two?
Does the Unix Domain socket not belong to any process, i.e. is the Unix domain socket not a connection endpoint of any process, but somehow like a "middle point" between the two processes?

There are two sockets, one on each end of the connection. Each of them, independently, may or may not have a name in the filesystem.
The thing you see when you ls -l that starts with srwx is not really "the socket". It's a name that is bound to a socket (or was bound to a socket in the past - they don't automatically get removed when they're dead).
An analogy: think about TCP sockets. Most of them involve an endpoint with a well-known port number (22 SSH; 25 SMTP; 80 HTTP; etc.) A server creates a socket and binds to the well-known port. A client creates a socket and connects to the well-known port. The client socket also has a port number, which you can see in a packet trace (tcpdump/wireshark), but it's not a fixed number, it's just some number that was automatically chosen by the client's kernel because it wasn't already in use.
In unix domain sockets, the pathname is like the port number. If you want clients to be able to find your server socket, you need to bind it to a well-known name, like /dev/log or /tmp/.X11-unix/X0. But the client doesn't need to have a well-known name, so normally it doesn't do a bind(). Therefore the name /tmp/.X11-unix/X0 is only associated with the server socket. You can confirm this with netstat -x. About half the sockets listed will have pathnames, and the other half won't. Or write your own client/server pair, and call getsockname() on the client. Its name will be empty, while getsockname() on the server gives the pathname.
The ephemeral port number automatically assigned to a TCP client has no counterpart in unix domain socket addresses. In TCP it's necessary to have a local port number so incoming packets can be matched to the correct socket. Unix domain sockets are linked directly in their kernel data structures, so there's no need. A client can be connected to a server and have no name.
And then there's socketpair() which creates 2 unix domain sockets connected to each other, without giving a name to either of them.
(Not mentioned here, and not really interesting: the "abstract" namespace.)

Related

TCP's socket vs Linux's TCP socket

Linux API and TCP protocol both have concepts called "socket". Are they the same concept, and does Linux's TCP socket implement TCP's socket concept?
Relation between connections and sockets:
I have heard that two connections can't share a Linux's TCP socket, and is it true?
Tenebaum's Computer Networks (5ed 2011, Section 6.5.2 The TCP Service Model, p553) says:
A socket may be used for multiple connections at the same time. In other words, two or more connections may terminate at the same socket. Connections are identified by the socket identifiers at both ends.
Since the quote says two connections can share a "socket", does the book use a different "socket" concept from Linux's TCP socket? Does the book use TCP's socket concept?
Relation between processes and sockets:
I also heard that two processes can share a Linux's TCP socket. But if two processes can share a socket, can't the processes create their own connections on the socket at will, so there are two connections on the same Linux's TCP socket? Is it a contradiction to 1, where two connections can't share a Linux TCP socket?
Can two processes share a TCP's socket?
The book references a more abstract concept of a socket, one that is not tied to a particular OS or even a network/transport protocol. In the book, a socket is simply a uniquely defined connection endpoint. A connection is thus a pair (S1, S2) of sockets, and this pair should be unique in some undefined context. An example specific to TCP using my connection right now would have an abstract socket consisting of an interface IP address and a TCP port number. There are many, many connections between stackoverflow users like myself and the abstract socket [443, 151.101.193.69] but only a single connection from my machine [27165, 192.168.1.231] to [443, 151.101.193.69], which is a fake example using a non-routable IP address so as to protect my privacy.
If we get even more concrete and assume that stackoverflow and my computer are both running linux, than we can talk about the socket as defined by man 2 socket, and the linux API that uses it. Here a socket can be created in listening mode, and this is typically called a server. This socket can be shared (shared in the sense of shared memory or state) amongst multiple processes. However, when a peer connects to this listening socket a new socket is created (as a result of the accept() call. The original listening socket may again be used to accept() another connection. I believe if there are multiple processes blocked on the accept() system call then exactly one of these is unblocked and returns with the newly created connected socket.
Let me know if there is something missing here.
Speaking as the docs you're reading do is convenient, but it's not really accurate.
Sockets are a general networking API. Their only relation with TCP is, you can set sockets up to use it. You can also set sockets up to talk over any other networking protocol the OS backs; also, you don't necessarily have to use sockets, many OS's still offer other networking APIs, some with substantial niche advantages.
The problem this leaves you with is, the nontechnical language leaves you with an idea of how things are put together but it glosses over implementation details, you can't do any detailed reasoning from the characterizations and analogies in layman's terms.
So ignore the concept you've formed of sockets. Read the actual docs, not tutorials. Write code to see if it works as you think it does. You'll learn that what you have now is a layman's understanding of "a socket", glossing over the differences between sockets you create with socket(), the ones you get from accept(), the ones you can find in Unix's filesystem, and so forth.
Even "connection" is somewhat of a simplification, even for TCP.
To give you an idea just how deep the rabbit hole goes, so is "sharing" -- you can send fd's over some kinds of sockets, and sockets are fd's, after fork() the two processes share the fd namespace, and you can dup() fd's...
A fully-set-up TCP network connection is a {host1:port1, host2:port2} pair with some tracked state at both ends and packets getting sent between those ends that update the state according to the TCP protocol i.e. rules. You can bind() a socket to a local TCP address, and connect() through that socket to remote (or local) addresses one after another, so in that sense connections can share a socket—but if you're running a server, accept()ed connections get their own dedicated socket, it's how you identify where the data you read() is coming from.
One of the common conflations is between the host:port pair a socket can be bound to and the socket itself. You wind up with one OS socket listening for new connections plus one per connection over a connection-based protocol like TCP, but they can all use the same host:port, it's easy to gloss over the reality and think of that as "a socket", and it looks like the book you're reading fell into that usage.

Understanding Client Server Connections [duplicate]

This question already has answers here:
Does the port change when a server accepts a TCP connection?
(3 answers)
Closed 4 years ago.
I understand the basics of how ports work. However, what I don't get is how multiple clients can simultaneously connect to say port 80. I know each client has a unique (for their machine) port. Does the server reply back from an available port to the client, and simply state the reply came from 80? How does this work?
First off, a "port" is just a number. All a "connection to a port" really represents is a packet which has that number specified in its "destination port" header field.
Now, there are two answers to your question, one for stateful protocols and one for stateless protocols.
For a stateless protocol (ie UDP), there is no problem because "connections" don't exist - multiple people can send packets to the same port, and their packets will arrive in whatever sequence. Nobody is ever in the "connected" state.
For a stateful protocol (like TCP), a connection is identified by a 4-tuple consisting of source and destination ports and source and destination IP addresses. So, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port (which is generally a random high-numbered port).
Simply, if I connect to the same web server twice from my client, the two connections will have different source ports from my perspective and destination ports from the web server's. So there is no ambiguity, even though both connections have the same source and destination IP addresses.
Ports are a way to multiplex IP addresses so that different applications can listen on the same IP address/protocol pair. Unless an application defines its own higher-level protocol, there is no way to multiplex a port. If two connections using the same protocol simultaneously have identical source and destination IPs and identical source and destination ports, they must be the same connection.
Important:
I'm sorry to say that the response from "Borealid" is imprecise and somewhat incorrect - firstly there is no relation to statefulness or statelessness to answer this question, and most importantly the definition of the tuple for a socket is incorrect.
First remember below two rules:
Primary key of a socket: A socket is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} not by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT} - Protocol is an important part of a socket's definition.
OS Process & Socket mapping: A process can be associated with (can open/can listen to) multiple sockets which might be obvious to many readers.
Example 1: Two clients connecting to same server port means: socket1 {SRC-A, 100, DEST-X,80, TCP} and socket2{SRC-B, 100, DEST-X,80, TCP}. This means host A connects to server X's port 80 and another host B also connects to the same server X to the same port 80. Now, how the server handles these two sockets depends on if the server is single-threaded or multiple-threaded (I'll explain this later). What is important is that one server can listen to multiple sockets simultaneously.
To answer the original question of the post:
Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT. With all fairness, "Borealid" essentially mentioned the same correct answer but the reference to state-less/full was kind of unnecessary/confusing.
To answer the second part of the question on how a server knows which socket to answer. First understand that for a single server process that is listening to the same port, there could be more than one socket (maybe from the same client or from different clients). Now as long as a server knows which request is associated with which socket, it can always respond to the appropriate client using the same socket. Thus a server never needs to open another port in its own node than the original one on which the client initially tried to connect. If any server allocates different server ports after a socket is bound, then in my opinion the server is wasting its resource and it must be needing the client to connect again to the new port assigned.
A bit more for completeness:
Example 2: It's a very interesting question: "can two different processes on a server listen to the same port". If you do not consider protocol as one of the parameters defining sockets then the answer is no. This is so because we can say that in such a case, a single client trying to connect to a server port will not have any mechanism to mention which of the two listening processes the client intends to connect to. This is the same theme asserted by rule (2). However, this is the WRONG answer because 'protocol' is also a part of the socket definition. Thus two processes in the same node can listen to the same port only if they are using different protocols. For example, two unrelated clients (say one is using TCP and another is using UDP) can connect and communicate to the same server node and to the same port but they must be served by two different server processes.
Server Types - single & multiple:
When a server processes listening to a port that means multiple sockets can simultaneously connect and communicate with the same server process. If a server uses only a single child process to serve all the sockets then the server is called single-process/threaded and if the server uses many sub-processes to serve each socket by one sub-process then the server is called a multi-process/threaded server. Note that irrespective of the server's type a server can/should always use the same initial socket to respond back (no need to allocate another server port).
Suggested Books and the rest of the two volumes if you can.
A Note on Parent/Child Process (in response to query/comment of 'Ioan Alexandru Cucu')
Wherever I mentioned any concept in relation to two processes say A and B, consider that they are not related by the parent-child relationship. OS's (especially UNIX) by design allows a child process to inherit all File-descriptors (FD) from parents. Thus all the sockets (in UNIX like OS are also part of FD) that process A listening to can be listened to by many more processes A1, A2, .. as long as they are related by parent-child relation to A. But an independent process B (i.e. having no parent-child relation to A) cannot listen to the same socket. In addition, also note that this rule of disallowing two independent processes to listen to the same socket lies on an OS (or its network libraries), and by far it's obeyed by most OS's. However, one can create own OS which can very well violate this restriction.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all". An easy mistake to make is to listen on address 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/dcore/tree/master/apps/quicknet) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
Normally, for every connecting client the server forks a child process that communicates with the client (TCP). The parent server hands off to the child process an established socket that communicates back to the client.
When you send the data to a socket from your child server, the TCP stack in the OS creates a packet going back to the client and sets the "from port" to 80.
Multiple clients can connect to the same port (say 80) on the server because on the server side, after creating a socket and binding (setting local IP and port) listen is called on the socket which tells the OS to accept incoming connections.
When a client tries to connect to server on port 80, the accept call is invoked on the server socket. This creates a new socket for the client trying to connect and similarly new sockets will be created for subsequent clients using same port 80.
Words in italics are system calls.
Ref
http://www.scs.stanford.edu/07wi-cs244b/refs/net2.pdf

Why does socketpair() allow SOCK_DGRAM type?

I've been learning about Linux socket programming recently, mostly from this site.
The site says that using the domain/type combination PF_LOCAL/SOCK_DGRAM...
Provides datagram services within the local host. Note that this
service is connectionless, but reliable, with the possible exception
that packets might be lost if kernel buffers should become exhausted.
My question, then, is why does socketpair(int domain, int type, int protocol, int sv[2]) allow this combination, when according to its man page...
The socketpair() call creates an unnamed pair of connected sockets in
the specified domain, of the specified type...
Isn't there a contradiction here?
I thought SOCK_DGRAM in the PF_LOCAL and PF_INET domains implied UDP, which is a connectionless protocol, so I can't reconcile the seeming conflict with socketpair()'s claim to create connected sockets.
Datagram sockets have "pseudo-connections". The protocol doesn't really have connections, but you can still call connect(). This associates a remote address and port with the socket, and then it only receives packets that come from that source, rather than all packets whose destination is the address/port that the socket is bound to, and you can use send() rather than sendto() to send back to this remote address.
An example where this might be used is the TFTP protocol. The server initially listens for incoming requests on the well-known port. Once a transfer has started, a different port is used, and the sender and receiver can use connect() to associate a socket with that pair of ports. Then they can simply send and receive on that new socket to participate in the transfer.
Similarly, if you use socketpair() with datagram sockets, it creates a pseudo-connection between the two sockets.

Maintaning more than 65535 connections on single IP

Reading the following article: 10M concurrent websockets
So, there are 1000 websocket servers listening on ports 10000-11000. When a connection is made to one of these servers, I assume they continue communication from a random established TCP connection with random ports. So, as one IP is used, and there are 64K ports, how can one maintain 10M connections? Are connections identified by IP-Port pairs? Can two different connections from different IPs to same port be established? How does this work under the hood?
When a connection is made to one of these servers, I assume they continue communication from a random established TCP connection with random ports.
Wrong assumption. They communicate with the clients using the same local port number they are listening on.
So, as one IP is used, and there are 64K ports, how can one maintain 10M connections?
Not a problem.
Are connections identified by IP-Port pairs?
Yes.
Can two different connections from different IPs to same port be established?
Yes.
How does this work under the hood?
See above. IP:port pairs. You answered your own question.
Sorry for totally changing my answer.
Linux can easily support millions of open sockets if the machine has enough memory and processing power. The TCP/IP stack allows this because the socket the OS targets for a given TCP packet is determined by the source and destination IP and port tuple.
The server implementing the websocket protocol need only listen to a single TCP socket, often defined by the HTTP or HTTPS port number, but not in this example. As part of standard TCP handshaking, the server OS and application open a unique socket for the TCP connection to the new client when the HTTP request which is a websocket request is received. The websocket package takes care of upgrading the protocol used on this new socket from standard HTTP to websocket.
In the example, a goroutine is started for each websocket socket.
The client side, the side initiating the TCP connections, is limited by the number of ephemeral ports its OS can open for a given destination host and port. Honestly, I don't know if this is a limitation of the client OS or the TCP/IP specification itself.
I think the part you are missing is a TCP connection is actually two pairs of IP:PORT.
One for the server, one for the client.
The listening side of a tcp socket is generally always the same IP/Port pair.
Example: net.Listen("tcp", ":8080") is listening on port 8080 (on all interfaces in this case)
The connecting (client) side is usually uses a single outgoing IP along with a random port.
Example: net.Dial("tcp","server:8080) Selects a random available ephemeral port and then attempts to connect to server:8080.
So, in the above example, that connection is: client.ip:32768 -> server.ip:8080 (where 32768 is the ephemeral port selected)
the two pairs combined make a unique connection.
The server side can take as many connections from a single client as there are available (client side) ports. It can also take as many clients are there are IP addresses.
Think of it as, for one listening socket, you can theoretically have 2^16(ports) * 2^32(ipv4 addrs) connections.
In reality, there are reserved IPs, ports, memory limitations, etc so the number is far smaller.
For exmaple, the ephemeral port range on Linux is 32768 - 61000. Which means I'll start getting errors if I net.Dial("tcp", "server:8080") more than 28232 times as I will have exhausted my ephemeral port range for the given server address. But if the server is listening on 2 separate ports, I can do 28232 to the first port, and another 28232 to the second port.
When you see people do the 10MM connection tests, they have to use multiple client IPs or multiple server IPs/Ports to achieve this (or a combo of both to get 10MM unique client:ip/server:ip pairs)

How do Unix Domain Sockets differentiate between multiple clients?

TCP has the tuple pairs (IP Addr/port/type) to tell one client from another. UDP passes the client IP and port. How does the unix domain keep track of different clients?
In other words the server creates a socket bound to some path say /tmp/socket. 2 or more clients connect to /tmp/socket. What is going on underneath that keeps track of data from client1 and client2? I imagine the network stack plays no part in domain sockets so is the kernel doing all the work here?
Is there a unix domain protocol format like there is an IP protocol format and TCP/UDP formats? Is the format of domain socket datagram protocols published somewhere? Is every unix different or does something like POSIX standardize it?
Thanks for any illumination. I could not find any information that explained this. Every source just glossed over how to use the domain sockets.
If you create a PF_UNIX socket of type SOCK_STREAM, and accept connections on it, then each time you accept a connection, you get a new file descriptor (as the return value of the accept system call). This file descriptor reads data from and writes data to a file descriptor in the client process. Thus it works just like a TCP/IP connection.
There's no “unix domain protocol format”. There doesn't need to be, because a Unix-domain socket can't be connected to a peer over a network connection. In the kernel, the file descriptor representing your end of a SOCK_STREAM Unix-domain socket points to a data structure that tells the kernel which file descriptor is at the other end of the connection. When you write data to your file descriptor, the kernel looks up the file descriptor at the other end of the connection and appends the data to that other file descriptor's read buffer. The kernel doesn't need to put your data inside a packet with a header describing its destination.
For a SOCK_DGRAM socket, you have to tell the kernel the path of the socket that should receive your data, and it uses that to look up the file descriptor for that receiving socket.
If you bind a path to your client socket before you connect to the server socket (or before you send data if you're using SOCK_DGRAM), then the server process can get that path using getpeername (for SOCK_STREAM). For a SOCK_DGRAM, the receiving side can use recvfrom to get the path of the sending socket.
If you don't bind a path, then the receiving process can't get an id that uniquely identifies the peer. At least, not on the Linux kernel I'm running (2.6.18-238.19.1.el5).

Resources