How do Unix Domain Sockets differentiate between multiple clients? - linux

TCP has the tuple pairs (IP Addr/port/type) to tell one client from another. UDP passes the client IP and port. How does the unix domain keep track of different clients?
In other words the server creates a socket bound to some path say /tmp/socket. 2 or more clients connect to /tmp/socket. What is going on underneath that keeps track of data from client1 and client2? I imagine the network stack plays no part in domain sockets so is the kernel doing all the work here?
Is there a unix domain protocol format like there is an IP protocol format and TCP/UDP formats? Is the format of domain socket datagram protocols published somewhere? Is every unix different or does something like POSIX standardize it?
Thanks for any illumination. I could not find any information that explained this. Every source just glossed over how to use the domain sockets.

If you create a PF_UNIX socket of type SOCK_STREAM, and accept connections on it, then each time you accept a connection, you get a new file descriptor (as the return value of the accept system call). This file descriptor reads data from and writes data to a file descriptor in the client process. Thus it works just like a TCP/IP connection.
There's no “unix domain protocol format”. There doesn't need to be, because a Unix-domain socket can't be connected to a peer over a network connection. In the kernel, the file descriptor representing your end of a SOCK_STREAM Unix-domain socket points to a data structure that tells the kernel which file descriptor is at the other end of the connection. When you write data to your file descriptor, the kernel looks up the file descriptor at the other end of the connection and appends the data to that other file descriptor's read buffer. The kernel doesn't need to put your data inside a packet with a header describing its destination.
For a SOCK_DGRAM socket, you have to tell the kernel the path of the socket that should receive your data, and it uses that to look up the file descriptor for that receiving socket.
If you bind a path to your client socket before you connect to the server socket (or before you send data if you're using SOCK_DGRAM), then the server process can get that path using getpeername (for SOCK_STREAM). For a SOCK_DGRAM, the receiving side can use recvfrom to get the path of the sending socket.
If you don't bind a path, then the receiving process can't get an id that uniquely identifies the peer. At least, not on the Linux kernel I'm running (2.6.18-238.19.1.el5).

Related

What is the purpose of SOCK_DGRAM and SOCK_STREAM in the context AF_UNIX sockets?

I understand that, SOCK_DGRAM and SOCK_STREAM corresponds to Connection-less and connection oriented network communication done using INET Address family.
Now i am trying to learn AF_UNIX sockets to carry out IPC between processes running on same host, and there i see we need to specify the sub_socket_type as SOCK_DGRAM Or SOCK_STREAM. I am not able to understand for AF_UNIX sockets, what is the purpose of specifying the sub socket type.
Can anyone pls help understand the significance of SOCK_DGRAM and SOCK_STREAM in the context of AF_UNIX sockets ?
It happens that TCP is both a stream protocol, and connection oriented, whereas UDP is a datagram protocol, and connectionless. However it is possible to have a connection-oriented datagram protocol. That is what a block special file (or a Windows Mailslot) are.
(You can't have a connectionless stream protocol though, it doesn't make sense, unless /dev/null counts)
The flag SOCK_DGRAM does not mean the socket is connectionless, it means that the socket is datagram oriented.
A stream-oriented socket (and a character special file like /dev/random or /dev/null) provides (or consumes, or both) a continuous sequence of bytes, with no inherent structure. Structure is provided by interpreting the contents of the stream. Generally speaking there is only one process on either end of the stream.
A datagram-oriented socket, provides (or consumes or both) short messages which are limited in size and self-contained. Generally speaking, the server can receive datagrams from multiple clients using recvfrom (which provides the caller with an address to send replies to) and replies to them with sendto specifying that address.
The question also confused me for a while, but as Ben said, with socket type is SOCK_STREAM OR SOCK_DGRAM ,they all means the same way to access inter-process communication between client and server. Under domain AF_UNIX ,it makes not one jot of difference.

Why does socketpair() allow SOCK_DGRAM type?

I've been learning about Linux socket programming recently, mostly from this site.
The site says that using the domain/type combination PF_LOCAL/SOCK_DGRAM...
Provides datagram services within the local host. Note that this
service is connectionless, but reliable, with the possible exception
that packets might be lost if kernel buffers should become exhausted.
My question, then, is why does socketpair(int domain, int type, int protocol, int sv[2]) allow this combination, when according to its man page...
The socketpair() call creates an unnamed pair of connected sockets in
the specified domain, of the specified type...
Isn't there a contradiction here?
I thought SOCK_DGRAM in the PF_LOCAL and PF_INET domains implied UDP, which is a connectionless protocol, so I can't reconcile the seeming conflict with socketpair()'s claim to create connected sockets.
Datagram sockets have "pseudo-connections". The protocol doesn't really have connections, but you can still call connect(). This associates a remote address and port with the socket, and then it only receives packets that come from that source, rather than all packets whose destination is the address/port that the socket is bound to, and you can use send() rather than sendto() to send back to this remote address.
An example where this might be used is the TFTP protocol. The server initially listens for incoming requests on the well-known port. Once a transfer has started, a different port is used, and the sender and receiver can use connect() to associate a socket with that pair of ports. Then they can simply send and receive on that new socket to participate in the transfer.
Similarly, if you use socketpair() with datagram sockets, it creates a pseudo-connection between the two sockets.

Linux UDP Socket: why select()?

I am new to Linux socket programming. Here I have an basic question:
for UDP, why we need select()?
As UDP is stateless, so UDP server just handles whatever data it received. There will be no new socket created once a new client sends data, right?
if so, select() will be returned/notified once this socket has data arrived. So we don't need to go throughput all to check which socket is being notified (as there will be only one socket);
Is this true? non-blocking UDP socket + select() == blocking UDP socket.
Thanks!
The main benefit of select() is to be able to wait for input on multiple descriptors at once. So when you have multiple UDP sockets open, you put them all into the fd_set, call select(), and it will return when a packet is received on any of them. And it returns an fd_set that indicates which ones have data available. You can also use it to wait for data from the network while also waiting for input from the user's terminal. Or you can handle both UDP and TCP connections in a single server (e.g. DNS servers can be accessed using either TCP or UDP).
If you don't use select(), you would have to write a loop that continuously performs a non-blocking read on each socket. This is not as efficient, since it will spend lots of time performing unnecessary system calls (imagine a server that only gets one request a day, yet is continually calling recv() all day).
Your question seems to assume that the server can work with just one UDP socket. However, if the server has multiple IP addresses, it may need multiple sockets. UDP clients generally expect the response to come from the same IP they sent the request to. The standard socket API doesn't provide a way to know which IP the request was sent to, or to set the source address of the outgoing reply. So the common way to implement this is to open a separate socket bound to each IP, and use select() or epoll() to wait for a request on all of them concurrently. Then you send the reply through the same socket that the request was received on, and it will use that socket's bound IP as the source.
(Linux has socket extensions that make this unnecessary, see Setting the source IP for a UDP socket.)

Is there only one Unix Domain Socket in the communication between two processes?

There are two kinds of sockets: network sockets and Unix Domain Sockets.
When two processes communicate using network sockets, each process creates its own network socket, and the processes communicate by connection between their sockets. There are two sockets, each belonging to a different process, being the connection endpoint of each process
When two processes communicate using Unix Domain sockets, a Unix Domain socket is addressed by a filename in the filesystem.
Does that imply the two processes communicate by only one Unix Domain socket, instead of two?
Does the Unix Domain socket not belong to any process, i.e. is the Unix domain socket not a connection endpoint of any process, but somehow like a "middle point" between the two processes?
There are two sockets, one on each end of the connection. Each of them, independently, may or may not have a name in the filesystem.
The thing you see when you ls -l that starts with srwx is not really "the socket". It's a name that is bound to a socket (or was bound to a socket in the past - they don't automatically get removed when they're dead).
An analogy: think about TCP sockets. Most of them involve an endpoint with a well-known port number (22 SSH; 25 SMTP; 80 HTTP; etc.) A server creates a socket and binds to the well-known port. A client creates a socket and connects to the well-known port. The client socket also has a port number, which you can see in a packet trace (tcpdump/wireshark), but it's not a fixed number, it's just some number that was automatically chosen by the client's kernel because it wasn't already in use.
In unix domain sockets, the pathname is like the port number. If you want clients to be able to find your server socket, you need to bind it to a well-known name, like /dev/log or /tmp/.X11-unix/X0. But the client doesn't need to have a well-known name, so normally it doesn't do a bind(). Therefore the name /tmp/.X11-unix/X0 is only associated with the server socket. You can confirm this with netstat -x. About half the sockets listed will have pathnames, and the other half won't. Or write your own client/server pair, and call getsockname() on the client. Its name will be empty, while getsockname() on the server gives the pathname.
The ephemeral port number automatically assigned to a TCP client has no counterpart in unix domain socket addresses. In TCP it's necessary to have a local port number so incoming packets can be matched to the correct socket. Unix domain sockets are linked directly in their kernel data structures, so there's no need. A client can be connected to a server and have no name.
And then there's socketpair() which creates 2 unix domain sockets connected to each other, without giving a name to either of them.
(Not mentioned here, and not really interesting: the "abstract" namespace.)

How reliable are unix domain sockets?

I'm trying to figure out a protocol to use with domain sockets and can't find information on how blindly the domain sockets can be trusted.
Can data be lost? Are messages always received in the same order as sent? Even when using datagram sockets?
Are transfers atomic? When reading the socket, can I trust that I get the whole message on one read or do I have to check it myself?
From 'man AF_UNIX':
Valid types are: SOCK_STREAM, for a stream-oriented socket and
SOCK_DGRAM, for a datagram-oriented socket that preserves message
boundaries (as on most Unix implementations, Unix domain datagram sockets
are always reliable and don’t reorder datagrams);

Resources