How linux select() works? - linux

Could someone explain how select() works to me? I have a wrong mental model and do not understand it from the man page.
If I have clients with multiple sockets to different servers and I am reading some info periodically from server, how does the kernel knows which socket mark to read? How does he know which socket read() will not be blocking? I think it is not predictable if it actually does not read data from server.

The kernel isn't predicting anything. It's telling you the current status of the socket's receive buffer. If the buffer is not empty, the socket is readable. If the buffer is empty, select() waits. When a packet arrives from the server, the kernel matches it to the correct socket using the IP address, protocol, and port numbers. The packet is put into the socket's receive queue and select() is notified that the status has changed.

Related

linux unix sockets vs tcp sockets send buffer size

I am comparing how many bytes a send call can transfer when a socket is tcp and when it is unix socket.
For unix domain socket the number is always 219264,but for TCP this number is much higher.Why is this difference? Both the programs are executed in the same machine
Note:sockets are in non blocking mode
checked the buffer size,these are the values
unix domain socket
receive buffer size =212992
send buffer size =212992
TCP socket
receive buffer size =1062000
send buffer size =2626560
can someone explain me why is this difference?
The tcp buffer is used for packages which have been sent but not acknowledged by the other end yet and for packages which have been received out of order and are waiting for delayed packages to arrive before presented to the application. Of course packages will also stay in the buffer as long as the consuming application doesn't read() the data.
Over UNIX sockets, packages which are waiting for ACK, or the order of packages is not an issue, therefore the buffer can be smaller.

a UDP socket based rateless file transmission

I'm new to socket programming and I need to implement a UDP based rateless file transmission system to verify a scheme in my research. Here is what I need to do:
I want a server S to send a file to a group of peers A, B, C.., etc. The file is divided into a number of packets. At the beginning, peers will send a Request message to the server to initialize transmission. Whenever S receives a request from a client, it ratelessly transmit encoded packets(how to encode is done by my design, the encoding itself has the erasure-correction capability, that's why I can transmit ratelessly via UDP) to that client. The client keeps collecting packets and try to decode them. When it finally decodes all packets and re-construct the file successfully, it sends back a Stop message to the server and S will stop transmitting to this client.
Peers request the file asynchronously (they may request the file at different time). And the server will have to be able to concurrently serve multiple peers. The encoded packets for different clients are different (they are all encoded from the same set source packets, though).
Here is what I'm thinking about the implementation. I have not much experience with unix network programming though, so I'm wondering if you can help me assess it, and see if it is possible or efficient.
I'm gonna implement the server as a concurrent UDP server with two socket ports(similar to TFTP according to the UNP book). One is to receive controlling messages, as in my context it is for the Request and Stop messages. The server will maintain a flag (=1 initially) for each request. When it receives a Stop message from the client, the flag will be set to 0.
When the serve receives a request, it will fork() a new process that use the second socket and port to send encoded packets to the client. The server keeps sending packets to the client as long as the flag is 1. When it turns to 0, the sending ends.
The client program is easy to do. Just send a Request, recvfrom() the server, progressively decode the file and send a Stop message in the end.
Is this design workable? The main concerns I have are: (1), is that efficient by forking multiple processes? Or should I use threads? (2), If I have to use multiple processes, how can the flag bit be known by the child process? Thanks for your comments.
Using UDB for file transfer is not best idea. There is no way for server or client to know if any packet has been lost so you would only know that during reconstruction assuming you have some mechanism (like counter) to detect lost packes. It would then be hard to request just one of those packets that got lost. And in the end you would have a code that would do what TCP sockets do. So I suggest to start with TCP.
Typical design of a server involves a listener thread that spawns a worker thread whenever there is a new client request. That new thread would handle communication with that particular client and then end. You should keep a limit of clients (threads) that are served simultaneously. Do not spawn a new process for each client - that is inefficient and not needed as this will get you nothing that you can't achieve with threads.
Thread programming requires carefulness so do not cut corners. Otherwise you will have hard time finding and diagnosing problems.
File transfer with UDP wil be fun :(
Your struct/class for each message should contain a sequence number and a checksum. This should enable each client to detect, and ask for the retransmission of, any missing blocks at the end of the transfer.
Where UDP might be a huge winner is on a local LAN. You could UDP-broadcast the entire file to all clients at once and then, at the end, ask each client in turn which blocks it has missing and send just those. I wish Kaspersky etc. would use such a scheme for updating all my local boxes.
I have used such a broadcast scheme on a CANBUS network where there are dozens of microControllers that need new images downloaded. Software upgrades take minutes instead of hours.

Reusing a port number in a UDP

In ASIO, s it possible to create another socket that has the same source port as another socket?
My UDP server application is calling receive_from using port 3000. It passes the packet
off to a worker thread which will send the response (currently using a dynamic source port).
The socket in the other thread is created like this:
udp::socket sock2(io_service, udp::endpoint(udp::v4(), 0));
And responds to the original request using the sender_endpoint saved with the original packet.
What I'd like to be able to do is respond to the client using the same source port as the server is listening on. But I can't see how that can be done. I get an exception if I try that saying address in use. Is it possible to do what I'm asking? The reason I want that is if I use dynamic ports, it means the clients need to add special firewall rules in windows to allow the reply packets to be read. I've found that if the source port is the same in the reply, windows firewall will allow it to pass back in.
The exception tells you as it is: you can't create two live sockets with the same source port. I don't know ASIO, but you should be able to create the socket before spinning off the thread, keeping reference to the socket and the thread for later use, and once the data sending thread is idle, joining back to it and sending any other stuff.
EDIT: with a little bit of effort, you can also make a socket for which you don't have to wait until the entire data from one thread has been sent: have a worker thread owning the socket listen on a queue for chunks of data (ideally exactly the size of the payload you intend to send) and send arbitrary chunks of payload to this queue, from multiple threads.
You should be able to use the SO_REUSEADDR socket option to bind multiple sockets to the same address. But having said that, you don't want to do this because it's not specified which socket will receive incoming data on that port (you would have to check all sockets for incoming data)
The better option is just to use the same socket to send replies - this can safely be done from multiple threads without any additional synchronisation (as you are using UDP).
send reply to the same socket (that you received client's request on) instead of creating new one
but make sure you don't send to the same socket from both threads simultaneously

Sending from the same UDP socket in multiple threads

I have multiple threads which need to send UDP packets to different IP addresses (only to send, nothing needs to be received). Can I reuse the same UDP socket in all the threads?
Yes, I think you can.
As the packets are sent out individually, although the order they are received will be nondeterministic, it is already with UDP.
So sending in multiple threads in the same socket is fine.
Although, if you're doing other stuff with the socket, such as bind(), close(), then you could end up with race conditions, so you might want to be careful.
System calls are supposed to be atomic, so formally it seems fine for UDP. Then kernels have bugs too and you are inviting all sorts of nasty surprises. Why can't you use socket per thread? It's not like with TCP where you need a connection. As an added bonus you'd get a separate send buffer for each descriptor.

How the buffering work in socket on linux

How does buffering work with sockets on Linux?
i.e. if the server does not read the socket and the client keeps sending data.
So what will happen? How big is the socket's buffer? And will the client know so that it will stop sending?
For UDP socket client will never know - the server side will just start dropping packets after the receive buffer is filled.
TCP, on the other hand, implements flow control. The server's kernel will gradually reduce the window, so the client will be able to send less and less data. At some point the window will go down to zero. At this point the client fills up its send buffer and receives an error from the send(2).
TCP sockets use buffering in the protocol stack. The stack itself implements flow control so that if the server's buffer is full, it will stop the client stack from sending more data. Your code will see this as a blocked call to send(). The buffer size can vary widely from a few kB to several MB.
I'm assuming that you're using send() and recv() for client and server communication.
So, send() will return the number of bytes that have been sent out. This doesn't necessarily equal to to the number of bytes you wanted to send out, so it's up to you to realise this and send the rest.
Now, the recv() returns the number of bytes read to the buffer. So if recv returns a 0, then the server has probably closed the connection.

Resources