Reusing a port number in a UDP

Reusing a port number in a UDP - linux

In ASIO, s it possible to create another socket that has the same source port as another socket?
My UDP server application is calling receive_from using port 3000. It passes the packet
off to a worker thread which will send the response (currently using a dynamic source port).
The socket in the other thread is created like this:
udp::socket sock2(io_service, udp::endpoint(udp::v4(), 0));
And responds to the original request using the sender_endpoint saved with the original packet.
What I'd like to be able to do is respond to the client using the same source port as the server is listening on. But I can't see how that can be done. I get an exception if I try that saying address in use. Is it possible to do what I'm asking? The reason I want that is if I use dynamic ports, it means the clients need to add special firewall rules in windows to allow the reply packets to be read. I've found that if the source port is the same in the reply, windows firewall will allow it to pass back in.

The exception tells you as it is: you can't create two live sockets with the same source port. I don't know ASIO, but you should be able to create the socket before spinning off the thread, keeping reference to the socket and the thread for later use, and once the data sending thread is idle, joining back to it and sending any other stuff.
EDIT: with a little bit of effort, you can also make a socket for which you don't have to wait until the entire data from one thread has been sent: have a worker thread owning the socket listen on a queue for chunks of data (ideally exactly the size of the payload you intend to send) and send arbitrary chunks of payload to this queue, from multiple threads.

You should be able to use the SO_REUSEADDR socket option to bind multiple sockets to the same address. But having said that, you don't want to do this because it's not specified which socket will receive incoming data on that port (you would have to check all sockets for incoming data)
The better option is just to use the same socket to send replies - this can safely be done from multiple threads without any additional synchronisation (as you are using UDP).

send reply to the same socket (that you received client's request on) instead of creating new one
but make sure you don't send to the same socket from both threads simultaneously

Related

TCP sockets connection becomes unreliable with SO_REUSEADDR

I have an app where a single client talks to a single server. Normally, the client does a single connect, and then calls send repeatedly, and there's no problem.
However, I need to do a version where the client sets up a connection for each individual send (a bit like HTTP with and without keep-alive). In this version, the client calls socket, connect, send once, and then close.
The problem with this is that I very quickly run out of ephemeral client ports, and the connect fails. To get around this I call setsockopt with SO_REUSEADDR, and then bind to port 0, before calling connect (see here, for example).
This works, except that the TCP connection is no longer reliable. I get occasional incorrect data, presumably because there's still data around when the TCP connection is closed.
Is there any way to make this reliable (and fast)? shutdown before close doesn't help. Maybe I can get select to tell me if the socket is ready for output, but that seems like overkill.

Do you have to use TCP? If so, you will probably have to maintain an open connection and route your messages over that one connection.
There is SCTP, which may be a good fit for your use case - a reliable datagram protocol:
Like TCP, SCTP provides reliable, connection oriented data delivery with congestion control. Unlike TCP, SCTP also provides message bound‐ ary preservation, ordered and unordered message delivery, multi-stream‐ ing and multi-homing. Detection of data corruption, loss of data and duplication of data is achieved by using checksums and sequence numbers. A selective retransmission mechanism is applied to correct loss or corruption of data.

What kind of server needs select

I know that a server normally open one port and listen it.
Today I learnt that there was a function select in system Unix-Like. With select we can listen multi-sockets.
I just can't imagine a case where we need to use select. If we have two sockets, it means that we are listening two ports, right? So I have a question:
What kind of server would open more than one port but receive and process the same type of requests?

Using select helps with handling reads and writes on multiple sockets. It doesn't have to be multiple server sockets. The most typical use is for multiplexing a large number of client sockets.
You have a server with one listening socket. Each time you accept a connection, you add the new client socket to the multiplexing pool. select then returns any time any of those sockets has data available to read. The big win is that you're doing all this with one thread.

You also get as socket for each connection that you've accepted on the listening (server) socket.
selecting among these (client) sockets and the server socket (readable => new connection) allows you to write apps such as chat servers efficiently.

Ummm... remember the difference between ports and sockets.
A "port" is like a telephone-number. But a single phone-number could be handling any number of "calls!"
A "socket," then, represents a single telephone-call: a currently active connection between this server and a particular client. Each connection, by definition, "takes place over a particular port," but any number of connections might exist at the same time.
(The "accept" operation corresponds to: picking up the phone.)
So, then, what select() buys you is the ability to monitor any number of sockets at one time. It examines all the sockets, waits (if necessary) for something to happen on any one of them, and returns one message to you. Now, the design of your server becomes "a simple loop." No matter how many sockets you're listening to, and no matter how many of them have messages waiting, select() will return messages to you one at a time.
It's basically the case that "every server out there will use a select() loop at its heart, unless there's an exceptionally wonderful reason not to.

Take a look here:
One traditional way to write network servers is to have the main
server block on accept(), waiting for a connection. Once a connection
comes in, the server fork()s, the child process handles the connection
and the main server is able to service new incoming requests.
With select(), instead of having a process for each request, there is
usually only one process that "multi-plexes" all requests, servicing
each request as much as it can.
So one main advantage of using select() is that your server will only
require a single process to handle all requests. Thus, your server
will not need shared memory or synchronization primitives for
different 'tasks' to communicate.
One major disadvantage of using select(), is that your server cannot
act like there's only one client, like with a fork()'ing solution. For
example, with a fork()'ing solution, after the server fork()s, the
child process works with the client as if there was only one client in
the universe -- the child does not have to worry about new incoming
connections or the existence of other sockets. With select(), the
programming isn't as transparent.
http://www.lowtek.com/sockets/select.html

Node clustering with websockets

I have a node cluster where the master responds to http requests.
The server also listens to websocket connections (via socket.io). A client connects to the server via the said websocket. Now the client choses between various games (with each node process handles a game).
The questions I have are the following:
Should I open a new connection for each node process? How to tell the client that he should connect to the exact node process X? (Because the server might handle incoming connection-requests on its on)
Is it possible to pass a socket to a node process, so that there is no need for opening a new connection?
What are the drawbacks if I just use one connection (in the master process) and pass the user messages to the respective node processes and the process messages back to the user? (I feel that it costs a lot of CPU to copy rather big objects when sending messages between the processes)

Is it possible to pass a socket to a node process, so that there is no
need for opening a new connection?
You can send a plain TCP socket to another node process as described in the node.js doc here. The basic idea is this:
const child = require('child_process').fork('child.js');
child.send('socket', socket);
Then, in child.js, you would have this:
process.on('message', (m, socket) => {
if (m === 'socket') {
// you have a socket here
}
});
The 'socket' message identifier can be any message name you choose - it is not special. node.js has code that when you use child.send() and the data you are sending is recognized as a socket, it uses platform-specific interprocess communication to share that socket with the other process.
But, I believe this only works for plain sockets that do not yet have any local state established yet other than the TCP state. I have not tried it with an established webSocket connection myself, but I assume it does not work for that because once a webSocket has higher level state associated with it beyond just the TCP socket (such as encryption keys), there's a problem because the OS will not automatically transfer that state to the new process.
Should I open a new connection for each node process? How to tell the
client that he should connect to the exact node process X? (Because
the server might handle incoming connection-requests on its on)
This is probably the simplest means of getting a socket.io connection to the new process. If you make sure that your new process is listening on a unique port number and that it supports CORS, then you can just take the socket.io connection you already have between the master process and the client and send a message to the client on it that tells the client where to reconnect to (what port number). The client can then contain code to listen for that message and make a connection to that new destination.
What are the drawbacks if I just use one connection (in the master
process) and pass the user messages to the respective node processes
and the process messages back to the user? (I feel that it costs a lot
of CPU to copy rather big objects when sending messages between the
processes)
The drawbacks are as you surmise. Your master process just has to spend CPU energy being the middle man forwarding packets both ways. Whether this extra work is significant to you depends entirely upon the context and has to be determined by measurement.
Here's ome more info I discovered. It appears that if an incoming socket.io connection that arrives on the master is immediately shipped off to a cluster child before the connection establishes its initial socket.io state, then this concept could work for socket.io connections too.
Here's an article on sending a connection to another server with implementation code. This appears to be done immediately at connection time so it should work for an incoming socket.io connection that is destined for a specific cluster. The idea here is that there's sticky assignment to a specific cluster process and all incoming connections of any kind that reach the master are immediately transferred over to the cluster child before they establish any state.

Linux IPC multiple clients with daemon

This is really basic but I am blanking right now.
I have a daemon process and would like to have multiple clients be able to talk to it. I would like a client to be able to start up and then using a shared library, essentially 'register' with the daemon process. The daemon process would spawn a thread off for this new client and provide a communication pipe between the client and new thread.
I am thinking a unix datagram socket as a 'registration channel' for all clients to use initially and then switching over to a client-specific channel but then cannot figure out how I create unique names for the new datagram sockets without setting them up a priori.
Server and clients are on same machine, prefer to use datagram sockets to not have to deal with breaking up the stream into packets.
Will be sending (very) high rate small messages back and forth.

You can entirely avoid the problem of naming the client sockets, if you wish. Each client can create a connected pair of sockets using socketpair(). The client then sends one of the
socket descriptors to the server over your well known "registration channel". The server and client then have a private, connected, unnamed pair of sockets for their communication.
The socket descriptor is sent to the server using sendmsg() and filling in the msg's control message.
These two answers have some relevant info/links:
How would I use a socket to have several processes communicate with a central process?
Sending file descriptor over UNIX domain socket, and select()

Basically I think you need to compromise and have a 2 stage process with a SOCK_STREAM socket as stage 1 and SOCK_DGRAM as stage 2.
So it will be like this:
server:
create SOCK_STREAM socket "my.daemon.handshake"
accept client
send a randomly generated string XXX to the client and close the
socket
create a SOCK_DGRAM socket "my.daemon.XXX" and start
processing it
repeat (2)
client
connect to socket "my.daemon.handshake"
read to EOF -- get value XXX
start communicating with server on socket "my.daemon.XXX"
profit!!!!

a UDP socket based rateless file transmission

I'm new to socket programming and I need to implement a UDP based rateless file transmission system to verify a scheme in my research. Here is what I need to do:
I want a server S to send a file to a group of peers A, B, C.., etc. The file is divided into a number of packets. At the beginning, peers will send a Request message to the server to initialize transmission. Whenever S receives a request from a client, it ratelessly transmit encoded packets(how to encode is done by my design, the encoding itself has the erasure-correction capability, that's why I can transmit ratelessly via UDP) to that client. The client keeps collecting packets and try to decode them. When it finally decodes all packets and re-construct the file successfully, it sends back a Stop message to the server and S will stop transmitting to this client.
Peers request the file asynchronously (they may request the file at different time). And the server will have to be able to concurrently serve multiple peers. The encoded packets for different clients are different (they are all encoded from the same set source packets, though).
Here is what I'm thinking about the implementation. I have not much experience with unix network programming though, so I'm wondering if you can help me assess it, and see if it is possible or efficient.
I'm gonna implement the server as a concurrent UDP server with two socket ports(similar to TFTP according to the UNP book). One is to receive controlling messages, as in my context it is for the Request and Stop messages. The server will maintain a flag (=1 initially) for each request. When it receives a Stop message from the client, the flag will be set to 0.
When the serve receives a request, it will fork() a new process that use the second socket and port to send encoded packets to the client. The server keeps sending packets to the client as long as the flag is 1. When it turns to 0, the sending ends.
The client program is easy to do. Just send a Request, recvfrom() the server, progressively decode the file and send a Stop message in the end.
Is this design workable? The main concerns I have are: (1), is that efficient by forking multiple processes? Or should I use threads? (2), If I have to use multiple processes, how can the flag bit be known by the child process? Thanks for your comments.

Using UDB for file transfer is not best idea. There is no way for server or client to know if any packet has been lost so you would only know that during reconstruction assuming you have some mechanism (like counter) to detect lost packes. It would then be hard to request just one of those packets that got lost. And in the end you would have a code that would do what TCP sockets do. So I suggest to start with TCP.
Typical design of a server involves a listener thread that spawns a worker thread whenever there is a new client request. That new thread would handle communication with that particular client and then end. You should keep a limit of clients (threads) that are served simultaneously. Do not spawn a new process for each client - that is inefficient and not needed as this will get you nothing that you can't achieve with threads.
Thread programming requires carefulness so do not cut corners. Otherwise you will have hard time finding and diagnosing problems.

File transfer with UDP wil be fun :(
Your struct/class for each message should contain a sequence number and a checksum. This should enable each client to detect, and ask for the retransmission of, any missing blocks at the end of the transfer.
Where UDP might be a huge winner is on a local LAN. You could UDP-broadcast the entire file to all clients at once and then, at the end, ask each client in turn which blocks it has missing and send just those. I wish Kaspersky etc. would use such a scheme for updating all my local boxes.
I have used such a broadcast scheme on a CANBUS network where there are dozens of microControllers that need new images downloaded. Software upgrades take minutes instead of hours.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string