a UDP socket based rateless file transmission

a UDP socket based rateless file transmission - multithreading

I'm new to socket programming and I need to implement a UDP based rateless file transmission system to verify a scheme in my research. Here is what I need to do:
I want a server S to send a file to a group of peers A, B, C.., etc. The file is divided into a number of packets. At the beginning, peers will send a Request message to the server to initialize transmission. Whenever S receives a request from a client, it ratelessly transmit encoded packets(how to encode is done by my design, the encoding itself has the erasure-correction capability, that's why I can transmit ratelessly via UDP) to that client. The client keeps collecting packets and try to decode them. When it finally decodes all packets and re-construct the file successfully, it sends back a Stop message to the server and S will stop transmitting to this client.
Peers request the file asynchronously (they may request the file at different time). And the server will have to be able to concurrently serve multiple peers. The encoded packets for different clients are different (they are all encoded from the same set source packets, though).
Here is what I'm thinking about the implementation. I have not much experience with unix network programming though, so I'm wondering if you can help me assess it, and see if it is possible or efficient.
I'm gonna implement the server as a concurrent UDP server with two socket ports(similar to TFTP according to the UNP book). One is to receive controlling messages, as in my context it is for the Request and Stop messages. The server will maintain a flag (=1 initially) for each request. When it receives a Stop message from the client, the flag will be set to 0.
When the serve receives a request, it will fork() a new process that use the second socket and port to send encoded packets to the client. The server keeps sending packets to the client as long as the flag is 1. When it turns to 0, the sending ends.
The client program is easy to do. Just send a Request, recvfrom() the server, progressively decode the file and send a Stop message in the end.
Is this design workable? The main concerns I have are: (1), is that efficient by forking multiple processes? Or should I use threads? (2), If I have to use multiple processes, how can the flag bit be known by the child process? Thanks for your comments.

Using UDB for file transfer is not best idea. There is no way for server or client to know if any packet has been lost so you would only know that during reconstruction assuming you have some mechanism (like counter) to detect lost packes. It would then be hard to request just one of those packets that got lost. And in the end you would have a code that would do what TCP sockets do. So I suggest to start with TCP.
Typical design of a server involves a listener thread that spawns a worker thread whenever there is a new client request. That new thread would handle communication with that particular client and then end. You should keep a limit of clients (threads) that are served simultaneously. Do not spawn a new process for each client - that is inefficient and not needed as this will get you nothing that you can't achieve with threads.
Thread programming requires carefulness so do not cut corners. Otherwise you will have hard time finding and diagnosing problems.

File transfer with UDP wil be fun :(
Your struct/class for each message should contain a sequence number and a checksum. This should enable each client to detect, and ask for the retransmission of, any missing blocks at the end of the transfer.
Where UDP might be a huge winner is on a local LAN. You could UDP-broadcast the entire file to all clients at once and then, at the end, ask each client in turn which blocks it has missing and send just those. I wish Kaspersky etc. would use such a scheme for updating all my local boxes.
I have used such a broadcast scheme on a CANBUS network where there are dozens of microControllers that need new images downloaded. Software upgrades take minutes instead of hours.

Related

Which request would NodeJS serve first, if it receives n requests at same time?

I am working on NodeJS. I have a doubt that if nodejs receives many requests, it processes them one after the other in queue. But, if it receives n requests for example say 4 requests reached nodejs at same time without any gap in time, then which one will nodejs pick first to serve? What is the criteria and reason to select any request from many requests at same time?

Since all four requests arrive on the same physical internet connection, one of the request's packets will get there before the others. As the packets converge on the last router before your server, one of them will get processed by a router slightly before the other and that packet will arrive at your server before the other. That packet will then get to the TCP stack in the OS first which will notify node.js about it first. Nodejs will start to process that first request. Since the main thread in nodejs is single threaded, if the request handler doesn't call something asynchronous, then it will send a response for the first request before it gets to start processing the second request.
If the first request has non-blocking, asynchronous portions to its request handling code, then as soon as it makes an asynchronous call and returns control back to the nodejs event loop, then the 2nd request will get to start processing.
But, if it receives n requests for example say 4 requests reached nodejs at same time without any gap in time, then which one will nodejs pick first to serve?
This is not possible. As the packets from each of the requests converge on the last router before your server, they will eventually get sequenced one after the other on the ethernet connection connected to your server. The ethernet connection doesn't send 4 requests in parallel. It sends packets one after the other.
So, your server will see one of the incoming packets before the others. Also, keep in mind an incoming http request is not just a single packet. It consists of establishing a TCP connection (with the back and forth that that entails) and then the client sends the actually http request over the TCP connection that has been established. If you're using https, there is even more involved in establishing the connection. So, the whole notion of four incoming connections arriving at exactly the same moment is not possible. Even if it were (imagine you had four network cards with four physical connections to the internet), the underlying operating system is going to end up servicing one of the incoming network cards before the others. Whether it's a hardware interrupt at the lowest level or a polling loop, one of the network cards is going to be found to have incoming data before the others.
What is the criteria and reason to select any request from many requests at same time?
It doesn't work that way. The OS doesn't suddenly realize it has four requests that arrived at exactly the same moment and then it has to implement some algorithm to choose which request to serve first. It doesn't work that way. Instead, some low level hardware/software element (probably in an upstream router) will have forced the incoming packets into an order (either based on minute timing or just based on how it's software works - like it checks hardware portA and then hardware portB and then hardware portC, for example) and one will physically arrive before the other on your server. This is not something your server gets to decide.

Data loss with UDP and web sockets

I have the following scenario:
A local PC receives data samples via bluetooth at 50.000 bits/sec. The data is send via UDP to some server. The server in turn distributes the data via web page/JavaScript and web sockets to connected browsers where the data is processed. Eventually, the results arriving from the browsers are passed on via UDP back to the local PC.
So far I'm experimenting with a strictly local setup, i.e. everything runs on one machine which has a CPU with four cores. I've written server code in both node.js and golang. In both cases there is a significant data loss, i.e. not every sample that is send via UDP is successfully received by the server even in case only one web socket client is connected.
Where is the bottleneck causing the loss? Is it the fact that everything runs on a local machine? Could it be that the web socket bandwidth is too small? Would I be better of with WebRTC? Or is it be something else entirely?

It is hard to say where exactly the bottleneck in your case is.
But UDP is an unreliable protocol (can loose data) while WebSockets (which uses TCP) is not. This means that the messages are probably lost by a processes which reads or writes the UDP data. Such packet loss might for example occur because these apps are too slow in general to read the data or because the socket buffers are too small to handle fluctuations in reading/writing speed caused by process scheduling or similar.

How to detect network failure in sockets Node.js

I am trying to write internal transport system.
Data should be transferred from client to server using net sockets.
It is working fine except handling of network issues.
If I place firewall between client and server, on both sides I will not see any error, so data will continue to fill kernel buffer on client side.
And if I will restart app in this moment I will lose all data in buffer.
Question:
Do we have any way to detect network issues?
Do we have any way to get data back from kernel buffers?

Node js exposes the low level socket api to you very directly. I'm assuming that you are using a TCP socket to send and receive data.
One way to ensure that there is an active connection between the client and server is to send heartbeat signals back and forth. If you fail to receive a heartbeat from the server while sending data, you can assume that the connection failed.
As for the second part of your question: There is no easy way to get data back from kernel buffers. If losing the data will be a problem, I would make sure to write it to disk.

server sending reset signal instead of FIN

I am learning tcp-ip stack, server-client connections. I wrote a simple client server. The client and servers were able to transfer data to each other without any issues. I am running client and server on the same machine. When I used to close the server with ctrl+c, I found kernel was sending RST signal instead of FIN. (Please refer my question: Active closure of server sockets )
With little more investigation, I realized one of my client was in read call and corresponding server thread was in infinite while loop doing nothing (Some buggy dumb coding on my part). But when I removed that infinite while loop, I saw expected behavior. I could see FIN being sent in both the directions.
So, I am wondering why tcp layer was sending RST in first case.

Eventually, you give up on waiting for the other end to accept the data.

Epoll for many short living UDP requests

I want to realize an application which handles many UDP connections to different servers simultaneously and non-blocking. They should just send one short message (around 80 bytes including the UDP header) and receive one packet. They are not intended to do more than these two operations: send a message and receive one. So my first goal is to have as many requests per second as possible. I thought of using epoll() to realize my idea, but when I read that epoll is primarily meant for servers with long living client-server sessions, I was unsure if this would make sense in my case. What is the best technique? (I use Linux)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string