One or more UDP networks realtime system - multithreading

I'm developing a real-time system for elevators, and I'm in need for some advice on the implementation of the network module. The network module is based on UDP broadcasting, and there are several things it needs to do:
Spam a heartbeat, to know which elevators are alive
Broadcast new order/job
Broadcast if the elevator is doing the order/job
Now so far I've implemented only one UDP server, making the load of each message quite large. It contains a JSONobject with the following attributes:
STATUS
newOrders
takenOrders
orderQueue
My question is, should I spawn multiple threads running UDP on different ports for each elevator? If so, I could use a combination of one to send heartbeats, one to deal out orders, and one to broadcast the orderqueue. What are the pros and cons of running multiple UDP's on each client?

Related

Parsing TCP payloads using a custom spec

My goal is to create a parser for TCP packets that are using a custom spec from the Options Price Reporting Authority found here but I have no idea where to start. I've never worked with low-level stuff and I'd appreciate if I got some guidance.
The huge problem is I don't have access to the actual network because it costs a huge sum per month and all I can work off is the specification. I don't even if it's possible. Do you step by step parse each byte and hope for the best? Do you first re-create some example data using the bytes in the spec and then parse it? Isn't that also difficult since (I think) that TCP spread the data to multiple blocks?
That's quite an elaborate data feed. A quick review of the spec shows that it contains enough information to write a program in either nodejs or golang to ingest it.
Getting it to work will be a big job. Your question didn't mention your level of programming skill, or of your network engineering skill. So it's hard to guess how much learning lies ahead of you to get this done.
A few things.
It's a complex enough protocol that you will need to test it with correctly formatted sample data. You need a fairly large collection of sample packets in order to mock your data feed (that is, build a fake data feed for testing purposes). While nothing is impossible, it will be very difficult to build a bug-free program to handle this data without extensive tests.
If you have a developer relationship to the publisher of the data feed, you should ask if they offer sample data for testing.
It is not a TCP / IP data feed. It is an IP multicast datagram feed. In IP multicast feeds you set up a server to listen for the incoming data packets. They use multicast to achieve the very low latencies necessary for predatory algorithmic trading.
You won't use TCP sockets to receive it, you'll use a different programming interface called UDP datagrams
If you're used to TCP's automatic recovery from errors, datagrams will be a challenge. With datagrams you cannot tell if you failed to receive data except by looking at sequence numbers. Most data feeds using IP and multicast have some provision for retransmitting data. Your spec is no exception. You must handle retransmitted data correctly or it will look like you have lots of duplicate data.
Multicast data doesn't move over the public network. You'll need a virtual private network connection to the publisher, or to co-locate your servers in a data center where the feed is available on an internal network.
There's another, operational, spec you'll need to cope with to get this data. It's called the Common IP Multicast Distribution Network Recipient Interface Specification. This spec has a primer on the multicast dealio.
You can do this. When you have made it work, you will have gained some serious skills in network programming and network engineering.
But if you just want this data, you might try to find a reseller of the data that repackages it in an easier-to-consume format. That reseller probably also imposes a delay on the data feed.

Advantages of multiple udp sockets

Hi this question is from a test i had recently :
(code of a server using one thread for read actions and N number of threads to write where N is the number of Writing actions needed to be done right now)
will using multiple UDP sockets (one for each client )over a single one(one for all of them) have any advantages ?
the official answer :
no because the server is using one thread for read/write per client which wont make it more efficient (students who addressed buffer overflow got full points)
my question is - under any circumstances will changing single UDP to many will have an efficiency impact?
Thanks
Maybe yes, if you are using BSD Systems (OSX, FreeBSD, etc.). BSD System do not behave well when sending a lot of upd packets on the same sockets. If the sending queue of the socket is full, the packets get dropped. So write on udp sockets never blocks on BSD systems. When you use 10.000 UDP Sockets sending to the same address, no packets will be dropped on the sending host. So there is definitely an advantage, since some OSs do crazy queue management.

Does each queue on ZeroMQ require it's own port?

We are looking to build a facade in nodejs that will accept requests from a client and then farm out the requests to a number of services using request/reply pattern to a number of different backend services. We want these requests held on individual queues in the event that one of the backend services is down. From initially reading of the ZeroMQ docs, it appears each queue is bound to its own port. When sending a message to a socket, there doesn't appear to be a way of naming a queue/topic to send to.
Is there a one-one mapping between ports and queues?
Thanks, Tom
ZeroMQ doesn't have the concept of "queues" or "topics". Your application consists of tasks, connected across some protocol, e.g. tcp://, and sending each other messages in various patterns. In your example one task will bind to an address:port and the workers will connect to it. The sender then sends requests to its socket, which deals them out to workers.
The best way to learn ZeroMQ is to work through at least the first couple of chapters of the Guide, before you design your own application. Many of the existing messaging concepts you're familiar with disappear into simpler patterns with ZeroMQ.

Epoll for many short living UDP requests

I want to realize an application which handles many UDP connections to different servers simultaneously and non-blocking. They should just send one short message (around 80 bytes including the UDP header) and receive one packet. They are not intended to do more than these two operations: send a message and receive one. So my first goal is to have as many requests per second as possible. I thought of using epoll() to realize my idea, but when I read that epoll is primarily meant for servers with long living client-server sessions, I was unsure if this would make sense in my case. What is the best technique? (I use Linux)

How does an asynchronous socket server work?

I should state that I'm not asking about specific implementation details (yet), but just a general overview of what's going on. I understand the basic concept behind a socket, and need clarification on the process as a whole. My (probably very wrong) understanding is currently this:
A socket is constantly listening for clients that want to connect (in its own thread). When a connection occurs, an event is raised that spawns another thread to perform the connection process. During the connection process the client is assigned it's own socket in which to communicate with the server. The server then waits for data from the client and when data arrives an event is raised which spawns a thread to read the data from a stream into a buffer.
My questions are:
How off is my understanding?
Does each client socket require it's own thread to listen for data on?
How is data routed to the correct client socket? Is this something taken care of by the guts of TCP/UDP/kernel?
In this threaded environment, what kind of data is typically being shared, and what are the points of contention?
Any clarifications and additional explanation would be greatly appreciated.
EDIT:
Regarding the question about what data is typically shared and points of contention, I realize this is more of an implementation detail than it is a question regarding general process of accepting connections and sending/receiving data. I had looked at a couple implementations (SuperSocket and Kayak) and noticed some synchronization for things like session cache and reusable buffer pools. Feel free to ignore this question. I've appreciated all your feedback.
One thread per connection is bad design (not scalable, overly complex) but unfortunately way too common.
A socket server works more or less like this:
A listening socket is setup to accept connections, and added to a socketset
The socket set is checked for events
If the listening socket has pending connections, new sockets are created by accepting the connections, and then added to the socket set
If a connected socket has events, the relevant IO functions are called
The socket set is checked for events again
This happens in one thread, you can easily handle thousands of connected sockets in a single thread, and there's few valid reasons for making this more complex by introducing threads.
while running
select on socketset
for each socket with events
if socket is listener
accept new connected socket
add new socket to socketset
else if socket is connection
if event is readable
read data
process data
else if event is writable
write queued data
else if event is closed connection
remove socket from socketset
end
end
done
done
The IP stack takes care of all the details of which packets go to what "socket" in which order. Seen from the applications point of view, a socket represents a reliable ordered byte stream (TCP) or an unreliable unordered sequence of packets(UDP)
EDIT: In response to updated question.
I don't know either of the libraries you mention, but on the concepts you mention:
A session cache typically keeps data associated with a client, and can reuse this data for multiple connections. This makes sense when your application logic requires state information, but it's a layer higher than the actual networking end. In the above sample, the session cache would be used by the "process data" part.
Buffer pools are also an easy and often effective optimization of a high-traffic server. The concept is very easy to implement, instead of allocating/deallocating space for storing data you read/write, you fetch a preallocated buffer from a pool, use it, then return it to a pool. This avoids the (sometimes relatively expensive) backend allocation/deallocation mechanisms. This is not directly related to networking, you can just as well use buffer pools for e.g. something that reads chunks of files and process them.
How off is my understanding?
Pretty far.
Does each client socket require it's own thread to listen for data on?
No.
How is data routed to the correct client socket? Is this something taken care of by the guts of TCP/UDP/kernel?
TCP/IP is a number of layers of protocol. There's no "kernel" to it. It's pieces, each with a separate API to the other pieces.
The IP Address is handled in on place.
The port # is handled in another place.
The IP addresses are matched up with MAC addresses to identify a particular host. The port # is what ties a TCP (or UDP) socket to a particular piece of application software.
In this threaded environment, what kind of data is typically being shared, and what are the points of contention?
What threaded environment?
Data sharing? What?
Contention? The physical channel is the number one point of contention. (Ethernet, for example depends on collision-detection.) After that, well, every part of the computer system is a scarce resource shared by multiple applications and is a point of contention.

Resources