Linux Socket Input Notification on a Client-Server application - linux

This question is a followup to this question and answer
I'm building a minimalistic remote access programm on Linux using SSL and socket programming.
The problem arose in the following protocol chain
Client sends command
Server recieves it
Server spawns a child, with input, output and error dup-ed with the server-client socket (so the input and output would flow directly though the sockets)
Server waits for the child and waits for a new command
When using SSL, you cannot use read and write operations directly, meaning the child using SSL sockets with send plain data (because it won't use SSL_write or SSL_read, but the client will, and this will create problems).
So, as you could read from the answer, one solution would be to create 3 additional sets of local sockets, that only server and it's child will share, so the data could flow unencrypted, and only then send it to the client with a proper SSL command.
So the question is - how do I even know when a child wants to read, so I could ask for input from the client. Or how do I know when the child outputs something so I could forward that to the client
I suppose there should be created some threads, that will monitor and put locks on the SSL structure to keep the order, but I still can't imagine how the server would get notified, when the child application hit a scanf("%d") or something else.

To illustrate what need to be done use the following Python program. I've used Python only because it is easy to read but the same can be done in C, only with more lines of code and harder to read.
Let's first do some initialization, i.e. create some socket server, SSL context, accept a new client, wrap the client fd into an SSL socket and do some initial communication between client and socket server. Based on your previous question you probably know how to this in C already and the Python code is not that far away from what you do in C:
import socket
import ssl
import select
import os
local_addr = ('',8888) # where we listen
cmd = ['./cmd.pl'] # do some command reading stdin, writing stdout
ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
ctx.load_cert_chain('server_cert_and_key.pem')
srv = socket.socket()
srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
srv.bind(local_addr)
srv.listen(10)
try:
cl,addr = srv.accept()
except:
pass
cl = ctx.wrap_socket(cl,server_side=True)
print("new connection from {}".format(addr))
buf = cl.read(1024)
print("received '{}'".format(buf))
cl.write("hi!\n")
After this setup is done and we have a SSL connection to the client we will fork the program. This program will read on stdin and write to stdout and we want to forward the decrypted input from the SSL client as input to the program and forward the output of the program encrypted to the SSL client. To do this we create a socketpair for exchanging data between parent process and forked program and remap the one side of the socketpair to stdin/stdout in the forked child before execv the program. This works also very similar to C:
print("forking subprocess")
tokid, toparent = socket.socketpair()
pid = os.fork()
if pid == 0:
tokid.close()
# child: remap stdin/stdout and exec cmd
os.dup2(toparent.fileno(),0)
os.dup2(toparent.fileno(),1)
toparent.close()
os.execv(cmd[0],cmd)
# will not return
# parent process
toparent.close()
Now we need to read the data from SSL client and forked command and forward it to the other side. While one might probably do this with blocking reads inside threads I prefer it event based using select. The syntax for select in Python is a bit different (i.e. simpler) than in C but the idea is exactly the same: the way we call it it will return when we have data from either client or forked command or if one second of no data has elapsed:
# parent: wait for data from client and subprocess and forward these to
# subprocess and client
done = False
while not done:
readable,_,_ = select.select([ cl,tokid ], [], [], 1)
if not readable:
print("no data for one second")
continue
Since readable is not empty we have new data waiting for us to read. In Python we use recv on the file handle, in C we would need to use SSL_read on the SSL socket and recv or read on the plain socket (from socketpair). After the data is read we write it to the other side. In Python we can use sendall, in C we would need to use SSL_write on the SSL socket and send or write on the plain socket - and we would also need to make sure that all data were written, i.e. maybe try multiple times.
There is one notable thing when using select in connection with SSL sockets. If you SSL_read less than the maximum SSL frame size it might be that the payload of the SSL frame was larger than what was requested within SSL_read. In this case the remaining data will be buffered internally by OpenSSL and the next call to select might not show more available data even though there are the already buffered data. To work around this one either needs to check with SSL_pending for buffered data or just use SSL_read always with the maximum SSL frame size:
for fd in readable:
# Always try to read 16k since this is the maximum size for an
# SSL frame. With lower read sizes we would need to explicitly
# deal with pending data from SSL (man SSL_pending)
buf = fd.recv(16384)
print("got {} bytes from {}".format(len(buf),"client" if fd == cl else "subprocess"))
writeto = tokid if fd == cl else cl
if buf == '':
# eof
writeto.close()
done = True
break # return from program
else:
writeto.sendall(buf)
print("connection done")
And that's it. The full program is also available here and the small program I've used to test is available here.

Related

Unable to run ZMQStream with Tornado Event loop using python 3.7

I've been trying to set up a server / client using zmq eventloop for REQ / REP messaging. Since python 3 doesn't support the eventloop provided by zmq, I'm trying to run it with tornado's eventloop.
I'm facing issues running zmqStream with tornado's event loop using python 3.
I created the server / client code using zmq's zmqStream and tornado's eventloop. The client is sending the correct messages, but the server doesn't seem to be responding to the message requests.
The server side code:
from tornado import ioloop
import zmq
def echo(stream, msg):
stream.send_pyobj(msg)
ctx = zmq.Context()
socket = ctx.socket(zmq.REP)
socket.bind('tcp://127.0.0.1:5678')
stream = ZMQStream(socket)
stream.on_recv(echo)
ioloop.IOLoop.current().start()
The client side code:
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5678")
for request in range (1,10):
print("Sending request ", request,"...")
socket.send_string("Hello")
# Get the reply.
message = socket.recv_pyobj()
print("Received reply ", request, "[", message, "]")
I was expecting the server to return back the request messages being sent by the client. But it is just not responding to the requests being sent.
Q : server doesn't seem to be responding
Step 0 :
One server-side SLOC, stream = ZMQStream( socket ) calls a function, that is not MCVE-documented and must and does fail to execute to yield any result: "ZMQStream" in dir() confirms this with False
Remedy:
repair the MCVE and also print( zmq.zmq_version ) + "ZMQStream" in dir() confirmation
Step 1:
Always prevent infinite deadlocks, unless due reason exists not to do so, with setting prior to doing respective .bind() or .connect() <aSocket>.setsockopt( zmq.LINGER, 0 ). Hung forever applications and un-released (yes, you read it correctly, infinitely blocked) resources are not welcome in distributed computing systems.
Step 2:
Avoid a blind distributed-mutual-deadlock the REQ/REP is always prone to run into. It will happen, one just never knows when. You may read heaps of details about this on Stack Overflow.
And remedy? May (and shall, where possible) avoid using the blocking-forms of .recv()-s (fair .poll()-s are way smarter-design-wise, resources-wise) may use additional sender-side signalisation before "throwing" either side into infinitely-blocking .recv()-s (yet a network delivery failure or other reason for a silent message drop may cause soft-signaling to flag sending, which did not result in receiving and mutual-deadlocking, where hard-wired behaviour moves both of the REQ/REP side into waiting one for the other, to send a message (which the counterparty will never send, as it is also waiting for .recv()-ing the still not received one from the (still listening) opposite side ))
Last, but not least:
The ZeroMQ Zen-of-Zero has also a Zero-Warranty - as messages are either fully delivered (error-free) or not delivered at all. The REQ/REP mutual deadlocks are best resolvable if one never falls into 'em (ref. LINGER and poll() above)

How can I send messages to multiple twitch channels using python 3 Select()?

I'm trying to create a Twitch bot in Python 3 that will simultaneously monitor and send messages to multiple channels. I've done this with threads but it's obviously demanding on the CPU and I've read that using Select() is more efficient. The code below allows me to read chat from multiple twitch channels but I'm at a loss for how to identify if the connections returned as writable are the ones I want to write to.
Can I pass in a list of objects that has the socket connection as well as an identifier so I know which ones have come back as writable?
I've read a number of stackoverflow posts related to using select() as well as other sources online but, as a hobbyist coder, I'm having trouble getting my head around this.
#!/usr/bin/env
import socket
import select
HOST = "irc.chat.twitch.tv"
PORT = 6667
NICK = "channelname"
PASS = 'oauth:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
CHANNEL = "channelname"
def create_sockets(usr_list):
final_socket_list = []
channels_first_element = 0
channels_last_element = len(usr_list)
for index in range(channels_first_element, channels_last_element):
channel = usr_list[index]
s = socket.socket()
s.connect((HOST, PORT))
s.setblocking(False)
s.send(bytes("PASS " + PASS + "\r\n", "UTF-8"))
s.send(bytes("NICK " + NICK + "\r\n", "UTF-8"))
s.send(bytes("JOIN #" + channel + " \r\n", "UTF-8"))
s.send(bytes('CAP REQ :twitch.tv/membership\r\n'.encode('utf-8')))
s.send(bytes('CAP REQ :twitch.tv/commands\r\n'.encode('utf-8')))
s.send(bytes('CAP REQ :twitch.tv/tags\r\n'.encode('utf-8')))
final_socket_list.append(s)
return final_socket_list
def main():
alive = True
user_list = ['channelone', 'channeltwo', 'channelthree']
user_sockets = create_sockets(user_list)
while alive:
readable, writable, errorreads = select.select(user_sockets, user_sockets, [])
if len(readable) != 0:
for element in readable:
print(str(element.recv(1024), "utf-8"))
if __name__ == "__main__":
main()
The first thing to point out is that the arguments to select() are meant to tell select() when to return -- i.e. by including a socket in the first/read_fds argument, you are telling select() to return as soon as that socket has incoming-data that is ready-to-read, and by including a socket in the second/write_fds argument, you are telling select() to return as soon as that socket has buffer-space-ready-to-write-to.
Because of that, it's important (if you want to be CPU-efficient) to only include a socket in select()'s second/write_fds argument if you have data that you want to send to that socket as soon as there is space available in that socket's outgoing-data-buffer. If you just always pass all of your sockets to the write_fds argument (as you are currently doing in the posted code), then select() will pretty much always return immediately (because sockets typically almost always have buffer space available to write to), and you'll end up spinning the CPU at near 100% usage, which is very inefficient.
Note that for a light-duty server that is using blocking TCP sockets, it's usually sufficient to simply always pass [] as the second argument to select(), on the assumption that you will never actually fill any socket's outgoing-data-buffer (and if you do, that the next send() call on that socket will simply block until there is buffer space available, and that's okay). If you want to use non-blocking sockets, you can either make the simplifying assumption that no socket's outgoing-data-buffer will ever become full (in which case passing [] to the write_fds argument is fine), or to be 100% robust you'll need to include your own per-socket outgoing-data FIFO queue, and include each socket in the write_fds argument only-if its FIFO queue is non-empty, and when the socket indicates it is ready-for-write, send() as many bytes as you can from the head of the socket's FIFO queue.
As for which sockets are the ones you want to write to, that's going to depend entirely on what your app is trying to do, and probably won't depend on which sockets have selected as writeable. Most programs include a Dictionary (or some other similar lookup mechanism) for quickly determining which data-object corresponds to a given socket, so when a socket select()'s as ready-for-read you can figure out which data-object the data coming from that socket should be viewed as "coming from". You can then write to any sockets you need to write to, in response.

How to merge three TCP streams in realtime

I have three bits of networked realtime data logging equipment that output lines of ASCII text via TCP sockets. They essentially just broadcast the data that they are logging - there are no requests for data from other machines on the network. Each piece of equipment is at a different location on my network and each has a unique IP address.
I'd like to combine these three streams into one so that I can log it to a file for replay or forward it onto another device to view in realtime.
At the moment I have a PHP script looping over each IP/port combination listening for up to 64Kb of data. As soon as the data is received or it gets an EOL then it forwards that on to another which that listens to the combined stream.
This works reasonably well but one of the data loggers outputs far more data than the others and tends to swamp the other machines so I'm pretty sure that I'm missing data. Presumably because it's not listening in parallel.
I've also tried three separate PHP processes writing to a shared file in memory (on /dev/shm) which is read and written out by a fourth process. Using file locking this seems to work but introduces a delay of a few seconds which I'd rather avoid.
I did find a PHP library that allows true multithreading using Pthreads called (I think) Amp but I'm still not sure how to combine the output. A file in RAM doesn't seem quick enough.
I've had a good look around on Google and can't see an obvious solution. There certainly doesn't seem to be a way to do this on Linux using command line tools that I've found unless I've missed something obvious.
I'm not too familiar with other languages but are there other languages that might be better suited to this problem ?
Based on the suggested solution below I've got the following code almost working however I get an error 'socket_read(): unable to read from socket [107]: Transport endpoint is not connected'. This is odd as I've set the socket to accept connections and made it non-blocking. What am I doing wrong ?:
// Script to mix inputs from multiple sockets
// Run forever
set_time_limit (0);
// Define address and ports that we will listen on
$localAddress='';
// Define inbound ports
$inPort1=36000;
$inPort2=36001;
// Create sockets for inbound data
$inSocket1=createSocket($localAddress, $inPort1);
$inSocket2=createSocket($localAddress, $inPort2);
// Define buffer of data to read and write
$buffer="";
// Repeat forever
while (true) {
// Build array streams to monitor
$readSockets=array($inSocket1, $inSocket2);
$writeSockets=NULL;
$exceptions=NULL;
$t=NULL;
// Count number of stream that have been modified
$modifiedCount=socket_select($readSockets, $writeSockets, $exceptions, $t);
if ($modifiedCount>0) {
// Process inbound arrays first
foreach ($readSockets as $socket) {
// Get up to 64 Kb from this socket
$buffer.=socket_read($socket, 65536, PHP_BINARY_READ);
}
// Process outbound socket array
foreach ($writeSockets as $socket) {
// Get up to 64 Kb from this socket and add it to any other data that we need to write out
//socket_write($socket, $buffer, strlen($buffer));
echo $buffer;
}
// Reset buffer
$buffer="";
} else {
echo ("Nothing to read\r\n");
}
}
function createSocket($address, $port) {
// Function to create and listen on a socket
// Create socket
$socket=socket_create(AF_INET, SOCK_STREAM, 0);
echo ("SOCKET_CREATE: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Allow the socket to be reused otherwise we'll get errors
socket_set_option($socket, SOL_SOCKET, SO_REUSEADDR, 1);
echo ("SOCKET_OPTION: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Bind it to the address and port that we will listen on
$bind=socket_bind($socket, $address, $port);
echo ("SOCKET_BIND: " . socket_strerror(socket_last_error($socket)) . " $address:$port\r\n");
// Tell socket to listen for connections
socket_listen($socket);
echo ("SOCKET_LISTEN: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Make this socket non-blocking
socket_set_nonblock($socket);
// Accept inbound connections on this socket
socket_accept($socket);
return $socket;
}
You don't necessary need to switch languages, it just sounds like you're not familiar with the concept of IO multiplexing. Check out some documentation for the PHP select call here
The concept of listening to multiple data inputs and not knowing which one some data will come from next is a common one and has standard solutions. There are variations on exactly how its implemented but the basic idea is the same: you tell the system that you're interested in receiving data from multiple source simultaneously (TCP sockets in your case), and run a loop waiting for this data. On every iteration of the loop the system the system tells you which source is ready for reading. In your case that means you can piecemeal-read from all 3 of your sources without waiting for an individual one to reach 64KB before moving on to the next.
This can be done in lots of languages, including PHP.
UPDATE: Looking at the code you posted in your update, the issue that remains is that you're trying to read from the wrong thing, namely from the listening socket rather than the connection socket. You are ignoring the return value of socket_accept in your createSocket function which is wrong.
Remove these lines from createSocket:
// Accept inbound connections on this socket
socket_accept($socket);
Change your global socket creation code to:
// Create sockets for inbound data
$listenSocket1=createSocket($localAddress, $inPort1);
$listenSocket2=createSocket($localAddress, $inPort2);
$inSocket1=socket_accept($listenSocket1);
$inSocket2=socket_accept($listenSocket2);
Then your code should work.
Explanation: when you create a socket for binding and listening, its sole function then becomes to accept incoming connections and it cannot be read from or written to. When you accept a connection a new socket is created, and this is the socket that represents the connection and can be read/written. The listening socket in the meantime continues listening and can potentially accept other connections (this is why a single server running on one http port can accept multiple client connections).

Using thread to write and select to read

Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
if the socket is non blocking, the write call will return immediately and the application will not know the status of the write (if it passed or failed).
is there a way of knowing the status of the write call without having to block on it.
Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
Yes, and it works fine. Sockets are bi-directional. They have separate buffers for reading and writing. It is perfectly acceptable to have one thread writing data to a socket while another thread is reading data from the same socket at the same time. Both threads can use select() at the same time.
if the socket is non blocking, the write call will
return immediately and the application will not
know the status of the write (if it passed or failed).
The same is true for blocking sockets, too. Outbound data is buffered in the kernel and transmitted in the background. The difference between the two types is that if the write buffer is full (such as if the peer is not reading and acking data fast enough), a non-blocking socket will fail to accept more data and report an error code (WSAEWOULDBLOCK on Windows, EAGAIN or EWOULDBLOCK on other platforms), whereas a blocking socket will wait for buffer space to clear up and then write the pending data into the buffer. Same thing with reading. If the inbound kernel buffer is empty, a non-blocking socket will fail with the same error code, whereas a blocking socket will wait for the buffer to receive data.
select() can be used with both blocking and non-blocking sockets. It is just more commonly used with non-blocking sockets than blocking sockets.
is there a way of knowing the status of the write
call without having to block on it.
On non-Windows platforms, about all you can do is use select() or equivalent to detect when the socket can accept new data before writing to it. On Windows, there are ways to receive a notification when a pending read/write operation completes if it does not finish right away.
But either way, outbound data is written into a kernel buffer and not transmitted right away. Writing functions, whether called on blocking or non-blocking sockets, merely report the status of writing data into that buffer, not the status of transmitting the data to the peer. The only way to know the status of the transmission is to have the peer explicitly send back a reply message once it has received the data. Some protocols do that, and others do not.
is there a way of knowing the status of the write call without having
to block on it.
If the result of the write call is -1, then check errno to for EAGAIN or EWOULDBLOCK. If it's one of those errors, then it's benign and you can go back to waiting on a select call. Sample code below.
int result = write(sock, buffer, size);
if ((result == -1) && ((errno == EAGAIN) || (errno==EWOULDBLOCK)) )
{
// write failed because socket isn't ready to handle more data. Try again later (or wait for select)
}
else if (result == -1)
{
// fatal socket error
}
else
{
// result == number of bytes sent.
// TCP - May be less than the number of bytes passed in to write/send call.
// UDP - number of bytes sent (should be the entire thing)
}

Linux: is there a way to use named fifos on the writer side in non-blocking mode?

I've found many questions and answers about pipes on Linux, but almost all discuss the reader side.
For a process that shall be ready to deliver data to a named pipe as soon as the data is available and a reading process is connected, is there a way to, in a non-blocking fashion:
wait (poll(2)) for reader to open the pipe,
wait in a loop (again poll(2)) for signal that writing to the pipe will not block, and
when such signal is received, check how many bytes may be written to the pipe without blocking
I understand how to do (2.), but I wasn't able to find consistent answers for (1.) and (3.).
EDIT: I was looking for (something like) FIONWRITE for pipes, but Linux does not have FIONWRITE (for pipes) (?)
EDIT2: The intended main loop for the writer (kind of pseudo code, target language is C/C++):
forever
poll(can_read_command, can_write_to_the_fifo)
if (can_read_command) {
read and parse command
update internal status
continue
}
if (can_write_to_the_fifo) {
length = min(data_available, space_for_nonblocking_write)
write(output_fifo, buffer, length)
update internal status
continue
}

Resources