Node.js child_process docs say
The optional sendHandle argument that may be passed to subprocess.send() is for passing a TCP server or socket object to the child process...Any data that is received and buffered in the socket will not be sent to the child.
So if pass a socket to a child process, how do I not lose the buffered data?
I called socket.read() on the socket before sending it to check for buffered data. It returned null, yet data was still lost.
How do I pass a socket from one process to another without losing data?
The only solution is to never start reading from the socket. For example, in the internal/cluster/round_robin_handle module
this.handle = this.server._handle;
this.handle.onconnection = (err, handle) => this.distribute(err, handle);
This overwrites the normal processessing of new TCP handles, so a child process can do it instead.
(I would have used the cluster module, except I needed some customization, like least connection load balancing.)
Related
This question is a followup to this question and answer
I'm building a minimalistic remote access programm on Linux using SSL and socket programming.
The problem arose in the following protocol chain
Client sends command
Server recieves it
Server spawns a child, with input, output and error dup-ed with the server-client socket (so the input and output would flow directly though the sockets)
Server waits for the child and waits for a new command
When using SSL, you cannot use read and write operations directly, meaning the child using SSL sockets with send plain data (because it won't use SSL_write or SSL_read, but the client will, and this will create problems).
So, as you could read from the answer, one solution would be to create 3 additional sets of local sockets, that only server and it's child will share, so the data could flow unencrypted, and only then send it to the client with a proper SSL command.
So the question is - how do I even know when a child wants to read, so I could ask for input from the client. Or how do I know when the child outputs something so I could forward that to the client
I suppose there should be created some threads, that will monitor and put locks on the SSL structure to keep the order, but I still can't imagine how the server would get notified, when the child application hit a scanf("%d") or something else.
To illustrate what need to be done use the following Python program. I've used Python only because it is easy to read but the same can be done in C, only with more lines of code and harder to read.
Let's first do some initialization, i.e. create some socket server, SSL context, accept a new client, wrap the client fd into an SSL socket and do some initial communication between client and socket server. Based on your previous question you probably know how to this in C already and the Python code is not that far away from what you do in C:
import socket
import ssl
import select
import os
local_addr = ('',8888) # where we listen
cmd = ['./cmd.pl'] # do some command reading stdin, writing stdout
ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
ctx.load_cert_chain('server_cert_and_key.pem')
srv = socket.socket()
srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
srv.bind(local_addr)
srv.listen(10)
try:
cl,addr = srv.accept()
except:
pass
cl = ctx.wrap_socket(cl,server_side=True)
print("new connection from {}".format(addr))
buf = cl.read(1024)
print("received '{}'".format(buf))
cl.write("hi!\n")
After this setup is done and we have a SSL connection to the client we will fork the program. This program will read on stdin and write to stdout and we want to forward the decrypted input from the SSL client as input to the program and forward the output of the program encrypted to the SSL client. To do this we create a socketpair for exchanging data between parent process and forked program and remap the one side of the socketpair to stdin/stdout in the forked child before execv the program. This works also very similar to C:
print("forking subprocess")
tokid, toparent = socket.socketpair()
pid = os.fork()
if pid == 0:
tokid.close()
# child: remap stdin/stdout and exec cmd
os.dup2(toparent.fileno(),0)
os.dup2(toparent.fileno(),1)
toparent.close()
os.execv(cmd[0],cmd)
# will not return
# parent process
toparent.close()
Now we need to read the data from SSL client and forked command and forward it to the other side. While one might probably do this with blocking reads inside threads I prefer it event based using select. The syntax for select in Python is a bit different (i.e. simpler) than in C but the idea is exactly the same: the way we call it it will return when we have data from either client or forked command or if one second of no data has elapsed:
# parent: wait for data from client and subprocess and forward these to
# subprocess and client
done = False
while not done:
readable,_,_ = select.select([ cl,tokid ], [], [], 1)
if not readable:
print("no data for one second")
continue
Since readable is not empty we have new data waiting for us to read. In Python we use recv on the file handle, in C we would need to use SSL_read on the SSL socket and recv or read on the plain socket (from socketpair). After the data is read we write it to the other side. In Python we can use sendall, in C we would need to use SSL_write on the SSL socket and send or write on the plain socket - and we would also need to make sure that all data were written, i.e. maybe try multiple times.
There is one notable thing when using select in connection with SSL sockets. If you SSL_read less than the maximum SSL frame size it might be that the payload of the SSL frame was larger than what was requested within SSL_read. In this case the remaining data will be buffered internally by OpenSSL and the next call to select might not show more available data even though there are the already buffered data. To work around this one either needs to check with SSL_pending for buffered data or just use SSL_read always with the maximum SSL frame size:
for fd in readable:
# Always try to read 16k since this is the maximum size for an
# SSL frame. With lower read sizes we would need to explicitly
# deal with pending data from SSL (man SSL_pending)
buf = fd.recv(16384)
print("got {} bytes from {}".format(len(buf),"client" if fd == cl else "subprocess"))
writeto = tokid if fd == cl else cl
if buf == '':
# eof
writeto.close()
done = True
break # return from program
else:
writeto.sendall(buf)
print("connection done")
And that's it. The full program is also available here and the small program I've used to test is available here.
I have three bits of networked realtime data logging equipment that output lines of ASCII text via TCP sockets. They essentially just broadcast the data that they are logging - there are no requests for data from other machines on the network. Each piece of equipment is at a different location on my network and each has a unique IP address.
I'd like to combine these three streams into one so that I can log it to a file for replay or forward it onto another device to view in realtime.
At the moment I have a PHP script looping over each IP/port combination listening for up to 64Kb of data. As soon as the data is received or it gets an EOL then it forwards that on to another which that listens to the combined stream.
This works reasonably well but one of the data loggers outputs far more data than the others and tends to swamp the other machines so I'm pretty sure that I'm missing data. Presumably because it's not listening in parallel.
I've also tried three separate PHP processes writing to a shared file in memory (on /dev/shm) which is read and written out by a fourth process. Using file locking this seems to work but introduces a delay of a few seconds which I'd rather avoid.
I did find a PHP library that allows true multithreading using Pthreads called (I think) Amp but I'm still not sure how to combine the output. A file in RAM doesn't seem quick enough.
I've had a good look around on Google and can't see an obvious solution. There certainly doesn't seem to be a way to do this on Linux using command line tools that I've found unless I've missed something obvious.
I'm not too familiar with other languages but are there other languages that might be better suited to this problem ?
Based on the suggested solution below I've got the following code almost working however I get an error 'socket_read(): unable to read from socket [107]: Transport endpoint is not connected'. This is odd as I've set the socket to accept connections and made it non-blocking. What am I doing wrong ?:
// Script to mix inputs from multiple sockets
// Run forever
set_time_limit (0);
// Define address and ports that we will listen on
$localAddress='';
// Define inbound ports
$inPort1=36000;
$inPort2=36001;
// Create sockets for inbound data
$inSocket1=createSocket($localAddress, $inPort1);
$inSocket2=createSocket($localAddress, $inPort2);
// Define buffer of data to read and write
$buffer="";
// Repeat forever
while (true) {
// Build array streams to monitor
$readSockets=array($inSocket1, $inSocket2);
$writeSockets=NULL;
$exceptions=NULL;
$t=NULL;
// Count number of stream that have been modified
$modifiedCount=socket_select($readSockets, $writeSockets, $exceptions, $t);
if ($modifiedCount>0) {
// Process inbound arrays first
foreach ($readSockets as $socket) {
// Get up to 64 Kb from this socket
$buffer.=socket_read($socket, 65536, PHP_BINARY_READ);
}
// Process outbound socket array
foreach ($writeSockets as $socket) {
// Get up to 64 Kb from this socket and add it to any other data that we need to write out
//socket_write($socket, $buffer, strlen($buffer));
echo $buffer;
}
// Reset buffer
$buffer="";
} else {
echo ("Nothing to read\r\n");
}
}
function createSocket($address, $port) {
// Function to create and listen on a socket
// Create socket
$socket=socket_create(AF_INET, SOCK_STREAM, 0);
echo ("SOCKET_CREATE: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Allow the socket to be reused otherwise we'll get errors
socket_set_option($socket, SOL_SOCKET, SO_REUSEADDR, 1);
echo ("SOCKET_OPTION: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Bind it to the address and port that we will listen on
$bind=socket_bind($socket, $address, $port);
echo ("SOCKET_BIND: " . socket_strerror(socket_last_error($socket)) . " $address:$port\r\n");
// Tell socket to listen for connections
socket_listen($socket);
echo ("SOCKET_LISTEN: " . socket_strerror(socket_last_error($socket)) . "\r\n");
// Make this socket non-blocking
socket_set_nonblock($socket);
// Accept inbound connections on this socket
socket_accept($socket);
return $socket;
}
You don't necessary need to switch languages, it just sounds like you're not familiar with the concept of IO multiplexing. Check out some documentation for the PHP select call here
The concept of listening to multiple data inputs and not knowing which one some data will come from next is a common one and has standard solutions. There are variations on exactly how its implemented but the basic idea is the same: you tell the system that you're interested in receiving data from multiple source simultaneously (TCP sockets in your case), and run a loop waiting for this data. On every iteration of the loop the system the system tells you which source is ready for reading. In your case that means you can piecemeal-read from all 3 of your sources without waiting for an individual one to reach 64KB before moving on to the next.
This can be done in lots of languages, including PHP.
UPDATE: Looking at the code you posted in your update, the issue that remains is that you're trying to read from the wrong thing, namely from the listening socket rather than the connection socket. You are ignoring the return value of socket_accept in your createSocket function which is wrong.
Remove these lines from createSocket:
// Accept inbound connections on this socket
socket_accept($socket);
Change your global socket creation code to:
// Create sockets for inbound data
$listenSocket1=createSocket($localAddress, $inPort1);
$listenSocket2=createSocket($localAddress, $inPort2);
$inSocket1=socket_accept($listenSocket1);
$inSocket2=socket_accept($listenSocket2);
Then your code should work.
Explanation: when you create a socket for binding and listening, its sole function then becomes to accept incoming connections and it cannot be read from or written to. When you accept a connection a new socket is created, and this is the socket that represents the connection and can be read/written. The listening socket in the meantime continues listening and can potentially accept other connections (this is why a single server running on one http port can accept multiple client connections).
Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
if the socket is non blocking, the write call will return immediately and the application will not know the status of the write (if it passed or failed).
is there a way of knowing the status of the write call without having to block on it.
Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
Yes, and it works fine. Sockets are bi-directional. They have separate buffers for reading and writing. It is perfectly acceptable to have one thread writing data to a socket while another thread is reading data from the same socket at the same time. Both threads can use select() at the same time.
if the socket is non blocking, the write call will
return immediately and the application will not
know the status of the write (if it passed or failed).
The same is true for blocking sockets, too. Outbound data is buffered in the kernel and transmitted in the background. The difference between the two types is that if the write buffer is full (such as if the peer is not reading and acking data fast enough), a non-blocking socket will fail to accept more data and report an error code (WSAEWOULDBLOCK on Windows, EAGAIN or EWOULDBLOCK on other platforms), whereas a blocking socket will wait for buffer space to clear up and then write the pending data into the buffer. Same thing with reading. If the inbound kernel buffer is empty, a non-blocking socket will fail with the same error code, whereas a blocking socket will wait for the buffer to receive data.
select() can be used with both blocking and non-blocking sockets. It is just more commonly used with non-blocking sockets than blocking sockets.
is there a way of knowing the status of the write
call without having to block on it.
On non-Windows platforms, about all you can do is use select() or equivalent to detect when the socket can accept new data before writing to it. On Windows, there are ways to receive a notification when a pending read/write operation completes if it does not finish right away.
But either way, outbound data is written into a kernel buffer and not transmitted right away. Writing functions, whether called on blocking or non-blocking sockets, merely report the status of writing data into that buffer, not the status of transmitting the data to the peer. The only way to know the status of the transmission is to have the peer explicitly send back a reply message once it has received the data. Some protocols do that, and others do not.
is there a way of knowing the status of the write call without having
to block on it.
If the result of the write call is -1, then check errno to for EAGAIN or EWOULDBLOCK. If it's one of those errors, then it's benign and you can go back to waiting on a select call. Sample code below.
int result = write(sock, buffer, size);
if ((result == -1) && ((errno == EAGAIN) || (errno==EWOULDBLOCK)) )
{
// write failed because socket isn't ready to handle more data. Try again later (or wait for select)
}
else if (result == -1)
{
// fatal socket error
}
else
{
// result == number of bytes sent.
// TCP - May be less than the number of bytes passed in to write/send call.
// UDP - number of bytes sent (should be the entire thing)
}
Is there a way to control each step of a tcp socket write to know the server side progress of a large sized image data transfer?
At worst, how to alter the main node bin folder to add this event?
Finally, can someone explain to me why the max length of node.js http tcp sockets are 1460?
From the documentation:
socket.write(data, [encoding], [callback])
Sends data on the socket.
The second parameter specifies the encoding in the case of a
string--it defaults to UTF8 encoding.
Returns true if the entire data was flushed successfully to the kernel
buffer. Returns false if all or part of the data was queued in user
memory. 'drain' will be emitted when the buffer is again free.
The optional callback parameter will be executed when the data is
finally written out - this may not be immediately.
Event: 'drain'
Emitted when the write buffer becomes empty. Can be
used to throttle uploads.
See also: the return values of socket.write()
So, you can either specify a callback to be called once the data is flushed, or you can register a "drain" event handler on the socket. This doesn't actually tell you about the progress on the other side, but a properly-implemented server will send appropriate notifications when its queue is full, which will trigger the events on the Node.js side.
When I create a TCP socket in blocking mode and use the send (or sendto) functions, when the will the function call return?
Will it have to wait till the other side of the socket has received the data? In that case, if there is traffic jam on the internet, could it block for a long time?
Both the sender and the receiver (and possibly intermediaries) will buffer the data.
Sending data successfully is no guarantee that the receiving end has received it.
Normally writes to a blocking socket, won't block as long as there is space in the sending-side buffer.
Once the sender's buffer is full, then the write WILL block, until there is space for the entire write in it.
If the write is partially successful (the receiver closed the socket, shut it down or an error occurred), then the write might return fewer bytes than you had intended. A subsequent write should give an error or return 0 - such conditions are irreversible on TCP sockets.
Note that if a subsequent send() or write() gives an error, then some previously written data could be lost forever. I don't think there is a real way of knowing how much data actually arrived (or was acknowledged, anyway).