'Feeding' a Netty 4 pipeline without a socket - linux

I have a set of Netty 4 handlers that I normally chain on top of a ServerBootstrap using EpollEventLoopGroups. However the source of the data will not be a socket; instead I will read from / write to two in-memory buffers. Solution can be Linux-specific.
For now I add a ServerBootstrap to listen to a loopback port, to which I connect with a server and manually feed the data; but I wonder if I can do this without having to use a socket at all.
I considered writing a custom SocketChannel that extends LocalChannel but there are lot of details to consider and I feel out of my depth, honestly.
I have found this repository but it is for Netty 3, not 4:
https://github.com/itm/netty-iostream

It sounds like you want to use EmbeddedChannel.

Related

How should I communicate bytes to Python using pyo3?

I am new to Rust, and I am writing a library that creates background threads that listen and handle TCP communications. I want to store the latest n bytes for every TCP client and have Python to be able to fetch them. The way I am thinking about this is to have shared buffers, although I am not sure how to achieve that given Rust's memory model.
This is what I want to achieve at the end:
import tcp_server_pyo3
# How can I return something that would keep track of TCP connections?
listener = tcp_server_pyo3.start("127.0.0.1:6142")
# So I could print all the TCP clients that are currently connected
print(listener.connections())
# And this would return the latest n bytes received for the "client_id"
listener.read("client_id")
Below is what I have so far. It's currently able to create a listener and connection handler threads. What should I add/change to be able to keep track of connections and read latest bytes that were communicated from Python?
My code: tcp_server_pyo3.rs
I am not sure if I am thinking about this the right way. I though of global variables but people are saying not to use global variables in Rust.

node tcp server stick data together

I am using net module in nodejs.
net.createServer(function(sock) {
sock.on('data', function(data) {
console.log(data);
});
});
then I tried to use 2000 tcp clients to send data to the server to test how many clients it can support. For the first 20 minutes, it was running ok. But after a period of time, the data stuck together. For example, the data from the client is in json format and look like this:
'{"value":1,"name":"tom"}'
Each client sent the data with different name and value would be incremented each time. From the server side, the data I got look like this:
{"value":1,"name":"tom"}{"value":2,"name":"tom"}{"value":3,"name":"tom"}.
They stick together, I have to split them up and save them into mongodb.
The situation was getting worse when keep the server running longer. the server can't receive any data while the clients were still sending the data.
I'd like to ask how to make the server read one item each time and the server will work well when keep running. Thanks a lot.
The socket server doesn't know or care about your data format. No matter what, you need to split it up. It can be arbitrarily buffered or split before reaching your application. The fact that you're ending up with nice segments under lower load is just due to your payload fitting into a single packet, and the app keeping up reading buffers as they come.
The easiest thing to do is use a delimiter. Since you're using JSON, you can use a newline delimeter. (Just make sure your JSON objects are formatted on a single line per object.)
Then, you can use a transform stream that buffers the stream, waits for a delimeter, and emits parsed objects.
this repository can help you.The solution of sticky package problem of TCP for Node.Js
stick
node.js stick package
You try this lib,The solution of sticky package problem of TCP for Node.Js

Node.js flush socket after write

I'm implementing a tcp protocol in Node.
Full source:
https://github.com/roelandmoors/ads.js/blob/master/ads.js
specs:
http://infosys.beckhoff.com/content/1033/tcadsamsspec/html/tcadsamsspec_amstcppackage.htm?id=17754
The problem is that I use this to send a package:
this.tcpClient.write(buf);
If I send multiple commands, then multiple commands are combined into a single tcp packet.
This doesn't work.
There are more questions about this on SO, but they recommend using a delimeter.
But since I can't change the protocol this isn't an option.
Isn't there a simple solution to flush the socket?
socket.setNoDelay() doesn't help.
Edit: I also tried to use the drain event to send the next write, but the event is never called?
Update:
This seems to solve the problem, but is very uggly and I don't now if it always works.
Instead of writing it directly I write to a buffer:
this.writeFILO.push(buf);
Every cycle(?) I'm writing a package to the socket stream:
var sendCycle = function(ads) {
if (ads.writeFILO.length > 0) {
ads.tcpClient.write(ads.writeFILO.shift());
}
setTimeout(function() {
sendCycle(ads);
}, 0);
}
I refer to the socket.write(data, [encoding], [callback]) API:
The optional callback parameter will be executed when the data is finally written out - this may not be immediately.
So, set up a queue (array is fine) which holds messages to be sent.
When the above callback is being called, check the queue and send if needed..
This however does not guarantee what you're looking for, you'll have to test. Unfortunately the docs don't explicitly state when there's an acknowledgement from the remote end point that it actually received that message...
In the end, as you concluded, TCP is a stream.
An interesting idea which just came up to me now, however, if you're FORCED TO use an existing protocol, then open two TCP connections.
When one connection acknowledges (whatever the higher-level protocol is) receiving that message, send the next through the other one... and so forth..
Anyway, nice challenge :)
I was wrong. TCP is a stream and the protocol works like a stream, but I didn't handle it like a stream.
PS: sending seperate messages seemed to work with setImmediate()
I know that this is an old question, and I'm not 100% sure I understand what you are looking for, but there is a way to flush a socket in node. First you need to implement a Transform class.
See here for example: https://nodejs.org/api/stream.html#stream_implementing_a_transform_stream.
Then you can take your stream and pipe it through your transform before piping it into your socket.
I do not own this node module but I have seen an example of this here: https://github.com/yongtang/clamav.js/blob/master/index.js#L8

How to pipeline in node.js to redis?

I have lot's of data to insert (SET \ INCR) to redis DB, so I'm looking for pipeline \ mass insertion through node.js.
I couldn't find any good example/ API for doing so in node.js, so any help would be great!
Yes, I must agree that there is lack of examples for that but I managed to create the stream on which I sent several insert commands in batch.
You should install module for redis stream:
npm install redis-stream
And this is how you use the stream:
var redis = require('redis-stream'),
client = new redis(6379, '127.0.0.1');
// Open stream
var stream = client.stream();
// Example of setting 10000 records
for(var record = 0; record < 10000; record++) {
// Command is an array of arguments:
var command = ['set', 'key' + record, 'value'];
// Send command to stream, but parse it before
stream.redis.write( redis.parse(command) );
}
// Create event when stream is closed
stream.on('close', function () {
console.log('Completed!');
// Here you can create stream for reading results or similar
});
// Close the stream after batch insert
stream.end();
Also, you can create as many streams as you want and open/close them as you want at any time.
There are several examples of using redis stream in node.js on redis-stream node module
In node_redis there all commands are pipelined:
https://github.com/mranney/node_redis/issues/539#issuecomment-32203325
You might want to look at batch() too. The reason why it'd be slower with multi() is because it's transactional. If something failed, nothing would be executed. That may be what you want, but you do have a choice for speed here.
The redis-stream package doesn't seem to make use of Redis' mass insert functionality so it's also slower than the mass insert Redis' site goes on to talk about with redis-cli.
Another idea would be to use redis-cli and give it a file to stream from, which this NPM package does: https://github.com/almeida/redis-mass
Not keen on writing to a file on disk first? This repo: https://github.com/eugeneiiim/node-redis-pipe/blob/master/example.js
...also streams to Redis, but without writing to file. It streams to a spawned process and flushes the buffer every so often.
On Redis' site under mass insert (http://redis.io/topics/mass-insert) you can see a little Ruby example. The repo above basically ported that to Node.js and then streamed it directly to that redis-cli process that was spawned.
So in Node.js, we have:
var redisPipe = spawn('redis-cli', ['--pipe']);
spawn() returns a reference to a child process that you can pipe to with stdin. For example: redisPipe.stdin.write().
You can just keep writing to a buffer, streaming that to the child process, and then clearing it every so often. This then won't fill it up and will therefore be a bit better on memory than perhaps the node_redis package (that literally says in its docs that data is held in memory) though I haven't looked into it that deeply so I don't know what the memory footprint ends up being. It could be doing the same thing.
Of course keep in mind that if something goes wrong, it all fails. That's what tools like fluentd were created for (and that's yet another option: http://www.fluentd.org/plugins/all - it has several Redis plugins)...But again, it means you're backing data on disk somewhere to some degree. I've personally used Embulk to do this too (which required a file on disk), but it did not support mass inserts, so it was slow. It took nearly 2 hours for 30,000 records.
One benefit to a streaming approach (not backed by disk) is if you're doing a huge insert from another data source. Assuming that data source returns a lot of data and your server doesn't have the hard disk space to support all of it - you can stream it instead. Again, you risk failures.
I find myself in this position as I'm building a Docker image that will run on a server with not enough disk space to accommodate large data sets. Of course it's a lot easier if you can fit everything on the server's hard disk...But if you can't, streaming to redis-cli may be your only option.
If you are really pushing a lot of data around on a regular basis, I would probably recommend fluentd to be honest. It comes with many great features for ensuring your data makes it to where it's going and if something fails, it can resume.
One problem with all of these Node.js approaches is that if something fails, you either lose it all or have to insert it all over again.
By default, node_redis, the Node.js library sends commands in pipelines and automatically chooses how many commands will go into each pipeline [(https://github.com/NodeRedis/node-redis/issues/539#issuecomment-32203325)][1]. Therefore, you don't need to worry about this. However, other Redis clients may not use pipelines by default; you will need to check out the client documentation to see how to take advantage of pipelines.

What are good sources to study the threading implementation of a XMPP application?

From my understanding the XMPP protocol is based on an always-on connection where you have no, immediate, indication of when an XML message ends.
This means you have to evaluate the stream as it comes. This also means that, probably, you have to deal with asynchronous connections since the socket can block in the middle of an XML message, either due to message length or a connection being slow.
I would appreciate one source per answer so we can mod them up and see what's the favourite.
Are you wanting to deal with multiple connections at once? Good asynch socket processing is a must in that case, to avoid one thread per connection.
Otherwise, you just need an XML parser that can deal with a chunk of bytes at a time. Expat is the canonical example; if you're in Java, try XP. These types of XML parsers will fire events as possible, and buffer partial stanzas until the rest arrives.
Now, to address your assertion that there is no notification when a stanza ends, that's not really true. The important thing is not to process the XML stream as if it is a sequence of documents. Use the following pseudo-code:
stanza = null
while parser has more:
switch on token type:
START_TAG:
elem = create element from parser state
if stanza is not null:
add elem as child of stanza
stanza = elem
END_TAG:
parent = parent of stanza
if parent is not null:
fire OnStanza event
stanza = parent
This approach should work with an event-based or pull parser. It only requires holding on to one pointer worth of state. Obviously, you'll also need to handle attributes, character data, entity references (like & and the like), and special-purpose the stream:stream tag, but this should get you started.
Igniterealtime.org provides an open source XMPP-server and client written in java
ejabberd is written in Erlang. I don't know the details of the ejabberd implementation, but one advantage of using Erlang is really inexpensive threads. I'll speculate they start a thread per XMPP connection. In Erlang terminology these would be called processes, but these are not protected-memory address spaces they are lightweight user-space threads.

Resources