Using gRPC in client that must block on other fds - how? - linux

I have a C++ client binary that uses epoll() to block on various OS descriptors - for file I/O, socket I/O, OS timers; and now it also needs to be a gRPC client (including streaming replies).
Reading answers to related questions across the web (e.g. about servers) it appears that there is no easy way from C/C++ to ask gRPC to give an fd that can be incorporated into the existing epoll set and then trigger gRPC to do the read & processing for the incoming response. Is that correct?
Alternatively, is the reverse possible: to use files, socket and timers via the gRPC core iomgr framework that are otherwise unrelated to the gRPC service? (for reading local files, communicating with external network equipment and managing the client's internal high-frequency timer needs.
The client in question is a single thread with RT priority (on an embedded (soft) real-time system using the PREEMPT RT). Given that gRPC creates other threads, could that be a problem?

Unfortunately, this isn't possible today, but it will be in the future once we finish the EventEngine effort. Once ready, that will allow you to inject your own event loop into gRPC. However, that probably won't be ready for public use until sometime next year.
For now, the only suggestion I can offer is that if you're using an insecure channel and don't need any name resolution or load balancing functionality, you may be able to use CreateInsecureChannelFromFd() to do the reverse (provide your own fd to use as a gRPC connection).

Related

Does nodejs http/s module use worker threads to listen for a request?

Had some conversation with a co-worker and it went down a rabbit hole of threads and i was questioning if something like expressjs, which uses the nodejs built in https module, uses workers to listen for connections for each network request or some other design.
Would anyone know how normally http type request wait for connections? threads? workers?
Does nodejs http/s module use worker threads to listen for a request?
No, it does not. Nodejs is an event driven system. Wherever possible, it uses an underlying asynchronous capability in the OS and that is what it does for all TCP/UDP networking, including http.
So, it uses native asycnhronous features in the OS. When some networking event happens (such as an incoming connection or an incoming packet), then the OS will notify nodejs and it will insert an event into the event queue. When nodejs is not doing something else, it will go back to the event queue and pull the next event out of the event queue and run the code associated with that event.
This is all managed for nodejs by the libuv library which provides the OS compatibility layer upon which nodejs runs (on several different platforms). Networking is part of what libuv provides.
This does not involve having a separate thread for every TCP socket. libuv itself uses a few system threads in order to do its job, but it does not create a new thread for each TCP socket you have connected and it's important to note that it is using the native asynchronous networking interfaces in the operating system (not the blocking interfaces).
The libuv project is here (including the source) if you want to learn more about that specific library.
libuv does use a thread pool for some operations that don't have a consistent, reliable asynchronous interface in the OS (like file system access), but that thread pool is not used or needed for networking.
There are also a zillion articles on the web about how various aspects of the nodejs event loop work, including some in the nodejs documentation itself.

What is the best way to communicate between two servers?

I am building a web app which has two parts. In one part it uses a real time connection between the server and the client and in the other part it does some cpu intensive task to provide relevant data.
Implementing the real time communication in nodejs and the cpu intensive part in python/java. What is the best way the nodejs server can participate in a duplex communication with the other server ?
For a basic solution you can use Socket.IO if you are already using it and know how it works, it will get the job done since it allows for communication between a client and server where the client can be a different server in a different language.
If you want a more robust solution with additional options and controls or which can handle higher traffic throughput (though this shouldn't be an issue if you are ultimately just sending it through the relatively slow internet) you can look at something like ØMQ (ZeroMQ). It is a messaging queue which gives you more control and lots of different communications methods beyond just request-response.
When you set either up I would recommend using your CPU intensive server as the stable end(server) and your web server(s) as your client. Assuming that you are using a single server for your CPU intensive tasks and you are running several NodeJS server instances to take advantage of multi-cores for your web server. This simplifies your communication since you want to have a single point to connect to.
If you foresee needing multiple CPU servers you will want to setup a routing server that can route between multiple web servers and multiple CPU servers and in this case I would recommend the extra work of learning ØMQ.
You can use http.request method provided to make curl request within node's code.
http.request method is also used for implementing Authentication api.
You can put your callback in the success of request and when you get the response data in node, you can send it back to user.
While in backgrount java/python server can utilize node's request for CPU intensive task.
I maintain a node.js application that intercommunicates among 34 tasks spread across 2 servers.
In your case, for communication between the web server and the app server you might consider mqtt.
I use mqtt for this kind of communication. There are mqtt clients for most languages, including node/javascript, python and java. In my case I publish json messages using mqtt 'topics' and any task that has registered to subscribe to a 'topic' receives it's data when published. If you google "pub sub", "mqtt" and "mosquitto" you'll find lots of references and examples. Mosquitto (now an Eclipse project) is only one of a number of mqtt brokers that are available. Another very good broker that is written in Java is called hivemq.
This is a very simple, reliable solution that scales well. In my case literally millions of messages reliably pass through mqtt every day.
You must be looking for socketio
Socket.IO enables real-time bidirectional event-based communication.
It works on every platform, browser or device, focusing equally on reliability and speed.
Sockets have traditionally been the solution around which most
realtime systems are architected, providing a bi-directional
communication channel between a client and a server.

poll system call in linux drivers

I am learning Linux internals. So I came across the poll system call. As far as I understand, it is used by drivers to provide notification when some data is ready to be read from device and when we have data ready to device.
If device do not have any data to read, process will get sleep and wake up when data become available and vice versa for write case.
Can someone provide me concrete understanding of poll system call with some real example?
poll and select (the latter is very similar to poll with these differences) sys calls are used in so called asynchronous event-driven approach for handling client's requests.
Basically, in network programming there are two major strategies for handling many connections from network clients by the server:
1) more traditional: threaded or process-oriented approach. In this situation network server has main proccess which listens on one specific network port (port 80 in case of web servers) for incomming connections and when connection arrives, it spawns new thread/process to handle this new connection. Apache HTTP server took this approch.
2) aforementioned asynchronous event-driven approach where (in simplest case) network server (for example web server) is application with only one process and it accepts connections (creating socket for each new client) and then it monitors those sockets with poll/select for incoming data. Nginx http web server took this approch.

winsock application and multhreading - listening to socket event from another thread

assume we have an application which uses winsock to implement tcp communication.
for each socket we create a thread and block-receiving on it.
when data arrives, we would like to notify other threads (listening threads).
i was wondering what is the best way to implement this:
move away from this design and use a non-blocking socket, then the listening thread will have to iterate constantly and call a non-blocking receive, thus making it thread safe (no extra threads for the sockets)
use asynchronous procedure calls to notify listening threads - which again will have to alert-wait for apc to queue for them.
implement some thread safe message queue, where each socket thread will post messages to it, and the listener, again, will go over it every interval and pull data from it.
also, i read about WSAAsyncSelect, but i saw that this is used to send messages to a window. isnt there something similar for other threads? (well i guess apcs are...)
Thanks!
Use I/O completion ports. See the CreateIoCompletionPort() and the GetQueuedCompletionStatus() functions of the Win32 API (under File Management functions). In this instance, the socket descriptors are used in place of file handles.
You'll always be better off abstracting the mechanics of socket API (listening, accepting, reading & writing) in a separate layer from the application logic. Have an object that captures the state of a connection, which is created during an incoming connection and you can maintain buffers in this object for the incoming and outgoing traffic. This will allow your network interface layer to be independent of the application code. This will also make the code cleaner by separating the application functionality from the underlying communication mechanism.
Blocking or non-blocking socket decision depends on the level of scalability that your applications needs to achieve. If your application needs to support hundreds of incoming connections, adopting a thread-per-socket approach is not going to be very wise. You'll be better off going for an Io ports based implementation, which will make your app immensely scaleable at added code complexity. However, if you only foresee a few 10s of connections at any point in time, you can go for an asynchronous sockets model using Win32 events or messages. Win32 events based approach doesn't scale very well beyond a certain limit as you would have to manage multiple threads if the number of concurrent sockets exceed 63 (as WaitForMultipleObjects can only support a max of 64 sockets). Windows message based mechanism doesn't have this limitation though. OHOH, Win32 event based approach does not require a GUI window to work.
Check out WSAEventSelect along with WSAAsyncSelect API documentation in MSDN.
You might want to take a look at boost::asio package as well. It provides a neat (though a little complex) C++ abstraction over sockets API.

How can a connection be represented only by a small space in a HTTP server running on node.js?

I have read that a HTTP server created in node.js does not create new threads for each incoming connection(request). Instead it executes a function that has been registered as a callback corresponding to the event of receiving a request.
It is said that each connection is represented by some small space in the heap. I cannot figure this out. Are connections not represented by sockets ? Should sockets not be opened for every connection made to the node.js server and this would mean each connection cannot be represented by just a space allocation in the javascript heap ?
It is described on the nodejs.org website that instead of spawning threads (2mb overhead per thread!) per connection, the server uses select(), epoll, kqueue or /dev/poll to wait until a socket is ready to read / write. It is this method that allows node to avoid thread spawning per connection, and the overhead is that associated with the heap allocation of the socket descriptor for the connection. This implementation detail is largely hidden from developers, and the net.socket API exposed by the runtime provides everything you need to take advantage of that feature without even thinking about it.
Node also exposes its own event API through events.EventEmitter. Many node objects implement events to provide asynchronous (non-blocking) event notification, which is perfect for I/O operations, which in other languages - such as PHP - are synchronous (blocking) by default. In the case of the node net.socket API, events are triggered for several API methods dealing with socket I/O, and the callbacks that are passed by parameter to these methods are triggered when an event occurs. Events can have callback functions bound to them in a variety of different ways, accepting a callback function as a parameter is only a convenience for the developer.
Finally, do not confuse OS events with nodejs events. In the case of the net API, OS events are passed to the nodejs runtime, but nodejs events are javascript.
I hope this helps.

Resources