Communicating with processes in the same host using internet sockets? - linux

I am building a message layer for processes running on an embedded Linux system. I am planing to use sockets. This system might be ported to different operating systems down the road so portability is a concern. Performance is below portability in priority order.
I have a few questions regarding my way forward.
I am thinking of using internet sockets over TCP/IP for this communication between local processes for the sake of portability. Is there any reason that I should not do that and use domain sockets?
Does it really improve the portability when using internet sockets instead of domain sockets?
If this is indeed the way forward, can you point me in the right direction (how to use ports for each process etc.) with some online resources?

Related

How to avoid DBus for Linux in embedded environment?

I am working in a Linux based embedded project with C/C++ and python applications. And we need an Inter Process Communication (IPC) method to transport JSON based messages between those applications. Initially DBus was an obvious option since it is present in almost all Linux distributions and is quite stable and proved software. Also there are libraries for many programming languages. Also DBus has a very granular and nice permission system - which is a requirement for our project (security reasons).
But unfortunately we have experienced some drawbacks of DBus:
We have hit some stability bugs like in some specific congestion situations there were some memory leaks which lead to dead IPC and only application restart helped.
Only the usage of DBus introduced 3-5 MB of ram usage per each application (which on a system with 512 MB RAM and multiplied by 25 applications does make some room for improvements).
The data flow model (signals / methods) seem to be a bit too complicated for the use-case we need.
Our next idea is to switch to some of Message broker available. But we also look for some nice to have features:
Be able to broadcast or Multicast messages to multiple applications
To have presence of applications when they connect/disconnect from the bus-server (the server can broadcast when new applications connect and when applications disconnect).
Watchdog of connected applications. Sometimes the apps might behave wrong on the IPC (by not answering to IPC messages) and the server with watchdog could detect that and disconnect that application and inform others that the application is dead.
How do we avoid DBus in this scenario?

Does OS perform optimization of TCP/IP when used locally

I am building cross-platform application, consisting of several modules, exchanging with each other.
That means my question is related to both Windows and Linux.
Q: If using TCP/IP for inter-process communication, is there any kind of special optimization, performed by OS in case both endpoints are on localhost?
Somewhere I've heard, in this case Windows can bypass network drivers and use just shared memory. I have no idea about the source/proof of this statement, but the idea to switch off some unused stuff sounds logic.
Is that true and if yes, where I can read the details?

Domain argument to socket() and socketpair()

I've been studying Linux socket programming recently, and the concepts are still swirling and unsettled in my head. Can someone confirm or correct my understanding of the domain argument to socket() and socketpair(): one should choose PF_LOCAL (or PF_UNIX) if one wants the socket communication to be strictly within the same computer, and one should choose PF_INET if the socket communication is meant to be between different computers -- is that correct?
No, it's the communications domain you want to use. See the man page for socket. For example, AF_INET means v4 internet protocols, AF_INET6 means v6 internet protocols, AF_APPLETALK means AppleTalk, and so forth. You almost certainly want AF_INET or AF_INET6.
Whether the other program you'll be communicating with is on the same machine or not isn't really relevant since you can communicate with the local host just fine using internet protocols.
However, there is a small performance penalty associated with using the internet domain protocols. If your application will be connecting only with other applications on the same machine, using the AF_LOCAL/AF_UNIX domain will be faster and will offer you some additional advantages such as file-level security controls on the sockets. Just be aware that you won't be able to use your code between different computers without modifying it if you go that route.
A good discussion of the pros and cons of this choice can be found here.

communication between processes: tcp vs unix sockets, ipc vs nats

I'm breaking a big application into several processes and I want each process to communicate with each other.
for now it's gonna be on the same server, but later several servers on same local network will have several processes that will need to communicate between each other. (means service on one server, with service on other server on same vpc)
so.. my raw options are tcp or unix sockets. I know that with Unix sockets can be useful only if you're on the same server. but we're thinking about writing our own implementation that on same server processes will communicate on unix sockets, and between servers that will communicate using tcp.
is it worth it ? of course tcp sockets are slower then unix sockets.. cause it doesn't go through the network and doesn't get wrapped with tcp related data. the question is by how much ? I couldn't find online proof of benchmarking between tcp and unix sockets. if tcp adds 3%-5% overhead that's cool, but can it be more then that ? I'd like to learn from experience of big projects.. of other people over the years, but didn't find anything relevant.
next...
our project is a NodejS project.
some people may say that I can use a broker for messages, so I tried using nats.io compared to node-ipc (https://www.npmjs.com/package/node-ipc) and I found out that node-ipc is 4 times faster but nats has the cool publish-subscribe feature... but performance is important.
so I have tons of options, no concrete decision.
any information regarding the issue would be greatly appreciated.
The question is actually too broad to answer, but one answer for TCP vs unix domain sockets:
Architect your code, so that you can easily move between those if necessary. The programming model for these is basically the same (both are bidirectional streams of data), and the read/write APIs on OS level as well as in most frameworks is the same. This means e.g. in node both will inherit from the Readable/WriteableStream interfaces. That means the only code that you need to change for switching between those is the listener on the server side where you call the TCP accept APIs instead of the unix domain socket accept APIs and the other way around. You can even have your application accept both types of connections and later on handle them the same internally.
TCP support is always nice because it gives you some flexibility. With my last measurement the overhead was a little bit more (I think 30% versus TCP over loopback) but these are all micro benchmarks and it won't matter for most applications. Unix domain sockets might have an advantage if require some of their special functions, e.g. the ability to send file descriptors across them.
And regarding TCP vs NATS & Co:
If you are not that experienced with network programming and protocol design it makes sense to use readymade IPC systems. That could be anything from HTTP to gRPC to Thrift. These are all point-to-point systems. NATS is different, since its a message broker and not RPC. It also requires an extra component in the middle. Whether this makes sense totally depends on the application.

Linux Networking Kernel

I am reading Kernel Networking In linux. I found this http://www.ibm.com/developerworks/linux/library/l-linux-networking-stack/ article helpful. After reading this I have a doubt that If I create any software in Linux platform.. Lets say, some chat Program.. Do I have to make use of all those API`s(sk_buff and all) available to connect to the another network? Please help me with it.
sk_buff is a kernel structure that is part of the kernel's TCP/IP stack. You shouldn't need to touch this directly and will in actuality find it difficult to do so.
What you need instead is to learn the user-space API's for network communication. For quickly learning the basics of network communication on Unix, it's tough to beat Beej's Guide.
If you want to create a chat software I would recommend to you to checkout the BSD Sockets or any TCP/IP Network Guide for Linux. You don't need to understand what's going on within the Kernel in order to Program a chat software.
The sk_buff is relevant if you would like to create a new device driver but you seem to be above the protocol level.
If you want to create a chat you would create a Server Socket (Listener) and Clients which connect to the address where your Server is listening and exchange information through TCP/IP.
sk_buff is a kernel data structure for socket buffer. You do not have to touch it for your chat server. If you have taken an OS class you must have noticed there is a process structure(struct proc in Linux) but does it mean you have to use it when you write a program ? No. sk_buff is a similar case, Linux kernel uses it to buffer certain data. You don't have to be concerned with it.
For your chat server have a look at Beej's guide, it has an implementation of Chat Server if I am not wrong and its the best guide that I know to get started with Network Programming on Linux, and is filled with humour. For a deeper understanding of Network Programming look at Richard Stevens Unix Network Programming, Volume 1 and Volume 2. It is considered the bible of Network Programming.

Resources