How to avoid DBus for Linux in embedded environment? - linux

I am working in a Linux based embedded project with C/C++ and python applications. And we need an Inter Process Communication (IPC) method to transport JSON based messages between those applications. Initially DBus was an obvious option since it is present in almost all Linux distributions and is quite stable and proved software. Also there are libraries for many programming languages. Also DBus has a very granular and nice permission system - which is a requirement for our project (security reasons).
But unfortunately we have experienced some drawbacks of DBus:
We have hit some stability bugs like in some specific congestion situations there were some memory leaks which lead to dead IPC and only application restart helped.
Only the usage of DBus introduced 3-5 MB of ram usage per each application (which on a system with 512 MB RAM and multiplied by 25 applications does make some room for improvements).
The data flow model (signals / methods) seem to be a bit too complicated for the use-case we need.
Our next idea is to switch to some of Message broker available. But we also look for some nice to have features:
Be able to broadcast or Multicast messages to multiple applications
To have presence of applications when they connect/disconnect from the bus-server (the server can broadcast when new applications connect and when applications disconnect).
Watchdog of connected applications. Sometimes the apps might behave wrong on the IPC (by not answering to IPC messages) and the server with watchdog could detect that and disconnect that application and inform others that the application is dead.
How do we avoid DBus in this scenario?

Related

Communicating with processes in the same host using internet sockets?

I am building a message layer for processes running on an embedded Linux system. I am planing to use sockets. This system might be ported to different operating systems down the road so portability is a concern. Performance is below portability in priority order.
I have a few questions regarding my way forward.
I am thinking of using internet sockets over TCP/IP for this communication between local processes for the sake of portability. Is there any reason that I should not do that and use domain sockets?
Does it really improve the portability when using internet sockets instead of domain sockets?
If this is indeed the way forward, can you point me in the right direction (how to use ports for each process etc.) with some online resources?

How can I reduce system call overhead when sending a large number of UDP packets? (both Windows & Linux)

For example, I am sending 100000 UDP packets on Windows. For each packet, I need to call WSASendTo() once, so probably a lot of system call overhead is introduced. Is there a way to do bulk sending and reduce this overhead? I could not find a solution for Windows after googling for a while. Also, I would like to know if this is possible on Linux. Thanks.
On Windows you can use the new Windows Registered I/O API (RIO) on Server 2012 and Windows 8 and later.
I've written quite a lot about it here and have done several performance comparisons with the previous APIs that are available on Windows. The performance tests can be found here.
In summary: "The Registered I/O Networking Extensions, RIO, is a new API that has been added to Winsock to support high-speed networking for increased networking performance with lower latency and jitter. These extensions are targeted primarily for server applications and use pre-registered data buffers and completion queues to increase performance. The increased performance comes from avoiding the need to lock memory pages and copy OVERLAPPED structures into kernel space when individual requests are issued, instead relying on pre-locked buffers, fixed sized completion queues, optional event notification on completions and the ability to return multiple completions from kernel space to user space in one go."
The results of my performance tests seem to imply that it works.
Use TransmitPackets() in Windows.
Microsoft has just recently adopted the QUIC protocol. To that end Microsoft had to improve the performance of UDP in Windows.
Therefore we now have a new "UDP send segmentation" and "UDP receive coalescing" API in Windows.
This solves the call overhead problem and takes advantage of hardware offloading on NICs which support it. This is the fastest possible method of transmitting UDP today. But with the caveat that it mostly requires all UDP packets in a call to have the same size.
Read more about it here:
https://techcommunity.microsoft.com/t5/networking-blog/making-msquic-blazing-fast/ba-p/2268963

Live socket monitoring with netlink inet_diag

My goal is to monitor sockets and relate them to the applications that created them.
I am aware of netstat, ss, lsof and so on and that they can list all sockets with their application.
And I also know that I can parse /proc/net/tcp to get the sockets and relate them to the applications with /proc/(PID), which is exactly what these tools do or they use netlink sockets.
My researches brought me to an article which explains how to get all sockets from the kernel with netlink via the inet_diag protocol. The user space program sets up a netlink socket of the inet_diag type and sends a request to the kernel. The response consists of several messages which contain the sockets and additional related information.
This is really neat, but unfortunately the kernel sends this information only once per request. So I have to "poll" continuously.
Further researches brought me to another article which monitors IP changes of interfaces with netlink route sockets continuously. The socket is bound to a multicast group and then messages are read from it in an endless loop.
So I investigated if there is the same possibility with the inet_diag sockets. Unfortunately I am not really able to understand kernel code. But as far as I can tell there are no multicast groups for this socket family.
At this point I am stuck and I need to know if this approach is somehow feasible or somebody knows any other hints.
You can try dtrace if every tools you mentioned can't meet your requirement.
You can use kprobe kernel module to hook all connect system call,whichi monitor sockets and relate them to the applications that created them
just like Elkeid,Elkeid Driver hooks kernel functions via Kprobe, providing rich and accurate data collection capabilities, including kernel-level process execve probing, privilege escalation monitoring, network audition, and much more. The Driver treats Container-based monitoring as a first-class citizen as Host-based data collection by supporting Linux Namespace. Compare to User Space agents on the market, Elkeid provides more comprehensive information with massive performance improvement.

Designing multithread application (looking for design patterns)

I'm preparing to write a multithread network application. At the moment I'm wondering what's the best thread pattern for my program. Whole application will handle up to 1000 descriptors (local files, network connections on various protocols and additional descriptors for timers and signals handling). Application will be optimized for Linux. Program will run on regular personal computers, so I assume, that they will have at least Pentium 4.
Here's my current idea:
One thread will handle network I/O
using epoll.
Second thread will
handle local-like I/O (disk I/O,
timers, signal handling) using epoll
Third thread
will handle UI (CLI, GTK+ or Qt)
Handling each network connection in separate thread will kill CPU because of too many context switches.
Maybe there's better way to do this?
Do you know any documents/books about designing multirhread applications? I'm looking for answers on questions like: What's the rational number of threads? etc.
You're on the right track. You want to use a thread pool pattern to handle the networking rather than one thread per network connection.
This website may also be helpful to you and lists the most common design patterns and in what situations they can be used.
http://sourcemaking.com/design_patterns/
To handle the disk I/O you might like to consider using mmap under linux. It's very fast and efficient. That way, you will let the kernel do the work and you probably won't need a separate thread for that.
I'm currently playing with Boost::asio which seems to be quite good. It uses epoll on linux. As it appears you are using a cross platform gui toolkit like Qt, then boost asio will also provide cross platform support so you will be able to use it on windows or linux. I think there might be a cross platform mmap too.

Whats the most widespread monitoring protocol/library?

I need to expose certain monitoring statistics from my application and I'm wondering what the most widespread framework or protocol is for doing this?
SNMP is widely used and a standard protocol. It's implemented in computers, routers, hubs, printers and practically anything connected to the net. Although it's called the SImple Network Management Protocol it's not restricted to network management.
It's an open standard and consequently there are a huge array of management/monitoring solutions, from simple shell scripts and libraries up to enterprise monitoring suites (e.g. HP Openview).
You can query synchronously for data or receive events (in SNMP-speak, traps). Each device will report a common set of data (relating mainly to the network status of that device) plus enterprise-specific data (e.g. CPU usage, printer status etc.).
It runs over UDP, and message consistency is a responsibility of the implementing library. This is a little unusual, but it's designed to operate even when the network is not functioning correctly (e.g. flooded with traffic/misconfigured etc.) and decisions about retry strategies, timeouts etc. need to be taken at the application level (unlike TCP).

Resources