How do MPI implementations (OpenMPI, MPICH) handle security/authentication - security

How do OpenMPI and MPICH handle security when I send MPI messages between the processes over TCP/IP sockets?
In particular, how do they prevent other users of the same network from connecting to a listening socket and sending fake MPI messages?
The specific scenario is the following:
The administrators are trusted. Untrusted users do not have a physical access to any hardware or network. Untrusted users do not have root access.
However, untrusted users can run their own programs in the cluster; the cluster nodes are typical Linux boxes. In particular, untrusted users can open TCP connections from any machine to any other machine in the cluster and send arbitrary messages.

J Teller's right; MPI doesn't really do this, and it shouldn't. That's a design decision based on the use case of MPI.
MPI users are the sorts of people who pay lots of money for interconnects with sub-microseconds latency. The overhead of some sort of cryptographic signing of messages would be completely unacceptable for this community.
And it wouldn't really help at any rate. The way MPI is used is as a message transport interface within a controlled environment - nodes in a limited-access cluster, or maybe machines in a compute lab. If a malicious user gains enough control of one of these nodes to interfere with MPI communications, there are far easier ways to disrupt the communication than sniffing packets, figuring out what stage of the computation is underway, and doing some kind of man-in-the-middle attack. One could just alter the memory of the running job, or more easily, simply overwrite the results on the shared file system. (notice simply sending forged MPI messages might well be noticed, as the "real" messages would pile up, using resources and possibly crashing the job; similarly, intercepting messages without relaying them would almost certainly result in deadlock).
These arguments don't apply so strongly to distributed computing, of course, say BOINC-style: but MPI isn't well suited for that sort of use anyway.
Nothing of course stops an MPI user who does have this sort of security requirement from simply sending a pgp-style signature along with every message and incorporating that into their code; but a mechanism for doing that is not part of MPI per se, and that's certainly the right decision.

I'm not quite an expert on this, but the basic answer is that MPI generally doesn't handle security. It relies on the underlying OS to provide the security level you're describing.
For my mpi distribution, this is built in is using the mpd daemon (the daemon that launches mpi processes). mpdboot sets up a ring of mpd daemons on the cluster (1 per node), in a sane way. Once that ring is setup, and if you trust the mpd daemons, then you're all set. Mpd will make sure that only processes you own connect to your mpi processes.
However, I don't quite understand the "sane way" the mpd ring is setup. In my distribution however, mpdboot is a python script, so it's possible to take a look at it and see if it's secure enough for you. It is probably safe enough if the cluster you're running on is access controlled.

I'm all with #Jonathan Dursi that securing MPI communication contributes little to the security of a well-configured cluster but (a) management may insist, (b) for some reason you want to run MPI over an untrusted network, and (c) it's fun to solve the problem.
As for the untrusted network situation, I used IPSec to protect the MPI communication network on a cluster before. Rather than using IPSec's automatic configuration scripts that require certificates, I loaded symmetric encryption keys via a shell script that I generated automatically with a python script.
To isolate MPI communication of individual jobs on the cluster, it should be possible to extend this approach to load newly generated IPSec keys in the prolog of each job, rather than only once at system startup. In this setup, jobs must not share nodes. Furthermore, it will be easier to set this up with a dedicated network for MPI as ongoing connections (ssh, the job manager etc.) must be left intact and you probably don't want to include the head node(s) which usually the job management system needs to connect to to the job's IPSec network as the attacker could be sitting there.

Related

Most lightweight way to emulate a distributed system on linux

So I am taking this distributed systems class in which projects are done by simulating a distributed system using android and multiple emulators. This approach is terrible for multiple reasons:
Android emulators are too effin resource consuming that my poor laptop crashes mostly.
Poor networking support between emulators. Need to do port forwarding on TCP and what not.
So what is the way to emulate a distributed system on my Linux machine that consumes minimal resources, mostly RAM and CPU time?
Is Docker the answer to all of this? Maybe create multiple containers with separate IP for each? Is that even possible?
My team maintains several production distributed systems; and we have to unit test it in such a way that we can capture protocol bugs.
We have a stub implementation of clock and of the network that we inject into our classes. The network mimics the Message Passing model used in many distributed systems papers: pick a message at random and deliver it. This models network latencies and inconsistencies very well. We have other things built in: being able to block/release or drop messages to/from sets of hosts; and a simple tcp model.
With this simple addition our unit tests are now what we call interaction tests. We can very quickly add however many servers we want all running in single process on a laptop.
Oh, and after doing this, you'll know why global variables and singletons are a Bad Thing.
You can run several docker containers on one Linux machine. Each container will get its own IP address and it will also be able to talk to other containers on the same host. How many systems do you want to simulate?

Performance implications of distribution of threads amongst processes in a server

The question title is pretty awkward, sorry about that.
I am currently working on the design of a server, and a comment came up from one of my co-workers that we should use multiple processes, since the was some performance hit to having too many threads in a single process (as opposed to having that same number of threads spread over multiple processes on the same machine)
The only thing I can think of which would cause this (other than bad OS scheduling), would be from increased contention (for example on the memory allocator), but I'm not sure how much that matters.
Is this a 'best practice'? Does anyone have some benchmarks they could share with me? Of course the answer may depend on the platform (I'm interested mostly in windows/linux/osx, although I need to care about HP-UX, AIX, and Solaris to some extent)
There are of course other benefits to using a multi-process architecture, such as process isolation to limit the effect of a crash, but I'm interested about performance for this question.
For some context, the server is going to service long-running, stateful connections (so they cannot be migrated to other server processes) which send back a lot of data, and can also cause a lot of local DB processing on the server machine. It's going to use the proactor architecture in-process and be implemented in C++. The server will be expected to run for weeks/months without need of restart (although this may be implemented by rotating new instances transparently under some proxy).
Also, we will be using a multi-process architecture, my concern is more about scheduling connections to processes.

Multithreaded Corba Client

There is a lot on multithreading on the Corba server side, but I'm interested about the client side. We have a multithreaded client (Solaris, Orbix 6.3) with a Corba singleton "manager" that initialises the ORB. During runtime 'lsof' shows only one TCP connection to the Corba server, so all synchronous calls made from the client worker threads should be serialised.
Would like to change this arrangement to take advantage of parallelism: each thread to manage its own connection. I've changed the setup so that instead of a singleton each worker thread calls ORB_init(), etc.
Totally puzzled now: 'lsof' shows now 2 TCP connections but there are 6 worker threads.
Something is not right, would have expected as many TCP connections as the number of worker threads. May be that the approach is naive - does it makes sense for example to call ORB_init() per thread?
I'd need someones opinion on this. Sample code for a multithreaded client would greatly help. Again, using Orbix 6.3 on Solaris.
Kind regards,
Adrian
The management of connections is implementation specific for plain CORBA. Each vendor has its own proprietary way of configuration their behavior. If you check the RTCORBA specification, that has a standardized way to configure how connections between client and server will be used.
I don't know how Orbix works and whether it supports RTCORBA, that is something you could get from their manuals probably. I do know that TAO has a lot of support for threading at the client side. By default when multiple threads make an invocation to the same server multiple tcpip transports can be opened at the same moment.
Thank you guys for your answers. I found, as Johnny says that this is indeed implementation specific.
omniORB has for example maxGIOPConnectionPerServer - default 5. That's:
The maximum number of concurrent connections the ORB will open to a single server. If multiple threads on the client call the same server, the ORB opens additional connections to the server, up to the maximum specified by this parameter. If the maximum is reached, threads are blocked until a connection becomes free for them to use.
Unfortunately I haven't yet found out what's the equivalent (if any) for Orbix. It's definitely defaulting to 1 connection. Still googling...
Found out though that as part of Solaris -> Linux migration will be moving from Orbix to TAO in a number of months. Hoping TAO would be more friendly and customizable.
Orbix internally uses a lot of optimization routines to ensure that connections are used efficiently. Specifically, it's not going to open up multiple connections to the same server endpoint because it's able to multiplex multiple concurrent GIOP requests over the same TCP connection. CORBA deliberately hides connection management from client and server programmers.
I don't believe this is controllable through configuration. Send a support ticket to Progress Support to confirm. You might be able to force it to happen if you move away from the singleton model and initialize a different ORB for each client (each with their own unique ID), but that would be a very heavy-handed and costly solution to a problem that is a little vague. The underlying ORB is already build to optimize for concurrent requests, so I'm not sure what problem it is you're trying to solve.
In my honest opinion I don't think there is such a concept called multi threaded client for CORBA applications. Because in the server side, there is only one object that is registered with the naming service which is available for all the clients. If you look at the IOR of the object, it will be same for all the clients. So it can establishes at most only one connection to that object. It also leads to thinking that you can not get more than one remote object (which means how much ever you do look-up for the object from different clients, they all get the same reference) for any number of clients. So, in order to support mutli-threading ,the server actually has to support different thread policies. POA the server can have different thread policies. Please go through JAVA PROGRAMMING WITH CORBA for more.
I don't know how exactly Orbix works, but normally ORB initialization in done only once even for a multithreaded setup. The multithreaded (server side) ORB will start an amount of worker threads (on demand or if needed or if configured, a fixed number) to handle incomming connection. These connections are handled by a worker. This worker looks up the servant that can handle this request. Normally this (the real call to the servant) is performed in an extra thread also. But you won't see this thread with lsof. Try so use ps -eLf or top -H with thread support enabled.
EDIT:
On the client side it depends on how many object do you want to call. For each object a caller thread is possible. It is also possible to have more than one caller thread per remote object, but only if called from different threads on the client side logic. (Imagine to have multiple threads and the remote object is shared across the threads)

Architecture to Sandbox the Compilation and Execution Of Untrusted Source Code

The SPOJ is a website that lists programming puzzles, then allows users to write code to solve those puzzles and upload their source code to the server. The server then compiles that source code (or interprets it if it's an interpreted language), runs a battery of unit tests against the code, and verifies that it correctly solves the problem.
What's the best way to implement something like this - how do you sandbox the user input so that it can not compromise the server? Should you use SELinux, chroot, or virtualization? All three plus something else I haven't thought of?
How does the application reliably communicate results outside of the jail while also insuring that the results are not compromised? How would you prevent, for instance, an application from writing huge chunks of nonsense data to disk, or other malicious activities?
I'm genuinely curious, as this just seems like a very risky sort of application to run.
A chroot jail executed from a limited user account sounds like the best starting point (i.e. NOT root or the same user that runs your webserver)
To prevent huge chunks of nonsense data being written to disk, you could use disk quotas or a separate volume that you don't mind filling up (assuming you're not testing in parallel under the same user - or you'll end up dealing with annoying race conditions)
If you wanted to do something more scalable and secure, you could use dynamic virtualized hosts with your own server/client solution for communication - you have a pool of 'agents' that receive instructions to copy and compile from X repository or share, then execute a battery of tests, and log the output back via the same server/client protocol. Your host process can watch for excessive disk usage and report warnings if required, the agents may or may not execute the code under a chroot jail, and if you're super paranoid you would destroy the agent after each run and spin up a new VM when the next sample is ready for testing. If you're doing this large scale in the cloud (e.g. 100+ agents running on EC2) you only ever have enough spun up to accommodate demand and therefore reduce your costs. Again, if you're going for scale you can use something like Amazon SQS to buffer requests, or if you're doing a experimental sample project then you could do something much simpler (just think distributed parallel processing systems, e.g. seti#home)

Which resources should one monitor on a Linux server running a web-server or database

When running any kind of server under load there are several resources that one would like to monitor to make sure that the server is healthy. This is specifically true when testing the system under load.
Some examples for this would be CPU utilization, memory usage, and perhaps disk space.
What other resource should I be monitoring, and what tools are available to do so?
As many as you can afford to, and can then graph/understand/look at the results. Monitoring resources is useful for not only capacity planning, but anomaly detection, and anomaly detection significantly helps your ability to detect security events.
You have a decent start with your basic graphs. I'd want to also monitor the number of threads, number of connections, network I/O, disk I/O, page faults (arguably this is related to memory usage), context switches.
I really like munin for graphing things related to hosts.
I use Zabbix extensively in production, which comes with a stack of useful defaults. Some examples of the sorts of things we've configured it to monitor:
Network usage
CPU usage (% user,system,nice times)
Load averages (1m, 5m, 15m)
RAM usage (real, swap, shm)
Disc throughput
Active connections (by port number)
Number of processes (by process type)
Ping time from remote location
Time to SSL certificate expiry
MySQL internals (query cache usage, num temporary tables in RAM and on disc, etc)
Anything you can monitor with Zabbix, you can also attach triggers to - so it can restart failed services; or page you to alert about problems.
Collect the data now, before performance becomes an issue. When it does, you'll be glad of the historical baselines, and the fact you'll be able to show what date and time problems started happening for when you need to hunt down and punish exactly which developer made bad changes :)
I ended up using dstat which is vmstat's nicer looking cousin.
This will show most everything you need to know about a machine's health,
including:
CPU
Disk
Memory
Network
Swap
"df -h" to make sure that no partition runs full which can lead to all kinds of funky problems, watching the syslog is of course also useful, for that I recommend installing "logwatch" (Logwatch Website) on your server which sends you an email if weird things start showing up in your syslog.
Cacti is a good web-based monitoring/graphing solution. Very complete, very easy to use, with a large userbase including many large Enterprise-level installations.
If you want more 'alerting' and less 'graphing', check out nagios.
As for 'what to monitor', you want to monitor systems at both the system and application level, so yes: network/memory/disk i/o, interrupts and such over the system level. The application level gets more specific, so a webserver might measure hits/second, errors/second (non-200 responses), etc and a database might measure queries/second, average query fulfillment time, etc.
Beware the afore-mentioned slowquerylog in mysql. It should only be used when trying to figure out why some queries are slow. It has the side-effect of making ALL your queries slow while it's enabled. :P It's intended for debugging, not logging.
Think 'passive monitoring' whenever possible. For instance, sniff the network traffic rather than monitor it from your server -- have another machine watch the packets fly back and forth and record statistics about them.
(By the way, that's one of my favorites -- if you watch connections being established and note when they end, you can find a lot of data about slow queries or slow anything else, without putting any load on the server you care about.)
In addition to top and auth.log, I often look at mtop, and enable mysql's slowquerylog and watch mysqldumpslow.
I also use Nagios to monitor CPU, Memory, and logged in users (on a VPS or dedicated server). That last lets me know when someone other than me has logged in.
network of course :) Use MRTG to get some nice bandwidth graphs, they're just pretty most of the time.. until a spammer finds a hole in your security and it suddenly increases.
Nagios is good for alerting as mentioned, and is easy to get setup. You can then use the mrtg plugin to get alerts for your network traffic too.
I also recommend ntop as it shows where your network traffic is going.
A good link to get you going with Munin and Monit: link text
I typically watch top and tail -f /var/log/auth.log.

Resources