Most lightweight way to emulate a distributed system on linux

Most lightweight way to emulate a distributed system on linux - linux

So I am taking this distributed systems class in which projects are done by simulating a distributed system using android and multiple emulators. This approach is terrible for multiple reasons:
Android emulators are too effin resource consuming that my poor laptop crashes mostly.
Poor networking support between emulators. Need to do port forwarding on TCP and what not.
So what is the way to emulate a distributed system on my Linux machine that consumes minimal resources, mostly RAM and CPU time?
Is Docker the answer to all of this? Maybe create multiple containers with separate IP for each? Is that even possible?

My team maintains several production distributed systems; and we have to unit test it in such a way that we can capture protocol bugs.
We have a stub implementation of clock and of the network that we inject into our classes. The network mimics the Message Passing model used in many distributed systems papers: pick a message at random and deliver it. This models network latencies and inconsistencies very well. We have other things built in: being able to block/release or drop messages to/from sets of hosts; and a simple tcp model.
With this simple addition our unit tests are now what we call interaction tests. We can very quickly add however many servers we want all running in single process on a laptop.
Oh, and after doing this, you'll know why global variables and singletons are a Bad Thing.

You can run several docker containers on one Linux machine. Each container will get its own IP address and it will also be able to talk to other containers on the same host. How many systems do you want to simulate?

Related

Is it a good idea to run Cassandra inside an LXC or Docker, in production?

I know it runs just fine, so it's ok for development which is great, but won't it have considerably worse disk and/or network IO performance because of AuFS ?

If you put Cassandra data on a volume, disk I/O performance will be exactly the same as outside of containers, since AUFS will be bypassed entirely.
And even if you don't use a volume, performance will be fine as long as you don't commit Cassandra data into a new image to run that image later. And even if you do that, performance will be affected only during the first writes on each file; after that, it will be native.
You will not see any different in Network I/O performance, unless your containers are dealing with 100s of Mb/s of network traffic and/or 1000s of connections per second. In that case, you can use tools like Pipework to assign MAC VLAN interfaces or even native physical interfaces to your containers.

We are actually running Cassandra in Docker in production and have had to work through a lot of performance issues.
Networking: you should this as --net=host to use the host networking. Otherwise you will take a substantial hit to your network speeds. See this article for more information on recommend best practices.
Data volume: you should expose your data volume to the physical host. If you're operating in the cloud note that where you place your data volume may limit your iops.
JVM: just because you run Cassandra in a container doesn't mean you can get away from tuning your jvm. You still need to modify it to account for the system resources on the host machine.
Cluster Name/Seeds: these need to be configured and need to be changed from hard coded values to find and replace with environment variables using sed.
The big take away is that like any software you need to do some configuration. It's not 100% plug and play.

Looking into the same thing, Just found this on slideshare:
"Docker uses Linux Ethernet Bridges for basic software routing. This will hose your network throughput. (50% hit)
Use the host network stack instead (10% hit)"

How do MPI implementations (OpenMPI, MPICH) handle security/authentication

How do OpenMPI and MPICH handle security when I send MPI messages between the processes over TCP/IP sockets?
In particular, how do they prevent other users of the same network from connecting to a listening socket and sending fake MPI messages?
The specific scenario is the following:
The administrators are trusted. Untrusted users do not have a physical access to any hardware or network. Untrusted users do not have root access.
However, untrusted users can run their own programs in the cluster; the cluster nodes are typical Linux boxes. In particular, untrusted users can open TCP connections from any machine to any other machine in the cluster and send arbitrary messages.

J Teller's right; MPI doesn't really do this, and it shouldn't. That's a design decision based on the use case of MPI.
MPI users are the sorts of people who pay lots of money for interconnects with sub-microseconds latency. The overhead of some sort of cryptographic signing of messages would be completely unacceptable for this community.
And it wouldn't really help at any rate. The way MPI is used is as a message transport interface within a controlled environment - nodes in a limited-access cluster, or maybe machines in a compute lab. If a malicious user gains enough control of one of these nodes to interfere with MPI communications, there are far easier ways to disrupt the communication than sniffing packets, figuring out what stage of the computation is underway, and doing some kind of man-in-the-middle attack. One could just alter the memory of the running job, or more easily, simply overwrite the results on the shared file system. (notice simply sending forged MPI messages might well be noticed, as the "real" messages would pile up, using resources and possibly crashing the job; similarly, intercepting messages without relaying them would almost certainly result in deadlock).
These arguments don't apply so strongly to distributed computing, of course, say BOINC-style: but MPI isn't well suited for that sort of use anyway.
Nothing of course stops an MPI user who does have this sort of security requirement from simply sending a pgp-style signature along with every message and incorporating that into their code; but a mechanism for doing that is not part of MPI per se, and that's certainly the right decision.

I'm not quite an expert on this, but the basic answer is that MPI generally doesn't handle security. It relies on the underlying OS to provide the security level you're describing.
For my mpi distribution, this is built in is using the mpd daemon (the daemon that launches mpi processes). mpdboot sets up a ring of mpd daemons on the cluster (1 per node), in a sane way. Once that ring is setup, and if you trust the mpd daemons, then you're all set. Mpd will make sure that only processes you own connect to your mpi processes.
However, I don't quite understand the "sane way" the mpd ring is setup. In my distribution however, mpdboot is a python script, so it's possible to take a look at it and see if it's secure enough for you. It is probably safe enough if the cluster you're running on is access controlled.

I'm all with #Jonathan Dursi that securing MPI communication contributes little to the security of a well-configured cluster but (a) management may insist, (b) for some reason you want to run MPI over an untrusted network, and (c) it's fun to solve the problem.
As for the untrusted network situation, I used IPSec to protect the MPI communication network on a cluster before. Rather than using IPSec's automatic configuration scripts that require certificates, I loaded symmetric encryption keys via a shell script that I generated automatically with a python script.
To isolate MPI communication of individual jobs on the cluster, it should be possible to extend this approach to load newly generated IPSec keys in the prolog of each job, rather than only once at system startup. In this setup, jobs must not share nodes. Furthermore, it will be easier to set this up with a dedicated network for MPI as ongoing connections (ssh, the job manager etc.) must be left intact and you probably don't want to include the head node(s) which usually the job management system needs to connect to to the job's IPSec network as the attacker could be sitting there.

Architecture to Sandbox the Compilation and Execution Of Untrusted Source Code

The SPOJ is a website that lists programming puzzles, then allows users to write code to solve those puzzles and upload their source code to the server. The server then compiles that source code (or interprets it if it's an interpreted language), runs a battery of unit tests against the code, and verifies that it correctly solves the problem.
What's the best way to implement something like this - how do you sandbox the user input so that it can not compromise the server? Should you use SELinux, chroot, or virtualization? All three plus something else I haven't thought of?
How does the application reliably communicate results outside of the jail while also insuring that the results are not compromised? How would you prevent, for instance, an application from writing huge chunks of nonsense data to disk, or other malicious activities?
I'm genuinely curious, as this just seems like a very risky sort of application to run.

A chroot jail executed from a limited user account sounds like the best starting point (i.e. NOT root or the same user that runs your webserver)
To prevent huge chunks of nonsense data being written to disk, you could use disk quotas or a separate volume that you don't mind filling up (assuming you're not testing in parallel under the same user - or you'll end up dealing with annoying race conditions)
If you wanted to do something more scalable and secure, you could use dynamic virtualized hosts with your own server/client solution for communication - you have a pool of 'agents' that receive instructions to copy and compile from X repository or share, then execute a battery of tests, and log the output back via the same server/client protocol. Your host process can watch for excessive disk usage and report warnings if required, the agents may or may not execute the code under a chroot jail, and if you're super paranoid you would destroy the agent after each run and spin up a new VM when the next sample is ready for testing. If you're doing this large scale in the cloud (e.g. 100+ agents running on EC2) you only ever have enough spun up to accommodate demand and therefore reduce your costs. Again, if you're going for scale you can use something like Amazon SQS to buffer requests, or if you're doing a experimental sample project then you could do something much simpler (just think distributed parallel processing systems, e.g. seti#home)

Linux vs Win runtime timings

I have an application which was ported from Windows to Linux. Now the same code compiles on VS C++ and g++, but there is a difference in performance when it's running on Win and when it's running on Linux. The scope of this application is caching. It's a node between a server and a client, and it's caching client requests and server response in a list, so that any other client which makes requests that was already processed by the server, this node will response instead of forwarding it to server.
When this node runs on Windows, the client gets all it needs in about 7 seconds. But when same node is running on Linux (Ubuntu 9.04), the client starts up in 35 seconds. Every test is from scratch. I'm trying to understand why is this timing difference. A weird scenario is when the node is running on Linux but in a Virtual Machine, hosted by Win. In this case, load time is around 7 seconds, just like it was running Win natively. So, my impression is that there is a problem with networking.
This node is using UDP protocol for sending and receiving network data, and it's using boost::asio as implementation. I tried to change all supported socket flags, changed buffer size, but nothing.
Does someone know why is this happening, or any network settings related with UDP that might influence the performance?
Thanks.

If you suspect a network problem take a network capture (Wireshark is great for this kind of problem) and look at the traffic.
Find out where the time is being spent, either based on the network capture or based on the output of a profiler.
Once you know that you're half way to a solution.

These timing differences can depend on many factors, but the first one coming to mind is that you are using a modern Windows version. XP already had features to keep recently used applications in memory, but in Vista this was much better optimized. For each application you load, a special load file is created that is equal to how it looks in memory. Next time you load your application, it should go a lot faster.
I don't know about Linux, but it is very well possible that it needs to load your app completely each time. You can test the difference in performance between the two systems much better if you compare performance when running. Leave your application open (if it is possible with your design) and compare again.
These differences in how the system optimizes memory are backed up by your scenario using the VM approach.
Basically, if you rule out other running applications and if you run your application in high priority mode, the performance should be close to equal, but it depends on whether you use operating system specific code, how you access the file system, how you you use the UDP protocol etc etc.

Is there any scenario where an application instance runs across multiple computers?

We know that a single application instance can use multiple cores and even multiple physical processors. With cloud and cluster computing, or other special scenario, I don't know if a single stance can run across multiple computers, or across multiple OS instances.
This is important for me because, besides being considered as bad programming, I use some static (as in C#) (global) variables, and my program will probably have an unexpected behavior if those variables become shared between computers.
Update I'll be more specific: I'm writing a .NET application, that has one variable that counts the number of active IP connections, to prevent that the number of connections don't exceed a limit per computer. I am concerned if I deploy that application in a cloud-computing host, if I'll have one instance of the variable per computer.

If you want to learn how to realize such a scenario (a single instance across multiple computers), I think you should read some articles about MPI.
it has become a de facto standard for
communication among processes that
model a parallel program running on a
distributed memory system.
Regarding your worries: Obviously, you'll need to somehow consciously change your program to run as one instance across several computers. Otherwise, no sharing of course takes place, and as Shy writes, there is nothing to worry about. This kind of stuff wouldn't happen automatically.

What programming environment (language) are you using? It should define exactly what "static" means. Most likely it does not allow any sharing of information between different computers except through explicit channels such as MPI or RPC.
However, abstracted high-level environments may be designed to hide the concept of "running on multiple computers" from you entirely. It is conceivable to design a virtual machine that can run on a cluster and will actually share static variables between different physical machines - but in this case your program is running on the single virtual machine, and whether that runs on one or more physical machines should not make any difference.

If you can't even imagine a situation where this happens, why are you worrying about it?
Usually, this could happen in a scenario involving RPC of some kind.

Well yes, there's distcc. It's the GCC compiler distributed.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string