How to update Julia in ssh clusters - linux

I'm working as a PhD student in a lab with ssh clusters, I have the access to connect to each one of them (there's no queue system, as it is a small lab, hence, as long as someone is not using a lot of cores in each computer I can run my programs on them).
Currently the lab doesn't have a cluster administrator so its maintenance is in the hands of two researchers with computer knowledge. Currently, the clusters have a very old version of Julia (0.5.1) and I need an update in order to work; however, one of the two researchers in charge told me that it will require a very large amount of time and stopping all current process in order to update Julia, so he is unwilling to make the update on the clusters.
Is there a way that I can update the Julia version on the clusters all by myself? Without interrupting nor canceling any of the current processes?
I believe non of the current processes are being run with Julia, as I am the only one in the lab who works with it. The languages being used for these processes are C, C++ and Fortran.

Julia does not need to be installed system-wide to be used. In fact on all OSes - Linux, Mac and Windows - the Julia distribution is portable and self-contained.
So, the easiest way to get this is to use juliaup to install julia on all the nodes, and use it to manage all the julia versions you need.

Related

OS specific build performance in Java

We are currently evaluating our next-generation company-wide developer pc-configuration and have noticed something really weird.
Our rather large monolith has - on our current configuration a build time of approx. 4.5 minutes (no test, just compile).
For our next generation configuration we upgraded several components. A moderate increase in frequency and IPC with the processor, doubling the number of CPU cores and a switch from a small SATA SSD towards a NVMe SSD rated at >3GBps. Also, the next generation configuration switches from Windows 7 to Windows 10.
When executing the first tests, we noticed an almost identical build time (4.3 Minutes), which was a lot less improvement than we expected.
During our experiments we tried at one point to run the build process from within a virtual Linux machine running on the windows host. On the old configuration (Windows7) we saw a drop in build times from 4.5 to ~3.7 Minutes, on the Windows 10 Host, we saw a decrease from 4.3 to 2.3 minutes. We have ruled out things like virus scan.
We were rather astonished with these results and have tried to find another explanation than some almost-religious and insulting statements about different operation systems.
So the question is: What could we have possibly done wrong in configuring the Windows machine such that the speed is almost half of a Linux running virtualized in the very same windows host? Especially as all the hardware advancements seem to be eaten up by the switch from windows 7 to 10.
Another question is: How can we ace the javac process use up more cores, because right now, using Hotspot JDK 8 we can see at most two cores really used by the build. I've read about sjavac but that seems a rather experimental feature only available to OpenJDK9 onward, right?
After almost a year in experimenting we came to the conclusion, that it is indeed NTFS which is the evil-doer. If you have a ntfs user-partition with a linux host, you get somewhat similar results compared to an all-windows-setup.
We did benchmarks of gradle-build, eclipse internal build, starting up wildfly and running database-centered tests on multiple devices. All our benchmarks showed consistently a speedup of at least 100% when switching from Windows to Linux (sometimes, Windows takes 3x the amount of time in real world benchmarks than Linux, some artificial benchmarks had a speedup of 60!). Especially on notebooks we experienced much less noise, as the combined processor load of a complete build is substantial less than with windows.
Our conclusion was, to switch from Windows to Linux over the course of the last year.
Regarding the parallelisation thing, we realized, it was some form of code-entanglement. Resolving this helped gradle and javac to parallelise the build a lot (also have a look into gradle-composite-builds)

Most lightweight way to emulate a distributed system on linux

So I am taking this distributed systems class in which projects are done by simulating a distributed system using android and multiple emulators. This approach is terrible for multiple reasons:
Android emulators are too effin resource consuming that my poor laptop crashes mostly.
Poor networking support between emulators. Need to do port forwarding on TCP and what not.
So what is the way to emulate a distributed system on my Linux machine that consumes minimal resources, mostly RAM and CPU time?
Is Docker the answer to all of this? Maybe create multiple containers with separate IP for each? Is that even possible?
My team maintains several production distributed systems; and we have to unit test it in such a way that we can capture protocol bugs.
We have a stub implementation of clock and of the network that we inject into our classes. The network mimics the Message Passing model used in many distributed systems papers: pick a message at random and deliver it. This models network latencies and inconsistencies very well. We have other things built in: being able to block/release or drop messages to/from sets of hosts; and a simple tcp model.
With this simple addition our unit tests are now what we call interaction tests. We can very quickly add however many servers we want all running in single process on a laptop.
Oh, and after doing this, you'll know why global variables and singletons are a Bad Thing.
You can run several docker containers on one Linux machine. Each container will get its own IP address and it will also be able to talk to other containers on the same host. How many systems do you want to simulate?

What is difference between Hydra and Torque and what is better: MPICH2 or OpenMPI

I have two questions:
What is difference between Hydra and Torque, or to ask in other way: what more Hydra have to offer in compare to Torque? Do I need Hydra at all if I choose to use Torque (+ MAUI)?
Also, what is an advantage of MPICH2 in advance of OpenMPI, since OpenMPI is supporting IB and also have continuously supporting Windows platform? For me it looks like swiss knife. Am I wrong?
Torque and Hydra are two completely separate things. Torque is a distributed resource manager that allows execution in batch mode of tasks (jobs) on a network of compute systems. Hydra is part of MPICH and is responsible for launching and controlling processes that are part of the MPI job. The way Torque and Hydra work together is that one submits a job to Torque, which reserves cluster resources and at some point start the job. The mpiexec command in turn uses Hydra to start and control the processes that make the MPI job on the compute nodes provided by Torque.
MPICH2 and Open MPI are both quite mature MPI implementations. While Open MPI supports more connection protocols, there is an InfiniBand-enabled version of MPICH called MVAPICH. MPICH is also basis of several commercial MPI implementations including Intel MPI and Microsoft MPI. While Open MPI has supported Windows for a long time, their Windows maintainer left some time ago and it is unclear if they will continue to support that OS.

User-Contributed Code Security

I've seen some websites that can run code from the browser, and the code is evaluated on the server.
What is the security best-practice for applications that run user-contributed code? Besides of accessing and changing the server's sensitive information.
(for example, using a Python with a stripped-down version of the standard library)
How to prevent DoS like non-halting and/or CPU-intensive programs? (we can't use static code analysis here) What about DoSing the type check system?
Python, Prolog and Haskell are suggested examples to talk about.
The "best practice" (am I really the only one who hates that phrase?) is probably just not to do it at all.
If you really must do it, set it up to run in a virtual machine (and I don't mean something like a JVM; I mean something that hosts an OS) so it's easy to restore the VM from a snapshot (or whatever the VM in question happens to call it).
In most cases, you'll need to go a bit beyond just that though. Without some extra work to lock it down, even a VM can use enough resources to reduce responsiveness so it can be difficult to kill and restart it (you usually can eventually, but "eventually" is rarely what you want). You also generally want to set some quotas to limit its total CPU usage, probably limit it to using a single CPU (and run it on a machine with at least two), limit its total memory usage, etc. In Windows, for example, you can do (at least most of that) by starting the VM in a job object, and limiting the resources available to the job object.

Is there any scenario where an application instance runs across multiple computers?

We know that a single application instance can use multiple cores and even multiple physical processors. With cloud and cluster computing, or other special scenario, I don't know if a single stance can run across multiple computers, or across multiple OS instances.
This is important for me because, besides being considered as bad programming, I use some static (as in C#) (global) variables, and my program will probably have an unexpected behavior if those variables become shared between computers.
Update I'll be more specific: I'm writing a .NET application, that has one variable that counts the number of active IP connections, to prevent that the number of connections don't exceed a limit per computer. I am concerned if I deploy that application in a cloud-computing host, if I'll have one instance of the variable per computer.
If you want to learn how to realize such a scenario (a single instance across multiple computers), I think you should read some articles about MPI.
it has become a de facto standard for
communication among processes that
model a parallel program running on a
distributed memory system.
Regarding your worries: Obviously, you'll need to somehow consciously change your program to run as one instance across several computers. Otherwise, no sharing of course takes place, and as Shy writes, there is nothing to worry about. This kind of stuff wouldn't happen automatically.
What programming environment (language) are you using? It should define exactly what "static" means. Most likely it does not allow any sharing of information between different computers except through explicit channels such as MPI or RPC.
However, abstracted high-level environments may be designed to hide the concept of "running on multiple computers" from you entirely. It is conceivable to design a virtual machine that can run on a cluster and will actually share static variables between different physical machines - but in this case your program is running on the single virtual machine, and whether that runs on one or more physical machines should not make any difference.
If you can't even imagine a situation where this happens, why are you worrying about it?
Usually, this could happen in a scenario involving RPC of some kind.
Well yes, there's distcc. It's the GCC compiler distributed.

Resources