Running a foreign Jupyter notebook - what can happen - security

I read the security page from https://jupyter-notebook.readthedocs.io/en/stable/security.html , but that basically seems to say that you can secure your server, but that once you execute a notebook from someone else, the only thing protecting you is your decision not to execute code - which to me seems to be the major point of those notebooks.
Is it really the case that i am effectively granting whatever level of access to my whole system by executing code from a notebook - or is there a built-in way to sandbox this?

Running foreign jupyter notebooks does actually provide the risk of changes on your computer being made; for example delete files or change system settings.
This is especially easy since in built-in line magic commands are supported.
Even though all changes can be reversed (with more or less effort), the best way for you to secure yourself from unwanted changes is reading (and understanding) the code that you will execute.
This is actually not only a recommendation for notebooks specifically, but for all forms of code found online.

Related

Running (& compiling) untrusted user code

I want to create a application that contains a feature that allows users to submit code and the server will compile and run it, similar to Ideone & Spoj. How do I do this securely in a scalable manner?
Partial Solutions I'm aware of:
IDEA 1 - 3rd Party Services
The Sphere Engine. However this costs a LOT of money!
I'm not aware of any open source application I can run on my server to achieve this, or a cheaper alternative. Please correct me if i'm wrong.
IDEA 2 - VM
This would be the next most sensible choice. However, I'm unsure how to implement it. For example let's say I created a VM and started to run the user's code. This would restrict damage on MY system, but not the damage on the VM, which other users would have to use. Does that mean I have to create a new VM each and every time I want to compile and run user's code (which clearly is not scalable - correct me if I'm wrong.
Having not set up a thing, I assumed that services like TravisCI (which compiles code and runs it under test cases you provide), have a base virtual machine image, which boots up and processes your code. The next user to come along gets a separate VM booted from the same base image, your changes aren't stored.
So inside the VM, the user code can do whatever. All of its effects, except stuff written to the console will be erased at the end of the time limit.

Extract SAS Enterprise Guide into Unix Server runnable batch?

We have built a project in Enterprise Guide for the purpose of creating a easy understandable and maintainable code. The project contain a set of process flows which run should be done in specific order. This project we need to run on a Linux Server machine, where the SAS Metadata Server is running.
Basic idea is to extract this project into SAS code, which we would be able to run from command line in Linux as a batch job.
Question 1:
Is there any other way to schedule a batch job in Linux-hosted SAS Server? I have read about VBS scripting for scheduling/running batch jobs, but in order this to be done on Linux Server, a installation of WINE is required, which on a production machine which already runs a number of other important applications, is almost completely out of question.
Is there a way to specify a complete project export into SAS code, provided that I give the specific order of running process flows? I have tried out ordered list, which is able to make you a list of tasks to run in order (although there is no way to choose a whole process flow as a single task), but unfortunately, this ordered list itself is later not possible to be exported as a SAS code.
Current solution we do is the following:
We export each single process flow of the SAS EG project into SAS code, and then create another SAS code with %include lines to run all the extracted codes in order that we want. This is of course a possible solution, but definitely not the most elegant one.
Question 2:
Since I don't know how exactly the code is being exported afterwards, are there any dangers I should bear in mind with the solution I chose.
Is there any other, more elegant way?
You have a couple of options from what I'm familiar with, plus I suspect if Dom happens by he'll know more. These answers are based on EG 6.1, which is the current version (ships with 9.4); it's possible some of these things may not be true in earlier versions.
First, if you're running Enterprise Guide from Windows, you can schedule the job locally (on any Windows machine with Enterprise Guide). You're not scheduling the server directly, you schedule Windows to launch an EG process that connects to the server and does its magic. That's how I largely interact with scheduling (because I have a fairly 'light' scheduling need).
Second, from the blog post "Four Ways to Schedule SAS Tasks", options 3 and 4 may be helpful for you. The SAS Platform Suite is designed in part for scheduling, and the options using SAS Management Console to schedule via operating system tools, are both very helpful.
Third, you may want to look into SAS Stored Processes, which should be schedulable. A process flow can be converted into a stored process.
For your specific questions:
Question 1: When you export a process flow or a project, at least in 6.1 you have the option to change the order in which the programs are exported. It's manual, so it's probably not perfect, but it does give you that option. (The code seems to be by default in creation order, which is sub-optimal.) The project export does group process flows together, but you don't have the option of manipulating the order of process flows - you have to move each program around, which would be tedious. It also of course gives you less flexibility if you need to multiply run programs.
Question 2: As Stig Eide points out in comments, make sure your System Option LRECL is > 256 (the default) or you run some risk of code being cut off. In 9.2+ this is modifiable; just place LRECL=32767in your config.sas file.

How to determine that the shell script is safe

I downloaded this shell script from this site.
It's suspiciously large for a bash script. So I opened it with text editor and noticed
that behind the code there is a lot of non-sense characters.
I'm afraid of giving the script execution right with chmod +x jd.sh. Can you advise me how to recognize if it's safe or how to set it's limited rights in the system?
thank you
The "non-sense characters" indicate binary files that are included directly into the SH file. The script will use the file itself as a file archive and copy/extract files as needed. That's nothing unusual for an SH installer. (edit: for example, makeself)
As with other software, it's virtually impossible to decide wether or not running the script is "safe".
Don't run it! That site is blocked where I work, because it's known to serve malware.
Now, as to verifying code, it's not really possible without isolating it completely (technically difficult, but a VM might serve if it has no known vulnerabilities) and running it to observe what it actually does. A healthy dose of mistrust is always useful when using third-party software, but of course nobody has time to verify all the software they run, or even a tiny fraction of it. It would take thousands (more likely millions) of work years, and would find enough bugs to keep developers busy for another thousand years. The best you can usually do is run only software which has been created or at least recommended by someone you trust at least somewhat. Trust has to be determined according to your own criteria, but here are some which would count in the software's favor for me:
Part of a major operating system/distribution. That means some larger organization has decided to trust it.
Source code is publicly available. At least any malware caused by company policy (see Sony CD debacle) would have a bigger chance of being discovered.
Source code is distributed on an appropriate platform. Sites like GitHub enable you to gauge the popularity of software and keep track of what's happening to it, while a random web site without any commenting features, version control, or bug database is an awful place to keep useful code.
While the source of the script does not seem trustworthy (IP address?), this might still be legit. With shell scripts it is possible to append binary content at the end and thus build a type of installer. Years ago, Sun would ship the JDK for Solaris in exactly that form. I don't know if that's still the case, though.
If you wanna test it without risk, I'd install a Linux in a VirtualBox (free virtual-machine software), run the script there and see what it does.
Addendum on see what it does: There's a variety of tools on UNIX that you can use to analyze a binary program, like strace, ptrace, ltrace. What might also be interesting is running the script using chroot. That way you can easily find all files that are installed.
But at the end of the day this will probably yield more binary files which are not easy to examine (as probably any developer of anti-virus software will tell you). Therefore, if you don't trust the source at all, don't run it. Or if you must run it, do it in a VM where at least it won't be able to do too much damage or access any of your data.

Authenticating GTK app to run with root permissions

I have a UI app (uses GTK) for Linux that requires to be run as root (it reads and writes /dev/sd*).
Instead of requiring the user to open a root shell or use "sudo" manually every time when he launches my app, I wonder if the app can use some OS-provided API to get root permissions. (Note: gtk app's can't use "setuid" mode, so that's not an option here.)
The advantage here would be an easier workflow: The user could, from his default user account, double click my app from the desktop instead of having to open a root terminal and launch it from there.
I ask this because OS X offers exactly this: An app can ask the OS to launch an executable with root permissions - the OS (and not the app) then asks the user to input his credentials, verifies them and then launches the target as desired.
I wonder if there's something similar for Linux (Ubuntu, e.g.)
Clarification:
So, after the hint at PolicyKit I wonder if I can use that to get r/w access to the "/dev/sd..." block devices. I find the documention quite hard to understand, so I thought I'd first ask whether this is possible at all before I spend hours on trying to understand it in vain.
Update:
The app is a remote operated disk repair tool for the unsavvy Linux user, and those Linux noobs won't have much understanding of using sudo or even changing their user's group memberships, especially if their disk just started acting up and they're freaking out. That's why I seek a solution that avoids technicalities like this.
The old way, simple but now being phased out, is GKSu. Here is the discussion on GKSu's future.
The new way is to use PolicyKit. I'm not quite sure how this works but I think you need to launch your app using the pkexec command.
UPDATE:
Looking at the example code on http://hal.freedesktop.org/docs/polkit/polkit-apps.html, it seems that you can use PolicyKit to obtain authorization for certain actions which are described by .policy files in /usr/share/polkit-1/actions. The action for executing a program as another user is org.freedesktop.policykit.exec. I can't seem to find an action for directly accessing block devices, but I have to admit, the PolicyKit documentation breaks my brain too.
So, perhaps the simplest course of action for you is to separate your disk-mangling code that requires privileges into a command-line utility, and run that from your GUI application using g_spawn_[a]sync() with pkexec. That way you wouldn't have to bother with requesting actions and that sort of thing. It's probably bad practice anyway to run your whole GUI application as root.
Another suggestion is to ask the author of PolicyKit (David Zeuthen) directly. Or try posting your question to the gtk-app-devel list.

Running external code in a restricted environment (linux)

For reasons beyond the scope of this post, I want to run external (user submitted) code similar to the computer language benchmark game. Obviously this needs to be done in a restricted environment. Here are my restriction requirements:
Can only read/write to current working directory (will be large tempdir)
No external access (internet, etc)
Anything else I probably don't care about (e.g., processor/memory usage, etc).
I myself have several restrictions. A solution which uses standard *nix functionality (specifically RHEL 5.x) would be preferred, as then I could use our cluster for the backend. It is also difficult to get software installed there, so something in the base distribution would be optimal.
Now, the questions:
Can this even be done with externally compiled binaries? It seems like it could be possible, but also like it could just be hopeless.
What about if we force the code itself to be submitted, and compile it ourselves. Does that make the problem easier or harder?
Should I just give up on home directory protection, and use a VM/rollback? What about blocking external communication (isn't the VM usually talked to over a bridged LAN connection?)
Something I missed?
Possibly useful ideas:
rssh. Doesn't help with compiled code though
Using a VM with rollback after code finishes (can network be configured so there is a local bridge but no WAN bridge?). Doesn't work on cluster.
I would examine and evaluate both a VM and a special SELinux context.
I don't think you'll be able to do what you need with simple file system protection because you won't be able to prevent access to syscalls which will allow access to the network etc. You can probably use AppArmor to do what you need though. That uses the kernel and virtualizes the foreign binary.

Resources