I am trying to make a chroot'ed, sandboxed build-environement, which creates itself from a Git checkout before proceeding with building the application. One of the requirements is that the developers doing the git checkout and invoking the build should not need admin privileges on the host machine.
unshare -r chroot
works fine - except there is no /proc which again means a lot of standeard stuff wont work.
Various methods to create /proc I have found with mount require sudo rights.
Docker does this but the developers have to be in the "docker" group which effectively gives them uncontrolled root access - then rather give them sudo rights.
I have found the "proot" which does some kind of emulation to do this. This, however, has some performance penalties.
You also need a mount namespace which will give you the ability to perform recursive bind mounts (and plain bind mount where there are no child mounts). pivot_root and the ability to mount tmpfs, so use unshare -rm.
With a pid namesapce you can also mount fresh instances of procfs.
I ended up using bubblewrap (bwrap). For a few things using ttys, I had to let it run with pseudo uid 0 to work.
If I should do it now I would use podman I think.
Related
I want to know how I can add the local users of my server to a docker container. I don't need to import their files, I just need a username/password/privileges with new home directory in the docker container for every user in my system. For example, suppose my docker container contains the following users:
Host System:
admin: who has root access and rw access to all
bob: a regular non-sudo user
joe: another regular non-sudo user
Then the Docker Container must have users:
admin: who has root access and rw access to all
bob: a regular non-sudo user
joe: another regular non-sudo user
The Docker container and the system are both running linux, though the system is red hat and the container is ubuntu.
EDIT: I don't want to mount /etc/ files if possible, as this can create a two way security vulnerability as pointed out by #caveman
You would have to mount all relevant linux files using -v like /etc/passwd, /etc/shadow, /ect/group, and /etc/sudoers. Though I can't recommend this due to the security risks, if anyone gets root access in the container they can add users on the host or change passwords since he mount works both ways.
The list of files is not exhaustive, for example, you have to also make sure the shell exacutables exist within the container. When testing this I had to make a symbolic link from /usr/bin/zsh to /bin/bash for example since my user has the zsh shell configured which was not present in the docker image.
If you want to use these users to interact with mounted files, you also have to make sure that user namespace remapping is disabled, or specify that you want to use the same user namespace as the host with the --userns=host flag. Again, not recommended since it is a security feature, so use with care.
Note: Once you have done all this you can use su - {username} to switch to all your existing users. The -u options doesn't work since docker checks the /etc/passwd file before mounting and will give an error.
I read this article about why you shouldn't run containerized applications as root user,
and I'd like someone to confirm my understanding:
Article brief
The article is basically saying that just as you won't run binaries on your machine
as root, but rather as a least-privileged-required user, you won't run containerized applications as root either.
The recommendation of the author is to create a user with a known uid in the Dockerfile
and run the process as that user.
The start of the dockerfile should look like
this:
FROM <base image>
RUN groupadd -g 999 appuser && \
useradd -r -u 999 -g appuser appuser
<br>USER appuser
... <rest of Dockerfile> ..
Validating my understanding, and some questions:
1. Why bother?
Ok, I understand that it's not good to run a container process as root,
just like it's not good to run any process as root.
That's why we should create a user in the Dockerfile and run the application process as that user.
But, if it's possible to run:
~$ docker run -u 0 some_docker_image
then why bother adding a user to my Dockerfile and switch to that user?
The question boils down to the question: What are we "afraid" of? What is the threat?
If the answer is that we're afraid of some untrusted user connected to the system (who is not a sudoer),
then this user can't even run docker containers, unless he is a member of the "docker" group,
in which case - again - he could run the container with -u 0?
So I guess that we're not concerned about the user. We're concerned about the binary itself.
In that case, two possible options exist:
a. The binary is of our creation. In that case: why are we concerned?
b. The binary is of someone else's creation. In that case i can understand why we would like to
switch user.
Am I missing something?
2. Why "Known uid"?
Why is that important to specify the uid of the newly created user, and not just name it something?
3. Why in the start of the Dockerfile?
Is it important to create+switch to the new user in the beginning of the Dockerfile?
Seemingly, this is an approach that's hard to implement, since usually during the docker build process
you need to run a lot of tasks that require root privileges, such as apt-get install etc.
4. What about adding a user and adding it to sudoers?
I have a case in which I need to create a docker image, which when the container runs,
it'll run a ssh server. In order to run the ssh server, you need root privileges.
Is there a point creating a user, adding it to sudoers, and then run the ssh server as root?
Running as root in Docker is dangerous for most of the same reasons as running as root directly on the host. The container has limited Linux privileges so there are some things it can't do (reconfigure the network, reboot the host), but it can do things like overwrite the application code inside the container.
Nobody's code is absolutely perfect, so one of the big reasons to run as non-root is to minimize the damage possible when a mistake does happen.
It doesn't matter what the user ID is, just that it's not 0. There's an argument to make it different from any uid the host might be using, but since your image could run on any host, it's just a guess.
You should create the user at the start of the Dockerfile, since that setup will change infrequently and Docker layer caching can skip it. But, you should use the USER directive and switch to the user at the end of the Dockerfile, after COPYing code in and RUNning the build. Do not RUN chown ... to make the non-root user own the code: you want most files to be owned by root, so that the non-root user can't overwrite them.
(In a compiled language, with a multi-stage build, you can consider the Dockerfile equivalent of the ./configure; make; sudo make install sequence, switching to a non-privileged user to do the build. I haven't seen this pattern in many Dockerfiles but I'd recognize it if I saw it.)
Do not add a user to /etc/sudoers. There are a couple of good reasons for this. The most basic one is that a container only runs a single process, and as already discussed we probably don't want it to be root. You can either configure it with no password (in which case you might as well be root) or hard-code a password in plain text in your Dockerfile (also a bad idea). You also usually don't want to run sudo inside a script (its behaviors of putting up random password prompts and hiding environment variables can cause trouble), and so correspondingly don't want to RUN sudo ... in a Dockerfile.
If you need to break into a container to debug it, you can always docker exec -u root ... to get a root shell there.
I have a tcpdump application in a CentOS container. I was trying to run tcpdump as nonroot. Following this forum post: https://askubuntu.com/questions/530920/tcpdump-permissions-problem (and some other documentation that reinforced this), I tried to use setcap cap_net_admin+eip /path/to/tcpdump in the container.
After running this, I tried to run tcpdump as a different user (with permissions to tcpdump) and I got "Operation Not Permitted". I then tried to run it as root which had previously been working and also got, "Operation Not Permitted". After running getcap, I verified that the permissions were what they should be. I thought it may be my specific use case so I tried running the setcap command against several other executables. Every single executable returned "Operation Not Permitted" until I ran setcap -r /filepath.
Any ideas on how I can address this issue, or even work around it without using root to run tcpdump?
The NET_ADMIN capability is not included in containers by default because it could allow a container process to modify and escape any network isolation settings applied on the container. Therefore explicitly setting this permission on a binary with setcap is going to fail since root and every other user in the container is blocked from that capability. To run a container with this, you would need to add this capability onto the container with the command used to start your container. e.g.
docker run --cap-add NET_ADMIN ...
However, I believe all you need is NET_RAW (setcap cap_net_raw) which is included in the default capabilities. From man capabilities:
CAP_NET_RAW
* Use RAW and PACKET sockets;
* bind to any address for transparent proxying.
I am running a VM on my machine and have mounted a host folder inside VM using sshfs (auto-mounted via fstab).
abc#xyz:/home/machine/test on /home/vm/test type fuse.sshfs (rw,relatime,user_id=0,group_id=0,allow_other)
That folder has an executable which I want to run inside the VM. But I also need some capabilities before running that executable. So my script looks like:
#!/bin/bash
# Some preprocessing.
sudo setcap CAP_DAC_OVERRIDE+ep /home/vm/test/my_exec
/home/vm/test/my_exec
But I am getting below error :
Failed to set capabilities on file `/home/vm/test/my_exec' (Operation not supported)
The value of the capability argument is not permitted for a file. Or the file is not a regular (non-symlink) file
But if I copy executable inside the VM (say in /tmp/), then it works perfectly fine. Is this a known limitation of sshfs or am I missing something here ?
File capabilities are implemented on Linux with extended attributes (specifically the security.capability attribute), and not all filesystems implement extended attributes.
sshfs in particular does not.
sshfs can only perform operations which the remote user is authorized to perform. You're logged into the remote host as abc, so you can only perform actions over sshfs which abc can perform -- which doesn't include setcap, since that operation can only be performed by root. Using sudo on your local machine doesn't change that.
I'm asking in both contexts: technically and stylistically.
Can my application/daemon keep a pidfile in /opt/my_app/run/?
Is it very bad to do so?
My need is this: my daemon runs under a specific user, and the implementor must mkdir a new directory in /var/run, chown, and chgrp it to make my daemon run. Seems easier to just keep the pidfile local (to the daemon).
I wouldn't put a pidfile under an application installation directory such as /opt/my_app/whatever. This directory could be mounted read-only, could be shared between machines, could be watched by a daemon that treats any change there as a possible break-in attempt…
The normal location for pidfiles is /var/run. Most unices will clean this directory on boot; under Ubuntu this is achieved by /var/run an in-memory filesystem (tmpfs).
If you start your daemon from a script that's running as root, have it create a subdirectory /var/run/gmooredaemon and chown it to the daemon-running user before suing to the user and starting the daemon.
On many modern Linux systems, if you start the daemon from a script or launcher that isn't running as root, you can put the pidfile in /run/user/$UID, which is a per-user equivalent of the traditional /var/run. Note that the root part of the launcher, or a boot script running as root, needs to create the directory (for a human user, the directory is created when the user logs in).
Otherwise, pick a location under /tmp or /var/tmp, but this introduces additional complexity because the pidfile's name can't be uniquely determined if it's in a world-writable directory.
In any case, make it easy (command-line option, plus perhaps a compile-time option) for the distributor or administrator to change the pidfile location.
The location of the pid file should be configurable. /var/run is standard for pid files, the same as /var/log is standard for logs. But your daemon should allow you to overwrite this setting in some config file.
/opt is used to install 'self-contained' applications, so nothing wrong here. Using /opt/my_app/etc/ for config files, /opt/my_app/log/ for logs and so on - common practice for this kind of application.
This away you can distribute your applications as a TGZ file instead of maintaining a package for every package manager (at least DEB since you tagged ubuntu). I would recommend this for in-house applications or situations where you have great control over the environment. The reasoning is that it makes no sense if the safe costs more than what you are putting inside (the work required to pack the application should not eclipse the effort required to write the application).
Another convention, if you're not running the script as root, is to put the pidfile in ~/.my_app/my_app.pid. It's simpler this way while still being secure as the home directory is not world-writeable.