Linux sandbox with C, secure? - linux

I'm developing a generic honeypot for TCP services as part of my BA thesis.
I'm currently using Chroot, Linux Namespaces, Secure Computing and Capabilities to provide some sort of a Sandbox.
My question is: Are there any points I have to be aware of? Since I have to mount /proc in the sandbox, I'm curious if it will affect the overall security of the host system.
(User namespaces are not an option btw.)
/* EDIT */
To be more clear: I'm using capabilities(7) and libseccomp to restrict the access to features such as syscalls for root and non-root users.
But what about files in /proc e.g. /proc/sys/* ? Should I blacklist files/directories with an empty bind-mount like firejail does?

As commented by Yann Droneaud, reading the src of systemd-nspawn helped a lot.
I found the following /proc subdirs which should be bind mounted read-only/inaccessible:
/* Make these files inaccessible to container payloads: they potentially leak information about kernel
* internals or the host's execution environment to the container */
PROC_INACCESSIBLE("/proc/kallsyms"),
PROC_INACCESSIBLE("/proc/kcore"),
PROC_INACCESSIBLE("/proc/keys"),
PROC_INACCESSIBLE("/proc/sysrq-trigger"),
PROC_INACCESSIBLE("/proc/timer_list"),
/* Make these directories read-only to container payloads: they show hardware information, and in some
* cases contain tunables the container really shouldn't have access to. */
PROC_READ_ONLY("/proc/acpi"),
PROC_READ_ONLY("/proc/apm"),
PROC_READ_ONLY("/proc/asound"),
PROC_READ_ONLY("/proc/bus"),
PROC_READ_ONLY("/proc/fs"),
PROC_READ_ONLY("/proc/irq"),
PROC_READ_ONLY("/proc/scsi"),

Related

How does AppArmor handle Linux-kernel mount namespaces?

I've searched through wiki of AppArmor's wiki as well as tried Internet searches for "apparmor mount namespace" (or similar). However, I always draw a blank as how AppArmor deals with them, which is especially odd considering that OCI containers could not exist without mount namespaces. Does AppArmor take mount namespaces into any account at all, or does it simply check for the filename passed to some syscall?
If a process inside a container switches mount namespaces does AppArmor take notice at all, or is it simply mount namespace-agnostic in that it doesn't care? For instance, if a container process switches into the initial mount namespace, can I write AppArmor MAC rules to prevent such a process from accessing senstive host files, while the same files inside its own container are allowed for access?
can I write AppArmor MAC rules to prevent such a process from
accessing senstive host files.
Just don't give container access to sensitive host filesystem part. That means don't mount them into container. This is out of scope of AppArmor to take care of if you do.
I would say that AppArmor is partially linux kernel mount namespace aware.
I think the attach_disconnected flag in apparmor is an indication that apparmor knows if you are in the main OS mount namespace or a separate mount namespace.
The attach_disconnected flag is briefly described at this link (despite the warning at the top of the page claiming to be a draft):
https://gitlab.com/apparmor/apparmor/-/wikis/AppArmor_Core_Policy_Reference
The following reference, from a ubuntu apparmor discussion, provides useful and related information although not directly answering your question.
https://lists.ubuntu.com/archives/apparmor/2018-July/011722.html
The following references, from a usenix presentation, provides a proposal to add security namespaces to the Linux kernel for use by frameworks such as apparmor. This does not directly show how / if apparmor currently uses kernel mount namespaces for decision making, but it's related enough to be of interest.
https://www.usenix.org/sites/default/files/conference/protected-files/security18_slides_sun.pdf
https://www.usenix.org/conference/usenixsecurity18/presentation/sun
I don't know if my response here is complete enough to be considered a full answer to your questions, however I don't have enough reputation points to put this into a comment. I also found it difficult to know when the AppArmor documentation meant "apparmor policy namespace" vs "linux kernel mount namespace", when the word "namespace" was specified alone.

Kernel configurations for lxc

I am configuring Linux kernel 3.10.31ltsi and want to add the needed support for LXC, as far as I understood, cgroups and namespaces shall be available for LXC, but what are the configurations in menuconfig that need to be included?
You should use a script called "lxc-checkconfig" (which is part of LXC) to check whether your kernel supports or not all required settings; see
https://linuxcontainers.org/lxc/manpages/man1/lxc-checkconfig.1.html
As a side note, I think that LXC uses by default all namespaces; this means that you should set
CONFIG_UTS_NS, CONFIG_IPC_NS, CONFIG_USER_NS, CONFIG_PID_NS,
CONFIG_NET_NS, and the mount namesapces (forgot it's config entry).
Regarding cgroups - not sure, probably the memory, cpu and I/O cgroups controllers are mandatory, and maybe some more; use the lxc-checkconfig script.

Can bash be used to communicate directly with hardware?

I am interested in writing my own tool in bash to act in place of my current network controller (wpa_supplicant) if possible. For example if I want to issue commands in order to begin a wps authentication session with a router's external registrar, is it possible, without using any pre-built tools, to communicate with the kernel to directly access the hardware? I have been told that I can achieve what I desire with a bash plugin called ctypes.sh but I am not too certain.
Generally speaking, the Linux kernel can interact with user-space through the following mechanisms:
Syscalls
Devices in /dev
Entries in /sys
Entries in /proc
Syscalls cannot be directly used from Bash but you need at least a binding through a C program.
You can create a Linux kernel driver or module which reads/writes data in an entry under /proc or /sys and then use a bash program to interact with it. Even if technically feasible, my personal opinion is that it is an overkill, and the usual C/C++ user-level programming with proper entries in /dev is much better.
No, this is generally not possible. Shell scripts can mess around in /proc, but they don't have the ability to perform arbitrary IOCTLs or even multi-step interactive IO. It's the wrong tool for the job.

What's the difference between a VFS and an NFS?

This might be a dumb question, but I'm struggling to find resources that clearly explain how a VFS is different from an NFS. Can they both be used for the same purpose?
Bonus question: Can you watch a VFS with inotify like you can an NFS?
"NFS" is a network filesystem that's been around for decades. Wikipedia has you covered on that front.
"VFS" is a more generic term that simply means "virtual filesystem". Within the context of Linux, it refers to the part of the kernel with which your user-space programs actually interact when they interact with "files". The VFS layer then passes requests to a concrete filesystem driver -- such as NFS, for example, or ext4, or others.
Read more here and here.
A virtual file system (VFS) is an abstraction layer on top of a more concrete file system.The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way, Where as Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystem in 1984, allowing a user on a client computer to access files over a computer network much more like local storage is accessed
A VFS can be used to access local and network storage devices transparently without the client application noticing the difference. It can be used to bridge the differences in Windows, Mac and Unix file systems, so that applications can access files on local file systems of those types without having to know what type of file system they are accessing Where as, NFS like many other protocols, builds on the Open Newtork Computing Remote Procedure Call (ONC RPC) system. The NFS is an open standard defined in Request for comments (RFC), allowing anyone to implement the protocol.
"VFS" is the name given to the entire layer in the kernel situated between the system calls and the filesystem drivers; it is not a filesystem in its own right.

Securty options in embeded linux?

I am working on an embedded Linux platform. In our platform there is only root user. Now we want to bring in security options like
1. Low Privileged user.
2. Allowing to run only executables from a particular location(only read permission).
3. Use Linux Containers
We have managed to add a low privileged user using the /etc/passwd file. But I have no idea how to do the rest. Is there any better options to implement security in the linux system. Any documentation or links are much appreciated.
Option two is achieved by the noexec flag on mounting. The slight challenge is figuring out exactly what to mount where; you'd want to mount / as noexec to get safety by default, but you need /sbin/mount to be executable. But you can probably make / read-only and mount all the writeable filesystems as noexec.

Resources