Run an untrusted C program in a sandbox in Linux that prevents it from opening files, forking, etc.?

Run an untrusted C program in a sandbox in Linux that prevents it from opening files, forking, etc.? - linux

I was wondering if there exists a way to run an untrusted C program under a sandbox in Linux. Something that would prevent the program from opening files, or network connections, or forking, exec, etc?
It would be a small program, a homework assignment, that gets uploaded to a server and has unit tests executed on it. So the program would be short lived.

I have used Systrace to sandbox untrusted programs both interactively and in automatic mode. It has a ptrace()-based backend which allows its use on a Linux system without special privileges, as well as a far faster and more poweful backend which requires patching the kernel.
It is also possible to create a sandbox on Unix-like systems using chroot(1), although that is not quite as easy or secure. Linux Containers and FreeBSD jails are a better alternative to chroot. Another alternative on Linux is to use a security framework like SELinux or AppArmor, which is what I would propose for production systems.
We would be able to help you more if you told as what exactly it is that you want to do.
EDIT:
Systrace would work for your case, but I think that something based on the Linux Security Model like AppArmor or SELinux is a more standard, and thus preferred, alternative, depending on your distribution.
EDIT 2:
While chroot(1) is available on most (all?) Unix-like systems, it has quite a few issues:
It can be broken out of. If you are going to actually compile or run untrusted C programs on your system, you are especially vulnerable to this issue. And if your students are anything like mine, someone WILL try to break out of the jail.
You have to create a full independent filesystem hierarchy with everything that is necessary for your task. You do not have to have a compiler in the chroot, but anything that is required to run the compiled programs should be included. While there are utilities that help with this, it's still not trivial.
You have to maintain the chroot. Since it is independent, the chroot files will not be updated along with your distribution. You will have to either recreate the chroot regularly, or include the necessary update tools in it, which would essentially require that it be a full-blown Linux distribution. You will also have to keep system and user data (passwords, input files e.t.c.) synchronized with the host system.
chroot() only protects the filesystem. It does not prevent a malicious program from opening network sockets or a badly-written one from sucking up every available resource.
The resource usage problem is common among all alternatives. Filesystem quotas will prevent programs from filling the disk. Proper ulimit (setrlimit() in C) settings can protect against memory overuse and any fork bombs, as well as put a stop to CPU hogs. nice(1) can lower the priority of those programs so that the computer can be used for any tasks that are deemed more important with no problem.

I wrote an overview of sandboxing techniques in Linux recently. I think your easiest approach would be to use Linux containers (lxc) if you dont mind about forking and so on, which don't really matter in this environment. You can give the process a read only root file system, an isolated loopback network connection, and you can still kill it easily and set memory limits etc.
Seccomp is going to be a bit difficult, as the code cannot even allocate memory.
Selinux is the other option, but I think it might be more work than a container.

Firejail is one of the most comprehensive tools to do that - it support seccomp, filesystem containers, capabilities and more:
https://firejail.wordpress.com/features-3/

You can use Qemu to test assignments quickly. This procedure below takes less than 5 seconds on my 5 year old laptop.
Let's assume the student has to develop a program that takes unsigned ints, each on their own line, until a line with "-1" arrives. The program should then average all the ints and output "Average: %f". Here's how you could test program completely isolated:
First, get root.bin from Jslinux, we'll use that as the userland (it has the tcc C-compiler):
wget https://github.com/levskaya/jslinux-deobfuscated/raw/master/root.bin
We want to put the student's submission in root.bin, so set up the loop device:
sudo losetup /dev/loop0 root.bin
(you could use fuseext2 for this too, but it's not very stable. If it stabilizes, you won't need root for any of this)
Make an empty directory:
mkdir mountpoint
Mount root.bin:
sudo mount /dev/loop0 mountpoint
Enter the mounted filesystem:
cd mountpoint.
Fix rights:
sudo chown -R `whoami` .
mkdir -p etc/init.d
vi etc/init.d:
#!/bin/sh
cd /root
echo READY 2>&1 > /dev/ttyS0
tcc assignment.c 2>&1 > /dev/ttyS0
./a.out 2>&1 > /dev/ttyS0
chmod +x etc/init.d/rcS
Copy the submission to the VM:
cp ~/student_assignment.c root/assignment.c
Exit the VM's root FS:
cd ..
sudo umount mountpoint
Now the image is ready, we just need to run it. It will compile and run the submission after booting.
mkfifo /tmp/guest_output
Open a seperate terminal and start listening for guest output:
dd if=/tmp/guest_output bs=1
In another terminal:
qemu-system-i386 -kernel vmlinuz-3.5.0-27-generic -initrd root.bin -monitor stdio -nographic -serial pipe:/tmp/guestoutput
(I just used the Ubuntu kernel here, but many kernels will work)
When the guest output shows "READY", you can send keys to the VM from the qemu prompt.
For example, to test this assignment, you could do
(qemu) sendkey 1
(qemu) sendkey 4
(qemu) sendkey ret
(qemu) sendkey 1
(qemu) sendkey 0
(qemu) sendkey ret
(qemu) sendkey minus
(qemu) sendkey 1
(qemu) sendkey ret
Now Average = 12.000000 should appear on the guest output pipe. If it doesn't, the student failed.
Quit qemu: quit
A program passing the test is here: https://stackoverflow.com/a/14424295/309483. Just use tcclib.h instead of stdio.h.

Try User-mode Linux. It has about 1% performance overhead for CPU-intensive jobs, but it may be 6 times slower for I/O-intensive jobs.

Running it inside a virtual machine should offer you all the security and restrictions you want.
QEMU would be a good fit for that and all the work (downloading the application, updating the disk image, starting QEMU, running the application inside it, and saving the output for later retrieval) could be scripted for automated tests runs.

When it goes about sanboxing based on ptrace (strace) check-out:
"sydbox" sandbox and "pinktrace" programming library ( it's C99 but there are bindings to python and ruby as far as I know).
Collected links related to topic:
http://www.diigo.com/user/wierzowiecki/sydbox
(sorry that not direct links, but no enough reputation points yet)

seccomp and seccomp-bpf accomplish this with the least effort: https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt

ok thanks to all the answers they helped ME a lot. But i would suggest none of them as an solution for the person who asked the original question. All mentioned tools require to much work for the purpose to test students code as a teacher,tutor,prof. The best way in this case would be in my opinion virtualbox. Ok, its emulates an complete x68-system and has nothing to do with the meaning of sandboxing in this way but if i imagine my programming teacher it would be the best for him. So "apt-get install virtualbox" on debian based systems, all others head over to http://virtualbox.org/ , create a vm, add an iso, click install, wait some time and be lucky. It will be much easier to use as to set up user-mode-linux or doing some heavy strace stuff...
And if you have fears about your students hacking you i guess you have an authority problem and a solution for that would be threaten them that you will sue the living daylights out of them if you can prove just one bite of maleware in the work they give you...
Also if there is a class and 1% of it is as good as he could do such things, dont bore them with such simple tasks and give them some big ones where they have to code some more. Integrative learning is best for everyone so dont relay on old deadlocked structures...
And of cause, never use the same computer for important things (like writing attestations and exams), that you are using for things like browsing the web and testing software.
Use an off line computer for important things and an on line computer for all other things.
However to everyone else who isnt a paranoid teacher (dont want to offend anybody, i am just the opinion that you should learn the basics about security and our society before you start being a programmers teacher...)
... where was i ... for everyone else:
happy hacking !!

Related

RE: Modifying bluetooth scan parameters via btmgmt

I have the same problem of #Hias about this topic:
https://unix.stackexchange.com/questions/420978
The more interesting answer is:
"modify bluetooth kernel module, set any discoveryFilter using Bluez (i.e.: RSSI -127)."
The answer is not clear to me.
Where is the "bluetooth kernel module"?
That is, what is the directory?
In the output of this command:
sudo btmgmt --index 1 find
Between "hci1 type 7 discovering on" and "hci1 type 7 discovering off" there is a time of 11 seconds (I counted in mind).
How to change this time?
If --timer parameter is 5 for example, i must wait other six second to execute the command, otherwise it gives me the output of busy: 5+6=11
For business needs I need to extend the scan times through the btmgmt command (or its configuration file, if it exists) and not through similar commands, the watch command is too draining on the raspberrypi and does not respond to my goals.

The answer is not clear to me.
Where is the "bluetooth kernel module"?
That is, what is the directory?
You probably don't want to modify the kernel module. Unless you know C very well and have built kernel modules before, I would advise against it. Plus, modifying the kernel module would leave your program in an unusable state on anyone's computer unless they also patched their kernel the same way you did. That said, if you absolutely positively think that's the way to go, I'm assuming you're on raspbian, you ought to be able to just apt install linux-sources and then you'd go to the /usr/src/ directory and unpack the kernel from the tar archive found there. Then once you're finished editing the kernel you could compile a new kernel. Have a look over at the gentoo docs or linux-from-scratch pages for quick and easy ways to compile a kernel or a kernel module. Again, I would completely scrap the idea of messing with the kernel, but that's me.
For what you want to do with btmgmt, there is a Python wrapper for btmgmt which would give you fine grain control over how you use that specific tool. With Python at your disposal, even if the btmgmt tool doesn't offer a specific feature that you are looking for, or you can't figure out how a specific feature works with it, you could build that in to your script yourself.

Why many Linux distros use setuid instead of capabilities?

capabilities(7) are a great way for not giving all root privileges to a process and AFAIK can be used instead of setuid(2). According to this and many others,
"Unfortunately, still many binaries have the setuid bit set, while they should be replaced with capabilities instead."
As a simple example, on Ubuntu,
$ ls -l `which ping`
-rwsr-xr-x 1 root root 44168 May 8 2014 /bin/ping
As you know, setting suid/guid on a file, changes the effective user ID to root. So if there the suid-enabled program contains a flaw, the non-privileged user can break-out and become the equivalent of the root user.
My question is why many Linux distributions still use setuid method while setting capabilities can be used instead with less security concerns?

This may not give the reason why some dudes somewhere decided one way or another, but some auditing tools and interfaces may not yet know about capabilities.
An example is the proc_connector netlink interface and the programs based on it (like forkstat): there are events for a process changing its credentials, but not for it changing its capabilities.
FWIW, the cause why you may not get eg a net_raw+ep ping(8) instead of a setuid one on a Debian-like distro is because that depends on the setcap(8) utility from the libcap2-bin package already existing before you install ping. From iputils-ping.postinst:
if command -v setcap > /dev/null; then
if setcap cap_net_raw+ep /bin/ping; then
chmod u-s /bin/ping
else
echo "Setcap failed on /bin/ping, falling back to setuid" >&2
chmod u+s /bin/ping
fi
else
echo "Setcap is not installed, falling back to setuid" >&2
chmod u+s /bin/ping
fi
Also notice that ping itself will drop any setuid privileges and switch to use capabilities on Linux upon starting, so your concerns about it may be a bit exagerated. From ping.c:
int
main(int argc, char **argv)
{
struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_protocol = IPPROTO
_UDP, .ai_socktype = SOCK_DGRAM, .ai_flags = getaddrinfo_flags };
struct addrinfo *result, *ai;
int status;
int ch;
socket_st sock4 = { .fd = -1 };
socket_st sock6 = { .fd = -1 };
char *target;
limit_capabilities();
From ping_common.c
void limit_capabilities(void)
{
...
if (setuid(getuid()) < 0) {
perror("setuid");
exit(-1);
}

Consider a different planet were capabilites patches were rejected, the submitter had to try harder to make a neighborly improvement, and came up with this:
filesystems can be mounted suid, nosuid, or capsuid.
if mounted capsuid, setuid bits on ${file} don't count unless .${file}.suid_capability also exists, is setuid, and is either 0-length or parseable in some standardized format:
-r-sr-xr-x. 1 carton carton 3812 Dec 27 15:39 t1
-r-Sr--r--. 1 carton carton 0 Dec 27 15:39 .t1.suid_capability
if the file is empty, setuid bit works as normal. If it has parseable contents, it works as partial-root capabilities.
if desired the .t1.suid_capability files can be annotated with fields for:
binding to a hash of their matching 't1' file
binding to an absolute path, so for example a setuid file within a chroot does not become "hot" until you chroot into it.
binding to a public key or hash nonce loaded at boot identifying the installed system like a uuid. If the private key or nonce is hidden from a backup server, traditional password resets would still work but some forms of rootkit persistence would become more difficult. It also lets you work with images of other systems in subdirectories, automatically making them effectively "nosuid"-mounted even if you are undisciplined about mount options and elect not to use the absolute path feature.
Now answer some questions about this planet relative to our own:
any missing functionality compared to our native planet?
which system works better with 'ls', 'tar', 'rsync', 'cpio', 'find', 'pax', 'mtree'? with gentoo's ebuild sandbox? with LXC?
which system works better with diskless or guest systems running on NFS or 9p roots?
which system works better with tripwire?
Now repeat those questions considering three systems instead of two:
our planet: mysterious capabilities metadata in filesystems
alternate planet using civilized Unix approach
mosvy's 'ping' example, where the binary drops privileges right after it starts
With the third system are finally starting to lose features relative to status quo: Someone could write a hypothetical capabilities-aware tripwire that works less well in the 'ping' example. Has anyone done that, though? . . . anyone outside NSA---have the people forcing this complexity on us done that work to deliver (relatively small) value from the complexity? And if that's the feature-hill we want to die on (as opposed to reducing attack surface), is there maybe a better way to get it, like dm-verity, that once again moots the complexity?
Now let's make a slight refinement to mosvy's 'ping':
crt.o or some early runtime startup code reads the sidecar file and /etc/suid_capability_hash_nonce. It drops suid if /etc/suid_capability_hash_nonce exists but the sidecar file does not (in "alternate planet" terms, the existence of /etc/suid_capability_hash_nonce makes the whole system '-o capsuid'-mounted). The early runtime startup code handles parsing the sidecars and dropping to the enumerated capabilities so that logic doesn't have to be coded into main.c one-by-one.
Honestly, I think the 'ping' example is superior because programs know what capabilities they need, and the need changes only when the program's source code changes. By setting them in a drop_privileges()-style function the need will never get out of sync. Fanned-out distributions won't have to scramble to update capability sets.
If you disagree and want the sidecar/capabilities-style system, something equivalent could have been implemented without tampering with kernel, mount options, bootup, any of it: old dracut-nfsroot initrds would keep working. It is just a matter of style.
This is all a Socratic way of saying capabilities offers nothing beyond dm-verity + 'ping'-style privileges-dropping. There is nothing "unfortunate" about eschewing complexity that's not pulling its weight.
Every few years CADT developers invent some new form of metadata to cram into filesystem directories: Finder metadata, POSIX ACLs, SELinux contexts, NFS ACLs. What will they think of next? How long will it take to update all the filesystems, all the network protocols, all the fileutils tools, all the non-Linux storage operating systems, all the fancy licensed "enterprise" tape backup suites? Will anyone ever get around to updating all that stuff, or will we just limp along with substandard tools? Will even the core feature be documented usefully, or will it be the undocumented plaything of a few annoying "distributions"? Will the feature work most of the time, but stop systems from booting unexpectedly and create chicken-and-egg recovery problems? Will the feature actually stop any attacks?
Is it fair to all the people who have to clean up after them and tolerate complexity in their own systems that made different, simpler choices, just to unbreak interop that their new feature broke?
The value delivered is particularly low in this case, but we have enough experience to set the value bar high on this type of proposal. I think it was a mistake to accept the feature into Linux and applaud any distribution that avoids it.

Capability support was really enabled in 2008 when file capabilities were added to the kernel. So, that's 13 years and counting to replace setuid. Your question is very valid.
If capability support wasn't implemented in a backwardly compatible way in Linux, it would either have been rejected at birth, or things would have surely changed by now! I suspect it all comes down to the fact that there is no economic incentive to adopt them when their benefit is only evident temporarily when a bug in some code is found to be exploitable.
I think that people are resigned to all the ways that setuid binaries seem to be exploitable and the point fixes that people quickly roll out when another vulnerability surfaces. Buffer overflow -> launch shell -> root exploit -> code fix -> start over. They are comfortable with the idea that user identity = privilege (ie., root). This spills into how people want to describe the futility of breaking down all powerful root into independent capabilities, when a chain of exploits can yield all privilege. Clearly, the pervasive idea that privilege and identity should be equivalent is why Ambient capabilities even exist.
Capabilities, however, when used as they were intended to be used are not identity = privilege. They are capable binaries = privilege - where the privilege is reduced relative to a user identity executing arbitrary code, by the combination of those capability bits and the code in the actual program that has them.
If you write code that edits files, it is clear it shouldn't need to worry about being directly abused for raw ethernet packet formation, or loading kernel modules. Exploiting a bug in that code may well allow a malicious edit of a file but, unlike setuid, it won't allow strange packets to be sent on the network. At least not without tricking the code of some independently capable executable into doing it by proxy.
However, no one intentionally writes buggy code, so I think the answer to your question comes down to another question: "where exactly is the benefit of limiting how exploitable an exploit is?".
I suspect that if some distribution were to figure out how to restructure their code base to eliminate setuid-root binaries, in favor of file capabilities, it would fare better (certainly no worse) over time as code exploits are found vs. other distributions clinging to setuid-root. But, until such a distribution comes into being, I can't fault the idea that that is just an opinion.

Stripping down a kernel in linux?

I recently read a post (admittedly its a few years old) and it was advice for fast number-crunching program:
"Use something like Gentoo Linux with 64 bit processors as you can compile it natively as you install. This will allow you to get the maximum punch out of the machine as you can strip the kernel right down to only what you need."
can anyone elaborate on what they mean by stripping down the kernel? Also, as this post was about 6 years old, which current version of Linux would be best for this (to aid my google searches)?

There is some truth in the statement, as well as something somewhat nonsensical.
You do not spend resources on processes you are not running. So as a first instance I would try minimise the number of processes running. For that we quite enjoy Ubuntu server iso images at work -- if you install from those, log in and run ps or pstree you see a thing of beauty: six or seven processes. Nothing more. That is good.
That the kernel is big (in terms of source size or installation) does not matter per se. Many of this size stems from drivers you may not be using anyway. And the same rule applies again: what you do not run does not compete for resources.
So think about a headless server, stripped down -- rather than your average desktop installation with more than a screenful of processes trying to make the life of a desktop user easier.

You can create a custom linux kernel for any distribution.
Start by going to kernel.org and downloading the latest source. Then choose your configuration interface (you have the choice of console text, 'config', ncurses style 'menuconfig', KDE style 'xconfig' and GNOME style 'gconfig' these days) and execute ./make whateverconfig. After choosing all the options, type make to create your kernel. Then make modules to compile all the selected modules for this kernel. Then, make install will copy the files to your /boot directory, and make modules_install, copies the modules. Next, go to /boot and use mkinitrd to create the ram disk needed to boot properly, if needed. Then you'll add the kernel to your GRUB menu.lst, by editing menu.lst and copying the latest entry and adding a similar one pointing to the new kernel version.
Of course, that's a basic overview and you should probably search for 'linux kernel compile' to find more detailed info. Selecting the necessary kernel modules and options takes a bit of experience - if you choose the wrong options, the kernel might not be bootable and you'll have to start over, which is a pain because selecting the options and compiling the kernel can take 15-30 minutes.
Ultimately, it isn't going to make a large difference to compile a stripped-down custom kernel unless your given task is very, very performance sensitive. It makes sense to remove things you're never going to use from the kernel, though, like say ISDN support.
I'd have to say this question is more suited to SuperUser.com, by the way, as it's not quite about programming.

How to "hibernate" a process in Linux by storing its memory to disk and restoring it later?

Is it possible to 'hibernate' a process in linux?
Just like 'hibernate' in laptop, I would to write all the memory used by a process to disk, free up the RAM. And then later on, I can 'resume the process', i.e, reading all the data from memory and put it back to RAM and I can continue with my process?

I used to maintain CryoPID, which is a program that does exactly what you are talking about. It writes the contents of a program's address space, VDSO, file descriptor references and states to a file that can later be reconstructed. CryoPID started when there were no usable hooks in Linux itself and worked entirely from userspace (actually, it still does work, depending on your distro / kernel / security settings).
Problems were (indeed) sockets, pending RT signals, numerous X11 issues, the glibc caching getpid() implementation amongst many others. Randomization (especially VDSO) turned out to be insurmountable for the few of us working on it after Bernard walked away from it. However, it was fun and became the topic of several masters thesis.
If you are just contemplating a program that can save its running state and re-start directly into that state, its far .. far .. easier to just save that information from within the program itself, perhaps when servicing a signal.

I'd like to put a status update here, as of 2014.
The accepted answer suggests CryoPID as a tool to perform Checkpoint/Restore, but I found the project to be unmantained and impossible to compile with recent kernels.
Now, I found two actively mantained projects providing the application checkpointing feature.
The first, the one I suggest 'cause I have better luck running it, is CRIU
that performs checkpoint/restore mainly in userspace, and requires the kernel option CONFIG_CHECKPOINT_RESTORE enabled to work.
Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.
The latter is DMTCP; quoting from their main page:
DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.
There is also a nice Wikipedia page on the argument: Application_checkpointing

The answers mentioning ctrl-z are really talking about stopping the process with a signal, in this case SIGTSTP. You can issue a stop signal with kill:
kill -STOP <pid>
That will suspend execution of the process. It won't immediately free the memory used by it, but as memory is required for other processes the memory used by the stopped process will be gradually swapped out.
When you want to wake it up again, use
kill -CONT <pid>
The more complicated solutions, like CryoPID, are really only needed if you want the stopped process to be able to survive a system shutdown/restart - it doesn't sound like you need that.

Linux Kernel has now partially implemented the checkpoint/restart futures:https://ckpt.wiki.kernel.org/, the status is here.
Some useful information are in the lwn(linux weekly net):
http://lwn.net/Articles/375855/ http://lwn.net/Articles/412749/ ......
So the answer is "YES"

The issue is restoring the streams - files and sockets - that the program has open.
When your whole OS hibernates, the local files and such can obviously be restored. Network connections don't, but then the code that accesses the internet is typically more error checking and such and survives the error conditions (or ought to).
If you did per-program hibernation (without application support), how would you handle open files? What if another process accesses those files in the interim? etc?
Maintaining state when the program is not loaded is going to be difficult.
Simply suspending the threads and letting it get swapped to disk would have much the same effect?
Or run the program in a virtual machine and let the VM handle suspension.

Short answer is "yes, but not always reliably". Check out CryoPID:
http://cryopid.berlios.de/
Open files will indeed be the most common problem. CryoPID states explicitly:
Open files and offsets are restored.
Temporary files that have been
unlinked and are not accessible on the
filesystem are always saved in the
image. Other files that do not exist
on resume are not yet restored.
Support for saving file contents for
such situations is planned.
The same issues will also affect TCP connections, though CryoPID supports tcpcp for connection resuming.

I extended Cryopid producing a package called Cryopid2 available from SourceForge. This can
migrate a process as well as hibernating it (along with any open files and sockets - data
in sockets/pipes is sucked into the process on hibernation and spat back into these when
process is restarted).
The reason I have not been active with this project is I am not a kernel developer - both
this (and/or the original cryopid) need to get someone on board who can get them running
with the lastest kernels (e.g. Linux 3.x).
The Cryopid method does work - and is probably the best solution to general purpose process
hibernation/migration in Linux I have come across.

The short answer is "yes." You might start by looking at this for some ideas: ELF executable reconstruction from a core image (http://vx.netlux.org/lib/vsc03.html)

As others have noted, it's difficult for the OS to provide this functionality, because the application needs to have some error checking builtin to handle broken streams.
However, on a side note, some programming languages and tools that use virtual machines explicitly support this functionality, such as the Self programming language.

This is sort of the ultimate goal of clustered operating system. Mathew Dillon puts a lot of effort to implement something like this in his Dragonfly BSD project.

adding another workaround: you can use virtualbox. run your applications in a regular virtual machine and simply "save the machine state" whenever you want.
I know this is not an answer, but I thought it could be useful when there are no real options.
if for any reason you don't like virtualbox, vmware and Qemu are as good.

Ctrl-Z increases the chances the process's pages will be swapped, but it doesn't free the process's resources completely. The problem with freeing a process's resources completely is that things like file handles, sockets are kernel resources the process gets to use, but doesn't know how to persist on its own. So Ctrl-Z is as good as it gets.

There was some research on checkpoint/restore for Linux back in 2.2 and 2.4 days, but it never made it past prototype. It is possible (with the caveats described in the other answers) for certain values of possible - I you can write a kernel module to do it, it is possible. But for the common value of possible (can I do it from the shell on a commercial Linux distribution), it is not yet possible.

There's ctrl+z in linux, but i'm not sure it offers the features you specified. I suspect you asked this question since it doesn't

Write my own 'everything is a file' interface

I would like to expose the settings and statistics of my program in a 'everything is a file' manner - sort of how /proc/ and /sys/ works.
As an example, imagine for a moment that apache2 had this type of interface. You would then be able to do something like this (hypothetical):
cd /apache2/virtual_hosts
mkdir 172.20.30.50
cd 172.20.30.50
echo '/www/example1' > DocumentRoot
echo 'www.example1.com' > ServerName
echo 1 > control/enabled
cat control/status
enabled true
uptime 4080
hits 0
Now, are there any tutorials or similar on how to do this? I'm mainly looking for techniques for 'pretending to be a file or dir'. I'm on linux, POSIX or other more portable method would be preferable, but not required.

On Linux, have a look at Fuse: implement a fully functional filesystem in a userspace program.
Simple library API
Simple installation (no need to patch or recompile the kernel)
Secure implementation
Userspace - kernel interface is very efficient
Usable by non privileged users
Runs on Linux kernels 2.4.X and 2.6.X
Has proven very stable over time
Look at compatible platforms here. In terms of tutorial, one of good ones I've came across is here.

In addition to FUSE, another solution is to export a 9p filesystem. wmii does this, for example.

Perhaps the way to do this is just use "real" files and use a change-notification library (inotify preferred) to detect when they change and update your behaviour accordingly.
The /proc and /sys are for kernel-user communication and not really intended for userspace programs' IPC - you're expected to use named pipes, sockets, shared memory etc for that.
(ab)using FUSE is not really a nice idea in this case, I think.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string