Golang, calling a new system call in Linux - linux

I have a Linux kernel with a custom system call. In C, I can use the standard C library syscall() to call a system call by its number. How can I call this new system call in Go?
In C, on Linux, there are also macros that can emit the needed inline assembly to make a system call directly.
I would hate to have to hack syscall_linux.go.
I see that in Go, syscall_linux.go is processed by a perl script (mysyscall.pl) to generate assembly. That is also pretty complicated and hacking it to generate a new stub also seems needlessly messy.

The syscall package has syscall.Syscall and syscall.Syscall6 to make syscalls directly.

Related

Does executable file of C++ program contain object code of system calls also

We use Linux system calls like fork(), pthread(), signal() and so on in C or C++ programs and compile the program to generate executable file (a.out). Now my doubt is whether the file a.out contain the object code of all linux system calls used, or whether the executable contain only the calls to system functions and the system call functions are linked during runtime? Suppose if I move my a.out file to some other Linux operating system which implements system calls in different syntax and try to compile it there will it work?
My doubt is whether system call function definitions part of a.out file?
User space binaries don't contain implementations of system calls. That would mean that any user could inject any code into kernel and take over system.
Instead they need to switch to kernel mode, by using processor interrupt or special instruction. Then processor can execute system call implementation from the kernel.
User space library, such as libc, is usually used, which provides stubs, which convert arguments of a syscall to a proper protocol and trigger jump to kernel mode. It is usually linked dynamically, so these stubs also don't appear in executable file.

How to invoke newly added system call by the function id without using syscall(__NR_mysyscall)

I am working with Linux-3.9.3 kernel in Ubuntu 10.04. I have added a basic system call in the kernel directory of the linux-3.9.3 source tree. I am able to use it with syscall() by passing my newly system call number in it as an argument. But I want to invoke it directly by using its method name as in the case of getpid() or open() system calls. Can any one help me to add it in GNU C library. I went through few documents but did not get any clear idea of how to accomplish it.
Thanks!!!
Assuming you are on a 64 bits Linux x86-64, the relevant ABI is the x86-64 ABI. Read also the x86 calling conventions wikipage and the linux assembly howto and syscalls(2)
So syscalls are using a different convention than ordinary function calls (e.g. all arguments are passed by registers, error condition could use the carry bit). Hence, you need a C wrapper to make your syscall available to C applications.
You could look into the source code of existing C libraries, like GNU libc or musl libc (so you'll need to make your own library for that syscall).
The MUSL libc source code is very readable, see e.g. its src/unistd/fsync.c as an example.
I would suggest wrapping your new syscall in your own library without patching libc. Notice that some uncommon syscalls are sitting in a different library, e.g. request_key(2) has its C wrapper in libkeyutils

Capture file system system calls on Linux platform

I want to capture all the system calls on a file system in great details. E.g. for write system call, I want to record the target file, number of bytes written and the offset that write occurs.
Currently, I want to implement such a logger with inotify. However, it cannot provide such details. E.g. for write it does not provide number of bytes written and offset.
An alternative is to use bbfs implemented on fuse. However, it will introduce overhead when logging system calls and delay user operations to some un-tolerable degree.
Is there some library that can capture system calls on file system, just like ptrace when logging all system calls issued by a process?
There are many options for tracing in Linux. But this sounds like a pretty simple case. Have you investigated simply using the strace utility? It has lots of options that can control tracing granularity, will log arguments to almost all syscalls (including buffer contents if you want that) and exists and works basically everywhere without any setup beyond installing the package.
How about write your own profiling tool using a wrapper? See GCC -wrapper:
-wrapper
Invoke all subcommands under a wrapper program. The name of the wrapper program and its parameters are passed as a comma separated list.

Learning x86 assembly on Mac/BSD: Kernel built-in functions? How to know arguments / order?

I have been playing around with yasm in an attempt to grasp a basic understanding of x86 assembly. From my tests, it seems you call functions from the kernel by setting the EAX register with the number of the function you want. Then, you push the function arguments onto the stack and issue a syscall (0x80) to execute the instruction. This is Mac OS X / BSD style, I know Linux uses registers to hold arguments instead of using the stack. Does this sound right? Is this the basic idea?
I am a little confused because where are the functions documented? How would I know what arguments, and in what order, to push them onto the stack? Should I look in syscall.h for the answers? It seems there would be a specific reference for supported kernel calls other than C headers.
Also, do standard C functions like printf() rely on the kernel's built-in functions for say, writing to stdout? In other words, does the C compiler know what the kernel functions are and is it trying to "figure out" how to take C code and translate it to kernel functions (which the assembler then translates to machine code)?
C code -> C compiler -> kernel calls / asm -> assembler -> machine binary
I'm sure these are really basic questions, but my understanding of everything that happens after the C compiler is rather muddy.
System Call Documentation
Make sure you have the XCode Developer Tools installed for the UNIX manpages for Mac OS X and then run man 2 intro on the commandline. For a list of system calls, you can use syscall.h (which is useful for the system call numbers) or you can run man 2 syscalls. Then to look up each specific system call, you can run man 2 syscall_name i.e. for read, you can run man 2 read.
UNIX manpages are a historically significant documentation reference for UNIX systems. Pretty much any low-level POSIX function or system call will be documented using them, as well as most commands. Section 2 covers just system calls, and so when you run man 2 pagename, you're asking for the manpage in the system calls section. Section 3 also deals with library functions, so you can run man 3 sprintf the next time you want to read about sprintf.
How C Libraries relate to System Calls
As for how C libraries implement their functionality, usually they build everything on top of system calls, especially in UNIX-like operating systems. malloc internally uses mmap() or brk() on a lot of platforms to get a hold of the actual memory for your process and I/O functions will often use buffers with read, write calls. If there's some other mechanism or library providing the needed functionality, they may also choose to use those instead (i.e. some C libraries for DOS may make use of direct BIOS interrupts instead of calling only DOS interrupts, whereas C libraries for Windows might use Win32 API calls).
Often only a subset of the library functions will need system calls or underlying mechanisms to be implemented though, since the remainder can be written in terms of that subset.
To actually know what's going on with your specific implementation, you should investigate what's happening in a debugger (just keep stepping into all the function calls) or browse the source code of the C library you're using.
How your C code using C libraries relates to machine code
In your question you also suggested:
C code -> C compiler -> kernel calls / asm -> assembler -> machine binary
This is combining two very different concepts. Functions and function calls are supported at the machine code and assembly level, so your C code has a very direct mapping to machine code:
C code -> C compiler -> Assembler -> Linker -> Machine Binary
That is, the compiler translates your function calls in C to function calls in Assembly and system calls in C to system calls in Assembly.
However on most platforms, that machine code contains references to shared libraries and functions in those libraries, so your machine code might have a function that calls other functions from a shared library. The OS then loads that shared library's machine code (if it hasn't been loaded yet for something else) and then runs the machine code for the library function. Then if that library function calls system calls via interrupts, the kernel receives the system call request and does low-level operations directly with the hardware or the BIOS.
So in a protected mode OS, your machine code can be seen as doing the following:
<----------+
|
Function call to -> Other function calls --+
or -> System calls to -> Direct hardware access (inside kernel)
or -> BIOS calls (inside kernel)
You can, of course, call system calls directly in your program as well, skipping the need for any libraries, but unless you're writing your own library, there's usually very little need to do this. If you want even lower-level access, you have to write kernel-level code such as drivers or kernel subsystems.
The recommended way is not doing INT 0x80 by yourself, but to use the wrapper functions from the stdlib. These are, of course, available for assembly as well.
Concerning printf, this works this way:
printf internally calls fprintf(stdout, ...), which in turn uses the FILE * stdout to write to the file descriptor 1 and does write(1, ...). This calls a small wrapper function to set the proper registers to the arguments and perform the kernel call.

Linux kernel - add system call dynamically through module

Is there any way to add a system call dynamic, such as through a module? I have found places where I can override an existing system call with a module by just changing the sys_call_table[] array to get my overridden function instead of the native when my module is installed, but can you do this with a new system call and a module?
No, sys_call_table is of fixed size:
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = { ...
The best you can do, as you probably already discovered, is to intercept existing system calls.
Intercepting existing system call (to have something done in the kernel) is not the right way in some cases. For eg, if your userspace drivers need to execute something in kernel, send something there, or read something from kernel?
Usually for drivers, the right way is to use ioctl() call, which is just one system call, but it can call different kernel functions or driver modules - by passing different parameters through ioctl().
The above is for user-controlled kernel code execution.
For data passing, you can use procfs, or sysfs drivers to talk to the kernel.
PS: when you intercept system call, which generally affect the entire OS, you have to worry about how to solve the problem of doing it safely: what if someone else is halfway calling the system call, and then you modify/intercept the codes?

Resources