mmap() docs mentions flag MAP_UNINITIALIZED, but the flag doesn't seem to be defined.
Tried on Centos7, and Xenial, neither distro has the flag defined in sys/mman.h as alleged.
Astonishingly, the internet doesn't seem to be aware of this. What's the story?
Edit: I understand from the docs that the flag is only honoured on embedded or low-security devices, but that doesn't mean the flag shouldn't be defined... How do you use it in portable code? Google has revealed code where it is defined as 0 in cases where not supported, except in my cases it's not defined at all.
In order to understand what to do about the fact that #include <sys/mman.h> does not define MAP_UNINITIALIZED, it is helpful to understand how the interface to the kernel is defined.
To build a kernel module, you will need the kernel headers used to build the kernel for the exact version of the kernel for which you wish to build the module. As you wish to run in userspace, you won't need these.
The headers that define the kernel API for userspace are largely in /usr/include/linux and /usr/include/asm (see this for how they are generated). One of the more important consumers of these headers is the C standard library, e.g., glibc, which must be built against some version of these headers. Since the linux kernel API is backwards compatible, you may have a glibc (or other library implementation) built against an older version of these headers than the kernel you are running. I'm by no means an expert on how all the various distros distribute glibc, but it is my impression that the kernel headers defining its userspace API are generally the version that glibc has been built against.
Finally, glibc defines its API through headers also installed under /usr/include such as /usr/include/sys. I don't know exactly what, if any, backward or forward compatibility is provided for applications built with older or newer glibc headers, but I'm guessing that the library .so version number gets bumped when backward comparability would be broken.
So now we can understand your problem to be that the glibc headers don't actually define MAP_UNINITIALIZED for the distros/versions that you tried.
However, the linux kernel API has exposed MAP_UNINITIALIZED, as this patch demonstrates. If the glibc headers don't define it for you, you can use the linux kernel API headers and #include <linux/mman.h> if this defines it. Note that you will still need to #include <sys/mman.h> in order to get the prototype for mmap, among other things.
If your linux kernel API headers don't define MAP_UNINITIALIZED but you have a kernel version that implements it, you can define it yourself:
#define MAP_UNINITIALIZED 0x4000000
You don't have to worry that you are effectively using "newer" headers than your glibc was built with, because the glibc implementation of mmap is very thin:
#include <sys/types.h>
#include <sys/mman.h>
#include <errno.h>
#include <sysdep.h>
#ifndef MMAP_PAGE_SHIFT
#define MMAP_PAGE_SHIFT 12
#endif
__ptr_t
__mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
{
if (offset & ((1 << MMAP_PAGE_SHIFT) - 1))
{
__set_errno (EINVAL);
return MAP_FAILED;
}
return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd,
offset >> MMAP_PAGE_SHIFT);
}
weak_alias (__mmap, mmap)
It is just passing your flags straight through to the kernel.
The kernel normally needs to clear the memory, to protect the privacy of both kernel space and other process' memory.
Continue reading:
This flag is honored only if the kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED option. Because of the security implications, that option is normally enabled only on embedded devices (i.e., devices where one has complete control of the contents of user memory).
Related
Short version of question: What parameter do I need to pass to the clone system call on x86_64 Linux system if I want to allocate a new TLS area for the thread that I am creating.
Long version:
I am working on a research project and for something I am experimenting with I want to create threads using the clone system call instead of using pthread_create. However, I also want to be able to use thread local storage. I don't plan on creating many threads right now, so it would be fine for me to create a new TLS area for each thread that I create with the clone system call.
I was looking at the man page for clone and it has the following information about the flag for the TLS parameter:
CLONE_SETTLS (since Linux 2.5.32)
The newtls argument is the new TLS (Thread Local Storage) descriptor.
(See set_thread_area(2).)
So I looked at the man page for set_thread_area and noticed the following which looked promising:
When set_thread_area() is passed an entry_number of -1, it uses a
free TLS entry. If set_thread_area() finds a free TLS entry, the value of
u_info->entry_number is set upon return to show which entry was changed.
However, after experimenting with this some it appears that set_thread_area is not implemented in my system (Ubunut 10.04 on an x86_64 platform). When I run the following code I get an error that says: set_thread_area() failed: Function not implemented
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <linux/unistd.h>
#include <asm/ldt.h>
int main()
{
struct user_desc u_info;
u_info.entry_number = -1;
int rc = syscall(SYS_set_thread_area,&u_info);
if(rc < 0) {
perror("set_thread_area() failed");
exit(-1);
}
printf("entry_number is %d",u_info.entry_number);
}
I also saw that when I use strace the see what happens when pthread_create is called that I don't see any calls to set_thread_area. I have also been looking at the nptl pthread source code to try to understand what they do when creating threads. But I don't completely understand it yet and I think it is more complex than what I'm trying to do since I don't need something that is as robust at the pthread implementation. I'm assuming that the set_thread_area system call is for x86 and that there is a different mechanism used for x86_64. But for the moment I have not been able to figure out what it is so I'm hoping this question will help me get some ideas about what I need to look at.
I am working on a research project and for something I am experimenting with I want to create threads using the clone system call instead of using pthread_create
In the exceedingly unlikely scenario where your new thread never calls any libc functions (either directly, or by calling something else which calls libc; this also includes dynamic symbol resolution via PLT), then you can pass whatever TLS storage you desire as the the new_tls parameter to clone.
You should ignore all references to set_thread_area -- they only apply to 32-bit/ix86 case.
If you are planning to use libc in your newly-created thread, you should abandon your approach: libc expects TLS to be set up a certain way, and there is no way for you to arrange for such setup when you call clone directly. Your new thread will intermittently crash when libc discovers that you didn't set up TLS properly. Debugging such crashes is exceedingly difficult, and the only reliable solution is ... to use pthread_create.
The other answer is absolutely correct in that setting up a thread outside of libc's control is guaranteed to cause trouble at a certain point. You can do it, but you can no longer rely on libc's services, definitely not on any of the pthread_* functions or thread-local variables (defined as such using __thread or thread_local).
That being said, you can set one of the segment registers used for TLS (GS and FS) even on x86-64. The system call to look for is prctl(ARCH_SET_GS, ...).
You can see an example comparing setting up TLS registers on i386 and x86-64 in this piece of code.
Per customer requirements, I installed CentOS 5.6 with the default kernel. With this kernel installed, the time.h file includes the #define CLOCK_MONOTONIC.
Now, a real-time kernel was installed along with the kernel-devel and our code would like to use CLOCK_MONOTONIC_RAW. It does exist as a part of the kernel's header files, but when I compile our code, it does not find it in the standard userspace includes.
My question is, what is the proper procedure to including/replacing the time.h found by default with the real-time kernel? From my research, it looks like symlinks are bad, so how should it be handled? What is the procedure or process? Upgrading to CentOS 6.0 or 5.7 is not an option per customer requirements. Thanks.
Well, userspace code uses userspace headers. Kernel modules use kernel headers (and that's why symlinks are bad, because you would be mixing userspace code with kernel headers).
To get the definition of CLOCK_MONOTONIC_RAW, you will have to update glibc — for CLOCK_ definitions, the "borderline" (they still count as userspace though!) headers in /usr/include/linux are not used.
With CentOS 5 default install, you are screwed, because both glibc (2.5) and the kernel (2.6.18) are too old; glibc-2.12 (commit glibc-2.12~111) and kernel-2.6.28 are the first to have MONOTONIC_RAW. That means it's got to be CentOS 6, or something else better.
You can try cheating your way in by using something like #ifndef CLOCK_MONOTONIC_RAW, #define CLOCK_MONOTONIC_RAW 4, #endif in your code, but that counts as unportable.
The definition of CLOCK_MONOTONIC_RAW is in /usr/local/include/linux/time.h on our Fedora 11 install but this header appears to be basically unusable . It doesn't declare clock_gettime or define clockid_t but it happily defines struct timerspec and struct itimerspec. The former is preceded by "#ifndef _STRUCT_TIMESPEC" so you can turn it off, but the latter is completely unprotected, which means that you can't include and in the same file without getting conflicting definitions.
There might be some contortion of #include directives that you could use to get this working using the headers in /usr/include, but I gave up and just copied the linux version to the source code directory for my project and then commented out the extra junk that I didn't need. So much for portability.
I'm working on a piece of software that monitors other processes' system calls using ptrace(2). Unfortunately most modern operating system implement some kind of fast user-mode syscalls that are called vsyscalls in Linux.
Is there any way to disable the use of vsyscalls/vDSO for a single process or, if that is not possible, for the whole operating system?
Try echo 0 > /proc/sys/kernel/vsyscall64
If you're trying to ptrace on gettimeofday calls and they aren't showing up, what time source is the system using (pmtimer, acpi, tsc, hpet, etc). I wonder if you'd humor me by trying to force your timer to something older like pmtimer. It's possible one of the many gtod timer specific optimizations is causing your ptrace calls to be avoided, even with vsyscall set to zero.
Is there any way to disable the use of vsyscalls/vDSO for a single process or, if that is not possible, for the whole operating system?
It turns out there IS a way to effectively disable linking vDSO for a single process without disabling it system-wide using ptrace!
All you have to do is to stop the traced process before it returns from execve and remove the AT_SYSINFO_EHDR entry from the auxiliary vector (which comes directly after environment variables along the memory region pointed to in rsp). PTRACE_EVENT_EXEC is a good place to do this.
AT_SYSINFO_EHDR is what the kernel uses to tell the system linker where vDSO is mapped in the process's address space. If this entry is not present, ld seems to act as if the system hasn't mapped a vDSO.
Note that this doesn't somehow unmap the vDSO from your processes memory, it merely ignores it when linking other shared libraries. A malicious program will still be able to interact with it if the author really wanted to.
I know this answer is a bit late, but I hope this information will spare some poor soul a headache
For newer systems echo 0 > /proc/sys/kernel/vsyscall64 might not work. In Ubuntu 16.04 vDSO can be disabled system-wide by adding the kernel parameter vdso=0 in /etc/default/grub under the parameter: GRUB_CMDLINE_LINUX_DEFAULT.
IMPORTANT: Parameter GRUB_CMDLINE_LINUX_DEFAULT might be overwriten by other configuration files in /etc/default/grub.d/..., so double check when to add your custom configuration.
Picking up on Tenders McChiken's approach, I did create a wrapper that disables vDSO for an arbitrary binary, without affecting the rest of the system: https://github.com/danteu/novdso
The general procedure is quite simple:
use ptrace to wait for return from execve(2)
find the address of the auxvector
overwrite the AT_SYSINFO_EHDR entry with AT_IGNORE, telling the application to ignore the following value
I know this is an older question, but nobody has mentioned a third useful way of disabling the vDSO on a per-process basis. You can overwrite the libc functions with your own that performs the actual system call using LD_PRELOAD.
A simple shared library for overriding the gettimeofday and time functions, for example, could look like this:
vdso_override.c:
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/syscall.h>
int gettimeofday(struct timeval *restrict tv, struct timezone *restrict tz)
{
return syscall(__NR_gettimeofday, (long)tv, (long)tz, 0, 0, 0, 0);
}
time_t time(time_t *tloc)
{
return syscall(__NR_time, (long)tloc, 0, 0, 0, 0, 0);
}
This uses the libc wrapper to issue a raw system call (see syscall(2)), so the vDSO is circumvented. You would have to overwrite all system calls that the vDSO exports on your architecture in this way (listed at vdso(7)).
Compile with
gcc -fpic -shared -o vdso_override.so vdso_override.c
Then run any program in which you want to disable VDSO calls as follows:
LD_PRELOAD=./vdso_override.so <some program>
This of course only works if the program you are running is not actively trying to circumvent this. While you can override a symbol using LD_PRELOAD, if the target program really wants to, there is a way to find the original symbol and use that instead.
I am trying to interface to a microcontroller from my linux box via RS232 serial.
I have written the driver and implemented a protocol b/n pc and microcontroller, which uses a tty(/dev/ttyS0) device already present in the kernel as a module(eg via calling open, close, etc..). However, when I try to compile, it says it cannot find reference to open, write, read etc...
How do I just use an existing device driver from within a driver? Is there something else I need to include?
If not, how can I use the serial port easily from within a driver?
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/uaccess.h>
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/cdev.h>
#include <linux/spinlock.h>
#include <linux/termios.h>
#include <linux/fcntl.h>
#include <linux/unistd.h>
Normally you should do such a thing in userspace - implement your device's protocol in a normal, userspace program.
It is possible, but definitely not recommended to do these things in the kernel. For example, the ppp driver implements a network driver on top of a serial driver. I don't know how it works in that case, but I'd expect that a userspace helper program opens the device, initialises its parameters etc, then passes the file descriptor into the kernel using some system call.
You cannot call arbitrary library functions from the kernel - or indeed, any library functions at all (except libraries which are actually shipped as part of the kernel). This includes kernel system calls. There are equivalent functions which it may be possible to call - for example, filp_open.
In most cases you can't just call the normal syscall from the kernel, as they expect pointers to point to userspace data, but in the kernel yours (allocated via kalloc etc) will normally point to kernel-space data. The two can't be freely mixed.
Is it possible to compile a linux kernel(2.6) module that includes functionality defined by non-kernel includes?
For example:
kernelmodule.h
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h> // printk()
// ...
#include <openssl/sha.h>
// ...
Makefile
obj-m := kernelmodule.o
all:
$(MAKE) -C /lib/modules/`uname -r`/build M=`pwd` modules
clean:
$(MAKE) -C /lib/modules/`uname -r`/build M=`pwd` clean
$(RM) Module.markers modules.order
The kernel module I have written and are trying to compile contains functionality found in a number of openssl include files.
The standard makefile presented above doesn't allow includes outside of the linux headers. Is it possible to include this functionality, and if so, could you please point me in the right direction.
Thanks,
Mike
The kernel cannot use userspace code and must stand alone (i.e. be completely self contained, no libraries), therefore it does not pick up standard headers.
It is not clear what benefit trying to pick up userspace headers is. If there are things in there that it would be valid to use (constants, some macros perhaps provided they don't call any userspace functions), then it may be better to duplicate them and include only the kernel-compatible parts that you need.
It is not possible to link the kernel with libraries designed for userspace use - even if they don't make any OS calls - because the linking environment in the kernel cannot pick them up.
Instead, recompile any functions to be used in the kernel (assuming they don't make any OS or library calls - e.g. malloc - in which case they'll need to be modified anyway). Incorporate them into your own library to be used in your kernel modules.
Recent versions of linux contain cryptographic functions anyway, including various SHA hashes - perhaps you can use one of those instead.
Another idea would be to stop trying to do crypto in kernel-space and move the code to userspace. Userspace code is easier to write / debug / maintain etc.
I have taken bits of userspace code that I've written and converted it to work in kernel space (i.e. using kmalloc(), etc), it's not that difficult. However, you are confined to the kernel's understanding of C, not userspace, which differs slightly .. especially with various standard int types.
Just linking against user space DSO's is not possible — the Linux kernel is monolithic, completely self contained. It does not use userspace libc, libraries or other bits as others have noted.
9/10 times, you will find what you need somewhere in the kernel. It's very likely that someone else ran into the same need you have and wrote some static functions in some module to do what you want .. just grab those and re-use them.
In the case of crypto, as others have said, just use what's in the kernel. One thing to note, you'll need them to be enabled in kconfig which may or may not happen depending on what the user selects when building it. So, watch out for dependencies and be explicit, you may have to hack a few entries in kconfig that also select the crypto API you want when your module is selected. Doing that can be a bit of a pain when building out of tree.
So on the one hand we have "just copy and rename stuff while adding overall bloat", on the other you have "tell people they must have the full kernel source". It's one of the quirks that come with a monolithic kernel.
With a Microkernel, almost everything runs in userspace, no worries linking against a DSO for some driver ... it's a non issue. Please don't take that statement as a cue to re-start kernel design philosophy in comments, that's not in the scope of this question.