Weird syscall numbers on Linux 32 bits - linux

The story
I have a C program which generates automatically a list of syscall numbers, as I prefer automated generation from real world reference than hand‑written generated files when, applicable. The target is an Ada package. I've run a test with the classical “Hello world” involving the common write syscall… it failed. I figured the syscall number was wrong: 64 instead of 4.
I generated the list from a C program including <asm-generic/unistd.h>. The platform is 32 bits and no tool‑chain targeting 64 bits platform was ever installed.
Examples definitions from this unistd.h: #define __NR_write 64 (should be 4), #define __NR_read 63 (should be 3), #define __NR_getuid 174 (should be 24), and so on…
I've run a text search in all files in /usr/** for occurrences of __NR_write which would be part of the expected definition, and found no one.
The question
Why this header specify weird syscall numbers? Why is the expected definitions found nowhere? Is this a new ABI?
Note: the platform is Ubuntu 12.04, 32 bits.
Update
I figured something running this command:
find /usr/include/ -name "unistd*" -exec nano '{}' \;
It shows the header /usr/include/i386-linux-gnu/asm/unistd_32.h contains the good numbers, and that header is included from /usr/include/i386-linux-gnu/asm/unistd.h, but many of the symbols are not defined when <asm/unistd.h> is included.
Update 2
Not only the numbers differs, but many names too. Ex. __NR_socket vs __NR_socketcall. The start of an explanation may be given in the possible duplicate: arch/x86/include/asm/unistd.h vs. include/asm-generic/unistd.h.

If you start with /usr/include/sys/syscall.h (as indicated in syscall(2)) and repeatedly follow the include directives you arrive at /usr/include/asm/unistd_32.h. Hence I recommend you use this header.

From the source of asm-generic.h:
6 /*
7 * This file contains the system call numbers, based on the
8 * layout of the x86-64 architecture, which embeds the
^^^^^^
9 * pointer to the syscall in the table.
10 *
...
15 */

Related

How would I lookup Linux syscalls by name/number in Go?

I'd like to lookup Linux syscalls for amd64 and i386 by name/number in Go, and was wondering if there's a built-in mapping available somewhere within the Go standard library, or a third-party module.
I can see here that the Go developers have hardcoded Linux syscall numbers into the syscall module:
i386: https://golang.org/src/syscall/zsysnum_linux_386.go
amd64: https://golang.org/src/syscall/zsysnum_linux_amd64.go
It looks like they've generated each of these files using GCC: https://golang.org/src/syscall/mksysnum_linux.pl
Example syscalls (amd64):
// mksysnum_linux.pl /usr/include/asm/unistd_32.h
// Code generated by the command above; DO NOT EDIT.
// +build 386,linux
package syscall
const (
SYS_RESTART_SYSCALL = 0
SYS_EXIT = 1
SYS_FORK = 2
SYS_READ = 3
SYS_WRITE = 4
SYS_OPEN = 5
SYS_CLOSE = 6
...
Would my best bet be to hard-code this mapping within my code, or is there a maintained mapping available somewhere?
I'm not looking for the mapping between syscall names/numbers on a particular Linux system, I'm looking for a (likely) mapping between syscall names/numbers on any (modern) Linux system on amd64/i386.
I understand that syscall numbers may change, but this is intended as a best-effort approach.
The mapping is in the kernel source, for example one architecture's mapping is /usr/include/asm/unistd_32.h
You should read that file side-by-side with the perl script that parses it, (the script is only a page long, and matches a very small number of #define patterns in the header file... some of the patterns will match many times in a row, finding the whole list of syscalls by name and number)
Also refer to this question (cross-site dupe):
Where do you find the syscall table for Linux?

nasm system calls Linux

I have got a question about linux x86 system calls in assembly.
When I am creating a new assembly program with nasm on linux, I'd like to know which system calls I have to use for doing a specific task (for example reading a file, writing output, or simple exiting...). I know some syscall because I've read them on some examples taken around internet (such as eax=0, ebx=1 int 0x80 exit with return value of 1), but nothing more... How could I know if there are other arguments for exit syscall? Or for another syscall? I'm looking for a docs that explain which syscalls have which arguments to pass in which registers.
I've read the man page about exit function etc. but it didn't explain to me what I'm asking.
Hope I was clear enough,
Thank you!
The x86 wiki (which I just updated again :) has links to the system call ABI (what the numbers are for every call, where to put the params, what instruction to run, and which registers will clobbered on return). This is not documented in the man page because it's architecture-specific. Same for binary constants: they don't have to be the same on every architecture.
grep -r O_APPEND /usr/include for your target architecture to recursively search the .h files.
Even better is to set things up so you can use the symbolic constants in your asm source, for readability and to avoid the risk of errors.
The gcc actually does use the C Preprocessor when processing .S files, but including most C header files will also get you some C prototypes.
Or convert the #defines to NASM macros with sed or something. Maybe feed some #include<> lines to the C preprocessor and have it print out just the macro definitions.
printf '#include <%s>\n' unistd.h sys/stat.h |
gcc -dD -E - |
sed -ne 's/^#define \([A-Za-z_0-9]*\) \(.\)/\1\tequ \2/p'
That turns every non-empty #define into a NASM symbol equ value. The resulting file has many lines of error: expression syntax error when I tried to run NASM on it, but manually selecting some valid lines from that may work.
Some constants are defined in multiple steps, e.g. #define S_IRGRP (S_IRUSR >> 3). This might or might not work when converted to NASM equ symbol definitions.
Also note that in C 0666, is an octal constant. In NASM, you need either 0o666 or 666o; a leading 0 is not special. Otherwise, NASM syntax for hex and decimal constants is compatible with C.
Perhaps you are looking for something like linux/syscalls.h[1], which you have on your system if you've installed the Linux source code via apt-get or whatever your distro uses.
[1] http://lxr.free-electrons.com/source/include/linux/syscalls.h#L326

How to disassemble these instructions

I am writing a little disassembler using riscv-spec-v2.0 and have some questions about the following instructions and how to correctly disassemble them:
1.
FENCE instruction has "pred" and "succ" bit fields in imm
2.
AMO instructions have "aq" and "rl" bits in in funct7
3.
Float instructions have a "rm" bit field in funct3
All of these bit fields seem to lack mappings in the assembler.
E.g. page 50 just says "FENCE" but not what to do with the intermediate.
Or page 33 has an example of putting .aq or .rl at the end but not what to do if both are present.
4.
SCALL, SBREAK are the same as ECALL, EBREAK
but there is also ERET: so why not drop SCALL and SBREAK
and just use ECALL, EBREAK and ERET because other wise it
is hard to disassemble these opcodes.
The current RISC-V assembler is terse for common defaults:
"FENCE" with no arguments is treated as a full fence (all bits set)
OK to have both on same instruction
Rounding mode not shown if not specified
ECALL and EBREAK will be the new standard names (will be clarified in the revised user ISA manual)

Where is OPEN_MAX defined for Linux systems?

OPEN_MAX is the constant that defines the maximum number of open files allowed for a single program.
According to Beginning Linux Programming 4th Edition, Page 101 :
The limit, usually defined by the constant OPEN_MAX in limits.h, varies from system to system, ...
In my system, the file limits.h in directory /usr/lib/gcc/x86_64-linux-gnu/4.6/include-fixed does not have this constant. Am i looking at the wrong limits.h or has the location of OPEN_MAX changed since 2008 ?
For what it's worth, the 4th edition of Beginning Linux Programming was published in 2007; parts of it may be a bit out of date. (That's not a criticism of the book, which I haven't read.)
It appears that OPEN_MAX is deprecated, at least on Linux systems. The reason appears to be that the maximum number of file that can be opened simultaneously is not fixed, so a macro that expands to an integer literal is not a good way to get that information.
There's another macro FOPEN_MAX that should be similar; I can't think of a reason why OPEN_MAX and FOPEN_MAX, if they're both defined, should have different values. But FOPEN_MAX is mandated by the C language standard, so system's don't have the option of not defining it. The C standard says that FOPEN_MAX
expands to an integer constant expression that is the minimum number of files that
the implementation guarantees can be open simultaneously
(If the word "minimum" is confusing, it's a guarantee that a program can open at least that many files at once.)
If you want the current maximum number of files that can be opened, take a look at the sysconf() function; on my system, sysconf(_SC_OPEN_MAX) returns 1024. (The sysconf() man page refers to a symbol OPEN_MAX. This is not a count, but a value recognized by sysconf(). And it's not defined on my system.)
I've searched for OPEN_MAX (word match, so excluding FOPEN_MAX) on my Ubuntu system, and found the following (these are obviously just brief excerpts):
/usr/include/X11/Xos.h:
# ifdef __GNU__
# define PATH_MAX 4096
# define MAXPATHLEN 4096
# define OPEN_MAX 256 /* We define a reasonable limit. */
# endif
/usr/include/i386-linux-gnu/bits/local_lim.h:
/* The kernel header pollutes the namespace with the NR_OPEN symbol
and defines LINK_MAX although filesystems have different maxima. A
similar thing is true for OPEN_MAX: the limit can be changed at
runtime and therefore the macro must not be defined. Remove this
after including the header if necessary. */
#ifndef NR_OPEN
# define __undef_NR_OPEN
#endif
#ifndef LINK_MAX
# define __undef_LINK_MAX
#endif
#ifndef OPEN_MAX
# define __undef_OPEN_MAX
#endif
#ifndef ARG_MAX
# define __undef_ARG_MAX
#endif
/usr/include/i386-linux-gnu/bits/xopen_lim.h:
/* We do not provide fixed values for
ARG_MAX Maximum length of argument to the `exec' function
including environment data.
ATEXIT_MAX Maximum number of functions that may be registered
with `atexit'.
CHILD_MAX Maximum number of simultaneous processes per real
user ID.
OPEN_MAX Maximum number of files that one process can have open
at anyone time.
PAGESIZE
PAGE_SIZE Size of bytes of a page.
PASS_MAX Maximum number of significant bytes in a password.
We only provide a fixed limit for
IOV_MAX Maximum number of `iovec' structures that one process has
available for use with `readv' or writev'.
if this is indeed fixed by the underlying system.
*/
Aside from the link given by cste, I would like to point out that there is a /proc/sys/fs/file-max entry that provides the number of files THE SYSTEM can have open at any given time.
Here's some docs:
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Directory_Server/8.2/html/Performance_Tuning_Guide/system-tuning.html
Note that this is not to say that there's a GUARANTEE you can open that many files - if the system runs out of some resource (e.g. "no more memory available"), then it may well fail.
The FOPEN_MAX indicates that the C library allows this many files to be opened (at least, as discussed), but there are other limits that may happen first. Say for example the SYSTEM limit is 4000 files, and some applications already running has 3990 files open. Then you won't be able to open more than 7 files [since stdin, stdout and stderr take up three slots too]. And if rlimit is set to 5, then you can only open 2 files of your own.
In my opinion, the best way to know if you can open a file is to open it. If that fails, you have to do something else. If you have some process that needs to open MANY files [e.g. a multithreaded search/compare on a machine with 256 cores and 8 threads per core and each thread uses three files (file "A", "B" and "diff") ], then you may need to ensure that your FOPEN_MAX allows for 3 * 8 * 256 files being opened before you start creating threads, as a thread that fails to open its files will be meaningless. But for most ordinary applications, just try to open the file, if it fails, tell the user (log, or something), and/or try again...
I suggest to use the magic of grep to find this constant on /usr/include:
grep -rn --col OPEN_MAX /usr/include
...
...
/usr/include/stdio.h:159: FOPEN_MAX Minimum number of files that can be open at once.
...
...
Hope it helps you

How does linux capability.h use 32-bit mask for 34 elements?

The file in /usr/include/linux/capability.h #defines 34 possible capabilities.
It goes like:
#define CAP_CHOWN 0
#define CAP_DAC_OVERRIDE 1
.....
#define CAP_MAC_ADMIN 33
#define CAP_LAST_CAP CAP_MAC_ADMIN
each process has capabilities defined thusly
typedef struct __user_cap_data_struct {
__u32 effective;
__u32 permitted;
__u32 inheritable;
} * cap_user_data_t;
I'm confused - a process can have 32-bits of effective capabilities, yet the total amount of capabilities defined in capability.h is 34. How is it possible to encode 34 positions in a 32-bit mask?
Because you haven't read all of the manual.
The capget manual starts by convincing you to not use it :
These two functions are the raw kernel interface for getting and set‐
ting thread capabilities. Not only are these system calls specific to
Linux, but the kernel API is likely to change and use of these func‐
tions (in particular the format of the cap_user_*_t types) is subject
to extension with each kernel revision, but old programs will keep
working.
The portable interfaces are cap_set_proc(3) and cap_get_proc(3); if
possible you should use those interfaces in applications. If you wish
to use the Linux extensions in applications, you should use the easier-
to-use interfaces capsetp(3) and capgetp(3).
Current details
Now that you have been warned, some current kernel details. The struc‐
tures are defined as follows.
#define _LINUX_CAPABILITY_VERSION_1 0x19980330
#define _LINUX_CAPABILITY_U32S_1 1
#define _LINUX_CAPABILITY_VERSION_2 0x20071026
#define _LINUX_CAPABILITY_U32S_2 2
[...]
effective, permitted, inheritable are bitmasks of the capabilities
defined in capability(7). Note the CAP_* values are bit indexes and
need to be bit-shifted before ORing into the bit fields.
[...]
Kernels prior to 2.6.25 prefer 32-bit capabilities with version
_LINUX_CAPABILITY_VERSION_1, and kernels 2.6.25+ prefer 64-bit capabil‐
ities with version _LINUX_CAPABILITY_VERSION_2. Note, 64-bit capabili‐
ties use datap[0] and datap[1], whereas 32-bit capabilities only use
datap[0].
where datap is defined earlier as a pointer to a __user_cap_data_struct. So you just represent a 64bit values with two __u32 in an array of two __user_cap_data_struct.
This, alone, tells me to not ever use this API, so i didn't read the rest of the manual.
They aren't bit-masks, they're just constants. E.G. CAP_MAC_ADMIN sets more than one bit. In binary, 33 is what, 10001?

Resources