Question about file seeking position

Question about file seeking position - linux

My previous Question is about raw data reading and writing, but a new problem arised, it seems there is no ending....
The question is: the parameters of the functions like lseek() or fseek() are all 4 bytes. If i want to move a span over 4G, that is imposible. I know in Win32, there is a function SetPointer(...,Hign, Low,....), this pointers can generate 64 byte pointers, which is what i want.
But if i want to create an app in Linux or Unix (create a file or directly write
the raw drive sectors), How can I move to a pointer over 4G?
Thanx, Waiting for your replies...

The offset parameter of lseek is of type off_t. In 32-bit compilation environments, this type defaults to a 32-bit signed integer - however, if you compile with this macro defined before all system includes:
#define _FILE_OFFSET_BITS 64
...then off_t will be a 64-bit signed type.
For fseek, the fseeko function is identical except that it uses the off_t type for the offset, which allows the above solution to work with it too.

a 4 byte unsigned integer can represent a value up to 4294967295, which means if you want to move more than 4G, you need to use lseek64(). In addition, you can use fgetpos() and fsetpos() to change the position in the file.

On Windows, use _lseeki64(), on Linux, lseek64().
I recommend to use lseek64() on both systems by doing something like this:
#ifdef _WIN32
#include <io.h>
#define lseek64 _lseeki64
#else
#include <unistd.h>
#endif
That's all you need.

Related

Why do both struct timeval with different sizes work when calling select on the same ARM64 Linux?

I used an ARMv7 and an ARMv8 toolchain to compile the same .c file as shown below. Then I ran the two produced programs (select32 and select64) on an ARM64 Linux using QEMU.
#include <sys/select.h>
#include <stdio.h>
int main()
{
struct timeval t = {10, 999999};
printf("sizeof timeval is %d, sizeof(tv_sec) is %d, sizeof(tv_usec) is %d\nstart select()\n",
(int) sizeof(struct timeval), (int)sizeof(t.tv_sec), (int)sizeof(t.tv_usec));
select(0,0,0,0, &t);
printf("select() ends\n");
}
I found that both programs slept about 10 seconds between start select()and select() ends
/ # ./select64
sizeof timeval is 16, sizeof(tv_sec) is 8, sizeof(tv_usec) is 8
start select()
select() ends
/ # ./select32
sizeof timeval is 8, sizeof(tv_sec) is 4, sizeof(tv_usec) is 4
start select()
select() ends
Why did the 32-bit application also sleep 10 seconds, although (I think) it passed the kernel a form of struct timeval that the kernel didn't expect? Is it some code in the kernel, or some code in the C library, that found out something weird and then converted the struct?

When you compile with a 32-bit toolchain, the members of struct timeval are 32 bit, and when you compile with a 64-bit toolchain, they're 64 bit. That means that the values you provided are automatically converted to the appropriate type by the compiler, and the data you passed it is of the right size.
The C compiler will automatically convert values of type int, which you've provided, to the appropriate type when using them in a struct initializer if the type used is larger (like long, which is probably what's used here under the hood).
The system calls for 32-bit and 64-bit programs differ, so when you call select in this case, you're really calling two different system calls, one of which is probably 32-bit EABI and the other, 64-bit.

The kernel distinguishes between 32-bit and 64-bit syscalls. The 32-bit select passes a old_timeval32 or compat_timeval structure while the 64-bit select passes the timeval structure type. The 32-bit one is deprecated due to the Year 2038 problem.
It is similar with timespec: The 32-bit pselect passes a timespec structure while the 64-bit pselect passes a timespec64 structure. The names differ between the implementations, for example compat_timespec for the 32-bit one and timespec for the 64-bit one.
Here some pieces of code from the linux kernel used in the Raspberry Pi OS: There are explicit conversion routines in compat.c to copy data between the two types:
static int compat_put_timeval(struct compat_timeval __user *o,
struct timeval *i)
{
return (put_user(i->tv_sec, &o->tv_sec) ||
put_user(i->tv_usec, &o->tv_usec)) ? -EFAULT : 0;
}

Incorrect address returned by crypt() on Solaris x64

When debugging shared library loaded with dlopen(), I found an interesting thing.The address returned by crypt() function when called from my library is 32 bits based; that is, when I try to see that address in debugger. it says that this is a bad address. Adding to this address a shift which is in my case 0xffffffff00000000 gives the correct result. Looking at the crypt sources it is clear that the string returned by crypt is a static char array, but it is not clear why the address is 32 bits based.
Thank you in advance to any ideas and help

Did you #include <unistd.h> or #include <crypt.h> in your code so that it had the function prototype declaring crypt() as returning char *?
If you don't have a function prototype, C defaults to assuming functions return int, even if that's only 32-bits on a 64-bit machine, and this often breaks functions that return pointers (which work by accident on 32-bit systems where int is the same size as a pointer).

Using <linux/types.h> in user programs, or <stdint.h> in driver module code...does it matter?

I'm developing a device driver module and associated user libraries to handle the ioctl() calls. The library takes the pertinent info and puts it into a struct, which gets passed into the driver module and unpacked there and then dealt with (I'm omitting a lot of steps, but that's the overall idea).
Some of the data being passed through the struct via the ioctl() is uint32_t type. I've discovered that that type is defined in <stdint.h> AND <linux/types.h>. So far I've been using <linux/types.h> to define that value, including down in the user libraries. But I understand it is bad form to use <linux/*.h> libraries in user space, so if I remove those and use <stdint.h> instead, then when my driver module includes the struct definition, it will have to be including <stdint.h> also.
It seems to me that the point of <linux/types.h> is to define types in kernel files, so I'm not sure if that means using <stdint.h> is bad idea there. I also found that when trying to compile my driver module with <stdint.h>, I get compilation errors about redefinitions that won't go away, even if I replace all instances of <linux/types.h> with <stdint.h> (and put it on the top of the include order).
Is it a bad idea to use linux/*.h includes in user-space code?
Is it a bad idea to use <stdint.h> in kernel-space code?
If the answers to both of those is yes, then how do I handle the situation where a structure containing uint32_t is shared by both the user library and the driver module?

Is it a bad idea to use linux/*.h includes in user-space code?
Yes, usually. The typical situation is that you should be using the C-library headers (in this case, stdint.h and friends), and interface with the C library though those user-space types, and let the library handle talking with the kernel through kernel types.
You're not in a typical situation though. In your case, you're writing the driver library. So you should be presenting an interface to userspace using stdint.h, but using the linux/*.h headers when you interface to your kernel driver.
So the answer is no, in your case.
Is it a bad idea to use stdint.h in kernel-space code?
Most definitely yes.
See also: http://lwn.net/Articles/113349/

Fixed length integers in the Linux kernel
The Linux kernel already has fixed length integers which might interest you. In v4.9 under include/asm-generic/int-ll64.h:
typedef signed char s8;
typedef unsigned char u8;
typedef signed short s16;
typedef unsigned short u16;
typedef signed int s32;
typedef unsigned int u32;
typedef signed long long s64;
typedef unsigned long long u64;
LDD3 has a chapter about data sizes as well: https://static.lwn.net/images/pdf/LDD3/ch11.pdf
LDD3 mentions there that the best printk strategy is to cast to just cast to the largest integer possible with correct signedness: %lld or %llu. %ju appears unavailable under the printk formatting centerpiece lib/linux/vsprintf.c.

How does linux capability.h use 32-bit mask for 34 elements?

The file in /usr/include/linux/capability.h #defines 34 possible capabilities.
It goes like:
#define CAP_CHOWN 0
#define CAP_DAC_OVERRIDE 1
.....
#define CAP_MAC_ADMIN 33
#define CAP_LAST_CAP CAP_MAC_ADMIN
each process has capabilities defined thusly
typedef struct __user_cap_data_struct {
__u32 effective;
__u32 permitted;
__u32 inheritable;
} * cap_user_data_t;
I'm confused - a process can have 32-bits of effective capabilities, yet the total amount of capabilities defined in capability.h is 34. How is it possible to encode 34 positions in a 32-bit mask?

Because you haven't read all of the manual.
The capget manual starts by convincing you to not use it :
These two functions are the raw kernel interface for getting and set‐
ting thread capabilities. Not only are these system calls specific to
Linux, but the kernel API is likely to change and use of these func‐
tions (in particular the format of the cap_user_*_t types) is subject
to extension with each kernel revision, but old programs will keep
working.
The portable interfaces are cap_set_proc(3) and cap_get_proc(3); if
possible you should use those interfaces in applications. If you wish
to use the Linux extensions in applications, you should use the easier-
to-use interfaces capsetp(3) and capgetp(3).
Current details
Now that you have been warned, some current kernel details. The struc‐
tures are defined as follows.
#define _LINUX_CAPABILITY_VERSION_1 0x19980330
#define _LINUX_CAPABILITY_U32S_1 1
#define _LINUX_CAPABILITY_VERSION_2 0x20071026
#define _LINUX_CAPABILITY_U32S_2 2
[...]
effective, permitted, inheritable are bitmasks of the capabilities
defined in capability(7). Note the CAP_* values are bit indexes and
need to be bit-shifted before ORing into the bit fields.
[...]
Kernels prior to 2.6.25 prefer 32-bit capabilities with version
_LINUX_CAPABILITY_VERSION_1, and kernels 2.6.25+ prefer 64-bit capabil‐
ities with version _LINUX_CAPABILITY_VERSION_2. Note, 64-bit capabili‐
ties use datap[0] and datap[1], whereas 32-bit capabilities only use
datap[0].
where datap is defined earlier as a pointer to a __user_cap_data_struct. So you just represent a 64bit values with two __u32 in an array of two __user_cap_data_struct.
This, alone, tells me to not ever use this API, so i didn't read the rest of the manual.

They aren't bit-masks, they're just constants. E.G. CAP_MAC_ADMIN sets more than one bit. In binary, 33 is what, 10001?

printf ptr: can the leading 0x be eliminated?

The Linux printf renders %p arguments as hex digits with a leading 0x. Is there a way to make it not print the 0x? (Needs to work on both 32 and 64 bit.)

You can use the format specifier for uintptr_t from <inttypes.h>:
#include <inttypes.h>
[...]
printf("%"PRIxPTR"\n", (uintptr_t) p);
This works like %x for the uintptr_t type, which is an integer type capable of roundtrip conversion from/to any pointer type.

Use %llx, it will work on 64-bit for sure. Tried and tested.

Use %lx or %08lx. It works for both 32 and 64 bit linux gcc, because long int is always the same width as void *. Doesn't work for MSVC, because long int is always 32 bit in MSVC.
If you want it to work on all compilers, you can use %llx and cast your pointer to unsigned long long int, it's not efficient in 32 bit though.
If you want efficiency as well, define different macro for different cases.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string