Linux struct msghdr :: msg_iovlen type - linux

No practical reason, just wondering.
Why in Linux in msghdr struct, they use size_t type for msg_iovlen field? I found it a bit confusing, as size_t usually means "how much bytes".
Btw, in FreeBSD they use u_int for that field, and int is in Posix standard.

See size_t:
size_t is the unsigned integer type of the result of the sizeof operator as well as the sizeof... operator and the alignof operator... size_t can store the maximum size of a theoretically possible object of any type (including array). On many platforms (an exception are systems with segmented addressing)

Related

simple_read_from_buffer/simple_write_to_buffer vs. copy_to_user/copy_from_user

I recently wrote a module implementing these functions.
What is the difference between the two? From my understanding, the copy_..._user functions are more secure. Please correct me if I'm mistaken.
Furthermore, is it a bad idea to mix the two functions in one program? For example, I used simple_read_from_buffer in my misc dev read function, and copy_from_user in my write function.
Edit: I believe I've found the answer to my question from reading fs/libfs.c (I wasn't aware that this was where the source code was located for these functions); from my understanding the simple_...() functions are essentially a wrapper around the copy_...() functions. I think it was appropriate in my case to use copy_from_user for the misc device write function as I needed to validate that the input matched a specific string before returning it to the user buffer.
I will still leave this question open though in case someone has a better explanation or wants to correct me!
simple_read_from_buffer and simple_write_to_buffer are just convenience wrappers around copy_{to,from}_user for when all you need to do is service a read from userspace from a kernel buffer, or service a write from userspace to a kernel buffer.
From my understanding, the copy_..._user functions are more secure.
Neither version is "more secure" than the other. Whether or not one might be more secure depends on the specific use case.
I would say that simple_{read,write}_... could in general be more secure since they do all the appropriate checks for you before copying. If all you need to do is service a read/write to/from a kernel buffer, then using simple_{read,write}_... is surely faster and less error-prone than manually checking and calling copy_{from,to}_user.
Here's a good example where those functions would be useful:
#define SZ 1024
static char kernel_buf[SZ];
static ssize_t dummy_read(struct file *filp, char __user *user_buf, size_t n, loff_t *off)
{
return simple_read_from_buffer(user_buf, n, off, kernel_buf, SZ);
}
static ssize_t dummy_write(struct file *filp, char __user *user_buf, size_t n, loff_t *off)
{
return simple_write_to_buffer(kernel_buf, SZ, off, user_buf, n);
}
It's hard to tell what exactly you need without seeing your module's code, but I would say that you can either:
Use copy_{from,to}_user if you want to control the exact behavior of your function.
Use a return simple_{read,write}_... if you don't need such fine-grained control and you are ok with just returning the standard values produced by those wrappers.

Function for unaligned memory access on ARM

I am working on a project where data is read from memory. Some of this data are integers, and there was a problem accessing them at unaligned addresses. My idea would be to use memcpy for that, i.e.
uint32_t readU32(const void* ptr)
{
uint32_t n;
memcpy(&n, ptr, sizeof(n));
return n;
}
The solution from the project source I found is similar to this code:
uint32_t readU32(const uint32_t* ptr)
{
union {
uint32_t n;
char data[4];
} tmp;
const char* cp=(const char*)ptr;
tmp.data[0] = *cp++;
tmp.data[1] = *cp++;
tmp.data[2] = *cp++;
tmp.data[3] = *cp;
return tmp.n;
}
So my questions:
Isn't the second version undefined behaviour? The C standard says in 6.2.3.2 Pointers, at 7:
A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned 57) for the
pointed-to type, the behavior is undefined.
As the calling code has, at some point, used a char* to handle the memory, there must be some conversion from char* to uint32_t*. Isn't the result of that undefined behaviour, then, if the uint32_t* is not corrently aligned? And if it is, there is no point for the function as you could write *(uint32_t*) to fetch the memory. Additionally, I think I read somewhere that the compiler may expect an int* to be aligned correctly and any unaligned int* would mean undefined behaviour as well, so the generated code for this function might make some shortcuts because it may expect the function argument to be aligned properly.
The original code has volatile on the argument and all variables because the memory contents could change (it's a data buffer (no registers) inside a driver). Maybe that's why it does not use memcpy since it won't work on volatile data. But, in which world would that make sense? If the underlying data can change at any time, all bets are off. The data could even change between those byte copy operations. So you would have to have some kind of mutex to synchronize access to this data. But if you have such a synchronization, why would you need volatile?
Is there a canonical/accepted/better solution to this memory access problem? After some searching I come to the conclusion that you need a mutex and do not need volatile and can use memcpy.
P.S.:
# cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 1581.05
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10
This code
uint32_t readU32(const uint32_t* ptr)
{
union {
uint32_t n;
char data[4];
} tmp;
const char* cp=(const char*)ptr;
tmp.data[0] = *cp++;
tmp.data[1] = *cp++;
tmp.data[2] = *cp++;
tmp.data[3] = *cp;
return tmp.n;
}
passes the pointer as a uint32_t *. If it's not actually a uint32_t, that's UB. The argument should probably be a const void *.
The use of a const char * in the conversion itself is not undefined behavior. Per 6.3.2.3 Pointers, paragraph 7 of the C Standard (emphasis mine):
A pointer to an object type may be converted to a pointer to a
different object type. If the resulting pointer is not correctly
aligned for the referenced type, the behavior is undefined.
Otherwise, when converted back again, the result shall compare
equal to the original pointer. When a pointer to an object is
converted to a pointer to a character type, the result points to the
lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining
bytes of the object.
The use of volatile with respect to the correct way to access memory/registers directly on your particular hardware would have no canonical/accepted/best solution. Any solution for that would be specific to your system and beyond the scope of standard C.
Implementations are allowed to define behaviors in cases where the Standard does not, and some implementations may specify that all pointer types have the same representation and may be freely cast among each other regardless of alignment, provided that pointers which are actually used to access things are suitably aligned.
Unfortunately, because some obtuse compilers compel the use of "memcpy" as an
escape valve for aliasing issues even when pointers are known to be aligned,
the only way compilers can efficiently process code which needs to make
type-agnostic accesses to aligned storage is to assume that any pointer of a type requiring alignment will always be aligned suitably for such type. As a result, your instinct that approach using uint32_t* is dangerous is spot on. It may be desirable to have compile-time checking to ensure that a function is either passed a void* or a uint32_t*, and not something like a uint16_t* or a double*, but there's no way to declare a function that way without allowing a compiler to "optimize" the function by consolidating the byte accesses into a 32-bit load that will fail if the pointer isn't aligned.

About IOCTL system call

The prototype of IOCTL system call in linux is
int ioctl(struct inode *, struct file *, unsigned int, unsigned long);
All other file operations like read(),write(),llseek(),mmap() etc.. have only struct file * as argument. But, why IOCTL call needs struct inode * to be passed.
Is there any specific use of it?
Which kernel version you are taking about, now ioctl doesn't has the inode pointer as its parameter. Previously there used to be, but I think from 2.6.36 kernel onwards it has been removed.
The prototype of ioctl is, at least according to the manpage, int ioctl(int d, int request, ...);. The ... bit is important - variadic arguments, meaning the remaining arguments depend on the first ones, much like printf. Any use for a struct inode * would stem from the specific ioctl request you're making.

copy_to_user and copy_from_user with structs

I have a simple question: when i have to copy a structure's content from userspace to kernel space for example with an ioctl call (or viceversa) (for simplicity code hasn't error check):
typedef struct my_struct{
int a;
char b;
} my_struct;
Userspace:
my_struct s;
s.a = 11;
s.b = 'X';
ioctl(fd, MY_CMD, &s);
Kernelspace:
int my_ioctl(struct inode *inode, struct file *filp, unsigned int cmd,
unsigned long arg)
{
...
my_struct ks;
copy_from_user(&ks, (void __user *)arg, sizeof(ks));
...
}
i think that size of structure in userspace (variable s) and kernel space (variable ks) could be not the same (without specify the __attribute__((packed))). So is a right thing specifing the number of byte in copy_from_user with sizeof macro? I see that in kernel sources there are some structures that are not declared as packed so, how is ensured the fact that the size will be the same in userspace and kernelspace?
Thank you all!
Why should the layout of a struct be different in kernel space from user space? There is no reason for the compiler to layout data differently.
The exception is if userspace is a 32bit program running on a 64bit kernel. See http://www.x86-64.org/pipermail/discuss/2002-June/002614.html for a tutorial how to deal with this.
The userspace structure should come from kernel header, so struct definition should be the same in user and kernel space. Do you have any real example ?
Of course, if you play with different packing options on two side of an ABI, whatever it is, you are in trouble. The problem here is not sizeof.
If your question is : does packing options affect binary interface, the answer is yes.
If your question is, how can I solve a packing mismatch, please provide more information

64 bit portability issues

All this originated from me poking at a compiler warning message (C4267) when attempting the following line:
const unsigned int nSize = m_vecSomeVec.size();
size() returns a size_t which although typedef'd to unsigned int, is not actually a unsigned int. This I believe have to do with 64 bit portability issues, however can someone explain it a bit better for me? ( I don't just want to disable 64bit warnings.)
It depends on the implementation. std::size_t for example has a minimal required size. But there is no upper limit. To avoid these kind of situations, always use the proper typedef:
const std::vector<T>::size_type nSize = m_vecSomeVec.size();
You will be always on the safe side then.
When compiling for a 64-bit platform, size_t will be a 64-bit type. Because of this, Visual Studio gives warnings about assigning size_ts to ints when 'Detect 64-bit Portability Issues' is enabled.
Visual C++ gets this information about size_t through the __w64 token, e.g. __w64 unsigned int.
Refer the below link for more on 64 bit porting issues..
http://www.viva64.com/en/a/0065/
If size_t is typedef:ed to unsigned int, then of course it is an unsigned int, on your particular platform. But it is abstracted so that you cannot depend on it always being an unsigned int, it might be larger on some other platform.
Probably it has not been made larger since it would cost too much to do so, and e.g. vectors with more than 2^32 items in them are not very common.
Depending on the compiler, int may be 32-bits in 64-bit land.

Resources