is it possible to obtain socket ID in linux kernel in sk_buff struct?
I know i could get socket using this code:
const struct tcphdr *th = tcp_hdr(skb);
struct sock *sk = __inet_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
if (sk)
struct socket* = sk->sk_socket;
Where could i find ID and what maximum value of this id?
A socket is a file.
You'll find, inside the struct socket, a struct file *file member.
I recommend you to look at this question, specifically the link "things you never should do in the Kernel" on the accepted answer, because I'm worried about the reason why you're trying to retrieve the file descriptor from a socket structure in the kernel (usually, you want to do the exact opposite).
To retrieve the file descriptor from a given file under the kernel, you'll need to iterate the fdtable (search for files_fdtable())... this is a tremendous amount of work to do, specially if there is a huge amount of open files.
The maximum value for a file descriptor value will be the maximum number of files allowed in the system, and can be retrieved with something like:
files_fdtable(current->files)->max_fds;
Related
My objective is to be able to determine how many bytes have been transferred into the write end of a pipe. Perhaps, one would need to access the f_pos member of the struct file structure from linux/fs.h associated with this pipe.
struct file snipfrom fs.h
Is it possible to access this value from a userspace program? Again, I'd just like to be able to determine (perhaps based on the f_pos value) how many bytes are stored in the kernel buffer backing the pipe.
I have a feeling this isn't possible and one has to keep reading until read(int fd, void *buf, size_t count) returns less bytes than count.. then at this point, all bytes have been "emptied out" I assume..
Amount of bytes available for read from the pipe can be requested by
ioctl(fd, FIONREAD, &nbytes);
Here fd is a file descriptor, and variable nbytes, where result will be stored, is int variable.
Taken from: man 7 pipe.
Amount of bytes available for write is a different story.
I've set a udp socket and call sendto() with a different recipient at each call.
I would like to use writev() in order to benefit scater/gather io but writev() does not allows me to specify the recipient addr/port as in sendto(). Any suggestions?
On Linux, there is sendmmsg(2)
The sendmmsg() system call is an extension of sendmsg(2) that allows the caller to transmit multiple messages on a socket using a single system call. (This has performance benefits for some applications.)
The prototype is:
int sendmmsg(int sockfd, struct mmsghdr *msgvec, unsigned int vlen,
unsigned int flags);
struct mmsghdr {
struct msghdr msg_hdr; /* Message header */
unsigned int msg_len; /* Number of bytes transmitted */
};
Since both the address and the i/o vector is specified in struct msghdr, you can both send to multiple destinations and make use of scatter/gather.
You can use writev to send a coalesced set of buffers to a single end point if you use connect to specify the end point beforehand. From the (OSX) manpage for connect(2):
datagram sockets may use connect() multiple times to change their association
You cannot use writev to send each buffer to a different endpoint.
A potential downside of using connect / writev instead of sendto*n is that it is yet another system call per writev.
If the set of recipients is limited (and known in advance) it may be preferable to use a separate socket per recipient and just connect each socket once.
in source/arch/x86/kernel/msr.c, the msr_open callback for the character device uses the following construct to extract the minor number of the character device file used:
static int msr_open(struct inode *inode, struct file *file)
{
unsigned int cpu = iminor(file_inode(file));
[...]
}
My question is:
Why not directly call iminor with the first argument of the function, like:
unsigned int cpu = iminor(inode);
The construct is used in other callbacks (e.g. read and write) as well,, where the inode is not passed as an argument, so I guess this is due to copy/paste, or is there a deeper meaning to it?
An inode is a data structure on a traditional Unix-style file system such as UFS or ext3. An inode stores basic information about a regular file, directory, or other file system object.
- http://www.cyberciti.biz/tips/understanding-unixlinux-filesystem-inodes.html
Same deal.
I am working on a Linux kernel module that requires me to check data right before it is written to a local disk. The data to be written is fetched from a remote disk. Therefore, I know that the data from the fetch is stored in the page cache. I also know that Linux has a data structure that manages block I/O requests in-flight called the bio struct.
The bio struct contains a list of structures called bio_vecs.
struct bio_vec {
/* pointer to the physical page on which this buffer resides */
struct page *bv_page;
/* the length in bytes of this buffer */
unsigned int bv_len;
/* the byte offset within the page where the buffer resides */
unsigned int bv_offset;
};
It has a list of these because the block representation in memory may not be physically contiguous. What I want to do is grab each piece of the buffer using the list of bio_vecs and put them together as one so that I could take an MD5 hash of the block. How do I use the pointer to the page, the length of the buffer and its offset to get the raw data in the buffer? Are there already functions for this or do I have to write my own?
you can use bio_data(struct bio *bio) function for accessing the data.
Accessing the data from bio_data could be troublesome as its return type is void*(so %S wont work),but it can be successfully tackle by, little type casting.
Following is the piece of code that will do the job:
char *ptr;
ptr=(char *)bio_data(bio);
for(i=0;i<4096;i++) //4096 as bio is going to be in 4kb chunk
{
printk("%c",*ptr);
ptr++;
}
I have a socket server in C/linux. Each time I create a new socket it is assigned a file descriptor. I want to use these FD's as uniqueID's for each client. If they are guaranteed to always be assigned in increasing order (which is the case for the Ubuntu that I am running) then I could just use them as array indices.
So the question: Are the file descriptors that are assigned from linux sockets guaranteed to always be in increasing order?
Let's look at how this works internally (I'm using kernel 4.1.20). The way file descriptors are allocated in Linux is with __alloc_fd. When you do a open syscall, do_sys_open is called. This routine gets a free file descriptor from get_unused_fd_flags:
long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{
...
fd = get_unused_fd_flags(flags);
if (fd >= 0) {
struct file *f = do_filp_open(dfd, tmp, &op);
get_unused_d_flags calls __alloc_fd setting minimum and maximum fd:
int get_unused_fd_flags(unsigned flags)
{
return __alloc_fd(current->files, 0, rlimit(RLIMIT_NOFILE), flags);
}
__alloc_fd gets the file descriptor table for the process, and gets the fd as next_fd, which is actually set from the previous time it ran:
int __alloc_fd(struct files_struct *files,
unsigned start, unsigned end, unsigned flags)
{
...
fd = files->next_fd;
...
if (start <= files->next_fd)
files->next_fd = fd + 1;
So you can see how file descriptors indeed grow monotonically... up to certain point. When the fd reaches the maximum, __alloc_fd will try to find the smallest unused file descriptor:
if (fd < fdt->max_fds)
fd = find_next_zero_bit(fdt->open_fds, fdt->max_fds, fd);
At this point the file descriptors will not be growing monotonically anymore, but instead will jump trying to find free file descriptors. After this, if the table gets full, it will be expanded:
error = expand_files(files, fd);
At which point they will grow again monotonically.
Hope this helps
FD's are guaranteed to be unique, for the lifetime of the socket. So yes, in theory, you could probably use the FD as an index into an array of clients. However, I'd caution against this for at least a couple of reasons:
As has already been said, there is no guarantee that FDs will be allocated monotonically. accept() would be within its rights to return a highly-numbered FD, which would then make your array inefficient. So short answer to your question: no, they are not guaranteed to be monotonic.
Your server is likely to end up with lots of other open FDs - stdin, stdout and stderr to name but three - so again, your array is wasting space.
I'd recommend some other way of mapping from FDs to clients. Indeed, unless you're going to be dealing with thousands of clients, searching through a list of clients should be fine - it's not really an operation that you should need to do a huge amount.
Do not depend on the monotonicity of file descriptors. Always refer to the remote system via a address:port pair.