So when I call an ioctl on a device, with an ioctl number, how does it know which function to call?
The ioctl(2) enters via the fs/ioctl.c function:
SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
{
struct file *filp;
int error = -EBADF;
int fput_needed;
filp = fget_light(fd, &fput_needed);
if (!filp)
goto out;
error = security_file_ioctl(filp, cmd, arg);
if (error)
goto out_fput;
error = do_vfs_ioctl(filp, fd, cmd, arg);
out_fput:
fput_light(filp, fput_needed);
out:
return error;
}
Note that there is already a filedescriptor fd associated. The kernel then calls fget_light() to look up a filp (roughly, file pointer, but don't confuse this with the standard IO FILE * file pointer). The call into security_file_ioctl() checks whether the loaded security module will allow the ioctl (whether by name, as in AppArmor and TOMOYO, or by labels, as in SMACK and SELinux), as well as whether or not the user has the correct capability (capabilities(7)) to make the call. If the call is allowed, then do_vfs_ioctl() is called to either handle common ioctls itself:
switch (cmd) {
case FIOCLEX:
set_close_on_exec(fd, 1);
break;
/* ... */
If none of those common cases are correct, then the kernel calls a helper routine:
static long vfs_ioctl(struct file *filp, unsigned int cmd,
unsigned long arg)
{
int error = -ENOTTY;
if (!filp->f_op || !filp->f_op->unlocked_ioctl)
goto out;
error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
if (error == -ENOIOCTLCMD)
error = -EINVAL;
out:
return error;
}
Drivers supply their own .unlocked_ioctl function pointer, like this pipe implementation in fs/pipe.c:
const struct file_operations rdwr_pipefifo_fops = {
.llseek = no_llseek,
.read = do_sync_read,
.aio_read = pipe_read,
.write = do_sync_write,
.aio_write = pipe_write,
.poll = pipe_poll,
.unlocked_ioctl = pipe_ioctl,
.open = pipe_rdwr_open,
.release = pipe_rdwr_release,
.fasync = pipe_rdwr_fasync,
};
There's a map in the kernel. You can register your own ioctl codes if you write a driver.
Edit: I wrote an ATA over Ethernet driver once and implemented a custom ioctl for tuning the driver at runtime.
A simplified explanation:
The file descriptor you pass to ioctl points to the inode structure that represents the device you are going to ioctl.
The inode structure contains the device number dev_t i_rdev, which is used as an index to find the device driver's file_operations structure. In this structure, there is a pointer to the ioctl function defined by the device driver.
You can read Linux Device Drivers, 3rd Edition for a more detailed explanation. It may be a bit outdated, but a good read nevertheless.
Related
I am writing a kernel driver to send/receive data with a PCI Express device. For this first version of the driver I am creating a character device interface where the user can read data using a file.
Background
I want to implement a blocking read where the user requests data and the driver populates a user buffer. In order to block the user's read call, I am using a completion structure.
When the driver is loaded and the user requests a read the driver blocks as expected. If I were to finish the read then everything runs fine.
The problem
In order to be safe, whenever the module is removed I call the complete_all function, just in case someone removes the module or device in the middle of a read transaction.
Neither the remove or exit function is called and both the module and user application is blocked. I've tried the following three functions (shown with their associated result).
wait_completion(&dev->read_complete); //Blocks indefinitely, I need to reset the computer
retval = wait_for_completion_interruptible(&dev->read_complete); //I can kill the user application manually and then remove the driver
retval = wait_for_completion_killable(&dev->read_complete); //Same as interruptible
My expectation is that when the remove function is called I can call complete_all(&dev->read_complete) and the read function will return an error.
In order to remove external factors I've made a repo on github, so if anyone wants to see the behavior for themselves they just need to clone and follow the instructions:
Kernel Module Completion Test
The relevant parts of the module are here (/src/mymodule.c)
typedef struct {
struct cdev cdv;
struct class *cls;
struct device *dev;
struct completion complete;
} mymodule_t;
mymodule_t mymod;
//Sysfs 'mymodule_test' attribute (all but the actual function is left out for brevity)
static ssize_t mymodule_test_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
{
int retval = 0;
int value = 0;
if (sscanf(buf, "%d", &value) == 1)
{
retval = strlen(buf);
}
if (value)
{
printk("Value is: %d\n", value);
if (!completion_done(&mymod.complete))
{
complete(&mymod.complete);
}
printk("Sent Completion\n");
}
return retval;
}
//FOPS (all but 'read' function is left out for brevity)
ssize_t mymodule_read(struct file *filp, char * buf, size_t count, loff_t *f_pos)
{
printk("Read!\n");
if (completion_done(&mymod.complete))
{
reinit_completion(&mymod.complete);
}
printk("Wait for Completion\n");
wait_for_completion_interruptible(&mymod.complete);
printk("After Completion\n");
return 0;
}
static int __init mymodule_init(void)
{
...
//Register class and device
//Configure character driver with fops
init_completion(&mymod.complete);
...
}
static void __exit mymodule_exit(void)
{
...
if (!completion_done(&mymod.complete))
{
printk("Send a completion!\n");
complete(&mymod.complete);
}
//Clean up the rest of the module
...
}
module_init(mymodule_init);
module_exit(mymodule_exit);
Here is the userland application I use to exercise this:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <termios.h>
#include "mymodule.h"
#define FILEPATH "/dev/mymodule0"
#define TEST_SIZE 10
int main(void)
{
int fn = -1;
char buf[TEST_SIZE];
printf("Attempting to open file module file...\n");
fn = open(FILEPATH, O_RDWR);
if (fn < 0)
{
printf("Failed to open file!\n");
return -1;
}
printf("Attempting to read from the file...\n");
read(fn, &buf, TEST_SIZE);
printf("Finished reading from file\n");
return 0;
}
Here is the dmesg output when I
load the module
run the user application (it opens the file, attempts to read 10 characters, then exits)
write '1' to the sysfs attribute
unload the module
[3217633.993937] Registering Driver
[3217633.993995] Driver Initialized!
[3217643.747791] Opened!
[3217643.747800] Read!
[3217643.747801] Wait for Completion
[3217646.436780] Value is: 1
[3217646.436792] Sent Completion
[3217646.436806] After Completion
[3217646.437010] Closed!
[3217727.378388] Cleanup Module
[3217727.378393] Check if we need to complete anything
[3217727.378395] Send a completion!
[3217727.378397] Unregistering Character Driver
[3217727.378400] Give back all the numbers we requested
[3217727.378402] Remove the class driver
[3217727.378571] Release the class
[3217727.378593] Finished Cleanup Module, Exiting
If I run the following commands:
load the module
run the user application
unload the module
[3218223.442777] Registering Driver
[3218223.442934] Driver Initialized!
[3218229.378396] Opened!
[3218229.378419] Read!
[3218229.378422] Wait for Completion
then the module doesn't unload. If this were a real device, like a USB hard drive, it is possible that the user could remove the device in the middle of a read transaction. It seems like something is wrong, or perhaps I'm missing something. Am I missing something?
While USB device can be removed at any time, its driver (e.g. kernel module) cannot be unloaded during certain operations with that device (e.g. reading). It is driver who reports to the upper level(e.g. filesystem) about absence of device.
I am moving userspace sysfs interaction to the "/dev" using miscregister using ioctl method.
Can we resolve client structure(struct i2c_client) from Inode of please somebody tell how to get client structure inside ioctl. I need to do i2c transfer inside ioctl.
I referred this link :
http://stackoverflow.com/questions/2635038/inode-to-device-information
but coudln get any answer.
please someone give solution.
while you open your device in kernel using your open function. (this part of code is copied from one of the mainline drivers (drivers/i2c/i2c-dev.c) to make things easy for you)
my_i2c_device_open(struct inode *inode, struct file *file)
{
unsigned int minor = iminor(inode);
struct i2c_client *client;
struct i2c_adapter *adap;
struct i2c_dev *i2c_dev;
i2c_dev = i2c_dev_get_by_minor(minor);
if (!i2c_dev)
return -ENODEV;
adap = i2c_get_adapter(i2c_dev->adap->nr);
if (!adap)
return -ENODEV;
client = kzalloc(sizeof(*client), GFP_KERNEL);
if (!client) {
i2c_put_adapter(adap);
return -ENOMEM;
}
snprintf(client->name, I2C_NAME_SIZE, "i2c-dev %d", adap->nr);
client->adapter = adap;
file->private_data = client;
return 0;
}
when you call ioctl you can retrieve the i2c_client from the file pointer of your device:
static long my_i2c_device_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
struct i2c_client *client = file->private_data;
}
Hope this makes your life easy.'
This reference might help:
Reason to pass data using struct inode and struct file in Linux device driver programming
You build yourself a structure equivalent with "struct scull_dev" in the example above and you store there a reference to the i2c_client structure. In the IOCTL function you can retrieve later on the main control structure and the reference to the i2c_client through container_of.
I have an exam question and I can't quite see how to solve it.
A driver that needs the ioctl method to be implemented and tested.
I have to write the ioctl() method, the associated test program as well as the common IOCTL definitions.
The ioctl() method should only handle one command. In this command, I need to transmit a data structure from user space to kernel space.
Below is the structure shown:
struct data
{
char label [10];
int value;
}
The driver must print the IOCTL command data, using printk();
Device name is "/dev/mydevice"
The test program must validate driver mode using an initialized data structure.
Hope there are some that can help
thanks in advance
My suggestion:
static int f_on_ioctl(struct inode *inode, struct file *file, unsigned int cmd,
unsigned long arg)
{
int ret;
switch (cmd)
{
case PASS_STRUCT:
struct data pass_data;
ret = copy_from_user(&pass_data, arg, sizeof(*pass_data));
if(ret < 0)
{
printk("PASS_STRUCT\n");
return -1;
}
printk(KERN ALERT "Message PASS_STRUCT : %d and %c\n",pass_data.value, pass_data.label);
break;
default:
return ENOTTY;
}
return 0;
}
Definitions:
Common.h
#define SYSLED_IOC_MAGIC 'k'
#define PASS_STRUCT _IOW(SYSLED_IOC_MAGIC, 1, struct data)
The test program:
int main()
{
int fd = open("/dev/mydevice", O_RDWR);
data data_pass;
data_pass.value = 2;
data_pass.label = "hej";
ioctl(fd, PASS_STRUCT, &data_pass);
close(fd);
return 0;
}
Is this completely wrong??
Hi i would like to know how is it possible to call/run the following function from user space.
static ssize_t lm70_sense_temp(struct device *dev,
struct device_attribute *attr, char *buf)
{
//some code
.
.
status = sprintf(buf, "%d\n", val); /* millidegrees Celsius */
.
.
//some code
}
This function is defined in lm70.c driver located in the kernel/drivers/hwmon folder of the linux source? Is it possible to pass the values of this functions internal variables to the user application? I would like to retrieve the value of val variable in the above function...
I don't know well the kernel internals. However, I grepped for lm70_sense_temp in the entire kernel source tree, and it appears only in the file linux-3.7.1/drivers/hwmon/lm70.c, first as a static function, then as the argument to DEVICE_ATTR.
Then I googled for linux kernel DEVICE_ATTR and found immediately device.txt which shows that you probably should read that thru the sysfs, i.e. under /sys; read sysfs-rules.txt; so a user application could very probably read something relevant under /sys/
I'm downvoting your question because I feel that you could have searched a few minutes like I did (and I am not a kernel expert).
You don't need to call this function from user space to get that value - it is already exported to you via sysfs.
You could use grep to find which hwmon device it is:
grep -rl "lm70" /sys/class/hwmon/*/name /sys/class/hwmon/*/*/name
Then you can read the temperature input from your user space program, e.g:
#include <stdio.h>
#include <fcntl.h>
#define SENSOR_FILE "/sys/class/hwmon/hwmon0/temp1_input"
int readSensor(void)
{
int fd, val = -1;
char buf[32];
fd = open(SENSOR_FILE, O_RDONLY);
if (fd < 0) {
printf("Failed to open %s\n", SENSOR_FILE);
return val;
}
if (read(fd, &buf, sizeof(buf)) > 0) {
val = atoi(buf);
printf("Sensor value = %d\n", val);
} else {
printf("Failed to read %s\n", SENSOR_FILE);
}
close(fd);
return val;
}
As others have already stated - you can't call kernel code from user space, thems the breaks.
You cannot call a driver function directly from user space.
If that function is exported with EXPORT_SYMBOL or EXPORT_SYMBOL_GPL then we can write a simple kernel module and call that function directly. The result can be sent to user space through FIFO or shared memory.
But in your case, this function is not exported. so you should not do in this way.
I'm studying about System Calls in Linux and I read the read() System Calls.
SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
struct file *file;
ssize_t ret = -EBADF;
int fput_needed;
file = fget_light(fd, &fput_needed);
if (file) {
loff_t pos = file_pos_read(file);
ret = vfs_read(file, buf, count, &pos);
file_pos_write(file, pos);
fput_light(file, fput_needed);
}
return ret;
}
This is the definition of fget_light()
struct file *fget_light(unsigned int fd, int *fput_needed)
{
struct file *file;
struct files_struct *files = current->files;
*fput_needed = 0;
if (likely((atomic_read(&files->count) == 1))) {
file = fcheck_files(files, fd);
} else {
rcu_read_lock();
file = fcheck_files(files, fd);
if (file) {
if (atomic_long_inc_not_zero(&file->f_count))
*fput_needed = 1;
else
/* Didn't get the reference, someone's freed */
file = NULL;
}
rcu_read_unlock();
}
return file;
}
Can you explain me, what does fget_light do?
Each task has a file descriptor table. This file descriptor table is indexed by file descriptor number, and contains information (file descriptions) about each open file.
As many other objects in the kernel, file descriptions are reference-counted. This means that when some part of the kernel wants to access a file description, it has to take a reference, do whatever it needs to do, and release the reference. When the reference count drops to zero, the object can be freed. For file descriptions, open() increments the reference count and close() decrements it, so file descriptions cannot be released while they are open and/or the kernel is using them (e.g: imagine a thread in your process close()ing a file while another thread is still read()ing the file: the file description will not actually be released until the read fput()s its reference).
To get a reference to a file description from a file descriptor, the kernel has the function fget(), and fput() releases that reference. Since several threads may be accessing the same file description at the same time on different CPUs, fget() and fput() must use appropriate locking. In modern times they use RCU; mere readers of the file descriptor table incur no/almost no cost.
But RCU is not enough optimization. Consider that it's very common to have processes which are not multi-threaded. In this case you don't have to worry about other threads from the same process accessing the same file description. The only task with access to our file descriptor table is us. So, as an optimization, fget_light()/fput_light() don't touch the reference count when the current file descriptor table is only used in a single task.
struct file *fget_light(unsigned int fd, int *fput_needed)
{
struct file *file;
/* The file descriptor table for our _current_ task */
struct files_struct *files = current->files;
/* Assume we won't need to touch the reference count,
* since the count won't reach zero (we are not close(),
* and hope we don't run concurrently to close()),
* fput_light() won't actually need to fput().
*/
*fput_needed = 0;
/* Check whether we are actually the only task with access to the fd table */
if (likely((atomic_read(&files->count) == 1))) {
/* Yep, get the reference to the file description */
file = fcheck_files(files, fd);
} else {
/* Nope, we'll need some locking */
rcu_read_lock();
/* Get the reference to the file description */
file = fcheck_files(files, fd);
if (file) {
/* Increment the reference count */
if (atomic_long_inc_not_zero(&file->f_count))
/* fput_light() will actually need to fput() */
*fput_needed = 1;
else
/* Didn't get the reference, someone's freed */
/* Happens if the file was close()d and all the
* other accessors ended its work and fput().
*/
file = NULL;
}
rcu_read_unlock();
}
return file;
}
Basically, the function translates the fd passed by the user to the syscall to the kernel-internal file structure pointer by calling the fcheck_files function that looks into the file table of the process (that would be its files parameter). For more information, read this.