linux wake_up_interruptible() having no effect - linux

I am writing a "sleepy" device driver for an Operating Systems class.
The way it works is, the user accesses the device via read()/write().
When the user writes to the device like so: write(fd, &wait, size), the device is put to sleep for the amount of time in seconds of the value of wait. If the wait time expires then driver's write method returns 0 and the program finishes. But if the user reads from the driver while a process is sleeping on a wait queue, then the driver's write method returns immediately with the number of seconds the sleeping process had left to wait before the timeout would have occurred on its own.
Another catch is that 10 instances of the device are created, and each of the 10 devices must be independent of each other. So a read to device 1 must only wake up sleeping processes on device 1.
Much code has been provided, and I have been charged with the task of mainly writing the read() and write() methods for the driver.
The way I have tried to solve the problem of keeping the devices independent of each other is to include two global static arrays of size 10. One of type wait_head_queue_t, and one of type Int(Bool flags). Both of these arrays are initialized once when I open the device via open(). The problem is that when I call wake_up_interruptible(), nothing happens, and the program terminates upon timeout. Here is my write method:
ssize_t sleepy_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos){
struct sleepy_dev *dev = (struct sleepy_dev *)filp->private_data;
ssize_t retval = 0;
int mem_to_be_copied = 0;
if (mutex_lock_killable(&dev->sleepy_mutex))
{
return -EINTR;
}
// check size
if(count != 4) // user must provide 4 byte Int
{
return EINVAL; // = 22
}
// else if the user provided valid sized input...
else
{
if((mem_to_be_copied = copy_from_user(&long_buff[0], buf, count)))
{
return -EFAULT;
}
// check for negative wait time entered by user
if(long_buff[0] > -1)// "long_buff[]"is global,for now only holds 1 value
{
proc_read_flags[MINOR(dev->cdev.dev)] = 0; //****** flag array
retval = wait_event_interruptible_timeout(wqs[MINOR(dev->cdev.dev)], proc_read_flags[MINOR(dev->cdev.dev)] == 1, long_buff[0] * HZ) / HZ;
proc_read_flags[MINOR(dev->cdev.dev)] = 0; // MINOR numbers for each
// device correspond to array indices
// devices 0 - 9
// "wqs" is array of wait queues
}
else
{
printk(KERN_INFO "user entered negative value for sleep time\n");
}
}
mutex_unlock(&dev->sleepy_mutex);
return retval;}
Unlike the many examples on this topic, I am switching the flag back to zero immediately before the call to wait_event_interruptible_timeout() because flag values seem to be lingering between subsequent runs of the program. Here is the code for my read method:
ssize_t sleepy_read(struct file *filp, char __user *buf, size_t count,
loff_t *f_pos){
struct sleepy_dev *dev = (struct sleepy_dev *)filp->private_data;
ssize_t retval = 0;
if (mutex_lock_killable(&dev->sleepy_mutex))
return -EINTR;
// switch the flag
proc_read_flags[MINOR(dev->cdev.dev)] = 1; // again device minor numbers
// correspond to array indices
// TODO: this is not waking up the process in write!
// wake up the queue
wake_up_interruptible(&wqs[MINOR(dev->cdev.dev)]);
mutex_unlock(&dev->sleepy_mutex);
return retval;}
The way I am trying to test the program is to have two main.c's, one for writing to the device and one for reading from the device, and I just ./a.out them in separate consoles in my ubuntu installation in Virtual Box. Another thing, the way it is set up now, neither the writing or reading a.outs return until timeout occurs. I apologize for the spotty formatting of the code. I'm not sure exactly what is going on here, so any help would be much appreciated! Thanks!

Your write method hold sleepy_mutex while wait event. So read method waits on mutex_lock_killable(&dev->sleepy_mutex) while the mutex become unlocked by the writer. It is occured only when writer's timeout exceeds, and write method returns. It is the behaviour you observe.
Usually, wait_event* is executed outside of any critical section. That can be achieved by using _lock-suffixed variants of such macros, or simply wrapping cond argument of such macros with spinlock acquire/release pair:
int check_cond()
{
int res;
spin_lock(&lock);
res = <cond>;
spin_unlock(&lock);
return res;
}
...
wait_event_interruptible(&wq, check_cond());
Unfortunately, wait_event-family macros cannot be used, when condition checking should be protected with a mutex. In that case, you can use wait_woken() function with manual condition checking code. Or rewrite your code without needs of mutex lock/unlock around condition checking.
For achive "reader wake writer, if it is sleep" functionality, you can adopt code from that answer https://stackoverflow.com/a/29765695/3440745.
Writer code:
//Declare local variable at the beginning of the function
int cflag;
...
// Outside of any critical section(after mutex_unlock())
cflag = proc_read_flags[MINOR(dev->cdev.dev)];
wait_event_interruptible_timeout(&wqs[MINOR(dev->cdev.dev)],
proc_read_flags[MINOR(dev->cdev.dev)] != cflag, long_buff[0]*HZ);
Reader code:
// Mutex holding protects this flag's increment from concurrent one.
proc_read_flags[MINOR(dev->cdev.dev)]++;
wake_up_interruptible_all(&wqs[MINOR(dev->cdev.dev)]);

Related

What is the purpose of putting a thread on a wait queue with a condition when only one thread is allowed to enter?

On this request
ssize_t foo_read(struct file *filp, char *buf, size_t count,loff_t *ppos)
{
foo_dev_t * foo_dev = filp->private_data;
if (down_interruptible(&foo_dev->sem)
return -ERESTARTSYS;
foo_dev->intr = 0;
outb(DEV_FOO_READ, DEV_FOO_CONTROL_PORT);
wait_event_interruptible(foo_dev->wait, (foo_dev->intr= =1));
if (put_user(foo_dev->data, buf))
return -EFAULT;
up(&foo_dev->sem);
return 1;
}
With this completion
irqreturn_t foo_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
foo->data = inb(DEV_FOO_DATA_PORT);
foo->intr = 1;
wake_up_interruptible(&foo->wait);
return 1;
}
Assuming foo_dev->sem is initially 1 then only one thread is allowed to execute the section after down_interruptible(&foo_dev->sem) and threads waiting for that semaphore make sense to be put in a queue.(As i understand making foo_dev->sem greater than one will be a problem in that code).
So if only one passes always whats the use of foo_dev->wait queue, isnt it possible to suspend the current thread, save its pointer as a global *curr and wake it up when it completes its request?
Yes, it is possible to put single thread to wait (using set_current_state() and schedule()) and resume it later (using wake_up_process).
But this requires writing some code for check wakeup conditions and possible absent of a thread to wakeup.
Waitqueues provide ready-made functions and macros for wait on condition and wakeup it later, so resulted code becomes much shorter: single macro wait_event_interruptible() processes checking for event and putting thread to sleep, and single macro wake_up_interruptible() processes resuming possibly absent thread.

issue with copy_from_user in kernel

I'm trying to use this function to copy a buffer from the user to one in kernel.
both buffers were allocated. I'm using while in case not all the bytes were copied on the first try. but for some reason, nothing is copied and the program is stuck in the while loop.
what can be the reasons for that?
void my_copy_from_user(const char* source_buff, char* dest_buff, int size_to_copy){
int not_copied = size_to_copy
int left = size_to_copy;
while( not_copied ){
not_copied = copy_from_user(dest_buff, source_buff, left);
dest_buff += (left - not_copied);
source_buff += (left - not_copied);
left = not_copied;
}
}
It is possible that it is legitimately failing for reasons that you cannot recover from.
Please look at: http://lxr.free-electrons.com/source/arch/x86/lib/usercopy_32.c#L681
unsigned long _copy_from_user(void *to, const void __user *from, unsigned n)
{
if (access_ok(VERIFY_READ, from, n))
n = __copy_from_user(to, from, n);
else
memset(to, 0, n);
return n;
}
This is the underlying implementation for copy_from_user for Linux on x86 processors. It first checks access_ok. If access is not allowed, it will fail and return with n (the number of bytes you requested to copy) immediately. This would cause an infinite loop.
Two points:
I do not think you should invoke copy_from_user in a loop like that. If it fails to copy in kernel mode, there is a reason why. This is a different beast from read() functions when reading from sockets, etc, where you are encouraged to read() in a loop.
Are you sure that you are passing in the correct dest_buff to copy_from_user?
Tips:
Printk all the values and see what's happening. Is left being changed or not? It is likely not.

Unclear logic behind pl011_tx_chars() in amba-pl011 Linux kernel module

I'm trying to understand how Linux driver for AMBA serial port (amba-pl011.c) sends characters in non-DMA mode. For port operations, this driver registers only following callbacks:
static struct uart_ops amba_pl011_pops = {
.tx_empty = pl011_tx_empty,
.set_mctrl = pl011_set_mctrl,
.get_mctrl = pl011_get_mctrl,
.stop_tx = pl011_stop_tx,
.start_tx = pl011_start_tx,
.stop_rx = pl011_stop_rx,
.enable_ms = pl011_enable_ms,
.break_ctl = pl011_break_ctl,
.startup = pl011_startup,
.shutdown = pl011_shutdown,
.flush_buffer = pl011_dma_flush_buffer,
.set_termios = pl011_set_termios,
.type = pl011_type,
.release_port = pl011_release_port,
.request_port = pl011_request_port,
.config_port = pl011_config_port,
.verify_port = pl011_verify_port,
.poll_init = pl011_hwinit,
.poll_get_char = pl011_get_poll_char,
.poll_put_char = pl011_put_poll_char };
As you can see, there's no character sending operation among them, namely, pl011_tx_chars() function is not listed there. Since pl011_tx_chars() is declared static, it is not exposed outside the module. I found that within the module it is called only from pl011_int() function which is an interrupt handler. It is called whenever UART011_TXIS occurs:
if (status & UART011_TXIS) pl011_tx_chars(uap);
The function pl011_tx_chars() itself writes characters from circular buffer to UART01x_DR port until the fifo queue size is reached (function returns then so more data will be written at the next interrupt) or until circular buffer is empty (pl011_stop_tx() is called then). As we can see, pl011_start_tx() and pl011_stop_tx() are listed in AMBA port operations (so they can be called as callbacks despite their local static declaration). Seems reasonable, thing is, these two function do something very simple:
static void pl011_stop_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
uap->im &= ~UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
pl011_dma_tx_stop(uap);
}
static void pl011_start_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
if (!pl011_dma_tx_start(uap)) {
uap->im |= UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
}
Since I don't have CONFIG_DMA_ENGINE set, pl011_dma_tx_start() and pl011_dma_tx_stop() are just stubs:
static inline void pl011_dma_tx_stop(struct uart_amba_port *uap)
{
}
static inline bool pl011_dma_tx_start(struct uart_amba_port *uap)
{
return false;
}
Seems like the only thing that pl011_start_tx() does is to arm UART011_TXIM interrupt while the only thing that pl011_stop_tx() does is to disarm it. Nothing initiates the transmission!
I looked at serial_core.c - it's the only file where start_tx operation is invoked, in four places (by the registered callback). The most promissing place is uart_write() function. It fills circular buffer with data and calls local static uart_start() function which is very simple:
static void __uart_start(struct tty_struct *tty)
{
struct uart_state *state = tty->driver_data;
struct uart_port *port = state->uart_port;
if (!uart_circ_empty(&state->xmit) && state->xmit.buf &&
!tty->stopped && !tty->hw_stopped)
port->ops->start_tx(port);
}
static void uart_start(struct tty_struct *tty)
{
struct uart_state *state = tty->driver_data;
struct uart_port *port = state->uart_port;
unsigned long flags;
spin_lock_irqsave(&port->lock, flags);
__uart_start(tty);
spin_unlock_irqrestore(&port->lock, flags);
}
As you can see, no one sends initial characters to the UART port, circular buffer is filled and everything is waiting for UART011_TXIS interrupt.
Is it possible that arming UART011_TXIM interrupt instantly emits UART011_TXIS? I looked into DDI0183.pdf (PrimeCell® UART (PL011) Technical Referecne Manual), Chapter 3: Programmers Model, section 3.4: Interrupts, subsection 3.4.3 UARTTXINTR. What it says is:
....
The transmit interrupt changes state when one of the following events occurs:
• If the FIFOs are enabled and the transmit FIFO reaches the programmed trigger
level. When this happens, the transmit interrupt is asserted HIGH. The transmit
interrupt is cleared by writing data to the transmit FIFO until it becomes greater
than the trigger level, or by clearing the interrupt.
• If the FIFOs are disabled (have a depth of one location) and there is no data
present in the transmitters single location, the transmit interrupt is asserted HIGH.
It is cleared by performing a single write to the transmit FIFO, or by clearing the
interrupt.
....
The note below is even more interesting:
....
The transmit interrupt is based on a transition through a level, rather than on the level
itself. When the interrupt and the UART is enabled before any data is written to the
transmit FIFO the interrupt is not set. The interrupt is only set once written data leaves
the single location of the transmit FIFO and it becomes empty.
....
The emphasis above is mine. I don't know if my English is not sufficient, but from the words above I can't find where it states that unlocking transmit interrupt can be used for triggering transmit routine. What am I missing?
The ARM docs say that the PL011 is a "16550-ish" UART. This sort of gets them off the hook for fully specifying its behavior and instead sends you to the 16550 docs, which state in the "FIFO interrupt mode operation" section...
When the XMIT FIFO and transmitter interrupts are enabled (FCR0e1,
IER1e1), XMIT interrupts will occur as follows: A. The transmitter
holding register interrupt (02) occurs when the XMIT FIFO is empty; it
is cleared as soon as the transmitter holding register is written to
(1 to 16 characters may be written to the XMIT FIFO while servicing
this interrupt) or the IIR is read.
So, it appears that if the FIFO and TX holding register are empty and you enable TX interrupts, you should immediately see a TX interrupt that kickstarts the sending process and fills the holding register and then the FIFO. Once those drain back down below the FIFO trigger, then another interrupt will be generated to keep the process going for as long as there is more buffered data to be sent.

Why accessing pthread keys' sequence number is not synchronized in glibc's NPTL implementation?

Recently when I look into how the thread-local storage is implemented in glibc, I found the following code, which implements the API pthread_key_create()
int
__pthread_key_create (key, destr)
pthread_key_t *key;
void (*destr) (void *);
{
/* Find a slot in __pthread_kyes which is unused. */
for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
{
uintptr_t seq = __pthread_keys[cnt].seq;
if (KEY_UNUSED (seq) && KEY_USABLE (seq)
/* We found an unused slot. Try to allocate it. */
&& ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[cnt].seq,
seq + 1, seq))
{
/* Remember the destructor. */
__pthread_keys[cnt].destr = destr;
/* Return the key to the caller. */
*key = cnt;
/* The call succeeded. */
return 0;
}
}
return EAGAIN;
}
__pthread_keys is a global array accessed by all threads. I don't understand why the read of its member seq is not synchronized as in the following:
uintptr_t seq = __pthread_keys[cnt].seq;
although it is syncrhonized when modified later.
FYI, __pthread_keys is an array of type struct pthread_key_struct, which is defined as follows:
/* Thread-local data handling. */
struct pthread_key_struct
{
/* Sequence numbers. Even numbers indicated vacant entries. Note
that zero is even. We use uintptr_t to not require padding on
32- and 64-bit machines. On 64-bit machines it helps to avoid
wrapping, too. */
uintptr_t seq;
/* Destructor for the data. */
void (*destr) (void *);
};
Thanks in advance.
In this case, the loop can avoid an expensive lock acquisition. The atomic compare and swap operation done later (atomic_compare_and_exchange_bool_acq) will make sure only one thread can successfully increment the sequence value and return the key to the caller. Other threads reading the same value in the first step will keep looping since the CAS can only succeed for a single thread.
This works because the sequence value alternates between even (empty) and odd (occupied). Incrementing the value to odd prevents other threads from acquiring the slot.
Just reading the value is fewer cycles than the CAS instruction typically, so it makes sense to peek at the value, before doing the CAS.
There are many wait-free and lock-free algorithms that take advantage of the CAS instruction to achieve low-overhead synchronization.

Uninterruptible read/write calls

At some point during my C programming adventures on Linux, I encountered flags (possibly ioctl/fcntl?), that make reads and writes on a file descriptor uninterruptible.
Unfortunately I cannot recall how to do this, or where I read it. Can anyone shed some light?
Update0
To refine my query, I'm after the same blocking and guarantees that fwrite() and fread() provide, sans userspace buffering.
You can avoid EINTR from read() and write() by ensuring all your signal handlers are installed with the SA_RESTART flag of sigaction().
However this does not protect you from short reads / writes. This is only possible by putting the read() / write() into a loop (it does not require an additional buffer beyond the one that must already be supplied to the read() / write() call.)
Such a loop would look like:
/* If return value is less than `count', then errno == 0 indicates end of file,
* otherwise errno indicates the error that occurred. */
ssize_t hard_read(int fd, void *buf, size_t count)
{
ssize_t rv;
ssize_t total_read = 0;
while (total_read < count)
{
rv = read(fd, (char *)buf + total_read, count - total_read);
if (rv == 0)
errno = 0;
if (rv < 1)
if (errno == EINTR)
continue;
else
break;
total_read += rv;
}
return rv;
}
Do you wish to disable interrupts while reading/writing, or guarantee that nobody else will read/write the file while you are?
For the second, you can use fcntl()'s F_GETLK, F_SETLK and F_SETLKW to acquire, release and test for record locks respectively. However, since POSIX locks are only advisory, Linux does not enforce them - it's only meaningful between cooperating processes.
The first task involves diving into ring zero and disabling interrupts on your local processor (or all, if you're on an SMP system). Remember to enable them again when you're done!

Resources