Unclear logic behind pl011_tx_chars() in amba-pl011 Linux kernel module - linux

I'm trying to understand how Linux driver for AMBA serial port (amba-pl011.c) sends characters in non-DMA mode. For port operations, this driver registers only following callbacks:
static struct uart_ops amba_pl011_pops = {
.tx_empty = pl011_tx_empty,
.set_mctrl = pl011_set_mctrl,
.get_mctrl = pl011_get_mctrl,
.stop_tx = pl011_stop_tx,
.start_tx = pl011_start_tx,
.stop_rx = pl011_stop_rx,
.enable_ms = pl011_enable_ms,
.break_ctl = pl011_break_ctl,
.startup = pl011_startup,
.shutdown = pl011_shutdown,
.flush_buffer = pl011_dma_flush_buffer,
.set_termios = pl011_set_termios,
.type = pl011_type,
.release_port = pl011_release_port,
.request_port = pl011_request_port,
.config_port = pl011_config_port,
.verify_port = pl011_verify_port,
.poll_init = pl011_hwinit,
.poll_get_char = pl011_get_poll_char,
.poll_put_char = pl011_put_poll_char };
As you can see, there's no character sending operation among them, namely, pl011_tx_chars() function is not listed there. Since pl011_tx_chars() is declared static, it is not exposed outside the module. I found that within the module it is called only from pl011_int() function which is an interrupt handler. It is called whenever UART011_TXIS occurs:
if (status & UART011_TXIS) pl011_tx_chars(uap);
The function pl011_tx_chars() itself writes characters from circular buffer to UART01x_DR port until the fifo queue size is reached (function returns then so more data will be written at the next interrupt) or until circular buffer is empty (pl011_stop_tx() is called then). As we can see, pl011_start_tx() and pl011_stop_tx() are listed in AMBA port operations (so they can be called as callbacks despite their local static declaration). Seems reasonable, thing is, these two function do something very simple:
static void pl011_stop_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
uap->im &= ~UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
pl011_dma_tx_stop(uap);
}
static void pl011_start_tx(struct uart_port *port)
{
struct uart_amba_port *uap = (struct uart_amba_port *)port;
if (!pl011_dma_tx_start(uap)) {
uap->im |= UART011_TXIM;
writew(uap->im, uap->port.membase + UART011_IMSC);
}
}
Since I don't have CONFIG_DMA_ENGINE set, pl011_dma_tx_start() and pl011_dma_tx_stop() are just stubs:
static inline void pl011_dma_tx_stop(struct uart_amba_port *uap)
{
}
static inline bool pl011_dma_tx_start(struct uart_amba_port *uap)
{
return false;
}
Seems like the only thing that pl011_start_tx() does is to arm UART011_TXIM interrupt while the only thing that pl011_stop_tx() does is to disarm it. Nothing initiates the transmission!
I looked at serial_core.c - it's the only file where start_tx operation is invoked, in four places (by the registered callback). The most promissing place is uart_write() function. It fills circular buffer with data and calls local static uart_start() function which is very simple:
static void __uart_start(struct tty_struct *tty)
{
struct uart_state *state = tty->driver_data;
struct uart_port *port = state->uart_port;
if (!uart_circ_empty(&state->xmit) && state->xmit.buf &&
!tty->stopped && !tty->hw_stopped)
port->ops->start_tx(port);
}
static void uart_start(struct tty_struct *tty)
{
struct uart_state *state = tty->driver_data;
struct uart_port *port = state->uart_port;
unsigned long flags;
spin_lock_irqsave(&port->lock, flags);
__uart_start(tty);
spin_unlock_irqrestore(&port->lock, flags);
}
As you can see, no one sends initial characters to the UART port, circular buffer is filled and everything is waiting for UART011_TXIS interrupt.
Is it possible that arming UART011_TXIM interrupt instantly emits UART011_TXIS? I looked into DDI0183.pdf (PrimeCell® UART (PL011) Technical Referecne Manual), Chapter 3: Programmers Model, section 3.4: Interrupts, subsection 3.4.3 UARTTXINTR. What it says is:
....
The transmit interrupt changes state when one of the following events occurs:
• If the FIFOs are enabled and the transmit FIFO reaches the programmed trigger
level. When this happens, the transmit interrupt is asserted HIGH. The transmit
interrupt is cleared by writing data to the transmit FIFO until it becomes greater
than the trigger level, or by clearing the interrupt.
• If the FIFOs are disabled (have a depth of one location) and there is no data
present in the transmitters single location, the transmit interrupt is asserted HIGH.
It is cleared by performing a single write to the transmit FIFO, or by clearing the
interrupt.
....
The note below is even more interesting:
....
The transmit interrupt is based on a transition through a level, rather than on the level
itself. When the interrupt and the UART is enabled before any data is written to the
transmit FIFO the interrupt is not set. The interrupt is only set once written data leaves
the single location of the transmit FIFO and it becomes empty.
....
The emphasis above is mine. I don't know if my English is not sufficient, but from the words above I can't find where it states that unlocking transmit interrupt can be used for triggering transmit routine. What am I missing?

The ARM docs say that the PL011 is a "16550-ish" UART. This sort of gets them off the hook for fully specifying its behavior and instead sends you to the 16550 docs, which state in the "FIFO interrupt mode operation" section...
When the XMIT FIFO and transmitter interrupts are enabled (FCR0e1,
IER1e1), XMIT interrupts will occur as follows: A. The transmitter
holding register interrupt (02) occurs when the XMIT FIFO is empty; it
is cleared as soon as the transmitter holding register is written to
(1 to 16 characters may be written to the XMIT FIFO while servicing
this interrupt) or the IIR is read.
So, it appears that if the FIFO and TX holding register are empty and you enable TX interrupts, you should immediately see a TX interrupt that kickstarts the sending process and fills the holding register and then the FIFO. Once those drain back down below the FIFO trigger, then another interrupt will be generated to keep the process going for as long as there is more buffered data to be sent.

Related

linux socket: lifetime of ancillary data for sendmsg

I use cmsg to activate timestamping on linux socket tx.
ssize_t sendWithOptions
(int sd, std::vector<uint8_t> &payload, uint32_t destIP, int flags)
{
msghdr msg { };
.... // filling standard
std::array<uint8_t, CMSG_LEN(sizeof(__u32))> buf;
msg.msg_control = buf.data();
msg.msg_controlen = buf.size();
auto cmsg { CMSG_FIRSTHDR ( &msg ) };
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SO_TIMESTAMPING;
cmsg->cmsg_len = buf.size();
*(reinterpret_cast<__u32>(CMSG_DATA (cmsg)) = static_cast<__u32>(flags);
return sendmsg ( sd, &msg, MSG_DONTWAIT );
}
Leaving the function, "buf" is automatically destroyed, but does sendmsg need this buffer to live longer?
Do I have a guarantee that the function does not need this buffer once it has returned the number of bytes sent.
Except for specific interfaces, it is generally the case that operating system calls do not rely on user-space to maintain data structures affecting their operation after they are finished. The exceptions will be spelled out in the manual pages.
With sendmsg, in particular, you can rely on the call to complete immediately - whether successful or not. It's fine therefore to use a dynamically allocated buffer as you're doing, and destroy it immediately after the call.
As an example of one exception, aio_write(2) is specifically intended to allow user-space to queue a write operation that will be completed asynchronously. For this call, the data is not consumed until it can be successfully written. Hence, you must not modify the data structures provided in the call until you have confirmed it is complete. That caveat is called out in the NOTES section of the manual page:
... The control block must not be changed while the write operation is in progress. The buffer area being written out must not be accessed during the operation or undefined results may occur. The memory areas involved must remain valid.
In summary: check the manual page for the system call. But most of the time, you don't need to worry about it.

How to make mprotect() to make forward progress after handling pagefaulte exception? [duplicate]

I want to write a signal handler to catch SIGSEGV.
I protect a block of memory for read or write using
char *buffer;
char *p;
char a;
int pagesize = 4096;
mprotect(buffer,pagesize,PROT_NONE)
This protects pagesize bytes of memory starting at buffer against any reads or writes.
Second, I try to read the memory:
p = buffer;
a = *p
This will generate a SIGSEGV, and my handler will be called.
So far so good. My problem is that, once the handler is called, I want to change the access write of the memory by doing
mprotect(buffer,pagesize,PROT_READ);
and continue normal functioning of my code. I do not want to exit the function.
On future writes to the same memory, I want to catch the signal again and modify the write rights and then record that event.
Here is the code:
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
char *buffer;
int flag=0;
static void handler(int sig, siginfo_t *si, void *unused)
{
printf("Got SIGSEGV at address: 0x%lx\n",(long) si->si_addr);
printf("Implements the handler only\n");
flag=1;
//exit(EXIT_FAILURE);
}
int main(int argc, char *argv[])
{
char *p; char a;
int pagesize;
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
handle_error("sigaction");
pagesize=4096;
/* Allocate a buffer aligned on a page boundary;
initial protection is PROT_READ | PROT_WRITE */
buffer = memalign(pagesize, 4 * pagesize);
if (buffer == NULL)
handle_error("memalign");
printf("Start of region: 0x%lx\n", (long) buffer);
printf("Start of region: 0x%lx\n", (long) buffer+pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+2*pagesize);
printf("Start of region: 0x%lx\n", (long) buffer+3*pagesize);
//if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
handle_error("mprotect");
//for (p = buffer ; ; )
if(flag==0)
{
p = buffer+pagesize/2;
printf("It comes here before reading memory\n");
a = *p; //trying to read the memory
printf("It comes here after reading memory\n");
}
else
{
if (mprotect(buffer + pagesize * 0, pagesize,PROT_READ) == -1)
handle_error("mprotect");
a = *p;
printf("Now i can read the memory\n");
}
/* for (p = buffer;p<=buffer+4*pagesize ;p++ )
{
//a = *(p);
*(p) = 'a';
printf("Writing at address %p\n",p);
}*/
printf("Loop completed\n"); /* Should never happen */
exit(EXIT_SUCCESS);
}
The problem is that only the signal handler runs and I can't return to the main function after catching the signal.
When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.
So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.
Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.
Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).
edit
The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:
ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];
will give you the pc where the exception occurred. You can read it to figure out what instruction it
was that faulted, and do something different.
As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.
You've fallen into the trap that all people do when they first try to handle signals. The trap? Thinking that you can actually do anything useful with signal handlers. From a signal handler, you are only allowed to call asynchronous and reentrant-safe library calls.
See this CERT advisory as to why and a list of the POSIX functions that are safe.
Note that printf(), which you are already calling, is not on that list.
Nor is mprotect. You're not allowed to call it from a signal handler. It might work, but I can promise you'll run into problems down the road. Be really careful with signal handlers, they're tricky to get right!
EDIT
Since I'm being a portability douchebag at the moment already, I'll point out that you also shouldn't write to shared (i.e. global) variables without taking the proper precautions.
You can recover from SIGSEGV on linux. Also you can recover from segmentation faults on Windows (you'll see a structured exception instead of a signal). But the POSIX standard doesn't guarantee recovery, so your code will be very non-portable.
Take a look at libsigsegv.
You should not return from the signal handler, as then behavior is undefined. Rather, jump out of it with longjmp.
This is only okay if the signal is generated in an async-signal-safe function. Otherwise, behavior is undefined if the program ever calls another async-signal-unsafe function. Hence, the signal handler should only be established immediately before it is necessary, and disestablished as soon as possible.
In fact, I know of very few uses of a SIGSEGV handler:
use an async-signal-safe backtrace library to log a backtrace, then die.
in a VM such as the JVM or CLR: check if the SIGSEGV occurred in JIT-compiled code. If not, die; if so, then throw a language-specific exception (not a C++ exception), which works because the JIT compiler knew that the trap could happen and generated appropriate frame unwind data.
clone() and exec() a debugger (do not use fork() – that calls callbacks registered by pthread_atfork()).
Finally, note that any action that triggers SIGSEGV is probably UB, as this is accessing invalid memory. However, this would not be the case if the signal was, say, SIGFPE.
There is a compilation problem using ucontext_t or struct ucontext (present in /usr/include/sys/ucontext.h)
http://www.mail-archive.com/arch-general#archlinux.org/msg13853.html

ping pong DMA in kernel

I am using a SoC to DMA data from the programmable logic side (PL) to the processor side (PS). My overall scheme is to allocate two separate dma buffers and ping-pong data to these buffers to the user-space application can access data without collisions. I have successfully done this using single dma transactions in a loop (in a kthread) with each transaction either on the first or second buffer. I was able to notify a poll() method of either the first or second buffer.
Now I would like to investigate scatter-gather and/or cyclic DMA.
struct scatterlist sg[2]
struct dma_async_tx_descriptor *chan_desc;
struct dma_slave_config config;
sg_init_table(sg, ARRAY_SIZE(sg));
addr_0 = (unsigned int *)dma_alloc_coherent(NULL, dma_size, &handle_0, GFP_KERNEL);
addr_1 = (unsigned int *)dma_alloc_coherent(NULL, dma_size, &handle_1, GFP_KERNEL);
sg_dma_address(&sg[0]) = handle_0;
sg_dma_len(&sg[0]) = dma_size;
sg_dma_address(&sg[1]) = handle_1;
sg_dma_len(&sg[1]) = dma_size;
memset(&config, 0, sizeof(config));
config.direction = DMA_DEV_TO_MEM;
config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
config.src_maxburst = DMA_MAX_BURST_SIZE / DMA_SLAVE_BUSWIDTH_4_BYTES;
dmaengine_slave_config(dma_dev->chan, &config);
chan_desc = dmaengine_prep_slave_sg(dma_dev->chan, &sg[0], 2, DMA_DEV_TO_MEM, DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
chan_desc->callback = sync_callback;
chan_desc->callback_param = dma_dev->cmp;
init_completion(dma_dev->cmp);
dmaengine_submit(chan_desc);
dma_async_issue_pending(dma_dev->chan);
dma_free_coherent(NULL, dma_size, addr_0, handle_0);
dma_free_coherent(NULL, dma_size, addr_1, handle_1);
This works fine for a single run through the scatterlist and then calls the call_back at sync_callback. My thought was to chain the scatterlist in a loop but I won't get the callback.
IS THERE A WAY to have a callback for each descriptor in the scatterlist? I wondered if I could use dmaengine_prep_slave_cyclic (calls callback after every transaction) but it looks to me like this is for a single buffer when reviewing dmaengine.h. Looking at DMA Engine API Guide it looks like there is another option dmaengine_prep_interleaved_dma using a dma_interleaved_template that sounds interesting but hard to find info about.
In the end I just want to signal the user space in some manner as to what buffer is ready.

linux wake_up_interruptible() having no effect

I am writing a "sleepy" device driver for an Operating Systems class.
The way it works is, the user accesses the device via read()/write().
When the user writes to the device like so: write(fd, &wait, size), the device is put to sleep for the amount of time in seconds of the value of wait. If the wait time expires then driver's write method returns 0 and the program finishes. But if the user reads from the driver while a process is sleeping on a wait queue, then the driver's write method returns immediately with the number of seconds the sleeping process had left to wait before the timeout would have occurred on its own.
Another catch is that 10 instances of the device are created, and each of the 10 devices must be independent of each other. So a read to device 1 must only wake up sleeping processes on device 1.
Much code has been provided, and I have been charged with the task of mainly writing the read() and write() methods for the driver.
The way I have tried to solve the problem of keeping the devices independent of each other is to include two global static arrays of size 10. One of type wait_head_queue_t, and one of type Int(Bool flags). Both of these arrays are initialized once when I open the device via open(). The problem is that when I call wake_up_interruptible(), nothing happens, and the program terminates upon timeout. Here is my write method:
ssize_t sleepy_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos){
struct sleepy_dev *dev = (struct sleepy_dev *)filp->private_data;
ssize_t retval = 0;
int mem_to_be_copied = 0;
if (mutex_lock_killable(&dev->sleepy_mutex))
{
return -EINTR;
}
// check size
if(count != 4) // user must provide 4 byte Int
{
return EINVAL; // = 22
}
// else if the user provided valid sized input...
else
{
if((mem_to_be_copied = copy_from_user(&long_buff[0], buf, count)))
{
return -EFAULT;
}
// check for negative wait time entered by user
if(long_buff[0] > -1)// "long_buff[]"is global,for now only holds 1 value
{
proc_read_flags[MINOR(dev->cdev.dev)] = 0; //****** flag array
retval = wait_event_interruptible_timeout(wqs[MINOR(dev->cdev.dev)], proc_read_flags[MINOR(dev->cdev.dev)] == 1, long_buff[0] * HZ) / HZ;
proc_read_flags[MINOR(dev->cdev.dev)] = 0; // MINOR numbers for each
// device correspond to array indices
// devices 0 - 9
// "wqs" is array of wait queues
}
else
{
printk(KERN_INFO "user entered negative value for sleep time\n");
}
}
mutex_unlock(&dev->sleepy_mutex);
return retval;}
Unlike the many examples on this topic, I am switching the flag back to zero immediately before the call to wait_event_interruptible_timeout() because flag values seem to be lingering between subsequent runs of the program. Here is the code for my read method:
ssize_t sleepy_read(struct file *filp, char __user *buf, size_t count,
loff_t *f_pos){
struct sleepy_dev *dev = (struct sleepy_dev *)filp->private_data;
ssize_t retval = 0;
if (mutex_lock_killable(&dev->sleepy_mutex))
return -EINTR;
// switch the flag
proc_read_flags[MINOR(dev->cdev.dev)] = 1; // again device minor numbers
// correspond to array indices
// TODO: this is not waking up the process in write!
// wake up the queue
wake_up_interruptible(&wqs[MINOR(dev->cdev.dev)]);
mutex_unlock(&dev->sleepy_mutex);
return retval;}
The way I am trying to test the program is to have two main.c's, one for writing to the device and one for reading from the device, and I just ./a.out them in separate consoles in my ubuntu installation in Virtual Box. Another thing, the way it is set up now, neither the writing or reading a.outs return until timeout occurs. I apologize for the spotty formatting of the code. I'm not sure exactly what is going on here, so any help would be much appreciated! Thanks!
Your write method hold sleepy_mutex while wait event. So read method waits on mutex_lock_killable(&dev->sleepy_mutex) while the mutex become unlocked by the writer. It is occured only when writer's timeout exceeds, and write method returns. It is the behaviour you observe.
Usually, wait_event* is executed outside of any critical section. That can be achieved by using _lock-suffixed variants of such macros, or simply wrapping cond argument of such macros with spinlock acquire/release pair:
int check_cond()
{
int res;
spin_lock(&lock);
res = <cond>;
spin_unlock(&lock);
return res;
}
...
wait_event_interruptible(&wq, check_cond());
Unfortunately, wait_event-family macros cannot be used, when condition checking should be protected with a mutex. In that case, you can use wait_woken() function with manual condition checking code. Or rewrite your code without needs of mutex lock/unlock around condition checking.
For achive "reader wake writer, if it is sleep" functionality, you can adopt code from that answer https://stackoverflow.com/a/29765695/3440745.
Writer code:
//Declare local variable at the beginning of the function
int cflag;
...
// Outside of any critical section(after mutex_unlock())
cflag = proc_read_flags[MINOR(dev->cdev.dev)];
wait_event_interruptible_timeout(&wqs[MINOR(dev->cdev.dev)],
proc_read_flags[MINOR(dev->cdev.dev)] != cflag, long_buff[0]*HZ);
Reader code:
// Mutex holding protects this flag's increment from concurrent one.
proc_read_flags[MINOR(dev->cdev.dev)]++;
wake_up_interruptible_all(&wqs[MINOR(dev->cdev.dev)]);

Atmega8 microcontroller reading from bluetooth

I want to read a byte sent by Bluetooth to the Atmega8 to process it. I found this function online to receive a byte
uint8_t receiveByte()
{
// Wait until a byte has been received
while((UCSRA&(1<<RXC)) == 0);
// Return received data
return UDR;
}
but it doesn't work by say turning a led on if 'a' was sent, so when I changed it and enabled port c to HIGH just before the while loop, and turned it low after it, but port c never gone low - which means this loop is infinite.
so my question is how to fix it or how can I read a byte from Bluetooth module
its atmega8-16pu and I configured it as follows:
/** define the cpu clock frequency*/
#define F_CPU 8000000UL
and fuse = 0xD9C4, from this site http://www.engbedded.com/fusecalc/

Resources