What's the functionality of the function pm_runtime_put_sync()? - linux

The function pm_runtime_put_sync() is called in spi-omap2-mcspi.c
Can somebody please explain what actually this function call does.
Thank you!

It calls __pm_runtime_idle(dev, RPM_GET_PUT) internally which is documented as
int __pm_runtime_idle(struct device *dev, int rpmflags)
Entry point for runtime idle operations.
* #dev: Device to send idle notification for.
* #rpmflags: Flag bits.
*
* If the RPM_GET_PUT flag is set, decrement the device's usage count and
* return immediately if it is larger than zero. Then carry out an idle
* notification, either synchronous or asynchronous.
*
* This routine may be called in atomic context if the RPM_ASYNC flag is set,
* or if pm_runtime_irq_safe() has been called.
here are the source and Documentation

Related

rcu_dereference() vs rcu_dereference_protected()?

Can any one explain what is the difference between rcu_dereference() and rcu_dereference_protected()?
rcu_dereference() contains barrier code and rcu_dereference_protected() do not contain.
When to use rcu_dereference() and when to use rcu_dereference_protected()?
In short:
rcu_dereference() should be used at read-side, protected by rcu_read_lock() or similar.
rcu_dereference_protected() should be used at write-side (update-side) by the single writer, or protected by the lock which prevents several writers from concurrent modification of the dereferenced pointer. In such cases pointer cannot be modified outside of the current thread, so neither compiler- nor cpu-barriers are needed.
If doubt, using rcu_dereference is always safe, and its perfomance penalties (compared to rcu_dereference_protected) are low.
Exact description for rcu_dereference_protected in the kernel 4.6:
/**
* rcu_dereference_protected() - fetch RCU pointer when updates prevented
* #p: The pointer to read, prior to dereferencing
* #c: The conditions under which the dereference will take place
*
* Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(). This
* is useful in cases where update-side locks prevent the value of the
* pointer from changing. Please note that this primitive does -not-
* prevent the compiler from repeating this reference or combining it
* with other references, so it should not be used without protection
* of appropriate locks.
*
* This function is only for update-side use. Using this function
* when protected only by rcu_read_lock() will result in infrequent
* but very ugly failures.
*/

How does SIGSTOP work in Linux kernel?

I am wondering how SIGSTOP works inside the Linux Kernel. How is it handled? And how the kernel stops running when it is handled?
I am familiar with the kernel code base. So, if you can reference kernel functions that will be fine, and in fact that is what I want. I am not looking for high level description from a user's perspective.
I have already bugged the get_signal_to_deliver() with printk() statements (it is compiling right now). But I would like someone to explain things in better details.
It's been a while since I touched the kernel, but I'll try to give as much detail as possible. I had to look up some of this stuff in various other places, so some details might be a little messy, but I think this gives a good idea of what happens under the hood.
When a signal is raised, the TIF_SIGPENDING flag is set in the process descriptor structure. Before returning to user mode, the kernel tests this flag with test_thread_flag(TIF_SIGPENDING), which will return true (because a signal is pending).
The exact details of where this happens seem to be architecture dependent, but you can see an example for um:
void interrupt_end(void)
{
struct pt_regs *regs = &current->thread.regs;
if (need_resched())
schedule();
if (test_thread_flag(TIF_SIGPENDING))
do_signal(regs);
if (test_and_clear_thread_flag(TIF_NOTIFY_RESUME))
tracehook_notify_resume(regs);
}
Anyway, it ends up calling arch_do_signal(), which is also architecture dependent and is defined in the corresponding signal.c file (see the example for x86):
void arch_do_signal(struct pt_regs *regs)
{
struct ksignal ksig;
if (get_signal(&ksig)) {
/* Whee! Actually deliver the signal. */
handle_signal(&ksig, regs);
return;
}
/* Did we come from a system call? */
if (syscall_get_nr(current, regs) >= 0) {
/* Restart the system call - no handlers present */
switch (syscall_get_error(current, regs)) {
case -ERESTARTNOHAND:
case -ERESTARTSYS:
case -ERESTARTNOINTR:
regs->ax = regs->orig_ax;
regs->ip -= 2;
break;
case -ERESTART_RESTARTBLOCK:
regs->ax = get_nr_restart_syscall(regs);
regs->ip -= 2;
break;
}
}
/*
* If there's no signal to deliver, we just put the saved sigmask
* back.
*/
restore_saved_sigmask();
}
As you can see, arch_do_signal() calls get_signal(), which is also in signal.c.
The bulk of the work happens inside get_signal(), it's a huge function, but eventually it seems to process the special case of SIGSTOP here:
if (sig_kernel_stop(signr)) {
/*
* The default action is to stop all threads in
* the thread group. The job control signals
* do nothing in an orphaned pgrp, but SIGSTOP
* always works. Note that siglock needs to be
* dropped during the call to is_orphaned_pgrp()
* because of lock ordering with tasklist_lock.
* This allows an intervening SIGCONT to be posted.
* We need to check for that and bail out if necessary.
*/
if (signr != SIGSTOP) {
spin_unlock_irq(&sighand->siglock);
/* signals can be posted during this window */
if (is_current_pgrp_orphaned())
goto relock;
spin_lock_irq(&sighand->siglock);
}
if (likely(do_signal_stop(ksig->info.si_signo))) {
/* It released the siglock. */
goto relock;
}
/*
* We didn't actually stop, due to a race
* with SIGCONT or something like that.
*/
continue;
}
See the full function here.
do_signal_stop() does the necessary processing to handle SIGSTOP, you can also find it in signal.c. It sets the task state to TASK_STOPPED with set_special_state(TASK_STOPPED), a macro that is defined in include/sched.h that updates the current process descriptor status. (see the relevant line in signal.c). Further down, it calls freezable_schedule() which in turn calls schedule(). schedule() calls __schedule() (also in the same file) in a loop until an eligible task is found. __schedule() attempts to find the next task to schedule (next in the code), and the current task is prev. The state of prev is checked, and because it was changed to TASK_STOPPED, deactivate_task() is called, which moves the task from the run queue to the sleep queue:
} else {
...
deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK);
...
}
deactivate_task() (also in the same file) removes the process from the runqueue by decrementing the on_rq field of the task_struct to 0 and calling dequeue_task(), which moves the process to the new (waiting) queue.
Then, schedule() checks the number of runnable processes and selects the next task to enter the CPU according to the scheduling policies in effect (I think this is a little bit out of scope by now).
At the end of the day, SIGSTOP moves a process from the runnable queue to a waiting queue until that process receives SIGCONT.
Nearly every time there is an interrupt, the kernel suspends some process from running and switches to running the interrupt handler (the only exception being when there is no process running). Likewise, the kernel will suspend processes that run too long without giving up the CPU (and technically that's the same thing: it just originates from the timer interrupt or possibly an IPI). Ordinarily in these cases, the kernel then puts the suspended process back on the run queue and when the scheduling algorithm decides the time is right, it is resumed.
In the case of SIGSTOP, the same basic thing happens: the affected processes are suspended due to the reception of the stop signal. They just don't get put back on the run queue until SIGCONT is sent. Nothing extraordinary here: SIGSTOP is just instructing the kernel to make a process non-runnable until further notice.
[One note: you seemed to imply that the kernel stops running with SIGSTOP. That is of course not the case. Only the SIGSTOPped processes stop running.]

Magnetometer sampling with Android NDK through callbacks

I have been using NDK APIs to obtain samples from the magnetometer through a callback function. According to the documentation, that is the header files sensor.h and looper.h, I simply have to create an event queue and pass as an argument the callback function.
/*
* Creates a new sensor event queue and associate it with a looper.
*/
ASensorEventQueue* ASensorManager_createEventQueue(ASensorManager* manager,
ALooper* looper, int ident, ALooper_callbackFunc callback, void* data);
After creating the queue and enabling the sensor, the callbacks occur and I am able to retrieve samples as events are triggered. So far so good. The problem is that when I call from Java the native function that does the proper initializations and I want to access the retrived samples (consider N samples) from the initialization function. Again, according to the API the callbacks always occur if 1 is returned and 0 is returned to stop the callbacks.
/**
* For callback-based event loops, this is the prototype of the function
* that is called. It is given the file descriptor it is associated with,
* a bitmask of the poll events that were triggered (typically ALOOPER_EVENT_INPUT),
* and the data pointer that was originally supplied.
*
* Implementations should return 1 to continue receiving callbacks, or 0
* to have this file descriptor and callback unregistered from the looper.
*/
typedef int (*ALooper_callbackFunc)(int fd, int events, void* data);
The thing is I can't figure out how these callbacks really work. I mean, after initializating the event queue and forth, the callbacks occur right way and it seems that there is no way to access the retrieved samples from outside of the callback function after returning 0. In fact, the callback function is apparently looping until 0 is returned. Is there anyway to do this? How does this callback function really work?
This was a silly question from me. Declaring a static array, global or not, to save data did, obviously, the trick. Then, just pass the array to Java through JNI inside the callback function after collecting N samples.

Proper handling of context data in libaio callbacks?

I'm working with kernel-level async I/O (i.e. libaio.h). Prior to submitting a struct iocb using io_submit I set the callback using io_set_callback that sticks a function pointer in iocb->data. Finally, I get the completed events using io_getevents and run each callback.
I'd like to be able to use some context information within the callback (e.g. a submission timestamp). The only method by which I can think of doing this is to continue using io_getevents, but have iocb->data point to a struct with context and the callback.
Is there any other methods for doing something like this, and is iocb->data guaranteed to be untouched when using io_getevents? My understanding is that there is another method by which libaio will automatically run callbacks which would be an issue if iocb->data wasn't pointing to a function.
Any clarification here would be nice. The documentation on libaio seems to really be lacking.
One solution, which I would imagine is typical, is to "derive" from iocb, and then cast the pointer you get back from io_getevents() to your struct. Something like this:
struct my_iocb {
iocb cb;
void* userdata;
// ... anything else
};
When you issue your jobs, whether you do it one at a time or in a batch, you provide an array of pointers to iocb structs, which means they may point to my_iocb as well.
When you retrieve the notifications back from io_getevents(), you simply cast the io_event::obj pointer to your own type:
io_event events[512];
int num_events = io_getevents(ioctx, 1, 512, events, NULL);
for (int i = 0; i < num_events; ++i) {
my_iocb* job = (my_iocb*)events[i].obj;
// .. do stuff with job
}
If you don't want to block in io_getevents, but instead be notified via a file descriptor (so that you can block in select() or epoll(), which might be more convenient), I would recommend using the (undocumented) eventfd integration.
You can tie an aiocb to an eventfd file descriptor with io_set_eventfd(iocb* cb, int fd). Whenever the job completes, it increments the eventfd by one.
Note, if you use this mechanism, it is very important to never read more jobs from the io context (with io_getevents()) than what the eventfd counter said there were, otherwise you introduce a race condition from when you read the eventfd counter and reap the jobs.

ucontext across threads

Are contexts (the objects manipulated by functions in ucontext.h) allowed to be shared across threads? That is, can I swapcontext with the second argument being a context created in makecontext on another thread? A test program seems to show this working on Linux. I can't find documentation one way or the other on this, whereas Windows fibers appear to explicitly support such a use case. Is this safe and OK to do in general? Is it standard POSIX behavior that this should work?
Actually, there was an NGPT - threading library for linux, which uses not a current 1:1 threading model (each user thread is the kernel thread or LWP), but a M:N threading model (several user threads corresponds to another, smaller number of kernel threads).
According to ftp://ftp.uni-duisburg.de/Linux/NGPT/ngpt-0.9.4.tar.gz/ngpt-0.9.4/pth_sched.c:170 pth_scheduler it was possible of moving user thread contexts between native (kernel) threads:
/*
* See if the thread is unbound...
* Break out and schedule if so...
*/
if (current->boundnative == 0)
break;
/*
* See if the thread is bound to a different native thread...
* Break out and schedule if not...
*/
if (current->boundnative == this_sched->lastrannative)
break;
To save and restore user threads, the ucontext can be used ftp://ftp.uni-duisburg.de/Linux/NGPT/ngpt-0.9.4.tar.gz/ngpt-0.9.4/pth_mctx.c:64 and seems this was a preferred method (mcsc):
/*
* save the current machine context
*/
#if PTH_MCTX_MTH(mcsc)
#define pth_mctx_save(mctx) \
( (mctx)->error = errno, \
getcontext(&(mctx)->uc) )
#elif
....
/*
* restore the current machine context
* (at the location of the old context)
*/
#if PTH_MCTX_MTH(mcsc)
#define pth_mctx_restore(mctx) \
( errno = (mctx)->error, \
(void)setcontext(&(mctx)->uc) )
#elif PTH_MCTX_MTH(sjlj)
...
#if PTH_MCTX_MTH(mcsc)
/*
* VARIANT 1: THE STANDARDIZED SVR4/SUSv2 APPROACH
*
* This is the preferred variant, because it uses the standardized
* SVR4/SUSv2 makecontext(2) and friends which is a facility intended
* for user-space context switching. The thread creation therefore is
* straight-foreward.
*/
So, even if NGPT is dead and unused, it selected *context() for switching user threads even between kernel threads. I assume, that using *context() family is safe enough on Linux.
There can be some problems when mixing ucontexts and other native threads library. I will consider a NPTL, which is standard linux native threading library since glibc 2.4. The main problem is THREAD_SELF - pointer to struct pthread of the current thread. TLS (Thread-local storage) also works via THREAD_SELF. The THREAD_SELF is usually stored on register (r2 on powerpc, %gs on x86, etc). get/setcontext might save and restore this register breaking internals of native pthread library (e.g. thread-local storage, thread identification, etc).
The glibc setcontext will not save/restore %gs register to be compatible with pthreads:
/* Restore the FS segment register. We don't touch the GS register
since it is used for threads. */
movl oFS(%eax), %ecx
movw %cx, %fs
You should check, does setcontext saves THREAD_SELF register on the architecture you are interested in. Also, your code can be not portable between OSes and libcs.
From the man page
In a System V-like environment, one
has the type ucontext_t defined in
and the four functions
getcontext(2), setcontext(2),
makecontext() and swapcontext() that
allow user-level context switching
between multiple threads of control
within a process.
Sounds like that's what it's for.
EDIT: although this discussion seems to indicate that you shouldn't be mixing them.

Resources