I found that there is a function called task_running in the kernel, and its judgment logic is as follows
static inline int task_current(struct rq *rq, struct task_struct *p)
{
return rq->curr == p;
}
static inline int task_running(struct rq *rq, struct task_struct *p)
{
#ifdef CONFIG_SMP
return p->on_cpu;
#else
return task_current(rq, p);
#endif
}
Is there any difference between rq->curr and p->on_cpu? I think they both mean that the process is being scheduled by the current cpu. Why are separate judgments required under SMP?
On a multiprocessor system (CONFIG_SMP) the scheduler usually needs to continuously acquire and releases the spinlocks p->pi_lock (for tasks i.e. struct task_struct) and rq->lock (for runqueues i.e. struct rq). Since lock contention is one of the main factors of slowdown in the case of multiprocessor systems, the on_cpu field of task_struct was added to avoid acquiring rq->lock to look at rq->curr.
On a uniprocessor system (no CONFIG_SMP), there is only one CPU with one runqeueue, and there is no need to acquire runqueue locks when checking the running task. Since the scheduler will not do any locking anyway, we can also avoid having the on_cpu field in task_struct, saving some memory and also dumping all the code that deals with it. In fact, if you take a look at the code for struct task_struct, you can see:
struct task_struct {
/* ... */
#ifdef CONFIG_SMP
int on_cpu;
/* ... */
#endif
/* ... */
}
Here's the relevant commit that implemented this optimization. It was part of this patchwork by Peter Zijlstra to reduce lock contention in the scheduler (expand the "related" field to list everything).
I want to retrieve the sessionid from a task struct in an eBPF program. I have the following code in my eBPF program:
struct task_struct *task;
u32 sessionid;
task = (struct task_struct *)bpf_get_current_task();
sessionid = task->sessionid;
This runs, but the sessionid always ends up being -1. I read in this answer that I can use task_session to retrieve it, but I get an error about invalid memory access. I believe I need to use bpf_probe_read to move the task_struct that task points to onto the stack, but I can't get it to work. Is there anything I'm missing?
After a bit more digging through the task_struct struct I realised you could do this:
struct task_struct *task;
struct pid_link pid_link;
struct pid pid;
unsigned int sessionid;
task = (struct task_struct *)bpf_get_current_task();
bpf_probe_read(&pid_link, sizeof(pid_link), (void *)&task->group_leader->pids[PIDTYPE_SID]);
bpf_probe_read(&pid, sizeof(pid), (void *)pid_link.pid);
sessionid = pid.numbers[0].nr;
As I was going through the below chunk of Linux char driver code, I found the structure pointer current in printk.
I want to know what structure the current is pointing to and its complete elements.
What purpose does this structure serve?
ssize_t sleepy_read (struct file *filp, char __user *buf, size_t count, loff_t *pos)
{
printk(KERN_DEBUG "process %i (%s) going to sleep\n",
current->pid, current->comm);
wait_event_interruptible(wq, flag != 0);
flag = 0;
printk(KERN_DEBUG "awoken %i (%s)\n", current->pid, current->comm);
return 0;
}
It is a pointer to the current process ie, the process which has issued the system call.
From the docs:
The Current Process
Although kernel modules don't execute sequentially as applications do,
most actions performed by the kernel are related to a specific
process. Kernel code can know the current process driving it by
accessing the global item current, a pointer to struct task_struct,
which as of version 2.4 of the kernel is declared in
<asm/current.h>, included by <linux/sched.h>. The current pointer
refers to the user process currently executing. During the execution
of a system call, such as open or read, the current process is the one
that invoked the call. Kernel code can use process-specific
information by using current, if it needs to do so. An example of this
technique is presented in "Access Control on a Device File", in
Chapter 5, "Enhanced Char Driver Operations".
Actually, current is not properly a global variable any more, like it
was in the first Linux kernels. The developers optimized access to the
structure describing the current process by hiding it in the stack
page. You can look at the details of current in <asm/current.h>. While
the code you'll look at might seem hairy, we must keep in mind that
Linux is an SMP-compliant system, and a global variable simply won't
work when you are dealing with multiple CPUs. The details of the
implementation remain hidden to other kernel subsystems though, and a
device driver can just include and refer to the
current process.
From a module's point of view, current is just like the external
reference printk. A module can refer to current wherever it sees fit.
For example, the following statement prints the process ID and the
command name of the current process by accessing certain fields in
struct task_struct:
printk("The process is \"%s\" (pid %i)\n",
current->comm, current->pid);
The command name stored in current->comm is the base name of the
program file that is being executed by the current process.
Here is the complete structure the "current" is pointing to
task_struct
Each task_struct data structure describes a process or task in the system.
struct task_struct {
/* these are hardcoded - don't touch */
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
long counter;
long priority;
unsigned long signal;
unsigned long blocked; /* bitmap of masked signals */
unsigned long flags; /* per process flags, defined below */
int errno;
long debugreg[8]; /* Hardware debugging registers */
struct exec_domain *exec_domain;
/* various fields */
struct linux_binfmt *binfmt;
struct task_struct *next_task, *prev_task;
struct task_struct *next_run, *prev_run;
unsigned long saved_kernel_stack;
unsigned long kernel_stack_page;
int exit_code, exit_signal;
/* ??? */
unsigned long personality;
int dumpable:1;
int did_exec:1;
int pid;
int pgrp;
int tty_old_pgrp;
int session;
/* boolean value for session group leader */
int leader;
int groups[NGROUPS];
/*
* pointers to (original) parent process, youngest child, younger sibling,
* older sibling, respectively. (p->father can be replaced with
* p->p_pptr->pid)
*/
struct task_struct *p_opptr, *p_pptr, *p_cptr,
*p_ysptr, *p_osptr;
struct wait_queue *wait_chldexit;
unsigned short uid,euid,suid,fsuid;
unsigned short gid,egid,sgid,fsgid;
unsigned long timeout, policy, rt_priority;
unsigned long it_real_value, it_prof_value, it_virt_value;
unsigned long it_real_incr, it_prof_incr, it_virt_incr;
struct timer_list real_timer;
long utime, stime, cutime, cstime, start_time;
/* mm fault and swap info: this can arguably be seen as either
mm-specific or thread-specific */
unsigned long min_flt, maj_flt, nswap, cmin_flt, cmaj_flt, cnswap;
int swappable:1;
unsigned long swap_address;
unsigned long old_maj_flt; /* old value of maj_flt */
unsigned long dec_flt; /* page fault count of the last time */
unsigned long swap_cnt; /* number of pages to swap on next pass */
/* limits */
struct rlimit rlim[RLIM_NLIMITS];
unsigned short used_math;
char comm[16];
/* file system info */
int link_count;
struct tty_struct *tty; /* NULL if no tty */
/* ipc stuff */
struct sem_undo *semundo;
struct sem_queue *semsleeping;
/* ldt for this task - used by Wine. If NULL, default_ldt is used */
struct desc_struct *ldt;
/* tss for this task */
struct thread_struct tss;
/* filesystem information */
struct fs_struct *fs;
/* open file information */
struct files_struct *files;
/* memory management info */
struct mm_struct *mm;
/* signal handlers */
struct signal_struct *sig;
#ifdef __SMP__
int processor;
int last_processor;
int lock_depth; /* Lock depth.
We can context switch in and out
of holding a syscall kernel lock... */
#endif
};
I want to know what is maximum size of waitqueue. How to know upper limit of waitqueue
regards
Sobin
According to the source , the waitqueue is just a task list, so there is not max size:
struct __wait_queue_head {
spinlock_t lock;
struct list_head task_list;
};
typedef struct __wait_queue_head wait_queue_head_t;
Alright,
I'm thinking of creating a webscript that depends on cronjob.. I'm wondering, would it ever make any server damages for the amount of crontabs ?
lets say i have 50 crontabs to be done everyday, would it ever hurt the server ?
if no, what's the max amount of crontabs to be added in a linux server # 512MB memory
When you create a new job the cron daemon call the function job_add (job.c), this function alloc the memory to the job and add it to the tail of the job list.
The job is allocated on the heap, so theorically you're limited just by the RAM installed on your machine.
Some notes from the CRON code:
The job structure:
typedef struct _job {
struct _job *next;
entry *e;
user *u;
} job;
Each user crontab entry is defined by:
typedef struct _entry {
struct _entry *next;
uid_t uid;
gid_t gid;
char **envp;
char *cmd;
bitstr_t bit_decl(minute, MINUTE_COUNT);
bitstr_t bit_decl(hour, HOUR_COUNT);
bitstr_t bit_decl(dom, DOM_COUNT);
bitstr_t bit_decl(month, MONTH_COUNT);
bitstr_t bit_decl(dow, DOW_COUNT);
int flags;
#define DOM_STAR 0x01
#define DOW_STAR 0x02
#define WHEN_REBOOT 0x04
} entry;
And the user struct:
typedef struct _user {
struct _user *next, *prev; /* links */
char *name;
time_t mtime; /* last modtime of crontab */
entry *crontab; /* this person's crontab */
} user;
You can see that this structs does not cosume a lot of memory.
If you're curious about how the implementation of cron work, you can see the code here : cron ubuntu source.