Source code for shell commands in linux (C language) - linux

Currently, I'm implementing a simple shell in C language as my term project. I used fork and exec to execute the commands. However, some commands must be executed internally (fork and exec aren't allowed).
Where can I find the source code for the shell commands?

Depends on which shell you want:
bash? zsh? csh?
I'd go with something smaller like the busybox shell: http://busybox.net/downloads/

It depends on the shell command. For commands like cd, all you end up doing is calling chdir(2).
But for things like shell variables (i.e. bash's var=value), the details will greatly depend on the internals of your implementation.

Take a ganders at Linux Application Development by Michael K Johnson and Erik W. Troan
In my Edition (2nd) you develop a simple shell (ladsh) as part of some examples (in 10.7) in pipes and process handling. A great educational resource.
Proved very useful for me.
A snippet:
struct childProgram {
pid_t pid; /* 0 if exited */
char ** argv; /* program name and arguments */
};
struct job {
int jobId; /* job number */
int numProgs; /* total number of programs in job */
int runningProgs; /* number of programs running */
char * text; /* name of job */
char * cmdBuf; /* buffer various argv's point into */
pid_t pgrp; /* process group ID for the job */
struct childProgram * progs; /* array of programs in job */
struct job * next; /* to track background commands */
};
void freeJob(struct job * cmd) {
int i;
for (i = 0; i < cmd->numProgs; i++) {
free(cmd->progs[i].argv);
}
free(cmd->progs);
if (cmd->text) free(cmd->text);
free(cmd->cmdBuf);
}
You can find the full source here under ladsh1.c, ladsh2.c and so on.

Related

How to resolve this mistake in Petersons algorithm for process synchronization

Information
I was reading the book of E. Tanenbaum about Modern operating systems and there was a code snippet that was introducing Petersons algorithm for process synchronization which is implemented with software.
Here's the snippet.
```
#define FALSE 0
#define TRUE 1
#define N 2 /* number of processes */
int turn; /* whose turn is it? */
int interested[N]; /* all values initially 0 (FALSE) */
void enter_region(int process) /* process is 0 or 1 */
{
int other; /* number of the other process */
other = 1 − process; /* the opposite of process */
interested[process] = TRUE; /* show that you are interested */
turn = process; /*set flag*/
while (turn == process && interested[other] == TRUE); /* null statement */
}
void leave_region(int process) { /* process: who is leaving */
interested[process] = FALSE; /* indicate departure from critical region */
}
```
The question is
Isn't there a mistake? [Edit] Must'nt it be turn = other or maybe there is another mistake.
This version of algorithm violates rules of mutual exclusion.
[Edit]
I think this version is violating the rules of mutual exclusion. As if first process sets the interested variable than stops and other process runs, second process can idle wait after setting his interested and turn variables without any need as there is no any process in critical section.
Any answer and help is appreciated. Thanks!
If process 0 sets interested[0], and then process 1 runs enter_region up until the loop, then process 0 will be able to exit the loop because turn == process is no longer true for it. turn, in this case, really means "turn to wait", and protects against exactly the situation you described. In contrast, if the code did turn = other then process 0 would not be able to exit the loop until process 1 started waiting.

Configure kern.log to give more info about a segfault

Currently I can find in kern.log entries like this:
[6516247.445846] ex3.x[30901]: segfault at 0 ip 0000000000400564 sp 00007fff96ecb170 error 6 in ex3.x[400000+1000]
[6516254.095173] ex3.x[30907]: segfault at 0 ip 0000000000400564 sp 00007fff0001dcf0 error 6 in ex3.x[400000+1000]
[6516662.523395] ex3.x[31524]: segfault at 7fff80000000 ip 00007f2e11e4aa79 sp 00007fff807061a0 error 4 in libc-2.13.so[7f2e11dcf000+180000]
(You see, apps causing segfault are named ex3.x, means exercise 3 executable).
Is there a way to ask kern.log to log the complete path? Something like:
[6...] /home/user/cclass/ex3.x[3...]: segfault at 0 ip 0564 sp 07f70 error 6 in ex3.x[4...]
So I can easily figure out from who (user/student) this ex3.x is?
Thanks!
Beco
That log message comes from the kernel with a fixed format that only includes the first 16 letters of the executable excluding the path as per show_signal_msg, see other relevant lines for segmentation fault on non x86 architectures.
As mentioned by Makyen, without significant changes to the kernel and a recompile, the message given to klogd which is passed to syslog won't have the information you are requesting.
I am not aware of any log transformation or injection functionality in syslog or klogd which would allow you to take the name of the file and run either locate or file on the filesystem in order to find the full path.
The best way to get the information you are looking for is to use crash interception software like apport or abrt or corekeeper. These tools store the process metadata from the /proc filesystem including the process's commandline which would include the directory run from, assuming the binary was run with a full path, and wasn't already in path.
The other more generic way would be to enable core dumps, and then to set /proc/sys/kernel/core_pattern to include %E, in order to have the core file name including the path of the binary.
The short answer is: No, it is not possible without making code changes and recompiling the kernel. The normal solution to this problem is to instruct your students to name their executable <student user name>_ex3.x so that you can easily have this information.
However, it is possible to get the information you desire from other methods. Appleman1234 has provided some alternatives in his answer to this question.
How do we know the answer is "Not possible to the the full path in the kern.log segfault messages without recompiling the kernel":
We look in the kernel source code to find out how the message is produced and if there are any configuration options.
The files in question are part of the kernel source. You can download the entire kernel source as an rpm package (or other type of package) for whatever version of linux/debian you are running from a variety of places.
Specifically, the output that you are seeing is produced from whichever of the following files is for your architecture:
linux/arch/sparc/mm/fault_32.c
linux/arch/sparc/mm/fault_64.c
linux/arch/um/kernel/trap.c
linux/arch/x86/mm/fault.c
An example of the relevant function from one of the files(linux/arch/x86/mm/fault.c):
/*
* Print out info about fatal segfaults, if the show_unhandled_signals
* sysctl is set:
*/
static inline void
show_signal_msg(struct pt_regs *regs, unsigned long error_code,
unsigned long address, struct task_struct *tsk)
{
if (!unhandled_signal(tsk, SIGSEGV))
return;
if (!printk_ratelimit())
return;
printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->ip, (void *)regs->sp, error_code);
print_vma_addr(KERN_CONT " in ", regs->ip);
printk(KERN_CONT "\n");
}
From that we see that the variable passed to printout the process identifier is tsk->comm where struct task_struct *tsk and regs->ip where struct pt_regs *regs
Then from linux/include/linux/sched.h
struct task_struct {
...
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- initialized normally by setup_new_exec */
The comment makes it clear that the path for the executable is not stored in the structure.
For regs->ip where struct pt_regs *regs, it is defined in whichever of the following are appropriate for your architecture:
arch/arc/include/asm/ptrace.h
arch/arm/include/asm/ptrace.h
arch/arm64/include/asm/ptrace.h
arch/cris/include/arch-v10/arch/ptrace.h
arch/cris/include/arch-v32/arch/ptrace.h
arch/metag/include/asm/ptrace.h
arch/mips/include/asm/ptrace.h
arch/openrisc/include/asm/ptrace.h
arch/um/include/asm/ptrace-generic.h
arch/x86/include/asm/ptrace.h
arch/xtensa/include/asm/ptrace.h
From there we see that struct pt_regs is defining registers for the architecture. ip is just: unsigned long ip;
Thus, we have to look at what print_vma_addr() does. It is defined in mm/memory.c
/*
* Print the name of a VMA.
*/
void print_vma_addr(char *prefix, unsigned long ip)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
/*
* Do not print if we are in atomic
* contexts (in exception stacks, etc.):
*/
if (preempt_count())
return;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma && vma->vm_file) {
struct file *f = vma->vm_file;
char *buf = (char *)__get_free_page(GFP_KERNEL);
if (buf) {
char *p;
p = d_path(&f->f_path, buf, PAGE_SIZE);
if (IS_ERR(p))
p = "?";
printk("%s%s[%lx+%lx]", prefix, kbasename(p),
vma->vm_start,
vma->vm_end - vma->vm_start);
free_page((unsigned long)buf);
}
}
up_read(&mm->mmap_sem);
}
Which shows us that a path was available. We would need to check that it was the path, but looking a bit further in the code gives a hint that it might not matter. We need to see what kbasename() did with the path that is passed to it. kbasename() is defined in include/linux/string.h as:
/**
* kbasename - return the last part of a pathname.
*
* #path: path to extract the filename from.
*/
static inline const char *kbasename(const char *path)
{
const char *tail = strrchr(path, '/');
return tail ? tail + 1 : path;
}
Which, even if the full path is available prior to it, chops off everything except for the last part of a pathname, leaving the filename.
Thus, no amount of runtime configuration options will permit printing out the full pathname of the file in the segment fault messages you are seeing.
NOTE: I've changed all of the links to kernel source to be to archives, rather than the original locations. Those links will get close to the code as it was at the time I wrote this, 2104-09. As should be no surprise, the code does evolve over time, so the code which is current when you're reading this may or may not be similar or perform in the way which is described here.

How to use proc_pid_cmdline in kernel module

I am writing a kernel module to get the list of pids with their complete process name. The proc_pid_cmdline() gives the complete process name;using same function /proc/*/cmdline gets the complete process name. (struct task_struct) -> comm gives hint of what process it is, but not the complete path.
I have included the function name, but it gives error because it does not know where to find the function.
How to use proc_pid_cmdline() in a module ?
You are not supposed to call proc_pid_cmdline().
It is a non-public function in fs/proc/base.c:
static int proc_pid_cmdline(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
However, what it does is simple:
get_cmdline(task, m->buf, PAGE_SIZE);
That is not likely to return the full path though and it will not be possible to determine the full path in every case. The arg[0] value may be overwritten, the file could be deleted or moved, etc. A process may exec() in a way which obscures the original command line, and all kinds of other maladies.
A scan of my Fedora 20 system /proc/*/cmdline turns up all kinds of less-than-useful results:
-F
BUG:
WARNING: at
WARNING: CPU:
INFO: possible recursive locking detecte
ernel BUG at
list_del corruption
list_add corruption
do_IRQ: stack overflow:
ear stack overflow (cur:
eneral protection fault
nable to handle kernel
ouble fault:
RTNL: assertion failed
eek! page_mapcount(page) went negative!
adness at
NETDEV WATCHDOG
ysctl table check failed
: nobody cared
IRQ handler type mismatch
Machine Check Exception:
Machine check events logged
divide error:
bounds:
coprocessor segment overrun:
invalid TSS:
segment not present:
invalid opcode:
alignment check:
stack segment:
fpu exception:
simd exception:
iret exception:
/var/log/messages
--
/usr/bin/abrt-dump-oops
-xtD
I have managed to solve a version of this problem. I wanted to access the cmdline of all PIDs but within the kernel itself (as opposed to a kernel module as the question states), but perhaps these principles can be applied to kernel modules as well?
What I did was, I added the following function to fs/proc/base.c
int proc_get_cmdline(struct task_struct *task, char * buffer) {
int i;
int ret = proc_pid_cmdline(task, buffer);
for(i = 0; i < ret - 1; i++) {
if(buffer[i] == '\0')
buffer[i] = ' ';
}
return 0;
}
I then added the declaration in include/linux/proc_fs.h
int proc_get_cmdline(struct task_struct *, char *);
At this point, I could access the cmdline of all processes within the kernel.
To access the task_struct, perhaps you could refer to kernel: efficient way to find task_struct by pid?.
Once you have the task_struct, you should be able to do something like:
char cmdline[256];
proc_get_cmdline(task, cmdline);
if(strlen(cmdline) > 0)
printk(" cmdline :%s\n", cmdline);
else
printk(" cmdline :%s\n", task->comm);
I was able to obtain the commandline of all processes this way.
To get the full path of the binary behind a process.
char * exepathp;
struct file * exe_file;
struct mm_struct *mm;
char exe_path [1000];
//straight up stolen from get_mm_exe_file
mm = get_task_mm(current);
down_read(&mm->mmap_sem); //lock read
exe_file = mm->exe_file;
if (exe_file) get_file(exe_file);
up_read(&mm->mmap_sem); //unlock read
//reduce exe path to a string
exepathp = d_path( &(exe_file->f_path), exe_path, 1000*sizeof(char) );
Where current is the task struct for the process you are interested in. The variable exepathp gets the string of the full path. This is slightly different than the process cmd, this is the path of binary which was loaded to start the process. Combining this path with the process cmd should give you the full path.

Geting numbers of files in specific directory in Linux

Is there anyway to get the total number of files in a specific directory not iterating readdir(3)?
I mean only direct members of a specific directory.
It seems that the only way to get the number of files is calling readdir(3) repeatedly until it returns zero.
Is there any other way to get the number in O(1)? I need a solution that works in Linux.
Thanks.
scandir() example, requires dirent.h:
struct dirent **namelist;
int n=scandir(".", &namelist, 0, alphasort); // "." == current directory.
if (n < 0)
{
perror("scandir");
exit(1);
}
printf("files found = %d\n", n);
free(namelist);
I don't think it is possible in O(1).
Just think about inode structure. There's no clue for this.
But if it's OK to get number of files in a filesystem,
you can use statvfs(2).
#include <sys/vfs.h> /* or <sys/statfs.h> */
int statfs(const char *path, struct statfs *buf);
struct statfs {
__SWORD_TYPE f_type; /* type of file system (see below) */
__SWORD_TYPE f_bsize; /* optimal transfer block size */
fsblkcnt_t f_blocks; /* total data blocks in file system */
fsblkcnt_t f_bfree; /* free blocks in fs */
fsblkcnt_t f_bavail; /* free blocks available to
unprivileged user */
fsfilcnt_t f_files; /* total file nodes in file system */
fsfilcnt_t f_ffree; /* free file nodes in fs */
fsid_t f_fsid; /* file system id */
__SWORD_TYPE f_namelen; /* maximum length of filenames */
__SWORD_TYPE f_frsize; /* fragment size (since Linux 2.6) */
__SWORD_TYPE f_spare[5];
};
You can easily get number of files via f_files - f_ffree.
BTW, This is very interesting question. so I voted it up.
In shell script it's damn simple
ll directory_path | wc -l

How to find definition of structure when reading c program on linux?

I am reading source code of xl2tpd, and face lots of problems when reading this code. For example I cannot find where the structure lac is defined. How do I find the definition of this structure?
I have used ctags and vim to read this code, but failed to find the structure. I googled and could not find the structure. Is there any method that can make the code reading process more comfortable? That is, I can jump to definition of most variables, functions and structures?
try cscope with vim. follow steps below -
1) run cscope -R in xl2tpd directory . it will create file cscope.out
2) open file with vim where structure lac is used
3) use :cs f g <lac> . now it will show the files where lac is defined .
4) choose file.h. it contain the definition .
if you are perticulerly interested in definition of struct lac it is below -
struct lac
{
struct lac *next;
struct host *lns; /* LNS's we can connect to */
struct schedule_entry *rsched;
int tun_rws; /* Receive window size (tunnel) */
int call_rws; /* Call rws */
int rxspeed; /* Tunnel rx speed */
int txspeed; /* Tunnel tx speed */
int active; /* Is this connection in active use? */
int hbit; /* Permit hidden AVP's? */
int lbit; /* Use the length field? */
int challenge; /* Challenge authenticate the peer? */
unsigned int localaddr; /* Local IP address */
unsigned int remoteaddr; /* Force remote address to this */
char authname[STRLEN]; /* Who we authenticate as */
char password[STRLEN]; /* Password to authenticate with */
char peername[STRLEN]; /* Force peer name to this */
char hostname[STRLEN]; /* Hostname to report */
char entname[STRLEN]; /* Name of this entry */
int authpeer; /* Authenticate our peer? */
int authself; /* Authenticate ourselves? */
int pap_require; /* Require PAP auth for PPP */
int chap_require; /* Require CHAP auth for PPP */
int pap_refuse; /* Refuse PAP authentication for us */
int chap_refuse; /* Refuse CHAP authentication for us */
int idle; /* Idle timeout in seconds */
int autodial; /* Try to dial immediately? */
int defaultroute; /* Use as default route? */
int redial; /* Redial if disconnected */
int rmax; /* Maximum # of consecutive redials */
int rtries; /* # of tries so far */
int rtimeout; /* Redial every this many # of seconds */
char pppoptfile[STRLEN]; /* File containing PPP options */
int debug;
struct tunnel *t; /* Our tunnel */
struct call *c; /* Our call */
};
When going through third-party code, there are a few tools that I have found invaluable:
Source Navigator
lxr
ctags
and, of course, the oldest and greatest of all: grep
I believe that the Eclipse CDT also allows you to quickly find the definition of any variable you are looking at, but I have not actually used it - I prefer using console programs for my actual C coding.
None of those are vim-based, although at least ctags can be used via vim or emacs. Nevertheless, they can be very useful when exploring a new codebase that you know nothing about...
Are you talking about this?
The source code already comes with a tags file.
Loading any file (common.h in my case) in Vim you can use :tag lac to jump to the first definition of lac or :tselect lac to choose between the 3 occurrences in this project and :tag gconfig to jump to the unique definition of gconfig.
See :help tags.
I'm using vim + cscope and have the same issue with you. I find a way to workaround this issue.
in vim, search the text instead of the definition. for example, in the linux kernel source code, if you're trying to find "struct file",
commands this:
cs find t struct file {
you will have a accurate definition timely in most cases, take care, no quotation mark for the text "struct file {".
hope it will help you.

Resources