How to extend MIPS bus error handlers? - linux

I'm writing a Linux driver in MIPS architecture.
there I implement read operation - read some registers content.
Here is the code:
static int proc_seq_show(struct seq_file *seq, void *v)
{
volatile unsigned reg;
//here read registers and pass to user using seq_printf
//1. read reg number 1
//2. read reg number 2
}
int proc_open(struct inode *inode, struct file *filp)
{
return single_open(filp,&proc_seq_show, NULL);
}
static struct file_operation proc_ops = {.read = &seq_read, .open=&seq_open};
My problem is that reading register content sometimes causing kernel oops - bus error, and read operation is prevented. I can't avoid it in advance.
Since this behavior is acceptable I would like to ignore this error and continue to read the other registers.
I saw bus error handler in the kernel (do_be in traps.c), there is an option to add my own entry to the __dbe_table. An entry looks like that:
struct exception_table_entry {unsigned long insn, nextinsn; };
insn is the instruction that cause the error. nextinsn is the next instruction to be performed after exception.
In the driver I declare an entry:
struct exception_table_entry e __attribute__ ((section("__dbe_table"))) = {0,0};
but I don't know how to initialize it. How can I get the instruction address of the risky line in C? how can I get the address of the fix line? I have tried something with labels and addresses of label - but didn't manage to set correctly the exception_table_entry .
The same infrastructure is available in x86, does someone know how they use it?
Any help will be appreciated.

Related

How to access the /proc file system's iiterate function pointer

I am trying to create a simple linux rootkit that can be used to hide processess. The method I have chosen to try is to replace the pointer to "/proc" 's iterate function with a pointer to a custom one that will hide the processess I need. To do this I have to first save a pointer to it's originale iterate function so it can be replaced later. The '/proc' file system's iterate function can be accessed by accessing the 'iterate' function pointer which is a member of it's 'file_operations' structure.
I have tried the following two methods to access it, as can be seen in the code segment as label_1 and label_2, each tested with the other commented out.
static struct file *proc_filp;
static struct proc_dir_entry *test_proc;
static struct proc_dir_entry *proc_root;
static struct file_operations *fs_ops;
int (*proc_iterate) (struct file *, struct dir_context *);
static int __init module_init(void)
{
label_1:
proc_filp = filp_open("/proc", O_RDONLY | O_DIRECTORY, 0);
fs_ops = (struct file_operations *) proc_filp->f_op;
printk(KERN_INFO "file_operations is %p", fs_ops);
proc_iterate = fs_ops->iterate;
printk(KERN_INFO "proc_iterate is %p", proc_iterate);
filp_close(proc_filp, NULL);
label_2:
test_proc = proc_create("test_proc", 0, NULL, &proc_fops);
proc_root = test_proc->parent;
printk(KERN_INFO "proc_root is %p", proc_root);
fs_ops = (struct file_operations *) proc_root->proc_fops;
printk(KERN_INFO "file_operations is %p", fs_ops);
proc_iterate = fs_ops->iterate;
printk(KERN_INFO "proc_iterate is %p", proc_iterate);
remove_proc_entry("test_proc", NULL);
return 0;
}
As per method 1, I open '/proc' as a file, then follow the file pointer to access it's 'struct file_operations' (f_op), and then follow this pointer to try and locate 'iterate'. However, although I am able to successfully access the 'f_op' structure, somehow it's 'iterate' seems to point to NULL. The following dmesg log shows the output of this.
[ 47.707558] file_operations is 00000000b8d10f59
[ 47.707564] proc_iterate is (null)
As per method 2, I create a new proc directory entry, then try to access it's parent directory (which should point to '/proc' itself), and then try to access it's 'struct file_operations' (proc_fops), and then try to follow on to 'iterate'. However, using this method I am not even able to access the '/proc' directory, as 'proc_root = test_proc->parent;' seems to return NULL. This causes a 'kernel NULL pointer dereference' error in the code that follows. The following dmesg log shows the output of this.
[ 212.078552] proc_root is (null)
[ 212.078567] BUG: unable to handle kernel NULL pointer dereference at 000000000000003
Now, I know that things have changed in the linux kernel, and we are not allowed to write at various addresses (like change the iterate pointer to point at a custom function for example) and that this can be overcome by making those pages writable in the kernel, but that will come later in my attempt to create this rootkit. At present I can't even figure out how to even read the original 'iterate' pointer.
So, I have these questions :
[1] Is something wrong in the following code ? How to fix it ?
[2] Is there any other way to access the pointer to the /proc 's iterate function ?
Tested on Arch Linux running linux 4.15.7

How to get the value of RIP(program counter) from inside a system call?

RIP is a register that stores the address of the current instruction. I am writing a system call to get the program counter of a user level process. Taking the following code for an example, the sys_getrip() is the system call that I am working on. How shall I implement it?
The platform that I am working on is Ubuntu 14.04 with Linux kernel 3.14.4.
/*A system call in the kernel level*/
void sys_getrip(){
char *pc =//What's the magic here?
printk(KERN_INFO, "The current program counter of %d is %p", current->pid, pc);
}
/*A test function running in user level*/
void main(void){
...
sys_getrip();
...
...
sys_getrip();
...
}
A similar problem has been posted here: How to print exact value of the program counter in C. I tried the approach in this post by implementing the sys_getrip() as following
void sys_getrip(void){
printk(KERN_INFO "RIP = %p\n", __builtin_return_address(0));
}
In this case, however, different invokes of sys_getrip() in main() gave the same RIP value.

OpenCL: Struct within struct sending to internal function on device side

Got a question about struct handling in OpenCL that I didn't find on here. I've gathered the all of the data I use in a struct, which itself consists of several structs. I want to do the following:
typedef struct tag_OwnStruct
{
float a;
float b;
float c;
int d;
float e;
int f;
}OwnStruct;
typedef struct tag_DataStruct
{
OwnStruct g;
//+ Alot of other structs... not written for simplicity
}DataStruct;
void PrintOwnStruct(OwnStruct* g)
{
printf("Current lane id : %f\n",g->a);
}
__kernel void Test(__global DataStruct *data)
{
PrintOwnStruct(&data->g);
}
So I want, from the given data I get sent in from the host side to the device, to send the reference to a struct inside of it. That doesn't work somehow, and I don't know why. I've tried the same thing in plain C code and it worked..
If I change PrintOwnStruct to :
void PrintOwnStruct(OwnStruct g)
{
printf("Current lane id : %f\n",g.a);
}
and call the function as : PrintOwnStruct(data->g) the code will run on the device side. Is there any other way to do this? Since I'm not sending the reference to the function, is it being passed by value? And shouldn't that be slower than passing function parameters by reference?
So the problem appears (from comments) to be the confusion between __private and __global address spaces, and possibly that the compiler/runtime is not very helpful in informing about the mix of pointers.
void PrintOwnStruct(OwnStruct* g)
{
printf("Current lane id : %f\n",g->a);
}
__kernel void Test(__global DataStruct *data)
{
PrintOwnStruct(&data->g);
}
The __global DataStruct *data is a pointer to something in __global address space [in other words having the same address for all CL threads], the argument to void PrintOwnStruct OwnStruct* g) declares an argument that is pointing to OwnStruct in the default __private address space [in other words on the stack of this thread].
The correct thing to do is to maintain the address space for both pointers to __global by declaring the function PrintOwnStruct(__global OwnStruct* g).
I'm pretty sure SOME OpenCL compilers will give an error for this, but apparently not this one. I expect that true syntax errors, such as adding %-&6 to the code will in fact give you a kernel that doesn't run at all, so when you call clCreateKernel or clBuildProgram, you'll get an error - which can be displayed by clGetProgramBuildInfo. But if the compiler isn't detecting different address spaces, then it's a bug/feature of the compiler.
[In fact, if your compiler is based on Clang, you may want to have a look at this bug:
https://llvm.org/bugs/show_bug.cgi?id=19957 - half an hour of googling gives a result of some sort! :) ]
In the newer CL2.0 the default address-space is generic, which allows "any" address space to be used.

Configure kern.log to give more info about a segfault

Currently I can find in kern.log entries like this:
[6516247.445846] ex3.x[30901]: segfault at 0 ip 0000000000400564 sp 00007fff96ecb170 error 6 in ex3.x[400000+1000]
[6516254.095173] ex3.x[30907]: segfault at 0 ip 0000000000400564 sp 00007fff0001dcf0 error 6 in ex3.x[400000+1000]
[6516662.523395] ex3.x[31524]: segfault at 7fff80000000 ip 00007f2e11e4aa79 sp 00007fff807061a0 error 4 in libc-2.13.so[7f2e11dcf000+180000]
(You see, apps causing segfault are named ex3.x, means exercise 3 executable).
Is there a way to ask kern.log to log the complete path? Something like:
[6...] /home/user/cclass/ex3.x[3...]: segfault at 0 ip 0564 sp 07f70 error 6 in ex3.x[4...]
So I can easily figure out from who (user/student) this ex3.x is?
Thanks!
Beco
That log message comes from the kernel with a fixed format that only includes the first 16 letters of the executable excluding the path as per show_signal_msg, see other relevant lines for segmentation fault on non x86 architectures.
As mentioned by Makyen, without significant changes to the kernel and a recompile, the message given to klogd which is passed to syslog won't have the information you are requesting.
I am not aware of any log transformation or injection functionality in syslog or klogd which would allow you to take the name of the file and run either locate or file on the filesystem in order to find the full path.
The best way to get the information you are looking for is to use crash interception software like apport or abrt or corekeeper. These tools store the process metadata from the /proc filesystem including the process's commandline which would include the directory run from, assuming the binary was run with a full path, and wasn't already in path.
The other more generic way would be to enable core dumps, and then to set /proc/sys/kernel/core_pattern to include %E, in order to have the core file name including the path of the binary.
The short answer is: No, it is not possible without making code changes and recompiling the kernel. The normal solution to this problem is to instruct your students to name their executable <student user name>_ex3.x so that you can easily have this information.
However, it is possible to get the information you desire from other methods. Appleman1234 has provided some alternatives in his answer to this question.
How do we know the answer is "Not possible to the the full path in the kern.log segfault messages without recompiling the kernel":
We look in the kernel source code to find out how the message is produced and if there are any configuration options.
The files in question are part of the kernel source. You can download the entire kernel source as an rpm package (or other type of package) for whatever version of linux/debian you are running from a variety of places.
Specifically, the output that you are seeing is produced from whichever of the following files is for your architecture:
linux/arch/sparc/mm/fault_32.c
linux/arch/sparc/mm/fault_64.c
linux/arch/um/kernel/trap.c
linux/arch/x86/mm/fault.c
An example of the relevant function from one of the files(linux/arch/x86/mm/fault.c):
/*
* Print out info about fatal segfaults, if the show_unhandled_signals
* sysctl is set:
*/
static inline void
show_signal_msg(struct pt_regs *regs, unsigned long error_code,
unsigned long address, struct task_struct *tsk)
{
if (!unhandled_signal(tsk, SIGSEGV))
return;
if (!printk_ratelimit())
return;
printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->ip, (void *)regs->sp, error_code);
print_vma_addr(KERN_CONT " in ", regs->ip);
printk(KERN_CONT "\n");
}
From that we see that the variable passed to printout the process identifier is tsk->comm where struct task_struct *tsk and regs->ip where struct pt_regs *regs
Then from linux/include/linux/sched.h
struct task_struct {
...
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- initialized normally by setup_new_exec */
The comment makes it clear that the path for the executable is not stored in the structure.
For regs->ip where struct pt_regs *regs, it is defined in whichever of the following are appropriate for your architecture:
arch/arc/include/asm/ptrace.h
arch/arm/include/asm/ptrace.h
arch/arm64/include/asm/ptrace.h
arch/cris/include/arch-v10/arch/ptrace.h
arch/cris/include/arch-v32/arch/ptrace.h
arch/metag/include/asm/ptrace.h
arch/mips/include/asm/ptrace.h
arch/openrisc/include/asm/ptrace.h
arch/um/include/asm/ptrace-generic.h
arch/x86/include/asm/ptrace.h
arch/xtensa/include/asm/ptrace.h
From there we see that struct pt_regs is defining registers for the architecture. ip is just: unsigned long ip;
Thus, we have to look at what print_vma_addr() does. It is defined in mm/memory.c
/*
* Print the name of a VMA.
*/
void print_vma_addr(char *prefix, unsigned long ip)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
/*
* Do not print if we are in atomic
* contexts (in exception stacks, etc.):
*/
if (preempt_count())
return;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma && vma->vm_file) {
struct file *f = vma->vm_file;
char *buf = (char *)__get_free_page(GFP_KERNEL);
if (buf) {
char *p;
p = d_path(&f->f_path, buf, PAGE_SIZE);
if (IS_ERR(p))
p = "?";
printk("%s%s[%lx+%lx]", prefix, kbasename(p),
vma->vm_start,
vma->vm_end - vma->vm_start);
free_page((unsigned long)buf);
}
}
up_read(&mm->mmap_sem);
}
Which shows us that a path was available. We would need to check that it was the path, but looking a bit further in the code gives a hint that it might not matter. We need to see what kbasename() did with the path that is passed to it. kbasename() is defined in include/linux/string.h as:
/**
* kbasename - return the last part of a pathname.
*
* #path: path to extract the filename from.
*/
static inline const char *kbasename(const char *path)
{
const char *tail = strrchr(path, '/');
return tail ? tail + 1 : path;
}
Which, even if the full path is available prior to it, chops off everything except for the last part of a pathname, leaving the filename.
Thus, no amount of runtime configuration options will permit printing out the full pathname of the file in the segment fault messages you are seeing.
NOTE: I've changed all of the links to kernel source to be to archives, rather than the original locations. Those links will get close to the code as it was at the time I wrote this, 2104-09. As should be no surprise, the code does evolve over time, so the code which is current when you're reading this may or may not be similar or perform in the way which is described here.

How to use proc_pid_cmdline in kernel module

I am writing a kernel module to get the list of pids with their complete process name. The proc_pid_cmdline() gives the complete process name;using same function /proc/*/cmdline gets the complete process name. (struct task_struct) -> comm gives hint of what process it is, but not the complete path.
I have included the function name, but it gives error because it does not know where to find the function.
How to use proc_pid_cmdline() in a module ?
You are not supposed to call proc_pid_cmdline().
It is a non-public function in fs/proc/base.c:
static int proc_pid_cmdline(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
However, what it does is simple:
get_cmdline(task, m->buf, PAGE_SIZE);
That is not likely to return the full path though and it will not be possible to determine the full path in every case. The arg[0] value may be overwritten, the file could be deleted or moved, etc. A process may exec() in a way which obscures the original command line, and all kinds of other maladies.
A scan of my Fedora 20 system /proc/*/cmdline turns up all kinds of less-than-useful results:
-F
BUG:
WARNING: at
WARNING: CPU:
INFO: possible recursive locking detecte
ernel BUG at
list_del corruption
list_add corruption
do_IRQ: stack overflow:
ear stack overflow (cur:
eneral protection fault
nable to handle kernel
ouble fault:
RTNL: assertion failed
eek! page_mapcount(page) went negative!
adness at
NETDEV WATCHDOG
ysctl table check failed
: nobody cared
IRQ handler type mismatch
Machine Check Exception:
Machine check events logged
divide error:
bounds:
coprocessor segment overrun:
invalid TSS:
segment not present:
invalid opcode:
alignment check:
stack segment:
fpu exception:
simd exception:
iret exception:
/var/log/messages
--
/usr/bin/abrt-dump-oops
-xtD
I have managed to solve a version of this problem. I wanted to access the cmdline of all PIDs but within the kernel itself (as opposed to a kernel module as the question states), but perhaps these principles can be applied to kernel modules as well?
What I did was, I added the following function to fs/proc/base.c
int proc_get_cmdline(struct task_struct *task, char * buffer) {
int i;
int ret = proc_pid_cmdline(task, buffer);
for(i = 0; i < ret - 1; i++) {
if(buffer[i] == '\0')
buffer[i] = ' ';
}
return 0;
}
I then added the declaration in include/linux/proc_fs.h
int proc_get_cmdline(struct task_struct *, char *);
At this point, I could access the cmdline of all processes within the kernel.
To access the task_struct, perhaps you could refer to kernel: efficient way to find task_struct by pid?.
Once you have the task_struct, you should be able to do something like:
char cmdline[256];
proc_get_cmdline(task, cmdline);
if(strlen(cmdline) > 0)
printk(" cmdline :%s\n", cmdline);
else
printk(" cmdline :%s\n", task->comm);
I was able to obtain the commandline of all processes this way.
To get the full path of the binary behind a process.
char * exepathp;
struct file * exe_file;
struct mm_struct *mm;
char exe_path [1000];
//straight up stolen from get_mm_exe_file
mm = get_task_mm(current);
down_read(&mm->mmap_sem); //lock read
exe_file = mm->exe_file;
if (exe_file) get_file(exe_file);
up_read(&mm->mmap_sem); //unlock read
//reduce exe path to a string
exepathp = d_path( &(exe_file->f_path), exe_path, 1000*sizeof(char) );
Where current is the task struct for the process you are interested in. The variable exepathp gets the string of the full path. This is slightly different than the process cmd, this is the path of binary which was loaded to start the process. Combining this path with the process cmd should give you the full path.

Resources