I have a question about the filp_open function:
I can get the error number from the IS_ERR function but I do not understand the meaning of the error number.
Where can find the filp_open error number definitions?
You should not use filp_open to read/write files in kernel mode. For (obvious) security reasons. Other reasons can be found in this answer and this answer (taken from this comment). The official documentation also recomends not to use flp_open:
This is the helper to open a file from kernelspace if you really have to. But in generally you should not do this, so please move along, nothing to see here..
Error code definitions
The kernel uses the same error numbers (errno) in kernel space as in the user space. So, as OmnipotentEntity pointed out, you can see man errno for a reference on what the errors generally mean.
It is also helpful to have a look at the actual implementation of filp_open and its possible error sources, such as file_open_name and build_open_flags.
Note that IS_ERR does not return the error but merely returns whether the supplied pointer is an error value or not. You have to use PTR_ERR to retrieve the error value from the pointer in case IS_ERR is true. Example:
fptr = filp_open(...)
if (IS_ERR(fptr)) {
printk("%d\n", PTR_ERR(fptr));
}
Related
I am writing a kernel module that will return error codes should something go wrong. My problem is that these are the same codes will be returned by init_module. I currently have only one situation in which my kernel module will fail and the error code would have been -1 which is interpreted as there being a problem with permissions. This means it would be indistinguishable from a case of the process actually not having the permission to load modules. So, which codes should I use? A number lower than the lowest error defined in the kernel errno headers?
As Tsyvarev said, if the error number describes the problem with loading your module well, you may (and indeed should) return this error code (negated, from init_module()) usually. But you should make exceptions from this rule for errors used in insmod or documented for the system call init_module, because e. g. return -ENOENT from the init_module() function makes insmod misleadingly output Unknown symbol in module. Instead of those errors better use less mistakable error codes as EIDRM, EUCLEAN or ECANCELED.
I tried adding this inside the brk system call function :
void *addr = sbrk(0);
printk("current-add-is-%p-\n", addr);
But it returned error during kernel compilation that implicit declaration of sbrk function. And I could not find where sbrk is defined!!
All I need to measure that whenever some user process tries to extended its program break address, I would know its current program break address, so that I can measure how much memory processes are requesting.
Thank you.
Looks like you are trying to do something wrong.
There is no 'sbrk' syscall, there is 'brk'. Except then it would be named sys_brk, but you have no reasons to call it. So if you want to find out how to learn the current break address, read brk's sources.
However, where exactly did you put this in if you did not happen to find brk's sources?
Add this line of code:
printf("Address of program break is %p\n", (void *)sbrk(0));
It will return a message to terminal with hex address of the program break.(e.g., 0x#### #### ####.)
If you want the address in other than hex, then use %u or similar. The use of sbrk(0) is documented in man pages (linux programmers manual).
To see documentation, type in command line: man sbrk and documentation will pop up.
I have a program that when run compiled with Microsoft Visual C++ 2008 Express crashes on the line
comparison_vectors = new vec_element[(rbfnetparams->comparison_vector_length)+1];
with the error Unhandled exception at 0x7c93426d in myprog.exe: 0xC0000005: Access violation reading location 0x00000000
rbfnetparams->comparison_vector_length evaluates to 4 (should do and checked in the debugger), and the thing still crashes here when I change the line as a test to:
comparison_vectors = new vec_element[5];
vec_element is a structure with several ints, doubles and a few bools, but no methods or constructor. The thing runs if I replace new with malloc, but then crashes on another new elsewhere. It does not crash every time this line is run, only sometimes, but seems to do so after the same number of iterations of this line each time. Memory usage is only 10MB at this point in the program.
This gets stranger as the same program DOES compile and run under gcc on Solaris, which usually shows up far more errors than Windows does.
Any help would be appreciated, as I am at a loss as to how to debug this one.
Access violation reading location 0x00000000 means "you dereferenced a NULL pointer." It looks like once in a while rbfnetparams is NULL when you reach this line, and thus you get the error.
I can't explain why comparison_vectors = new vec_element[5]; crashes. Is it the same error message?
Check if rbfnetparams is NULL before the line, and see if it gets hit (or add a conditional break point). Then decide if the fact that rbfnetparams is NULL is a symptom of a bigger bug somewhere else.
Dereferencing a NULL pointer is undefined. It's possible that the Solaris compiler does an optimization that masks the bug. That's allowed by the Standard (read the whole series referenced from that post).
I have a question about assert() in Linux: can I use it in the kernel?
If no, what techniques do you usually use if, for example I don't want to enter NULL pointer?
The corresponding kernel macros are BUG_ON and WARN_ON. The former is for when you want to make the kernel panic and bring the system down (i.e., unrecoverable error). The latter is for when you want to log something to the kernel log (viewable via dmesg).
As #Michael says, in the kernel, you need to validate anything that comes from userspace and just handle it, whatever it is. BUG_ON and WARN_ON are to catch bugs in your own code or problems with the hardware.
One option would be to use the macro BUG_ON(). It will printk a message, and then panic() (i.e. crash) the kernel.
http://kernelnewbies.org/KernelHacking-HOWTO/Debugging_Kernel
Of course, this should only be used as an error handling strategy of last resort (just like assert)...
No. Unless you're working on the kernel core and rather on a module, you should do your best to never crash (technically, abort()) the kernel. If you don't want to use a NULL pointer, just don't do it. Check it before using it, and produce an error log if it is.
The closest thing you might want to do if you're actually handling a fatal case is the panic() function or the BUG_ON and WARN_ON macros, which will abort execution and produce diagnostic messages, a stack trace and a list of modules.
Well, dereferencing null pointer will produce an oops, which you can use to find the offending code. Now, if you want to assert() a given condition, you can use
BUG_ON(condition)
A less lethal mechanism is WARN_ON, which will produce a backtrace without crashing the kernel.
I use this macro, it uses BUG() but adds some more info I normally use for debugging, and of course you can edit it to include more info if you wish:
#define ASSERT(x) \
do { if (x) break; \
printk(KERN_EMERG "### ASSERTION FAILED %s: %s: %d: %s\n", \
__FILE__, __func__, __LINE__, #x); dump_stack(); BUG(); \
} while (0)
BUG_ON() is the appropriate approach to do it. It checks for the condition to be true and calls the macro BUG().
How BUG() handles the rest is explained very well in the following article:
http://kernelnewbies.org/FAQ/BUG
Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.
GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.
(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.
Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.