I got an error on my C program on runtime. I found some stuff about "double free or corruption" error but nothing relevant.
Here is my code :
void compute_crc32(const char* filename, unsigned long * destination)
{
FILE* tmp_chunk = fopen(filename, "rb");
printf("\n\t\t\tCalculating CRC...");
fflush(stdout);
Crc32_ComputeFile(tmp_chunk, destination);
printf("\t[0x%08lX]", *destination);
fflush(stdout);
fclose(tmp_chunk);
printf("\t[ OK ]");
fflush(stdout);
}
It seems the
fclose(tmp_chunk);
raises this glibc error :
*** glibc detected *** ./crc32: double free or corruption (out): 0x09ed86f0 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x75ee2)[0xb763cee2]
/lib/i386-linux-gnu/libc.so.6(fclose+0x154)[0xb762c424]
./crc32[0x80498be]
./crc32[0x8049816]
./crc32[0x804919c]
./crc32[0x8049cc2]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75e04d3]
./crc32[0x8048961]
In the console output, the last CRC is displayed but not the last "[ OK ]"...
I never have this type of error and I searched for hours on Google but nothing really interesting in my case... please help :)
Now I have another error :
*** glibc detected *** ./xsplit: free(): invalid next size (normal): 0x095a66f0 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x75ee2)[0xb7647ee2]
/lib/i386-linux-gnu/libc.so.6(fclose+0x154)[0xb7637424]
./xsplit[0x80497f7]
./xsplit[0x804919c]
./xsplit[0x8049cd6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75eb4d3]
./xsplit[0x8048961]
What the hell is this ? I'm lost... :(
*** glibc detected *** ./crc32: double free or corruption
Glibc is telling you that you've corrupted heap.
The tools to find such corruption on Linux are Valgrind and AddressSanitizer.
Chances are, either one of them will immediately tell you what your problem is.
Related
I wrote an MPI program in which I use shared memory through MPI_Win_Allocate_shared command, then I run the program on a Virtual Machine with 4 cpus on Azure.
Everything works well with 1 or processes, but it doesn't work with 3 or 4.
I know that MPI_Win_Allocate_shared works only if processes are on the same node, so I thought the problem was related to that. I tried to solve that with an hostfile setting "AzureVM slots=4 max_slots=8", but I still get error.
I'll report the error below:
mpiexec -np 3 --hostfile my_host --oversubscribe tables
[AzureVM][[37487,1],1][btl_openib_component.c:652:init_one_port] ibv_query_gid failed (mlx4_0:1, 0)
[AzureVM][[37487,1],0][btl_openib_component.c:652:init_one_port] ibv_query_gid failed (mlx4_0:1, 0)
[AzureVM][[37487,1],2][btl_openib_component.c:652:init_one_port] ibv_query_gid failed (mlx4_0:1, 0)
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: AzureVM
Local device: mlx4_0
--------------------------------------------------------------------------
[AzureVM:01918] 2 more processes have sent help message help-mpi-btl-openib.txt / error in device init
[AzureVM:01918] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[AzureVM:1930] *** An error occurred in MPI_Win_allocate_shared
[AzureVM:1930] *** reported by process [2456748033,2]
[AzureVM:1930] *** on communicator MPI_COMM_WORLD
[AzureVM:1930] *** MPI_ERR_RMA_SHARED: Memory cannot be shared
[AzureVM:1930] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[AzureVM:1930] *** and potentially your MPI job)
[AzureVM:01918] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
Makefile:54: recipe for target 'table' failed
make: *** [table] Error 71
Please, could someone explain me how to solve the problem?? Thank you in advance!
Hi, have you solved the problem?
Consider adding these two lines (following the quide)
MPI_Comm nodecomm;
MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, &nodecomm);
And after, allocate memory with
// define alloc_length (sth like: int alloc_length = 10 * sizeof(int);)
MPI_Win win;
MPI_Win_allocate_shared (alloc_length, 1, info, shmcomm, &mem, &win);
I had the same problem (a similar error log at least) and solved it exactly in the way I described above
To better understand, see this. I tested the code at the end of the answer chosen as the best one, and unfortunately, it didn't work for me. I modified it as follows:
#include <stdio.h>
#include <mpi.h>
#define ARRAY_LEN 32
int main() {
MPI_Init(NULL, NULL);
int * baseptr;
MPI_Comm nodecomm;
MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0,
MPI_INFO_NULL, &nodecomm);
int nodesize, noderank;
MPI_Comm_size(nodecomm, &nodesize);
MPI_Comm_rank(nodecomm, &noderank);
MPI_Win win;
int size = (noderank == 0)? ARRAY_LEN * sizeof(int) : 0;
MPI_Win_allocate_shared(size, 1, MPI_INFO_NULL,
nodecomm, &baseptr, &win);
if (noderank != 0) {
MPI_Aint size;
int disp_unit;
MPI_Win_shared_query(win, 0, &size, &disp_unit, &baseptr);
}
for (int i = noderank; i < ARRAY_LEN; i += nodesize)
baseptr[i] = noderank;
MPI_Barrier(nodecomm);
if (noderank == 0) {
for (int i = 0; i < nodesize; i++)
printf("%4d", baseptr[i]);
printf("\n");
}
MPI_Win_free(&win);
MPI_Finalize();
}
Now, if you name the code above as test.cpp
mpic++ test.cpp && mpirun -n 8 ./a.out will output 0 1 2 3 4 5 6 7
Some right tips I took from here
Good luck!
I wrote a simple kernel module to learn module_param feature of the kernel module. However, if I give the S_IWUGO, S_IRWXUGO or S_IALLUGO permissions for the perm field, I get the follwing compilation error:
[root#localhost param]# make -C $KDIR M=$PWD modules
make: Entering directory `/usr/src/kernels/3.11.10-301.fc20.i686+PAE'
CC [M] /root/ldd/misc/param/param/hello.o
/root/ldd/misc/param/param/hello.c:6:168: error: negative width in bit-field ‘<anonymous>’
module_param(a, int, S_IWUGO);
^
make[1]: *** [/root/ldd/misc/param/param/hello.o] Error 1
make: *** [_module_/root/ldd/misc/param/param] Error 2
make: Leaving directory `/usr/src/kernels/3.11.10-301.fc20.i686+PAE'
Compilation is successful for S_IRUGO or S_IXUGO (permission not containing Write permssion). I suppose I must be doing something wrong because from what I know, wrtie permission is legal. What am I doing wrong here?
The program:
#include<linux/module.h>
#include<linux/stat.h>
int a = 2;
module_param(a, int, S_IXUGO);
int f1(void){
return 0;
}
void f2(void){
}
module_init(f1);
module_exit(f2);
MODULE_AUTHOR("lavya");
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("experiment with parameters");
Linux does not accept the S_IWOTH permission.
If you follow the macro chain behind module_param, you arrive to __module_param_call which includes:
BUILD_BUG_ON_ZERO((perm) < 0 || (perm) > 0777 || ((perm) & 2))
S_IWOTH == 2 so the test fails.
The negative width in bit-field error is merely is an artefact of the implementation of BUILD_BUG_ON_ZERO
Linux probably refuses to make module parameters world-writable for security reasons. You should be able to use narrower permissions such as S_IWUSR | S_IWGRP.
I am writing a kernel module to get the list of pids with their complete process name. The proc_pid_cmdline() gives the complete process name;using same function /proc/*/cmdline gets the complete process name. (struct task_struct) -> comm gives hint of what process it is, but not the complete path.
I have included the function name, but it gives error because it does not know where to find the function.
How to use proc_pid_cmdline() in a module ?
You are not supposed to call proc_pid_cmdline().
It is a non-public function in fs/proc/base.c:
static int proc_pid_cmdline(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
However, what it does is simple:
get_cmdline(task, m->buf, PAGE_SIZE);
That is not likely to return the full path though and it will not be possible to determine the full path in every case. The arg[0] value may be overwritten, the file could be deleted or moved, etc. A process may exec() in a way which obscures the original command line, and all kinds of other maladies.
A scan of my Fedora 20 system /proc/*/cmdline turns up all kinds of less-than-useful results:
-F
BUG:
WARNING: at
WARNING: CPU:
INFO: possible recursive locking detecte
ernel BUG at
list_del corruption
list_add corruption
do_IRQ: stack overflow:
ear stack overflow (cur:
eneral protection fault
nable to handle kernel
ouble fault:
RTNL: assertion failed
eek! page_mapcount(page) went negative!
adness at
NETDEV WATCHDOG
ysctl table check failed
: nobody cared
IRQ handler type mismatch
Machine Check Exception:
Machine check events logged
divide error:
bounds:
coprocessor segment overrun:
invalid TSS:
segment not present:
invalid opcode:
alignment check:
stack segment:
fpu exception:
simd exception:
iret exception:
/var/log/messages
--
/usr/bin/abrt-dump-oops
-xtD
I have managed to solve a version of this problem. I wanted to access the cmdline of all PIDs but within the kernel itself (as opposed to a kernel module as the question states), but perhaps these principles can be applied to kernel modules as well?
What I did was, I added the following function to fs/proc/base.c
int proc_get_cmdline(struct task_struct *task, char * buffer) {
int i;
int ret = proc_pid_cmdline(task, buffer);
for(i = 0; i < ret - 1; i++) {
if(buffer[i] == '\0')
buffer[i] = ' ';
}
return 0;
}
I then added the declaration in include/linux/proc_fs.h
int proc_get_cmdline(struct task_struct *, char *);
At this point, I could access the cmdline of all processes within the kernel.
To access the task_struct, perhaps you could refer to kernel: efficient way to find task_struct by pid?.
Once you have the task_struct, you should be able to do something like:
char cmdline[256];
proc_get_cmdline(task, cmdline);
if(strlen(cmdline) > 0)
printk(" cmdline :%s\n", cmdline);
else
printk(" cmdline :%s\n", task->comm);
I was able to obtain the commandline of all processes this way.
To get the full path of the binary behind a process.
char * exepathp;
struct file * exe_file;
struct mm_struct *mm;
char exe_path [1000];
//straight up stolen from get_mm_exe_file
mm = get_task_mm(current);
down_read(&mm->mmap_sem); //lock read
exe_file = mm->exe_file;
if (exe_file) get_file(exe_file);
up_read(&mm->mmap_sem); //unlock read
//reduce exe path to a string
exepathp = d_path( &(exe_file->f_path), exe_path, 1000*sizeof(char) );
Where current is the task struct for the process you are interested in. The variable exepathp gets the string of the full path. This is slightly different than the process cmd, this is the path of binary which was loaded to start the process. Combining this path with the process cmd should give you the full path.
I'm a college student and as part of a Networks Assignment I need to do an implementation of the Stop-and-Wait Protocol. The problem statement requires using 2 threads. I am a novice to threading but after going through the man pages for the pthreads API, I wrote the basic code. However, I get a segmentation fault after the thread is created successfully (on execution of the first line of the function passed to pthread_create() as an argument).
typedef struct packet_generator_args
{
int max_pkts;
int pkt_len;
int pkt_gen_rate;
} pktgen_args;
/* generates and buffers packets at a mean rate given by the
pkt_gen_rate field of its argument; runs in a separate thread */
void *generate_packets(void *arg)
{
pktgen_args *opts = (pktgen_args *)arg; // error occurs here
buffer = (char **)calloc((size_t)opts->max_pkts, sizeof(char *));
if (buffer == NULL)
handle_error("Calloc Error");
//front = back = buffer;
........
return 0;
}
The main thread reads packets from this bufffer and runs the stop-and wait algorithm.
pktgen_args thread_args;
thread_args.pkt_len = DEF_PKT_LEN;
thread_args.pkt_gen_rate = DEF_PKT_GEN_RATE;
thread_args.max_pkts = DEF_MAX_PKTS;
/* initialize sockets and other data structures */
.....
pthread_t packet_generator;
pktgen_args *thread_args1 = (pktgen_args *)malloc(sizeof(pktgen_args));
memcpy((void *)thread_args1, (void *)&thread_args, sizeof(pktgen_args));
retval = pthread_create(&packet_generator, NULL, &generate_packets, (void *)thread_args1);
if (retval != 0)
handle_error_th(retval, "Thread Creation Error");
.....
/* send a fixed no of packets to the receiver wating for ack for each. If
the ack is not received till timeout occurs resend the pkt */
.....
I have tried debugging using gdb but am unable to understand why a segmentation fault is occuring at the first line of my generate_packets() function. Hopefully, one of you can help. If anyone needs additional context, the entire code can be obtained at http://pastebin.com/Z3QtEJpQ. I am in a real jam here having spent hours over this. Any help will be appreciated.
You initialize your buffer as NULL:
char **buffer = NULL;
and then in main() without further do, you try to address it:
while (!buffer[pkts_ackd]); /* wait as long as the next pkt has not
Basically my semi-educated guess is that your thread hasn't generated any packets yet and you crash on trying to access an element in NULL.
[162][04:34:17] vlazarenko#alluminium (~/tests) > cc -ggdb -o pthr pthr.c 2> /dev/null
[163][04:34:29] vlazarenko#alluminium (~/tests) > gdb pthr
GNU gdb 6.3.50-20050815 (Apple version gdb-1824) (Thu Nov 15 10:42:43 UTC 2012)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries .. done
(gdb) run
Starting program: /Users/vlazarenko/tests/pthr
Reading symbols for shared libraries +............................. done
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x000000010000150d in main (argc=1, argv=0x7fff5fbffb10) at pthr.c:205
205 while (!buffer[pkts_ackd]); /* wait as long as the next pkt has not
(gdb)
Here is the error I've got:
http://pastebin.com/VadUW6fy
drivers/built-in.o: In function `gem_rxmac_reset':
clkdev.c:(.text+0x212238): undefined reference to `__bad_udelay'
drivers/built-in.o: In function `divide.part.4':
clkdev.c:(.text.unlikely+0x7214): undefined reference to `__udivdi3'
clkdev.c:(.text.unlikely+0x7244): undefined reference to `__umoddi3'
I googled and found this patch: https://lkml.org/lkml/2008/4/7/82
--- a/include/linux/time.h
+++ b/include/linux/time.h
## -174,6 +174,10 ## static inline void timespec_add_ns(struct timespec *a, u64 ns)
{
ns += a->tv_nsec;
while(unlikely(ns >= NSEC_PER_SEC)) {
+ /* The following asm() prevents the compiler from
+ * optimising this loop into a modulo operation. */
+ asm("" : "+r"(ns));
+
ns -= NSEC_PER_SEC;
a->tv_sec++;
}
but failed to apply (may be due to new version of the file).
patching file linux/time.h
Hunk #1 FAILED at 174.
1 out of 1 hunk FAILED -- saving rejects to file linux/time.h.rej
surprisingly, the file time.h.rej is not present!
I should have read more closely. The patch is for timespec_add_ns(), and you have gem_rxmac_reset() and divide.part.4 functions failing. Probably unrelated to the patch you found -- instead, probably standard 64-bit div / mod functions don't have an implementation on your target platform.
Do you have a Sun GEM or Apple GMAC NIC? If not, you can probably just disable that driver and get rid of the first error message.
For the second, you might need to implement a similar asm trick in the clkdev.c file -- when I skimmed my copy for a repeated subtraction operation I didn't spot one -- but maybe you can simply steal a newer clkdev.c or clkdev.h to fix this problem? (It's a long shot, there's only one entry in git log drivers/clk/clkdev.c.)