I read a lot about fclose and I have 2 questions about it:
1) my program crashes after a call to fclose(pFile), even though I checked that Pfile isn't NULL, and change it to NULL right after fclose. (see code below)
I have to mention that the crash doesn't happen any time the program run, but only at specific circumstances which I still don't know exactly what are they.
after the crash , I get an error - double free or corruption (!prev).
so I guess it means that somehow fclose is being called twice on the same file descriptor, what I don't understand is how could this possibly happen?
if (NULL != pFile)
{
fclose(pFile);
pFile=NULL;
}
pFile = fopen (fullFilePath.c_str(), "r");
2) I know that calling fclose twice on the same file descriptor usually leads to crash, but I couldn't find an answer in the case that I have 2 file descriptors to the same file (lets say that there are 2 threads of the same process - each holds one file descriptor) . in this case, if both threads calling to fclose on the same file but through two different file descriptors, is that ok? it won't cause a crash?
Related
I was going through the book The Linux Programming Interface. On page 73 in Chapter 4,
it is written:
fd = open("w.log", O_WRONLY | O_CREAT | O_TRUNC | O_APPEND, S_IRUSR | S_IWUSR);
I read that O_TRUC flag is used to truncate the file length to zero that destroys any existing data in the file.
O_APPEND flag is used to append data to the end of the file.
The kernel records a file offset, sometimes also called the read-write offset or pointer. This is the location in the file at which the next read() or write() will commence.
I am confused that if the file is truncated and the kernel does the subsequent writing at the end of the file, why is the append flag is needed to explicitly tell to append at the end of the file?
Without the append flag (if the file is truncated), the kernel writes at the end of the file for the subsequent write() function call.
O_APPEND flag is used to append data to the end of the file.
That's true, but incomplete enough to be potentially misleading. And I suspect that you are in fact confused in that regard.
The kernel records a file offset, sometimes also called the read-write offset or pointer. This is the location in the file at which the next read() or write() will commence.
That's also incomplete. There is a file offset associated with at least each seekable file. That is the position where the next read() will commence. It is where the next write() will commence if the file is not open in append mode, but in append mode every write happens at the end of the file, as if it were repositioned with lseek(fd, 0, SEEK_END) before each one. In that case, then, the current file offset might not be the position where the next write() will commence.
I am confused that if the file is truncated and the the kernel does the subsequent writing at the end of the file why the append flag is needed to explicitly tell to append at the end of the file ?
It is not needed to cause the first write (by any process) after truncation to occur at the end of the file because immediately after the file has been truncated there isn't any other position.
With out the append flag (if the file is truncated), the kernel writes at the end of the file for the subsequent write() function call.
It is not needed for subsequent writes either, as long as the file is not repositioned or externally modified. Otherwise, the location of the next write depends on whether the file is open in append mode or not.
In practice, it is not necessarily the case that every combination of flags is useful, but the combination of O_TRUNC and O_APPEND has observably different effect than does either flag without the other, and the combination is useful in certain situations.
O_APPEND rarely makes sense with O_TRUNC. I think no combination of the C fopen modes will produce that combination (on POSIX systems, where this is relevant).
O_APPEND ensures that every write is done at the end of the file, automatically, regardless of the write position. In particular, this means that if multiple processes are writing to the file, they do not stomp over each other's writes.
note that POSIX does not require the atomic behavior of O_APPEND. It requires that an automatic seek takes place to the (current) end of the file before the write, but it doesn't require that position to still be the end of of the file when the write occurs. Even on implementations which feature atomic O_APPEND, it might not work for all file systems. The Linux man page on open cautions that O_APPEND doesn't work atomically on NFS.
Now, if every process uses O_TRUNC when opening the file, it will be clobbering everything that every other process wrote. That conflicts with the idea that the processes shouldn't be clobbering each other's writes, for which O_APPEND was specified.
O_APPEND is not required for appending to a file by a single process which is understood to be the only writer. It is possible to just seek to the end and then start writing new data. Sometimes O_APPEND is used in the exclusive case anyway simply because it's a programming shortcut. We don't have to bother making an extra call to position to the end of the file. Compare:
FILE *f = fopen("file.txt", "a");
// check f and start writing
versus:
FILE *f = fopen("file.txt", "r+");
// check f
fseek(f, 0, SEEK_END); // go to the end, also check this for errors
// start writing
We can think about the idea that we have a group of processes using O_APPEND to a file, such that the first one also performs O_TRUNC to truncate it first. But it seems awkward to program this; it's not easy for a process to tell whether it is the first one to be opening the file.
If such a situation is required on, say, boot-up, where the old file from before the boot is irrelevant for some reason, just have a boot-time action (script or whatever) remove the old file before these multiple processes are started. Each one then uses O_CREAT to create the file if necessary (in case it is the first process) but without O_TRUNC (in case they are not the first process), and with O_APPEND to do the atomic (if available) appending thing.
The two are entirely independent. The file is simply opened with O_APPEND because it's a log file.
The author wants concurrent messages to concatenate instead of overwrite each other (e.g. if the program forks), and if an admin or log rotation tool truncates the file then new messages should start being written at line #1 instead of at line #1000000 where the last log entry was written. This would not happen without O_APPEND.
I have a program which opens a static file with sys_open() and wants to receive a file descriptor equals to zero (=stdin). I have the ability to write to the file, remove it or modify it, so I tried to create a symbolic link to /dev/stdin from the static file name. It opens stdin, but returns with the lowest available fd (not equal to zero). How can I cause the syscall to return zero, without hooking the syscall or modifying the program itself? it that even possible?
(It's part of a challenge, not a real case scenario)
Thank you as always
Posix guarantees that the lowest available FD will be returned. Therefore you can just invoke the program with stdin closed:
./myprogram 0>&-
Is there a way from Linux userspace to replace the pages of a mapped file (or mmap'd pages within a certain logical address range) with empty pages (mapped from /dev/null, or maybe a single empty page, mapped repeatedly over the top of the pages mapped from the file)?
For context, I want to find a fix for this JDK bug:
https://bugs.openjdk.java.net/browse/JDK-4724038
To summarize the bug: it is not currently possible to unmap files in Java until the JVM can garbage collect the MappedByteBuffer that wraps an mmap'd file, because forcibly unmapping the file could give rise to security issues due to race conditions (e.g. native code could still be trying to access the same address range that the file was mapped to, and the OS may have already mapped a new file into that same logical address range).
I'm looking to replace the mapped pages in the logical address range, and then unmap the file. Is there any way to accomplish this?
(Bonus points if you know a way of doing this in other operating systems too, particularly Windows and Mac OS X.)
Note that this doesn't have to be an atomic operation. The main goal is to separate the unmapping of the memory (or the replacing of the mapped file contents with zero-on-read pages) from the closing of the file, since that will solve a litany of issues on both Linux (which has a low limit on the number of file descriptors per process) and Windows (the fact you can't delete a file while it is mapped).
UPDATE: see also: Memory-mapping a file in Windows with SHARE attribute (so file is not locked against deletion)
On Linux you can use mmap with MAP_FIXED to replace the mapping with any mapping you want. If you replace the entire mapping the reference to the file will be removed.
The reason the bug remains in the JDK so long is fundamentally because of the race condition in between unmapping the memory and mapping the dummy memory, some other memory could end up mapped there (potentially by native code). I have been over the OS APIs and there exist no memory operations atomic at the syscall level that unmap a file and map something else to the same address. However there are solutions that block the whole process while swapping out the mapping from underneath it.
The unmap works correctly in finalize without a guard because the GC has proven the object is unreachable first, so there is no race.
Highly Linux specific solution:
1) vfork()
2) send parent a STOP signal
3) unmap the memory
4) map the zeros in its place
5) send parent a CONT signal
6) _exit (which unblocks parent thread)
In Linux, memory mapping changes propagate to the parent.
The code actually looks more like this (vfork() is bonkers man):
int unmap(void *addr, int length)
{
int wstatus;
pid_t child;
pid_t parent;
int thread_cancel_state;
signal_set signal_set;
signal_set old_signal_set;
parent = getpid();
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &thread_cancel_state);
sigfillset(&signal_set);
pthread_sigmask(SIG_SETMASK, &signal_set, &old_signal_set);
if (0 == (child = vfork()) {
int err = 0;
kill(parent, SIGSTOP);
if (-1 == munmap(addr, length))
err = 1;
else if ((void*)-1 == mmap(addr, length, PROT_NONE, MAP_ANONYMOUS, -1, 0);
err = 1;
kill(parent, SIGCONT);
_exit(err);
}
if (child > 0)
waitpid(child, &wstatus, 0);
else
wstatus = 255;
pthread_sigmask(SIG_SETMASK, &old_signal_set, &signal_set);
pthread_setcancelstate(thread_cancel_state, &thread_cancel_state);
return (wstatus & 255) != 0;
}
Under Windows you can do stop all threads but this one using SuspendThread which feels tailor made for this. However, enumerating threads is going to be hard because you're racing against CreateThread. You have to run the enumerate thread ntdll.dll APIs (you cannot use ToolHelp here trust me) and SuspendThread each one but your own, carefully only using VirtualAlloc to allocate memory because SuspendThread just broke all the heap allocation routines, and you're going to have to do all that in a loop until you find no more.
There's some writeup here that I don't quite feel like I can distill down accurately:
http://forums.codeguru.com/showthread.php?200588-How-to-enumerate-threads-in-currently-running-process
I did not find any solutions for Mac OSX.
I have a very strange bug. If I do:
int fd = open("/proc/...", O_WRONLY);
write(fd, argv[1], strlen(argv[1]));
close(fd);
everything is working including for a very long string which length > 1024.
If I do:
FILE *fd = fopen("/proc/...", "wb");
fwrite(argv[1], 1, strlen(argv[1]), fd);
fclose(fd);
the string is cut around 1024 characters.
I'm running an ARM embedded device with a 3.4 kernel. I have debugged in the kernel and I see that the string is already cut when I reach the very early function vfs_write (I spotted this function with a WARN_ON instruction to get the stack).
The problem is the same with fputs vs. puts.
I can use fwrite for a very long string (>1024) if I write to a standard rootfs file. So the problem is really linked how the kernel handles /proc.
Any idea what's going on?
Probably the problem is with buffers.
The issue is that special files, such as those at /proc are, well..., special, they are not always simple stream of bytes, and have to be written to (or read from) with specific sizes and or offsets. You do not say what file you are writing to, so it is impossible to be sure.
Then, the call to fwrite() assumes that the output fd is a simple stream of bytes, so it does smart fancy things, such as buffering and splicing and copying the given data. In a regular file it will just work, but in a special file, funny things may happen.
Just to be sure, try to run strace with both versions of your program and compare the outputs. If you wish, post them for additional comments.
I developed my own log processing program. to process logs originated from printk(), I read from kernel ring buffer like this:
#define _PATH_KLOG "/proc/kmsg"
CGR_INT kernelRingBufferFileDescriptor = open(_PATH_KLOG, O_RDONLY|O_NONBLOCK);
CGR_CHAR kernelLogMessage[MAX_KERNEL_RING_BUFFER + 1] = {'\0'};
while (1)
{
...
read(kernelRingBufferFileDescriptor, kernelLogMessage + residueSize, MAX_KERNEL_RING_BUFFER);
...
}
my program is in user space. I remember whenever someone use read() to read data in the ring buffer (like I did above), the part that is read will be cleared from the ring buffer. Is it the case, or is it not?
I am confused about this, since there is always something in the ring buffer, and as a result, my program is very busy processing all these logs. So I am not sure is it because some module is keeping sending logs to me or is it because I read the same logs again and again since logs are not cleared.
TO figure out, I use klogctl() to check the ring buffer:
CGR_CHAR buf[MAX_KERNEL_RING_BUFFER] = {0};
int byteCount = klogctl(4, buf, MAX_KERNEL_RING_BUFFER - 1); /* 4 -- Read and clear all messages remaining in the ring buffer */
printf("%s %d: data read from kernel ring buffer = \"%s\"\n",__FILE__, __LINE__, buf);
and I keep getting data all the time. Since klogctl() with argument 4 read and clear ring buffer, I kind of believing some module DOES sending logs to me all the time.
Can anyone tell me - does read() clear ring buffer?
Become root and run this cat /proc/kmsg >> File1.txt and cat /proc/kmsg >> File2.txt. Compare File1.txt and File2.txt You will immediately know whether the ring buffer is getting cleared on read() cos cat internally invokes read() anyways!
Also read about ring buffers and how they behave in the Kernel Documentation here-
http://www.mjmwired.net/kernel/Documentation/trace/ring-buffer-design.txt
EDIT: I found something interesting in the book Linux Device Drivers by Jonathan Corbet-
The printk function writes messages into a circular buffer that is
__LOG_BUF_LEN bytes long: a value from 4 KB to 1 MB chosen while configuring the kernel. The function then wakes any process that is
waiting for messages, that is, any process that is sleeping in the
syslog system call or that is reading /proc/kmsg. These two interfaces
to the logging engine are almost equivalent, but note that reading
from /proc/kmsg consumes the data from the log buffer, whereas the
syslog system call can optionally return log data while leaving it for
other processes as well. In general, reading the /proc file is easier
and is the default behavior for klogd. The dmesg command can be used
to look at the content of the buffer without flushing it; actually,
the command returns to stdout the whole content of the buffer, whether
or not it has already been read
So in your particular case, if you are using a plain read(), I think the buffer is indeed getting cleared and new data is being constantly written into it and hence you find some data all the time! Kernel experts can correct me here!
From reading the do_syslog function, it seems that messages are cleared when they're read.
By your description, you get the same behavior with klogctl(4), which also clears the buffer, so it makes sense.
So maybe there's indeed someone that keeps writing messages.
You can find which printk it is, by the text, disable it, and see what you get. Or you can add the jiffies value to the message, so you'll know if you keep getting new messages, or are these are the same ones.