mmap'ed file - determine which page is dirty - linux

I've a fixed size file. The file has been ftruncate()'ed to a size = N * getpagesize(). The file has fixed size records. I've a writer process which maps the entire file by mmap(...MAP_SHARED...) and modifies records randomly (accessed like an array). I've a reader process which also does mmap(...MAP_SHARED...). Now the reader process needs to determine which page has changed in its mapping because of writer process writing to a random record. Is there a way to do it in userspace? I'm on Linux - x86_64. Platform specific code/hacks are welcome. Thank you for your time.
Edit: I have no liberty to modify the writer process' code to give me an indication of the modified records in some way.

Relevant documentation:
http://lwn.net/Articles/230975/
https://www.kernel.org/doc/Documentation/vm/pagemap.txt
Determine the number of your virtual page (i.e. divide by 4096), multipy by 8, and seek to that position in /proc/*/pagemap
Read 8 bytes, this is the page frame number (PFN)
Open /proc/kpageflags, and seek to the PFN, read 8 bytes
If the DIRTY flag is set, the page is dirty (in other words, the writer has written to it)
Repeat for each page in your mapped file

It's going to be very, very ugly. Most likely, it's just not worth trying to do this and you're better off changing whatever painted you into this corner.
You can use a shared bitmap protected by a lock. The writer protects each page. If it writes into a protected page, it faults. You'll have to catch the fault, unprotect the page, lock the bitmap, and set the bit corresponding to that bit in the bitmap. This will tell the reader that the page was modified.
The reader operates as follows (this is the painful part):
Lock the bitmap.
Make a list of modified pages.
Communicate that list of modified pages to the writer. The writer must protect those pages again and clear their bits in the bitmap. The writer must wait for the reader to complete this before its starts reading or changes can be lost.
The reader can now read the modified pages.

Related

how to determine page(cache) boundaries when writing to a file

In linux when writing to a file, kernel maintains multiple in memory pages (4KB in size). Data is first written to the pages and background process bdflush sends these data to disk drive.
Is there a way to determine page boundaries when writing sequentially to a file ?
Can I assume it is always 1-4096:page 1 and 4097-8192:page 2 ?
or can it vary ?
say if I start writing from 10 (i.e. first 10 bytes already written to the file previously and I set file position to 10 before start writing) will the page boundary still be
1-4096 : page 1
OR
10-5096 : page 1 ?
Reason for asking,
I can use sync_file_range
http://man7.org/linux/man-pages/man2/sync_file_range.2.html
to flush data from kernel pages to disk drive in a orderly manner.
If I can determine page boundaries I can call sync_file_range only when a page boundary is reached, so that unnecessary sync_file_range calls are avoided.
Edit :
Only positive thing to suggest such boundary alignment I could find was in mmap page asking offset to be multiple of page size
,
offset must be a multiple of the page size as
returned by sysconf(_SC_PAGE_SIZE)

Linux file dirty page write back order

In Linux, for a single file what is the dirty page write back(to disk) order ? is it from beginning to end ? or out of order ?
Scenario 1 : without overwrite
A file(in disk) is created and large amount of data written (sequentially) quickly. Now I presume these would be in multiple page caches. When writing back the dirty pages is pages written back in order ?
e.g. Say server shutdown before file write completion.
Now after reboot can we have the disk file in below state
|--correct data --|---data unset/garbage--|--correct data--|
i.e. I understand last set of bytes in file can be incomplete, but can data in mid be incomplete
Scenario 2 : with overwrite (attempt to use file similar to a circular/ring buffer)
File created, data written, after reaching a max size, "fsync"
called (i.e. data+meta data synchronized).
Now, file pointer is
moved to the beginning of the file and data written sequentially.
(No fsync done)
Now due to a server shutdown can we have the disk file in below state after reboot
|--Newly written data--|--Old data--|--New data--|...
i.e. for new data, some pages were written to disk out of order
OR
can I assume it is always
|--Newly written data--|----Newly written data--|--Old data--|
i.e. old data and new data will not mix-up (if present old data would only be at the end of file)

What is a quick way to check if file contents are null?

I have a rather large file (32 GB) which is an image of an SD card, created using dd.
I suspected that the file is empty (i.e. filled with the null byte \x00) starting from a certain point.
I checked this using python in the following way (where f is an open file handle with the cursor at the last position I could find data at):
for i in xrange(512):
if set(f.read(64*1048576))!=set(['\x00']):
print i
break
This worked well (in fact it revealed some data at the very end of the image), but took >9 minutes.
Has anyone got a better way to do this? There must be a much faster way, I'm sure, but cannot think of one.
Looking at a guide about memory buffers in python here I suspected that the comparator itself was the issue. In most non-typed languages memory copies are not very obvious despite being a killer for performance.
In this case, as Oded R. established, creating a buffer from read and comparing the result with a previously prepared nul filled one is much more efficient.
size = 512
data = bytearray(size)
cmp = bytearray(size)
And when reading:
f = open(FILENAME, 'rb')
f.readinto(data)
Two things that need to be taken into account is:
The size of the compared buffers should be equal, but comparing bigger buffers should be faster until some point (I would expect memory fragmentation to be the main limit)
The last buffer may not be the same size, reading the file into the prepared buffer will keep the tailing zeroes where we want them.
Here the comparison of the two buffers will be quick and there will be no attempts of casting the bytes to string (which we don't need) and since we reuse the same memory all the time, the garbage collector won't have much work either... :)

Changing the head of a large Fortran binary file without dealing with the whole body

I have a large binary file (~ GB size) generated from a Fortran 90 program. I want to modify something in the head part of the file. The structure of the file is very complicated and contains many different variables, which I want to avoid going into. After reading and re-writing the head, is it possible to "copy and paste" the reminder of the file without knowing its detailed structure? Or even better, can I avoid re-writing the whole file altogether and just make changes on the original file? (Not sure if it matters, but the length of the header will be changed.)
Since you are changing the length of the header, I think that you have to write a new, revised file. You could avoid having to "understand" the records after the header by opening the file with stream access and just reading bytes (or perhaps four byte words if the file is a multiple of four bytes) until you reach EOF and copying them to the new file. But if the file was originally created as sequential access and you want to access it that way in the future, you will have to handle the record length information for the header record(s), including altering the value(s) to be consistent with the changed the length of the record(s). This record length information is typically a four-byte integer at beginning and end of each record, but it depends on the compiler.

potential dangers of merging two files of unknown size?

I have a binary file that I need to insert a header at the beginning of. I was thinking of opening a new file, writing the header data, and then copying the data from the binary file to this new file. Since the binary file is about 1 megabyte, are there any dangers to making this file using fwrite? One specific concern would be something like unintentionally overwriting data, similar to what happens if using gets and the input is longer than the buffer.
There's no risk. Allocate a buffer of a given size, read that many bytes into it from the source file, write the buffer back out to the destination file. The operations (file read / file write) all take a maximum number of bytes so as long as your buffer is the size you claim it is, it won't be overrun.
Also, the approach you describe is pretty much the only way to do it. I've never heard of a filesystem that has an "insert these bytes at the beginning of this file" operation.

Resources