How might one go about implementing a disk fragmenter? - utility

I have a few ideas I would like to try out in the Disk Defragmentation Arena. I came to the conclusion that as a precursor to the implementation, it would be useful, to be able to put a disk into a state where it was fragmented. This seems to me to be a state that is more difficult to achieve than a defragmented one. I would assume that the commercial defragmenter companies probably have solved this issue.
So my question.....
How might one go about implementing a fragmenter? What makes sense in the context that it would be used, to test a defragmenter?

Maybe instead of fragmenting the actual disk, you should really test your defragmentation algorithm on a simulation/mock disk? Only once you're satisfied the algorithm itself works as specified, you could do the testing on actual disks using the actual disk API.
You could even take snapshots of actual fragmented disks (yours or of someone you know) and use this data as a mock model for testing.

How you can best fragement depends on the file system.
In general, concurrently open a large number of files. Opening a file will create a new directory entry but won't cause a block to be written for that file. But now go through each file in turn, writing one block. This typically will cause the next free block to be consumed, which will lead to all your files being fragmented with regard to each other.
Fragmenting existing files is another matter. Basically, do the same, but do it on a file copy of existing files, doing a delete of the original and rename of copy.

I may be oversimplifying here but if you artificially fragment the disk won't any tests you run will be only true for the fragmentation created by your fragmenter rather than any real world fragmentation. You may end up optimising for assumptions in the fragmenter tool that don't represent real world occurrences.
Wouldn't it be easier and more accurate to take some disk images of fragmented disks? Do you have any friends or colleagues who trust you not to do anything anti-social with their data?

Fragmentation is a mathematical problem such that you are trying to maximize the distance the head of the hard drive is traveling while performing a specific operation. So in order to effectively fragment something you need to define the specific operation first

Related

external multithreading sort

I need to implement external multithreading sort. I dont't have experience in multithreading programming and now I'm not sure if my algorithm is good anoth also I don't know how to complete it. My idea is:
Thread reads next block of data from input file
Sort it using standart algorith(std::sort)
Writes it to another file
After this I have to merge such files. How should I do this?
If I wait untill input file will be entirely processed until merge
I recieve a lot of temporary files
If I try to merge file straight after sort, I can not come up with
an algorithm to avoid merging files with quite different sizes, which
will lead to O(N^2) difficulty.
Also I suppose this is a very common task, however I cannot find good prepared algoritm in the enternet. I would be very grateful for such a link especially for it's c++ implementation.
Well, the answer isn't that simple, and it actually depends on many factors, amongst them the number of items you wish to process, and the relative speed of your storage system and CPUs.
But the question is why to use multithreading at all here. Data too big to be held in memory? So many items that even a qsort algorithm can't sort fast enough? Take advantage of multiple processors or cores? Don't know.
I would suggest that you first write some test routines to measure the time needed to read and write the input file and the output files, as well as the CPU time needed for sorting. Please note that I/O is generally A LOT slower than CPU execution (actually they aren't even comparable), and I/O may not be efficient if you read data in parallel (there is one disk head which has to move in and out, so reads are in effect serialized - even if it's a digital drive it's still a device, with input and output channels). That is, the additional overhead of reading/writing temporary files may more than eliminate any benefit from multithreading. So I would say, first try making an algorithm that reads the whole file in memory, sorts it and writes it, and put in some time counters to check their relative speed. If I/O is some 30% of the total time (yes, that little!), it's definitely not worth, because with all that reading/merging/writing of temporary files, this will rise a lot more, so a solution processing the whole data at once would rather be preferable.
Concluding, don't see why use multithreading here, the only reason imo would be if data are actually delivered in blocks, but then again take into account my considerations above, about relative I/O-CPU speeds and the additional overhead of reading/writing the temporary files. And a hint, your file accessing must be very efficient, eg reading/writing in larger blocks using application buffers, not one by one (saves on system calls), otherwise this may have a detrimental effect if the file(s) are stored on a machine other than yours (eg a server).
Hope you find my suggestions useful.

Is there a way to show linux buffer cache misses?

I am trying to measure the effects of adding memory to a LAMP server.
How can I find which processes try to read from the Linux buffer cache, but miss and read from disk instead?
SystemTap is one of the best ways to do this, but fair warning it's difficult to get a great answer. The kernel simply doesn't provide this data directly. You have to infer it based on how many times the system requested a read and how many times a disk was read from. Usually they line up fairly well and you can attribute the difference to the VFS cache, but not always. One problem is LVM- LVM is a "block device", but so is the underlying disk(s), so if you're not careful it's easy to double-count the disk reads.
A while back I took a stab at it and wrote this:
https://sourceware.org/systemtap/wiki/WSCacheHitRate
I do not claim that it is perfect, but it works better than nothing, and usually generates reasonable output as long as the environment is fairly "normal". It does attempt to account for LVM in a fairly crude way.

Can file size be used to detect a partial append?

I'm thinking about ways for my application to detect a partially-written record after a program or OS crash. Since records are only ever appended to a file (never overwritten), is a crash while writing guaranteed to yield a file size that is shorter than it should be? Is this guaranteed even if the file was opened in read-write mode instead of append mode, so long as writes are always at the end of the file? This would greatly simplify crash recovery, since comparing the last record's expected size and position with the actual file size would be enough to detect a partial write.
I understand that random-access writes can be reordered by the filesystem, but I'm having trouble finding information on whether this can happen when appending. I imagine an out-of-order append would require the filesystem to create a "hole" at the tail of the (sparse) file, write blocks beyond the hole, and then fill in the blocks in between, but I'm hoping that such an approach would be so inefficient that nobody would ever implement their filesystem that way.
I suppose another problem might be a filesystem updating the directory entry's file size field before appending the new blocks to to the file, and the OS crashing in between. Does this ever happen in practice? (ext4, perhaps?) Is there a quick way to detect it? (And what happens when trying to read the unwritten blocks that should exist according to the file's size?)
Is there anything else, such as write reordering performed by a disk/flash drive, that would get in the way of using file size as a way to detect a partial append? I don't expect to be able to compensate for this sort of drive trickery in my application, but it would be good to know about.
If you want to be SURE that you're never going to lose records, you need a consistent journaling or transactional system for your files.
There is absolutely no guarantee that a write will have been fulfilled unless you either set O_DIRECT [which you probably do not want to do], or you use markers to indicate aht "this has been fully committed", that are only written when the file is closed. You can either do that in the mainfile, or, for example, have a file that records, externally, "last written record". If you open & close that file, it should be safe as long as the APP is what is crashing - if the OS crashes [or is otherwise abruptly stopped - e.g. power cut, disk unplugged, etc], all bets are off.
Write reordering and write caching is/can be done at all levels - the C library, the OS, the filesystem module and the hard disk/controller itself are all ABLE to reorder writes.

How to parallelize file reading and writing

I have a program which reads data from 2 text files and then save the result to another file. Since there are many data to be read and written which cause a performance hit, I want to parallize the reading and writing operations.
My initial thought is, use 2 threads as an example, one thread read/write from the beginning, and another thread read/write from the middle of the file. Since my files are formatted as lines, not bytes(each line may have different bytes of data), seek by byte does not work for me. And the solution I could think of is use getline() to skip over the previous lines first, which might be not efficient.
Is there any good way to seek to a specified line in a file? or do you have any other ideas to parallize file reading and writing?
Environment: Win32, C++, NTFS, Single Hard Disk
Thanks.
-Dbger
Generally speaking, you do NOT want to parallelize disk I/O. Hard disks do not like random I/O because they have to continuously seek around to get to the data. Assuming you're not using RAID, and you're using hard drives as opposed to some solid state memory, you will see a severe performance degradation if you parallelize I/O(even when using technologies like those, you can still see some performance degradation when doing lots of random I/O).
To answer your second question, there really isn't a good way to seek to a certain line in a file; you can only explicitly seek to a byte offset using the read function(see this page for more details on how to use it.
Queuing multiple reads and writes won't help when you're running against one disk. If your app also performed a lot of work in CPU then you could do your reads and writes asynchronously and let the CPU work while the disk I/O occurs in the background. Alternatively, get a second physical hard drive: read from one, write to the other. For modestly sized data sets that's often effective and quite a bit cheaper than writing code.
This isn't really an answer to your question but rather a re-design (which we all hate but can't help doing). As already mentioned, trying to speed up I/O on a hard disk with multiple threads probably won't help.
However, it might be possible to use another approach depending on data sensitivity, throughput needs, data size, etc. It would not be difficult to create a structure in memory that maintains a picture of the data and allows easy/fast updates of the lines of text anywhere in the data. You could then use a dedicated thread that simply monitors that structure and whose job it is to write the data to disk. Writing data sequentially to disk can be extremely fast; it can be much faster than seeking randomly to different sections and writing it in pieces.

Can running 'cat' speed up subsequent file random access on a linux box?

on a linux box with plenty of memory (a few Gigs), I need to access randomly to a big file as fast as possible.
I was thinking about doing a cat myfile > /dev/null before accessing it so my file pages go in memory sequentially, hence faster than with a dry random access.
Does this approach make sense to you?
While doing that may force the contents of the file into the system's cache, you are better off using posix_fadvise() (with the POSIX_FADV_WILLNEED advice) or the (blocking)readahead() call to make the kernel precache the data you will need.
EDIT:
You might also want to try using the POSIX_FADV_RANDOM advice to disable readahead altogether.
There's an article with a decent explanation of usage here: Advising the Linux Kernel on File I/O
As the others said, you'll need to benchmark it in your particular case.
It is quite possible it will result in a significant performance increase though.
On traditional rotating media (i.e. a hard disk) sequential access (cat file > /dev/null/fadvise) is much faster than random access.
Only one way to be sure that any (possibly premature?) optimization is worthwhile: benchmark it.
It could theoretically speed up the access (especially if you access almost everything from the file), but I wouldn't bet on a big difference.
The only really useful approach is to benchmark it for your specific case.
If you really want the speed I'd recommend trying memory-mapped IO instead of trying to hack something up with cat. Of course, it depends on the size of file you're trying to access and the type of access you want.. this may not be possible...
readahead is a good call too...
Doing "cat" on a big file might bring the data in and blow more valuable data out of the cache; this is not what you want.
If performance is at all important to you, you'll be doing regular performance testing anyway (and soak tests etc), so continue to do that and watch your graphs, figures etc.

Resources