If the offset of `pread` represents the real offset in disk - io

Random read is usually slower than sequential read. I want to make my program faster, so my question is: do the following two reads mean sequential reads on the disk.
pread(fd, size, offset);
pread(fd, size, offset+size);
Or under which condition, these two reads can be sequential?
Thanks!

Related

What's the difference between read_bytes and read_char in <linux/taskstats.h>?

I'm trying to use Taskstats to measure a program. I‘m confused about the meaning of read_bytes and read_chars, what is the difference between them, do they calculate the io size of real disk read?

How to avoid programs in status D

I wrote a program that are reading/writing data (open one infile and one outfile, read part of infile, then process, then write to outfile, and that cycle repeats), with I/O value about 200M/s in total. However, most of the running time, they are in status D, which means waiting for I/O (As shown in the figure)1. I used dd check write speed in my system, that is about 1.8G/s.
Are my programs inefficient?
Or my harddisk have problems?
How can I deal with it?
If using ifort, you must explicitly use buffered I/O. Flag with -assume buffered_io when compiling or set buffered='yes' in the openstatement.
If you are using gfortran this is the default, so then there must be some other problem.
Edit
I can add that depending on how you read and write the data, most time can be spent parsing it, i.e. decoding ascii characters 123 etc and changing the basis from 10 to 2 until it is machine readable data; then doing the opposite when writing. This is the case if you construct your code like this:
real :: vector1(10)
do
read(5,*) vector1 !line has 10 values
write(6,*) vector1
enddo
If you instead do the following, it will be much faster:
character(1000) :: line1 ! use enough characters so the whole line fits
do
read(5,'(A)') line1
write(6,'(A)') line1
enddo
Now you are just pumping ascii through the program without even knowing if its digits or maybe "ääåö(=)&/&%/(¤%/&Rhgksbks---31". With these modifications I think you should reach the max of your disk speed.
Notice also that there is a write cache in most drives, which is faster than the disk read/write speeds, meaning that you might first be throttled by the read speed, and after filling up the write cache, be throttled by the write speed, which is usually lower than the read speed.

Is overwriting a small file atomic on ext4?

Assume we have a file of FILE_SIZE bytes, and:
FILE_SIZE <= min(page_size, physical_block_size);
file size never changes (i.e. truncate() or append write() are never performed);
file is modified only by completly overwriting its contents using:
pwrite(fd, buf, FILE_SIZE, 0);
Is it guaranteed on ext4 that:
Such writes are atomic with respect to concurrent reads?
Such writes are transactional with respect to a system crash?
(i.e., after a crash the file's contents is completely from some previous write and we'll never see a partial write or empty file)
Is the second true:
with data=ordered?
with data=journal or alternatively with journaling enabled for a single file?
(using ioctl(fd, EXT4_IOC_SETFLAGS, EXT4_JOURNAL_DATA_FL))
when physical_block_size < FILE_SIZE <= page_size?
I've found related question which links discussion from 2011. However:
I didn't find an explicit answer for my question 2.
I wonder, if the above is true, is it documented somewhere?
From my experiment it was not atomic.
Basically my experiment was to have two processes, one writer and one reader. The writer writes to a file in a loop and reader reads from the file
Writer Process:
char buf[][18] = {
"xxxxxxxxxxxxxxxx",
"yyyyyyyyyyyyyyyy"
};
i = 0;
while (1) {
pwrite(fd, buf[i], 18, 0);
i = (i + 1) % 2;
}
Reader Process
while(1) {
pread(fd, readbuf, 18, 0);
//check if readbuf is either buf[0] or buf[1]
}
After a while of running both processes, I could see that the readbuf is either xxxxxxxxxxxxxxxxyy or yyyyyyyyyyyyyyyyxx.
So it definitively shows that the writes are not atomic. In my case 16byte writes were always atomic.
The answer was: POSIX doesn't mandate atomicity for writes/reads except for pipes. The 16 byte atomicity that I saw was kernel specific and may/can change in future.
Details of the answer in the actual post:
write(2)/read(2) atomicity between processes in linux
I am familiar with theory about filesystems in general, not with implementation of Ext4. Take this as educated guess.
Yes, I believe one sector reads and writes will be atomic because
Link you provided quotes "Currently concurrent reads/writes are atomic only wrt individual pages, however are not on the system call. "
Disk sector (512 bytes) writes are atomic according to Stephen Tweedie. In private email conversation with him, he acknowledged that this guarantee is only as good as the hardware.
Ext filesystems overwrite data in place, no copy on write. No allocation.
There is some effort to implement inline data, very small files data can fit in the inode itself. If you only need to store few bytes, that may have impact.
Not sure about one page, but it would make little sense in full journaling mode to send less than a page to the journal before commiting.

Optimal buffer size with Node.js?

I have a situation where I need to take a stream and chunk it up into Buffers. I plan to write an object transform stream which takes regular input data, and outputs Buffer objects (where the buffers are all the same size). That is, if my chunker transform is configured at 8KB, and 4KB is written to it, it will wait until an additional 4KB is written before outputting an 8KB Buffer instance.
I can choose the size of the buffer, as long as it is in the ballpark of 8KB to 32KB. Is there an optimal size to pick? The reason I'm curious is that the Node.js documentation speaks of using SlowBuffer to back a Buffer, and allocating a minimum of 8KB:
In order to avoid the overhead of allocating many C++ Buffer objects for small blocks of memory in the lifetime of a server, Node allocates memory in 8Kb (8192 byte) chunks. If a buffer is smaller than this size, then it will be backed by a parent SlowBuffer object. If it is larger than this, then Node will allocate a SlowBuffer slab for it directly.
Does this imply that 8KB is an efficient size, and that if I used 12KB, there would be two 8KB SlowBuffers allocated? Or does it just mean that the smallest efficient size is 8KB? What about simply using multiples of 8KB? Or, does it not matter at all?
Basically it's saying that if your Buffer is less than 8KB, it'll try to fit it in to a pre-allocated 8KB chunk of memory. It'll keep putting Buffers in that 8KB chunk until one doesn't fit, then it'll allocate a new 8KB chunk. If the Buffer is larger than 8KB, it'll get its own memory allocation.
You can actually see what's happening by looking at the node source for buffer here:
if (this.length <= (Buffer.poolSize >>> 1) && this.length > 0) {
if (this.length > poolSize - poolOffset)
createPool();
this.parent = sliceOnto(allocPool,
this,
poolOffset,
poolOffset + this.length);
poolOffset += this.length;
} else {
alloc(this, this.length);
}
Looking at that, it actually looks like it'll only put the Buffer in to a pre-allocated chunk if it's less than or equal to 4KB (Buffer.poolSize >>> 1 which is 4096 when Buffer.poolSize = 8 * 1024).
As for an optimum size to pick in your situation, I think it depends on what you end up using it for. But, in general, if you want a chunk less than or equal to 8KB, I'd pick something less than or equal to 4KB that will evenly fit in to that 8KB pre-allocation (4KB, 2KB, 1KB, etc.). Otherwise, chunk sizes greater than 8KB shouldn't make too much of a difference.

when user types size command in linux/unix, what does the result mean?

I've been wandering about the size of bss, data or text that I have. So I typed size command.
The result is
text data bss dec hex filename
5461 580 24 ....
What does the number mean? Is the unit bits, Bytes, Kilobytes or Megabytes?
In addition, how to reduce the size of bss, data, text of the file? (Not using strip command.)
That command shows a list of the sections and their sizes in bytes found in an object file. The unit is decimal bytes, unless display of a different format was specified. And there most likely exists a man page for the size command too.
"reduce the size" - modify source code. Take things out.
As for the part about reducing segment size, you have some leeway in moving parts from data to bss by not initializing them. This is only an option if the program initializes the data in another way.
You can reduce data or bss by replacing arrays with dynamically allocated memory, using malloc and friends.
Note that the bss takes no space in the executable and reducing it just for the sake of having smaller numbers reported by size is probably not a good idea.

Resources