The linux disk writing operation must first write to the memory and then write to the disk at the appropriate time. Also, when writing to the disk the CPU can be freed for use by other processes while waiting for the disk to complete.
In this case, write disk operations should not affect the computing performance of Linux.Is this correct?
To release the CPU from performing write and read operations on peripherals the Direct Memory Access (DMA) is used. The DMA-Controller, assuming your Linux System has one, is instructed by the CPU to perform the Data transfer. So the CPU needs to initiate the transfer. Additionally the DMA-Controller is working on a bus, that the rest of your system might also like to use. However, your CPU should not be affected much by a data-transfer.
Related
like said in the title, I don't really understand the usage of this syscall. I was writing some program that write some data in a file, and the tutorial I've seen told me to use sys_sync syscall. But my problem is why and when should we use this? The data isn't already written on the file?
The manual says:
sync - Synchronize cached writes to persistent storage
So it is written to the file cache in memory, not on disk.
You rarely have to use sync unless you are writing really important data and need to make sure that data is on disk before you go on. One example of systems that use sync a lot are databases (such as MySQL or PostgreSQL).
So in other words, it is theoretically in your file, just not on disk and therefore if you lose electricity, you could lose the data, especially if you have a lot of RAM and many writes in a raw, it may privilege the writes to cache for a long while, increasing the risk of data loss.
But how can a file be not on the disk? I understand the concept of cache but if I wrote in the disk why would it be in a different place?
First, when you write to a file, you send the data to the Kernel. You don't directly send it to the disk. Some kernel driver is then responsible to write the data to disk. In my days on Apple 2 and Amiga computers, I would actually directly read/write to disk. And at least the Amiga had a DMA so you could setup a buffer, then tell the disk I/O to do a read or a write and it would send you an interrupt when done. On the Apple 2, you had to write loops in assembly language with precise timings to read/write data on floppy disks... A different era!
Although you could, of course, directly access the disk (but with a Kernel like Linux, you'd have to make sure the kernel gives you hands free to do that...).
Cache is primarily used for speed. It is very slow to write to disk (as far as a human is concerned, it looks extremely fast, but compared to how much data the CPU can push to the drive, it's still slow).
So what happens is that the kernel has a task to write data to disk. That task wakes up as soon as data appears in the cache and ends once all the caches are transferred to disk. This task works in parallel. You can have one such task per drive (which is especially useful when you have a system such as RAID 1).
If your application fills up the cache, then a further write will block until some of the cache can be replaced.
and the tutorial I've seen told me to use sys_sync syscall
Well that sounds silly, unless you're doing filesystem write benchmarking or something.
If you have one really critical file that you want to make sure is "durable" wrt. power outages before you do something else (like sent a network packet to acknowledge a complete transfer), use fsync(fd) to sync just that one file's data and metadata.
(In asm, call number SYS_fsync from sys/syscall.h, with the file descriptor as the first register arg.)
But my problem is why and when should we use this?
Generally never use the sync system call in programs you're writing.
There are interactive use-cases where you'd normally use the wrapper command of the same name, sync(1). e.g. with removable media, to get the kernel started doing write-back now, so unmount will take less time once you finish typing it. Or for some benchmarking use-cases.
The system shutdown scripts may run sync after unmounting filesystems (and remounting / read-only), before making a reboot(2) system call.
Re: why sync(2) exists
No, your data isn't already on disk right after echo foo > bar.txt.
Most OSes, including Linux, do write-back caching, not write-through, for file writes.
You don't want write() system calls to wait for an actual magnetic disk when there's free RAM, because the traditional way to do I/O is synchronous so simple single-threaded programs wouldn't be able to do anything else (like reading more data or computing anything) while waiting for write() to return. Blocking for ~10 ms on every write system call would be disastrous; that's as long as a whole scheduler timeslice. (It would still be bad even with SSDs, but of course OSes were designed before SSDs were a thing.) Even just queueing up the DMA would be slow, especially for small file writes that aren't a whole number of aligned sectors, so even letting the disk's own write-back write caching work wouldn't be good enough.
Therefore, file writes do create "dirty" pages of kernel buffers that haven't yet been sent to the disk. Sometimes we can even avoid the IO entirely, e.g. for tmp files that get deleted before anything triggers write-back. On Linux, dirty_writeback_centisecs defaults to 1500 (15 seconds) before the kernel starts write-back, unless it's running low on free pages. (Heuristics for what "low" means use other tunable values).
If you really want writes to flush to disk immediately and wait for data to be on disk, mount with -o sync. Or for one program, have it use open(O_SYNC) or O_DSYNC (for just the data, not metadata like timestamps).
See Are file reads served from dirtied pages in the page cache?
There are other advantages to write-back, including delayed allocation even at the filesystem level. The FS can wait until it knows how big the file will be before even deciding where to put it, allowing better decisions that reduce fragmentation. e.g. a small file can go into a gap that would have been a bad place to start a potentially-large file. (It just have to reserve space to make sure it can put it somewhere.) XFS was one of the first filesystems to do "lazy" delayed allocation, and ext4 has also had the feature for a while.
https://en.wikipedia.org/wiki/XFS#Delayed_allocation
https://en.wikipedia.org/wiki/Allocate-on-flush
https://lwn.net/Articles/323169/
I am learning computer organization but struggling with the following concept. In non-DMA scenarios, do all disk reads follow the following sequence to get into main memory:
Disk storage surface -> Disk registers -> CPU registers -> Main memory
Similarly for writes, is the sequence:
Main memory -> CPU registers -> Disk registers -> Disk storage surface
(I know that in a DMA scenario, the CPU only initiates the transfer after which the content of the disks are transferred directly to main memory).
If yes, before DMA came, was the above sequence a serious bottleneck as overall CPU registers' capacity is much less compared to main memory and storage disk? Or it is so fast that a human user won't notice in non-DMA modes?
PS: Please bear with my rudimentary terminology, but I hope I conveyed what I want to ask.
Yes, what you describe is what happened in the bad old days with programmed-I/O instead of DMA.
For example, IDE disk-controller hardware used to be less well standardized, so the Linux drivers defaulted to programmed I/O (i.e. a copy loop using x86 IN instructions, since ATA predated memory-mapped I/O registers being common). For decent performance, you had to manually enable DMA in your boot scripts.
But before doing that, check by manually enabling DMA it didn't lead to lockups, or far worse cause data corruption.
re: memory-mapped file: nothing to do with how the data gets from disk into the pagecache (or vice versa). mmap() just means your process's address space includes a shared mapping of the same pages that the OS is using to cache the file's contents.
I have stumbled not once into a term "non coherent" and "coherent" memory in the
tech papers related to graphics programming.I have been searching for a simple and clear explanation,but found mostly 'hardcore' papers of this type.I would be glad to receive layman's style answer on what coherent memory actually is on GPU architectures and how it is compared to other (probably not-coherent) memory types.
Memory is memory. But different things can access that memory. The GPU can access memory, the CPU can access memory, maybe other hardware bits, whatever.
A particular thing has "coherent" access to memory if changes made by others to that memory are visible to the reader. Now, you might think this is foolishness. After all, if the memory has been changed, how could someone possibly be unable to see it?
Simply put, caches.
It turns out that changing memory is expensive. So we do everything possible to avoid changing memory unless we absolutely have to. When you write a single byte from the CPU to a pointer in memory, the CPU doesn't write that byte yet. Or at least, not to memory. It writes it to a local copy of that memory called a "cache."
The reason for this is that, generally speaking, applications do not write (or read) single bytes. They are more likely to write (and read) lots of bytes, in small chunks. So if you're going to perform an expensive operation like a memory load or store, you should load or store a large chunk of memory. So you store all of the changes you're going to make to a chunk of memory in a cache, then make a single write of that cached chunk to actual memory at some point in the future.
But if you have two separate devices that use the same memory, you need some way to be certain that writes one device makes are visible to other devices. Most GPUs can't read the CPU cache. And most CPU languages don't have language-level support to say "hey, that stuff I wrote to memory? I really mean for you to write it to memory now." So you usually need something to ensure visibility of changes.
In Vulkan, memory which is labeled by VK_MEMORY_PROPERTY_HOST_COHERENT_BIT means that, if you read/write that memory (via a mapped pointer, since that's the only way Vulkan lets you directly write to memory), you don't need to use functions vkInvalidateMappedMemoryRanges/vkFlushMappedMemoryRanges to make sure the CPU/GPU can see those changes. The visibility of any changes is guaranteed in both directions. If that flag isn't available on the memory, then you must use the aforementioned functions to ensure the coherency of the specific regions of data you want to access.
With coherent memory, one of two things is going on in terms of hardware. Either CPU access to the memory is not cached in any of the CPU's caches, or the GPU has direct access to the CPU's caches (perhaps due to being on the same die as the CPU(s)). You can usually tell that the latter is happening, because on-die GPU implementations of Vulkan don't bother to offer non-coherent memory options.
If memory is coherent then all threads accessing that memory must agree on the state of the memory at all times, e.g.: if thread 0 reads memory location A and thread 1 reads the same location at the same time, both threads should always read the same value.
But if memory is not coherent then threads A and B might read back different values. Thread 0 could think that location A contains a 1, while thread thinks that that location contains a 2. The different threads would have an incoherent view of the memory.
Coherence is hard to achieve with a high number of cores. Often every core must be aware of memory accesses from all other cores. So if you have 4 cores in a quad core CPU, coherence is not that hard to achieve as every core must be informed about the memory accesses addresses of 3 other cores, but in a GPU with 16 cores, every core must be made aware of the memory accesses by 15 other cores. The cores exchange data about the content of their cache using so called "cache coherence protocols".
This is why GPUs often only support limited forms of coherency. If some memory locations are read only or are only accessed by a single thread, then no coherence is required. If caches are small and coherence is not always required but only at specific instructions of the program, then it is possible to achieve correct behavior of the program using cache flushes before or after specific memory accesses.
If your hardware offers both coherent and non-coherent memory types, then you can expect that non-coherent memory will be faster, but if you try to run parallel algorithms using this memory they will fail in really weird ways.
As far as I know, the CPU is usually faster than an I/O device (like the HDD, the network, RAM, etc.), so when copying a file the bottleneck is usually I/O-bound (right?).
If under some condition that I/O device is faster than the CPU (like in a virtual machine) is it possible to keep the CPU busy moving data (like from buffer to kernel space, from kernel space to user space)? And does it then become CPU-bound?
It depends on the program and the conditions where the program is run.
It would be highly unlikely that the speed of a program copying data would be throttled by the CPU speed. However it could be the case if for example the computer runs other programs that use CPU intensively and with higher priority than the program executing the copy.
The most common bottleneck would be the persistence storage medium speed (e.g. Hard drive).
Then, the amount of RAM available.
Then, the CPU being unavailable.
If and only if however, an I/O device is so super fast that outperforms the CPU speed. Then, it could be the case. However this is a hypothetical case since the CPU does not usually performs the copy itself, but commands other hardware to do so.
And, in real systems the bandwidth available for I/O device are far slower than the CPU and RAM bandwidth.
If copy is done efficiently, copying RAM data to HDD should not stress the CPU.
Data from RAM and Northbridge can be copied to the HDD via the Southbridge.
See also here.
If copy is done inefficiently, of course a program could read every single byte with the CPU and copy it.
Furthermore, as one can infer, the answer also depends from the hardware and architecture of the system.
Wrong answer, I am afraid. At least not always correct.
If I copy a folder with some 50.000 files (different sizes) in Windows Explorer, then TaskManager reports that the copy is mostly CPU bound. (I.e. TM reports low disk usage and very high CPU usage)
I'm a bit confused about the whole idea of IO; I want to know how the CPU reads from the disk (a SATA disk for example) ?
When the program with read()/write() is complied with a reference to a specific file and when the CPU encounters this reference, does it read from the disk directly (via memory mapped IO ports)? Or does it write to the RAM and then writes back to disk?
I'd suggest reading:
http://www.makelinux.net/books/ulk3/understandlk-CHP-13-SECT-1
With a supplement of:
http://en.wikipedia.org/wiki/Direct_memory_access
With regards to buffering in RAM: most programming languages and operating systems buffer at least part of I/O operations (read and write) to memory. This is usually done asynchronously: i.e. a buffer is created, filled, and then processed. For a read, the CPU would (working with the disk controller) create IO instructions to fetch data and a place to put it in memory, fill that space, and then present its contents to the program making the request. For a write request, this would be queuing write operations and their associated data and then sending them off to the IO controller and eventually the disk to be executed. Buffering can happen in multiple places: on the CPU's caches, in RAM, (sometimes) on the disk controller, or on the hard disk itself. How much buffering is done, and exactly how the abstract sequence of operations I've mentioned is handled, differs depending on your hardware architecture, OS, and task.
Main memory is the only large storage area (millions to bilions of bytes) that the processors can access directly.
"Operating System Concepts" said.
So if you want to run a program or manipulate some data, they (program and data) must be in Main memory.