Syncing a file system that has no file on it - linux

Say I want to synchronize data buffers of a file system to disk (in my case the one of an USB stick partition) on a linux box.
While searching for a function to do that I found the following
DESCRIPTION
sync() causes all buffered modifications to file metadata and
data to be written to the underlying file sys‐
tems.
syncfs(int fd) is like sync(), but synchronizes just the file system
containing file referred to by the open file
descriptor fd.
But what if the file system has no file on it that I can open and pass to syncfs? Can I "abuse" the dot file? Does it appear on all file systems?
Is there another function that does what I want? Perhaps by providing a device file with major / minor numbers or some such?

Yes I think you can do that. The root directory of your file system will have at least one inode for your root directory. You can use the .-file to do that. Play also around with ls -i to see the inode numbers.
Is there a possibility to avoid your problem by mounting your file system with sync? Does performance issues hamper? Did you have a look at remounting? This can sync your file system as well in particular cases.
I do not know what your application is, but I suffered problems with synchronization of files to a USB stick with the FAT32-file system. It resulted in weird read and write errors. I can not imagine any other valid reason why you should sync an empty file system.

From man 8 sync description:
"sync writes any data buffered in memory out to disk. This can include (but is not
limited to) modified superblocks, modified inodes, and delayed reads and writes. This
must be implemented by the kernel; The sync program does nothing but exercise the sync(2)
system call."
So, note that it's all about modification (modified inode, superblocks etc). If you don't have any modification, it don't have anything to sync up.

Related

If the size of the file exceeds the maximum size of the file system, what happens?

For example, In FAT32 partition, The maximum file size is 4GB. but I was able to create a 5GB file with vim and I saved the file and opened it again, the console output was broken like a staircase. I have three questions.
If the size of the file exceeds the maximum size of the file system, what happens?
In my case, Why break?
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1). Does this have anything to do with the file system? Is there a relationship between the limits of data in stat() and the limits of each feature in the file system?
If the size of the file exceeds the maximum size of the file system, what happens?
By definition, that can never happens. What really happens is that some system call (probably write(2) ...) is failing, and the code doing that should take care of that case.
Notice that FAT32 filesystems restrict the maximal size of files to 2Gigabytes. Use a better file system on your USB key if you want more (or split(1) large files in smaller chunks before copying them to your FAT32-formatted USB key).
If using <stdio.h> notice that fflush(3), fprintf(3), fclose(3) (and most other standard functions) can fail (e.g. because they will do some failing write(2)).
the console output was broken like a staircase
probably because your pseudoterminal was in some broken state. See stty(1), reset(1), termios(3) and read the tty demystified.
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1)
You are misunderstanding stat(2). Read again its documentation
Read Advanced Linux Programming then syscalls(2).
I was able to create a 5GB file with vim
To understand the behavior of vim read first its documentation then study its source code (it is free software, and you can and perhaps should study its code).
You could also use strace(1) to understand what system calls are done by some command or process.

Level 2 I/O in Linux using readdir() possible?

I am trying to traverse a directory structure and open every file in that structure. To traverse, I am using opendir() and readdir(). Since I already have the entity, it seems stupid to build a path and open the file -- that presumably forces Linux to find the directory and file I just traversed.
Level 2 I/O (open, creat, read, write) require a path. Is there any call to either open a filename inside a directory, or open a file given an inode?
You probably should use nftw(3) to recursively traverse a file tree.
Otherwise, in a portable way, construct your directory + filename path using e.g.
snprintf(pathbuf, sizeof(pathbuf), "%s/%s", dirname, filename);
(or perhaps using asprintf(3) but don't forget to later free the result)
And to answer your question about opening a file in a directory, you could use the Linux or POSIX2008 specific openat(2). But I believe that you should really use nftw or construct your path like suggested above. Read also about O_PATH and O_TMPFILE in open(2).
BTW, the kernel has to access several times the directory (actually, the metadata is cached by file system kernel code), just because another process could have written inside it while you are traversing it.
Don't even think of opening a file thru its inode number: this will violate several file system abstractions! (but might be hardly possible by insane and disgusting tricks, e.g. debugfs - and this could probably harm very strongly your filesystem!!).
Remember that files are generally inodes, and can have zero (a process did open then unlink(2) a file while keeping the open file descriptor), one (this is the usual case), or several (e.g. /foo/bar1 and /gee/bar2 could be hard-linked using link(2) ....) file names.
Some file systems (e.g. FAT ...) don't have real inodes. The kernel fakes something in that case.

I/O Performance in Linux

File A in a directory which have 10000 files, and file B in a directory which have 10 files, Would read/write file A slower than file B?
Would it be affected by different journaling file system?
No.
Browsing the directory and opening a file will be slower (whether or not that's noticeable in practice depends on the filesystem). Input/output on the file is exactly the same.
EDIT:
To clarify, the "file" in the directory is not really the file, but a link ("hard link", as opposed to symbolic link), which is merely a kind of name with some metadata, but otherwise unrelated to what you'd consider "the file". That's also the historical reason why deleting a file is done via the unlink syscall, not via a hypothetical deletefile call. unlink removes the link, and if that was the last link (but only then!), the file.
It is perfectly legal for one file to have a hundred links in different directories, and it is perfectly legal to open a file and then move it to a different place or even unlink it (while it remains open!). It does not affect your ability to read/write on the file descriptor in any way, even when a file (to your knowledge) does not even exist any more.
In general, once a file has been opened and you have a handle to it, the performance of accessing that file will be the same no matter how many other files are in the same directory. You may be able to detect a small difference in the time it takes to open the file, as the OS will have to search for the file name in the directory.
Journaling aims to reduce the recover time from file system crashes, IMHO, it will not affect the read/write speed of files. Journaling ext2

nodejs open nfs files by inode (or a the fastest way to reopen a file)

I am currently writing a caching system that will hold serialized (json) data on disk and in memory in order to reduce I/O load on a database.
The system will work by holding the last X number of accessed files in memory and read other files from disk.
I have read that there are systems out there that reduce I/O load on nfs (which I may use in the future) systems by opening files by inode.
My questions are:
Is there a way to open files on a nfs file system by inode in nodejs? If not, what homework would I need to do to make it happen?
2. Is it absolutely impossible to open a file on a local file system by inode?
3. if it is in fact impossible is there a faster way to reopen a file as it seems unnecessarily repetitive to have the OS stat the file over and over?
No, there is no user-accessible way to open files by inode, because doing so would, in some cases, allow users to bypass filesystem ACLs.
Yes. Same reason.
Most competent NFS clients, including the Linux kernel, will cache stat results locally.

Is the file mutex in Linux? How to implement it?

In windows, if I open a file with MS Word, then try to delete it.
The system will stop me. It prevents the file being deleted.
There is a similar mechanism in Linux?
How can I implement it when writing my own program?
There is not a similar mechanism in Linux. I, in fact, find that feature of windows to be an incredible misfeature and a big problem.
It is not typical for a program to hold a file open that it is working on anyway unless the program is a database and updating the file as it works. Programs usually just open the file, write contents and close it when you save your document.
vim's .swp file is updated as vim works, and vim holds it open the whole time, so even if you delete it, the file doesn't really go away. vim will just lose its recovery ability if you delete the .swp file while it's running.
In Linux, if you delete a file while a process has it open, the system keeps it in existence until all references to it are gone. The name in the filesystem that refers to the file will be gone. But the file itself is still there on disk.
If the system crashes while the file is still open it will be cleaned up and removed from the disk when the system comes back up.
The reason this is such a problem in Windows is that mandatory locking frequently prevents operations that should succeed from succeeding. For example, a backup process should be able to read a file that is being written to. It shouldn't have to stop the process that is doing the writing before the backup proceeds. In many other cases, operations that should be able to move forward are blocked for silly reasons.
The semantics of most Unix filesystems (such as Linux's ext2 fs family) is that a file can be unlink(2)'d at any time, even if it is open. However, after such a call, if the file has been opened by some other process, they can continue to read and write to the file through the open file descriptor. The filesystem does not actually free the storage until all open file descriptors have been closed. These are very long-standing semantics.
You may wish to read more about file locking in Unix and Linux (e.g., the Wikipedia article on File Locking.) Basically, mandatory and advisory locks on Linux exist but they're not guaranteed to prevent what you want to prevent.

Resources