Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I'm writing a shell script in bash where I'm making some links between files, but I'm not sure what kind of link to use (physical or symbolic). Doing some research, I've noticed that it's more common to use symbolic links instead of physical ones. My question is why to use symbolic links when they require an indirection (it creates an extra I-node to safe the information about the real I-node of the file) instead of using hard links that point directly to the file?
In other words:
Why
ln -s ...
instead of
ln -P ...
The main reason for symlinks is that a 'soft' symlink can cross filesystem boundaries. The file representing the symlink will contain a string that is the actual path of the file being pointed at. As long as the end-user representations of that path remain the same, the symlink will work. If you move the file at the end of the symlink, the symlink will now be stale (aka "dangling"), because the resource it pointed at no longer exists.
A hard (aka physical) symlink works at the inode layer. Since inodes are only unique within a single file system, you cannot hardlink ACROSS file systems. You could quite easily run into a duplicate inode situation if this were allowed. The benefit is that no matter where you move the target of a hardlink, the links pointing at the resource will "follow", because they're pointing at the inode itself, and don't care what the actual path/resource name is.
Off the top of my head:
symbol links work across filesystems. If you don't want to keep track of what filesystem the source file and destination link are on, or if you sometimes move files across filesystems, it is less of a hassle to use symbolic links.
$#&#$& emacs backup files. When you edit, say file.txt and make a change to it in emacs, emacs renames the original file to file.txt~ and saves your changes under the original file name. If there was a hard link to file.txt, it is now linked to file.txt~, which is probably not what you want. A symbolic link to file.txt will still point to the updated inode.
A hardlink can only work on the same filesystem, renames the inode. A file can only be deleted when the last link to its inode is gone. Hardlinks usually are for files instead of directories.
A symlink is an actual file containing a path to another file. Symlinks can work across file systems as well. They can point to different file types as well. A symlink can also point to either files or directories.
Hard links don't make sense across filesystems, since they're basically pointers to inodes on the local FS. Symlinks don't care; since they're just names, they can live anywhere.
If you look at the directory listing and see a symlink, you know it's to a particular other location. Hard links, on the other hand, offer no such clue. You might not know you were playing around in some important file til you stat both its names.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 months ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I have a large directory of gzip-compressed files.
I want to use a tool to index these files. The tool works by walking over a folder and reading all text files. Unfortunately, the tool doesn't support reading gzip-compressed text files, and it's not practical for me to temporarily decompress all the files so the tool can access them. It would consume a massive amount of disk space, and even though disk space is cheap, it's still impractical in my use-case.
I also don't have access to the tool to modify it to add support for gzip-compression.
So I was thinking of a way to insert a middle-man, between the tool and my files, that would transparently perform the decompression on the fly.
To that end, is there any way for me, under Linux, to create a sort of symbolic filesystem that mirrors my folder contents, and create a "fake" file for each original file so that, when read, it silently calls a script that accesses the original file, pipes it through gunzip, and returns the output? The effect would be, from the tool's perspective, it's reading un-compressed files without me having to decompress them all at once.
Are there any other solutions that I'm overlooking?
There are a few approaches that occur to me, each with varying amounts of difficulty. The options are ordered by how easy they would be IMHO.
Option 1 -- A compressed-at-rest filesystem
Several modern file systems support compression at rest -- i.e., the data is stored compressed, and decompressed for you on demand. You could set up a partition of your disk with one of these filesystems (I would recommend zfs), and then copy all of your data into the partition.
Once you've done that, you'd have the disk usage of compressed data, but would be able to interact with the filesystem as if it were uncompressed.
Option 2 -- FUSE Wrapper
If you're willing to do some coding for this, using FUSE would be an attractive option. FUSE is a library that effectively lets you describe a file system, and implement reading/writing as just callbacks to user code.
If you weren't worried about performance, it would be relatively straightforward to write some python script that mirrors a directory tree and wraps all read calls with gunzip.
Between option 1 and 2, I would lean towards 1. It will be more performant than any script you could hack out yourself, and would give you added convenience being able to use the data directly.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
i've been studying the linux operating system for a while now, i understand what file systems are but i'm curious as to how they're made. Is it possible for a programmer to create their own custom made file system in linux, is it possible to combine multiple file systems together and how much control do we have over a file system? Thanks.
Also does anyone know any online sources or books that talk about linux file systems
Certainly a programmer can create a file system, everyone can, you just have to use the command to do that. In addition to that a programmer theoretically can implement logic that creates what you probably refer to as "custom made filesystem", just as a programmer can change, remove or add anything he wants from any part of the system he uses. It is questionable though if many programmers actually are able to create a working and usable file system from scratch, since that is quite a complex thing to do.
Combining multiple filesystems is certainly possible, but maybe you should define in more detail what you actually ask by that. You certainly can use multiple filesystems inside a single system by simply mounting them. You can mount one filesystem inside another. You can even use a loopback device to store a whole filesystem inside a file contained in another filesystem. What you can not do is somehow take two separate file systems, hold a magic wand over them and declare them as one from now on. Well, actually you can do that but it won't work as expected ;-)
About the last question, how much control we have... well, difficult to answer without you giving any metric to do so... We certainly can configure a filesystem, we can use it and its content. We can even destroy or damage it, mount it, check it, examine it, monitor it, repair it, create it, ... I personally would indeed call that some amount of "control" we have over filesystems.
Just a simple question, borne out of learning about File Systems;
Is it possible for a single file two simultaneously exist in two or more directories?
I'd like to know if this is possible in Linux and well as Windows.
Yes, you can do this with either hard- or soft links (and maybe on Windows with shortcuts. I'm not sure about that). Note this is different from making a copy of the file! In both cases, you only store the same file once, unlike when you make a copy.
In the case of hard links, the same file (on disk) will be referenced in two different places. You cannot distinguish between the 'original' and the 'new one'. If you delete one of them, the other will be unaffected; a file will only actually be deleted when the last "reference" is removed. An important detail is that the way hard links work means that you cannot create them for directories.
Soft links, also referred to as symbolic links, are a bit similar to shortcuts in Windows, but on a lower level. if you open them for read or write operations, you'll read from the file, but you can distinguish between reading from the file directly, and reading from the soft link.
In Windows, the use of soft links is fairly uncommon, but there is support for it (IDK about the filesystem APIs, but there's a tool called ln just like on Unix).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content. If you 'rm' a file, you certainly must be deleting that pointer. If rm actually 'erases' the data, I would guess each bit is written over (set to 0 or something). But I know that there are special procedures/programs (i.e. srm) to make sure that data isn't 'recoverable' --- which suggests that nothing is actually being overwritten...
So, is deleting the pointer to a memory address the only thing rm does? Is the data still sitting there in a contiguous block like it was before?
My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content.
Be careful with your terminology. The files (and pointers) are on disk, not in memory (RAM).
If you 'rm' a file, you certainly must be deleting that pointer.
Yes. What happens is heavily file-system dependent. Some have a bitmap of which block are free/busy. So it would have to flip the bit for each block freed. Other filesystems use more sophisticated methods of tracking free space.
which suggests that nothing is actually being overwritten...
Correct. You can find various "undelete" utilities. But depending on the filesystem, it can get rather complex. But stuff you saved years ago could still be sitting around -- or it could be overwritten. It all depends on minute details. For example, see e2fsprogs.
So, is deleting the pointer to a memory address the only thing rm does?
Well, it also has to remove the "directory entry" that gives metadata about the file. (Sometimes it just wipes out the first byte of the filename).
Is the data still sitting there in a contiguous block like it was before?
Yes, the data is still there. But don't assume it is a contiguous block. Files can be freagmented all over the disk, with lots of pointers that tell it how to re-assemble. And if you are using RAID, things get real complex.
Yes. rm simply deletes the pointer. If you have multiple pointers to the file (hard links), then deleting one of those pointers with rm leaves the others completely untouched and the data still available.
Deleting all of those links still does not touch the data, however the OS is now free to reuse the blocks which previously were reserved for storing that data.
It's worth noting that any process which opens a file creates a file handle for it. This adds to the overall count of references to the file. If you delete all of the pointers from your filesystem, but the operating system still has a process running with an open file handle for your file, then the count of pointers will not be zero and the file will not really be deleted. Only when that final pointer is closed will the filesystem register the disk space as having been released, and only at that point will the OS be free to overwrite the blocks previously reserved for storing that data.
You may or may not be able to recover that data at any point in the future depending on whether any reuse of the blocks in question has occurred.
Incidentally, you have no guarantee that your data is sitting there in a contiguous block in the first place.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I did this on the command line(Ubuntu 12.04)
mv some_arbit_file required_file
Is there any way I can recover required_file. I had put so much work into this. I usually backup files. But, I forgot this time. Will appreciate any help.
Thanks.
You can't "undo" your command. Maybe you can found some recovery tools that will help you...
By the way the next time, to avoid this kind of trouble I suggest you use the -i option. It will tell you if you're going to overwritte a file.
You can create aliases in ~/.bashrc :
alias mv="mv -i"
#you can also add aliases to cp and rm
alias cp="cp -i"
alias rm="rm -i"
To prevent this kind of problem in the future you need to have a backup system. It's better if this backup system does some form of version control and doesn't require any effort on your part to maintain. You don't want to be thinking about which files need backing up, it should be "all of them, yes even the contents of /tmp, every hour at least"
Using git/svn or another file versioning system is not a bad idea, but what happens if you forget to add new files to the repo? What happens if the repo itself gets deleted? What happens when you try to shovel your entire music collection into it, or want to store special unix files like pipes and symlinks? I also find git repos insanely easy to corrupt, especially if I manually delete things or move them about.
A more robust system is to first buy a new hard disk of a suitably large size - one or two terabytes should be enough. Designate this your backup drive. Then get something like Rsnapshot and make it work. You will then have versioned backups that can be recovered using standard file tools from a half broken machine that is booting from CD where the network isn't up after a catastrophic failure of your boot drive (yes, I am speaking from experience here).
Between Rsnapshot and Apple's TimeMachine (which works in a very similar way) I have recovered from several SSD failures, a hard disk crash and a powercut that silently corrupted half my music collection which I didn't find until a week later.
You could try to use photorec from the testdisk package. It works when you delete a file, but I don't know actually if overwriting it counts as a deletion.
For the future, I would suggest you to use a service like Dropbox for all your important files. It allows you to recover any version of every file.
Have a look at this ext3 recovery page.
Because it is hard, I try to be proactive and use git in every directory. This way, when I do a mistake I can git checkout to get the old file back.