ext4 enable hashes for directory entries [closed]

ext4 enable hashes for directory entries [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
According to kernel.org there is the possibility to store dentries in trees instead of lists but you need to enable this flag (EXT4_INDEX_FL) in the inode structure. I this enabled by default or I have to format my partition with some flags?
I need to store lots of small files (same old problem) of about 130k each and I understood that this will help to speed up lookup and also that it is recommended to store those files in a 2 level directories hierarchy. Is there something else I need to consider so that this doesn't blow up if want to store something close to 60.000.000 of this kind of files ? (maybe some other values for block size, number of blocks in a group)

This option is referred to by the e2fsprogs suite as dir_index. It's enabled by default, and you can verify that it's enabled on a file system by running tune2fs -l DEVICE as root.
It is indeed recommended that you shard your files manually so that you don't have a huge number of files in the same directory. While using B-trees makes the operation O(log n) instead of O(n), for large numbers of files, the operation can still be expensive.
If you know you're going to be creating a large number of files, you can set the inode ratio to 4096 with the -i option; this will create a larger number of inodes so that you can hold more files. You can also see common settings for a large number of situations in /etc/mke2fs.conf.

Related

Space needed for reserved blocks on a partition [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
By default, the ext2/3/4 fs reserves 5% of its capacity to be able to keep running when diskspace is getting low.
I also believe it has something to do with allowing "fragmentation between files" or something like this (I haven't been able to find concrete information about this, and I'm kinda newbie in this domain).
My question is: when do we need to keep these 5%, when can we reduce it to something like 1-2%, or when can we remove it totally ?
The elements that I'm considering atm are the following:
The 5% rule was decided something like 20 years ago when the reserved size wasn't way more than ~100Mbs, which is totally different now; if we're only talking about space needed to execute commands and such, do we really need 20Gbs ?
Could it ever be a good idea to remove this allowed space ? If some of it is needed for "fragmentation" somehow, I believe we should at least keep 1-2% available
Is this space really only useful for partitions that are related in any way to root ? I mean, if we have a partition for some folder in /home (something personal, or data from a database, something else that is not related in any way to the OS), this space may not be needed
I've only seen more and more articles on the web telling about how to reduce the reserved blocks so I believe that it may not be a bad idea 100% of the time, but I've not really been able to have articles explaining deeply the concrete application of "when it can / cannot be used", and "what it exactly does and implies".
So if some of you could provide comprehensive information (as well as a simple answer to the question I exposed above) I would be very thankful.

Those 5% are really kept for root user to be able to login and do some operations in case of full filesystem. And yes, you can decrease the amount (I did this in the past) to 1-2% (depend of the disk size). But be aware for most filesystems this should be defined when you create it and its hard (if possible at all) to change it after.
About zero it - yes, that's also possible. But will be wise to keep some space for root in the /, /root (or whatever is the home of root user), /tmp and eventually /var/tmp

See available space in all storage devices, mounted or unmounted, through a linux command? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
I've seen that df -H --total gives me the total space available but only of mounted devices, and lsblk gives me the sizes of all storage devices, yet not how much space is available within them.
Is there a way I could see the sum total, available storage space of all devices, e.g. hard disks, thumb drives, etc., in one number?

The operation of mounting a medium makes the operating system analyze the file system.
Before a medium is mounted, it exists as a block device and the only fact the OS might know about it might be the capacity.
Other than that, it is just a stream of bytes not interpreted in any way. That "stream of bytes" very probably contains the information of used and unused blocks. But, dependent on file system types, in very different places and can thus not be known by the OS (other than mounting and analyzing it)
You could write a specific application that would extract that information, but I would consider that temporarily mounting the file system. Standard Unix/Linux doesn't come with such an application.

From the df man page, I'd say "No", but the wording indicates that it may be possible on some sytems/distributions with some version(s) of df.
The other problem is how things can be accessed. For example, the system I'm using right now shows 3 160gb disks in it... but df will show one of them at / and the other 2 as a software based RAID-1 setup on /home.

Data destroy using shred agains ext4 filesystem [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I'm running shred against blockdevice with couple of etx4 filesystems on it.
The blockdevices are virtual drives - RAID-1 and RAID-5. Controller is PERC H710P.
command
shred -v /dev/sda; shred -v /dev/sdc ...
I can understand from shred man(info) page that shred might be no effective on journal filesystems but only when shredding files.
Anyone can please explain whether is shredding against blockdevice safe way to destruct all data on it?

This is a complex issue.
The only way that is 100% effective is physical destruction. The problem is that the drive firmware can mark sectors as bad and remap them to a pool of spares. These sectors are effectively no longer accessible to you but the old data may be recoverable from those sectors by other means (such as an alternate firmware or physically removing the platters).
That being said, running shred on the block device does not have the issues due to journaling.
The problem with journaling is that for partial overwrites to be recoverable you cannot actually overwrite the original data, so the overwrite of the file takes place in a second physical location, leaving the first in tact. Writing directly to the block device is not subject to journaling.

How to store data permanently in /tmp directory in linux [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
Is there any way to store any data in Linux tmp directory. As i know Linux clear its /tmp directory when system is rebooted. But I want to store data permanently.

Like #bereal said, this defeats the purpose of the /tmp directory. Let me quote the Linux Filesystem Hierarchy Standard:
The /tmp directory must be made available for programs that require temporary files.
Programs must not assume that any files or directories in /tmp are preserved between invocations of the program.
You'll find a better place to store permanent data.

Since it's linux you are free to do what you want to do (as root). When /tmp is cleared depends on your system and can be changed; there is no particular magic involved. A good summary seems to be here: https://serverfault.com/questions/377348/when-does-tmp-get-cleared.
Of course if you are root you can set up an entirely different global directory, say "/not-quite-tmp" or such. But I assume that some progs not under your control write to tmp and you want to inspect or in any case persist those files.

While you are trying to do wrong things, it’s still possible.
/tmp directory is cleared accordigly to TMPTIME setting. The default is apparently 0, what means “clear on every startup”.
The value might be changed in /etc/default/rcS (value is to be set in days.)

Does unix 'find' give the same order every time? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
If I run find (Ubuntu, specifically), can I expect it to give me the same order of results every time? (Assuming, of course, that the actual files haven't changed.)
In other words, if I run
$ find foo
and it gives me
bar.txt
foo.txt
can I expect that it will never give me
foo.txt
bar.txt
?

The answer is "probably" but you shouldn't rely on it because any number of things can affect it.
What order do you want the files in? Decide on that and then use a find command (perhaps piped into sort) which reproducibly gets the result you need.

The order of the files is determined by the fine details of the filesystem format and the filesystem driver. You can't rely on it. Depending on the filesystem and operating system, here are things that might change the order:
A file is created or removed in a traversed directory (even if none of the listed files changed).
The files are moved around (e.g. transfered to a different filesystem or restored from backup).
A defragmenter or filesystem check ran and decided to move things around.
If you want a reproducible order, sort the results. find … | sort will do nicely if none of the file names contain newlines.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string