tar extracting most recent file - linux

Using bash, I have dir of /home/user/logs/
Aug 2 15:34 backup.20120802.tar.gz
Aug 3 00:26 backup.20120803.tar.gz
Aug 4 00:25 backup.20120804.tar.gz
Aug 15 06:39 backup.20120816.tar.gz
This gets updated every few days, but if something goes wrong I want it to automatically restore the most recent backup, how can I use bash only extract the most recent?

ls -t1 /home/user/logs/ | head -1
gives you the most recent modified file in /home/user/logs/.
So you could do:
cd /dir/to/extract
tar -xzf "$(ls -t1 /home/user/logs/ | head -1)"
NOTE:
this assumes that /home/user/logs/ is flat and contains nothing but "*.tar.gz" files

If the time stamps may not always be reliable, try sorting by date.
ls -1 /home/user/logs/backup.*.tar.gz | sort -t . -k2rn | head -1
Ideally, you should not parse the output from ls, but if there are only regularly named files matching the wildcard, it may be the easiest solution; sort expects line-oriented input, anyway, so the task becomes more involved in the general case of completely arbitrary file names. (This may make no sense to you, but it would be perfectly okay as far as Unix is concerned to have a file named backup.20120816.tar.gz(newline)backup.20380401.tar.gz.)

Related

How to chain 'mimetype -b' and 'find' command to get file names and file type in same csv?

I would like to get filenames, creation dates, modification dates and file mime-types from directory structure. I've made a script which reads as follows :
#!/bin/bash
output="file_list.csv"
## columns
echo '"File name";"Creation date";"Modification date";"Mime type"' > $output
## content
find $1 -type f -printf '"%f";"%Tc";"%Cc";"no idea!"\n' >> $output
which gives me encouraging results :
"File name";"Creation date";"Modification date";"Mime type"
"Exercice 4 Cluster.xlsx";"ven. 27 mars 2020 10:35:46 CET";"mar. 17 mars 2020 19:14:18 CET";"no idea!"
"Exercice 5 Bayes.xlsx";"ven. 27 mars 2020 10:36:30 CET";"ven. 20 mars 2020 16:18:54 CET";"no idea!"
"Exercice 3 Régression.xlsx";"ven. 27 mars 2020 10:36:46 CET";"mer. 28 août 2019 17:21:10 CEST";"no idea!"
"Archers et Clustering.xlsx";"ven. 27 mars 2020 10:37:34 CET";"lun. 16 mars 2020 14:12:05 CET";"no idea!"
...
but I'm missing a capital thing : how do I get the files mime-types ? It would be great if I could chain the command 'mimetype -b' on each file found with 'find' command, and write it in the convenient column.
Thanks in advance,
Cyril
You might try using the -exec option of the find command, in which the brackets {} represent the name of the current file.
Then, you could remove the new line when appending to an existing file: AFAIK default behavior automatically appends new content to a new line, so the \n should not be necessary.
Last, you want to have a closing quote after your mimetype, so you should not only use the -b option, but the --output-format one, which will give you more control over what you want to display.
Hence the third command of your script should look like this:
find $1 -type f -printf '"%f";"%Tc";"%Cc";"' -exec mimetype --output-format %m\" {} \; >> $output
This is what I came up with:
for entry in *; do stat --printf='"%n";"%z";"%y";"' $entry; file -00 --mime-type $entry | cut -d $'\0' -f2; echo '"'; done
Uses a shell "for loop", to perform a stat on the directory entries in the current directory. Then uses file to get the mime type, and pipes that to cut to get only the mime type (by excluding the file name which is also printed by file).
The format for stat is what I believe was requested -- the file name, the last change date, the last modification date (both in ISO format, but could easily be made to UNIX seconds-since-epoch by upper-casing Z and Y).
Availability:
file: probably its own package if you are on Linux? But should be preinstalled on macOS I'm guessing.
bash/zsh: easily accessible both on Linux and macOS.
stat and cut: part of coreutils so should be preinstalled on most systems.

Finding the oldest folder in a directory in linux even when files inside are modified

I have two folders A and B, inside that there are two files each.
which are created in the below order
mkdir A
cd A
touch a_1
touch a_2
cd ..
mkdir B
cd B
touch b_1
touch b_2
cd ..
From the above i need to find which folder was created first(not modified).
ls -c <path_to_root_before_A_and_B> | tail -1
Now this outputs as "A" (no issues here).
Now i delete the file a_1 inside the Directory A.
Now i again execute the command
ls -c <path_to_root_before_A_and_B> | tail -1
This time it shows "B".
But the directory A contains the file a_2, but the ls command shows as "B". how to overcome this
How To Get File Creation Date Time In Bash-Debian
You'll want to read the link above for that, files and directories would save the same modification time types, which means directories do not save their creation date. Methods like the ls -i one mentioned earlier may work sometimes, but when I ran it just now it got really old files mixed up with really new files, so I don't think it works exactly how you think it might.
Instead try touching a file immediately after creating a directory, save it as something like .DIRBIRTH and make it hidden. Then when trying to find the order the directories were made, just grep for which .DIRBIRTH has the oldest modification date.
Assuming that all the stars align (You're using a version of GNU stat(1) that supports the file birth time formats, you're using a filesystem that records them, and a linux kernel version new enough to support the statx(2) syscall, this script should print out all immediate subdirectories of the directory passed as its argument sorted by creation time:
#!/bin/sh
rootdir=$1
find "$rootdir" -maxdepth 1 -type d -exec stat -c "%W %n" {} + | tail -n +2 \
| sort -k1,1n | cut --complement -d' ' -f1

Quickly list random set of files in directory in Linux

Question:
I am looking for a performant, concise way to list N randomly selected files in a Linux directory using only Bash. The files must be randomly selected from different subdirectories.
Why I'm asking:
In Linux, I often want to test a random selection of files in a directory for some property. The directories contain 1000's of files, so I only want to test a small number of them, but I want to take them from different subdirectories in the directory of interest.
The following returns the paths of 50 "randomly"-selected files:
find /dir/of/interest/ -type f | sort -R | head -n 50
The directory contains many files, and resides on a mounted file system with slow read times (accessed through ssh), so the command can take many minutes. I believe the issue is that the first find command finds every file (slow), and only then prints a random selection.
If you are using locate and updatedb updates regularly (daily is probably the default), you could:
$ locate /home/james/test | sort -R | head -5
/home/james/test/10kfiles/out_708.txt
/home/james/test/10kfiles/out_9637.txt
/home/james/test/compr/bar
/home/james/test/10kfiles/out_3788.txt
/home/james/test/test
How often do you need it? Do the work periodically in advance to have it quickly available when you need it.
Create a refreshList script.
#! /bin/env bash
find /dir/of/interest/ -type f | sort -R | head -n 50 >/tmp/rand.list
mv -f /tmp/rand.list ~
Put it in your crontab.
0 7-20 * * 1-5 nice -25 ~/refresh
Then you will always have a ~/rand.list that's under an hour old.
If you don't want to use cron and aren't too picky about how old it is, just write a function that refreshes the file after you use it every time.
randFiles() {
cat ~/rand.list
{ find /dir/of/interest/ -type f |
sort -R | head -n 50 >/tmp/rand.list
mv -f /tmp/rand.list ~
} &
}
If you can't run locate and the find command is too slow, is there any reason this has to be done in real time?
Would it be possible to use cron to dump the output of the find command into a file and then do the random pick out of there?

Linux Command : Why does the redirection operator - | i.e. piping fail here?

I was working my way through a primer on Shell (Bash) Scripting and had the following doubt :
Why does not the following command print the contents of cp's directory : which cp | ls -l
Does not piping by definition mean that we pass the output of one command to another i.e. redirect the output ?
Can someone help me out ? I am a newbie ..
The output of which is being piped to the standard input of ls. However, ls doesn't take anything on standard input. You want it (I presume) to be passed as a parameter. There are a couple of ways of doing that:
which cp | xargs ls -l
or
ls -l `which cp`
or
ls -l $(which cp)
In the first example the xargs command takes the standard output of the previous previous command and makes each line a parameter to the command whose name immediately follows xargs. So, for instance
find / | xargs ls -l
will do an ls -l on each file in the filesystem (there are some issues with this with peculiarly named files but that's beyond the scope of this answer).
The remaining two are broadly equivalent and use the shell to do this, expanding the output from which into the command line for cp.
It would be,
$ ls -l $(which cp)
-rwxr-xr-x 1 root root 130304 Mar 24 2014 /bin/cp
OR
$ which cp | xargs ls -l
-rwxr-xr-x 1 root root 130304 Mar 24 2014 /bin/cp
To pass the output of one command as parameter of another command, you need to use xargs along with the pipe symbol.
From man xargs
xargs - build and execute command lines from standard input.xargs reads items
from the standard input, delimited by blanks (which can be protected
with double or single quotes or a backslash) or newlines, and executes
the command (default is /bin/echo) one or more times with any initial-
arguments followed by items read from standard input. Blank lines on
the standard input are ignored.

How to view last created file?

I have uploaded a file to a Linux computer. I do not know its name. So how to view files through their last created date attribute ?
ls -lat
will show a list of all files sorted by date. When listing with the -l flag using the -t flag sorts by date. If you only need the filename (for a script maybe) then try something like:
ls -lat | head -2 | tail -1 | awk '{print $9}'
This will list all files as before, get the first 2 rows (the first one will be something like 'total 260'), the get the last one (the one which shows the details of the file) and then get the 9th column which contains the filename.
find / -ctime -5
Will print the files created in the last five minutes. Increase the period one minute at a time to find your file.
Assuming you know the folder where you'll be searching it, the most easy solution is:
ls -t | head -1
# use -A in case the file can start with a dot
ls -tA | head -1
ls -t will sort by time, newest first (from ls --help itself)
head -1 will only keep 1 line at the top of anything
Use ls -lUt or ls -lUtr, as you wish. You can take a look at the ls command documentation typing man ls on a terminal.

Resources