How to delete folders except two with linux

How to delete folders except two with linux - linux

I have many directories of backup starting with "backup_".
I want to keep only the two last created folders.
I did this command to show the last two created:
ls -1 -t -d */ | head -2
The problem is i don't know how to exclude the result of that command from remove command (rm -rf | ...).
I know grep -v only works with strings.

In general, xargs is the tool you want to use to pass a generated list of names as arguments to a command. In your case, you just need to invert the head -2 to a command that prints everything except the first 2 lines. eg:
cmd-to-generate-file-list | sed -e 1,2d | xargs rm
The sed will delete the first two lines, and xargs will call rm with each line of output as an argument. Note that it is not generally safe to use ls to generate the file list, but that is a different issue entirely.

A zsh specific approach:
setopt extended_glob # Turn on extended globbing if it's not already enabled
dirs=( backup_*(#q/om) ) # Match only directories, sorted by modification time - newest first
rm -rf "${dirs[#]:2}" # Delete all but the first two elements of that array of directory names
See the documentation for more on zsh glob qualifiers like the above uses. They can make things with filenames that are tedious or difficult to do in other shell dialects trivial.

Related

Obtaining a flat list of all blobs in a .git/objects/ folder

In the .git/objects/ folder there are many folders with files within such as ab/cde.... I understand that these are actually blobs abcde...
Is there a way to obtain a flat file listing of all blobs under .git/objects/ with no / being used a delimitor between ab and cde in the example above? For e.g.
abcde....
ab812....
74axs...
I tried
/.git/objects$ du -a .
This does list recursively all folders and files within the /objects/ folder but the blobs are not listed since the command lists the folder followed by the filename (as the OS recognizes them, as opposed to git). Furthermore, the du command does not provide a flat listing in a single column -- it provides the output in two columns with a numeric entry (disk usage) in the first column.

I think you should start round here (git version 2.37.2):
git rev-list --all --objects --filter=object:type=blob
Doing it this way offers the advantage of not only checking the directory where the unpacked objects are but also the objects that are already packed (which are not in that directory anymore).

If you are in the .git/objects/ folder
Try this.
find . -type f | sed -e 's/.git\/objects\///' | sed -e 's/\///'
sed -e requires the sed script, which means a find/replace pattern.
's/.git\/objects\///' finds .git/objects/ and replace it to '' which is nothing. therefore sed command remove the pattern.
\ in the pattern is an escape character.
After first sed command ends,
the results will be (in linux.)
61/87c3f3d6c61c1a6ea475afb64265b83e73ec26
To remove / which refers a directory sign,
sed -e 's/\///'
If you are in the directory which contains .git
find .git/objects/ -type f | sed -e 's/.git\/objects\///' | sed -e 's/\///'
try this.

How can I delete the oldest n group of files with the same prefix?

In Linux I use InfluxDB which can make a backup of the database for archival purposes. Each backup comprises a series of files with the same prefix "/tank/Backups/var/Influxdb/20191225T235655Z." and different extensions.
I wanted to write a bash script which first deletes the oldest existing backups, then creates a new one (here I paste only the removal):
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' | \
sort -ru | sed 's/$/.*/' | tail -n +4 | xargs -d '\n' -r rm --
However, when I run the script as "sudo", I get
rm: cannot remove '/tank/Backups/var/Influxdb/20191225T235655Z.*': No such file or directory
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct. Also, if I manually write
sudo /tank/Backups/var/Influxdb/20190930T215357Z.*
the command succeeds.
Why is the script reporting an error?
I'm using Ubuntu 18.04 and the folder "/tank" is a ZFS volume.

Better do :
find /tank/Backups/var/Influxdb/* -mtime +5 -delete
to remove files older than 5 days.
Then, you can run the next command

Explaining the Error
This answer is only here to explain the error and give a deeper understanding of what is happening. If you are simply looking for an elegant solution search for other answers.
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct
The listed strings are not what you want. When you pass these paths to rm it sees them just as literal strings, that is, two files whose names end with a literal *. Since you don't have such files you get an error.
When you type rm * manually into your console bash (not rm!) does globbing. bash searches files and replaces the * with the list of found files. Only after that bash executes rm foundFile1 foundFile2 .... rm never sees the *.
Strings inside a pipeline are not processed by bash, but by the commands in the pipeline, in your case rm. rm does not glob.
You could run bash inside your pipeline and let it expand the * you inserted earlier. To this end, replace the last command in your pipeline with xargs -r bash -c 'rm -- $*' --. However, note that your paths are not quoted here. If there are spaces or literal * in your filenames the command will break. This is necessary for globbing as quoted "*" are not expanded by bash.
To quote your files you have to insert the * glob inside the bash command:
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' |
sort -ru | tail -n +4 | xargs -d\\n -L1 -r bash -c 'rm -- "$0."*'
Above command is only a simple fix for your command. It is neither elegant nor very robust. Using tools like find is strongly recommended.

Remove part of filename with common delimiter

I have a number of files with the following naming:
name1.name2.s01.ep01.RANDOMWORD.mp4
name1.name2.s01.ep02.RANDOMWORD.mp4
name1.name2.s01.ep03.RANDOMWORD.mp4
I need to remove everything between the last . and ep# from the file names and only have name1.name2.s01.ep01.mp4 (sometimes the extension can be different)
name1.name2.s01.ep01.mp4
name1.name2.s01.ep02.mp4
name1.name2.s01.ep03.mp4

This is a simpler version of #Jesse's [answer]
for file in /path/to/base_folder/* #Globbing to get the files
do
epno=${file#*.ep}
mv "$file" "${file%.ep*}."ep${epno%%.*}".${file##*.}"
#For the renaming part,see the note below
done
Note : Didn't get a grab of shell parameter expansion yet ? Check [ this ].

Using Linux string manipulation (refer: http://www.tldp.org/LDP/abs/html/string-manipulation.html) you could achieve like so:
You need to do per file-extension type.
for file in <directory>/*
do
name=${file}
firstchar="${name:0:1}"
extension=${name##${firstchar}*.}
lastchar=$(echo ${name} | tail -c 2)
strip1=${name%.*$lastchar}
lastchar=$(echo ${strip1} | tail -c 2)
strip2=${strip1%.*$lastchar}
mv $name "${strip2}.${extension}"
done

You can use rename (you may need to install it). But it works like sed on filenames.
As an example
$ for i in `seq 3`; do touch "name1.name2.s01.ep0$i.RANDOMWORD.txt"; done
$ ls -l
name1.name2.s01.ep01.RANDOMWORD.txt
name1.name2.s01.ep02.RANDOMWORD.txt
name1.name2.s01.ep03.RANDOMWORD.txt
$ rename 's/(name1.name2.s01.ep\d{2})\..*(.txt)$/$1$2/' name1.name2.s01.ep0*
$ ls -l
name1.name2.s01.ep01.txt
name1.name2.s01.ep02.txt
name1.name2.s01.ep03.txt
Where this expression matches your filenames, and using two capture groups so that the $1$2 in the replacement operation are the parts outside the "RANDOMWORD"
(name1.name2.s01.ep\d{2})\..*(.txt)$

Listing entries in a directory using grep

I'm trying to list all entries in a directory whose names contain ONLY upper-case letters. Directories need "/" appended.
#!/bin/bash
cd ~/testfiles/
ls | grep -r *.*
Since grep by default looks for upper-case letters only (right?), I'm just recursively searching through the directories under testfiles for all names who contain only upper-case letters.
Unfortunately this doesn't work.
As for appending directories, I'm not sure why I need to do this. Does anyone know where I can start with some detailed explanations on what I can do with grep? Furthermore how to tackle my problem?

No, grep does not only consider uppercase letters.
Your question I a bit unclear, for example:
from your usage of the -r option, it seems you want to search recursively, however you don't say so. For simplicity I assume you don't need to; consider looking into #twm's answer if you need recursion.
you want to look for uppercase (letters) only. Does that mean you don't want to accept any other (non letter) characters, but which are till valid for file names (like digits or dashes, dots, etc.)
since you don't say th it i not permissible to have only on file per line, I am assuming it is OK (thus using ls -1).
The naive solution would be:
ls -1 | grep "^[[:upper:]]\+$"
That is, print all lines containing only uppercase letters. In my TEMP directory that prints, for example:
ALLBIG
LCFEM
WPDNSE
This however would exclude files like README.TXT or FILE001, which depending on your requirements (see above) should most likely be included.
Thus, a better solution would be:
ls -1 | grep -v "[[:lower:]]\+"
That is, print all lines not containing an lowercase letter. In my TEMP directory that prints for example:
ALLBIG
ALLBIG-01.TXT
ALLBIG005.TXT
CRX_75DAF8CB7768
LCFEM
WPDNSE
~DFA0214428CD719AF6.TMP
Finally, to "properly mark" directories with a trailing '/', you could use the -F (or --classify) option.
ls -1F | grep -v "[[:lower:]]\+"
Again, example output:
ALLBIG
ALLBIG-01.TXT
ALLBIG005.TXT
CRX_75DAF8CB7768
LCFEM/
WPDNSE/
~DFA0214428CD719AF6.TMP
Note a different option would to be use find, if you can live with the different output (e.g. find ! -regex ".*[a-z].*"), but that will have a different output.

The exact regular expression depend on the output format of your ls command. Assuming that you do not use an alias for ls, you can try this:
ls -R | grep -o -w "[A-Z]*"
note that with -R in ls you will recursively list directories and files under the current directory. The grep option -o tells grep to only print the matched part of the text. The -w options tell grep to consider as match only for whole words. The "[A-Z]*" is a regexp to filter only upper-cased words.
Note that this regexp will print TEST.txt as well as TEXT.TXT. In other words, it will only consider names that are formed by letters.

It's ls which lists the files, not grep, so that is where you need to specify that you want "/" appended to directories. Use ls --classify to append "/" to directories.
grep is used to process the results from ls (or some other source, generally speaking) and only show lines that match the pattern you specify. It is not limited to uppercase characters. You can limit it to just upper case characters and "/" with grep -E '^[A-Z/]*$ or if you also want numbers, periods, etc. you could instead filter out lines that contain lowercase characters with grep -v -E [a-z].
As grep is not the program which lists the files, it is not where you want to perform the recursion. ls can list paths recursively if you use ls -R. However, you're just going to get the last component of the file paths that way.
You might want to consider using find to handle the recursion. This works for me:
find . -exec ls -d --classify {} \; | egrep -v '[a-z][^/]*/?$'
I should note, using ls --classify to append "/" to the end of directories may also append some other characters to other types of paths that it can classify. For instance, it may append "*" to the end of executable files. If that's not OK, but you're OK with listing directories and other paths separately, this could be worked around by running find twice - once for the directories and then again for other paths. This works for me:
find . -type d | egrep -v '[a-z][^/]*$' | sed -e 's#$#/#'
find . -not -type d | egrep -v '[a-z][^/]*$'

How to delete multiple files at once in Bash on Linux?

I have this list of files on a Linux server:
abc.log.2012-03-14
abc.log.2012-03-27
abc.log.2012-03-28
abc.log.2012-03-29
abc.log.2012-03-30
abc.log.2012-04-02
abc.log.2012-04-04
abc.log.2012-04-05
abc.log.2012-04-09
abc.log.2012-04-10
I've been deleting selected log files one by one, using the command rm -rf see below:
rm -rf abc.log.2012-03-14
rm -rf abc.log.2012-03-27
rm -rf abc.log.2012-03-28
Is there another way, so that I can delete the selected files at once?

Bash supports all sorts of wildcards and expansions.
Your exact case would be handled by brace expansion, like so:
$ rm -rf abc.log.2012-03-{14,27,28}
The above would expand to a single command with all three arguments, and be equivalent to typing:
$ rm -rf abc.log.2012-03-14 abc.log.2012-03-27 abc.log.2012-03-28
It's important to note that this expansion is done by the shell, before rm is even loaded.

Use a wildcard (*) to match multiple files.
For example, the command below will delete all files with names beginning with abc.log.2012-03-.
rm -f abc.log.2012-03-*
I'd recommend running ls abc.log.2012-03-* to list the files so that you can see what you are going to delete before running the rm command.
For more details see the Bash man page on filename expansion.

If you want to delete all files whose names match a particular form, a wildcard (glob pattern) is the most straightforward solution. Some examples:
$ rm -f abc.log.* # Remove them all
$ rm -f abc.log.2012* # Remove all logs from 2012
$ rm -f abc.log.2012-0[123]* # Remove all files from the first quarter of 2012
Regular expressions are more powerful than wildcards; you can feed the output of grep to rm -f. For example, if some of the file names start with "abc.log" and some with "ABC.log", grep lets you do a case-insensitive match:
$ rm -f $(ls | grep -i '^abc\.log\.')
This will cause problems if any of the file names contain funny characters, including spaces. Be careful.
When I do this, I run the ls | grep ... command first and check that it produces the output I want -- especially if I'm using rm -f:
$ ls | grep -i '^abc\.log\.'
(check that the list is correct)
$ rm -f $(!!)
where !! expands to the previous command. Or I can type up-arrow or Ctrl-P and edit the previous line to add the rm -f command.
This assumes you're using the bash shell. Some other shells, particularly csh and tcsh and some older sh-derived shells, may not support the $(...) syntax. You can use the equivalent backtick syntax:
$ rm -f `ls | grep -i '^abc\.log\.'`
The $(...) syntax is easier to read, and if you're really ambitious it can be nested.
Finally, if the subset of files you want to delete can't be easily expressed with a regular expression, a trick I often use is to list the files to a temporary text file, then edit it:
$ ls > list
$ vi list # Use your favorite text editor
I can then edit the list file manually, leaving only the files I want to remove, and then:
$ rm -f $(<list)
or
$ rm -f `cat list`
(Again, this assumes none of the file names contain funny characters, particularly spaces.)
Or, when editing the list file, I can add rm -f to the beginning of each line and then:
$ . ./list
or
$ source ./list
Editing the file is also an opportunity to add quotes where necessary, for example changing rm -f foo bar to rm -f 'foo bar' .

Just use multiline selection in sublime to combine all of the files into a single line and add a space between each file name and then add rm at the beginning of the list. This is mostly useful when there isn't a pattern in the filenames you want to delete.
[$]> rm abc.log.2012-03-14 abc.log.2012-03-27 abc.log.2012-03-28 abc.log.2012-03-29 abc.log.2012-03-30 abc.log.2012-04-02 abc.log.2012-04-04 abc.log.2012-04-05 abc.log.2012-04-09 abc.log.2012-04-10

A wild card would work nicely for this, although to be safe it would be best to make the use of the wild card as minimal as possible, so something along the lines of this:
rm -rf abc.log.2012-*
Although from the looks of it, are those just single files? The recursive option should not be necessary if none of those items are directories, so best to not use that, just for safety.

I am not a linux guru, but I believe you want to pipe your list of output files to xargs rm -rf. I have used something like this in the past with good results. Test on a sample directory first!
EDIT - I might have misunderstood, based on the other answers that are appearing. If you can use wildcards, great. I assumed that your original list that you displayed was generated by a program to give you your "selection", so I thought piping to xargs would be the way to go.

if you want to delete all files that belong to a directory at once.
For example:
your Directory name is "log" and "log" directory include abc.log.2012-03-14, abc.log.2012-03-15,... etc files. You have to be above the log directory and:
rm -rf /log/*

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string