In Linux terminal, how to delete all files in a directory except one or two - linux

In a Linux terminal, how to delete all files from a folder except one or two?
For example.
I have 100 image files in a directory and one .txt file.
I want to delete all files except that .txt file.

From within the directory, list the files, filter out all not containing 'file-to-keep', and remove all files left on the list.
ls | grep -v 'file-to-keep' | xargs rm
To avoid issues with spaces in filenames (remember to never use spaces in filenames), use find and -0 option.
find 'path' -maxdepth 1 -not -name 'file-to-keep' -print0 | xargs -0 rm
Or mixing both, use grep option -z to manage the -print0 names from find

In general, using an inverted pattern search with grep should do the job. As you didn't define any pattern, I'd just give you a general code example:
ls -1 | grep -v 'name_of_file_to_keep.txt' | xargs rm -f
The ls -1 lists one file per line, so that grep can search line by line. grep -v is the inverted flag. So any pattern matched will NOT be deleted.
For multiple files, you may use egrep:
ls -1 | grep -E -v 'not_file1.txt|not_file2.txt' | xargs rm -f
Update after question was updated:
I assume you are willing to delete all files except files in the current folder that do not end with .txt. So this should work too:
find . -maxdepth 1 -type f -not -name "*.txt" -exec rm -f {} \;

find supports a -delete option so you do not need to -exec. You can also pass multiple sets of -not -name somefile -not -name otherfile
user#host$ ls
1.txt 2.txt 3.txt 4.txt 5.txt 6.txt 7.txt 8.txt josh.pdf keepme
user#host$ find . -maxdepth 1 -type f -not -name keepme -not -name 8.txt -delete
user#host$ ls
8.txt keepme

Use the not modifier to remove file(s) or pattern(s) you don't want to delete, you can modify the 1 passed to -maxdepth to specify how many sub directories deep you want to delete files from
find . -maxdepth 1 -not -name "*.txt" -exec rm -f {} \;
You can also do:
find -maxdepth 1 \! -name "*.txt" -exec rm -f {} \;

In bash, you can use:
$ shopt -s extglob # Enable extended pattern matching features
$ rm !(*.txt) # Delete all files except .txt files

Related

cat files in subdirectories using linux commands

I have the following directories:
P922_101
P922_102
.
.
Each directory, for instance P922_101 has following subdirectories:
140311_AH8MHGADXX 140401_AH8CU4ADXX
Each subdirectory, for instance 140311_AH8MHGADXX has the following files:
1_140311_AH8MH_P922_101_1.fastq.gz 1_140311_AH8MH_P922_101_2.fastq.gz
2_140311_AH8MH_P922_101_1.fastq.gz 2_140311_AH8MH_P922_101_2.fastq.gz
And files in 140401_AH8CU4ADXX are:
1_140401_AH8CU_P922_101_1.fastq.gz 1_140401_AH8CU_P922_4001_2.fastq.gz
2_140401_AH8CU_P922_101_1.fastq.gz 2_140401_AH8CU_P922_4001_2.fastq.gz
I want to do 'cat' for the files in the subdirectories in the following way:
cat 1_140311_AH8MH_P922_101_1.fastq.gz 2_140311_AH8MH_P922_101_1.fastq.gz
1_140401_AH8CU_P922_101_1.fastq.gz 2_140401_AH8CU_P922_101_1.fastq.gz > P922_101_1.fastq.gz
which means that files ending with _1.fastq.gz should be concatenated into a single file and files ending with _2.fatsq.gz into another file.
It should be run for all files in subdirectories in all directories. Could someone give a linux solution to do this?
Since they're compressed, you should probably use gzip -dc (decompress and write to stdout) -
find /somePath -type f -name "*.fastq.gz" -exec gzip -dc {} \; | \
tee -a /someOutFolder/out.txt
You can use find for this:
find /top/path -mindepth 2 -type f -name "*_1.fastq.gz" -exec cat {} \; > one_file
find /top/path -mindepth 2 -type f -name "*_2.fastq.gz" -exec cat {} \; > another_file
This will look for all the files starting from /top/path and having a name matching the pattern _1.fastq.gz / _2.fastq.gz and cat them into the desired file. -mindepth 2 makes find look for files that are at least under the current directory; this way, files in /top/path won't be matched.
Note that you will probably need zcat instead of cat, for gz files.
As you keep adding details in comments, let's see what else we can do:
Say you have the list of directories in a file directories_list, each line containing one:
while read directory
do
find $directory -mindepth 2 -type f -name "*_1.fastq.gz" -exec cat {} \; > $directory/output
done < directories_list

Capture the output of command line in a variable in Unix

I want to capture the output of the command below in a variable.
Command:
find . -iname 'FIL*'.TXT
The output is :
./FILE1.TXT
I want to capture './FILE1.TXT' into 'A' variable. But when I am trying
A=`find . -iname 'FIL*'.TXT`
then this command is displaying the data of the file. But I want ./FILE1.TXT value in the variable A.
# ls *.txt
test1.txt test.txt
# find ./ -maxdepth 1 -iname "*.txt"
./test1.txt
./test.txt
# A=$(find ./ -maxdepth 1 -iname "*.txt")
# echo $A
./test1.txt ./test.txt
You can ignore -maxdepth 1 if you want to. I had to use it for this example.
Or with a single file:
# ls *.txt
test.txt
# find ./ -maxdepth 1 -iname "*.txt"
./test.txt
# A=$(find ./ -maxdepth 1 -iname "*.txt")
# echo $A
./test.txt
Do you try ?
A="`find . -iname 'FIL*'.TXT`"
and
A="`find . -iname 'FIL*'.TXT -print`"
A file does not have any value, but does have a content. Use the following to display that content.
find . -iname 'FIL*'.TXT -exec cat {} \;
If you want all the contents (of all such files) in a variable, then
A=$(find . -iname 'FIL*'.TXT -exec cat {} \;)
BTW you could have used
find . -iname 'FIL*.TXT' -print0 | xargs -0 cat
If you want the names of such files in a variable, try
A=$(find . -iname 'FILE*.txt' -print)
BTW, on some several recent interactive shells (zsh, bash version 4 but not earlier versions) just write
A=**/FILE*.txt
My feeling is that the ** feature is by itself worth switching to a newer shell, but it is just my opinion.
Also, don't forget that files may have several or no names. Read about inodes ...

How to copy all the files with the same suffix to another directory? - Unix

I have a directory with unknown number of subdirectories and unknown level of sub*directories within them. How do I copy all the file swith the same suffix to a new directory?
E.g. from this directory:
> some-dir
>> foo-subdir
>>> bar-sudsubdir
>>>> file-adx.txt
>> foobar-subdir
>>> file-kiv.txt
Move all the *.txt files to:
> new-dir
>> file-adx.txt
>> file-kiv.txt
One option is to use find:
find some-dir -type f -name "*.txt" -exec cp \{\} new-dir \;
find some-dir -type f -name "*.txt" would find *.txt files in the directory some-dir. The -exec option builds a command line (e.g. cp file new.txt) for every matching file denoted by {}.
Use find with xargs as shown below:
find some-dir -type f -name "*.txt" -print0 | xargs -0 cp --target-directory=new-dir
For a large number of files, this xargs version is more efficient than using find some-dir -type f -name "*.txt" -exec cp {} new-dir \; because xargs will pass multiple files at a time to cp, instead of calling cp once per file. So there will be fewer fork/exec calls with the xargs version.

how to find files containing a string using egrep

I would like to find the files containing specific string under linux.
I tried something like but could not succeed:
find . -name *.txt | egrep mystring
Here you are sending the file names (output of the find command) as input to egrep; you actually want to run egrep on the contents of the files.
Here are a couple of alternatives:
find . -name "*.txt" -exec egrep mystring {} \;
or even better
find . -name "*.txt" -print0 | xargs -0 egrep mystring
Check the find command help to check what the single arguments do.
The first approach will spawn a new process for every file, while the second will pass more than one file as argument to egrep; the -print0 and -0 flags are needed to deal with potentially nasty file names (allowing to separate file names correctly even if a file name contains a space, for example).
try:
find . -name '*.txt' | xargs egrep mystring
There are two problems with your version:
Firstly, *.txt will first be expanded by the shell, giving you a listing of files in the current directory which end in .txt, so for instance, if you have the following:
[dsm#localhost:~]$ ls *.txt
test.txt
[dsm#localhost:~]$
your find command will turn into find . -name test.txt. Just try the following to illustrate:
[dsm#localhost:~]$ echo find . -name *.txt
find . -name test.txt
[dsm#localhost:~]$
Secondly, egrep does not take filenames from STDIN. To convert them to arguments you need to use xargs
find . -name *.txt | egrep mystring
That will not work as egrep will be searching for mystring within the output generated by find . -name *.txt which are just the path to *.txt files.
Instead, you can use xargs:
find . -name *.txt | xargs egrep mystring
You could use
find . -iname *.txt -exec egrep mystring \{\} \;
Here's an example that will return the file paths of a all *.log files that have a line that begins with ERROR:
find . -name "*.log" -exec egrep -l '^ERROR' {} \;
there's a recursive option from egrep you can use
egrep -R "pattern" *.log
If you only want the filenames:
find . -type f -name '*.txt' -exec egrep -l pattern {} \;
If you want filenames and matches:
find . -type f -name '*.txt' -exec egrep pattern {} /dev/null \;

Linux command for removing all ~ files

What command can I use in Linux to check if there is a file in a given directory (or its subdirectories) that contains a ~at the end of the file's name?
For example, if I'm at a directory called t which contains many subdirectories, etc, I would like to remove all files that end with a ~.
Watch out for filenames with spaces in them!
find ./ -name "*~" -type f -print0 | xargs -0 rm
with GNU find
find /path -type f -name "*~" -exec rm {} +
or
find /path -type f -name "*~" -delete
find ./ -name '*~' -print0 | xargs -0 rm -f
Here find will search the directory ./ and all sub directories, filtering for filenames that match the glob '*~' and printing them (with proper quoting courtesy of alberge). The results are passed to xargs to be appended to rm -f and the resulting string run in a shell. You can use multiple paths, and there are many other filters available (just read man find).
you can use a find, grep, rm combination, something like
find | grep "~" | xargs rm -f
Probably others have better ideas :)

Resources