Select files from a list of files - linux

I have a list of filenames which look like this:
tRapTrain.Isgf3g.2853.2.v1.primary.RC.txt tRapTrain.Yox1.txt
tRapTrain.Isgf3g.2853.2.v1.primary.txt tRapTrain.Ypr015c.txt
tRapTrain.Isgf3g.2853.2.v1.secondary.RC.txt tRapTrain.Yrm1.txt
tRapTrain.Isgf3g.2853.2.v1.secondary.txt tRapTrain.Zbtb12.2932.2.v1.primary.RC.txt
Now i need to select the files with primary.txt and all the files where no final suffix is found. final suffix == primary.RC.txt , secondary.RC.txt, secondary.txt.
So my desired output will be:
tRapTrain.Isgf3g.2853.2.v1.primary.txt
tRapTrain.Yox1.txt
tRapTrain.Ypr015c.txt
tRapTrain.Yrm1.txt
I tried to do it with ls tRap*primary.txt but cant figure out how to do both selections at once. Any help is appreciated.

You can use find:
find * -type f -not -name "*.secondary.RC.txt" -not -name "*.primary.RC.txt" -not -name "*.secondary.txt" -print

Using Shopt:
$ shopt -s extglob
$ ls !(*primary.RC.txt|*secondary.RC.txt|*secondary.txt)
Meaning:
!(pattern-list)
Matches anything except one of the given patterns.

I would use an inverted grep match:
ls tRap* | grep -v "\.RC\." | grep -v "\.secondary\."
This should get rid of anything with ".RC." or ".secondary." in the title, which sounds like what you want.
This may not be the most elegant, but it works.

Related

Can't find a file by pattern [duplicate]

I am having a hard time getting find to look for matches in the current directory as well as its subdirectories.
When I run find *test.c it only gives me the matches in the current directory. (does not look in subdirectories)
If I try find . -name *test.c I would expect the same results, but instead it gives me only matches that are in a subdirectory. When there are files that should match in the working directory, it gives me: find: paths must precede expression: mytest.c
What does this error mean, and how can I get the matches from both the current directory and its subdirectories?
Try putting it in quotes -- you're running into the shell's wildcard expansion, so what you're acually passing to find will look like:
find . -name bobtest.c cattest.c snowtest.c
...causing the syntax error. So try this instead:
find . -name '*test.c'
Note the single quotes around your file expression -- these will stop the shell (bash) expanding your wildcards.
What's happening is that the shell is expanding "*test.c" into a list of files. Try escaping the asterisk as:
find . -name \*test.c
From find manual:
NON-BUGS
Operator precedence surprises
The command find . -name afile -o -name bfile -print will never print
afile because this is actually equivalent to find . -name afile -o \(
-name bfile -a -print \). Remember that the precedence of -a is
higher than that of -o and when there is no operator specified
between tests, -a is assumed.
“paths must precede expression” error message
$ find . -name *.c -print
find: paths must precede expression
Usage: find [-H] [-L] [-P] [-Olevel] [-D ... [path...] [expression]
This happens because *.c has been expanded by the shell resulting in
find actually receiving a command line like this:
find . -name frcode.c locate.c word_io.c -print
That command is of course not going to work. Instead of doing things
this way, you should enclose the pattern in quotes or escape the
wildcard:
$ find . -name '*.c' -print
$ find . -name \*.c -print
Try putting it in quotes:
find . -name '*test.c'
I see this question is already answered. I just want to share what worked for me. I was missing a space between ( and -name. So the correct way of chosen a files with excluding some of them would be like below;
find . -name 'my-file-*' -type f -not \( -name 'my-file-1.2.0.jar' -or -name 'my-file.jar' \)
I came across this question when I was trying to find multiple filenames that I could not combine into a regular expression as described in #Chris J's answer, here is what worked for me
find . -name one.pdf -o -name two.txt -o -name anotherone.jpg
-o or -or is logical OR. See Finding Files on Gnu.org for more information.
I was running this on CygWin.
You can try this:
cat $(file $( find . -readable) | grep ASCII | tr ":" " " | awk '{print $1}')
with that, you can find all readable files with ascii and read them with cat
if you want to specify his weight and no-executable:
cat $(file $( find . -readable ! -executable -size 1033c) | grep ASCII | tr ":" " " | awk '{print $1}')
In my case i was missing trailing / in path.
find /var/opt/gitlab/backups/ -name *.tar

bash: filter the files where NOT to search

I have created a script that searches for the specified keywords in specified directories:
find $directory -type f -name "*.properties" -exec grep -Fi "$keyword"
The problem i faced is that the $directory contains 2 types of files - sample files and config files: config / sample.config. Where sample.config is an example only, thus i'm not interested to include them into the search.
The question is how to exclude these 'sample.*' files out of the results of my results?
From the question to exclude sample.config files, add ! -name sample.config in find commands, for example :
find $(<$SRC) -type f -name "*.properties" ! -name sample.config -exec grep -Fi "$keyword" --color {} +
however *.properties can't match sample.config so it will not change the result
Probably 1 command to search $keyword, with all 4 kinds of your file types, exclude sample.*:
msr -rp dir1,dir2,dirN -f "\.(properties|pl|xml|ini)$" --nf "^sample\." -it "keyword"
Use -PAC or -P -A -C to remove color and line number etc. to get pure result.
Use -l to just list the file paths and show distribution: count + percentage.
msr.gcc* is a single exe tool to search/replace file/pipe in my open project https://github.com/qualiu/msr tools directory, with cross platform versions and OS-bit versions. Built-in doc like: https://qualiu.github.io/msr/usage-by-running/msr-CentOS-7.html Vivid-demo, Performance-comparision-with-findstr-and-grep, test etc. just see the home.
Using the suggestion of #Nahuel, i've modified it a bit and it started working for me as:
find $(<$SRC) -type f -name "*.properties" ! -name "sample.*" -exec grep -Fi "$keyword" --color {} +

How do I find the number of all .txt files in a directory and all sub directories using specifically the find command and the wc command?

So far I have this:
find -name ".txt"
I'm not quite sure how to use wc to find out the exact number of files. When using the command above, all the .txt files show up, but I need the exact number of files with the .txt extension. Please don't suggest using other commands as I'd like to specifically use find and wc. Thanks
Try:
find . -name '*.txt' | wc -l
The -l option to wc tells it to return just the number of lines.
Improvement (requires GNU find)
The above will give the wrong number if any .txt file name contains a newline character. This will work correctly with any file names:
find . -iname '*.txt' -printf '1\n' | wc -l
-printf '1\n tells find to print just the line 1 for each file name found. This avoids problems with file names having difficult characters.
Example
Let's create two .txt files, one with a newline in its name:
$ touch dir1/dir2/a.txt $'dir1/dir2/b\nc.txt'
Now, let's find the find command:
$ find . -name '*.txt'
./dir1/dir2/b?c.txt
./dir1/dir2/a.txt
To count the files:
$ find . -name '*.txt' | wc -l
3
As you can see, the answer is off by one. The improved version, however, works correctly:
$ find . -iname '*.txt' -printf '1\n' | wc -l
2
find -type f -name "*.h" -mtime +10 -print | wc -l
This worked out.

Display multiple files in Linux/Unix

I'm looking to display 3 different files, if they exist. I thought the following would work, but it doesn't:
ls -R | grep 6-atom2D.vector$ 6-atom2D.klist 6-atom2D.struct
How can I do it?
Knowing the (base) filenames, you can use find:
find . -name '6-atom2D.vector$' -o -name '6-atom2D.klist' -o -name '6-atom2D.struct'
It searches recursive by default.
For case-insensitive search, use -iname instead.
ls -R | egrep "6-atom2D\.vector$|6-atom2D\.klist|6-atom2D\.struct"
If $ is supposed to be end of line regexp, then you might need to use \> instead. That works for me at least.
Edit: Backslash before .

Yjw to replace and then search with sed and find

I'm making a bash script which would call like
script.sh 172.16.1.1
trying to replace . and search files to delete them but it won't happen
echo $1 | find -name '*.`sed 's/\.*//g'`' -printf "%f\n" -delete
files look like
eth0-2:120.1721611 eth1-2:120.1721611
Try this command inside that script.
I think this may help you for your requirement.
$ find -name "*echo "$1" | sed 's/\.*//g'" -printf "%f\n" -delete
I am passing the name only for the particular field, If you passed for whole command it produce the different result.
The given command is searched from current directory to end.
If you need to search from root or home use / or ~ in find command like
$ find ~ -name "*echo "$1" | sed 's/\.*//g'" -printf "%f\n" -delete

Resources