Nested find on the basis of two conditions - linux

I want to find files that are starting with show and were created in a particular month. I have tried the following
for i in `find /home/data -type d -name "$MONTH"`;
do find $i -type f -name "show*" -printf "%h\n"|uniq >tempfile1;
done;
but I get this error:
-bash: /home/data/testdata/2017/Apr/25: Is a directory
How can I fix that?

If you run that small bit through ShellCheck, several issues become apparent:
Instead of looping over the output of find use -exec
You should enclose $i in quotes to prevent globbing and word splitting. That may be the cause of your issue.

Related

Linux FIND searching files with names within single quotes [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 1 year ago.
I am trying to save results of FIND command into array so then I can do some AWK commands with them.
My actual code is: files_arr=( "$(find "$1" -type f \( -name "\'*[[:space:]]*\'" -o -name "*" \) -perm -a=r -print )
this code should find all files with spaces and without spaces and return them to my array (and are readable also)
The PROBLEM is, when I have directory named: 'not easy' and inside this is directory are files: 'file one' and 'file two' so what I will get is: not easy/file one
what I want to get is: 'not easy'/'file one'I was thinking about using SED to add quotes but it would add quotes even if I had just simple one word file which doesnt have quotes in it.
Thank you for our advices.
Try this out :
mapfile -d '' files_arr < <(find . -type f -name "'*[[:space:]]*'" -perm -a=r -print0)
declare -p files_arr # To see what's in the array

Why cat command not working in script

I have the following script and it has an error. I am trying to merge all the files into one large file. From the command line the cat commant works fine and the content is printed to the redirected file. From script it is working sometime but not the other time. I dont know why its behaving abnormally. Please help.
#!/bin/bash
### For loop starts ###
for D in `find . -type d`
do
combo=`find $D -maxdepth 1 -type f -name "combo.txt"`
cat $combo >> bigcombo.tsv
done
Here is the output of bash -x app.sh
++ find . -type d
+ for D in '`find . -type d`'
++ find . -maxdepth 1 -type f -name combo.txt
+ combo=
+ cat
^C
UPDATE:
The following worked for me. There was issue with the path. I still dont know what was the issue so answer is welcome.
#!/bin/bash
### For loop starts ###
rm -rf bigcombo.tsv
for D in `find . -type d`
do
psi=`find $D -maxdepth 1 -type f -name "*.psi_filtered"`
# This will give us only the directory path from find result i.e. removing filename.
directory=$(dirname "${psi}")
cat $directory"/selectedcombo.txt" >> bigcombo.tsv
done
The obvious problem is that you are attempting to cat a file which doesn't exist.
Secondary problems are related to efficiency and correctness. Running two nested loops is best avoided, though splitting the action into two steps is merely inelegant here; the inner loop will only execute once, at most. Capturing command results into variables is a common beginner antipattern; a variable which is only used once can often be avoided, and avoids littering the shell's memory with cruft (and coincidentally solves the multiple problems with missing quoting - a variable which contains a file or directory name should basically always be interpolated in double quotes). Redirection is better performed outside any containing loop;
rm file
while something; do
another thing >>file
done
will open, seek to the end of the file, write, and close the file as many times as the loop runs, whereas
while something; do
another thing
done >file
only performs the open, seek, and close actions once, and avoids having to clear the file before starting the loop. Though your script can be refactored to not have any loops at all;
find ./*/ -type f -name "*.psi_filtered" -execdir cat selectedcombo.txt \;> bigcombo.tsv
Depending on your problem, it might be an error for there to be directories which contain combo.txt but which do not contain any *.psi_filtered files. Perhaps you want to locate and examine these directories.

Exclude range of directories in find command

I have directory called test which has sub folders in the date range like 01,02,...31. This all sub folders contain .bz2 files in it. I need to search all the files with .bz2 extension using find command but excluding particular range of directories. I know about find . -name ".bz2" -not -path "./01/*", but writing -not -path "./01/*" would be so pathetic if I would want to skip 10 directories. So how would I skip 01..19 subdirectories in my find command ?
You can use wildcards in the pattern for the option -not -path:
find ./ -type f -name "*.bz2" -not -path "./0*/*" -not -path "./1*/*
this will exclude all directories starting with 0 or 1. Or even better:
find ./ -type f -name "*.bz2" -not -path "./[01]*/*"
Firstly, you can help find by using -prune rather than -not -path - that will avoid even looking inside the relevant directories.
To your main point - you can build a wildcard for your example (numeric 01 to 19):
find . -path './0[1-9]' -prune -o -path './1[0-9]' -prune -o -print
If your range is less convenient (e.g. 05 to 25) you might want to build the range into a bash variable, then interpolate that into the find command:
a=("-path ./"{05..25}" -prune -o")
find . ${a[*]} -print -prune
(you might want to echo "${a[*]}" or printf '%s\n' ${a[*]} to see how it's working)
For me, I found the find command as a standalone tool somehow cumbersome. Therefore, I always end up using a combination of find just for the recursive file search and grep to make the actual exculsion/inclusion stuff. Finally I hand over the results to a third command which will perform the actions, like rm to remove files for example.
My generic command would look something like this:
find [root-path] | grep (-v)? -E "cond1|cond2|...|condN" | [action-performing-tool]
root-path is where to start the search recursively
add -v option is used to invert the matching results.
cond1 - condN, the conditions for the matching. When -v is involed then this are the conditions to not match.
the action-performing-tool does the actual work
For example you want to remove all files not matching some conditions in the current directory:
find . -not -name "\." | grep -v -E "cond1|cond2|cond3|...|condN" | xargs rm -rf
As you can see, we are searching in the current directory indicated by the dot as root-path: then we want to invert the matching results, because we want all files not matching our conditions: and finally we pass all files found to rm in order to delete them: I add -rf to recursive/force delete all files. I used the find command with -not -name "." to exclude the current directory indicated normally by dot.
For the actuall question: Assume we have a directory using .git and .metadata directory and we want to exclude them in our search:
find . -not -name "\." | grep -v -E ".git|.metadata" | [action-performing-tool]
Hope that helps!
If you wan to exclude child directory under parent directory then this might be useful:
E.g.- You have parent directory "ParentDir" and it has two child directories "Child1, Child2". You wan to read files from "Chiled2" only and skip "Child1". Then this will help.
find ./ParentDir ! -path "./ParentDir/Child1*" -name *.<extention>

Using Perl-based rename command with find in Bash

I just stumbled upon Perl today while playing around with Bash scripting. When I tried to remove blank spaces in multiple file names, I found this post, which helped me a lot.
After a lot of struggling, I finally understand the rename and substitution commands and their syntax. I wanted to try to replace all "_(x)" at the end of file names with "x", due to duplicate files. But when I try to do it myself, it just does not seem to work. I have three questions with the following code:
Why is nothing executed when I run it?
I used redirection to show me the success note as an error, so I know what happened. What did I do wrong about that?
After a lot of research, I still do not entirely understand file descriptors and redirection in Bash as well as the syntax for the substitute function in Perl. Can somebody give give me a link for a good tutorial?
find -name "*_(*)." -type f | \
rename 's/)././g' && \
find -name "*_(*." -type f | \
rename 's/_(//g' 2>&1
You either need to use xargs or you need to use find's ability to execute commands:
find -name "*_(*)." -type f | xargs rename 's/)././g'
find -name "*_(*." -type f | xargs rename 's/_(//g'
Or:
find -name "*_(*)." -type f -exec rename 's/)././g' {} +
find -name "*_(*." -type f -exec rename 's/_(//g' {} +
In both cases, the file names are added to the command line of rename. As it was, rename would have to read its standard input to discover the file names — and it doesn't.
Does the first find find the files you want? Is the dot at the end of the pattern needed? Do the regexes do what you expect? OK, let's debug some of those too.
You could do it all in one command with a more complex regex:
find . -name "*_(*)" -type f -exec rename 's/_\((\d+)\)$/$1/' {} +
The find pattern is corrected to lose the requirement of a trailing .. If the _(x) is inserted before the extension, then you'd need "*_(*).*" as the pattern for find (and you'll need to revise the Perl regexes).
The Perl substitute needs dissection:
The \( matches an open parenthesis.
The ( starts a capture group.
The \d+ looks for 'one or more digits'.
The ) stops the capture group. It is the first and only, so it is given the number 1.
The \) matches a close parenthesis.
The $ matches the end of the file name.
The $1 in the replacement puts the value of capture group 1 into the replacement text.
In your code, the 2>&1 sent the error messages from the second rename command to standard output instead of standard error. That really doesn't help much here.
You need two separate tutorials; you are not going to find one tutorial that covers I/O redirection in Bash and regular expressions in Perl.
The 'official' Perl regular expression tutorial is:
perlretut, also available as perldoc perlretut on your machine.
The Bash manual covers I/O redirection, but it is somewhat terse:
I/O Redirections.

Find Directories With No Files in Unix/Linux

I have a list of directories
/home
/dir1
/dir2
...
/dir100
Some of them have no files in it. How can I use Unix find to do it?
I tried
find . -name "*" -type d -size 0
Doesn't seem to work.
Does your find have predicate -empty?
You should be able to use find . -type d -empty
If you're a zsh user, you can always do this. If you're not, maybe this will convince you:
echo **/*(/^F)
**/* will expand to every child node of the present working directory and the () is a glob qualifier. / restricts matches to directories, and F restricts matches to non-empty ones. Negating it with ^ gives us all empty directories. See the zshexpn man page for more details.
-empty reports empty leaf dirs.
If you want to find empty trees then have a look at:
http://code.google.com/p/fslint/source/browse/trunk/fslint/finded
Note that script can't be used without the other support scripts,
but you might want to install fslint and use it directly?
You can also use:
find . -type d -links 2
. and .. both count as a link, as do files.
The answer of Pimin Konstantin Kefalou prints folders with only 2 links and other files (d, f, ...).
The easiest way I have found is:
for directory in $(find . -type d); do
if [ -n "$(find $directory -maxdepth 1 -type f)" ]; then echo "$directory"
fi
done
If you have name with spaces use quotes in "$directory".
You can replace . by your reference folder.
I haven't been able to do it with one find instruction.

Resources