Get file globs from a file and find these files - linux

I have a file called "file.txt" and it contains globs. Contents:
*.tex
*.pdf
*C*.png
I need to get these extensions from the file and then find the files containging these globs in the current directory (preferably using find, anything else is fine too).
I used
grep "" file.txt | xargs find . -name
but I get this error:
find: paths must precede expression: `*.pdf'
Using Ubuntu

The original code needs the -n 1 argument to be passed to xargs, to pass only one glob to each copy of find, as each glob expression needs to be preceded with a -name (and attached to any other -name expressions with -o, the "or" operator).
More efficient is to run find just once, after constructing an expression that puts all your -name operators on a single command line, separated with -os.
#!/usr/bin/env bash
# ^^^^- MUST be run with bash, not /bin/sh
find_expr=( -false )
while IFS= read -r line; do
find_expr+=( -o -name "$line" )
done <file.txt
find . '(' "${find_expr[#]}" ')' -print

Related

Passing linux command as a command line argument to shell script

Following command
"find . -type f -regextype posix-extended -regex './ctrf.|./rbc.' -exec basename {} ;"
And executing it.
I am stroring the command in variable in shell script link
Find_Command=$1
For Execution
Files="$(${Find_Command})"
Not working.
Best Practice: Accept An Array, Not A String
First, your shell script should take the command to run as a series of separate arguments, not a single argument.
#!/usr/bin/env bash
readarray -d '' Files < <("$#")
echo "Found ${#Files[#]} files" >&2
printf ' - %q\n' "${Files[#]}"
called as:
./yourscript find . -type f -regextype posix-extended -regex './ctrf.*|./rbc.*' -printf '%f\0'
Note that there's no reason to use the external basename command: find -printf can directly print you only the filename.
Fallback: Parsing A String To An Array Correctly
If you must accept a string, you can use the answers in Reading quoted/escaped arguments correctly from a string to convert that string to an array safely.
Compromising complete shell compatibility to avoid needing nonstandard tools, we can use xargs:
#!/usr/bin/env bash
readarray -d '' Command_Arr < <(xargs printf '%s\0' <<<"$1")
readarray -d '' Files < <("${Command_Arr[#]}")
echo "Found ${#Files[#]} files" >&2
printf ' - %q\n' "${Files[#]}"
...with your script called as:
./yourscript $'find . -type f -regextype posix-extended -regex \'./ctrf.*|./rbc.*\' -printf \'%f\\0\''
If you want to run a command specified in a variable and save the output in another variable, you can use following commands.
command="find something" output=$($command)
Or if you want to store output in array:
typeset -a output=$($command)
However, storing filenames in variables and then attempting to access files with those filenames is a bad idea because it is impossible to set the proper delimiter to separate filenames because filenames can contain any character except NUL (see https://mywiki.wooledge.org/BashPitfalls).
I'm not sure what you're trying to accomplish, but your find command contains an error. The -exec option must end with ; to indicate the end of the -exec parameters. Aside from that, it appears to be 'The xy problem' see https://xyproblem.info/
If you want to get basename of regular files with the extension .ctrf or.rbc, use the bash script below.
for x in **/*.+(ctrf|rbc); do basename $x ; done
Or zsh script
basename **/*.(ctrf|rbc)(#q.)
Make sure you have enabled 'extended glob' option in your shell.
To enable it in bash run following comand.
shopt -s extglob
And for zsh
setopt extendedglob
You should use array instead of string for Find_Command :
#!/usr/bin/env bash
Find_Command=(find . -type f -regextype posix-extended -regex '(./ctrf.|./rbc.)' -exec basename {} \;)
Files=($(“${Find_Command[#]}”))
Second statement assumes you don't have special characters (like spaces) in your file names.
Use eval:
Files=$(eval "${Find_Command}")
Be mindful of keeping the parameter sanitized and secure.

How to read out a file line by line and for every line do a search with find and copy the search result to destination?

I hope you can help me with the following problem:
The Situation
I need to find files in various folders and copy them to another folder. The files and folders can contain white spaces and umlauts.
The filenames contain an ID and a string like:
"2022-01-11-02 super important file"
The filenames I need to find are collected in a textfile named ids.txt. This file only contains the IDs but not the whole filename as a string.
What I want to achieve:
I want to read out ids.txt line by line.
For every line in ids.txt I want to do a find search and copy cp the result to destination.
So far I tried:
for n in $(cat ids.txt); do find /home/alex/testzone/ -name "$n" -exec cp {} /home/alex/testzone/output \; ;
while read -r ids; do find /home/alex/testzone -name "$ids" -exec cp {} /home/alex/testzone/output \; ; done < ids.txt
The output folder remains empty. Not using -exec also gives no (search)results.
I was thinking that -name "$ids" is the root cause here. My files contain the ID + a String so I should search for names containing the ID plus a variable string (star)
As argument for -name I also tried "$ids *" "$ids"" *" and so on with no luck.
Is there an argument that I can use in conjunction with find instead of using the star in the -name argument?
Do you have any solution for me to automate this process in a bash script to read out ids.txt file, search the filenames and copy them over to specified folder?
In the end I would like to create a bash script that takes ids.txt and the search-folder and the output-folder as arguments like:
my-id-search.sh /home/alex/testzone/ids.txt /home/alex/testzone/ /home/alex/testzone/output
EDIT:
This is some example content of the ids.txt file where only ids are listed (not the whole filename):
2022-01-11-01
2022-01-11-02
2020-12-01-62
EDIT II:
Going on with the solution from tripleee:
#!/bin/bash
grep . $1 | while read -r id; do
echo "Der Suchbegriff lautet:"$id; echo;
find /home/alex/testzone -name "$id*" -exec cp {} /home/alex/testzone/ausgabe \;
done
In case my ids.txt file contains empty lines the -name "$id*" will be -name * which in turn finds all files and copies all files.
Trying to prevent empty line to be read does not seem to work. They should be filtered by the expression grep . $1 |. What am I doing wrong?
If your destination folder is always the same, the quickest and absolutely most elegant solution is to run a single find command to look for all of the files.
sed 's/.*/-o\n—name\n&*/' ids.txt |
xargs -I {} find -false {} -exec cp {} /home/alex/testzone/output +
The -false predicate is a bit of a hack to allow the list of actual predicates to start with -o (as in "or").
This could fail if ids.txt is too large to fit into a single xargs invocation, or if your sed does not understand \n to mean a literal newline.
(Here's a fix for the latter case:
xargs printf '-o\n-name\n%s*\n' <ids.txt |
...
Still the inherent problem with using xargs find like this is that xargs could split the list between -o and -name or between -name and the actual file name pattern if it needs to run more than one find command to process all the arguments.
A slightly hackish solution to that is to ensure that each pair is a single string, and then separately split them back out again:
xargs printf '-o_-name_%s*\n' <ids.txt |
xargs bash -c 'arr=("$#"); find -false ${arr[#]/-o_-name_/-o -name } -exec cp {} "$0"' /home/alex/testzone/ausgabe
where we temporarily hold the arguments in an array where each file name and its flags is a single item, and then replace the flags into separate tokens. This still won't work correctly if the file names you operate on contain literal shell metacharacters like * etc.)
A more mundane solution fixes your while read attempt by adding the missing wildcard in the -name argument. (I also took the liberty to rename the variable, since read will only read one argument at a time, so the variable name should be singular.)
while read -r id; do
find /home/alex/testzone -name "$id*" -exec cp {} /home/alex/testzone/output \;
done < ids.txt
Please try the following bash script copier.sh
#!/bin/bash
IFS=$'\n' # make newlines the only separator
set -f # disable globbing
file="files.txt" # name of file containing filenames
finish="finish" # destination directory
while read -r n ; do (
du -a | awk '{for(i=2;i<=NF;++i)printf $i" " ; print " "}' | grep $n | sed 's/ *$//g' | xargs -I '{}' cp '{}' $finish
);
done < $file
which copies recursively all the files named in files.txt from . and it's subfiles to ./finish
This new version works even if there are spaces in the directory names or file names.

Passing filename as variable from find's exec into a second exec command

From reading this stackoverflow answer I was able to remove the file extension from the files using find:
find . -name "S4*" -execdir basename {} .fastq.gz ';'
returned:
S9_S34_R1_001
S9_S34_R2_001
I'm making a batch script where I want to extract the filename with the above prefix to pass as arguments into a program. At the moment I'm currently doing this with a loop but am wondering if it can be achieved using find.
for i in $(ls | grep 'S9_S34*' | cut -d '.' -f 1); do echo "$i"_trim.log "$i"_R1_001.fastq.gz "$i"_R2_001.fastq.gz; done; >> trim_script.sh
Is it possible to do something as follows:
find . -name "S4*" -execdir basename {} .fastq.gz ';' | echo {}_trim.log {}_R1_001.fastq.gz {}_R2_001.fastq.gz {}\ ; >> trim_script.sh
You don't need basename at all, or -exec, if all you're doing is generating a series of strings that contain your file's basenames within them; the -printf action included in GNU find can do all that for you, as it provides a %P built-in to insert the basename of your file:
find . -name "S4*" \
-printf '%P_trim.log %P_R1_001.fastq.gz %P_R2_001.fastq.gz %P\n' \
>trim_script.sh
That said, be sure you only do this if you trust your filenames. If you're truly running the result as a script, there are serious security concerns if someone could create a S4$(rm -rf ~).txt file, or something with a similarly malicious name.
What if you don't trust your filenames, or don't have the GNU version of find? Then consider making find pass them into a shell (like bash or ksh) that supports the %q extension, to generate a safely-escaped version of those names (note that you should run the script with the same interpreter you used for this escaping):
find . -name "S4*" -exec bash -c '
for file do # iterates over "$#", so processes each file in turn
file=${file##*/} # get the basename
printf "%q_trim.log %q_R1_001.fastq.gz %q_R2_001.fastq.gz %q\n" \
"$file" "$file" "$file" "$file"
done
' _ {} + >trim_script.sh
Using -exec ... {} + invokes the smallest possible number of subprocesses -- not one per file found, but instead one per batch of filenames (using the largest possible batch that can fit on a command line).

Terminal find Command: Manipulate Output String

I am trying to manipulate the filename from the find command:
find . -name "*.xib" -exec echo '{}' ';'
For example, this might print:
./Views/Help/VCHelp.xib
I would like to make it:
./Views/Help/VCHelp.strings
What I tried:
find . -name "*.xib" -exec echo ${'{}'%.*} ';'
But, the '{}' is not being recognized as a string or something...
I also tried the following:
find . -name "*.xib" -exec filename='{}' ";" -exec echo ${filename%.*} ";"
But it is trying to execute a command called "filename" instead of assigning the variable:
find: filename: No such file or directory
You can't use Parameter Expansion with literal string. Try to store it in a variable first:
find . -name '*.xib' -exec bash -c "f='{}' ; echo \${f%.xib}.strings" \;
-exec sees first argument after it as the command, therefore you can't simply give it filename='{}' because find doesn't use sh to execute what you give it. If you want to run some shell stuff, you need to use sh or bash to wrap up.
Or use sed:
find . -name '*.xib' | sed 's/.xlib$/.strings/'
For such a simple search, you can use a pure bash solution:
#!/bin/bash
shopt -s globstar
shopt -s nullglob
found=( **.xib )
for f in "${found[#]}"; do
echo "${f%xib}strings"
done
Turning the globstar shell option on enables the ** to "match all files and zero or more directories and subdirectories" (as quoted from the bash reference manual). The nullglob option helps if there's no match: in this case, the glob will be expanded to nothing instead of the ugly **.xib. This solution is safe regarding filenames containing funny characters.
find . -name "*.xib" | sed -e 's/\.xib$/.strings/'

Strip leading dot from filenames bash script

I have some files in a bunch of directories that have a leading dot and thus are hidden. I would like to revert that and strip the leading dot.
I was unsuccessful with the following:
for file in `find files/ -type f`;
do
base=`basename $file`
if [ `$base | cut -c1-2` = "." ];
then newname=`$base | cut -c2-`;
dirs=`dirname $file`;
echo $dirs/$newname;
fi
done
Which fails on the condition statement:
[: =: unary operator expected
Furthermore, some files have a space in them and file returns them split.
Any help would be appreciated.
The easiest way to delete something from the start of a variable is to use ${var#pattern}.
$ FILENAME=.bashrc; echo "${FILENAME#.}"
bashrc
$ FILENAME=/etc/fstab; echo "${FILENAME#.}"
/etc/fstab
See the bash man page:
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches the beginning of the value of parameter, then the result of the expansion is
the expanded value of parameter with the shortest matching pattern (the ‘‘#’’ case) or the longest matching pattern (the ‘‘##’’ case) deleted.
By the way, with a more selective find command you don't need to do all the hard work. You can have find only match files with a leading dot:
find files/ -type f -name '.*'
Throwing that all together, then:
find files/ -type f -name '.*' -printf '%P\0' |
while read -d $'\0' path; do
dir=$(dirname "$path")
file=$(basename "$path")
mv "$dir/$file" "$dir/${file#.}"
done
Additional notes:
To handle file names with spaces properly you need to quote variable names when you reference them. Write "$file" instead of just $file.
For extra robustness the -printf '\0' and read -d $'\0' use NUL characters as delimiters so even file names with embedded newlines '\n' will work.
find files/ -name '.*' -printf '%f\n'|while read f; do
mv "files/$f" "files/${f#.}"
done
This script works with any file you
can throw at it, even if they have
spaces, newlines or other nefarious
characters in their name.
It works no matter how many subdirectories deep the hidden file is
Unlike other answers thus far, you
don't have to change the rest of the script when you change the path
given to find
*Note: I included an echo so that you can test it like a dry-run. Remove the single echo if you are satisfied with the results.
find . -name '.*' -exec sh -c 'for arg; do d="${arg%/*}"; f=${arg:${#d}}; echo mv "$arg" "$d/${f#.}"; done' _ {} +

Resources