how to assign output of find into array - linux

In linux shell scripting I am trying to set the output of find into an array as below
#!/bin/bash
arr=($(find . -type -f))
but it give error as -type should contain only one character. can anybody tell me where is the issue.
Thanks

If you are using bash 4, the readarray command can be used along with process substitution.
readarray -t arr < <(find . -type f)
Properly supporting all file names, including those that contain newlines, requires a bit more work, along with a version of find that supports -print0:
while read -d '' -r; do
arr+=( "$REPLY" )
done < <(find . -type f -print0)

I suggest the following script:
#!/bin/bash
listoffiles=$(find . -type f)
nfiles=$(echo "${listoffiles}" | wc -l)
unset myarray
for i in $(seq 1 ${nfiles}) ; do
myarray[$((i-1))]=$(echo "${listoffiles}" | sed -n $i'{p;q}')
done
Because you cannot rely on the Bash automatic array instanciation through the myarr=( one two three ) syntax, because it treats the same way all whitespaces (including spaces) it sees within its parentheses. So you have to handle the resulting multiline variable listoffiles kindof manually, what I do in the above script.
echo without the -n option prints a trailing newline at the very end of the variable, but that's fine in our case because find doesn't (you may check this with echo -n "${listoffiles}").
And I use sed to extract the relevant i^th line, with the $i being interpreted by the shell before being given to sed as the first character of its own script.

Related

listing file in unix and saving the output in a variable(Oldest File fetching for a particular extension)

This might be a very simple thing for a shell scripting programmer but am pretty new to it. I was trying to execute the below command in a shell script and save the output into a variable
inputfile=$(ls -ltr *.{PDF,pdf} | head -1 | awk '{print $9}')
The command works fine when I fire it from terminal but fails when executed through a shell script (sh). Why is that the command fails, does it mean that shell script doesn't support the command or am I doing it wrong? Also how do I know if a command will work in shell or not?
Just to give you a glimpse of my requirement, I was trying to get the oldest file from a particular directory (I also want to make sure upper case and lower case extensions are handled). Is there any other way to do this ?
The above command will work correctly only if BOTH *.pdf and *.PDF files are in the directory you are currently.
If you would like to execute it in a directory with only one of those you should consider using e.g.:
inputfiles=$(find . -maxdepth 1 -type f \( -name "*.pdf" -or -name "*.PDF" \) | xargs ls -1tr | head -1 )
NOTE: The above command doesn't work with files with new lines, or with long list of found files.
Parsing ls is always a bad idea. You need another strategy.
How about you make a function that gives you the oldest file among the ones given as argument? the following works in Bash (adapt to your needs):
get_oldest_file() {
# get oldest file among files given as parameters
# return is in variable get_oldest_file_ret
local oldest f
for f do
[[ -e $f ]] && [[ ! $oldest || $f -ot $oldest ]] && oldest=$f
done
get_oldest_file_ret=$oldest
}
Then just call as:
get_oldest_file *.{PDF,pdf}
echo "oldest file is: $get_oldest_file_ret"
Now, you probably don't want to use brace expansions like this at all. In fact, you very likely want to use the shell options nocaseglob and nullglob:
shopt -s nocaseglob nullglob
get_oldest_file *.pdf
echo "oldest file is: $get_oldest_file_ret"
If you're using a POSIX shell, it's going to be a bit trickier to have the equivalent of nullglob and nocaseglob.
Is perl an option? It's ubiquitous on Unix.
I would suggest:
perl -e 'print ((sort { -M $b <=> -M $a } glob ( "*.{pdf,PDF}" ))[0]);';
Which:
uses glob to fetch all files matching the pattern.
sort, using -M which is relative modification time. (in days).
fetches the first element ([0]) off the sort.
Prints that.
As #gniourf_gniourf says, parsing ls is a bad idea. Such as leaving unquoted globs, and generally not counting for funny characters in file names.
find is your friend:
#!/bin/sh
get_oldest_pdf() {
#
# echo path of oldest *.pdf (case-insensitive) file in current directory
#
find . -maxdepth 1 -mindepth 1 -iname "*.pdf" -printf '%T# %p\n' \
| sort -n \
| tail -1 \
| cut -d\ -f1-
}
whatever=$(get_oldest_pdf)
Notes:
find has numerous ways of formatting the output, including
things like access time and/or write time. I used '%T# %p\n',
where %T# is last write time in UNIX time format incl.fractal part.
This will never containt space so it's safe to use as separator.
Numeric sort and tail get the last item, sorting by the time,
cut removes the time from the output.
I used IMO much easier to read/maintain pipe notation, with help of \.
the code should run on any POSIX shell,
You could easily adjust the function to parametrize the pattern,
time used (access/write), control the search depth or starting dir.

Bash loop through directory including hidden file

I am looking for a way to make a simple loop in bash over everything my directory contains, i.e. files, directories and links including hidden ones.
I will prefer if it could be specifically in bash but it has to be the most general. Of course, file names (and directory names) can have white space, break line, symbols. Everything but "/" and ASCII NULL (0×0), even at the first character. Also, the result should exclude the '.' and '..' directories.
Here is a generator of files on which the loop has to deal with :
#!/bin/bash
mkdir -p test
cd test
touch A 1 ! "hello world" \$\"sym.dat .hidden " start with space" $'\n start with a newline'
mkdir -p ". hidden with space" $'My Personal\nDirectory'
So my loop should look like (but has to deal with the tricky stuff above):
for i in * ;
echo ">$i<"
done
My closest try was the use of ls and bash array, but it is not working with, is:
IFS=$(echo -en "\n\b")
l=( $(ls -A .) )
for i in ${l[#]} ; do
echo ">$i<"
done
unset IFS
Or using bash arrays but the ".." directory is not exclude:
IFS=$(echo -en "\n\b")
l=( [[:print:]]* .[[:print:]]* )
for i in ${l[#]} ; do
echo ">$i<"
done
unset IFS
* doesn't match files beginning with ., so you just need to be explicit:
for i in * .[^.]*; do
echo ">$i<"
done
.[^.]* will match all files and directories starting with ., followed by a non-. character, followed by zero or more characters. In other words, it's like the simpler .*, but excludes . and ... If you need to match something like ..foo, then you might add ..?* to the list of patterns.
As chepner noted in the comments below, this solution assumes you're running GNU bash along with GNU find GNU sort...
GNU find can be prevented from recursing into subdirectories with the -maxdepth option. Then use -print0 to end every filename with a 0x00 byte instead of the newline you'd usually get from -print.
The sort -z sorts the filenames between the 0x00 bytes.
Then, you can use sed to get rid of the dot and dot-dot directory entries (although GNU find seems to exclude the .. already).
I also used sed to get read of the ./ in front of every filename. basename could do that too, but older systems didn't have basename, and you might not trust it to handle the funky characters right.
(These sed commands each required two cases: one for a pattern at the start of the string, and one for the pattern between 0x00 bytes. These were so ugly I split them out into separate functions.)
The read command doesn't have a -z or -0 option like some commands, but you can fake it with -d "" and blanking the IFS environment variable.
The additional -r option prevents a backslash-newline combo from being interpreted as a line continuation. (A file called backslash\\nnewline would otherwise be mangled to backslashnewline.) It might be worth seeing if other backslash-combos get interpreted as escape sequences.
remove_dot_and_dotdot_dirs()
{
sed \
-e 's/^[.]\{1,2\}\x00//' \
-e 's/\x00[.]\{1,2\}\x00/\x00/g'
}
remove_leading_dotslash()
{
sed \
-e 's/^[.]\///' \
-e 's/\x00[.]\//\x00/g'
}
IFS=""
find . -maxdepth 1 -print0 |
sort -z |
remove_dot_and_dotdot_dirs |
remove_leading_dotslash |
while read -r -d "" filename
do
echo "Doing something with file '${filename}'..."
done
It may not be the most favorable way but I tried bellow thing
while read line ; do echo $line; done <<< $(ls -a | grep -v -w ".")
check the below trail which I did
Try the find command, something like:
find .
That will list all the files in all recursive directories.
To output only files excluding the leading . or .. try:
find . -type f -printf %P\\n

Optional Command line parameter Shell

lets call the following replace.sh
if "$1" !=""
then
REPLACE_AS=$1
else
REPLACE_AS="Tebow"
fi
find . ! -regex ".*[/]\.svn[/]?.*" -type f -print0 | xargs -0 -n 1 sed -i -e 's/SANCHEZ/'$REPLACE_AS'/g'
Sorry for the primitive question. I am trying to make it so the command line parameter is optional . IE if someone doesn't put it and just runs this script it uses Tebow. That seems to work. However if i run the script with a command line argument it doesnt work.
ie
./test.sh
this will replace it with tebow.
however
./test.sh Smith
will not replace the Sanchez string with Smith
You can use shell parameter expansion and the notation:
REPLACE_AS="${1:-Tebow}"
to do in one line what you do in 6.
Additionally, your code as written should be:
if [ "$1" != "" ]
then REPLACE_AS="$1"
else REPLACE_AS="Tebow"
fi
The test command is [; it needs spaces around its operands and a ] at the end. You need quotes around "$1" in case it contains spaces; the quotes around "Tebow" are optional because it doesn't contain spaces (but the uniformity is good).
There's nothing to stop you writing:
xargs -0 -n 1 sed -i -e 's/SANCHEZ/'"${1:-Tebow}"'/g'
but the clarity of the variable is good, especially if you'll refer to it several times.
Also, I would leave the pipe at the end of the line and start the second command (xargs) on the second line for clarity (again - it is very important).
find . ! -regex ".*[/]\.svn[/]?.*" -type f -print0 |
xargs -0 -n 1 sed -i -e 's/SANCHEZ/'"$REPLACE_AS"'/g'
Sometimes, but not often, I'll indent the second command.
Note the double quotes around "$REPLACE_AS"; it prevents problems with spaces in the replacement text.

ambiguous redirection

I'm trying to go the current directory and all sub direcotires, and add some annotations into each file that ends in .sql
heres a snippet of the code
HEADER="--SQL HEADER"
for f in 'find . -name *.sql';
do
echo $f
echo -e $HEADER > $f.tmp;
FNAME=${f//\//_/};
echo -e "\n\n--MORE ANNOTATIONS ${FNAME%.*}:1" >> $f.tmp;
cat $f >> $f.tmp;
mv $f.tmp $f;
rm $f.tmp
done;
im a beginner at bash so i think some of the errors im getting might be due to the find statement with the loop
but this is the error i get
find . -name X.sql A.sql W.sql E.sql S.sql
./annotate.sh: line 6: $f.tmp: ambiguous redirect
./annotate.sh: line 8: $f.tmp: ambiguous redirect
./annotate.sh: line 9: $f.tmp: ambiguous redirect
mv: invalid option -- n
Try `mv --help' for more information.
rm: invalid option -- n
Try `rm --help' for more information.
any help would be greatly appreciated =)
Here's the problem. Your "echo" gives it away:
echo $f
outputs
find . -name X.sql A.sql W.sql E.sql S.sql
I think the problem is you have straight single quotes ('') in the find command, instead of backquotes (``). So it's not really running find, but simply expanding the wildcards.
You may have to quote the wildcard so it gets passed to find instead of evaluated by the shell:
for f in `find . -name \*.sql`;
However, there are several problems in your script, which you should address if you want to use it more than once. See ormaaj's answer.
The problem, as already pointed out, is that find isn't actually being executed. However, this pattern is very wrong. Iterating using a for loop over anything that happens with a command substitution doesn't work because splitting the output into words requires word-splitting, which requires not quoting, which is a problem even if pathname expansion is disabled because filenames can contain newlines.
Preferably, use -exec. First write this script to a file and chmod u+x scriptname:
#!/usr/bin/env bash
header="--SQL HEADER"
for f in "$#"; do
echo "$f" >&2
fname=${f//\//_/}
cat - "$f" <<EOF >"$f.tmp"
${header}$'\n\n'
--MORE ANNOTATIONS ${fname%.*}:1
EOF
mv "$f.tmp" "$f"
done
Then run find like this:
find . -name '*.sql' -exec scriptname {} +
Alternatively, (and assuming this is a recent version of Bash), use globstar and no find (ksh has a similar feature if you prefer). This may be slower depending upon the job - the shell must pre-generate the list of files.
#!/usr/bin/env bash
shopt -s globstar
for f in ./**/*.sql; do
...
Alternatively, if you have Bash 4 and a system with the necessary GNU utilities, use -print0.
find . -name '*.sql' -print0 | while IFS= read -rd '' f; do
# <body of the above for loop here>
done
See: http://mywiki.wooledge.org/UsingFind

ls conditional in a linux script for loop

I'm trying to create a simple for loop that will take all the outputs of an ls command and put each output into a variable. so far i have
for i in ls /home/svn
do
echo $i
done
but it gives me an error.
Because the ls needs to be executed:
for i in $(ls ...); do
echo $i
done
Also, you might want to consider shell globbing instead:
for i in /home/svn/*; do
echo $i
done
... or find, which allows very fine-grained selection of the properties of items to find:
for i in $(find /home/svn -type f); do
echo $i
done
Furthermore, if you can have white space in the segments of the path or the file names themselves, use a while loop (previous example adjusted):
find /home/svn -type f|while read i; do
echo $i
done
while reads line-wise, so that the white space is preserved.
Concerning the calling of basename you have two options:
# Call basename
echo $(basename $i)
# ... or use string substitution
echo ${i##*/}
To explain the substitution: ## removes the longest front-anchored pattern from the string, # up to the first pattern match, %% the longest back-anchored pattern and % the first back-anchored full match.
You don't need to use ls to go over files in this case, the following will do the job: for i in /home/svn/*; do echo $i; done
You want to assign the output of the ls command to i, so you need to enclose it in backticks or the $() operator:
for i in $(ls /home/svn)
do
echo $i
done
That's because you're doing it wrong in the first place.
for i in /home/svn/*
do
echo "$i"
done

Resources