Create new file but add number if filename already exists in bash - linux

I found similar questions but not in Linux/Bash
I want my script to create a file with a given name (via user input) but add number at the end if filename already exists.
Example:
$ create somefile
Created "somefile.ext"
$ create somefile
Created "somefile-2.ext"

The following script can help you. You should not be running several copies of the script at the same time to avoid race condition.
name=somefile
if [[ -e $name.ext || -L $name.ext ]] ; then
i=0
while [[ -e $name-$i.ext || -L $name-$i.ext ]] ; do
let i++
done
name=$name-$i
fi
touch -- "$name".ext

Easier:
touch file`ls file* | wc -l`.ext
You'll get:
$ ls file*
file0.ext file1.ext file2.ext file3.ext file4.ext file5.ext file6.ext

To avoid the race conditions:
name=some-file
n=
set -o noclobber
until
file=$name${n:+-$n}.ext
{ command exec 3> "$file"; } 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
And in addition, you have the file open for writing on fd 3.
With bash-4.4+, you can make it a function like:
create() { # fd base [suffix [max]]]
local fd="$1" base="$2" suffix="${3-}" max="${4-}"
local n= file
local - # ash-style local scoping of options in 4.4+
set -o noclobber
REPLY=
until
file=$base${n:+-$n}$suffix
eval 'command exec '"$fd"'> "$file"' 2> /dev/null
do
((n++))
((max > 0 && n > max)) && return 1
done
REPLY=$file
}
To be used for instance as:
create 3 somefile .ext || exit
printf 'File: "%s"\n' "$REPLY"
echo something >&3
exec 3>&- # close the file
The max value can be used to guard against infinite loops when the files can't be created for other reason than noclobber.
Note that noclobber only applies to the > operator, not >> nor <>.
Remaining race condition
Actually, noclobber does not remove the race condition in all cases. It only prevents clobbering regular files (not other types of files, so that cmd > /dev/null for instance doesn't fail) and has a race condition itself in most shells.
The shell first does a stat(2) on the file to check if it's a regular file or not (fifo, directory, device...). Only if the file doesn't exist (yet) or is a regular file does 3> "$file" use the O_EXCL flag to guarantee not clobbering the file.
So if there's a fifo or device file by that name, it will be used (provided it can be open in write-only), and a regular file may be clobbered if it gets created as a replacement for a fifo/device/directory... in between that stat(2) and open(2) without O_EXCL!
Changing the
{ command exec 3> "$file"; } 2> /dev/null
to
[ ! -e "$file" ] && { command exec 3> "$file"; } 2> /dev/null
Would avoid using an already existing non-regular file, but not address the race condition.
Now, that's only really a concern in the face of a malicious adversary that would want to make you overwrite an arbitrary file on the file system. It does remove the race condition in the normal case of two instances of the same script running at the same time. So, in that, it's better than approaches that only check for file existence beforehand with [ -e "$file" ].
For a working version without race condition at all, you could use the zsh shell instead of bash which has a raw interface to open() as the sysopen builtin in the zsh/system module:
zmodload zsh/system
name=some-file
n=
until
file=$name${n:+-$n}.ext
sysopen -w -o excl -u 3 -- "$file" 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3

Try something like this
name=somefile
path=$(dirname "$name")
filename=$(basename "$name")
extension="${filename##*.}"
filename="${filename%.*}"
if [[ -e $path/$filename.$extension ]] ; then
i=2
while [[ -e $path/$filename-$i.$extension ]] ; do
let i++
done
filename=$filename-$i
fi
target=$path/$filename.$extension

Use touch or whatever you want instead of echo:
echo file$((`ls file* | sed -n 's/file\([0-9]*\)/\1/p' | sort -rh | head -n 1`+1))
Parts of expression explained:
list files by pattern: ls file*
take only number part in each line: sed -n 's/file\([0-9]*\)/\1/p'
apply reverse human sort: sort -rh
take only first line (i.e. max value): head -n 1
combine all in pipe and increment (full expression above)

Try something like this (untested, but you get the idea):
filename=$1
# If file doesn't exist, create it
if [[ ! -f $filename ]]; then
touch $filename
echo "Created \"$filename\""
exit 0
fi
# If file already exists, find a similar filename that is not yet taken
digit=1
while true; do
temp_name=$filename-$digit
if [[ ! -f $temp_name ]]; then
touch $temp_name
echo "Created \"$temp_name\""
exit 0
fi
digit=$(($digit + 1))
done
Depending on what you're doing, replace the calls to touch with whatever code is needed to create the files that you are working with.

This is a much better method I've used for creating directories incrementally.
It could be adjusted for filename too.
LAST_SOLUTION=$(echo $(ls -d SOLUTION_[[:digit:]][[:digit:]][[:digit:]][[:digit:]] 2> /dev/null) | awk '{ print $(NF) }')
if [ -n "$LAST_SOLUTION" ] ; then
mkdir SOLUTION_$(printf "%04d\n" $(expr ${LAST_SOLUTION: -4} + 1))
else
mkdir SOLUTION_0001
fi

A simple repackaging of choroba's answer as a generalized function:
autoincr() {
f="$1"
ext=""
# Extract the file extension (if any), with preceeding '.'
[[ "$f" == *.* ]] && ext=".${f##*.}"
if [[ -e "$f" ]] ; then
i=1
f="${f%.*}";
while [[ -e "${f}_${i}${ext}" ]]; do
let i++
done
f="${f}_${i}${ext}"
fi
echo "$f"
}
touch "$(autoincr "somefile.ext")"

without looping and not use regex or shell expr.
last=$(ls $1* | tail -n1)
last_wo_ext=$($last | basename $last .ext)
n=$(echo $last_wo_ext | rev | cut -d - -f 1 | rev)
if [ x$n = x ]; then
n=2
else
n=$((n + 1))
fi
echo $1-$n.ext
more simple without extension and exception of "-1".
n=$(ls $1* | tail -n1 | rev | cut -d - -f 1 | rev)
n=$((n + 1))
echo $1-$n.ext

Related

bash: set variable inside loop when piping find and grep [duplicate]

i want to compute all *bin files inside a given directory. Initially I was working with a for-loop:
var=0
for i in *ls *bin
do
perform computations on $i ....
var+=1
done
echo $var
However, in some directories there are too many files resulting in an error: Argument list too long
Therefore, I was trying it with a piped while-loop:
var=0
ls *.bin | while read i;
do
perform computations on $i
var+=1
done
echo $var
The problem now is by using the pipe subshells are created. Thus, echo $var returns 0.
How can I deal with this problem?
The original Code:
#!/bin/bash
function entropyImpl {
if [[ -n "$1" ]]
then
if [[ -e "$1" ]]
then
echo "scale = 4; $(gzip -c ${1} | wc -c) / $(cat ${1} | wc -c)" | bc
else
echo "file ($1) not found"
fi
else
datafile="$(mktemp entropy.XXXXX)"
cat - > "$datafile"
entropy "$datafile"
rm "$datafile"
fi
return 1
}
declare acc_entropy=0
declare count=0
ls *.bin | while read i ;
do
echo "Computing $i" | tee -a entropy.txt
curr_entropy=`entropyImpl $i`
curr_entropy=`echo $curr_entropy | bc`
echo -e "\tEntropy: $curr_entropy" | tee -a entropy.txt
acc_entropy=`echo $acc_entropy + $curr_entropy | bc`
let count+=1
done
echo "Out of function: $count | $acc_entropy"
acc_entropy=`echo "scale=4; $acc_entropy / $count" | bc`
echo -e "===================================================\n" | tee -a entropy.txt
echo -e "Accumulated Entropy:\t$acc_entropy ($count files processed)\n" | tee -a entropy.txt
The problem is that the while loop is part of a pipeline. In a bash pipeline, every element of the pipeline is executed in its own subshell [ref]. So after the while loop terminates, the while loop subshell's copy of var is discarded, and the original var of the parent (whose value is unchanged) is echoed.
One way to fix this is by using Process Substitution as shown below:
var=0
while read i;
do
# perform computations on $i
((var++))
done < <(find . -type f -name "*.bin" -maxdepth 1)
Take a look at BashFAQ/024 for other workarounds.
Notice that I have also replaced ls with find because it is not good practice to parse ls.
A POSIX compliant solution would be to use a pipe (p file). This solution is very nice, portable, and POSIX, but writes something on the hard disk.
mkfifo mypipe
find . -type f -name "*.bin" -maxdepth 1 > mypipe &
while read line
do
# action
done < mypipe
rm mypipe
Your pipe is a file on your hard disk. If you want to avoid having useless files, do not forget to remove it.
So researching the generic issue, passing variables from a sub-shelled while loop to the parent. One solution I found, missing here, was to use a here-string. As that was bash-ish, and I preferred a POSIX solution, I found that a here-string is really just a shortcut for a here-document. With that knowledge at hand, I came up with the following, avoiding the subshell; thus allowing variables to be set in the loop.
#!/bin/sh
set -eu
passwd="username,password,uid,gid
root,admin,0,0
john,appleseed,1,1
jane,doe,2,2"
main()
{
while IFS="," read -r _user _pass _uid _gid; do
if [ "${_user}" = "${1:-}" ]; then
password="${_pass}"
fi
done <<-EOT
${passwd}
EOT
if [ -z "${password:-}" ]; then
echo "No password found."
exit 1
fi
echo "The password is '${password}'."
}
main "${#}"
exit 0
One important note to all copy pasters, is that the here-document is setup using the hyphen, indicating that tabs are to be ignored. This is needed to keep the layout somewhat nice. It is important to note, because stackoverflow doesn't render tabs in 'code' and replaces them with spaces. Grmbl. SO, don't mangle my code, just cause you guys favor spaces over tabs, it's irrelevant in this case!
This probably breaks on different editor(settings) and what not. So the alternative would be to have it as:
done <<-EOT
${passwd}
EOT
This could be done with a for loop, too:
var=0;
for file in `find . -type f -name "*.bin" -maxdepth 1`; do
# perform computations on "$i"
((var++))
done
echo $var

Checking root integrity via a script

Below is my script to check root path integrity, to ensure there is no vulnerability in PATH variable.
#! /bin/bash
if [ ""`echo $PATH | /bin/grep :: `"" != """" ]; then
echo "Empty Directory in PATH (::)"
fi
if [ ""`echo $PATH | /bin/grep :$`"" != """" ]; then echo ""Trailing : in PATH""
fi
p=`echo $PATH | /bin/sed -e 's/::/:/' -e 's/:$//' -e 's/:/ /g'`
set -- $p
while [ ""$1"" != """" ]; do
if [ ""$1"" = ""."" ]; then
echo ""PATH contains ."" shift
continue
fi
if [ -d $1 ]; then
dirperm=`/bin/ls -ldH $1 | /bin/cut -f1 -d"" ""`
if [ `echo $dirperm | /bin/cut -c6 ` != ""-"" ]; then
echo ""Group Write permission set on directory $1""
fi
if [ `echo $dirperm | /bin/cut -c9 ` != ""-"" ]; then
echo ""Other Write permission set on directory $1""
fi
dirown=`ls -ldH $1 | awk '{print $3}'`
if [ ""$dirown"" != ""root"" ] ; then
echo $1 is not owned by root
fi
else
echo $1 is not a directory
fi
shift
done
The script works fine for me, and shows all vulnerable paths defined in the PATH variable. I want to also automate the process of correctly setting the PATH variable based on the above result. Any quick method to do that.
For example, on my Linux box, the script gives output as:
/usr/bin/X11 is not a directory
/root/bin is not a directory
whereas my PATH variable have these defined,and so I want to have a delete mechanism, to remove them from PATH variable of root. lot of lengthy ideas coming in mind. But searching for a quick and "not so complex" method please.
No offense but your code is completely broken. Your using quotes in a… creative way, yet in a completely wrong way. Your code is unfortunately subject to pathname expansions and word splitting. And it's really a shame to have an insecure code to “secure” your PATH.
One strategy is to (safely!) split your PATH variable into an array, and scan each entry. Splitting is done like so:
IFS=: read -r -d '' -a path_ary < <(printf '%s:\0' "$PATH")
See my mock which and How to split a string on a delimiter answers.
With this command you'll have a nice array path_ary that contains each fields of PATH.
You can then check whether there's an empty field, or a . field or a relative path in there:
for ((i=0;i<${#path_ary[#]};++i)); do
if [[ ${path_ary[i]} = ?(.) ]]; then
printf 'Warning: the entry %d contains the current dir\n' "$i"
elif [[ ${path_ary[i]} != /* ]]; then
printf 'Warning: the entry %s is not an absolute path\n' "$i"
fi
done
You can add more elif's, e.g., to check whether the entry is not a valid directory:
elif [[ ! -d ${path_ary[i]} ]]; then
printf 'Warning: the entry %s is not a directory\n' "$i"
Now, to check for the permission and ownership, unfortunately, there are no pure Bash ways nor portable ways of proceeding. But parsing ls is very likely not a good idea. stat can work, but is known to have different behaviors on different platforms. So you'll have to experiment with what works for you. Here's an example that works with GNU stat on Linux:
read perms owner_id < <(/usr/bin/stat -Lc '%a %u' -- "${path_ary[i]}")
You'll want to check that owner_id is 0 (note that it's okay to have a dir path that is not owned by root; for example, I have /home/gniourf/bin and that's fine!). perms is in octal and you can easily check for g+w or o+w with bit tests:
elif [[ $owner_id != 0 ]]; then
printf 'Warning: the entry %s is not owned by root\n' "$i"
elif ((0022&8#$perms)); then
printf 'Warning: the entry %s has group or other write permission\n' "$i"
Note the use of 8#$perms to force Bash to understand perms as an octal number.
Now, to remove them, you can unset path_ary[i] when one of these tests is triggered, and then put all the remaining back in PATH:
else
# In the else statement, the corresponding entry is good
unset_it=false
fi
if $unset_it; then
printf 'Unsetting entry %s: %s\n' "$i" "${path_ary[i]}"
unset path_ary[i]
fi
of course, you'll have unset_it=true as the first instruction of the loop.
And to put everything back into PATH:
IFS=: eval 'PATH="${path_ary[*]}"'
I know that some will cry out loud that eval is evil, but this is a canonical (and safe!) way to join array elements in Bash (observe the single quotes).
Finally, the corresponding function could look like:
clean_path() {
local path_ary perms owner_id unset_it
IFS=: read -r -d '' -a path_ary < <(printf '%s:\0' "$PATH")
for ((i=0;i<${#path_ary[#]};++i)); do
unset_it=true
read perms owner_id < <(/usr/bin/stat -Lc '%a %u' -- "${path_ary[i]}" 2>/dev/null)
if [[ ${path_ary[i]} = ?(.) ]]; then
printf 'Warning: the entry %d contains the current dir\n' "$i"
elif [[ ${path_ary[i]} != /* ]]; then
printf 'Warning: the entry %s is not an absolute path\n' "$i"
elif [[ ! -d ${path_ary[i]} ]]; then
printf 'Warning: the entry %s is not a directory\n' "$i"
elif [[ $owner_id != 0 ]]; then
printf 'Warning: the entry %s is not owned by root\n' "$i"
elif ((0022 & 8#$perms)); then
printf 'Warning: the entry %s has group or other write permission\n' "$i"
else
# In the else statement, the corresponding entry is good
unset_it=false
fi
if $unset_it; then
printf 'Unsetting entry %s: %s\n' "$i" "${path_ary[i]}"
unset path_ary[i]
fi
done
IFS=: eval 'PATH="${path_ary[*]}"'
}
This design, with if/elif/.../else/fi is good for this simple task but can get awkward to use for more involved tests. For example, observe that we had to call stat early before the tests so that the information is available later in the tests, before we even checked that we're dealing with a directory.
The design may be changed by using a kind of spaghetti awfulness as follows:
for ((oneblock=1;oneblock--;)); do
# This block is only executed once
# You can exit this block with break at any moment
done
It's usually much better to use a function instead of this, and return from the function. But because in the following I'm also going to check for multiple entries, I'll need to have a lookup table (associative array), and it's weird to have an independent function that uses an associative array that's defined somewhere else…
clean_path() {
local path_ary perms owner_id unset_it oneblock
local -A lookup
IFS=: read -r -d '' -a path_ary < <(printf '%s:\0' "$PATH")
for ((i=0;i<${#path_ary[#]};++i)); do
unset_it=true
for ((oneblock=1;oneblock--;)); do
if [[ ${path_ary[i]} = ?(.) ]]; then
printf 'Warning: the entry %d contains the current dir\n' "$i"
break
elif [[ ${path_ary[i]} != /* ]]; then
printf 'Warning: the entry %s is not an absolute path\n' "$i"
break
elif [[ ! -d ${path_ary[i]} ]]; then
printf 'Warning: the entry %s is not a directory\n' "$i"
break
elif [[ ${lookup[${path_ary[i]}]} ]]; then
printf 'Warning: the entry %s appears multiple times\n' "$i"
break
fi
# Here I'm sure I'm dealing with a directory
read perms owner_id < <(/usr/bin/stat -Lc '%a %u' -- "${path_ary[i]}")
if [[ $owner_id != 0 ]]; then
printf 'Warning: the entry %s is not owned by root\n' "$i"
break
elif ((0022 & 8#$perms)); then
printf 'Warning: the entry %s has group or other write permission\n' "$i"
break
fi
# All tests passed, will keep it
lookup[${path_ary[i]}]=1
unset_it=false
done
if $unset_it; then
printf 'Unsetting entry %s: %s\n' "$i" "${path_ary[i]}"
unset path_ary[i]
fi
done
IFS=: eval 'PATH="${path_ary[*]}"'
}
All this is really safe regarding spaces and glob characters and newlines inside PATH; the only thing I don't really like is the use of the external (and non-portable) stat command.
I'd recommend you get a good book on Bash shell scripting. It looks like you learned Bash from looking at 30 year old system shell scripts and by hacking away. This isn't a terrible thing. In fact, it shows initiative and great logic skills. Unfortunately, it leads you down to some really bad code.
If statements
In the original Bourne shell the [ was a command. In fact, /bin/[ was a hard link to /bin/test. The test command was a way to test certain aspects of a file. For example test -e $file would return a 0 if the $file was executable and a 1 if it wasn't.
The if merely took the command after it, and would run the then clause if that command returned an exit code of zero, or the else clause (if it exists) if the exit code wasn't zero.
These two are the same:
if test -e $file
then
echo "$file is executable"
fi
if [ -e $file ]
then
echo "$file is executable"
fi
The important idea is that [ is merely a system command. You don't need these with the if:
if grep -q "foo" $file
then
echo "Found 'foo' in $file"
fi
Note that I am simply running grep and if grep is successful, I'm echoing my statement. No [ ... ] are necessary.
A shortcut to the if is to use the list operators && and ||. For example:
grep -q "foo" $file && echo "I found 'foo' in $file"
is the same as the above if statement.
Never parse ls
You should never parse the ls command. You should use stat instead. stat gets you all the information in the command, but in an easily parseable form.
[ ... ] vs. [[ ... ]]
As I mentioned earlier, in the original Bourne shell, [ was a system command. In Kornshell, it was an internal command, and Bash carried it over too.
The problem with [ ... ] is that the shell would first interpolate the command before the test was performed. Thus, it was vulnerable to all sorts of shell issues. The Kornshell introduced [[ ... ]] as an alternative to the [ ... ] and Bash uses it too.
The [[ ... ]] allows Kornshell and Bash to evaluate the arguments before the shell interpolates the command. For example:
foo="this is a test"
bar="test this is"
[ $foo = $bar ] && echo "'$foo' and '$bar' are equal."
[[ $foo = $bar ]] && echo "'$foo' and '$bar' are equal."
In the [ ... ] test, the shell interpolates first which means that it becomes [ this is a test = test this is ] and that's not valid. In [[ ... ]] the arguments are evaluated first, thus the shell understands it's a test between $foo and $bar. Then, the values of $foo and $bar are interpolated. That works.
For loops and $IFS
There's a shell variable called $IFS that sets how read and for loops parse their arguments. Normally, it's set to space/tab/NL, but you can modify this. Since each PATH argument is separated by :, you can set IFS=:", and use a for loop to parse your $PATH.
The <<< Redirection
The <<< allows you to take a shell variable and pass it as STDIN to the command. These both more or less do the same thing:
statement="This contains the word 'foo'"
echo "$statement" | sed 's/foo/bar/'
statement="This contains the word 'foo'"
sed 's/foo/bar/'<<<$statement
Mathematics in the Shell
Using ((...)) allows you to use math and one of the math function is masking. I use masks to determine whether certain bits are set in the permission.
For example, if my directory permission is 0755 and I and it against 0022, I can see if user read and write permissions are set. Note the leading zeros. That's important, so that these are interpreted as octal values.
Here's your program rewritten using the above:
#! /bin/bash
grep -q "::" <<<"$PATH" && echo "Empty directory in PATH ('::')."
grep -q ":$" <<<$PATH && "PATH has trailing ':'"
#
# Fix Path Issues
#
path=$(sed -e 's/::/:/g' -e 's/:$//'<<<$PATH);
OLDIFS="$IFS"
IFS=":"
for directory in $PATH
do
[[ $directory == "." ]] && echo "Path contains '.'."
[[ ! -d "$directory" ]] && echo "'$directory' isn't a directory in path."
mode=$(stat -L -f %04Lp "$directory") # Differs from system to system
[[ $(stat -L -f %u "$directory") -eq 0 ]] && echo "Directory '$directory' owned by root"
((mode & 0022)) && echo "Group or Other write permission is set on '$directory'."
done
I'm not 100% sure what you want to do or mean about PATH Vulnerabilities. I don't know why you care whether a directory is owned by root, and if an entry in the $PATH is not a directory, it won't affect the $PATH. However, one thing I would test for is to make sure all directories in your $PATH are absolute paths.
[[ $directory != /* ]] && echo "Directory '$directory' is a relative path"
The following could do the whole work and also removes duplicate entries
export PATH="$(perl -e 'print join(q{:}, grep{ -d && !((stat(_))[2]&022) && !$seen{$_}++ } split/:/, $ENV{PATH})')"
I like #kobame's answer but if you don't like the perl-dependency you can do something similar to:
$ cat path.sh
#!/bin/bash
PATH="/root/bin:/tmp/groupwrite:/tmp/otherwrite:/usr/bin:/usr/sbin"
echo "${PATH}"
OIFS=$IFS
IFS=:
for path in ${PATH}; do
[ -d "${path}" ] || continue
paths=( "${paths[#]}" "${path}" )
done
while read -r stat path; do
[ "${stat:5:1}${stat:8:1}" = '--' ] || continue
newpath="${newpath}:${path}"
done < <(stat -c "%A:%n" "${paths[#]}" 2>/dev/null)
IFS=${OIFS}
PATH=${newpath#:}
echo "${PATH}"
$ ./path.sh
/root/bin:/tmp/groupwrite:/tmp/otherwrite:/usr/bin:/usr/sbin
/usr/bin:/usr/sbin
Note that this is not portable due to stat not being portable but it will work on Linux (and Cygwin). For this to work on BSD systems you will have to adapt the format string, other Unices don't ship with stat at all OOTB (Solaris, for example).
It doesn't remove duplicates or directories not owned by root either but that can easily be added. The latter only requires the loop to be adapted slightly so that stat also returns the owner's username:
while read -r stat owner path; do
[ "${owner}${stat:5:1}${stat:8:1}" = 'root--' ] || continue
newpath="${newpath}:${path}"
done < <(stat -c "%A:%U:%n" "${paths[#]}" 2>/dev/null)

Running diff and have it stop on a difference

I have a script running that is checking multiples directories and comparing them to expanded tarballs of the same directories elsewhere.
I am using diff -r -q and what I would like is that when diff finds any difference in the recursive run it will stop running instead of going through more directories in the same run.
All help appreciated!
Thank you
#bazzargh I did try it like you suggested or like this.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]];
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
But this only works with files that exist in both directories. If one file is missing I won't get information about that. Also the directories I am working with have over 300.000 files so it seems to be a bit of overhead to do a find for each file and then diff.
I would like something like this to work, with and elif statement that checks if $runid.tmp contains data and breaks if it does. I added 2> after the first if statement so stderr is sent to the $runid.tmp file.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]] 2> /tmp/$runid.tmp;
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
elif [[ -s /tmp/$runid.tmp ]];
then echo differs: $file >> /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
Would this work?
You can do the loop over files with 'find' and break when they differ. eg for dirs foo, bar:
for file in $(find foo -type f); do if [[ $(diff -q $file ${file/#foo/bar}) ]]; then echo differs: $file; break; else echo same: $file; fi; done
NB this will not detect if 'bar' has directories that do not exist in 'foo'.
Edited to add: I just realised I overlooked the really obvious solution:
diff -rq foo bar | head -n1
It's not 'diff', but with 'awk' you can compare two files (or more) and then exit when they have a different line.
Try something like this (sorry, it's a little rough)
awk '{ h[$0] = ! h[$0] } END { for (k in h) if (h[k]) exit }' file1 file2
Sources are here and here.
edit: to break out of the loop when two files have the same line, you may have to do the loop in awk. See here.
You can try the following:
#!/usr/bin/env bash
# Determine directories to compare
d1='./someDir1'
d2='./someDir2'
# Loop over the file lists and diff corresponding files
while IFS= read -r line; do
# Split the 3-column `comm` output into indiv. variables.
lineNoTabs=${line//$'\t'}
numTabs=$(( ${#line} - ${#lineNoTabs} ))
d1Only='' d2Only='' common=''
case $numTabs in
0)
d1Only=$lineNoTabs
;;
1)
d2Only=$lineNoTabs
;;
*)
common=$lineNoTabs
;;
esac
# If a file exists in both directories, compare them,
# and exit if they differ, continue otherwise
if [[ -n $common ]]; then
diff -q "$d1/$common" "$d2/$common" || {
echo "EXITING: Diff found: '$common'" 1>&2;
exit 1; }
# Deal with files unique to either directory.
elif [[ -n $d1Only ]]; then # fie
echo "File '$d1Only' only in '$d1'."
else # implies: if [[ -n $d2Only ]]; then
echo "File '$d2Only' only in '$d2."
fi
# Note: The `comm` command below is CASE-SENSITIVE, which means:
# - The input directories must be specified case-exact.
# To change that, add `I` after the last `|` in _both_ `sed commands`.
# - The paths and names of the files diffed must match in case too.
# To change that, insert `| tr '[:upper:]' '[:lower:]' before _both_
# `sort commands.
done < <(comm \
<(find "$d1" -type f | sed 's|'"$d1/"'||' | sort) \
<(find "$d2" -type f | sed 's|'"$d2/"'||' | sort))
The approach is based on building a list of files (using find) containing relative paths (using sed to remove the root path) for each input directory, sorting the lists, and comparing them with comm, which produces 3-column, tab-separated output to indicated which lines (and therefore files) are unique to the first list, which are unique to the second list, and which lines they have in common.
Thus, the values in the 3rd column can be diffed and action taken if they're not identical.
Also, the 1st and 2nd-column values can be used to take action based on unique files.
The somewhat complicated splitting of the 3 column values output by comm into individual variables is necessary, because:
read will treat multiple tabs in sequence as a single separator
comm outputs a variable number of tabs; e.g., if there's only a 1st-column value, no tab is output at all.
I got a solution to this thanks to #bazzargh.
I use this code in my script and now it works perfectly.
for file in $(find ${intfolder} -type f);
do if [[ $(diff -q $file ${file/#${intfolder}/${EXPANDEDROOT}/${runid}/$(basename ${intfolder})}) ]] 2> ${resultfile}.tmp;
then echo differs: $file > ${resultfile}.tmp 2>&1; break;
elif [[ -s ${resultfile}.tmp ]];
then echo differs: $file >> ${resultfile}.tmp 2>&1; break;
else echo same: $file > /dev/null;
fi; done
thanks!

Check whether a certain file type/extension exists in directory [duplicate]

This question already has answers here:
Test whether a glob has any matches in Bash
(22 answers)
Closed 1 year ago.
How would you go about telling whether files of a specific extension are present in a directory, with bash?
Something like
if [ -e *.flac ]; then
echo true;
fi
#!/bin/bash
count=`ls -1 *.flac 2>/dev/null | wc -l`
if [ $count != 0 ]
then
echo true
fi
#/bin/bash
myarray=(`find ./ -maxdepth 1 -name "*.py"`)
if [ ${#myarray[#]} -gt 0 ]; then
echo true
else
echo false
fi
This uses ls(1), if no flac files exist, ls reports error and the script exits; othewise the script continues and the files may be be processed
#! /bin/sh
ls *.flac >/dev/null || exit
## Do something with flac files here
shopt -s nullglob
if [[ -n $(echo *.flac) ]] # or [ -n "$(echo *.flac)" ]
then
echo true
fi
#!/bin/bash
files=$(ls /home/somedir/*.flac 2> /dev/null | wc -l)
if [ "$files" != "0" ]
then
echo "Some files exists."
else
echo "No files with that extension."
fi
You need to be carful which flag you throw into your if statement, and how it relates to the outcome you want.
If you want to check for only regular files and not other types of file system entries then you'll want to change your code skeleton to:
if [ -f file ]; then
echo true;
fi
The use of the -f restricts the if to regular files, whereas -e is more expansive and will match all types of filesystem entries. There are of course other options like -d for directories, etc. See http://tldp.org/LDP/abs/html/fto.html for a good listing.
As pointed out by #msw, test (i.e. [) will choke if you try and feed it more than one argument. This might happen in your case if the glob for *.flac returned more than one file. In that case try wrapping your if test in a loop like:
for file in ./*.pdf
do
if [ -f "${file}" ]; then
echo 'true';
break
fi
done
This way you break on the first instance of the file extension you want and can keep on going with the rest of the script.
The top solution (if [ -e *.flac ];) did not work for me, giving: [: too many arguments
if ls *.flac >/dev/null 2>&1; then it will work.
You can use -f to check whether files of a specific type exist:
#!/bin/bash
if [ -f *.flac ] ; then
echo true
fi
bash only:
any_with_ext () (
ext="$1"
any=false
shopt -s nullglob
for f in *."$ext"; do
any=true
break
done
echo $any
)
if $( any_with_ext flac ); then
echo "have some flac"
else
echo "dir is flac-free"
fi
I use parentheses instead of braces to ensure a subshell is used (don't want to clobber your current nullglob setting).
shopt -s nullglob
set -- $(echo *.ext)
if [ "${#}" -gt 0 ];then
echo "got file"
fi
For completion, with zsh:
if [[ -n *.flac(#qN) ]]; then
echo true
fi
This is listed at the end of the Conditional Expressions section in the zsh manual. Since [[ disables filename globbing, we need to force filename generation using (#q) at the end of the globbing string, then the N flag (NULL_GLOB option) to force the generated string to be empty in case there’s no match.
Here is a solution using no external commands (i.e. no ls), but a shell function instead. Tested in bash:
shopt -s nullglob
function have_any() {
[ $# -gt 0 ]
}
if have_any ./*.flac; then
echo true
fi
The function have_any uses $# to count its arguments, and [ $# -gt 0 ] then tests whether there is at least one argument. The use of ./*.flac instead of just *.flac in the call to have_any is to avoid problems caused by files with names like --help.
Here's a fairly simple solution:
if [ "$(ls -A | grep -i \\.flac\$)" ]; then echo true; fi
As you can see, this is only one line of code, but it works well enough. It should work with both bash, and a posix-compliant shell like dash. It's also case-insensitive, and doesn't care what type of files (regular, symlink, directory, etc.) are present, which could be useful if you have some symlinks, or something.
I tried this:
if [ -f *.html ]; then
echo "html files exist"
else
echo "html files dont exist"
fi
I used this piece of code without any problem for other files, but for html files I received an error:
[: too many arguments
I then tried #JeremyWeir's count solution, which worked for me:
count=`ls -1 *.flac 2>/dev/null | wc -l`
if [ $count != 0 ]
then
echo true
fi
Keep in mind you'll have to reset the count if you're doing this in a loop:
count=$((0))
This should work in any borne-like shell out there:
if [ "$(find . -maxdepth 1 -type f | grep -i '.*\.flac$')" ]; then
echo true
fi
This also works with the GNU find, but IDK if this is compatible with other implementations of find:
if [ "$(find . -maxdepth 1 -type f -iname \*.flac)" ]; then
echo true
fi

Creating bash script of a complex linux command

I have few long commands that I will be using on a day to day basis. So I though it would be better to have a bash script where I could pass arguments, thus saving typing. I guess this is the norm in Linux but I am kind of new to it. Could someone show me how to do it. A example is the following command
cut -f <column_number> <filename> | sort | uniq -c |
sort -r -k1 -n | awk '{printf "%-15s %-10d\n", $2,$1}'
so i want this in a script where i can pass the filename and column number (preferably in any order) and get the desired ouput instead of having to type the whole thing everytime.
Create a file say myscript.sh -
#!/bin/bash
if [ $# -ne 2 ]; then
echo Usage: myscript.sh column_number file_path
exit
fi
if ! [ -f $2 ]; then
echo File doesnt exist
exit
fi
if [ `echo $1 | grep -E ^[0-9]+$ | wc -l` -ne 1 ]; then
echo First argument must be a number
exit
fi
cut -f 10 $1 $2 | sort | uniq -c |
sort -r -k1 -n | awk '{printf "%-15s %-10d\n", $2,$1}'
Make sure this file is executable using command chmod +x mytask.sh
You can invoke it like sh myscript.sh 30 myfile.sh or ./myscript.sh 30 myfile.sh
The first line of above script specifies the shell you would like your script to be executed in. $1 and $2 refer to the first and second command line arguments.
About argument validity checks:
First check ensures that there are exactly two arguments passed to the script.
Second check ensures the file pointed by the argument two is existing
Third check ensures that the number passed as first argument is really a number. It uses regular expression for that purpose. May be someone provide a better replacement for this check but this is what came to my mind instantly.
To accept the filename and column number in any order, you'll need to use option switches. Bash's getopts allows you to specify and process options so you can call your script using scriptname -f filename -c 12 or scriptname -c 12 -f filename for example.
#!/bin/bash
options=":f:c:"
while getopts $options option
do
case $option in
f)
filename=$OPTARG
;;
c)
col_num=$OPTARG
;;
\?)
usage_function # not shown
exit 1
;;
*)
echo "Invalid option"
usage_function
exit 1
;;
esac
done
shift $((OPTIND - 1))
if [[ -z $filename || -z $col_num ]]
then
echo "Missing option"
usage_function
exit 1
fi
if [[ $col_num == *[^0-9]* ]]
then
echo "Invalid integer"
usage_function
exit 1
fi
# other checks
cut -f 10 $col_num "$filename" | ...

Resources