BASH: If statement needed to run if number of files in directory is 2 or greater - linux

I have the following BASH script:
http://pastebin.com/CX4RN1QW
There are two sections within the script that I want to run only if the number of files in the directory are 2 or greater. They are marked by ## Begin file test here and ## End file test.
I am very sensitive about the script, I don't want anything else to change, even if it simplifies it.
I have tried:
if [ "$(ls -b | wc -l)" -gt 1 ];
But that didn't work.

Instead of using the external ls command, you can use a glob to check for the existence of files in a directory:
EDIT I missed that you were looking for > 2 files. Updated.
shopt -s nullglob # cause unmatched globs to return empty, rather than the glob itself
files=(*) # put all file in the current directory into an array
if (( "${#files[#]}" >= 2 )); then # since we only care about existence, we only need to expand the first element
...
fi
shopt -u nullglob # disable null glob (not required)

You would need ls -1 there for it to work, since -b doesn't make it print one item per line. Alternatively use find, since it does that by default.

Related

`mv somedir/* someotherdir` when somedir is empty

I am writing an automated bash script that moves some files from one directory to another directory, but the first directory may be empty:
$ mv somedir/* someotherdir/
mv: cannot stat 'somedir/*': No such file or directory
How can I write this command without generating an error if the directory is empty? Should I just use rm and cp instead? I could write a conditional check to see if the directory is empty first, but that feels overweight.
I'm surprised the command fails if the directory is empty, so I'm trying to find out if I'm missing some simple solution.
Environment:
bash
RHEL
If you really want full control over the process, it might look like:
#!/usr/bin/env bash
# ^^^^- bash, not sh
restore_nullglob=$(shopt -p nullglob) # store the initial state of the nullglob setting
shopt -s nullglob # unconditionally enable nullglob
source_files=( somedir/* ) # store matching files in an array
if (( ${#source_files[#]} )); then # if that array isn't empty...
mv -- "${source_files[#]}" someotherdir/ # ...move the files it contains...
else # otherwise...
echo "No files to move; doing nothing" >&2 # ...write an error message.
fi
eval "$restore_nullglob" # restore nullglob to its original setting
Explaining the moving parts:
When nullglob is set, the shell expands *.txt to an empty list if no .txt files exist; otherwise (by default), it expands *.txt to the string *.txt when there are no matching files.
source_files is an array above -- bash's native mechanism to store a list. ${#source_files[#]} expands to the length of that array, whereas ${source_files[#]} on its own expands to its contents.
(( )) creates an arithmetic context, in which expressions are treated as math. In such a context, 0 is falsey, and positive numbers are truthy. Thus, if (( ${#source_files[#]} )) is true only if there is more than one file listed in the array source_files.
BTW, note that saving and restoring nullglob isn't really essential in an independent script; the purpose of showing how to do it is so you can safely use this code in larger scripts that might make assumptions about whether or not nullglob is set, without disrupting other code.
find somedir -type f -exec mv -t someotherdir/. '{}' +
Saves you the check, may not be what you want, though.
Are you aware of the output stream and the error stream? Output stream has number 1, while error stream has number 2. In case you don't want to see a result, you can redirect that result to the garbage bin.
Excuse me?
Well, let's have a look at this case: when the directory is empty, an error is generated and that error is shown in the error stream (2). You can redirect this, using 2>/dev/null (/dev/null being the UNIX/Linux garbage bin), so your command becomes:
$ mv somedir/* someotherdir/ 2>/dev/null
Following up on Dominique, to report all errors except the empty directory one use:
mv somedir/* someotherdir 2>&1 | grep -v No.such

Find out if a backup ran by searching the newest file

I'd like to write a short and simple script, that searches for a file using a specivic filter, and checks the age of that file. I want to write a short output and an error-code. This should be accessible for an NRPE-Server.
The script itself works, but I only have a problem when the file does not exist. This happens with that command:
newestfile=$(ls -t $path/$filter | head -1)
When the files exist, everything works as it should. When there nothing matches my filter, I get the output (I changed the filter to *.zip to show):
ls: cannot access '/backup/*.zip': No such file or directory
But I want to get the following output and then just exit the script with code 1:
there are no backups with the filter *.zip in the directory /backup
I am pretty sure this is a very easy problem but I just don't know whats wron. By the way, I am still "new" to bash scripts.
Here is my whole code:
#!/bin/bash
# Set the variables
path=/backup
filter=*.tar.gz
# Find the newest file
newestfile=$(ls -t $path/$filter | head -1)
# check if we even have a file
if [ ! -f $newestfile ]; then
echo "there are no backups with the filter $filter in the directory $path"
exit 1
fi
# check how old the file is that we found
if [[ $(find "$newestfile" -mtime +1 -print) ]]; then
echo "File $newestfile is older than 24 hours"
exit 2
else
echo "the file $newestfile is younger than 24 hours"
exit 0
fi
Actually, with your code you should also get an error message bash: no match: /backup/*.zip
UPDATE: Fixed the proposed solution, and the missing quotes in the original solution:
I suggest the following approach:
shopt -u failglob # Turn off error from globbing
pathfilter="/backup/*.tar.gz" # Quotes to avoid the wildcards to be expanded here already
# First see whether we have any matching files
files=($pathfilter)
if [[ ! -e ${#files[0]} ]]
then
# .... No matching files
else
# Now you can safely fetch the newest file
# Note: This does NOT work if you have filenames
# containing newlines
newestfile=$(ls -tA $pathfilter | head -1)
fi
I don't like using ls for this task, but I don't see an easy way in bash to do it better.

Delete files in one directory that do not exist in another directory or its child directories

I am still a newbie in shell scripting and trying to come up with a simple code. Could anyone give me some direction here. Here is what I need.
Files in path 1: /tmp
100abcd
200efgh
300ijkl
Files in path2: /home/storage
backupfile_100abcd_str1
backupfile_100abcd_str2
backupfile_200efgh_str1
backupfile_200efgh_str2
backupfile_200efgh_str3
Now I need to delete file 300ijkl in /tmp as the corresponding backup file is not present in /home/storage. The /tmp file contains more than 300 files. I need to delete the files in /tmp for which the corresponding backup files are not present and the file names in /tmp will match file names in /home/storage or directories under /home/storage.
Appreciate your time and response.
You can also approach the deletion using grep as well. You can loop though the files in /tmp checking with ls piped to grep, and deleting if there is not a match:
#!/bin/bash
[ -z "$1" -o -z "$2" ] && { ## validate input
printf "error: insufficient input. Usage: %s tmpfiles storage\n" ${0//*\//}
exit 1
}
for i in "$1"/*; do
fn=${i##*/} ## strip path, leaving filename only
## if file in backup matches filename, skip rest of loop
ls "${2}"* | grep -q "$fn" &>/dev/null && continue
printf "removing %s\n" "$i"
# rm "$i" ## remove file
done
Note: the actual removal is commented out above, test and insure there are no unintended consequences before preforming the actual delete. Call it passing the path to tmp (without trailing /) as the first argument and with /home/storage as the second argument:
$ bash scriptname /path/to/tmp /home/storage
You can solve this by
making a list of the files in /home/storage
testing each filename in /tmp to see if it is in the list from /home/storage
Given the linux+shell tags, one might use bash:
make the list of files from /home/storage an associative array
make the subscript of the array the filename
Here is a sample script to illustrate ($1 and $2 are the parameters to pass to the script, i.e., /home/storage and /tmp):
#!/bin/bash
declare -A InTarget
while read path
do
name=${path##*/}
InTarget[$name]=$path
done < <(find $1 -type f)
while read path
do
name=${path##*/}
[[ -z ${InTarget[$name]} ]] && rm -f $path
done < <(find $2 -type f)
It uses two interesting shell features:
name=${path##*/} is a POSIX shell feature which allows the script to perform the basename function without an extra process (per filename). That makes the script faster.
done < <(find $2 -type f) is a bash feature which lets the script read the list of filenames from find without making the assignments to the array run in a subprocess. Here the reason for using the feature is that if the array is updated in a subprocess, it would have no effect on the array value in the script which is passed to the second loop.
For related discussion:
Extract File Basename Without Path and Extension in Bash
Bash Script: While-Loop Subshell Dilemma
I spent some really nice time on this today because I needed to delete files which have same name but different extensions, so if anyone is looking for a quick implementation, here you go:
#!/bin/bash
# We need some reference to files which we want to keep and not delete,
 # let's assume you want to keep files in first folder with jpeg, so you
# need to map it into the desired file extension first.
FILES_TO_KEEP=`ls -1 ${2} | sed 's/\.pdf$/.jpeg/g'`
#iterate through files in first argument path
for file in ${1}/*; do
# In my case, I did not want to do anything with directories, so let's continue cycle when hitting one.
if [[ -d $file ]]; then
continue
fi
# let's omit path from the iterated file with baseline so we can compare it to the files we want to keep
NAME_WITHOUT_PATH=`basename $file`
 # I use mac which is equal to having poor quality clts
# when it comes to operating with strings,
# this should be safe check to see if FILES_TO_KEEP contain NAME_WITHOUT_PATH
if [[ $FILES_TO_KEEP == *"$NAME_WITHOUT_PATH"* ]];then
echo "Not deleting: $NAME_WITHOUT_PATH"
else
# If it does not contain file from the other directory, remove it.
echo "deleting: $NAME_WITHOUT_PATH"
rm -rf $file
fi
done
Usage: sh deleteDifferentFiles.sh path/from/where path/source/of/truth

Writing a function to replace duplicate files with hardlinks

I need to write a bash script that iterates through the files of a specified directory and replaces duplicates of files with hardlinks. Right now, my entire function looks like this:
#! /bin/bash
# sameln --- remove duplicate copies of files in specified directory
D=$1
cd $D #go to directory specified as default input
fileNum=0 #loop counter
DIR=".*|*"
for f in $DIR #for every file in the directory
do
files[$fileNum]=$f #save that file into the array
fileNum=$((fileNum+1)) #increment the counter
done
for((j=0; j<$fileNum; j++)) #for every file
do
if [ -f "$files[$j]" ] #access that file in the array
then
for((k=0; k<$fileNum; k++)) #for every other file
do
if [ -f "$files[$k]" ] #access other files in the array
then
test[cmp -s ${files[$j]} ${files[$k]}] #compare if the files are identical
[ln ${files[$j]} ${files[$k]}] #change second file to a hard link
fi
done
fi
done
Basically:
Loop through all files of depth 1 in specified directory
Put file contents into array
Compare each array item with every other array item and replace duplicates with hardlinks
The test directory has four files: a, b, c, d
a and b are different, but c and d are duplicates (they are empty). After running the script, ls -l shows that all of the files still only have 1 hardlink, so the script appears to have basically done nothing.
Where am I going wrong?
DIR=".*|*"
for f in $DIR #for every file in the directory
do
echo $f
done
This code outputs
.*|*
You should not loop over files like this. Look into the find command. As you see, your code doesn't work because the first loop is already faulty.
BTW, don't name your variables all uppercase, those are reserved for system variables, I believe.
You may be making this process a bit harder on yourself than necessary. There is already a Linux command fdupes that scans a directory conducting a byte-by-byte, md5sum, date & time comparison to determine whether files are duplicates of one another. It can easily find and return groups of files that are duplicates. Your are left with only using the results.
Below is a quick example of using this tool for the job. NOTE this quick example works only for filenames that do not contain spaces within them. You will have to modify it if you are dealing with filenames containing spaces. This is intended to show an approach to using a tool that already does what you want. Also note the actual ln command is commented out below. The program just prints what it would do. After testing you can remove the comment to the ln command once you are satisfied with the results.
#! /bin/bash
# sameln --- remove duplicate copies of files in specified directory using fdupes
[ -d "$1" ] || { # test valid directory supplied
printf "error: invalid directory '%s'. usage: %s <dir>\n" "$1" "${0//\//}"
exit 1
}
type fdupes &>/dev/null || { # verify fdupes is available in path
printf "error: 'fdupes' required. Program not found within your path\n"
exit 1
}
pushd "$1" &>/dev/null # go to directory specified as default input
declare -a files # declare files and dupes array
declare -a dupes
## read duplicate files into files array
IFS=$'\n' read -d '' -a files < <(fdupes --sameline .)
## for each list of duplicates
for ((i = 0; i < ${#files[#]}; i++)); do
printf "\n duplicate files %s\n\n" "${files[i]}"
## split into original files (no interal 'spaces' allowed in filenames)
dupes=( ${files[i]} )
## for the 1st duplicate on
for ((j = 1; j < ${#dupes[#]}; j++)); do
## create hardlink to original (actual command commented)
printf " ln -f %s %s\n" "${dupes[0]}" "${dupes[j]}"
# ln -f "${dupes[0]}" "${dupes[j]}"
done
done
exit 0
Output/Example
$ bash rmdupes.sh dat
duplicate files ./output.dat ./tmptest ./env4.dat.out
ln -f ./output.dat ./tmptest
ln -f ./output.dat ./env4.dat.out
duplicate files ./vh.conf ./vhawk.conf
ln -f ./vh.conf ./vhawk.conf
duplicate files ./outfile.txt ./newfile.txt
ln -f ./outfile.txt ./newfile.txt
duplicate files ./z1 ./z1cpy
ln -f ./z1 ./z1cpy

If multiple directories exist then move the directories - test if a globbing pattern matches anything

I want to know how I can use an if statement in a shell script to check the existence of multiple directories.
For example, if /tmp has subdirectories test1, test2, test3, I want to move them to another directory.
I am using if [ -d /tmp/test* ]; then mv test* /pathOfNewDir
but it does not work on the if statement part.
The -d test only accepts one argument, so you'll need to test each directory individually. I would also not recommend moving test* as it may match more than you intended.
Use the double-bracket syntax test syntax (e.g. if [[ -d...), which is bash-specific but tends to be clearer and have fewer gotchas than the single-bracket syntax. If you just need to check a few directories, you can do it with a simple statement like if [[ -d /tmp/test1 && -d /tmp/test2 && -d /tmp/test3 ]]; then...
Unfortunately, the shell's file-testing operators (such as -d and -f) operate on a single, literal path only:
A conditional such as [ -d /tmp/test* ] won't work, because if /tmp/test* expands to multiple matches, you'll get a syntax error (only 1 argument accepted).
The bash variant [[ -d /tmp/test* ]] doesn't work either, because no globbing (pathname expansion) is performed inside [[ ... ]].
To test whether a globbing pattern matches anything, the cleanest approach is to define an auxiliary function (this solution is POSIX-compliant):
exists() { [ -e "$1" ]; }
Invoke it with an [unquoted] pattern, e.g.:
exists foo* && echo 'HAVE MATCHES'
# or, in an `if` statement:
if exists foo*; then # ...
The only caveat is that if shopt -s failglob is in effect in bash, an error message will be printed to stderr if there's no match, and the rest of the command will not be executed.
See below for an explanation of the function.
Applied to your specific scenario, we get (using bash syntax):
# Define aux. function
exists() { [[ -e $1 ]]; }
exists /tmp/test*/ && mv /tmp/test*/ /path/to/new/dir
Note the trailing / in /tmp/test*/ to ensure that only directories match, if any.
&& ensures that the following command is only executed if the function's exit code indicates true.
mv /tmp/test*/ ... moves all matching directories at once to the new target directory.
Alternatively, capture globbing results in an helper array variable:
if matches=(/tmp/test*/) && [[ -e ${matches[0]} ]]; then
mv "${matches[#]}" /path/to/new/dir
fi
Or, process matches individually:
for d in /tmp/test*/; do
[[ -e $d ]] || break # exit, if no actual match
# Process individual match.
mv "$d" /path/to/new/dir
done
Explanation of auxiliary function exists() { [ -e "$1" ]; }:
It takes advantage of several shell features:
If you invoke it with a[n unquoted] pattern such as exists foo*, the shell will expand foo* to all matching files/directories and pass their names as individual arguments to the function.
If there are no matches, the pattern will be passed as is to the function - this behavior is mandated by POSIX.
Caveat: bash has configuration items that allow changing this behavior (shell options failglob and nullglob) - though by default it acts as mandated by POSIX in this case. (zsh, sadly, by default fails if there's no match.)
Inside the function, it's sufficient to examine the 1st argument ($1) to determine whether any matches were found:
If the 1st argument, $1 refers to an actual, existing filesystem item (as indicated by the exit code of the -e file-test operator), the implication is that the pattern indeed matched something (at least one, possibly more items).
Otherwise, the implication is that the pattern was passed as is, implying that no matches were found.
Note that the exit code of the -e test - due to being the last command in the function - implicitly serves as the exit code of the function as a whole.
It looks like you may want to use find:
find /tmp -name "test*" -maxdepth 1 -type d -exec mv \{\} /target/directory \;
This finds all test* directories directly under /tmp without recursion and moves them to /target/directory.
This approach uses ls and grep to create a list of matching directories or write an error in case no such directories are found:
IFS="
" # input is separated with newlines
if dirs=$( ls -1 -F | grep "^test.*/" | tr -d "/" )
then
# directories found - move them:
for d in $dirs
do
mv "$d" "$target_directory"/
done
else
# no directories found - send error
fi
While it would seem feasible to use find for such a task, find does not directly provide feedback on the number of matches as required by the OP according to the comments.
Note: Using ls for the task introduces a few limitations on filenames. This approach will not work with filenames containing newlines or wildcard characters.

Resources