bash: Truncate filenames, adding incrementing number for duplicates - linux

I would like to shorten all filenames in a given location, and where the truncation produces duplicates, include an incrementing number in the filename. I am most of the way there, thanks to this solution: bash: Truncate Filenames, keeping them unique
I've made a small modification allowing it to cover multiple file extensions (as the original was just written for jpegs). My script:
num=1
length=8
for file in *
do
newname=$file
extension=${file: -4}
until [[ ! -f $newname ]]
do
(( sublen = length - ${#num} ))
printf -v newname '%.*s%d' "$sublen" "$file" "$num"
(( num++ ))
done
mv "$file" "$newname""$extension"
done
Original file list:
DumbFilename.txt
DumbFilename2.txt
DumbFilename3.txt
GrossFilename.txt
GrossFilename2.txt
GrossFilename3.txt
LongFilename.doc
LongFilename2.doc
LongFilename3.doc
UglyFilename.doc
UglyFilename2.doc
UglyFilename3.doc
Output from the above code:
DumbFil1.txt
DumbFil2.txt
DumbFil3.txt
GrossFi4.txt
GrossFi5.txt
GrossFi6.txt
LongFil7.doc
LongFil8.doc
LongFil9.doc
UglyFi10.doc
UglyFi11.doc
UglyFi12.doc
My only issue is that the numbers increment in one long sequence. I only want it to increment for each instance of a duplicate, like this:
DumbFil1.txt
DumbFil2.txt
DumbFil3.txt
GrossFi1.txt
GrossFi2.txt
GrossFi3.txt
LongFil1.doc
LongFil2.doc
LongFil3.doc
UglyFil1.doc
UglyFil2.doc
UglyFil3.doc
How do I go about this?

Would you please try the following:
length=8
for file in *; do
newname=$file
extension=${file: -4}
for (( num=1; ; num++ )); do
(( sublen = length - ${#num} ))
printf -v newname '%.*s%d%s' "$sublen" "$file" "$num" "$extension"
[[ ! -f $newname ]] && break
done
mv -- "$file" "$newname"
done
BTW your original script checks the existence of the $newname without appending the extension, which will break on some conditions.

Put the truncated filename in a variable. On the next iteration, check if the current truncated filename is the same as the previous one. If it's different, reset num to 1 to start the sequence over.
length=8
num=1
last_truncated=
for file in *
do
newname=$file
extension=${file: -4}
until [[ ! -f $newname ]]
do
(( sublen = length - ${#num} ))
truncated=${newname:0:$sublen}
# reset $num with new prefix
if [[ $truncated != $last_truncated ]]
then num=1
fi
newname=$truncated$num
(( num++ ))
done
last_truncated=truncated
mv "$file" "$newname""$extension"
done

Related

How can I add a third parameter to this bash script?

I want to add the third parameter that will be changing files name from upper to lower OR lower to upper but in this third parameter I want to specify what file's name must be changed? What's wrong with this script? Thank you in advance.
#!/bin/bash
if test "$1" = "lower" && test "$2" = "upper"
then
for file in *; do
if [ $0 != "$file" ] && [ $0 != "./$file" ]; then
mv "$file" "$(echo $file | tr [:lower:] [:upper:])";
fi
fi
done
elif test "$1" = "upper" && test "$2" = "lower"
then
for file in *; do
if [ $0 != "$file" ] && [ $0 != "./$file" ]; then
mv "$file" "$(echo $file | tr [:upper:] [:lower:])";
fi
done
fi
if [ "$1" = "lower" ] && [ "$2" = "upper" ] && [ "$3" = "$file" ];
then
for file in * ; do
if [ $0 != "$file" ] && [ $0 != "./$file" ]; then
mv "$file" "$(echo $file | tr [:lower:] [:upper:])";
fi
done
fi
If I am guessing correctly what you want, try
#!/bin/bash
case $1:$2 in
upper:lower | lower:upper ) ;;
*) echo "Syntax: $0 upper|lower lower|upper files ..." >&2; exit 1;;
esac
from=$1
to=$2
shift; shift
for file; do
mv "$file" "$(echo "$file" | tr "[:$from:]" "[:$to:]")"
done
This has the distinct advantage that it allows more than three arguments, where the first two specify the operation to perform.
Notice also how we take care to always quote strings which contain a file name. See also When to wrap quotes around a shell variable?
The above script should in fact also work with /bin/sh; we do not use any Bash-only features so it should run under any POSIX sh.
However, a much better design would probably be to use an option to decide what mapping to apply, and simply accept a (possibly empty) list of options and a list of file name arguments. Then you can use Bash built-in parameter expansion, too. Case conversion parameter expansion operations are available in Bash 4 only, though.
#!/bin/bash
op=',,'
# XXX FIXME: do proper option parsing
case $1 in -u) op='^^'; shift;; esac
for file; do
eval mv "\$file" "\${file$op}"
done
This converts to lowercase by default, and switches to uppercase instead if you pass in -u before the file names.
In both of these scripts, we use for file as a shorthand for for file in "$#" i.e. we loop over the (remaining) command-line arguments. Perhaps this is the detail you were looking for.
Forgive me if I grossly misunderstand, but I think you may have misunderstood how argument passing works.
The named/numbered arguments represent the values you pass in on the command line in their ordinal positions. Each can theoretically have any value that can by stuck in a string. You don't need a third parameter, just a third value.
Let's try a sample.
#! /bin/env bash
me=${0#*/} # strip the path
use="
$me { upper | lower } file
changes the NAME of the file given to be all upper or all lower case.
"
# check for correct arguments
case $# in
2) : exactly 2 arguments passed - this is correct ;;
*) echo "Incorrect usage - $me requires exactly 2 arguments $use" >&2
exit 1 ;;
esac
declare -l lower action # these variables will downcase anything put in them
declare -u upper # this one will upcase anything in it
declare newname # create a target variable with unspecified case
action="$1" # stored the *lowercased* 1st argument passed as $action
case $action in # passed argument has been lowercased for simpler checking
upper) upper="$2" # store *uppercased* 2nd arg.
newname="$upper" # newname is now uppercase.
;;
lower) lower="$2" # store *lowercased* 2nd arg.
newname="$lower" # newname is now lowercase.
;;
*) echo "Incorrect usage - $me requires 2nd arg to be 'upper' or 'lower' $use" >&2
exit 1 ;;
esac
if [[ -e "$2" ]] # confirm the argument exists
then echo "Renaming $2 -> $newname:"
ls -l "$2"
echo " -> "
mv "$2" "$newname" # rename the file
ls -l "$newname"
else echo "'$2' does not exist. $use" >&2
exit 1
fi
First of all there is indentation problem with this script check first if condition done should be coming before fi
Below is the correct.
if test "$1" = "lower" && test "$2" = "upper"
then
for file in *; do
if [ $0 != "$file" ] && [ $0 != "./$file" ]; then
mv "$file" "$(echo $file | tr [:lower:] [:upper:])";
fi
done
fi
Secondly the question you asked:
#/bin/bash -xe
[ $# -ne 3 ] && echo "Usage: {lower} {upper} {fileName} " && exit 1
if [ "$1" = "lower" ] && [ "$2" = "upper" ] && [ -f "$3" ];
then
mv "$3" "$(echo $3 | tr [:lower:] [:upper:])";
fi
Hope this helps.

Unix - Replace column value inside while loop

I have comma separated (sometimes tab) text file as below:
parameters.txt:
STD,ORDER,ORDER_START.xml,/DML/SOL,Y
STD,INSTALL_BASE,INSTALL_START.xml,/DML/IB,Y
with below code I try to loop through the file and do something
while read line; do
if [[ $1 = "$(echo "$line" | cut -f 1)" ]] && [[ "$(echo "$line" | cut -f 5)" = "Y" ]] ; then
//do something...
if [[ $? -eq 0 ]] ; then
// code to replace the final flag
fi
fi
done < <text_file_path>
I wanted to update the last column of the file to N if the above operation is successful, however below approaches are not working for me:
sed 's/$f5/N/'
'$5=="Y",$5=N;{print}'
$(echo "$line" | awk '$5=N')
Update: Few considerations which need to be considered to give more clarity which i missed at first, apologies!
The parameters file may contain lines with last field flag as "N" as well.
Final flag needs to be update only if "//do something" code has successfully executed.
After looping through all lines i.e, after exiting "while loop" flags for all rows to be set to "Y"
perhaps invert the operations do processing in awk.
$ awk -v f1="$1" 'BEGIN {FS=OFS=","}
f1==$1 && $5=="Y" { // do something
$5="N"}1' file
not sure what "do something" operation is, if you need to call another command/script it's possible as well.
with bash:
(
IFS=,
while read -ra fields; do
if [[ ${fields[0]} == "$1" ]] && [[ ${fields[4]} == "Y" ]]; then
# do something
fields[4]="N"
fi
echo "${fields[*]}"
done < file | sponge file
)
I run that in a subshell so the effects of altering IFS are localized.
This uses sponge to write back to the same file. You need the moreutils package to use it, otherwise use
done < file > tmp && mv tmp file
Perhaps a bit simpler, less bash-specific
while IFS= read -r line; do
case $line in
"$1",*,Y)
# do something
line="${line%Y}N"
;;
esac
echo "$line"
done < file
To replace ,N at the end of the line($) with ,Y:
sed 's/,N$/,Y/' file

How to show the file with most hard links in a directory in bash

Does anyone know the specific command for how to show the file with most hard links in a directory on terminal on unix?
This solution works for all filenames (including ones with newlines in them), skips non-files, and prints the paths of all files that have the maximum number of links:
dir=$1
# Support empty directories and directories containing files whose
# names begin with '.'
shopt -s nullglob dotglob
declare -i maxlinks=0 numlinks
maxlinks_paths=()
for path in "$dir"/* ; do
[[ -f $path ]] || continue # skip non-files
numlinks=$(stat --format '%h' -- "$path")
if (( numlinks > maxlinks )) ; then
maxlinks=numlinks
maxlinks_paths=( "$path" )
elif (( numlinks == maxlinks )) ; then
maxlinks_paths+=( "$path" )
fi
done
# Print results with printf and '%q' to quote unusual filenames
(( ${#maxlinks_paths[*]} > 0 )) && printf '%q\n' "${maxlinks_paths[#]}"

Running diff and have it stop on a difference

I have a script running that is checking multiples directories and comparing them to expanded tarballs of the same directories elsewhere.
I am using diff -r -q and what I would like is that when diff finds any difference in the recursive run it will stop running instead of going through more directories in the same run.
All help appreciated!
Thank you
#bazzargh I did try it like you suggested or like this.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]];
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
But this only works with files that exist in both directories. If one file is missing I won't get information about that. Also the directories I am working with have over 300.000 files so it seems to be a bit of overhead to do a find for each file and then diff.
I would like something like this to work, with and elif statement that checks if $runid.tmp contains data and breaks if it does. I added 2> after the first if statement so stderr is sent to the $runid.tmp file.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]] 2> /tmp/$runid.tmp;
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
elif [[ -s /tmp/$runid.tmp ]];
then echo differs: $file >> /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
Would this work?
You can do the loop over files with 'find' and break when they differ. eg for dirs foo, bar:
for file in $(find foo -type f); do if [[ $(diff -q $file ${file/#foo/bar}) ]]; then echo differs: $file; break; else echo same: $file; fi; done
NB this will not detect if 'bar' has directories that do not exist in 'foo'.
Edited to add: I just realised I overlooked the really obvious solution:
diff -rq foo bar | head -n1
It's not 'diff', but with 'awk' you can compare two files (or more) and then exit when they have a different line.
Try something like this (sorry, it's a little rough)
awk '{ h[$0] = ! h[$0] } END { for (k in h) if (h[k]) exit }' file1 file2
Sources are here and here.
edit: to break out of the loop when two files have the same line, you may have to do the loop in awk. See here.
You can try the following:
#!/usr/bin/env bash
# Determine directories to compare
d1='./someDir1'
d2='./someDir2'
# Loop over the file lists and diff corresponding files
while IFS= read -r line; do
# Split the 3-column `comm` output into indiv. variables.
lineNoTabs=${line//$'\t'}
numTabs=$(( ${#line} - ${#lineNoTabs} ))
d1Only='' d2Only='' common=''
case $numTabs in
0)
d1Only=$lineNoTabs
;;
1)
d2Only=$lineNoTabs
;;
*)
common=$lineNoTabs
;;
esac
# If a file exists in both directories, compare them,
# and exit if they differ, continue otherwise
if [[ -n $common ]]; then
diff -q "$d1/$common" "$d2/$common" || {
echo "EXITING: Diff found: '$common'" 1>&2;
exit 1; }
# Deal with files unique to either directory.
elif [[ -n $d1Only ]]; then # fie
echo "File '$d1Only' only in '$d1'."
else # implies: if [[ -n $d2Only ]]; then
echo "File '$d2Only' only in '$d2."
fi
# Note: The `comm` command below is CASE-SENSITIVE, which means:
# - The input directories must be specified case-exact.
# To change that, add `I` after the last `|` in _both_ `sed commands`.
# - The paths and names of the files diffed must match in case too.
# To change that, insert `| tr '[:upper:]' '[:lower:]' before _both_
# `sort commands.
done < <(comm \
<(find "$d1" -type f | sed 's|'"$d1/"'||' | sort) \
<(find "$d2" -type f | sed 's|'"$d2/"'||' | sort))
The approach is based on building a list of files (using find) containing relative paths (using sed to remove the root path) for each input directory, sorting the lists, and comparing them with comm, which produces 3-column, tab-separated output to indicated which lines (and therefore files) are unique to the first list, which are unique to the second list, and which lines they have in common.
Thus, the values in the 3rd column can be diffed and action taken if they're not identical.
Also, the 1st and 2nd-column values can be used to take action based on unique files.
The somewhat complicated splitting of the 3 column values output by comm into individual variables is necessary, because:
read will treat multiple tabs in sequence as a single separator
comm outputs a variable number of tabs; e.g., if there's only a 1st-column value, no tab is output at all.
I got a solution to this thanks to #bazzargh.
I use this code in my script and now it works perfectly.
for file in $(find ${intfolder} -type f);
do if [[ $(diff -q $file ${file/#${intfolder}/${EXPANDEDROOT}/${runid}/$(basename ${intfolder})}) ]] 2> ${resultfile}.tmp;
then echo differs: $file > ${resultfile}.tmp 2>&1; break;
elif [[ -s ${resultfile}.tmp ]];
then echo differs: $file >> ${resultfile}.tmp 2>&1; break;
else echo same: $file > /dev/null;
fi; done
thanks!

How to use shell logical operators for If else case

I need some help to write a script for the following scenario.
The requirement is, based on the number of configuration files(*.cfg) inside a given directory, I need load all the configuration file names with out the file extension into an array. If there is only one configuration file in the directory, then array will be assigned the value "" (not the name of the only available configuration file)
I am trying to do this using logical operators. This is what i have tried so far.
[`ls *.cfg |wc -l`] || code_to_initialize_array;
My problem here is that, how do I integrate the case where i have only one configuration file.
Short code:
#!/bin/bash
array=(*.cfg)
array=("${array[#]%.cfg}")
[ ${#array[#]} -eq 1 ] && array=""
#!/bin/bash
config=(*.cfg) #glob instead ls usage
num=${#config[#]}
case $num in
0)
echo "No config file"
;;
1)
echo "Only one config file"
;;
*)
code_to_initialize_array
;;
esac
You can have this example script for your requirement. It's detailed and variable names are long but you could have your own customizations. Using readarray is safer than A=($(...)) since it doesn't depend on IFS and is not subject to pathname expansion.
#!/bin/bash
DIR=/path/to/somewhere
readarray -t FILES < <(compgen -G "${DIR%/}/*.cfg") ## Store matches to array.
FILES_COUNT=${#FILES[#]} ## Match count.
FILES_NAMES=("${FILES[#]##*/}") ## No directory parts.
FILES_NAMES_WITHOUT_CFG=("${FILES_NAMES[#]%.cfg}") ## No .cfg extension.
if [[ FILES_COUNT -gt 0 ]]; then
printf "File: %s\n" "${FILES[#]}"
printf "Name: %s\n" "${FILES_NAMES[#]}"
printf "Name (no .cfg): %s\n" "${FILES_NAMES_WITHOUT_CFG[#]}"
printf "Total: %d\n" "$FILES_COUNT"
fi
Note that each entry has the same index number. So ${FILES[1]} is ${FILES_NAMES[1]} and also ${FILES_NAMES_WITHOUT_CFG[1]}. Entries begin with index 0.
You can also have other details through this:
if [[ FILES_COUNT -gt 0 ]]; then
for I in "${!FILES[#]}"; do
printf "File: %s\n" "${FILES[I]}"
printf "Name: %s\n" "${FILES_NAMES[I]}"
printf "Name (no .sh): %s\n" "${FILES_NAMES_WITHOUT_CFG[I]}"
printf "Index number: $I\n\n"
done
printf "Total: %d\n" "$FILES_COUNT"
fi
I've always liked abusing a for loop for a situation like this.
for x in *.cfg; do
[[ -f $x ]] && code_to_initialize_array
break
The explicit break means the loop iterates only once, no matter how many .cfg files you have. If you have none, *.cfg will be treated literally, so the [[ -f $x ]] checks if the "first" cfg file actually exists before trying to run code_to_initialize_array.

Resources