Shell Script for File name conversion in linux - linux

I am pretty new to Unix and have little exposure to shell script. I need to come up with a script that converts the file names from certain string values to special characters. This needs to be run in such a way all files under sub-directories also gets renamed.
For Example:
From: abc(GE)xyz(PR).txt changes
To: abc>xyz%.txt
I m ok to set if condition for all required special characters, but im not sure what options to pass and how to do it for all sub-directories.
Thanks,
Jeel

Here's one approach:
# given a filename, execute any desired replacements.
update_name() {
local orig_name_var=$1
local dest_name_var=$2
local orig_name=${!orig_name_var}
local new_name="$orig_name"
new_name=${new_name//(GE)/">"}
new_name=${new_name//(PR)/"%"} # repeat for additional substitutions
printf -v "$dest_name_var" "$new_name"
}
while IFS= read -r -d '' orig_name; do
update_name orig_name new_name
[[ $orig_name = $new_name ]] && continue
if ! [[ -e $orig_name ]]; then
orig_dirname=${orig_name%/*}
orig_basename=${orig_name##*/}
update_name orig_dirname new_dirname
if [[ -e $new_dirname/$orig_basename ]]; then
# we already renamed the directory this file is in
orig_name=$new_dirname/$orig_basename
fi
fi
mv -- "$orig_name" "$new_name"
done < <(find . '(' -name '*(GE)*' -o -name '*(PR)*' ')' -print0)

Related

Bash Loop with counter gives a count of 1 when no item found. Why?

In the function below my counter works fine as long as an item is found in $DT_FILES. If the find is empty for that folder the counter gives me a count of 1 instead of 0. I am not sure what I am missing.
What the script does here is 1) makes a variable containing all the parent folders. 2) Loop through each folder, cd inside each one and makes a list of all files that contain the string "-DT-". 3) If it finds a file that doesn't not end with ".tif", it then copy the DT files and put a .tif extension to it. Very simple.
I count the number of times the loop did create a new file with the ".tif" extension.
So I am not sure why I am getting a count of 1 at times.
function create_tifs()
{
IFS=$'\n'
# create list of main folders
LIST=$( find . -maxdepth 1 -mindepth 1 -type d )
for f in $LIST
do
echo -e "\n${OG}>>> Folder processed: ${f} ${NONE}"
cd ${f}
DT_FILES=$(find . -type f -name '*-DT-*' | grep -v '.jpg')
if (( ${#DT_FILES} ))
then
count=0
for b in ${DT_FILES}
do
if [[ "${b}" != *".tif" ]]
then
# cp -n "${b}" "${b}.tif"
echo -e "TIF created ${b} as ${b}.tif"
echo
((count++))
else
echo -e "TIF already done ${b}"
fi
done
fi
echo -e "\nCount = ${count}"
}
I can't repro your problem, but your code contains several dubious constructs. Here is a refactoring might coincidentally also remove whatever problem you were experiencing.
#!/bin/bash
# Don't use non-portable function definition syntax
create_tifs() {
# Don't pollute global namespace; don't attempt to parse find output
# See also https://mywiki.wooledge.org/BashFAQ/020
local f
for f in ./*/; do
# prefer printf over echo -e
# print diagnostic messages to standard error >&2
# XXX What are these undeclared global variables?
printf "\n%s>>> Folder processed: %s %s" "$OG" "$f" "$NONE" >&2
# Again, avoid parsing find output
find "$f" -name '*-DT-*' -not -name '*.jpg' -exec sh -c '
for b; do
if [[ "${b}" != *".tif" ]]
then
# cp -n "${b}" "${b}.tif"
printf "TIF created %s as %s.tif\n" "$b" "$b" >&2
# print one line for wc
printf ".\n"
else
# XXX No newline, really??
printf "TIF already done %s" "$b" >&2
fi
done
fi' _ {} +
# Missing done!
done |
# Count lines produced by printf inside tif creation
wc -l |
sed 's/.*/Count = &/'
}
This could be further simplified by using find ./*/ instead of looping over f but then you don't (easily) get to emit a diagnostic message for each folder separately. Similarly, you could add -not -name '*.tif' but then you don't get to print "tif already done" for those.
Tangentially perhaps see also Correct Bash and shell script variable capitalization; use lower case for your private variables.
Printing a newline before your actual message (like in the first printf) is a weird antipattern, especially when you don't do that consequently. The usual arrangement would be to put a newline at the end of each emitted message.
If you've got Bash 4.0 or later you can use globstar instead of (the error-prone) find. Try this Shellcheck-clean code:
#! /bin/bash -p
shopt -s dotglob extglob nullglob globstar
function create_tifs
{
local dir dtfile
local -i count
for dir in */; do
printf '\nFolder processed: %s\n' "$dir" >&2
count=0
for dtfile in "$dir"**/*-DT-!(*.jpg); do
if [[ $dtfile == *.tif ]]; then
printf 'TIF already done %s\n' "$dtfile" >&2
else
cp -v -n -- "$dtfile" "$dtfile".tif
count+=1
fi
done
printf 'Count = %d\n' "$count" >&2
done
return 0
}
shopt -s ... enables some Bash settings that are required by the code:
dotglob enables globs to match files and directories that begin with .. find shows such files by default.
extglob enables "extended globbing" (including patterns like !(*.jpg)). See the extglob section in glob - Greg's Wiki.
nullglob makes globs expand to nothing when nothing matches (otherwise they expand to the glob pattern itself, which is almost never useful in programs).
globstar enables the use of ** to match paths recursively through directory trees.
Note that globstar is potentially dangerous in versions of Bash prior to 4.3 because it follows symlinks, possibly leading to processing the same file or directory multiple times, or getting stuck in a cycle.
The -v option with cp causes it to print details of what it does. You might prefer to drop the option and print a different format of message instead.
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why I used printf instead of echo.
I didn't use cd because it often leads to problems in programs.

Moving files with whitespaces and rename, if files already exist in the new folder

I need to move files with names like source (new file).c to another directory and rename it if the file already exist there.
I tried a lot of things like
for file in $(find ~/path/ -type f -name "*.c"); do
or
IFS=$'\0' for file in $(find ~/path/ -type f -name "*.c"); do
Update.1 For rename condition i try if [ -f /this/is/the/path/${file} ]; then
or if [ -f "$file" ]; then
or if [ -f "$HOME/some/path/$file" ]; then
i want user input like read -p "some messege" msg but I can't figure it out and the if statement don't work as well and i don't know why...
fix it When I run the script, I get errors for split names. Example:
mv: cannot stat '(new': No such file or directory
Can someone help me with this?
Update.2 solution for find name with whitespaces: find ... | while IFS= read -r name; do your command done
Update.3 solution for if condition how don't work: check correct awnser
My regrets
Would you please try the following:
#!/bin/bash
path="/path/to/the/source" # original directory
dest="/path/to/the/dest" # destination directory
find "$path" -type f -name "*.c" -print0 | while IFS= read -r -d "" file; do
f=${file##*/} # extracts filename by removing everything before "/"
if [[ -f $dest/$f ]]; then
# if the file already exists
for ((;;)); do # then enter a infinite loop until proper filename is given
IFS= read -p "'$f' already exists. Input new name: " f2 < /dev/tty
if [[ ! $f2 =~ [[:alnum:]_] ]]; then
echo "'$f2' is not a valid filename."
elif [[ ! -f $dest/$f2 ]]; then
# a proper filename is input
break
else # the filename still exists
f=$f2
fi
done
fi
mv -- "$file" "$dest/$f2"
done
find ... -print0 uses a null character as a filename delimiter
to protect filenames which contain blank (whitespace, tab, newline ...)
character.
read -d "" corresponds to -print0 and split the input on the null characters.
read < /dev/tty avoids the conflict with the outermost pipeline
fed by the find command.

Recursive directory listing in shell without using ls

I am looking for a script that recursively lists all files using export and read link and by not using ls options. I have tried the following code, but it does not fulfill the purpose. Please can you help.
My Code-
#!/bin/bash
for i in `find . -print|cut -d"/" -f2`
do
if [ -d $i ]
then
echo "Hello"
else
cd $i
echo *
fi
done
Here's a simple recursive function which does a directory listing:
list_dir() {
local i # do not use a global variable in our for loop
# ...note that 'local' is not POSIX sh, but even ash
# and dash support it.
[[ -n $1 ]] || set -- . # if no parameter is passed, default to '.'
for i in "$1"/*; do # look at directory contents
if [ -d "$i" ]; then # if our content is a directory...
list_dir "$i" # ...then recurse.
else # if our content is not a directory...
echo "Found a file: $i" # ...then list it.
fi
done
}
Alternately, if by "recurse", you just mean that you want the listing to be recursive, and can accept your code not doing any recursion itself:
#!/bin/bash
# ^-- we use non-POSIX features here, so shebang must not be #!/bin/sh
while IFS='' read -r -d '' filename; do
if [ -f "$filename" ]; then
echo "Found a file: $filename"
fi
done < <(find . -print0)
Doing this safely calls for using -print0, so that names are separated by NULs (the only character which cannot exist in a filename; newlines within names are valid.

Running diff and have it stop on a difference

I have a script running that is checking multiples directories and comparing them to expanded tarballs of the same directories elsewhere.
I am using diff -r -q and what I would like is that when diff finds any difference in the recursive run it will stop running instead of going through more directories in the same run.
All help appreciated!
Thank you
#bazzargh I did try it like you suggested or like this.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]];
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
But this only works with files that exist in both directories. If one file is missing I won't get information about that. Also the directories I am working with have over 300.000 files so it seems to be a bit of overhead to do a find for each file and then diff.
I would like something like this to work, with and elif statement that checks if $runid.tmp contains data and breaks if it does. I added 2> after the first if statement so stderr is sent to the $runid.tmp file.
for file in $(find $dir1 -type f);
do if [[ $(diff -q $file ${file/#$dir1/$dir2}) ]] 2> /tmp/$runid.tmp;
then echo differs: $file > /tmp/$runid.tmp 2>&1; break;
elif [[ -s /tmp/$runid.tmp ]];
then echo differs: $file >> /tmp/$runid.tmp 2>&1; break;
else echo same: $file > /dev/null; fi; done
Would this work?
You can do the loop over files with 'find' and break when they differ. eg for dirs foo, bar:
for file in $(find foo -type f); do if [[ $(diff -q $file ${file/#foo/bar}) ]]; then echo differs: $file; break; else echo same: $file; fi; done
NB this will not detect if 'bar' has directories that do not exist in 'foo'.
Edited to add: I just realised I overlooked the really obvious solution:
diff -rq foo bar | head -n1
It's not 'diff', but with 'awk' you can compare two files (or more) and then exit when they have a different line.
Try something like this (sorry, it's a little rough)
awk '{ h[$0] = ! h[$0] } END { for (k in h) if (h[k]) exit }' file1 file2
Sources are here and here.
edit: to break out of the loop when two files have the same line, you may have to do the loop in awk. See here.
You can try the following:
#!/usr/bin/env bash
# Determine directories to compare
d1='./someDir1'
d2='./someDir2'
# Loop over the file lists and diff corresponding files
while IFS= read -r line; do
# Split the 3-column `comm` output into indiv. variables.
lineNoTabs=${line//$'\t'}
numTabs=$(( ${#line} - ${#lineNoTabs} ))
d1Only='' d2Only='' common=''
case $numTabs in
0)
d1Only=$lineNoTabs
;;
1)
d2Only=$lineNoTabs
;;
*)
common=$lineNoTabs
;;
esac
# If a file exists in both directories, compare them,
# and exit if they differ, continue otherwise
if [[ -n $common ]]; then
diff -q "$d1/$common" "$d2/$common" || {
echo "EXITING: Diff found: '$common'" 1>&2;
exit 1; }
# Deal with files unique to either directory.
elif [[ -n $d1Only ]]; then # fie
echo "File '$d1Only' only in '$d1'."
else # implies: if [[ -n $d2Only ]]; then
echo "File '$d2Only' only in '$d2."
fi
# Note: The `comm` command below is CASE-SENSITIVE, which means:
# - The input directories must be specified case-exact.
# To change that, add `I` after the last `|` in _both_ `sed commands`.
# - The paths and names of the files diffed must match in case too.
# To change that, insert `| tr '[:upper:]' '[:lower:]' before _both_
# `sort commands.
done < <(comm \
<(find "$d1" -type f | sed 's|'"$d1/"'||' | sort) \
<(find "$d2" -type f | sed 's|'"$d2/"'||' | sort))
The approach is based on building a list of files (using find) containing relative paths (using sed to remove the root path) for each input directory, sorting the lists, and comparing them with comm, which produces 3-column, tab-separated output to indicated which lines (and therefore files) are unique to the first list, which are unique to the second list, and which lines they have in common.
Thus, the values in the 3rd column can be diffed and action taken if they're not identical.
Also, the 1st and 2nd-column values can be used to take action based on unique files.
The somewhat complicated splitting of the 3 column values output by comm into individual variables is necessary, because:
read will treat multiple tabs in sequence as a single separator
comm outputs a variable number of tabs; e.g., if there's only a 1st-column value, no tab is output at all.
I got a solution to this thanks to #bazzargh.
I use this code in my script and now it works perfectly.
for file in $(find ${intfolder} -type f);
do if [[ $(diff -q $file ${file/#${intfolder}/${EXPANDEDROOT}/${runid}/$(basename ${intfolder})}) ]] 2> ${resultfile}.tmp;
then echo differs: $file > ${resultfile}.tmp 2>&1; break;
elif [[ -s ${resultfile}.tmp ]];
then echo differs: $file >> ${resultfile}.tmp 2>&1; break;
else echo same: $file > /dev/null;
fi; done
thanks!

Find all files where no part of the path of the file is a symbolic link

Is there an easy way to find all files where no part of the path of the file is a symbolic link?
Short:
find myRootDir -type f -print
This would answer the question.
Care to not add a slash at end of specified dir ( not myRootDir/ but myRootDir ).
This won't print other than real files in real path.
No symlinked file nor file in symlinked dir.
But...
If you wanna ensure that a specified dir contain a symlink, there is a litte bash function to could do the job:
isPurePath() {
if [ -d "$1" ];then
while [ ! -L "$1" ] && [ ${#1} -gt 0 ] ;do
set -- "${1%/*}"
if [ "${1%/*}" == "$1" ] ;then
[ ! -L "$1" ] && return
set -- ''
fi
done
fi
false
}
if isPurePath /usr/share/texmf/dvips/xcolor ;then echo yes; else echo no;fi
yes
if isPurePath /usr/share/texmf/doc/pgf ;then echo yes; else echo no;fi
no
So you could Find all files where no part of the path of the file is a symbolic link in running this command:
isPurePath myRootDir && find myRootDir -type f -print
So if something is printed, there are no symlink part !
You can use this script : (copy/paste the whole code in a shell)
cat<<'EOF'>sympath
#!/bin/bash
cur="$1"
while [[ $cur ]]; do
cur="${cur%/*}"
if test -L "$cur"; then
echo >&2 "$cur is a symbolic link"
exit 1
fi
done
EOF
${cur%/*} is a bash parameter expansion
EXAMPLE
chmod +x sympath
./sympath /tmp/foo/bar/base
/tmp/foo/bar is a symbolic link
I don't know any easy way, but here's an answer that fully answers your question, using two methods (that are, in fact, essentially the same):
Using an auxiliary script
Create a file called hasnosymlinkinname (or choose a better name --- I've always sucked at choosing names):
#!/bin/bash
name=$1
if [[ "$1" = /* ]]; then
name="$(pwd)/$1"
else
name=$1
fi
IFS=/ read -r -a namearray <<< "$name"
for ((i=0;i<${#namearray[#]}; ++i)); do
IFS=/ read name <<< "${namearray[*]:0:i+1}"
[[ -L "$name" ]] && exit 1
done
exit 0
Then chmod +x hasnosymlinkinname. Then use with find:
find /path/where/stuff/is -exec ./hasnosymlinkinname {} \; -print
The scripts works like this: using IFS trickery, we decompose the filename into each part of the path (separated by the /) and put each part in an array namearray. Then, we loop through the (cumulative) parts of the array (joined with the / thanks to some IFS trickery) and if this part is a symlink (see the -L test), we exit with a non-success return code (1), otherwise, we exit with a success return code (0).
Then find runs this script to all files in /path/where/stuff/is. If the script exits with a success return code, the name of the file is printed out (but instead of -print you could do whatever else you like).
Using a one(!)-liner (if you have a large screen) to impress your grand-mother (or your dog)
find /path/where/stuff/is -exec bash -c 'if [[ "$0" = /* ]]; then name=$0; else name="$(pwd)/$0"; fi; IFS=/ read -r -a namearray <<< "$name"; for ((i=0;i<${#namearray[#]}; ++i)); do IFS=/ read name <<< "${namearray[*]:0:i+1}"; [[ -L "$name" ]] && exit 1; done; exit 0' {} \; -print
Note
This method is 100% safe regarding spaces or funny symbols that could appear in file names. I don't know how you'll use the output of this command, but please make sure that you'll use a good method that will also be safe regarding spaces and funny symbols that could appear in a file name, i.e., don't parse its output with another script unless you use -print0 or similar smart thing.

Resources