Get grandparent directory in bash script - rename files for a directory in their paths - linux

I have the following script, which I normally use when I get a bunch of files that need to be renamed to the directory name which contains them.
The problem now is I need to rename the file to the directory two levels up. How can I get the grandparent directory to make this work?
With the following I get errors like this example:
"mv: cannot move ./48711/zoom/zoom.jpg to ./48711/zoom/./48711/zoom.jpg: No such file or directory". This is running on CentOS 5.6.
I want the final file to be named: 48711.jpg
#!/bin/bash
function dirnametofilename() {
for f in $*; do
bn=$(basename "$f")
ext="${bn##*.}"
filepath=$(dirname "$f")
dirname=$(basename "$filepath")
mv "$f" "$filepath/$dirname.$ext"
done
}
export -f dirnametofilename
find . -name "*.jpg" -exec bash -c 'dirnametofilename "{}"' \;
find .

Another method could be to use
(cd ../../; pwd)
If this were executed in any top-level paths such as /, /usr/, or /usr/share/, you would get a valid directory of /, but when you get one level deeper, you would start seeing results: /usr/share/man/ would return /usr, /my/super/deep/path/is/awesome/ would return /my/super/deep/path, and so on.
You could store this in a variable as well:
GRANDDADDY="$(cd ../../; pwd)"
and then use it for the rest of your script.

Assuming filepath doesn't end in /, which it shouldn't if you use dirname, you can do
Parent = "${filepath%/*}"
Grandparent = "${filepath%/*/*}"
So do something like this
[[ "${filepath%/*/*}" == "" ]] && echo "Path isn't long enough" || echo "${filepath%/*/*}"
Also this likely won't work if you're using relative paths (like find .). In which case you will want to use
filepath=$(dirname "$f")
filepath=$(readlink -f "$filepath")
instead of
filepath=$(dirname "$f")
Also you're never stripping the extension, so there is no reason to get it from the file and then append it again.

Note:
* This answer solves the OP's specific problem, in whose context "grandparent directory" means: the parent directory of the directory containing a file (it is the grandparent path from the file's perspective).
* By contrast, given the question's generic title, other answers here focus (only) on getting a directory's grandparent directory; the succinct answer to the generic question is: grandParentDir=$(cd ../..; printf %s "$PWD") to get the full path, and grandParentDirName=$(cd ../..; basename -- "$PWD") to get the dir. name only.
Try the following:
find . -name '*.jpg' \
-execdir bash -c \
'old="$1"; new="$(cd ..; basename -- "$PWD").${old##*.}"; echo mv "$old" "$new"' - {} \;
Note: echo was prepended to mv to be safe - remove it to perform the actual renaming.
-execdir ..\; executes the specified command in the specific directory that contains a given matching file and expands {} to the filename of each.
bash -c is used to execute a small ad-hoc script:
$(cd ..; basename -- "$PWD") determines the parent directory name of the directory containing the file, which is the grandparent path from the file's perspective.
${old##*.} is a Bash parameter expansion that returns the input filename's suffix (extension).
Note how {} - the filename at hand - is passed as the 2nd argument to the command in order to bind to $1, because bash -c uses the 1st one to set $0 (which is set to dummy value _ here).
Note that each file is merely renamed, i.e., it stays in its original directory.
Caveat:
Each directory with a matching file should only contain 1 matching file, otherwise multiple files will be renamed to the same target name in sequence - effectively, only the last file renamed will survive.

Can't you use realpath ../../ or readlink -f ../../ ? See this, readlink(1), realpath(3), canonicalize_file_name(3), and realpath(1). You may want to install the realpath package on Debian or Ubuntu. Probably CentOS has an equivalent package. (readlink should always be available, it is in GNU coreutils)

Related

Convert all files of a specified extension within a directory to pdf, recursively for all sub-directories

I'm using the following code (from this answer) to convert all CPP files in the current directory to a file named code.pdf and it works well:
find . -name "*.cpp" -print0 | xargs -0 enscript -Ecpp -MLetter -fCourier8 -o - | ps2pdf - code.pdf
I would like to improve this script to:
Make it a .sh file that can take an argument specifying the
extension instead of having it hardcoded to CPP;
Have it run recursively, visiting all subdirectories of the current directory;
For each subdirectory encountered, convert all files of the specified extension to a single PDF that is named $NameOfDirectory$.PDF and is placed in that subdirectory;
First, if I understand it correctly, this requirement:
For each subdirectory encountered, convert all files of the specified extension to a single PDF that is named $NameOfDirectory$.PDF
is unwise. If that means, say, a/b/c/*.cpp gets enscripted to ./c.pdf, then you're screwed if you also have a/d/x/c/*.cpp, since both directories' contents map to the same PDF. It also means that *.cpp (i.e. CPP files in the current dir) gets enscripted to a file named ./..pdf.
Something like this, which names the PDF according to the desired extension and places it in each subdirectory alongside its source files, doesn't have those problems:
#!/usr/bin/env bash
# USAGE: ext2pdf [<ext> [<root_dir>]]
# DEFAULTS: <ext> = cpp
# <root_dir> = .
ext="${1:-cpp}"
rootdir="${2:-.}"
shopt -s nullglob
find "$rootdir" -type d | while read d; do
# With "nullglob", this loop only runs if any $d/*.$ext files exist
for f in "$d"/*.${ext}; do
out="$d/$ext".pdf
# NOTE: Uncomment the following line instead if you want to risk name collisions
#out="${rootdir}/$(basename "$d")".pdf
enscript -Ecpp -MLetter -fCourier8 -o - "$d"/*.${ext} | ps2pdf - "$out"
break # We only want this to run once
done
done
First, if I understand correctly, what you are using is in fact wrong - find will retrieve the files from all sub-directories. To work recursively, only getting files from the current directory (I named it do.bash):
#!/bin/bash
ext=$1
if ls *.$ext &> /dev/null; then
enscript -Ecpp -MLetter -fCourier8 -o - *.$ext | ps2pdf - $(basename $(pwd)).pdf
fi
for subdir in */; do
if [ "$subdir" == "*/" ]; then break; fi
cd $subdir
/path/to/do.bash $ext
cd ../
done
The checks are to make sure a file with the extension or a sub directory actually exist. This scripts operates on the current directory, and calls itself recursively - if you do not want a full path, put it in your PATH listings, though a full path is fine.

Bash script to change file names of different extension [duplicate]

This question already has answers here:
Batch renaming files with Bash
(10 answers)
Closed 4 years ago.
we are working on a angularjs project, where the compiled output contains lot of file extensions like js,css, woff, etc.. along with individual dynamic hashing as part of file name.
I am working on simple bash script to search the files belonging to the mentioned file extensions and moving to some folder with hashing removed by
searching for first instance of '.'.
Please note file extension .woff and .css should be retained.
/src/main.1cc794c25c00388d81bb.js ==> /dst/main.js
/src/polyfills.eda7b2736c9951cdce19.js ==> /dst/polyfills.js
/src/runtime.a2aefc53e5f0bce023ee.js ==> /dst/runtime.js
/src/styles.8f19c7d2fbe05fc53dc4.css ==> /dst/styles.css
/src/1.620807da7415abaeeb47.js ==> /dst/1.js
/src/2.93e8bd3b179a0199a6a3.js ==> /dst/2.js
/src/some-webfont.fee66e712a8a08eef580.woff ==> /dst/some-webfont.woff
/src/Web_Bd.d2138591460eab575216.woff ==> /dst/Web_Bd.woff
Bash code:
#!/bin/bash
echo Process web binary files!
echo Processing the name change for js files!!!!!!!!!!!!!
sfidx=0;
SFILES=./src/*.js #{js,css,voff}
DST=./dst/
for files in $SFILES
do
echo $(basename $files)
cp $files ${DST}"${files//.*}".js
sfidx=$((sfidx+1))
done
echo Number of target files detected in srcdir $sfidx!!!!!!!!!!
The above code has 2 problems,
Need to add file extensions in for loop at a common place, instead of running for each extension. However, this method fails, not sure this needs to be changed.
SFILES=./src/*.{js,css,voff}
cp: cannot stat `./src/*.{js,css,voff}': No such file or directory
Second, the cp cmd fails due to below reason, need some help to figure out correct syntax.
cp $files ${DST}"${files//.*}".js
1.620807da7415abaeeb47.js
cp: cannot create regular file `./dst/./src/1.620807da7415abaeeb47.js.js': No such file or directory
Here is a relatively simple command to do it:
find ./src -type f \( -name \*.js -o -name \*.css -o -name \*.woff \) -print0 |
while IFS= read -r -d $'\0' line; do
dest="./dst/$(echo $(basename $line) | sed -E 's/(\..{20}\.)(js|css|woff)/\.\2/g')"
echo Copying $line to $dest
cp $line $dest
done
This is based on the original code and is Shellcheck-clean:
#!/bin/bash
shopt -s nullglob # Make globs that match nothing expand to nothing
echo 'Process web binary files!'
echo 'Processing the name change for js, css, and woff files!!!!!!!!!!!!!'
srcfiles=( src/*.{js,css,woff} )
destdir=dst
for srcpath in "${srcfiles[#]}" ; do
filename=${srcpath##*/}
printf '%s\n' "$filename"
nohash_base=${filename%.*.*} # Remove the hash and suffix from the end
suffix=${filename##*.} # Remove everything up to the final '.'
newfilename=$nohash_base.$suffix
cp -- "$srcpath" "$destdir/$newfilename"
done
echo "Number of target files detected in srcdir ${#srcfiles[*]}!!!!!!!!!"
The code uses an array instead of a string to hold the list of files because it is easier (and generally safer because it can handle file with names that contain spaces and other special characters). See Arrays [Bash Hackers Wiki] for information about using arrays in Bash.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about using ${var##pattern} etc. for extracting parts of strings.
See Correct Bash and shell script variable capitalization for an explanation of why it is best to avoid uppercase variable names (such as SFILES).
shopt -s nullglob prevents strange things happening if the glob pattern(s) fail to match. See Why is nullglob not default? for more information.
See Bash Pitfalls #2 (cp $file $target) for why it's generally better to use cp -- instead of plain cp (though it's not necessary in this case (since neither argument can begin with '-')).
It's best to keep Bash code Shellcheck-clean. When run on the code in the question it identifies the key problem, and recommends the use of arrays as a way to fix it. It also identifies several other potential problems.
Your problem is precedence of the expansions. Here is my solution:
#!/bin/bash
echo Process web binary files!
echo Processing the name change for js files!!!!!!!!!!!!!
sfidx=0;
SFILES=$(echo ./src/*.{js,css,voff})
DST=./dst/
for file in $SFILES
do
new=${file##*/} # basename
new="${DST}${new%\.*}.js" # escape \ the .
echo "copying $file to $new" # sanity check
cp $file "$new"
sfidx=$((sfidx+1))
done
echo Number of target files detected in srcdir $sfidx!!!!!!!!!!
With three files in ./src all named "gash" I get:
Process web binary files!
Processing the name change for js files!!!!!!!!!!!!!
copying ./src/gash.js to ./dst/gash.js
copying ./src/gash.css to ./dst/gash.js
copying ./src/gash.voff to ./dst/gash.js
Number of target files detected in srcdir 3!!!!!!!!!!
(You might be able to get around using eval, but that can be a security issue)
new=${file##*/} - remove the longest string on the left ending in / (remove leading directory names, as basename). If you wanted to use the external non-shell basename program then it would be new=$(basename $file).
${new%\.*} - remove the shortest string on the right starting . (remove the old file extension)
A possible approach is to have the find command generate a shell script, then execute it.
src=./src
dst=./dst
find "$src" \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf 'p="%p"; f="%f"; cp "$p" "'"${dst}"'/${f%%%%.*}.${f##*.}"\n'
This will print the shell commands you want to execute. If they are what you want,
just pipe the output to a shell:
find "$src" \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf 'p="%p"; f="%f"; cp "$p" "'"${dst}"'/${f%%%%.*}.${f##*.}"\n'|bash
(or |bash -x if you want to see what is going on.)
If you have files named, e.g., ./src/dir1/a.xyz.js and ./src/dir2/a.uvw.js they will both end up as ./dst/a.js, the second overwriting the first. To avoid this, you might want to use cp -i instead of cp.
If you are absolutely sure that there will never be spaces or other strange characters in your pathnames, you can use less quotes (to the horror of some shell purists)
find $src \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf "p=%p; f=%f; cp \$p ${dst}/\${f%%%%.*}.\${f##*.}\\n"|bash
Some final remarks:
%p and %f are expanded by -printf as the full pathname and the basename of the processed file. they enable us to avoid the basename command. Unfortunately, the is no such directive for the file extension, so we must use brace expansion in the shell to compose the final name.
In the -printf argument, we must use %% to write a single percent character. Since we need two of them, there have to be four...
${f%%.*} expands in the shell as the value of $f with everything removed from the first dot onwards
${f##*.} expands in the shell as the value of $f with everything removed up to the last dot (i.e., it expands to the file extension)

Removing Colons From Multiple FIles on Linux

I am trying to take some directories that and transfer them from Linux to Windows. The problem is that the files on Linux have colons in them. And I need to copy these directories (I cannot alter them directly since they are needed as they are the server) over to files with a name that Windows can use. For example, the name of a directory on the server might be:
IAPLTR2b-ERVK-LTR_chr9:113137544-113137860_-
while I need it to be:
IAPLTR2b-ERVK-LTR_chr9-113137544-113137860_-
I have about sixty of these directories and I have collected the names of the files with their absolute paths in a file I call directories.txt. I need to walk through this file changing the colons to hyphens. Thus far, my attempt is this:
#!/bin/bash
$DIRECTORIES=`cat directories.txt`
for $i in $DIRECTORIES;
do
cp -r "$DIRECTORIES" "`echo $DIRECTORIES | sed 's/:/-/'`"
done
However I get the error:
./my_shellscript.sh: line 10: =/bigpartition1/JKim_Test/test_bs_1/129c-test-biq/IAPLTR1_Mm-ERVK-LTR_chr10:104272652-104273004_+.fasta: No such file or directory ./my_shellscript.sh: line 14: `$i': not a valid identifier
Can anyone here help me identify what I am doing wrong and maybe what I need to do?
Thanks in advance.
This monstrosity will rename the directories in situ:
find tmp -depth -type d -exec sh -c '[ -d "{}" ] && echo mv "{}" "$(echo "{}" | tr : _)"' \;
I use -depth so it descends down into the deepest subdirectories first.
The [ -d "{}" ] is necessary because as soon as the subdirectory is renamed, its parent directory (as found by find) may no longer exist (having been renamed).
Change "echo mv" to "mv" if you're satisfied it will do what you want.

Synchronize content of directories in Linux

Let's assume I have following source directory
source/
subdir1/file1
subdir1/file2
subdir2/file3
subdir3/file4
and target directory
target
subdir1/file5
subdir2/file6
subdir4/file7
I would like to move content of source subdirectories to right target subdirectories so result look like this
target
subdir1/file1
subdir1/file2
subdir1/file5
subdir2/file6
subdir2/file3
subdir3/file4
subdir4/file7
Is there some Linux command to do this or must I write a script myself?
To suimmarize, it is important to move, not copy. That rules out cp and rsync but allows mv. mv, however, has the issue that it is not good at merging the old directory into the new.
In the examples that you gave, the target directory had the complete directory tree but lacked files. If that is the case, try:
cd /source ; find . -type f -exec sh -c 'mv "$1" "/target/$1"' _ {} \;
The above starts by selecting the source as the current directory with cd /source. Next, we use find which is the usual *nix utility for finding files/directories. In this case, give find the -type f option to tell it to look only for files. With the -exec option, we tell it to move any such files found to the target directory.
You have choices for how to deal with conflicts between the two directory trees. You can give mv the -f option and it will overwrite files in the target without asking, or you can give it the -n option and it will never overwrite a target file, or your can give it the -i option and it will ask you each time.
In case the target directory tree is incomplete
If the target directory tree is missing some directories that are in the source, the we have to create them on the fly. This adds just minor complication:
cd /source ; find . -type f -exec sh -c 'mkdir -p "/target/${1%/*}"; mv "$1" "/target/$1"' _ {} \;
The mkdir -p command assures that the directory we want exists before we try to move the file there.
Additional notes
The form ${1%/*} is an example of one of the shells powerful features called "parameter expansion". This particular feature is suffix removal. In general, it looks like ${parameter%word} which tells bash to expand word and remove it from the end of parameter. In our case, the name of the parameter is 1, meaning the first argument to the script. We want to remove the file name and just leave behind the directory that the file is in. So, the word /* tells the shell to remove the last slash and any characters which follow.
The commands above use both single and double quotes. They have to be copied exactly for the command to work properly.
To sync dorectory maybe used rsync
Example:
rsync -avzh source/ target/
More info man rsync
Move (no copy)
rsync --remove-source-files -avzh source/ target/

Remove all files in a directory (do not touch any folders or anything within them)

I would like to know whether rm can remove all files within a directory (but not the subfolders or files within the subfolders)?
I know some people use:
rm -f /direcname/*.*
but this assumes the filename has an extension which not all do (I want all files - with or without an extension to be removed).
Although find allows you to delete files using -exec rm {} \; you can use
find /direcname -maxdepth 1 -type f -delete
and it is faster. Using -delete implies the -depth option, which means process directory contents before directory.
find /direcname -maxdepth 1 -type f -exec rm {} \;
Explanation:
find searches for files and directories within /direcname
-maxdepth restricts it to looking for files and directories that are direct children of /direcname
-type f restricts the search to files
-exec rm {} \; runs the command rm {} for each file (after substituting the file's path in place of {}).
I would like to know whether rm can remove all files within a directory (but not the subfolders or files within the subfolders)?
That's easy:
$ rm folder/*
Without the -r, the rm command won't touch sub-directories or the files they contain. This will only remove the files in folder and not the sub-directories or their files.
You will see errors telling you that folder/foo is a directory can cannot be removed, but that's actually okay with you. If you want to eliminate these messages, just redirect STDERR:
$ rm folder/* 2> /dev/null
By the way, the exit status of the rm command may not be zero, so you can't check rm for errors. If that's important, you'll have to loop:
$ for file in *
> do
> [[ -f $file ]] && rm $file
> [[ $? -ne 0 ]] && echo "Error in removing file '$file'"
> done
This should work in BASH even if the file names have spaces in them.
You can use
find /direcname -maxdepth 1 -type f -exec rm -f {} \;
A shell solution (without the non-standard find -maxdepth) would be
for file in .* *; do
test -f "$file" && rm "$file"
done
Some shells, notably zsh and perhaps bash version 4 (but not version 3), have a syntax to do that.
With zsh you might just type
rm /dir/path/*(.)
and if you would want to remove any file whose name starts with foo, recursively in subdirectories, you could do
rm /dir/path/**/foo*(.)
the double star feature is (with IMHO better interactive completion) in my opinion enough to switch to zsh for interactive shells. YMMV
The dot in parenthesis suffix indicates that you want only files (not symlinks or directories) to be expanded by the zsh shell.
Unix isn't DOS. There is no special "extension" field in a file name. Any characters after a dot are just part of the name and are called the suffix. There can be more than one suffix, for example.tar.gz. The shell glob character * matches across the . character; it is oblivious to suffixes. So the MS-DOS *.* is just * In Unix.
Almost. * does not match files which start with a .. Objects named with a leading dot are, by convention, "hidden". They do not show up in ls either unless you specify -a.
(This means that the . and .. directory entries for "self" and "parent" are considered hidden.)
To match hidden entries also, use .*
The rm command does not remove directories (when not operated recursively with -r).
Try rm <directory> and see. Even if the directory is empty, it will refuse.
So, the way you remove all (non-hidden) files, pipes, devices, sockets and symbolic links from a directory (but leave the subdirectories alone) is in fact:
rm /path/to/directory/*
to also remove the hidden ones which start with .:
rm /path/to/directory/{*,.*}
This syntax is brace expansion. Brace expansion is not pattern matching; it is just a short-hand for generating multiple arguments, in this case:
rm /path/to/directory/* /path/to/directory/.*
this expansion takes place first first and then globbing takes place to generate the names to be deleted.
Note that various solutions posted here have various issues:
find /path/to/directory -type f -delete
# -delete is not Unix standard; GNU find extension
# without -maxdepth 1 this will recurse over all files
# -maxdepth is also a GNU extension
# -type f finds only files; so this neglects to delete symlinks, fifos, etc.
The GNU find solutions have the benefit that they work even if the number of directory entries to be deleted is huge: too large to pass in a single call to rm. Another benefit is that the built-in -delete does not have issues with passing funny path names to an external command.
The portable workaround for the problem of too many directory entries is to list the entries with ls and pipe to xargs:
( cd /path/to/directory ; ls -a | xargs rm -- )
The parentheses mean "do these commands in a sub-process"; this way the effect of the cd is forgotten, which is useful in scripting. ls -a includes the hidden files.
We now include a -- after rm which means "this is the last option; everything else is a non-option argument". This guards us against directory entries whose names are indistinguishable from options. What if a file is called -rf and ends up the first argument? Then you have rm -rf ... which will blow off subdirectories.
The easiest way to do this is to use:
rm *
In order to remove directories, you must specify the option -r
rm -r
so your directories and anything contained in them will not be removed by using
rm *
per the man page for rm, its purpose is to remove files, which is why this works

Resources