renaming and replacing by string in git repo - linux

I am working on a project named XXX.
I want to replace every instance of XXX to YYY. (I wish to replace the string inside all files and also rename files/directories that contain the string XXX to YYY).
What I have done and where I'm stuck:
git checkout -b renameFix
// in zsh
sed -i -- 's/XXX/YYY/g' **/*(D.) // replace
zmv '(**/)(*XXX*)' '$1${2//XXX/YYY}' // rename files and dirs
Now, when I run git status, I get "fatal: unknown index entry format 0x2f700000" error.
Is there a different approach I can use?

Your problem is that your glob (**/*(D.)) traverses the .git directory.
You can either remove the D qualifier, to avoid globbing "hidden" files (files that start with a period) or add another qualifier to filter out files that starts with git.
Something like this might work:
ignore_git() { ! [[ $REPLY =~ "^.git" ]] }
printf '%s\n' **/*(D.+ignore_git)
I have added the printf so that you can verify that you list the files that you want.
You can also take a look at git ls-files which can produce a list of files that git tracks.

Related

Git Diff and Copy

I am wanting to run a Static Code Analysis (PMD) report against the files that have been added or modified as part of a pull request on bitbucket. The files that have been modified etc are available locally within the pipeline image, however I need to do a git diff to identify the changes ONLY between the source branch (pulling from) and the target branch (to be merged into). I will then be executing the PMD CLI (with rulesets etc) against a directory that will contain only the "changed files" to highlight any issues with those files specifically as part of the change.
I basically want to copy out the files indicated in the git diff result. I hope this provides some more context.
I have tried finding some examples and done testing however I am just not getting it right due to my lack of understanding on these crazy linux commands :)
So far I have the below command, but results in an empty folder.
git diff --name-only --pretty $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH | xargs -i {} cp {} -t ~/branch-diff/
xargs might have problems will a number of files - argument would be too big. I Propose something like
for name in $(git diff --name-only --pretty $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH); do cp $name ~/branch-diff/; done
As a result you will have all these files in one directory (without directory tree). Other question is that is it really what you need.
Firstly, the issues with your current solution:
xargs doesn't play nicely with filenames which have spaces in. You may not have that problem now, and you can work around it, but it's better to just avoid this if possible.
cp does not build a directory tree - which you can trivially verify - so it wouldn't do what you asked anyway.
git does not produce pathnames relative to the current path, but to the working tree base.
The filenames produced by git diff $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH don't even have to exist in your working tree, but only in (at least one of) the branches.
If there's a diff between the two versions of a file, you haven't said which one you want copied!
A functional script using standard file tools would look something like:
#!/bin/bash
# diffcopy.sh
#
DESTDIR="$1"
BRANCH1="$2"
BRANCH2="$3"
# so relative paths match git output
SRCDIR="$(git rev-parse --show-toplevel)"
# choose the branch whose files we want to copy
git checkout "$BRANCH2"
# make the output directory
mkdir -p "$DESTDIR"
# sync the changed files
rsync -a --files-from=<(git diff --name-only "${BRANCH1}".."${BRANCH2}") "$SRCDIR" "$DESTDIR"
# restore working copy
git checkout -
There may be a better way to do this purely in git, but I don't know it.
If you have the GNU variant of cp and xargs, you can do this:
git diff --name-only -z $BITBUCKET_PR_DESTINATION_BRANCH $BITBUCKET_BRANCH |
xargs -0 cp --target-directory="$HOME/branch-diff/" --parents
This does not spawn a cp per file, but copies as many files as possible with one cp process. By specifying --target-directory, the destination can come first on the cp command, and xargs can paste as many source file names at the and of the cp command as it likes. --parents keeps the directory names of the source files.
The -z in git diff separates file names by a NUL character instead of line breaks, and the -0 of xargs knows how to take the NUL separated path list apart without stumbling over whitespace characters in file names.

for each pair of files with the same prefix, execute code

I have a large list of directories, each of which contains a varied number of "paired" files. By paired, I mean the prefix is the same for two files, and the pairs are denoted as "a" and "b". The prefix does not follow a defined pattern either. My broader intentions are to write a bash script that will list all subdirectories in a given directory, cd into each directory, find the pairs of files, and execute a function on the pairs. Here is an example directory:
Dir1
123_a.txt
234_a.txt
123_b.txt
234_b.txt
Dir2
345_a.txt
345_b.txt
Dir3
456_a.txt
567_a.txt
678_a.txt
456_b.txt
567_b.txt
678_b.txt
I can use this code to loop thought each directory:
for d in ./*/ ; do (cd "$d" && script.sh); done
In script.sh, I have been working on writing a script that will find all pairs of files (which is the problem I am struggling to figure out), and then call the function I want to apply to those files. This is the gist of what I have been trying:
for file in ./*_a.txt; do (find the paired file with *_b.txt && run_function.sh); done
Ive broken the problem into needing to get the value of "*" for the _a.txt files, and then searching the directory using this value for the matching _b.txt suffix,and making a subdirectory that I can put them into so I can then apply run_function.sh. So Dir1, would contain subdirectories 123 and 234.
Let me know if this doesn't make sense. The part of the problem I'm struggling with is matching files without a defined prefix.
Thanks for your help.
Use parameter expansion:
#!/bin/bash
file=123_a.txt
prefix=${file%_a.txt} # remove _a.txt from the right
second=${prefix}_b.txt
if [[ -f $second ]] ; then
run_function "$file" "$second"
fi

When using "mv" command for moving and renaming, script sees it as a directory instead of new filename

I have a question regarding the mv command.
In order to move and rename the file, I did mv file someDir/File2 in my terminal and it moved file into someDir with new name called File2.
However, when I do it with shell script, it sees the File2 part as a directory, instead of the new name for the file.
So I have two variables, NEWDir=newDir, NEWF=newName
for i in *.txt ; do
mv $i $NEWDIR/$NEWF
done
I run this script, it says the following:
mv: target 'newName' is not a directory.
mv requires the destination to be a directory only if more than one source argument is given.
In this case, that can be caused by your variables being split due to lack of quoting. Use double quotes -- as http://shellcheck.net/ directs -- around all expansions.
for i in *.txt ; do
mv "$i" "$NEWDIR/$NEWF"
done
Note that only the last file iterated over will actually be left behind with the given name -- the rest will be overwritten by their successors.

Bash command-line to rename wildcard

In my /opt/myapp dir I have a remote, automated process that will be dropping files of the form <anything>-<version>.zip, where <anything> could literally be any alphanumeric filename, and where <version> will be a version number. So, examples of what this automated process will be delivering are:
fizz-0.1.0.zip
buzz-1.12.35.zip
foo-1.0.0.zip
bar-3.0.9.RC.zip
etc. Through controls outside the scope of this question, I am guaranteed that only one of these ZIP files will exist under /opt/myapp at any given time. I need to write a Bash shell command that will rename these files and move them to /opt/staging. For the rename, the ZIP files need to have their version dropped. And so /opt/myapp/<anything>-<version>.zip is renamed and moved to /opt/staging/<anything>.zip. Using the examples above:
/opt/myapp/fizz-0.1.0.zip => /opt/staging/fizz.zip
/opt/myapp/buzz-1.12.35.zip => /opt/staging/buzz.zip
/opt/myapp/foo-1.0.0.zip => /opt/staging/foo.zip
/opt/myapp/bar-3.0.9.RC.zip => /opt/staging/bar.zip
The directory move is obvious and easy, but the rename is making me pull my hair out. I need to somehow save off the <anything> and then re-access it later on in the command. The command must be generic and can take no arguments.
My best attempt (which doesn't even come close to working) so far is:
file=*.zip; file=?; mv file /opt/staging
Any ideas on how to do this?
for file in *.zip; do
[[ -e $file ]] || continue # handle zero-match case without nullglob
mv -- "$file" /opt/staging/"${file%-*}.zip"
done
${file%-*} removes everything after the last - in the filename. Thus, we change fizz-0.1.0.zip to fizz, and then add a leading /opt/staging/ and a trailing .zip.
To make this more generic (working with multiple extensions), see the following function (callable as a command; function body could also be put into a script with a #!/bin/bash shebang, if one removed the local declarations):
stage() {
local file ext
for file; do
[[ -e $file ]] || continue
[[ $file = *-*.* ]] || {
printf 'ERROR: Filename %q does not contain a dash and a dot\n' "$file" >&2
continue
}
ext=${file##*.}
mv -- "$file" /opt/staging/"${file%-*}.$ext"
done
}
...with that function defined, you can run:
stage *.zip *.txt
...or any other pattern you so choose.
f=foo-1.3.4.txt
echo ${f%%-*}.${f##*.}

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Resources