How can i move/group specific folders in bash? - linux

I have a folder structure like the following:
2020-123-1
2020-123-2
2020-123-3
2020-124-1
2020-124-2
...
I need to create folders from the first 2 numbers and omit whatever follows the second dash (-). Then I need to put the prior folders under the newly created ones with the correct name.
2020-123
->2020-123-1
->2020-123-2
->2020-123-3
2020-124
->2020-124-1
->2020-124-2
I tried to write a script in bash like this:
ls -d */ > folder.txt
cut -f1,2 -d"-" folder.txt |cut -f1 -d"/" |sort|uniq > mainfolder.txt
while read line; do mkdir $line ; done < mainfolder.txt
while read line; do mv $(cut -f1,2 -d"-" $line) $line/ ; done < folder.txt
I couldn't make the last line work, I know it has issues.

Actually, you don't have to parse the directory names and build the hierarchy. You can make use of the -p option of mkdir, thus, an awk one-liner will do the job:
awk -F'-' '{top=$1 FS $2;printf "mkdir -p %s; mv %s %s\n",top, $0, top}' dir.txt
The output with your example:
mkdir -p 2020-123; mv 2020-123-1 2020-123
mkdir -p 2020-123; mv 2020-123-2 2020-123
mkdir -p 2020-123; mv 2020-123-3 2020-123
mkdir -p 2020-124; mv 2020-124-1 2020-124
mkdir -p 2020-124; mv 2020-124-2 2020-124
Note
This one-liner just print the commands without executing them, you just pipe the output to |sh if everything looks fine. Examine the output commands, change the printf format/values for adjustment.
I didn't quote the filenames, since your example doesn't contain any special chars. Do it if it is in the case.

So the final script is as follows:
ls -d */ | cut -f1 -d"/" > folder.txt
awk -F'-' '{top=$1 FS $2;printf "mkdir -p %s; mv %s %s\n",top, $0, top}' folder.txt |sh

In pure bash:
#!/bin/bash
for src in *-*-*; do
destdir=${src%-*}
[[ -d $destdir ]] || mkdir "$destdir" || exit
# This just prints out the command that will be called.
# Remove the "echo" in actual script after making sure it will run as intented
echo mv "$src" "$destdir"
done
In the script above it is assumed that each file name to be moved contains exactly two dashes. If it can contain two or more dashes then the destdir=${src%-*} line should be replaced with these two lines:
suffix=${src#*-*-}
destdir=${src%"-$suffix"}
For detailed information read the "shell parameter expansion" section in bash reference.
Additionally, a good read article is: Why you shouldn't parse the output of ls

Related

Bash regex for just numbers and dots

There's a folder with two files in it like: filename-3.0.1-extra.jar and filename-3.0.1.jar. The number and dots in the middle are the version, which can change. I'm trying to copy filename-3.0.1.jar to another folder.
Something like:
cp folder1/filename-*.jar otherfolder/
But the wildcard * matches both files. I'm trying to copy just the file without the -extra at the end. So I'm trying to match filename on just numbers and dots when I copy, something like this:
cp folder1/filename-[0-9.].jar otherfolder/.
But that's not the right syntax for the regex. Would appreciate any help here!
UPDATE:
I got it somewhat working with this:
ls | grep -e "filename-[0-9]\.[0-9]\.[0-9]\.jar"
But the regex seems a bit rigid. Is there a way to shorten it to something like "filename-([0-9]+[\.])+jar"?
So that even cases like filename-32.430.3.jar would also get captured?
Using extglob you can do this:
shopt -s extglob
cp folder1/filename-+([0-9.]).jar otherfolder/
Here +([0-9.]) will match 1 or more of any digits or dots.
Based on your edited question it appears you're trying to use a grep with a regular expression. You can use this grep solution:
printf '%s\n' *.* | grep -E '^filename-([0-9]+\.)+jar$'
filename-3.0.1.jar
you can do something like
cp "folder1/${##*.}" otherfolder
or
cd folder1 && cp -r -v $(echo -e $(ls | grep -e "[0-9]*\.*")) otherfolder/. && cd ..
Given:
$ ls -1 *.jar
filename-3.0.1-extra.jar
filename-3.0.1.jar
You can use a loop and filter out those that match *-extra*:
for fn in *.jar; do # with this glob, what DO you want
[[ $fn != #(*-extra*) ]] && echo "$fn" # and what you DONT want
done
Prints:
filename-3.0.1.jar
So your loop could be:
for fn in *.jar; do
[[ $fn != #(*-extra*) ]] && cp "$fn" otherfolder/
done

Commands work on terminal but not in shell script

The following commands work on my terminal but not in my shell script. I later found out that my terminal was /bin/tcsh. Can somebody tell me what changes I need to do for /bin/sh. Here are the commands I need to change:
cp source_dir/*/dir1/*.xml destination_dir/
Error in sh-> cp: cannot stat `source_dir/*/dir1/*.xml': No such file or directory
sed -i "s+${initial_name}+${final_name}+" $file_name
This one does not complain but does not work as well.
I am adding an example for testing. The code tends to rename the names of xml files and also the contents of xml files. For example-
The file name crr.ya.na.aa.xml should be changed to aa.xml
The same name inside crr.ya.na.aa.xml should also be changed from crr.ya.na.aa to aa
Here is the code:
#!/bin/sh
# Create dir structure for testing
rm -rf audience
mkdir audience
mkdir audience/dir1 audience/dir2 audience/dir3
mkdir audience/dir1/ipxact audience/dir2/ipxact audience/dir3/ipxact
touch audience/dir1/ipxact/crr.ya.na.aa.xml
echo "<spirit:name>crr.ya.na.aa</spirit:name>" > audience/dir1/ipxact/crr.ya.na.aa.xml
touch audience/dir2/ipxact/crr.ya.na.bb.xml
echo "<spirit:name>crr.ya.na.bb</spirit:name>" > audience/dir2/ipxact/crr.ya.na.bb.xml
touch audience/dir3/ipxact/crr.ya.na.cc.xml
echo "<spirit:name>crr.ya.na.cc</spirit:name>" > audience/dir3/ipxact/crr.ya.na.cc.xml
# Create a dir for ipxact_drop files if it does not exist
mkdir -p ipxact_drop
rm -rf ipxact_drop/*
cp audience/*/ipxact/*.xml ipxact_drop/
ls ipxact_drop/ > ipxact_drop_files.log
cat ipxact_drop_files.log | \
awk '{ split($0,a,"."); print a[length(a)-1] "." a[length(a)] }' ipxact_drop_files.log > file_names.log
cat ipxact_drop_files.log | \
awk '{ split($0,a,"."); print "mv ipxact_drop/" $0 " ipxact_drop/" a[length(a)-1] "." a[length(a)] }' ipxact_drop_files.log > command.log
chmod +x command.log
./command.log
while read line
do
echo ipxact_drop/$line
initial_name=`grep -m 1 crr ipxact_drop/$line | sed -e 's/<spirit:name>//' | sed -e 's/<\/spirit:name>//' `
final_name="${line%.*}"
echo $initial_name
echo $final_name
sed -i "s+${initial_name}+${final_name}+" ipxact_drop/$line
done < file_names.log
echo " ***** SCRIPT RUN FINISHED *****"
Only the sed command at the end is not working
I was reading some other posts and understood that xml files can have problems with scripts. Here is what that worked for me upto now.
To remove cp error: replace #!/bin/sh -f with #!/bin/sh
To remove sed error for the test input: replace sed -i ...... with sed -i.back ....

How do I copy files at same location ending with "*100000.prm" with different name "*full.prm" in linux?

#!/bin/bash
for FILE in *1000000.wgt; do
BASE=${FILE%1000000.wgt}
[[ -e $BASE.trs && -e $BASE.1000000.wgt ]] && cp "$FILE" "$BASE.trs" "$BASE.wav" /some/dir
done
This script does what you need according to your commment.
eg: 'xyz_100000.prm' is to be copied with name 'xyz_full.prm' at the same location.
#!/bin/sh
IFS=$'\n'
for FILE in *1000000.prm; do
new_name=$(echo "$FILE" | sed "s/1000000.prm$/full.prm/")
cp "$FILE" "$new_name"
done
Demonstration:
➜ ls
a1000000.prm b1000000.prm copy.sh
➜ ./copy.sh
➜ ls
afull.prm bfull.prm copy.sh
I'd suggest this:
for i in *1000000.prm; do mv $i ${i%1000000.prm}full.prm; done
Read Parameter expansion section from bash man page.

Linux: Update directory structure for millions of images which are already in prefix-based folders

This is basically a follow-up to Linux: Move 1 million files into prefix-based created Folders
The original question:
I want to write a shell command to rename all of those images into the
following format:
original: filename.jpg new: /f/i/l/filename.jpg
Now, I want to take all of those files and add an additional level to the directory structure, e.g:
original: /f/i/l/filename.jpg new: /f/i/l/e/filename.jpg
Is this possible to do with command line or bash?
One way to do it is to simply loop over all the directories you already have, and in each bottom-level subdirectory create the new subdirectory and move the files:
for d in ?/?/?/; do (
cd "$d" &&
printf '%.4s\0' * | uniq -z |
xargs -0 bash -c 'for prefix do
s=${prefix:3:1}
mkdir -p "$s" && mv "$prefix"* "$s"
done' _
) done
That probably needs a bit of explanation.
The glob ?/?/?/ matches all directory paths made up of three single-character subdirectories. Because it ends with a /, everything it matches is a directory so there is no need to test.
( cd "$d" && ...; )
executes ... after cd'ing to the appropriate subdirectory. Putting that block inside ( ) causes it to be executed in a subshell, which means the scope of the cd will be restricted to the parenthesized block. That's easier and safer than putting cd .. at the end.
We then collecting the subdirectories first, by finding the unique initial strings of the files:
printf '%.4s\0' * | uniq -z | xargs -0 ...
That extracts the first four letters of each filename, nul-terminating each one, then passes this list to uniq to eliminate duplicates, providing the -z option because the input is nul-terminated, and then passes the list of unique prefixes to xargs, again using -0 to indicate that the list is nul-terminated. xargs executes a command with a list of arguments, issuing the command several times only if necessary to avoid exceeding the command-line limit. (We probably could have avoided the use of xargs but it doesn't cost that much and it's a lot safer.)
The command called with xargs is bash itself; we use the -c option to pass it a command to be executed. That command iterates over its arguments by using the for arg in syntax. Each argument is a unique prefix; we extract the fourth character from the prefix to construct the new subdirectory and then mv all files whose names start with the prefix into the newly created directory.
The _ at the end of the xargs invocation will be passed to bash (as with all the rest of the arguments); bash -c uses the first argument following the command as the $0 argument to the script, which is not part of the command line arguments iterated over by the for arg in syntax. So putting the _ there means that the argument list constructed by xargs will be precisely $1, $2, ... in the execution of the bash command.
Okay, so I've created a very crude solution:
#!/bin/bash
for file1 in *; do
if [[ -d "$file1" ]]; then
cd "$file1"
for file2 in *; do
if [[ -d "$file2" ]]; then
cd "$file2"
for file3 in *; do
if [[ -d "$file3" ]]; then
cd "$file3"
for file4 in *; do
if [[ -f "$file4" ]]; then
echo "mkdir -p ${file4:3:1}/; mv $file4 ${file4:3:1}/;"
mkdir -p ${file4:3:1}/; mv $file4 ${file4:3:1}/;
fi
done
cd ..
fi
done
cd ..
fi
done
cd ..
fi
done
I should warn that this is untested, as my actual structure varies slightly, but I wanted to keep the question/answer consistent with the original question for clarity.
That being said, I'm sure a much more elegant solution exists than this one.

Bash: Move files to specific folder if name contains one of a list of strings

I have a script that queries the Twitter API for several queries, and then writes the raw data to a file with the query in the name, plus a timestamp. I'd like to have a script that, given the list of query strings (regexs?) and for all files in a folder, if one of the query strings is a substring in that file, move it to a specific folder. Right now I have just a script with just a few dozen mv commands, but I'd like a simpler and more maintainable version. Here's an example of what I'm doing now:
mv /home/nick/TwitterSearchToDatabase/queries_for_amita/*femin*/home/nick/TwitterSearchToDatabase/queries_for_amita/feminism
mv /home/nick/TwitterSearchToDatabase/queries_for_amita/*patriarchy* /home/nick/TwitterSearchToDatabase/queries_for_amita/feminism
mv /home/nick/TwitterSearchToDatabase/queries_for_amita/*yesallwomen* /home/nick/TwitterSearchToDatabase/queries_for_amita/feminism
mv /home/nick/TwitterSearchToDatabase/queries_for_amita/*womanpower* /home/nick/TwitterSearchToDatabase/queries_for_amita/feminism
I would use a for loop:
for i in femin patriarchy yesallwomen womanpower; do
mv /home/nick/TwitterSearchToDatabase/queries_for_amita/*$i* /home/nick/TwitterSearchToDatabase/queries_for_amita/feminism
done
That way the list is in the first line so it is easy to amend.
I would isolate data (the words to be moved to feminism) and code.
When you have more keywords (feminism and so), you can make files with keywords and check these keywordfiles for the files you are considering to move.
With ${fromdir} where the files come from, ${todir} where you want them and ${keyfiledir} with the keywords, you get something like
for keyfile in ${keyfiledir}/*; do
key="${keyfile##*/}"
find $from -type f | sed 's#.*/##' | while read -r file; do
echo "${file}" | grep -q -f "${keyfiledir}"/"${key}" && mv "${from}"/"${file}" "${to}"/"${key}"
done
done
How does that work? I tested the solution above with the following script.
from=fromdir
to=todir
keyfiledir=keyfiledir
rm -rf ${from} ${to} ${keyfiledir}
mkdir ${from} ${to} ${keyfiledir}
mkdir ${to}/feminism ${to}/so
touch ${from}/yesallwomen ${from}/women ${from}/some_femin ${from}/"help move"
cat <<# > ${keyfiledir}/feminism
femin
patriarchy
yesallwomen
womanpower
#
touch ${from}/yesallwomen ${from}/women ${from}/some_femin
cat <<# > ${keyfiledir}/so
stack
exchange
help
#
test ! -d "${from}" && echo " Wrong dir ${from}" && exit 1
test ! -d "${to}" && echo " Wrong dir ${to}" && exit 1
test ! -d "${keyfiledir}" && echo " Wrong dir ${keyfiledir}" && exit 1
for keyfile in ${keyfiledir}/*; do
key="${keyfile##*/}"
find $from -type f | sed 's#.*/##' | while read -r file; do
echo "${file}" | grep -q -f "${keyfiledir}"/"${key}" && mv "${from}"/"${file}" "${to}"/"${key}"
done
done
echo "Not moved"
ls ${from}
echo "Moved"
ls -R ${to}
A simple combination of mv and egrep should suffice. egrep can take a pattern list from a file (and then you get to use full regexp syntax, not just glob syntax.) Make sure to exclude the name of the target folder.
cd /home/nick/TwitterSearchToDatabase/queries_for_amita
mv $(ls | egrep -f patterns.txt | grep -v '^feminism$') feminism

Resources