Bash script for moving files and their parent directory - linux

I have searched and a lot of topics dance around what I am trying to accomplish. I have over 2,000 m4a files buried in with 17,000 mp3s. The directory structure is /home/me/Music/MP3/Artist/Album/song.m4a. I want to use 'find' to discover the m4a songs, move them, their album directory, and their artist directory to /home/me/Music/M4A/Artist/Album/song.m4a. I have been unsuccessful with the mv -exec switch and I have been unsuccessful using basename and/or dirname to create a script. The parent and grand-parent directories have me thrown. Moving the files themselves are not a problem, just creating the directory structure AND moving the files. In a piecemeal effort, I have exported the file list find /home/me/Music/MP3 -name "*.m4a" >> dir.sh (partly because I wanted to see the file locations and # of songs). I then ran sed 's/MP3/M4A/g' dir.sh to replace the MP3 with M4A. Dropping the song.m4a as in this sample from the dir.sh will leave me with a list of Artist/Album directories to run through a while loop with mkdir: /home/me/Music/M4A/Metallica/Re-load/Metallica - The Unforgiven.m4a. Unfortunately, this is where I am stuck, dirname yields a '.'

find /home/me/Music/MP3 -name \*.m4a| sed -e 's/.*/mkdir -p $(dirname "&"); mv "&" "&"M4A;/' | sed -e 's/MP3\([^"]*\)"M4A$/M4A\1"/' > moveit_baby.sh
bash moveit_baby.sh
should do the job, but check "moveit_baby.sh" before you call it.
Depending on your sed implementation you will need \(\) or plain () in the second sed. Of course the string "MP3" should neither be part of Artist, Album or song name, otherwise you need a more complex filter, see below.
You might further optimize when you insert mkdir -p only if the dirname changes. Such more complex decisions on input parameters are better achieved with while read loops
find /home/me/Music/MP3 -name \*.m4a | while read file
do
# do anything you want to $file here
done

Related

Rename multiple filename with random numeric extension after one specific alphanumeric word in Linux

I have a folder/subfolders that contain some files with filenames that end with a random numeric extension:
DWH..AUFTRAG.20211123115143.A901.3801176
DWH..AUFTRAGSPOSITION.20211122002147.A901.3798013
I would like to remove everything after A901 from the above filenames.
For example:
DWH..AUFTRAG.20211123115143.A901 (remove this .3801176)
DWH..AUFTRAGSPOSITION.20211122002147.A901 (remove this .3798013) from the filename
How do I use rename or any other command in linux to remove only after A901 everything from finale rest file name keep as it is?
I can see there is 5 '.' (dots) before the number so I did some desi jugad.
I made some files in folder and also made a folder and created some files inside that folder accourding to the name pattern that you gave.
I created a command and it somewhat looks like this.
find "$PWD"|grep A901|while read F; do mv "${F}" `echo ${F}|cut -d . -f 1-5`;done
When executed it worked for me.
terminal output below.
rexter#rexter:~/Desktop/test$ find $PWD
/home/rexter/Desktop/test
/home/rexter/Desktop/test/test1
/home/rexter/Desktop/test/test1/DWH..AUFTRAG.20211123115143.A901.43214
/home/rexter/Desktop/test/test1/DWH..AUFTRAGSPOSITION.2021112200fsd2147.A901.31244324
/home/rexter/Desktop/test/DWH..AUFTRAG.20211123115143.A901.321423
/home/rexter/Desktop/test/DWH..AUFTRAGSPOSITION.20211122002147.A901.3124325
rexter#rexter:~/Desktop/test$ find "$PWD"|grep A901|while read F; do mv "${F}" `echo ${F}|cut -d . -f 1-5`;done
rexter#rexter:~/Desktop/test$ find $PWD
/home/rexter/Desktop/test
/home/rexter/Desktop/test/test1
/home/rexter/Desktop/test/test1/DWH..AUFTRAG.20211123115143.A901
/home/rexter/Desktop/test/test1/DWH..AUFTRAGSPOSITION.2021112200fsd2147.A901
/home/rexter/Desktop/test/DWH..AUFTRAG.20211123115143.A901
/home/rexter/Desktop/test/DWH..AUFTRAGSPOSITION.20211122002147.A901
rexter#rexter:~/Desktop/test$
I dont know if this is a proper way to do it but it just make things work.
Let me know if it is useful to you.

How to search multiple DOCX files for a string within a Word field?

Is there any Windows app that will search for a string of text within fields in a Word (DOCX) document? Apps like Agent Ransack and its big brother FileLocator Pro can find strings in the Word docs but seem incapable of searching within fields.
For example, I would like to be able to find all occurrences of the string "getProposalTranslations" within a collection of Word documents that have fields with syntax like this:
{ AUTOTEXTLIST \t "<wr:out select='$.shared_quote_info' datasource='getProposalTranslations'/>" }
Note that string doesn't appear within the text of the document itself but rather only within a field. Essentially the DOCX file is just a zip file, I believe, so if there's a tool that can grep within archives, that might work. Note also that I need to be able to search across hundreds or perhaps thousands of files in many directories, so unzipping the files one by one isn't feasible. I haven't found anything on my own and thought I'd ask here. Thanks in advance.
This script should accomplish what you are trying to do. Let me know if that isn't the case. I don't usually write entire scripts because it can hurt the learning process, so I have commented each command so that you might learn from it.
#!/bin/sh
# Create ~/tmp/WORDXML folder if it doesn't exist already
mkdir -p ~/tmp/WORDXML
# Change directory to ~/tmp/WORDXML
cd ~/tmp/WORDXML
# Iterate through each file passed to this script
for FILE in $#; do
{
# unzip it into ~/tmp/WORDXML
# 2>&1 > /dev/null discards all output to the terminal
unzip $FILE 2>&1 > /dev/null
# find all of the xml files
find -type f -name '*.xml' | \
# open them in xmllint to make them pretty. Discard errors.
xargs xmllint --recover --format 2> /dev/null | \
# search for and report if found
grep 'getProposalTranslations' && echo " [^ found in file '$FILE']"
# remove the temporary contents
rm -rf ~/tmp/WORDXML/*
}; done
# remove the temporary folder
rm -rf ~/tmp/WORDXML
Save the script wherever you like. Name it whatever you like. I'll name it docxfind. Make it executable by running chmod +x docxfind. Then you can run the script like this (assuming your terminal is running in the same directory): ./docxfind filenames...

Sort files according to their filetype

After an HD problem and some work, I have a bunch of files with names like "f1234", "f1235", etc.
My goal is to sort this files according to their filetype. For example, I want to move all the PDF files in the "pdfs" directory.
For one file, I can do : "file f1234", and if it's a PDF, I can "mv f1234 pdfs/". But I have thousands of file... Can you help me with a bash or zsh command for sort all the PDF in one pass ? Thanks
The hard part here is reliably turning the output of file into a directory name. I think probably the best candidate for that is the mime-type of the file rather than the human readable output of file. I'd use something like:
mkdir sorted
for f in f*
do
d=$(file -b --mime-type "$f" | tr / -)
mkdir -p "sorted/$d"
mv "$f" "sorted/$d/"
done
Obviously I'd test that out a bit before running it on your files, but something pretty close to that should work.

How to make this (l)unix script dynamically accept directory name in for-loop?

I am teaching myself more (l)unix skills and wanted to see if I could begin to write a program that will eventually read all .gz files and expand them. However, I want it to be super dynamic.
#!/bin/bash
dir=~/derp/herp/path/goes/here
for file in $(find dir -name '*gz')
do
echo $file
done
So when I excute this file, I simply go
bash derp.sh.
I don't like this. I feel the script is too brittle.
How can I rework my for loop so that I can say
bash derp.sh ~/derp/herp/path/goes/here (1)
I tried re-coding it as follows:
for file in $*
However, I don't want to have to type in bash
derp.sh ~/derp/herp/path/goes/here/*.gz.
How could I rewrite this so I could simply type what is in (1)? I feel I must be missing something simple?
Note
I tried
for file in $*/*.gz and that obviously did not work. I appreciate your assistance, my sources have been a wrox unix text, carpentry v5, and man files. Unfortunately, I haven't found anything that will what I want.
Thanks,
GeekyOmega
for dir in "$#"
do
for file in "$dir"/*.gz
do
echo $file
done
done
Notes:
In the outer loop, dir is assigned successively to each argument given on the command line. The special form "$#" is used so that the directory names that contain spaces will be processed correctly.
The inner loop runs over each .gz file in the given directory. By placing $dir in double-quotes, the loop will work correctly even if the directory name contains spaces. This form will also work correctly if the gz file names have spaces.
#!/bin/bash
for file in $(find "$#" -name '*.gz')
do
echo $file
done
You'll probably prefer "$#" instead of $*; if you were to have spaces in filenames, like with a directory named My Documents and a directory named Music, $* would effectively expand into:
find My Documents Music -name '*.gz'
where "$#" would expand into:
find "My Documents" "Music" -name '*.gz'
Requisite note: Using for file in $(find ...) is generally regarded as a bad practice, because it does tend to break if you have spaces or newlines in your directory structure. Using nested for loops (as in John's answer) is often a better idea, or using find -print0 and read as in this answer.

Compare files content with similar names on two folders

I have two folders (I'll use database names as example):
MongoFolder/
CassandraFolder/
These two folders have similar files inside like:
MongoFolder/
MongoFile
MongoStatus
MongoConfiguration
MongoPlugin
CassandraFolder/
CassandraFile
CassandraStatus
CassandraConfiguration
Those files have content also very similar, only changing the name of the database for example, so they all have code or configuration only changing the name Mongo for Cassandra.
How can I compare this two folders, so the result is the files missing from one to the other (for example the file CassandraPlugin for the CassandraFolder) and also that the contents of the files alike, have to be similar, only changing the database name.
This will give you the names of the missing files (minus the database name):
find MongoFolder/ CassandraFolder/ | \
sed -e s/Mongo//g -e s/Cassandra//g | sort | uniq -u
Output:
Folder/Plugin
the following provides a full diff, including missing files and changed content:
cp -r CassandraFolder cmpFolder
# rename files
find cmpFolder -name "Cassandra*" -print | while read file; do
mongoName=`echo "$file" | sed 's/Cassandra/Mongo/'`
mv "$file" "$mongoName"
done
# fix content
find cmpFolder -type f -exec perl -pi -e 's/Cassandra/Mongo/g' {} \;
# inspect result
diff -r MongoFolder cmpFolder # or use a gui tool like kdiff3
I haven't tested this though, feel free fix bugs or to ask if something specific is unclear.
Instead of mv you can also use rename but that's different on different flavours of linux.

Resources