Like a vlookup but in bash to match filenames in a directory against a ref file and return full description - linux

I am aware there isn't a special bash function to do this and we will have to build this with available tools -- e.g. sed, awk, grep, etc.
We dump files into a directory and while their filename looks random, they can be mapped to their full description. For example:
/tmp/abcxyz.csv
/tmp/efgwaz.csv
/tmp/mnostu.csv
In filemapping.dat, we have:
abcxyz, customer_records_abcxyz
efgwaz, routernodes_logs_efgwaz
mnostu, products_campaign
We need to go through each of them in the directory recursively and rename the file with its full description. Final outcome:
/tmp/customer_records_abcxyz.csv
/tmp/routernodes_logs_efgwaz.csv
/tmp/products_campaign_mnostu.csv
I found something similar here but not sure how to work it out at directory level dealing with only one file as the lookup/referece file. Please help. Thanks!

I would try something like this:
sed 's/,/.csv/;s/$/.csv/' filemapping.dat | xargs -n2 mv
Either cd to tmp beforehand, or modify the sed command to include the path name.
The sed commands simply replace the comma and the line end with the string ".csv".

Related

Add extra file extension to all filenames in a directory via Linux command line

I want to add the ".sbd" after all files ending on ".utf8" in a directory
I do not want to replace the extensions, but really want to add them so the filenames will look like "filename.utf8.sbd"
I think I should adapt the following code, but don't manage to find out exactly how
for f in *.utf8 ; do mv "$f" "$f.sbd" ; done
Can anyone help me? I am very new to the command line
Thanks a bunch!
Your code should work if no file has spaces (or other "special" character) in the name and if the directory is not pathologically big.
In those cases, you can use something like this:
ls|grep '*.utf8$'|while read i; do mv "$i" "$i.sbd"; done

How to specify the tar final structure

I have this structure:
release/folder1/file1
release/folder2/file2
...
release/folderN/fileN
I want to include all those folders (folder1, folder2 ... folderN) in a tar file.
The key is that I want these folders to be in the final tar within another directory named MYAPP so when you open the tar you can see this:
MYAPP/folder1/file1
MYAPP/folder2/file2
...
MYAPP/folderN/fileN
How can I achieve this without renaming the original "release" directory and/or creating new directories.
Is this possible to achive just in the tar process?
Thanks
Add
--transform=s#^release/#MYAPP/#
to your tar command line.
The argument of the --transform command line is a command that is passed to sed together with the file path before it is stored in the archive (use tar -tf to show the names of the files stored in the archive).
The command s#^release/#MYAPP/# tells sed to search (s) release/ at the beginning of the string (^) and replace it with MYAPP/.
The / at the end of the search and replace strings is needed to be sure the complete name of the component is release (to not replace release.txt). The # character is just a regex delimiter. Usually / is used as a regex delimiter but we prefer to use a different delimiter here to avoid the need to escape / (because it is used in the search and replace strings).
Read more in the documentation of tar and sed.

Linux rename files based on input file

I need to rename hundreds of files in Linux to change the unique identifier of each from the command line. For sake of examples, I have a file containing:
old_name1 new_name1
old_name2 new_name2
and need to change the names from new to old IDs. The file names contain the IDs, but have extra characters as well. My plan is therefore to end up with:
abcd_old_name1_1234.txt ==> abcd_new_name1_1234.txt
abcd_old_name2_1234.txt ==> abcd_new_name2_1234.txt
Use of rename is obviously fairly helpful here, but I am struggling to work out how to iterate through the file of the desired name changes and pass this as input into rename?
Edit: To clarify, I am looking to make hundreds of different rename commands, the different changes that need to be made are listed in a text file.
Apologies if this is already answered, I've has a good hunt, but can't find a similar case.
rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' *.txt
Should work, depending on whether you have that package installed. Also have a look at qmv (rename-utils)
If you want more options, use e.g.
shopt -s globstart
rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' folder/**/*.txt
(finds all txt files in subdirectories of folder), or
find folder -type f -iname '*.txt' -exec rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' {} \+
To do then same using GNU find
while read -r old_name new_name; do
rename "s/$old_name/$new_name/" *$old_name*.txt
done < file_with_names
In this way, you read the IDs from file_with_names and rename the files replacing $old_name with $new_name leaving the rest of the filename untouched.
I was about to write a php function to do this for myself, but I came upon a faster method:
ls and copy & paste the directory contents into excel from the terminal window. Perhaps you may need to use on online line break removal or addition tool. Assume that your file names are in column A In excel, use the following formula in another column:
="mv "&A1&" prefix"&A1&"suffix"
or
="mv "&A1&" "&substitute(A1,"jpeg","jpg")&"suffix"
or
="mv olddirectory/"&A1&" newdirectory/"&A1
back in Linux, create a new file with
nano rename.txt and paste in the values from excel. They should look something like this:
mv oldname1.jpg newname1.jpg
mv oldname1.jpg newname2.jpg
then close out of nano and run the following command:
bash rename.txt. Bash just runs every line in the file as if you had typed it.
and you are done! This method gives verbose output on errors, which is handy.

how to use a string as an argument in an for ... in ... statement in linux shell

I'm familiar with the structure of
for file in foo/folder\ with\ spaces/foo2/*.txt
do
#do some stuff...
done
However, I want to put foo/folder with spaces/foo2/*.txt into a variable and then use it. Something like this:
myDirectory="foo/folder with spaces/foo2/*.txt"
for file in $myDirectory
do
# do some stuff
done
But what I've written here doesn't work, and it won't work even if I do
myDirectory="food/folder\ with\ spaces/foo2/*.txt"
or
for file in "$myDirectory" ...
Any help? is this even possible?
don't parse ls
# your files are expanded here
# note lack of backslashes and location of quotes
myfiles=("food/folder with spaces/foo2/"*.txt)
# iterate over the array with this
for file in "${myfiles[#]}"; do ...
Parsing ls is a bad idea, instead just do the shell globbing outside of the quotes.
You could also do:
$mydir="folder/with spaces"
for file in "$mydir"/*; do
...
done
Also look into how find and xargs works. Many of these sort of problems can be solved using those. Look at the -print0 and -0 options in particular if you want to be safe.
Try using the ls command in the for loop. This works for me:
for file in `ls "$myDirectory"`

Shell script to use a list of filenames in a CSV to copy files from one directory to another

I have a list of files that I need to copy. I want to recursively search a drive and copy those files to a set location if that filename exists in the list. The list is a text file/
the text file would look something like this:
A/ART-FHKFX1.jpg
B/BIG-085M.jpg
B/BIG-085XL.jpg
L/LL-CJFK.jpg
N/NRT-56808EA.jpg
P/PFE-25.10.jpg
P/PFE-7/60.jpg
P/PFE-7L.20.jpg
P/PFE-8.25.jpg
P/PFE-9.15.jpg
P/PFE-D11.1.tiff
P/PFE-D11.1.tiff
P/PFE-D12.2.tiff
P/PFE-D12.2.tiff
using find will take a lot of time, try to use locate if possible.
what will happen when there's several matches? like searching for foo.bar and having a/foo.bar and also b/foo.bar what would you do in that case?
your csv seems to include a path, given the previous I'll assume those paths are actually valid from where the script is run so in that case just do this:
#!/bin/bash
while read path; do
cp "$path" "$1"
done
then call it like this:
teh_script /path/to/destination < csv-file.csv

Resources