Merge files to directories based on match of filename to directory name

Merge files to directories based on match of filename to directory name - linux

I am pretty new to scripting so please be easy. I am aware that there is another forum that is related to this but does not exactly cover my issue.
I have a directory containing files and another directory containing the corresponding folders that I need to move each file to. Each file corresponds to the destination directory like:
DS-123.txt
/DS-123_alotofstuffhere/
I would like to automate the move based on a match of the first 6 characters of the filename to the first 6 of the directory.
I have this:
filesdir=$(ls ~/myfilesarehere/)
dir=$(ls ~/thedirectoriesareinthisfolder/)
for i in $filesdir; do
for j in $dir; do
if [[${i:6} == ${j:6}]]; then
cp $i $j
fi
done
done
But when I run the script, I get the following error:
es: line 6: [[_DS-123_morefilenametext.fasta: command not found
I am using Linux (not sure what version on the supercomputer, sorry).

It's better to use arrays and globbing to hold the list of files and directories, instead of ls. With that change and a correction to the [[ ... ]] part, you code us this:
files=(~/myfilesarehere/*)
dirs=(~/thedirectoriesareinthisfolder/*)
for i in "${files[#]}"; do
[[ -f "$i" ]] || continue # skip if not a regular file
for j in "${dirs[#]}"; do
[[ -d "$j" ]] || continue # skip if not a directory
ii="${i##*/}" # get the basename of file
jj="${j##*/}" # get the basename of dir
if [[ ${ii:0:6} == ${jj:0:6} ]]; then
cp "$i" "$j"
# need to break unless a file has more than one destination directory
fi
done
done
[[ -d "$j" ]] check is necessary because your dirs array could contain some files too. To be safer, I have added a check for $i being a file as well.
Here is the solution that doesn't use arrays, as suggested by #triplee:
for i in ~/myfilesarehere/*; do
[[ -f "$i" ]] || continue # skip if not a regular file
for j in ~/thedirectoriesareinthisfolder/*; do
[[ -d "$j" ]] || continue # skip if not a directory
ii="${i##*/}" # get the basename of file
jj="${j##*/}" # get the basename of dir
if [[ ${ii:0:6} == ${jj:0:6} ]]; then
cp "$i" "$j"
# need to break unless a file has more than one destination directory
fi
done
done

Related

extracting files that doesn't have a dir with the same name

sorry for that odd title. I didn't know how to word it the right way.
I'm trying to write a script to filter my wiki files to those got directories with the same name and the ones without. I'll elaborate further.
here is my file system:
what I need to do is print a list of those files which have directories in their name and another one of those without.
So my ultimate goal is getting:
with dirs:
Docs
Eng
Python
RHEL
To_do_list
articals
without dirs:
orphan.txt
orphan2.txt
orphan3.txt
I managed to get those files with dirs. Here is me code:
getname () {
file=$( basename "$1" )
file2=${file%%.*}
echo $file2
}
for d in Mywiki/* ; do
if [[ -f $d ]]; then
file=$(getname $d)
for x in Mywiki/* ; do
dir=$(getname $x)
if [[ -d $x ]] && [ $dir == $file ]; then
echo $dir
fi
done
fi
done
but stuck with getting those without. if this is the wrong way of doing this please clarify the right one.
any help appreciated. Thanks.

Here's a quick attempt.
for file in Mywiki/*.txt; do
nodir=${file##*/}
test -d "${file%.txt}" && printf "%s\n" "$nodir" >&3 || printf "%s\n" "$nodir"
done >with 3>without
This shamelessly uses standard output for the non-orphans. Maybe more robustly open another separate file descriptor for that.
Also notice how everything needs to be quoted unless you specifically require the shell to do whitespace tokenization and wildcard expansion on the value of a token. Here's the scoop on that.

That may not be the most efficient way of doing it, but you could take all files, remove the extension, and the check if there isn't a directory with that name.
Like this (untested code):
for file in Mywiki/* ; do
if [ -f "$d" ]; then
dirname=$(getname "$d")
if [ ! -d "Mywiki/$dirname" ]; then
echo "$file"
fi
fi
done

To List all the files in current dir
list1=`ls -p | grep -v /`
To List all the files in current dir without extension
list2=`ls -p | grep -v / | sed 's/\.[a-z]*//g'`
To List all the directories in current dir
list3=`ls -d */ | sed -e "s/\///g"`
Now you can get the desired directory listing using intersection of list2 and list3. Intersection of two lists in Bash

How can I batch rename multiple images with their path names and reordered sequences in bash?

My pictures are kept in the folder with the picture-date for folder name, for example the original path and file names:
.../Pics/2016_11_13/wedding/DSC0215.jpg
.../Pics/2016_11_13/afterparty/DSC0234.jpg
.../Pics/2016_11_13/afterparty/DSC0322.jpg
How do I rename the pictures into the format below, with continuous sequences and 4-digit padding?
.../Pics/2016_11_13_wedding.0001.jpg
.../Pics/2016_11_13_afterparty.0002.jpg
.../Pics/2016_11_13_afterparty.0003.jpg
I'm using Bash 4.1, so only mv command is available. Here is what I have now but it's not working
#!/bin/bash
p=0
for i in *.jpg;
do
mv "$i" "$dirname.%03d$p.JPG"
((p++))
done
exit 0

Let say you have something like .../Pics/2016_11_13/wedding/XXXXXX.jpg; then go in directory .../Pics/2016_11_13; from there, you should have a bunch of subdirectories like wedding, afterparty, and so on. Launch this script (disclaimer: I didn't test it):
#!/bin/sh
for subdir in *; do # scan directory
[ ! -d "$subdir" ] && continue; # skip non-directory
prognum=0; # progressive number
for file in $(ls "$dir"); do # scan subdirectory
(( prognum=$prognum+1 )) # increment progressive
newname=$(printf %4.4d $prognum) # format it
newname="$subdir.$newname.jpg" # compose the new name
if [ -f "$newname" ]; then # check to not overwrite anything
echo "error: $newname already exist."
exit
fi
# do the job, move or copy
cp "$subdir/$file" "$newname"
done
done
Please note that I skipped the "date" (2016_11_13) part - I am not sure about it. If you have a single date, then it is easy to add these digits in # compose the new name. If you have several dates, then you can add a nested for for scanning the "date" directories. One more reason I skipped this, is to let you develop something by yourself, something you can be proud of...

Using only mv and bash builtins:
#! /bin/bash
shopt -s globstar
cd Pics
p=1
# recursive glob for .jpg files
for i in **/*.jpg
do
# (date)/(event)/(filename).jpg
if [[ $i =~ (.*)/(.*)/(.*).jpg ]]
then
newname=$(printf "%s_%s.%04d.jpg" "${BASH_REMATCH[#]:1:2}" "$p")
echo mv "$i" "$newname"
((p++))
fi
done
globstar is a bash 4.0 feature, and regex matching is available even in OSX's anitque bash.

Linux: Piping output to unique files

I have a folder filed with hundreds of text files which I want to run a Linux command called mint. This command outputs a text value which I want stored in unique files, one for each file I have in the folder. Is there a way to run the command using the * character to represent all my input files, while still piping the output to a file that is unique from each other file?
Example:
$ mint * > uniqueFile.krn

With the bugs fixed and caveats closed:
#!/bin/bash
# ^^^^ - bash, not sh, for [[ ]] support
for f in *.krn; do
[[ $f = *.krn ]] && continue # skip files already ending in .krn
mint "$f" >"$f.krn"
done
Or, with a prefix:
for f in *; do
[[ $f = int_* ]] && continue
mint "$f" >"int_$f"
done
You can also avoid recreating hashes that already exist unless the source file changed:
for f in *; do
# don't hash hash files
[[ $f = int_* ]] && continue
# if a non-empty hash file exists, and is newer than our source file, don't hash again
[[ -s "int_$f" && "int_$f" -nt "$f" ]] && continue
# ...if we got through the above conditions, then go ahead with creating a hash
mint "$f" >"int_$f"
done
To explain:
test -s filename is true only if a file by the given name exists and is non-empty
test file1 -nt file2 is true only if both files exist, and file1 is newer than file2.
[[ ]] is a ksh-extended shell syntax derived from that for the test command, adding support for pattern-matching tests (ie. [[ $string = *.txt ]] will be true only if $string expands to a value ending in .txt), and relaxing quoting rules (it's safe to write [[ -s $f ]], but test -s "$f" needs the quotes to work with all possible filenames).

Thanks for all the suggestions! Shiping's solution worked great, I just appended a prefix to the file name. Like so:
$ for file in * ; do mint $file > int_$file ; done
Self-answer moved from question and flagged Community Wiki; see What is the appropriate action when the answer to a question is added to the question itself?

Change directories in shell scripts with string variables

I have a series of directories that are only different by a numerical tag.
arr=(0 1 2 3)
i=0
while [ $i -le ${arr}]
do
dir="~Documents/seed"
dir+=${arr[i]}
echo $dir #works
cd dir #directory not found
#do other things#
done
Is it possible to do this?

This might be easier:
#!/bin/bash
for d in ~/Dcouments/seed*
do
if [ -d "$d" ]; then
echo $d
fi
done
Note:
You have tarfiles in ~/Documents too (with names that also match the wildcard), so I have added an if statement that checks if it is a directory or a file and only reacts to directories.

Creating a pathname to check a file doesn't exist there / Permission denied error

Hello from a Linux Bash newbie!
I have a list.txt containing a list of files which I want to copy to a destination($2). These are unique images but some of them have the same filename.
My plan is to loop through each line in the text file, with the copy to the destination occurring when the file is not there, and a mv rename happening when it is present.
The problem I am having is creating the pathname to check the file against. In the code below, I am taking the filename only from the pathname, and I want to add that to the destination ($2) with the "/" in between to check the file against.
When I run the program below I get "Permission Denied" at line 9 which is where I try and create the path.
for line in $(cat list.txt)
do
file=$[ basename $line ]
path=$[ $2$file ]
echo $path
if [ ! -f $path ];
then
echo cp $line $2
else
echo mv $line.DUPLICATE $2
fi
done
I am new to this so appreciate I may be missing something obvious but if anyone can offer any advice it would be much appreciated!

Submitting this since OP is new in BASH scripting no good answer has been posted yet.
DESTINATION="$2"
while read -r line; do
file="${line##*/}"
path="$2/$file"
[[ ! -f $path ]] && cp "$line" "$path" || mv "$line" "$path.DUP"
done < list.txt
Don't have logic for counting duplicates at present to keep things simple. (Which means code will take care of one dup entry) As an alternative you get uniq from list.txt beforehand to avoid the duplicate situation.

#anubhava: Your script looks good. Here is a small addition to it to work with several dupes.
It adds a numer to the $path.DUP name
UniqueMove()
{
COUNT=0
while [ -f "$1" ]
do
(( COUNT++ ))
mv -n "$1" "$2$COUNT"
done
}
while read -r line; do
file="${line##*/}"
path="$2/$file"
[[ ! -f $path ]] && cp "$line" "$path" || UniqueMove "$line" "$path.DUP"
done < list.txt

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Merge files to directories based on match of filename to directory name - linux

Related

extracting files that doesn't have a dir with the same name

How can I batch rename multiple images with their path names and reordered sequences in bash?

Linux: Piping output to unique files

Change directories in shell scripts with string variables

Creating a pathname to check a file doesn't exist there / Permission denied error

Categories

Resources