Loop through directory and files with with date string to find the file with highest suffix (e.g. "firsttable_20230113093000_12") - linux

I am looking to adapt a shell script to find a way to cycle through files that have different table names as well as different dates between files that have the same table name, and return the highest suffix file.
An example of my files in a given directory:
firsttable_20230112093000_1
firsttable_20230112093000_2
firsttable_20230112093000_3
firsttable_20230112093000_4
firsttable_19990202090000_1
firsttable_19990202090000_2
secondtable_20220112090000_1
secondtable_20220112090000_2
secondtable_20220112090000_3
Desired Result:
firsttable_20230112093000_4
firsttable_19990202090000_2
secondtable_20220112090000_3
What's been done
Originally I only needed to find the highest suffix as the dates would be the same for all tables, and what I had worked:
allTables=(
'firsttable'
'secondtable'
'thirdtable'
...
)
for table in ${allTables[#]}; do
substring="_2"
searchstring="$table$substring"
# Check if the file for a given table exists:
if ls $Path/$searchString* 1> /dev/null 2>&1; then
echo "$searchString* files exist. Proceeding..."
lastFile=$(ls "$Path/searchString"* | sort -rV | head -n1)
echo "Highest suffix file: $lastFile"
else
echo "File searchstring not found: '$Path/$searchString' "
fi
done
If I was to apply that to my new directory shown above, it would only be able to find:
Highest suffix file: firsttable_20230112093000_4
Highest suffix file: secondtable_20220112090000_3
I need to find a way to make the script also look at the dates and see if they are different, and if they are, treat them as such. Would this require a regex to assess the filename? The filename format stays the same: "tablename_$$$$$$$$$$$$$$_nn" (underscore placing after table and date, suffix can go above single figures, date is always 14 characters)
Thanks in advance for any help!

Related

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

I am trying to rename all files located in a directory (recursively) with a specific meta data field appended to the end of the png file name.
the meta data field name is "aesthetic_score" with a value range from 1.0-9.0
when I type:
exiftool -Aesthetic_score -G1 -s testn.png
the result is:
[PNG] Aesthetic_score : 7.0
This is how I would like to append the png files recursively within a directory.
Note i would like to swap out the word aesthetic with the word chad in the append, and not all files will have this data field:
input file:
filename001.png (metadata aesthetic_score:7.0)
output:
filename001-chad-score-70.png
I tried to use Digikam and JExifToolGui-2.01, without success.
I am trying to perform this task in the cmd line, although other solutions are welcome. Thank you for your help.
So, this might work for you, I can't really test it; note that you would need to get rid of the echo before the mv for it to actually do something (rename rather than just show what it would do).
while read name
do
newname=$(exiftool -G1 -s "$name"|awk '$2~/FileName/{name=$4}; $2~/Aesthetic_score/{basename=gensub(/^(.+)\....$/,"\\1","1",name);ext=gensub(/^.*\.(...)$/,"\\1","1",name);gsub(/\./,"",$4);print basename"."$4"."ext}')
echo mv "$name" "$newname"
done <<<$( find -iname \*.png )
Basically the find at the very end finds all the pngs.
The while loop takes every name find throws it, and passes each file through exiftool (using your specs) and parses the output using awk, which then outputs the new name, which gets captured in the shell variable by the same name.
And finally the mv (without the echo) renames the files.

If statement comparing variable to files in list

I am using terminal emulator. I have a folder with save files in it and am trying to determine whether the entered text matches any file in the list.
I created a variable called saveFiles using the ls. Only displaying files with .save and removing it from the output:
saveFiles=$(cd "${0%/*}"/save; ls *.save* | ls *.save*; cd "${0%/*}")
echo -n ">"
read -r "name"
So $saveFiles equals:
Savegame1 savegame2 savegame3
I'm trying to make an if statement that tests wether the entered variable equals any of the files in the folder.
The following script works except when I type a letter contained at the end of the file. So if one of the files is called savegame, if I type game it comes up with a match because game.save is contained in the string.
if [[ $saveFiles = *"$name".save* ]]
then
scene=$(cat "save/$name".save)
fi
I need to find a way to test wether any of the strings in $saveFiles are equal to the entered variable $name.
To reiterate, files in folder:
Save1.save
Save2.save
...
Read `$name`
If $name matches any file in the list then load scene otherwise repeat.
I hope this isn't confusing. Please feel free to ask me to clarify further. Thank you.
Maybe I am not understanding the question correctly, but why don't you first request the file name and then query the file system with precisely that name, e.g.
read name
if [[ -f "${name}.save" ]];
echo "Found the file ${name}.save"
fi

Rename file names in Linux based on conditions

I have some files in Linux directory like below.
email_Tracking_export_2018_08_26.zip
email_Tracking_export_2018_08_27.zip
email_Tracking_export_2018_08_28.zip
email_Tracking_export_2018_08_29.zip
email_Tracking_export_2018_09_03.zip
email_Tracking_export_history_Novemeber.zip
email_Tracking_export_history_December.zip
email_Tracking_export_history_january.zip
email_Tracking_export_history_february.zip
email_Tracking_export_history_march.zip
email_Tracking_export_history_April.zip
Now I want to change the files names to be like below.
email_Tracking_export_2018_08_26.zip
email_Tracking_export_2018_08_27.zip
email_Tracking_export_2018_08_28.zip
email_Tracking_export_2018_08_29.zip
email_Tracking_export_2018_09_03.zip
email_Tracking_export_2017_11_01.zip
email_Tracking_export_2017_12_01.zip
email_Tracking_export_2018_01_01.zip
email_Tracking_export_2018_02_01.zip
email_Tracking_export_2018_03_01.zip
email_Tracking_export_2018_04_01.zip
Conditions:
If the file names are in yyyy-mm-dd format then leave them as is
if the file names are in Alphabetical form convert to yyyy-mm-dd
if month has passed in that particular year than leave as is if not then year should be previous year.
How can I achive that in bash/Linux
for f in email_Tracking_export_*.zip; do
case "$f" in
email_Tracking_export_????_??_??.zip) : ignore ;;
*) date=$(stat -c %Y "$f") # mod time in seconds
fmtdate=$(date --date="#$date" +%Y_%m_%d) # formatted
mv "$f" email_Tracking_export_$fmtdate.zip
;;
esac
done
Here are the steps in small parts which you can find answers of and execute as a bash script,
Make a key value pair of mapping, which maps a month to its numerical value. (Take care of lowercase/uppercase)
For each file, check their format.
Generate the new name for each file.
Then, use the mv command to rename the file with the new name.

bash - opening an image only when a corresponding text file exists

I came across a problem in Bash when I would try to only open images based upon the information stored in .txt files about them. I am trying to sort a number of images by size or height, and display an image with them in the sorted order, but if there exists a .jpg in the folder without a .txt file with the same name, it should not process it.
I have the sorting piece of my situation done, and am trying to figure out how I would go about opening only the images that have a .jpg extension WITH a .txt file.
I figured a solution would look like me putting every .jpg's name (without extension) in a list and then process through the list and run something like:
[if -f $filename.txt ]; then ~~~
but I came across the problem of iterating through without a for-loop, or else all the pictures would open multiple times. My attempt was:
for i in *jpg; do
y=$y ${i.jpg}
done
if[ -f $y.txt ] then
(sorting parts)
This only looked at the last filename in y, as it should, but I am trying to figure out a way to look at each separate filename and see if there exists that textfile, in order to include it in the sorting.
Thanks so much for your help!
Collecting a list of file names in a single variable is an antipattern. You want to collect them in an array instead.
a=()
for f in *.jpg; do
if [ -e "${f%.jpg}".txt ]; then
continue
fi
a+=("$f")
done
# now do things with "${a[#]}"
Frequently, you don't really need to collect the files in an array -- just do everything you were doing inside the for loop to each individual file as you traverse the files.
(And actually y=$y ${i%.jpg} doesn't append to y -- it sets y to itself for the duration of attempting to execute a file named i sans the .jpg extension, which would most likely fail in the vast majority of cases.)
I would do the file check first such that find just reports files that have a corresponding text file. The following snippet will just display jpg files that have a corresponding txt file:
find . -name "*.jpg" -maxdepth 1 -exec /bin/bash -c '[ -e "${0%.*}.txt" ] && echo "$0";' {} \;

How do you format output string in bash script for input by another script?

I need to unzip a bunch of student assignment (jar) files so that I can use a script to submit the contents to the Moss (Stanford) plagiarism detection server. I did the same thing in Java which was trivial but I'm trying to re-implement to as a bash script.
I am trying to do the following:
Get a list of student names (each student has a directory).
In each student directory, sub-directories exist numbered from 1 to the
latest submission. I need to get the directory with the highest
number.
Inside of each of those submission directories contains a
jar file that I need. I copy each jar into a temp directory with the
same name as the student and unzip it.
I need that temp directory listing formatted as a string in the form
/tempDir/studentName1/.languageExt /tempDir/studentName2/.languageExt
The student directory has the basic structure:
Student_Root_Directory:
Student1
Student2
Student1
Sub-Directories: 1 2 3 4 5
1: student1.jar
2: student1.jar
...
Student2
Sub-Directories: 1 2 3
1. student2.jar
...
To do the first 3 steps above I did:
#!/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: .c, .cpp, .java, .py
students=`ls $1`
student_dir=$1
languageExt=$2
mossDir="/home/moss"
tempDir="/home/moss/tempJarStorage"
for student in $students
do
latestSubmissionDir=`ls -t $student_dir/$student | head -1`
for jarDir in $latestSubmissionDir
do
mkdir $tempDir/$student
cp $student_dir/$student/$jarDir/*.jar $tempDir/$student
unzip -d $tempDir/$student/ -o -j $tempDir/$student/$student.jar *.$languageExt
rm $tempDir/$student/$student.jar
done
done
...which results in a number of student directories being created in a temp directory that contains only the unzipped contents for the student submissions.
I need the ls output of the new temp directories formatted as a string that contains:
/tempDir/studentName1/\*.languageExt /tempDir/studentName2/\*.languageExt
I have tried variations on
find "$tempDir" -iname "*.$languageExt" -printf "%p/*.$languageExt"
using iname and not - but I either have output that contains extra directory information such as $tempDir/*.languageExt (when I just need the subdirectories $tempDir/$studentName/*.languageExt) or I have output where the path for every source file is also listed such as:
$tempDir/$studentName/studentNameA.java
$tempDir/$studentName/studentNameB.java
when I only need
$tempDir/$studentName/*.java
I think this should be really easy and I'm just over thinking it. Any hints for improving the script also appreciated.
Here's a revised version of the script hat may work:
#/bin/bash
# Extract all jar files into a temp directory called /home/moss/tempJarFiles/studentName
# $1 is the command line argument that contains the path to the institution submission dir.
# $2 is the language extension: c, cpp, java, py
students_dir=$1
languageExt=$2
studentPathsT=( "$students_dir"/*/ )
mossDir='/home/moss'
tempDir='/home/moss/tempJarStorage'
for studentPathT in "${studentPathsT[#]}"; do
student=$(basename "$studentPathT")
mkdir "$tempDir/$student"
submissionDirsT=( "$studentPathT"*/ )
latestSubmissionDirT=${submissionDirsT[${#submissionDirsT[#]-1]}
cp "$latestSubmissionDirT"*.jar "$tempDir/$student/"
unzip -d "$tempDir/$student/" -o -j "$tempDir/$student/*.jar" "*.$languageExt"
rm "$tempDir/$student"/*.jar
done
# Note that at this point `"$tempDir"/*/*.$languageExt` would expand
# to all extracted submission files, across all students.
# Finally, output each student's extracted files as an unexpanded glob à la
# /{tempDir}/{studentName1}/*.{languageExt}
for pT in "$tempDir"/*/; do
echo "$pT*.$languageExt"
# Note: If there is a chance that your filenames contain
# embedded newlines (rare in practice) using `echo` won't work properly
# as #Charles Duffy points out.
# If that is a concern, use
# printf '%s\0' "$pT*.$languageExt"
# and process the output with a utility that can process NUL characters
# as separators, such as `xargs -0`.
done
It avoids using ls and only uses pathname expansion and array variables so as to properly deal with paths that contain embedded spaces and other shell metacharacters.
suffix ...T in variable names indicates that a particular path or array of paths is *T*erminated, i.e, that it ends in a /.
The assumption is that the numbered subdirectories do not go beyond 9, as the implicit lexical sorting of pathname expansion is relied upon; if the numbers go higher, explicit numerical sorting must be applied.
Note that the globs (pathname patterns) passed to unzip are intentionally double-quoted, as they should be interpreted by unzip, not the shell.
Note that, based on your original code, I've assumed that $languageExt does NOT start with . (e.g., cpp rather than .cpp), despite what your comment says.

Resources