String manipulation - getting file name from absolute file path

String manipulation - getting file name from absolute file path - string

I have a variable containing a path to a file, that I obtain from the tk_getOpenFile function, the $file variable would be something like this:
/home/usr/Documents/Plugin-2-Linux.pdpk
I need some sort of split to get only the Plugin-2-Linux. Please note that the path may not be the same every time. So what I need is to get the string between the last / and the .pdpk and put it in another variable: $filename.

set filename [file rootname [file tail $file]]
file tail returns the part after the last / (not counting trailing /s), and file rootname the part before the last ..
man page for file

Related

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

I am trying to rename all files located in a directory (recursively) with a specific meta data field appended to the end of the png file name.
the meta data field name is "aesthetic_score" with a value range from 1.0-9.0
when I type:
exiftool -Aesthetic_score -G1 -s testn.png
the result is:
[PNG] Aesthetic_score : 7.0
This is how I would like to append the png files recursively within a directory.
Note i would like to swap out the word aesthetic with the word chad in the append, and not all files will have this data field:
input file:
filename001.png (metadata aesthetic_score:7.0)
output:
filename001-chad-score-70.png
I tried to use Digikam and JExifToolGui-2.01, without success.
I am trying to perform this task in the cmd line, although other solutions are welcome. Thank you for your help.

So, this might work for you, I can't really test it; note that you would need to get rid of the echo before the mv for it to actually do something (rename rather than just show what it would do).
while read name
do
newname=$(exiftool -G1 -s "$name"|awk '$2~/FileName/{name=$4}; $2~/Aesthetic_score/{basename=gensub(/^(.+)\....$/,"\\1","1",name);ext=gensub(/^.*\.(...)$/,"\\1","1",name);gsub(/\./,"",$4);print basename"."$4"."ext}')
echo mv "$name" "$newname"
done <<<$( find -iname \*.png )
Basically the find at the very end finds all the pngs.
The while loop takes every name find throws it, and passes each file through exiftool (using your specs) and parses the output using awk, which then outputs the new name, which gets captured in the shell variable by the same name.
And finally the mv (without the echo) renames the files.

grab a number after file extension

I have gigabytes of files with the following naming convention:
H__Flights_SCP_Log_Analysis_Log_Store_Extracted_File_Store_Aircraft_023_Logs_06Apr2021_164418_dtd_slotb_MDN_Gateway_Logs_audit_audit.log.1
The number at the end (1) is important and I need to grab it. Currently my code looks like this:
#!/bin/bash
#this grabs all the log files in the folder that will be converted and places it in a variable
files=$(find ./ -name "*.log*")
#I iterate through each file in files to convert each one
for file in $files
do
#this grabs the file name except the file extension and places it in a variable
name=${file%.*}
#this converts the file and places it in a file with the same name plus a csv extension
ausearch -if "file" --format csv>>$name.csv
done
This works fine to convert the logs and name them except that it does not grab the number at the end of the file extension. How could I grab that?

Continuing with OPs current use of parameter substitution ...
$ file='H__Flights_SCP_Log_Analysis_Log_Store_Extracted_File_Store_Aircraft_023_Logs_06Apr2021_164418_dtd_slotb_MDN_Gateway_Logs_audit_audit.log.1'
$ mynum="${file##*.}"
$ echo "${mynum}"
1
# or
$ mynum="${file//*./}"
$ echo "${mynum}"
1

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would have a specific file name as the query, output, etc., but I'm trying to use a variable so the loop can work through many files.
(Disclaimer: I am new to coding.) I believe that I am running into serious problems with trying to use my file names within my functions. As it is, my code will execute, but it creates a bunch of extra unintended files. This is what I intend for my script to do:
Line 1: Iterate through every file in my "sequences" directory. (All of which end with ".fa", if that is helpful.)
Line 3: Recognize the filename as a variable. (I know, I know, I think I've done this horribly wrong.)
Line 4: Run the blastp function using the file name as the argument for the "query" flag, always use "database.faa" as the argument for the "db" flag, and output the result in a new file that is has the same name as the initial file, but with ".txt" at the end.
Line 5: Output parts of the output file from line 4 into a new file that has the same name as the initial file, but with "_top_hits.txt" at the end.
for sequence in ./sequences/{.,}*;
do
echo "$sequence";
blastp -query $sequence -db database.faa -out ${sequence}.txt -evalue 1e-10 -outfmt 7
cat ${sequence}.txt | awk '/hits found/{getline;print}' | grep -v "#">${sequence}_top_hits.txt
done
When I ran this code, it gave me six new files derived from each file in the directory (and they were all in the same directory - I'd prefer to have them all in their own folders. How can I do that?). They were all empty. Their suffixes were, ".txt", ".txt.txt", ".txt_top_hits.txt", "_top_hits.txt", "_top_hits.txt.txt", and "_top_hits.txt_top_hits.txt".
If I can provide any further information to clarify anything, please let me know.

If you're only interested in *.fa files I would limit your input to only those matching files like this:
for sequence in sequences/*.fa;
do

I can propose you the following improvements:
for fasta_file in ./sequences/*.fa # ";" is not necessary if you already have a new line for your "do"
do
# ${variable%something} is the part of $variable
# before the string "something"
# basename path/to/file is the name of the file
# without the full path
# $(some command) allows you to use the result of the command as a string
# Combining the above, we can form a string based on our fasta file
# This string can be useful to name stuff in a clean manner later
sequence_name=$(basename ${fasta_file%.fa})
echo ${sequence_name}
# Create a directory for the results for this sequence
# -p option avoids a failure in case the directory already exists
mkdir -p ${sequence_name}
# Define the name of the file for the results
# (including our previously created directory in its path)
blast_results=${sequence_name}/${sequence_name}_blast.txt
blastp -query ${fasta_file} -db database.faa \
-out ${blast_results} \
-evalue 1e-10 -outfmt 7
# Define a file name for the top hits
top_hits=${sequence_name}/${sequence_name}_top_hits.txt
# alternatively, using "%"
#top_hits=${blast_results%_blast.txt}_top_hits.txt
# No need to cat: awk can take a file as argument
awk '/hits found/{getline;print}' ${blast_results} \
| grep -v "#" > ${sequence_name}_top_hits.txt
done
I made more intermediate variables, with (hopefully) meaningful names.
I used \ to escape line ends and allow putting commands in several lines.
I hope this improves code readability.
I haven't tested. There may be typos.

You should be using *.fa if you only want files with a .fa ending. Additionally, if you want to redirect your output to new folders you need to create those directories somewhere using
mkdir 'folder_name'
then you need to redirect your -o outputs to those files, something like this
'command' -o /path/to/output/folder
To help you test this script out, you can run each line one by one to test them. You need to make sure each line works by itself before combining.
One last thing, be careful with your use of colons, it should look something like this:
for filename in *.fa; do 'command'; done

All files in one dir, linux

Today I tried a script in linux to get all files in one dir. It was pretty straightforward, but I found something interesting.
#!/bin/bash
InputDir=/home/XXX/
for file in $InputDir'*'
do
echo $file
done
The output is:
/home/XXX/fileA /home/XXX/fileB
But when I just input the dir directly, like:
#!/bin/bash
InputDir=/home/XXX/
for file in /home/XXX/*
do
echo $file
done
The output is:
/home/XXX/fileA
/home/XXX/fileB
It seems, in the first script, there was only one loop and all the file names were stored in the variable $file in the FIRST loop, separated by space. But in the second script, one file name was stored in $file just in one loop, and there were more than one loop. What is exactly the difference between these two scripts?
Thanks very much, maybe my question is a little bit naive..

The behavior is correct and "as expected".
for file in $InputDir'*' means assign "/home/XXX/*" to $file (note the quotes). Since you quoted the asterisk, it will not be executed at this time. When the shell sees echo $file, it first expands the variables and then it does glob expansion. So after the first step, it sees
echo /home/XXX/*
and after glob expansion, it sees:
echo /home/XXX/fileA /home/XXX/fileB
Only now, it will execute the command.
In the second case, the pattern /home/XXX/* is expanded before the for is executed and thus, each file in the directory is assigned to file and then the body of the loop is executed.
This will work:
for file in "$InputDir"*
but it's brittle; it will fail, for example, when you forget to add a / to the end of the variable $InputDir.
for file in "$InputDir"/*
is a little bit better (Unix will ignore double slashes in a path) but it can cause trouble when $InputDir is not set or empty: You'll suddenly list files in the / (root) folder. This can happen, for example, because of a typo:
inputDir=...
for file in "$InputDir"/*
Case matters on Unix :-)
To help you understand code like this, use set -x ("enable tracing") in a line before the code you want to debug.

The difference is the quoting of '*'. In the first case the loop only executes once, with $file equal to /home/XXX/* which then expands to all the files in the directory when passed to echo. In the second case it executes once per file, with $file equal to each file name in turn.
Bottom line - change:
for file in $InputDir'*'
to:
for file in $InputDir*
or, better, and to make it more readable - change:
InputDir=/home/XXX/
for file in $InputDir'*'
to:
InputDir=/home/XXX
for file in $InputDir/*

String Manipulation in Batch file

I'm running a FOR loop to retrieve the (absolute) path name for ALL *.properties file in the root folder: "C:\ExecutionSDKTest_10.2.2\". Now, I'm trying to slice the path name to only result in the file name. e.g. if absolute path is "C\ExecutionSDKTest_10.2.2\Test-101" I only want the "Test-101" part (w/out the quotes of course) I have:
FOR %%G IN (C:\ExecutionSDKTest_10.2.2\*.properties) DO
(
REM Ignore "C:\ExecutionSDKTest_10.2.2\" ??
java -jar %1 %G:> Logs\%%G.log
)
So the absolute file path name is stored in G, but I'd only like the file name. How can I achieve this goal?

You can use
%%~nG
to get just the filename, or
%%~nxG
if you want the filename and extension.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

String manipulation - getting file name from absolute file path - string

set filename [file rootname [file tail $file]] file tail returns the part after the last / (not counting trailing /s), and file rootname the part before the last .. man page for file

Related

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

grab a number after file extension

Iterate through files in a directory, create output files, linux

All files in one dir, linux

String Manipulation in Batch file

Categories

Resources