Linux: get specific field from filename - linux

I am currently learning linux bash scripting:
I have files in a folder with the following filename-pattern:
ABC01_-12ab_STRINGONE_logicMatches.txt
DEF02_-12ab_STRINGTWO_logicMatches.txt
JKL03_-12ab_STRINGTHREE_logicMatches.txt
I want to extract STRINGONE, STRINGTWO and STRINGTHREE as a list. To see, if my idea works, I wanted to echo my result to bash first.
Code of my bash-script (executed in the folder, where the files are located):
#!/bin/bash
for element in 'folder' do out='cut -d "_" -f2 $element | echo $out' done
Actual result:
error: unexpected end of file
Desired result:
STRINGONE
STRINGTWO
STRINGTHREE
(echoed in bash)

The idea you are doing is right. But the syntax of file globbing (looking for text files) and command substitution (running the cut command) is wrong. You need to do
for file in folder/*.txt;
# This condition handles the loop exit if no .txt files are found, and
# not throw errors
[ -f "$file" ] || continue
# The command-substitution syntax $(..) runs the command and returns the
# result out to the variable 'out'
out=$(cut -d "_" -f3 <<< "$file")
echo "$out"
done

Related

Get current directory (not full path) with filename only when sub folder is present in Linux bash

I have prepared a bash script to get only the directory (not full path) with file name where file is present. It has to be done only when file is located in sub directory.
For example:
if input is src/email/${sub_dir}/Bank_Casefeed.email, output should be ${sub_dir}/Bank_Casefeed.email.
If input is src/layouts/Bank_Casefeed.layout, output should be Bank_Casefeed.layout. I can easily get this using basename command.
src/basefolder is always constant. In some cases (after src/email(basefolder) directory), sub_directories will be there.
This script will work. I can use this script (only if module is email) to get output. but script should work even if sub directory is present in other modules. Maybe should I count the directories? if there are more than two directories (src/basefolder), script should get sub directories. Is there any better way to handle both scenarios?
#!/bin/bash
filename=`basename src/email/${sub_dir}/Bank_Casefeed.email`
echo "filename is $filename"
fulldir=`dirname src/email/${sub_dir}/Bank_Casefeed.email`
dir=`basename $fulldir`
echo "subdirectory name: $dir"
echo "concatenate $filename $dir"
Entity=$dir/$filename
echo $Entity
Using shell parameter expansion:
sub_dir='test'
files=( "src/email/${sub_dir}/Bank_Casefeed.email" "src/email/Bank_Casefeed.email" )
for f in "${files[#]}"; do
if [[ $f == *"/$sub_dir/"* ]]; then
echo "${f/*\/$sub_dir\//$sub_dir\/}"
else
basename "$f"
fi
done
test/Bank_Casefeed.email
Bank_Casefeed.email
I know there might be an easier way to do this. But I believe you can just manipulate the input string. For example:
#!/bin/bash
sub_dir='test'
DIRNAME1="src/email/${sub_dir}/Bank_Casefeed.email"
DIRNAME2="src/email/Bank_Casefeed.email"
echo $DIRNAME1 | cut -f3- -d'/'
echo $DIRNAME2 | cut -f3- -d'/'
This will remove the first two directories.

Linux : check if something is a file [ -f not working ]

I am currently trying to list the size of all files in a directory which is passed as the first argument to the script, but the -f option in Linux is not working, or am I missing something.
Here is the code :
for tmp in "$1/*"
do
echo $tmp
if [ -f "$tmp" ]
then num=`ls -l $tmp | cut -d " " -f5`
echo $num
fi
done
How would I fix this problem?
I think the error is with your glob syntax which doesn't work in either single- or double-quotes,
for tmp in "$1"/*; do
..
Do the above to expand the glob outside the quotes.
There are couple more improvements possible in your script,
Double-quote your variables to prevent from word-splitting, e.g. echo "$temp"
Backtick command substitution `` is legacy syntax with several issues, use the $(..) syntax.
The [-f "filename"] condition check in linux is for checking the existence of a file and it is a regular file. For reference, use this text as reference,
-b FILE
FILE exists and is block special
-c FILE
FILE exists and is character special
-d FILE
FILE exists and is a directory
-e FILE
FILE exists
-f FILE
FILE exists and is a regular file
-g FILE
FILE exists and is set-group-ID
-G FILE
FILE exists and is owned by the effective group ID
I suggest you try with [-e "filename"] and see if it works.
Cheers!
At least on the command line, this piece of script does it:
for tmp in *; do echo $tmp; if [ -f $tmp ]; then num=$(ls -l $tmp | sed -e 's/ */ /g' | cut -d ' ' -f5); echo $num; fi; done;
If cut uses space as delimiter, it cuts at every space sign. Sometimes you have more than one space between columns and the count can easily go wrong. I'm guessing that in your case you just happened to echo a space, which looks like nothing. With the sed command I remove extra spaces.

What do these lines of Unix/Linux do?

I am a Unix/Linux shell script newbie and I have been asked to look at a script which contains the lines below. The following details in this question are vague but the person who wrote this code left no documentation and has since demised. Can anyone advise what they actually do?
There are two specific pieces of code. The first is simply line source polys.sh where polys.sh is a text file with contents:
failure="020o 040a"
success="002[a-d] 003[a-r] 004[a-s] 005[a-u]
Representing various parameters, I think, to do with the calculations the shell script performs. The nature of the calculations is, I am told, not important because the aim is to just get the script running.
The second piece of code is below and the relevant lines are delimited by Start and Stop comments. What I can tell you is that: $arg1 is blank, $opt1 is also blank, $poly is the path and name of a text file and ./search I believe to be a folder.
if [ $search == "yes" ]
then
# Search stage for squares containing zeros
#
# Start.
output="$outputs/search/"`basename $poly`
./search $opt1 $arg1 < $poly 2>&1 | tee $output
if tail -n1 $output | grep -v "success"
# End.
then
echo "SEARCH FAILURE" >> $output
continue
fi
# Save approximations
#
echo -n "SEARCH SUCCESS " >> $output
cat /tmp/iters >> $output
cp /tmp/zeros $inputs/search/`basename $poly`
else
echo "No search"
fi
EDIT Initial disclaimer as advised by Mr. Charles Duffy:
The below explanations assume you won't hit expansion-related bugs; please correct your code as advised by shellcheck.net to be assured that these explanations are correct
source polys.sh includes the code from the script polys.sh, which is a file in the same folder as the file sourcing it (hence just the filename, without its path).
Within that file:
failure="020o 040a"
success="002[a-d] 003[a-r] 004[a-s] 005[a-u]"
are two variable declarations; the variable $failure is set to "020o 040a" and $success to "002[a-d] 003[a-r] 004[a-s] 005[a-u]". As the file was sourced, these two variables are available in your script (do echo "$failure" and echo "$success" to see for yourself).
output="$outputs/search/`basename $poly`" has two parts to explain:
"$outputs/search/"
sets the variable $output to "$outputs/search/", i.e., to the value of the variable $outputs, appended by the string "/search/".,
`basename $poly`
anything in backticks is a command substitution, which interprets and runs the command returning its output, and the command basename $poly gets the base file or folder name from $poly, if it is a file path (e.g., basename $poly for poly="/dev/file.txt" yields file.txt); the output is appended as a string. to "$outputs/search/".
./search $opt1 $arg1 < $poly 2>&1 | tee $output is two commands, separated by a pipe |:
./search $opt1 $arg1 < $poly 2>&1
runs the executable file ./search (./ is shorthand for the current script's directory) with two arguments, $opt1 and $opt2 variables. $poly is the variable name which should represent a file path, of which the file path has its content redirected to the command (using <). The output of all errors (stderr, as 2) is redirected (>) to the standard output (stdout, or &2, the ampersand represents this is a file descriptor, not a file path, otherwise it would redirect output to a file named 2).
tee $output
tee pipes outputs stdin to stdout and to arguments as file paths. So tee "/home/nick/output" would save the stdin to a file at "/home/nick/output", as well as the stdout.
if tail -n1 $output | grep -v "success"
tail -n1 $output
gets the last line of the file at the "$output" variable's value.
grep -v "success"
searches for any non-match (-v inverts the match) in the last line from tail -n1 of "success" in a line (e.g., if the last line is "fail", it would pass the if statement as it does not contain "success")

Pass string to script

I have a script, download, that takes a string and checks if a file has the filename of the string. If it doesn't, it then downloads it. All the filenames are in a file.
This command is not working:
cat filenames | ./download
Download source:
filename=$1
if [ ! -f $1 ];
then
wget -q http://www.example.com/nature/life/${filename}.rdf
fi
Sample filename file:
file1
file2
file3
file4
How do I pass the command output from the cat to the download script?
In your script $1 is the positional arg on the command line. ./download somefile would work, but cat filename | ./download streams the data into download, which you ignore.
You should read the advanced bash scripting guide, which will give you a good base for how bash scripting works. To fix this, change your command to:
cat filename | xargs -n 1 ./download
This will run ./download for each filename in your list. However, the filenames may have spaces or other special characters in them, which would break your script. You should look into alternatives ways of doing this, to avoid these problems.
Specifically, use a while loop to read your file. This properly escapes your filenames on each line, if they were input into the file correctly. That way, you avoid the problems cat would have with filenames like: fi/\nle.
You can pass a filename to a file that contains file names to your script:
./download filenames
And then loop through file names from the file name in $1:
$!/bin/bash
# Do sanity check
fname=$1
for f in $(<$fname); do
if [ ! -f "$f.rdf" ]; then
wget -q http://www.example.com/nature/life/${f}.rdf
fi
done

Assigning a variable after the contents are 'cut' in bash

I am iterating through a folder of files using bash, but I need to cut the preceding path. For instance if I have this '/temp/test/filename' I want to cut off the '/temp/test/' and store the file name to a variable so I can write a log with the filename in it.
Can anyone help me out? The problem is that the variable temp is always empty.
Here is my bash code:
#!/bin/bash
for file in /temp/test/*
do
if [[ ! -f "$file" ]]
then
continue
fi
temp="$file"|cut -d'/' -f3
$file > /var/log/$temp$(date +%Y%m%d%H%M%S).log
done
exit
Try that :
$ x=/temp/test/filename
$ echo ${x##*/}
filename
Another solution is to use basename :
$ basename /temp/test/filename
filename
The first solution is a parameter expansion and it's a bash builtin, so we increase performance.
Your line temp="$file"|cut -d'/' -f3 is broken.
when you want to store the output of a command in a variable, you should do var=$(command)
you need to pass the value to the STDIN of the command with a here-string (<<<) or with echo value | command
finally, if you'd want to use cut :
$ temp=$(cut -d/ -f4 <<< /temp/test/filename)
$ echo $temp
filename

Resources