Linux script to search for string in a file - linux

I am newbie to shell scripting. I have a requirement to read a file by line and match for specific string. If it matches, print x and if it doesn't match, print y.
Here is what I am trying. But,I am getting unexpected results. I am getting 700 lines of result where my /tmp/l1.txt has 10 lines only. Somewhere, I am going through the loop. I appreciate your help.
for line in `cat /tmp/l3.txt`
do
if echo $line | grep "abc.log" ; then
echo "X" >>/tmp/l4.txt
else
echo "Y" >>/tmp/l4.txt
fi
done

I don't understand the urge to do looping ...
awk '{if($0 ~ /abc\.log/){print "x"}else{print "y"}}' /tmp/13.txt > /tmp/14.txt
EDIT after inquiry ...
Of course, your spec wasn't overly precise, and I'm jumping to conclusions regarding your lines format ... we basically take the whole line that matched abc.log, replace everything up to the directory abc and from /log to the end of line with nothing, which leaves us with clusterX/xyz.
awk '{if($0 ~ /abc\.log/){print gensub(/.+\/abc\/(.+)\/logs/, "\\1", 1)}else{print "y"}}' /tmp/13.txt > /tmp/14.txt

cat /tmp/l3.txt | while read line # read the entire line into the variable "line"
do
if [ -n `echo "$line" | grep "abc.log"` ] # If there is a value "-n"
then
echo "X" >> /tmp/l4.txt # Echo "X" or the value of the variable "line" into l4.txt
else
echo "Y" >> /tmp/l4.txt # If empty echo "Y" into l4.txt
fi
done
While read statement will read either the entire line if only one variable is given, in this case "line" or if you have a fixed amount of fields you can specify a variable for each field, I.E. "| while read field1 field2" etc... The -n tests for if their is a value or not. -z will test if it's empty.

Why worry about cat and the rest before grep, you can simply test the return of grep and append all matching lines to /tmp/14.txt or append "Y":
[ -f "/tmpfile.tmp" ] && :> /tmpfile.tmp # test for existing tmpfile & truncate
if grep "abc.log" /tmp/13.txt >>tmpfile.tmp ; then # write all matching lines to tmpfile
cat tmpfile.tmp /tmp/14.txt # if grep matched append to /tmp/14.txt
else
echo "Y" >> /tmp/14.txt # write "Y" to /tmp/14.txt
fi
rm tmpfile.tmp # cleanup
Note: if you don't want the result of the grep appended to /tmp/14.txt, then just replace cat tmpfile.tmp /tmp/14.txt with echo "X" >> /tmp/14.txt and you can remove the 1st and last lines.

I think the "awk" answer above is better. However, if you really need to interact using a bash loop, you can use:
PATTERN="abc.log"
OUTPUTFILE=/tmp/14.txt
INPUTFILE=/tmp/13.txt
while read line
do
grep -q "$PATTERN" <<< "$line" > /dev/null 2>&1 && echo X || echo Y
done < $INPUTFILE >> $OUTPUTFILE

Related

How do I use for to loop over potentially-empty lines of output from egrep?

I'm trying to print out blank lines in a text file but I want it to also print out numbers to see how many lines of white spaces the egrep returned by using:
for x in $(egrep '$^ txtfile); do echo '$x'; done
but this doesn't echo or return anything, is there any way I know how many blank lines the egrep command returned?
for is the wrong tool for this job; the right one (if you don't want to use grep -c but really do want to read each line of output) is a while read loop, as discussed in BashFAQ #1:
#!/usr/bin/env bash
# ^^^^- important: bash, not sh
count=0
while IFS= read -r x; do
echo "Read a line: <$x>" >&2
(( ++count ))
done < <(egrep '^$' txtfile)
echo "Read $count lines" >&2

How to test for certain characters in a file

I am currently running a script with an if statement. Before I run the script, I want to make sure the file provided as the first argument has certain characters.
If the file does not have those certain characters in certain spots then the output would be else "File is Invalid" on the command line.
For the if statement to be true, the file needs to have at least one hyphen in Field 1 line 1 and at least one comma in Field one Line one.
How would I create an if statement with perhaps a test command to validate those certain characters are present?
Thanks
Im new to Linux/Unix, this is my homework so I haven't really tried anything, only brain storming possible solutions.
function usage
{
echo "usage: $0 filename ..."
echo "ERROR: $1"
}
if [ $# -eq 0 ]
then
usage "Please enter a filename"
else
name="Yaroslav Yasinskiy"
echo $name
date
while [ $# -gt 0 ]
do
if [ -f $1 ]
then
if <--------- here is where the answer would be
starting_data=$1
echo
echo $1
cut -f3 -d, $1 > first
cut -f2 -d, $1 > last
cut -f1 -d, $1 > id
sed 's/$/:/' last > last1
sed '/last:/ d' last1 > last2
sed 's/^ *//' last2 > last3
sed '/first/ d' first > first1
sed 's/^ *//' first1 > first2
sed '/id/ d' id > id1
sed 's/-//g' id1 > id2
paste -d\ first2 last3 id2 > final
cat final
echo ''
else
echo
usage "Coult not find file $1"
fi
shift
done
fi
In answer to your direct question:
For the if statement to be true, the file needs to have at least one
hyphen in Field 1 line 1 and at least one comma in Field one Line one.
How would I create an if statement with perhaps a test command to
validate those certain characters are present?
Bash provides all the tools you need. While you can call awk, you really just need to read the first line of the file into two-variable (say a and b) and then use the [[ $a =~ regex ]] to where the regex is an extended regular expression that verifies that the first field (contained in $a) contains both a '-' and ','.
For details on the [[ =~ ]] expression, see bash(1) - Linux manual page under the section labeled [[ expression ]].
Let's start with read. When you provide two variables, read will read the first field (based on normal word-splitting given by IFS (the Internal Field Separator, default $'[ \t\n]' - space, tab, newline)). So by doing read -r a b you read the first field into a and the rest of the line into b (you don't care about b for your test)
Your regex can be ([-]+.*[,]+|[,]+.*[-]+) which is an (x|y), e.g. x OR y expression where x is [-]+.*[,]+ (one or more '-' and one or more ','), your y is [,]+.*[-]+ (one or more ',' and one or more '-'). So by using the '|' your regex will accept either a comma then zero-or-more characters and a hyphen or a hyphen and zero-or-more characters and then a comma in the first field.
How do you read the line? With simple redirection, e.g.
read -r a b < "$1"
So your conditional test in your script would look something like:
if [ -f $1 ]
then
read -r a b < "$1"
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]] # <-- here is where the ...
then
starting_data=$1
...
else
echo "File is Invalid" >&2 # redirection to 2 (stderr)
fi
else
echo
usage "Coult not find file $1"
fi
shift
...
Example Test Files
$ cat valid
dog-food, cat-food, rabbit-food
50lb 16lb 5lb
$ cat invalid
dogfood, catfood, rabbitfood
50lb 16lb 5lb
Example Use/Output
$ read -r a b < valid
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]]; then
echo "file valid"
else
echo "file invalid"
fi
file valid
and for the file without the certain characters:
$ read -r a b < invalid
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]]; then
echo "file valid"
else
echo "file invalid"
fi
file invalid
Now you really have to concentrate on eliminating the spawning of at least a dozen subshells where you call cut 3-times, sed 7-times, paste once and then cat. While it is good you are thinking through what you need to do, and getting it working, as mentioned in my comment, any time you are looping, you want to eliminate the number of subshells spawned to the greatest extent possible. I suspect as #Mig answered, awk will be the proper tool that can likely eliminate all 12 subshells are replace it with a single call to awk.
I personally would use awk for this all part since you want to test fields and create a string with concatenated fields. Awk is perfect for that.
But here is a small script which shows how you could just test your file's first line:
if [[ $(head -n 1 file.csv | awk '$1~/-/ && $1~/,/ {print "MATCH"}') == 'MATCH' ]]; then
echo "yes"
else
echo "no"
fi
It looks overkill when not doing the whole thing in awk but it works. I am sure there is a way to test only one regex, but that would involve knowing which flavour of awk you have because I think they don't all use the same regex engine. Therefore I left this out for the sake of simplicity.

Add a special value to a variable

I want to add a special value to the value of a variable. this is my script:
x 55;
y 106;
now I want to change the value of x from 55 to 60.
Generally, how can we apply a math expression on the values of variables in a script?
Others might come up with something simpler (ex: sed, awk, ...), but this quick and dirty script works. It assumes your input file is exactly like you posted:
this is my script.
x 55;
y 106;
And the code:
#!/bin/bash
#
if [ $# -ne 1 ]
then
echo "ERROR: usage $0 <file>"
exit 1
else
inputfile=$1
if [ ! -f $inputfile ]
then
echo "ERROR: could not find $inputfile"
exit 1
fi
fi
tempfile="/tmp/tempfile.$$"
>$tempfile
while read line
do
firstelement=$(echo $line | awk '{print $1}')
if [ "$firstelement" == 'x' ]
then
secondelement=$(echo $line | awk '{print $2}' | cut -d';' -f1)
(( secondelement = secondelement + 5 ))
echo "$firstelement $secondelement;" >>$tempfile
else
echo "$line" >>$tempfile
fi
done <$inputfile
mv $tempfile $inputfile
So it reads the input file line per line. If the line starts with variable x, it takes the number that follows, does +5 to it and outputs it to a temp file. If the line does not start with x, it outputs the line, unchanged, to the temp file. Lastly the temp file overwrite the input file.
Copy this code in a file, make it executable and run it with the input file as an argument.

find string in file using bash

I need to find strings matching some regexp pattern and represent the search result as array for iterating through it with loop ), do I need to use sed ? In general I want to replace some strings but analyse them before replacing.
Using sed and diff:
sed -i.bak 's/this/that/' input
diff input input.bak
GNU sed will create a backup file before substitutions, and diff will show you those changes. However, if you are not using GNU sed:
mv input input.bak
sed 's/this/that/' input.bak > input
diff input input.bak
Another method using grep:
pattern="/X"
subst=that
while IFS='' read -r line; do
if [[ $line = *"$pattern"* ]]; then
echo "changing line: $line" 1>&2
echo "${line//$pattern/$subst}"
else
echo "$line"
fi
done < input > output
The best way to do this would be to use grep to get the lines, and populate an array with the result using newline as the internal field separator:
#!/bin/bash
# get just the desired lines
results=$(grep "mypattern" mysourcefile.txt)
# change the internal field separator to be a newline
IFS=$'/n'
# populate an array from the result lines
lines=($results)
# return the third result
echo "${lines[2]}"
You could build a loop to iterate through the results of the array, but a more traditional and simple solution would just be to use bash's iteration:
for line in $lines; do
echo "$line"
done
FYI: Here is a similar concept I created for fun. I thought it would be good to show how to loop a file and such with this. This is a script where I look at a Linux sudoers file check that it contains one of the valid words in my valid_words array list. Of course it ignores the comment "#" and blank "" lines with sed. In this example, we would probably want to just print the Invalid lines only but this script prints both.
#!/bin/bash
# -- Inspect a sudoer file, look for valid and invalid lines.
file="${1}"
declare -a valid_words=( _Alias = Defaults includedir )
actual_lines=$(cat "${file}" | wc -l)
functional_lines=$(cat "${file}" | sed '/^\s*#/d;/^\s*$/d' | wc -l)
while read line ;do
# -- set the line to nothing "" if it has a comment or is empty line.
line="$(echo "${line}" | sed '/^\s*#/d;/^\s*$/d')"
# -- if not set to nothing "", check if the line is valid from our list of valid words.
if ! [[ -z "$line" ]] ;then
unset found
for each in "${valid_words[#]}" ;do
found="$(echo "$line" | egrep -i "$each")"
[[ -z "$found" ]] || break;
done
[[ -z "$found" ]] && { echo "Invalid=$line"; sleep 3; } || echo "Valid=$found"
fi
done < "${file}"
echo "actual lines: $actual_lines funtional lines: $functional_lines"

Parsing Command Output in Bash Script

I want to run a command that gives the following output and parse it:
[VDB VIEW]
[VDB] vhctest
[BACKEND] domain.computername: ENABLED:RW:CONSISTENT
[BACKEND] domain.computername: ENABLED:RW:CONSISTENT
...
I'm only interested in some key works, such as 'ENABLED' etc. I can't search just for ENABLED as I need to parse each line at a time.
This is my first script, and I want to know if anyone can help me?
EDIT:
I now have:
cmdout=`mycommand`
while read -r line
do
#check for key words in $line
done < $cmdout
I thought this did what I wanted but it always seems to output the following right before the command output.
./myscript.sh: 29: cannot open ... : No such file
I don't want to write to a file to have to achieve this.
Here is the psudo code:
cmdout=`mycommand`
loop each line in $cmdout
if line contains $1
if line contains $2
output 1
else
output 0
The reason for the error is that
done < $cmdout
thinks that the contents of $cmdout is a filename.
You can either do:
done <<< $cmdout
or
done <<EOF
$cmdout
EOF
or
done < <(mycommand) # without using the variable at all
or
done <<< $(mycommand)
or
done <<EOF
$(mycommand)
EOF
or
mycommand | while
...
done
However, the last one creates a subshell and any variables set in the loop will be lost when the loop exits.
"How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?"
"I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read?"
$ cat test.sh
#!/bin/bash
while read line ; do
if [ `echo $line|grep "$1" | wc -l` != 0 ]; then
if [ `echo $line|grep "$2" | wc -l` != 0 ]; then
echo "output 1"
else
echo "output 0"
fi
fi
done
USAGE
$ cat in.txt | ./test.sh ENABLED RW
output 1
output 1
This isn't the best solution, but its a word by word translation of what you want and should give you something to start with and add your own logic

Resources