find string in file using bash - linux

I need to find strings matching some regexp pattern and represent the search result as array for iterating through it with loop ), do I need to use sed ? In general I want to replace some strings but analyse them before replacing.

Using sed and diff:
sed -i.bak 's/this/that/' input
diff input input.bak
GNU sed will create a backup file before substitutions, and diff will show you those changes. However, if you are not using GNU sed:
mv input input.bak
sed 's/this/that/' input.bak > input
diff input input.bak
Another method using grep:
pattern="/X"
subst=that
while IFS='' read -r line; do
if [[ $line = *"$pattern"* ]]; then
echo "changing line: $line" 1>&2
echo "${line//$pattern/$subst}"
else
echo "$line"
fi
done < input > output

The best way to do this would be to use grep to get the lines, and populate an array with the result using newline as the internal field separator:
#!/bin/bash
# get just the desired lines
results=$(grep "mypattern" mysourcefile.txt)
# change the internal field separator to be a newline
IFS=$'/n'
# populate an array from the result lines
lines=($results)
# return the third result
echo "${lines[2]}"
You could build a loop to iterate through the results of the array, but a more traditional and simple solution would just be to use bash's iteration:
for line in $lines; do
echo "$line"
done

FYI: Here is a similar concept I created for fun. I thought it would be good to show how to loop a file and such with this. This is a script where I look at a Linux sudoers file check that it contains one of the valid words in my valid_words array list. Of course it ignores the comment "#" and blank "" lines with sed. In this example, we would probably want to just print the Invalid lines only but this script prints both.
#!/bin/bash
# -- Inspect a sudoer file, look for valid and invalid lines.
file="${1}"
declare -a valid_words=( _Alias = Defaults includedir )
actual_lines=$(cat "${file}" | wc -l)
functional_lines=$(cat "${file}" | sed '/^\s*#/d;/^\s*$/d' | wc -l)
while read line ;do
# -- set the line to nothing "" if it has a comment or is empty line.
line="$(echo "${line}" | sed '/^\s*#/d;/^\s*$/d')"
# -- if not set to nothing "", check if the line is valid from our list of valid words.
if ! [[ -z "$line" ]] ;then
unset found
for each in "${valid_words[#]}" ;do
found="$(echo "$line" | egrep -i "$each")"
[[ -z "$found" ]] || break;
done
[[ -z "$found" ]] && { echo "Invalid=$line"; sleep 3; } || echo "Valid=$found"
fi
done < "${file}"
echo "actual lines: $actual_lines funtional lines: $functional_lines"

Related

Unix - Replace column value inside while loop

I have comma separated (sometimes tab) text file as below:
parameters.txt:
STD,ORDER,ORDER_START.xml,/DML/SOL,Y
STD,INSTALL_BASE,INSTALL_START.xml,/DML/IB,Y
with below code I try to loop through the file and do something
while read line; do
if [[ $1 = "$(echo "$line" | cut -f 1)" ]] && [[ "$(echo "$line" | cut -f 5)" = "Y" ]] ; then
//do something...
if [[ $? -eq 0 ]] ; then
// code to replace the final flag
fi
fi
done < <text_file_path>
I wanted to update the last column of the file to N if the above operation is successful, however below approaches are not working for me:
sed 's/$f5/N/'
'$5=="Y",$5=N;{print}'
$(echo "$line" | awk '$5=N')
Update: Few considerations which need to be considered to give more clarity which i missed at first, apologies!
The parameters file may contain lines with last field flag as "N" as well.
Final flag needs to be update only if "//do something" code has successfully executed.
After looping through all lines i.e, after exiting "while loop" flags for all rows to be set to "Y"
perhaps invert the operations do processing in awk.
$ awk -v f1="$1" 'BEGIN {FS=OFS=","}
f1==$1 && $5=="Y" { // do something
$5="N"}1' file
not sure what "do something" operation is, if you need to call another command/script it's possible as well.
with bash:
(
IFS=,
while read -ra fields; do
if [[ ${fields[0]} == "$1" ]] && [[ ${fields[4]} == "Y" ]]; then
# do something
fields[4]="N"
fi
echo "${fields[*]}"
done < file | sponge file
)
I run that in a subshell so the effects of altering IFS are localized.
This uses sponge to write back to the same file. You need the moreutils package to use it, otherwise use
done < file > tmp && mv tmp file
Perhaps a bit simpler, less bash-specific
while IFS= read -r line; do
case $line in
"$1",*,Y)
# do something
line="${line%Y}N"
;;
esac
echo "$line"
done < file
To replace ,N at the end of the line($) with ,Y:
sed 's/,N$/,Y/' file

pass line to function - overwrite using sed line by line - BASH

why this code not work? What's the problem of passing $line to a function?
function a {
echo $1 | grep $2
}
while read -r line; do
a $line "LAN"
done < database.txt
Another question, i have to overwrite line by line a txt file possibly using sed command.But not all the line, only the part to change. Something like this:
while read -r line; do
echo $line | sed "s/STRING1/STRING2/"
done < namefile
EDIT
I give you an example for my second question.
input file:
LAN 1:
[text]11111[text]
[text]22222[text]
[text]33333[text]
LAN 2:
[text]11111[text]
[text]22222[text]
[text]33333[text]
output file:
LAN 1:
[text]44444[text]
[text]22222[text]
[text]33333[text]
LAN 2:
[text]11111[text]
[text]22222[text]
[text]33333[text]
I have to overwrite database.txt so i think to do this line by line using a counter for LAN. This is my code:
while read -r line; do
echo "$line" | grep -q LAN
if [ $? = "0" ]; then
net_count=$((net_count+1))
fi
if [ $net_count = <lan choose before> ]; then # variable that contains lan number chosen by user
echo "$line" | fgrep -q "11111"
if [ "$?" = "0" ]; then
echo $line | sed "s/11111/44444/" > database.txt
break
fi
fi
done < database.txt
Thank you all
Running sed once for every line is typically over a thousand times slower than running just one sed instance processing your whole file of input. Don't do it.
If you want to do string manipulation on a line-by-line basis, use native bash primitives for the purpose, as documented in BashFAQ #100:
a() {
local line=$1 regex=$2
if [[ $line =~ $regex ]]; then
printf '%s\n' "$line"
fi
}
while IFS= read -r line; do
a "$line" LAN
done <database.txt
Likewise, for substring replacements, an appropriate parameter expansion primitive exists:
while read -r line; do
printf '%s\n' "${line//STRING1/STRING2}"
done < namefile
That said, those approaches are only appropriate if you need to iterate line-by-line. Typically, it makes more sense to use a single grep or sed operation, and iterate over the results of those calls if you need to do native bash operations. For instance, the following iterates over the output from grep, as emitted by a process substitution:
regex=LAN
while IFS= read -r line; do
echo "Read line from grep: $line"
done < <(grep -e "$regex" <database.txt)
You need to put $line in quotes for the whole line to be considered as a single parameter. Else, every space character splits the strings as multiple parameter to the function:
#!/bin/bash
function a {
echo "$1" | grep "$2"
}
while read -r line; do
a "$line" "LAN"
done < database.txt
And for the second question, if you want to print only the lines that you modify, you can use the following code:
while read -r line; do
echo "$line" | sed -n "s/STRING1/STRING2/p"
done < namefile
Here, -n will omit the lines that do not match the string; and the flag p makes sure to print the lines that match.

Linux script to search for string in a file

I am newbie to shell scripting. I have a requirement to read a file by line and match for specific string. If it matches, print x and if it doesn't match, print y.
Here is what I am trying. But,I am getting unexpected results. I am getting 700 lines of result where my /tmp/l1.txt has 10 lines only. Somewhere, I am going through the loop. I appreciate your help.
for line in `cat /tmp/l3.txt`
do
if echo $line | grep "abc.log" ; then
echo "X" >>/tmp/l4.txt
else
echo "Y" >>/tmp/l4.txt
fi
done
I don't understand the urge to do looping ...
awk '{if($0 ~ /abc\.log/){print "x"}else{print "y"}}' /tmp/13.txt > /tmp/14.txt
EDIT after inquiry ...
Of course, your spec wasn't overly precise, and I'm jumping to conclusions regarding your lines format ... we basically take the whole line that matched abc.log, replace everything up to the directory abc and from /log to the end of line with nothing, which leaves us with clusterX/xyz.
awk '{if($0 ~ /abc\.log/){print gensub(/.+\/abc\/(.+)\/logs/, "\\1", 1)}else{print "y"}}' /tmp/13.txt > /tmp/14.txt
cat /tmp/l3.txt | while read line # read the entire line into the variable "line"
do
if [ -n `echo "$line" | grep "abc.log"` ] # If there is a value "-n"
then
echo "X" >> /tmp/l4.txt # Echo "X" or the value of the variable "line" into l4.txt
else
echo "Y" >> /tmp/l4.txt # If empty echo "Y" into l4.txt
fi
done
While read statement will read either the entire line if only one variable is given, in this case "line" or if you have a fixed amount of fields you can specify a variable for each field, I.E. "| while read field1 field2" etc... The -n tests for if their is a value or not. -z will test if it's empty.
Why worry about cat and the rest before grep, you can simply test the return of grep and append all matching lines to /tmp/14.txt or append "Y":
[ -f "/tmpfile.tmp" ] && :> /tmpfile.tmp # test for existing tmpfile & truncate
if grep "abc.log" /tmp/13.txt >>tmpfile.tmp ; then # write all matching lines to tmpfile
cat tmpfile.tmp /tmp/14.txt # if grep matched append to /tmp/14.txt
else
echo "Y" >> /tmp/14.txt # write "Y" to /tmp/14.txt
fi
rm tmpfile.tmp # cleanup
Note: if you don't want the result of the grep appended to /tmp/14.txt, then just replace cat tmpfile.tmp /tmp/14.txt with echo "X" >> /tmp/14.txt and you can remove the 1st and last lines.
I think the "awk" answer above is better. However, if you really need to interact using a bash loop, you can use:
PATTERN="abc.log"
OUTPUTFILE=/tmp/14.txt
INPUTFILE=/tmp/13.txt
while read line
do
grep -q "$PATTERN" <<< "$line" > /dev/null 2>&1 && echo X || echo Y
done < $INPUTFILE >> $OUTPUTFILE

shell bash replacing tags in a line with values from a different file

I am trying to read lines within a file and if the line contains a tag, the text within the tag is used to replace the tag with a value from a different propertiesfile, before the line, with all tags replaced is written to a different file.
So the initial file being read have lines that would adhere to the following format:
testkey "TEST-KEY" "[#key_location#]:///[#key_name#]"
Where [# and #] house the tag text.
The propertied file would then contain lines like:
key_location=location_here
key_name=test_key_name
So the end result I am trying to achieve is that the line is written to a new file, but the tags are replaced with the values from the property file, so using the above content:
testkey "TEST-KEY" "loaction_here:///test_key_name"
I am not sure how best to handle the tags and deal with multiple tags in one line and am pretty lost. Any help would be greatly appreciated.
Skeleton code:
while read line
if [[ $line == *[#* ]]
then
#echo found a tag and need to deal with it
else
echo "$line">> $NEW_FILE
fi
done < $INITIAL_FILE
EDIT
Lines within the file could contain one or more tags, not always two like in the example given.
You'll have to do some looping and global sed replacements. The following is probably not optimal but it will get you started:
#!/bin/bash
declare -A props
while read line ; do
key=$(echo $line | sed -r 's/^(.*)=.*/\1/')
value=$(echo $line | sed -r 's/^.*=(.*)/\1/')
props[$key]=$value
done < values.properties
replace() {
line=$1
for key in "${!props[#]}"; do
line=$(echo $line | sed "s/\[#$key#\]/${props[$key]}/g")
done
echo $line
}
while read line ; do
while [[ $line == *"[#"*"#]"* ]] ;do
line=$(replace "$line")
echo Iter: $line
done
echo DONE: $line
done < $INITIAL_FILE
The snippet prints to stdout and it includes intermediate results so that you can check how it works. I think you will easily be able to modify it to write to a file, etc.
There are a number of ways to do this (e.g. count #). But simply, you can just use * expansion outside just the start and end of the string so
if [[ $line == *"[#"*"#]"*"[#"*"#]"* ]]; then
echo "has tags"
else
echo "does not have tags"
fi
Would work in your case, e.g.
$ echo "$line"
testkey "TEST-KEY" "[#key_location#]:///[#key_name#]"
$ if [[ $line == *"[#"*"#]"*"[#"*"#]"* ]]; then echo "has tags"; fi
has tags

Create new file but add number if filename already exists in bash

I found similar questions but not in Linux/Bash
I want my script to create a file with a given name (via user input) but add number at the end if filename already exists.
Example:
$ create somefile
Created "somefile.ext"
$ create somefile
Created "somefile-2.ext"
The following script can help you. You should not be running several copies of the script at the same time to avoid race condition.
name=somefile
if [[ -e $name.ext || -L $name.ext ]] ; then
i=0
while [[ -e $name-$i.ext || -L $name-$i.ext ]] ; do
let i++
done
name=$name-$i
fi
touch -- "$name".ext
Easier:
touch file`ls file* | wc -l`.ext
You'll get:
$ ls file*
file0.ext file1.ext file2.ext file3.ext file4.ext file5.ext file6.ext
To avoid the race conditions:
name=some-file
n=
set -o noclobber
until
file=$name${n:+-$n}.ext
{ command exec 3> "$file"; } 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
And in addition, you have the file open for writing on fd 3.
With bash-4.4+, you can make it a function like:
create() { # fd base [suffix [max]]]
local fd="$1" base="$2" suffix="${3-}" max="${4-}"
local n= file
local - # ash-style local scoping of options in 4.4+
set -o noclobber
REPLY=
until
file=$base${n:+-$n}$suffix
eval 'command exec '"$fd"'> "$file"' 2> /dev/null
do
((n++))
((max > 0 && n > max)) && return 1
done
REPLY=$file
}
To be used for instance as:
create 3 somefile .ext || exit
printf 'File: "%s"\n' "$REPLY"
echo something >&3
exec 3>&- # close the file
The max value can be used to guard against infinite loops when the files can't be created for other reason than noclobber.
Note that noclobber only applies to the > operator, not >> nor <>.
Remaining race condition
Actually, noclobber does not remove the race condition in all cases. It only prevents clobbering regular files (not other types of files, so that cmd > /dev/null for instance doesn't fail) and has a race condition itself in most shells.
The shell first does a stat(2) on the file to check if it's a regular file or not (fifo, directory, device...). Only if the file doesn't exist (yet) or is a regular file does 3> "$file" use the O_EXCL flag to guarantee not clobbering the file.
So if there's a fifo or device file by that name, it will be used (provided it can be open in write-only), and a regular file may be clobbered if it gets created as a replacement for a fifo/device/directory... in between that stat(2) and open(2) without O_EXCL!
Changing the
{ command exec 3> "$file"; } 2> /dev/null
to
[ ! -e "$file" ] && { command exec 3> "$file"; } 2> /dev/null
Would avoid using an already existing non-regular file, but not address the race condition.
Now, that's only really a concern in the face of a malicious adversary that would want to make you overwrite an arbitrary file on the file system. It does remove the race condition in the normal case of two instances of the same script running at the same time. So, in that, it's better than approaches that only check for file existence beforehand with [ -e "$file" ].
For a working version without race condition at all, you could use the zsh shell instead of bash which has a raw interface to open() as the sysopen builtin in the zsh/system module:
zmodload zsh/system
name=some-file
n=
until
file=$name${n:+-$n}.ext
sysopen -w -o excl -u 3 -- "$file" 2> /dev/null
do
((n++))
done
printf 'File is "%s"\n' "$file"
echo some text in it >&3
Try something like this
name=somefile
path=$(dirname "$name")
filename=$(basename "$name")
extension="${filename##*.}"
filename="${filename%.*}"
if [[ -e $path/$filename.$extension ]] ; then
i=2
while [[ -e $path/$filename-$i.$extension ]] ; do
let i++
done
filename=$filename-$i
fi
target=$path/$filename.$extension
Use touch or whatever you want instead of echo:
echo file$((`ls file* | sed -n 's/file\([0-9]*\)/\1/p' | sort -rh | head -n 1`+1))
Parts of expression explained:
list files by pattern: ls file*
take only number part in each line: sed -n 's/file\([0-9]*\)/\1/p'
apply reverse human sort: sort -rh
take only first line (i.e. max value): head -n 1
combine all in pipe and increment (full expression above)
Try something like this (untested, but you get the idea):
filename=$1
# If file doesn't exist, create it
if [[ ! -f $filename ]]; then
touch $filename
echo "Created \"$filename\""
exit 0
fi
# If file already exists, find a similar filename that is not yet taken
digit=1
while true; do
temp_name=$filename-$digit
if [[ ! -f $temp_name ]]; then
touch $temp_name
echo "Created \"$temp_name\""
exit 0
fi
digit=$(($digit + 1))
done
Depending on what you're doing, replace the calls to touch with whatever code is needed to create the files that you are working with.
This is a much better method I've used for creating directories incrementally.
It could be adjusted for filename too.
LAST_SOLUTION=$(echo $(ls -d SOLUTION_[[:digit:]][[:digit:]][[:digit:]][[:digit:]] 2> /dev/null) | awk '{ print $(NF) }')
if [ -n "$LAST_SOLUTION" ] ; then
mkdir SOLUTION_$(printf "%04d\n" $(expr ${LAST_SOLUTION: -4} + 1))
else
mkdir SOLUTION_0001
fi
A simple repackaging of choroba's answer as a generalized function:
autoincr() {
f="$1"
ext=""
# Extract the file extension (if any), with preceeding '.'
[[ "$f" == *.* ]] && ext=".${f##*.}"
if [[ -e "$f" ]] ; then
i=1
f="${f%.*}";
while [[ -e "${f}_${i}${ext}" ]]; do
let i++
done
f="${f}_${i}${ext}"
fi
echo "$f"
}
touch "$(autoincr "somefile.ext")"
without looping and not use regex or shell expr.
last=$(ls $1* | tail -n1)
last_wo_ext=$($last | basename $last .ext)
n=$(echo $last_wo_ext | rev | cut -d - -f 1 | rev)
if [ x$n = x ]; then
n=2
else
n=$((n + 1))
fi
echo $1-$n.ext
more simple without extension and exception of "-1".
n=$(ls $1* | tail -n1 | rev | cut -d - -f 1 | rev)
n=$((n + 1))
echo $1-$n.ext

Resources