AWK with If condition - linux

i am trying to replace the following string for ex:
from
['55',2,1,10,30,23],
to
['55',2,555,10,30,23],
OR
['55',2,1,10,30,23],
to
['55',2,1,10,9999,23],
i search around and find this :
$ echo "[55,2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="[55"){$2=10}{print}}'
[55,10,1,10,30,23],
but it's not working in my case since there is " ' " around the value of $1 in my if condition :
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['55'"){$2=10}{print}}'
['55',2,1,10,30,23],

The problem is not in the awk code, it's the shell expansion. You cannot have single quotes in a singly-quoted shell string. This is the same problem you run into when you try to put the input string into single quotes:
$ echo '['55',2,1,10,30,23],'
[55,2,1,10,30,23],
-- the single quotes are gone! And this makes sense, because they did their job of quoting the [ and the ,2,1,10,30,23], (the 55 is unquoted here), but it is not what we wanted.
A solution is to quote the sections between them individually and squeeze them in manually:
$ echo '['\''55'\'',2,1,10,30,23],'
['55',2,1,10,30,23],
Or, in this particular case, where nothing nefarious is between where the single quotes should be,
echo '['\'55\'',2,1,10,30,23],' # the 55 is now unquoted.
Applied to your awk code, that looks like this:
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['\'55\''"){$2=10}{print}}'
['55',10,1,10,30,23],
Alternatively, since this doesn't look very nice if you have many single quotes in your code, you can write the awk code into a file, say foo.awk, and use
echo "['55',2,1,10,30,23]," | awk -F, -f foo.awk
Then you don't have to worry about shell quoting mishaps in the awk code because the awk code is not subject to shell expansion anymore.

I think how to match and replace is not the problem for you. The problem you were facing is, how to match a single quote ' in field.
To avoid to escape each ' in your codes, and to make your codes more readable, you can assigen the quote to a variable, and use the variable in your codes, for example like this:
echo "['55' 1
['56' 1"|awk -v q="'" '$1=="["q"55"q{$2++}7'
['55' 2
['56' 1
In the above example, only in line with ['55', the 2nd field got incremented.

Related

Using awk command in Bash

I'm trying to loop an awk command using bash script and I'm having a hard time including a variable within the single quotes for the awk command. I'm thinking I should be doing this completely in awk, but I feel more comfortable with bash right now.
#!/bin/bash
index="1"
while [ $index -le 13 ]
do
awk "'"/^$index/ {print}"'" text.txt
done
Use the standard approach -- -v option of awk to set/pass the variable:
awk -v idx="$index" '$0 ~ "^"idx' text.txt
Here i have set the variable idx as having the value of shell variable $index. Inside awk, i have simply used idx as an awk variable.
$0 ~ "^"idx matches if the record starts with (^) whatever the variable idx contains; if so, print the record.
awk '/'"$index"'/' text.txt
# A lil play with the script part where you split the awk command
# and sandwich the bash variable in between using double quotes
# Note awk prints by default, so idiomatic awk omits the '{print}' too.
should do, alternatively use grep like
grep "$index" text.txt # Mind the double quotes
Note : -le is used for comparing numerals, so you may change index="1" to index=1.

shell scripts variable passed to awk and double quotes needed to preserve

I have some logs called ts.log that look like
[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Parameters: []
[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Types: []
[957670][DEBUG:2016-11-30 16:50:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} ResultSet
[957670][DEBUG:2016-11-30 16:51:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Header: [LAST_INSERT_ID()]
[957670][DEBUG:2016-11-30 16:52:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Result: [731747]
[065417][DEBUG:2016-11-30 16:53:17,986:sdk.protocol.process.InitProcessor.process(InitProcessor.java:61)]query String=requestid=10547
I have a script in which there's sth like
#!/bin/bash
begin=$1
cat ts.log | awk -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="16:50:17"){print $0}}'
instead of inputting the time like 16:50:17 I want to just pass $1 of shell to awk so that all I need to do is ./script time:hh:mm:ss The script will look like
#!/bin/bash
begin=$1
cat ts.log | awk -v var=$begin -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="var"){print $0}}'
But the double quotes need to be there OR it won't work.
I tried 2>"\""var"\""
but it doesn't work.
so is there a way to keep the double quotes there?
preferred result ./script
then extract the log from the time specified as $1.
There's many ways to do what you want.
Option 1: Using double quotes enclosing awk program
#!/bin/bash
begin=$1
awk -F '[ ,]' "\$2 ~ /^..:..:../ && \$2 >= \"${begin}\" " ts.log
Inside double quotes strings, bash does variable substitution. So $begin or ${begin} will be replaced with the shell variable value (whatever sent by the user)
Undesired effect: awk special variables starting with $ must be escaped with '\' or bash will try to replace them before execute awk.
To get a double quote char (") in bash double quote strings, it has to be escaped with '\', so in bash " \"16:50\" " will be replaced with "16:50". (This won't work with single quote strings, that don't have expansion of variables nor escaped chars at all in bash).
To see what variable substitutions are made when bash executes the script, you can execute it with debug option (it's very enlightening):
$ bash -x yourscript.sh 16:50
Option 2: Using awk variables
#!/bin/bash
begin=$1
awk -F '[ ,]' -v begin=$begin '$2 ~ /^..:..:../ && $2 >= begin' ts.log
Here an awk variable begin is created with option -v varname=value.
Awk variables can be used in any place of awk program as any other awk variable (don't need double quotes nor $).
There are other options, but I think you can work with these two.
In both options I've changed a bit your script:
It doesn't need cat to send data to awk, because awk can execute your program in one or more data files sent as parameters after your program.
Your awk program doesn't need include print at all (as #fedorqui said), because a basic awk program is composed by pairs of pattern {code}, where pattern is the same as you used in the if sentence, and the default code is {print $0}.
I've also changed the time pattern, primarly to clarify the script, but in a log file there's almost no chance that exists some 8 char length string that has 2 colons inside (regexp: . repaces any char)

Bash prompt scripting

I'm trying to follow a guide found here, but I do not like what I see as a cop out. In the script they set PS1 to
PS1="<code> `cat /proc/loadavg | awk '{print $1}'` <more code>"
My problem with this is I would like to know if it is possible to write it with single quotes like:
PS1='<code> `cat /proc/loadavg | awk \'{print $1}\'` <more code>'
So it is evaluated every time I run the command, not just the once. It seems the presence of the single quotes in awk are forcing me to use double quotes. I would like to have this run after every prompt and I have another awk tidbit of code I would like to run here as well.
If this would be too cumbersome for bash to do, then I'm fine not having it, it's more for proof of concept anyways.
You can't put a single quote into single quotes, you have to end the single quotes, insert the quote, and start single quotes again:
PS1='$(code | awk '\''{print $1}'\'')'
# or
PS1='$(code | awk '"'"'{print $1}'"'"')'

find words in two quotes unix

I would like to display the last word in these lines I tried to look for example the word value but no answer, so I thought to look for the words between quotes but my file contains other words between quotes that I have I need not actually want to display the values ​​of the select tag knowing that my html file is.
grep '*' hosts.html | awk '{print $NF}'
For example:
value='www.visit-tunisia.com'>www.visit-tunisia.com
value='www.watania1.tn'>www.watania1.tn
value='www.watania2.tn'>www.watania2.tn
I would have
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
You need to set the field separator to > you do this with the -F option:
$ awk -F'>' '{print $NF}' hosts.html
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
Note: I'm not sure what you are trying to achieve by grep '*' hosts.html?
Interpreting the comment liberally, you have input lines which might contain:
value='www.visit-tunisia.com'>www.visit-tunisia.com
value='www.watania1.tn'>www.watania1.tn
value='www.watania2.tn'>www.watania2.tn
and you would like the names which are repeated on a line as the output:
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
This can be done using sed and capturing parentheses.
sed -n -e "s/.*'\([^']*\)'.*\1.*/\1/p"
The -n says "don't print unless I say to do so". The s///p command prints if the substitute works. The pattern looks for a stream of 'anything' (.*), a single quote, captures what's inside up to the next single quote ('\([^']*\)') followed by any text, the captured text (the first \1), and anything. The replacement text is what was captured (the second \1).
Example:
$ cat data
www and wotnot
value='www.visit-tunisia.com'>www.visit-tunisia.com
blah
value='www.watania1.tn'>www.watania1.tn
hooplah
value='www.watania2.tn'>www.watania2.tn
if 'nothing' is required, nothing will be done.
$ sed -n -e "s/.*'\([^']*\)'.*\1.*/\1/p" data
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
nothing
$
Clearly, you can refine the [^']* part of the match if you want to. I used double quotes around the expression since the pattern matches on single quotes. Life is trickier if you need to allow both single and double quotes; at that point, I'd put the script into a file and run sed -f script data to make life easier.
sed 's/.*>\(.*\)/\1/g' your_file

Simple linux script help

I have a text file with the following structure:
text1;text2;text3;text4
...
I need to write a script that gets 2 arguments: the column we want to search in and the content we want to find.
So the script should output only the lines (WHOLE LINES!) that match content(arg2) found in column x(arg1).
I tried with egrep and sed, but I'm not experienced enough to finish it. I would appreciate some guidance...
Given your added information of needing to output the entire line, awk is easiest:
awk -F';' -v col=$col -v pat="$val" '$col ~ pat' $input
Explaining the above, the -v options set awk variables without needing to worry about quoting issues in the body of the awk script. Pre-POSIX versions of awk won't understand the -v option, but will recognize the variable assignment without it. The -F option sets the field separator. In the body, we are using a pattern with the default action (which is print); the pattern uses the variables we set with -v for both the column ($ there is awk's "field index" operator, not a shell variable) and the pattern (and pat can indeed hold an awk-style regex).
cat text_file.txt| cut -d';' column_num | grep pattern
It prints only the column that is matched and not the entire line. let me think if there is a simple solution for that.
Python
#!/usr/bin/env python
import sys
column = 1 # the column to search
value = "the data you're looking for"
with open("your file","r") as source:
for line in source:
fields = line.strip().split(';')
if fields[column] == value:
print line
There's also a solution with egrep. It's not a very beautiful one but it works:
egrep "^([^;]+;){`expr $col - 1`}$value;([^;]+;){`expr 3 - $col`}([^;]+){`expr 4 - $col`}$" filename
or even shorter:
egrep "^([^;]+;){`expr $col - 1`}$value(;|$)" filename
grep -B1 -i "string from previous line" |grep -iv 'check string from previous line' |awk -F" " '{print $1}'
This will print your line.

Resources