awk usage in a variable - linux

actlist file contains around 15 records. I want to print/store each row in a variable to perform further action. script runs but echo $j displays blank value. What is the issue?
my script:
#/usr/bin/sh
acList=/root/john/actlist
Rowcount=`wc -l $acList | awk -F " " '{print $1}'`
for ((i=1; i<=Rowcount; i++)); do
j=`awk 'FNR == $i{print}' $acList`
echo $j
done
file: actlist
cat > actlist
5663233332 2223 2
5656556655 5545 5
4454222121 5555 5
.
.
.

The issue happens to be related to quotes and to the way the shell interpolates variables.
More specifically, when you write
j=`awk "FNR == $i{print}" $acList`
the AWK code must be enclosed into double quotes. This is necessary if you want the shell to be able to substitute the $i with the actual value stored in the i variable.
On the other hand, if you write
j=`awk 'FNR == $i{print}' $acList`
i.e. with single quotes, the $i will be interpreted as a literal string.
Hence the fixed code will read:
#/usr/bin/sh
acList=/root/john/actlist
Rowcount=`wc -l $acList | awk -F " " '{print $1}'`
for ((i=1; i<=Rowcount; i++)); do
j=`awk "FNR == $i{print}" $acList`
echo $j
done
Remember: it is always the shell that does variable interpolation before calling other commands.
Having said that, there are some places, in supplied code, where some improvements could be devised. But that's another story.

Unfortunately all your script does is print the contents of the input file so we can't help you figure out the right approach to do whatever it is you REALLY want to do without more information on what that is but chances are this is the right starting point:
acList=/root/john/actlist
awk '
{ print }
' "$acList"

I think you would probably be better off with this for parsing your file:
#!/bin/bash
while read a b c; do
echo $a, $b, $c
done < "$actlist"
Output:
5663233332, 2223, 2
5656556655, 5545, 5
4454222121, 5555, 5
Updated
Whilst the above demonstrates the concept I was suggesting, as #EdMorton rightly says in the comments section, the following code would be more robust for a production environment.
#!/bin/bash
while IFS= read -r a b c; do
echo "$a, $b, $c"
done < "$actlist"

Related

How to search the full string in file which is passed as argument in shell script?

i am passing a argument and that argument i have to match in file and extract the information. Could you please how I can get it?
Example:
I have below details in file-
iMedical_Refined_load_Procs_task_id=970113
HV_Rawlayer_Execution_Process=988835
iMedical_HV_Refined_Load=988836
DHS_RawLayer_Execution_Process=988833
iMedical_DHS_Refined_Load=988834
If I am passing 'hv' as argument so it should to pick 'iMedical_HV_Refined_Load' and give the result - '988836'
If I am passing 'dhs' so it should pick - 'iMedical_DHS_Refined_Load' and give the result = '988834'
I tried below logic but its not giving the result correctly. What Changes I need to do-
echo $1 | tr a-z A-Z
g=${1^^}
echo $g
echo $1
val=$(awk -F= -v s="$g" '$g ~ s{print $2}' /medaff/Scripts/Aggrify/sltconfig.cfg)
echo "TASK ID is $val"
Assuming your matching criteria is the first string after delimiter _ and the output needed is the numbers after the = char, then you can try this sed
$ sed -n "/_$1/I{s/[^=]*=\(.*\)/\1/p}" input_file
$ read -r input
hv
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988836
$ read -r input
dhs
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988834
If I'm reading it right, 2 quick versions -
$: cat 1
awk -F= -v s="_${1^^}_" '$1~s{print $2}' file
$: cat 2
sed -En "/_${1^^}_/{s/^.*=//;p;}" file
Both basically the same logic.
In pure bash -
$: cat 3
while IFS='=' read key val; do [[ "$key" =~ "_${1^^}_" ]] && echo "$val"; done < file
That's a lot less efficient, though.
If you know for sure there will be only one hit, all these could be improved a bit by short-circuit exits, but on such a small sample it won't matter at all. If you have a larger dataset to read, then I strongly suggest you formalize your specs better than "in this set I should get...".

Linux Bash: Use awk(substr) to get parameters from file input

I have a .txt-file like this:
'SMb_TSS0303' '171765' '171864' '-' 'NC_003078' 'SMb20154'
'SMb_TSS0302' '171758' '171857' '-' 'NC_003078' 'SMb20154'
I want to extract the following as parameters:
-'SMb'
-'171765'
-'171864'
-'-' (minus)
-> need them without quotes
I am trying to do this in a shell script:
#!/bin/sh
file=$1
cat "$1"|while read line; do
echo "$line"
parent=$(awk {'print substr($line,$0,5)'})
echo "$parent"
done
echos 'SMb
As far as I understood awk substr, I though, it would work like this:
substr(s, a, b)=>returns b number of chars from string s, starting at position a
Firstly, I do not get, why I can extract 'Smb with 0-5, secondly, I can't extract any other parameter I need, because moving the start does not work.
E.g. $1,6 gives empty echo. I would expect Mb_TSS
Desired final output:
#!/bin/sh
file=$1
cat "$1"|while read line; do
parent=$(awk {'print substr($line,$0,5)'})
start=$(awk{'print subtrs($line,?,?')})
end=$(awk{'print subtrs($line,?,?')})
strand=$(awk{'print subtrs($line,?,?')})
done
echo "$parent" -> echos SMb
echo "$start" -> echos 171765
echo "$end" -> echos 171864
echo "$strand" -> echos -
I have an assumption, that the items in the lines are seen as single strings or something? Maybe I am also handling the file-parsing wrongly, but everything I tried does not work.
Really unclear exactly what you're trying to do. But I can at least help you with the awk syntax:
while read -r line
do
parent=$(echo $line | awk '{print substr($1,2,3)}')
start=$(echo $line | awk '{print substr($2,2,6)}')
echo $parent
echo $start
done < file
This outputs:
SMb
171765
SMb
171758
You should be able to figure out how to get the rest of the fields.
This is quite an inefficient way to do this but based on the information in the question I'm unable to provide a better answer at the moment.
the question was orignally tagged python, so let me propose a python solution:
with open("input.txt") as f:
for l in txt:
data = [x.strip("'").partition("_")[0] for x in l.split()[:4]]
print("\n".join(data))
It opens the file, splits the lines like awk would to, considers only the 4 first fields, strips off the quotes, to create the list. Then display it separated by newlines.
that prints:
SMb
171765
171864
-
SMb
171758
171857
-

Operation Precedence

I want to store the result of a command into an array variable. I'm having trouble because the command itself contains variables that must be resolved before its execution. For example:
for ((i=1; i<=4; i++)); do
NEXT=$(( i + 1 ))
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n '$NEXTp')
done
The "cat $VARIABLE" command is being processed correctly. The problem is with "$NEXT" substitution that is immediately followed by a "p" character. How can I force the script to resolve $NEXT variable before executing the command and store the results inside MYARRAY[i]?
Thanks.
The typical solution is: ${NEXT}p
Notice that what you are doing is fairly atypical. It is more usual to assign to an array using something like:
IFS='
'
MYARRAY=( $( < $VARIABLE uniq | sed -n 1,5p ))
This will assign MYARRAY[0], which your original code does not do, but it's not clear to me if that is intentional or an attempt to adjust the indexing. As always, UUOC is to be discouraged, and although uniq can take $VARIABLE as an argument, it's a good idiom to use the redirection so I'm using that in the example to demonstrate a simple way to eliminate UUOC in 99.9% of the cases it appears.
You could add the 'p' to NEXT before using it in the expression:
for ((i=1; i<=4; i++)); do
NEXT=$(expr $i + 1)
NEXT+='p'
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n $NEXT)
done
You can use script like this:
for ((i=1; i<=4; i++)); do
NEXT=$(expr $i + 1)
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n $NEXT'p')
done

bash print first to nth column in a line iteratively

I am trying to get the column names of a file and print them iteratively. I guess the problem is with the print $i but I don't know how to correct it. The code I tried is:
#! /bin/bash
for i in {2..5}
do
set snp = head -n 1 smaller.txt | awk '{print $i}'
echo $snp
done
Example input file:
ID Name Age Sex State Ext
1 A 12 M UT 811
2 B 12 F UT 818
Desired output:
Name
Age
Sex
State
Ext
But the output I get is blank screen.
You'd better just read the first line of your file and store the result as an array:
read -a header < smaller.txt
and then printf the relevant fields:
printf "%s\n" "${header[#]:1}"
Moreover, this uses bash only, and involves no unnecessary loops.
Edit. To also answer your comment, you'll be able to loop through the header fields thus:
read -a header < smaller.txt
for snp in "${header[#]:1}"; do
echo "$snp"
done
Edit 2. Your original method had many many mistakes. Here's a corrected version of it (although what I wrote before is a much preferable way of solving your problem):
for i in {2..5}; do
snp=$(head -n 1 smaller.txt | awk "{print \$$i}")
echo "$snp"
done
set probably doesn't do what you think it does.
Because of the single quotes in awk '{print $i}', the $i never gets expanded by bash.
This algorithm is not good since you're calling head and awk 4 times, whereas you don't need a single external process.
Hope this helps!
You can print it using awk itself:
awk 'NR==1{for (i=2; i<=5; i++) print $i}' smaller.txt
The main problem with your code is that your assignment syntax is wrong. Change this:
set snp = head -n 1 smaller.txt | awk '{print $i}'
to this:
snp=$(head -n 1 smaller.txt | awk '{print $i}')
That is:
Do not use set. set is for setting shell options, numbered parameters, and so on, not for assigning arbitrary variables.
Remove the spaces around =.
To run a command and capture its output as a string, use $(...) (or `...`, but $(...) is less error-prone).
That said, I agree with gniourf_gniourf's approach.
Here's another alternative; not necessarily better or worse than any of the others:
for n in $(head smaller.txt)
do
echo ${n}
done
somthin like
for x1 in $(head -n1 smaller.txt );do
echo $x1
done

Counter increment in Bash loop not working

I have the following simple script where I am running a loop and want to maintain a COUNTER. I am unable to figure out why the counter is not updating. Is it due to subshell that's getting created? How can I potentially fix this?
#!/bin/bash
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' | awk -F ', ' '{print $2,$4,$0}' | awk '{print "http://domain.example"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' | awk -F '&end=1' '{print $1"&end=1"}' |
(
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
)
echo $COUNTER # output = 0
First, you are not increasing the counter. Changing COUNTER=$((COUNTER)) into COUNTER=$((COUNTER + 1)) or COUNTER=$[COUNTER + 1] will increase it.
Second, it's trickier to back-propagate subshell variables to the callee as you surmise. Variables in a subshell are not available outside the subshell. These are variables local to the child process.
One way to solve it is using a temp file for storing the intermediate value:
TEMPFILE=/tmp/$$.tmp
echo 0 > $TEMPFILE
# Loop goes here
# Fetch the value and increase it
COUNTER=$[$(cat $TEMPFILE) + 1]
# Store the new value
echo $COUNTER > $TEMPFILE
# Loop done, script done, delete the file
unlink $TEMPFILE
COUNTER=1
while [ Your != "done" ]
do
echo " $COUNTER "
COUNTER=$[$COUNTER +1]
done
TESTED BASH: Centos, SuSE, RH
COUNTER=$((COUNTER+1))
is quite a clumsy construct in modern programming.
(( COUNTER++ ))
looks more "modern". You can also use
let COUNTER++
if you think that improves readability. Sometimes, Bash gives too many ways of doing things - Perl philosophy I suppose - when perhaps the Python "there is only one right way to do it" might be more appropriate. That's a debatable statement if ever there was one! Anyway, I would suggest the aim (in this case) is not just to increment a variable but (general rule) to also write code that someone else can understand and support. Conformity goes a long way to achieving that.
HTH
Try to use
COUNTER=$((COUNTER+1))
instead of
COUNTER=$((COUNTER))
I think this single awk call is equivalent to your grep|grep|awk|awk pipeline: please test it. Your last awk command appears to change nothing at all.
The problem with COUNTER is that the while loop is running in a subshell, so any changes to the variable vanish when the subshell exits. You need to access the value of COUNTER in that same subshell. Or take #DennisWilliamson's advice, use a process substitution, and avoid the subshell altogether.
awk '
/GET \/log_/ && /upstream timed out/ {
split($0, a, ", ")
split(a[2] FS a[4] FS $0, b)
print "http://example.com" b[5] "&ip=" b[2] "&date=" b[7] "&time=" b[8] "&end=1"
}
' | {
while read WFY_URL
do
echo $WFY_URL #Some more action
(( COUNTER++ ))
done
echo $COUNTER
}
count=0
base=1
(( count += base ))
Instead of using a temporary file, you can avoid creating a subshell around the while loop by using process substitution.
while ...
do
...
done < <(grep ...)
By the way, you should be able to transform all that grep, grep, awk, awk, awk into a single awk.
Starting with Bash 4.2, there is a lastpipe option that
runs the last command of a
pipeline in the current shell context. The lastpipe option has no
effect if job control is enabled.
bash -c 'echo foo | while read -r s; do c=3; done; echo "$c"'
bash -c 'shopt -s lastpipe; echo foo | while read -r s; do c=3; done; echo "$c"'
3
minimalist
counter=0
((counter++))
echo $counter
There were two conditions that caused the expression ((var++)) to fail for me:
If I set bash to strict mode (set -euo pipefail) and if I start my increment at zero (0).
Starting at one (1) is fine but zero causes the increment to return "1" when evaluating "++" which is a non-zero return code failure in strict mode.
I can either use ((var+=1)) or var=$((var+1)) to escape this behavior
This is all you need to do:
$((COUNTER++))
Here's an excerpt from Learning the bash Shell, 3rd Edition, pp. 147, 148:
bash arithmetic expressions are equivalent to their counterparts in
the Java and C languages.[9] Precedence and associativity are the same
as in C. Table 6-2 shows the arithmetic operators that are supported.
Although some of these are (or contain) special characters, there is
no need to backslash-escape them, because they are within the $((...))
syntax.
..........................
The ++ and - operators are useful when you want to increment or
decrement a value by one.[11] They work the same as in Java and C,
e.g., value++ increments value by 1. This is called post-increment;
there is also a pre-increment: ++value. The difference becomes evident
with an example:
$ i=0
$ echo $i
0
$ echo $((i++))
0
$ echo $i
1
$ echo $((++i))
2
$ echo $i
2
See http://www.safaribooksonline.com/a/learning-the-bash/7572399/
This is a simple example
COUNTER=1
for i in {1..5}
do
echo $COUNTER;
//echo "Welcome $i times"
((COUNTER++));
done
Source script has some problem with subshell.
First example, you probably do not need subshell. But We don't know what is hidden under "Some more action".
The most popular answer has hidden bug, that will increase I/O, and won't work with subshell, because it restores couter inside loop.
Do not fortot add '\' sign, it will inform bash interpreter about line continuation. I hope it will help you or anybody. But in my opinion this script should be fully converted to AWK script, or else rewritten to python using regexp, or perl, but perl popularity over years is degraded. Better do it with python.
Corrected Version without subshell:
#!/bin/bash
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' |\
awk -F ', ' '{print $2,$4,$0}' |\
awk '{print "http://example.com"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' |\
awk -F '&end=1' '{print $1"&end=1"}' |\
#( #unneeded bracket
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
# ) unneeded bracket
echo $COUNTER # output = 0
Version with subshell if it is really needed
#!/bin/bash
TEMPFILE=/tmp/$$.tmp #I've got it from the most popular answer
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' |\
awk -F ', ' '{print $2,$4,$0}' |\
awk '{print "http://example.com"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' |\
awk -F '&end=1' '{print $1"&end=1"}' |\
(
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
echo $COUNTER > $TEMPFILE #store counter only once, do it after loop, you will save I/O
)
COUNTER=$(cat $TEMPFILE) #restore counter
unlink $TEMPFILE
echo $COUNTER # output = 0
It seems that you didn't update the counter is the script, use counter++

Resources