Awk print lines starting with regex (IP address) - linux

Im trying to read a file for those lines which has IP address on the first column.
my command below does not return any value.
cat test.csv | awk '$1 == "^[[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}]" { print $0 }'
The regex can capture IP address.
Tried the below too,
cat test_1.csv | awk '$1~/^[[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\]/ {print $0}'
test.csv
1.1.1.1 ipaddress gateway
2.2.2.2 ipaddress_2 firewall
www.google.com domain google

You can do it more easily with grep:
grep -P '^\d+(\.\d+){3}\s' test.csv
or
grep -P '^\d{1,3}(\.\d{1,3}){3}\s' test.csv

When you use {1,3} (Interval expressions) in GNU awk, you have to use either --re-interval (or) --posix option to enable it.
Use :
awk --posix '$1 ~ /^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/' file
(OR)
awk --re-interval '$1 ~ /^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/' file
From man awk:
r{n,m}
One or two numbers inside braces denote an interval expression.
Interval expressions are only available if either --posix or
--re-interval is specified on the command line.

Related

How do you change column names to lowercase with linux and store the file as it is?

I am trying to change the column names to lowercase in a csv file. I found the code to do that online but I dont know how to replace the old column names(uppercase) with new column names(lowercase) in the original file. I did something like this:
$cat head -n1 xxx.csv | tr "[A-Z]" "[a-z]"
But it simply just prints out the column names in lowercase, which is not enough for me.
I tried to add sed -i but it did not do any good. Thanks!!
Using awk (readability winner) :
concise way:
awk 'NR==1{print tolower($0);next}1' file.csv
or using ternary operator:
awk '{print (NR==1) ? tolower($0): $0}' file.csv
or using if/else statements:
awk '{if (NR==1) {print tolower($0)} else {print $0}}' file.csv
To change the file for real:
awk 'NR==1{print tolower($0);next}1' file.csv | tee /tmp/temp
mv /tmp/temp file.csv
For your information, sed using the in place edit switch -i do the same: it use a temporary file under the hood.
You can check this by using :
strace -f -s 800 sed -i'' '...' file
Using perl:
perl -i -pe '$_=lc() if $.==1' file.csv
It replace the file on the fly with -i switch
You can use sed to tell it to replace the first line with all lower-case and then print the rest as-is:
sed '1s/.*/\L&/' ./xxx.csv
Redirect the output or use -i to do an in-place edit.
Proof of Concept
$ echo -e "COL1,COL2,COL3\nFoO,bAr,baZ" | sed '1s/.*/\L&/'
col1,col2,col3
FoO,bAr,baZ

bash: awk print with in print

I need to grep some pattern and further i need to print some output within that. Currently I am using the below command which is working fine. But I like to eliminate using multiple pipe and want to use single awk command to achieve the same output. Is there a way to do it using awk?
root#Server1 # cat file
Jenny:Mon,Tue,Wed:Morning
David:Thu,Fri,Sat:Evening
root#Server1 # awk '/Jenny/ {print $0}' file | awk -F ":" '{ print $2 }' | awk -F "," '{ print $1 }'
Mon
I want to get this output using single awk command. Any help?
You can try something like:
awk -F: '/Jenny/ {split($2,a,","); print a[1]}' file
Try this
awk -F'[:,]+' '/Jenny/{print $2}' file.txt
It is using muliple -F value inside the [ ]
The + means one or more since it is treated as a regex.
For this particular job, I find grep to be slightly more robust.
Unless your company has a policy not to hire people named Eve.
(Try it out if you don't understand.)
grep -oP '^[^:]*Jenny[^:]*:\K[^,:]+' file
Or to do a whole-word match:
grep -oP '^[^:]*\bJenny\b[^:]*:\K[^,:]+' file
Or when you are confident that "Jenny" is the full name:
grep -oP '^Jenny:\K[^,:]+' file
Output:
Mon
Explanation:
The stuff up until \K speaks for itself: it selects the line(s) with the desired name.
[^,:]+ captures the day of week (in this case Mon).
\K cuts off everything preceding Mon.
-o cuts off anything following Mon.

How Can I Perform Awk Commands Only On Certain Fields

I have CSV columns that I'm working with:
info,example-string,super-example-string,otherinfo
I would like to get:
example-string super example string
Right now, I'm running the following command:
awk -F ',' '{print $3}' | sed "s/-//g"
But, then I have to paste the lines together to combine $2 and $3.
Is there anyway to do something like this?
awk -F ',' '{print $2" "$3}' | sed "s/-//g"
Except, where the sed command is only performed on $3 and $2 stays in place? I'm just concerned later on if the lines don't match up, the data could be misaligned.
Please note: I need to keep the pipe for the SED command. I just used a simple example but I end up running a lot of commands after that as well.
Try:
$ awk -F, '{gsub(/-/," ",$3); print $2,$3}' file
example-string super example string
How it works
-F,
This tells awk to use a comma as the field separator.
gsub(/-/," ",$3)
This replaces all - in field 3 with spaces.
print $2,$3
This prints fields 2 and 3.
Examples using pipelines
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}'
example-string super example string
In a pipeline with sed:
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}' | sed 's/string/String/g'
example-String super example String
Though best solution will be either use a single sed or use single awk. Since you have requested to use awk and sed solution so providing this. Also considering your actual data will be same as shown sample Input_file.
awk -F, '{print $2,$3}' Input_file | sed 's/\([^ ]*\)\([^-]*\)-\([^-]*\)-\([^-]*\)/\1 \2 \3 \4/'
Output will be as follows.
example-string super example string

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Assigning output of a command to a variable(BASH)

I need to assign the output of a command to a variable. The command I tried is:
grep UUID fstab | awk '/ext4/ {print $1}' | awk '{print substr($0,6)}'
I try this code to assign a variable:
UUID=$(grep UUID fstab | awk '/ext4/ {print $1}' | awk '{print substr($0,6)}')
However, it gives a syntax error. In addition I want it to work in a bash script.
The error is:
./upload.sh: line 12: syntax error near unexpected token ENE=$( grep UUID fstab | awk '/ext4/ {print $1}' | awk '{print substr($0,6)}'
)'
./upload.sh: line 12: ENE=$( grep UUID fstab | awk '/ext4/ {print $1}' | awk '{print substr($0,6)}'
)'
well, using the '$()' subshell operator is a common way to get the output of a bash command. As it spans a subshell it is not that efficient.
I tried :
UUID=$(grep UUID /etc/fstab|awk '/ext4/ {print $1}'|awk '{print substr($0,6)}')
echo $UUID # writes e577b87e-2fec-893b-c237-6a14aeb5b390
it works perfectly :)
EDIT:
Of course you can shorten your command :
# First step : Only one awk
UUID=$(grep UUID /etc/fstab|awk '/ext4/ {print substr($1,6)}')
Once more time :
# Second step : awk has a powerful regular expression engine ^^
UUID=$(cat /etc/fstab|awk '/UUID.*ext4/ {print substr($1,6)}')
You can also use awk with a file argument ::
# Third step : awk use fstab directlty
UUID=$(awk '/UUID.*ext4/ {print substr($1,6)}' /etc/fstab)
Just for trouble-shooting purposes, and something else to try to see if you can get this to work, you could also try to use "backticks", e.g,
cur_dir=`pwd`
would save the output of the pwd command in your variable cur_dir, though using $() approach is generally preferable.
To quote from a pages given to me on http://unix.stackexchange.com:
The second form `COMMAND` (using backticks) is more or less obsolete for Bash, since it
has some trouble with nesting ("inner" backticks need to be escaped)
and escaping characters. Use $(COMMAND), it's also POSIX!

Resources