I have a text file with the following structure:
text1;text2;text3;text4
...
I need to write a script that gets 2 arguments: the column we want to search in and the content we want to find.
So the script should output only the lines (WHOLE LINES!) that match content(arg2) found in column x(arg1).
I tried with egrep and sed, but I'm not experienced enough to finish it. I would appreciate some guidance...
Given your added information of needing to output the entire line, awk is easiest:
awk -F';' -v col=$col -v pat="$val" '$col ~ pat' $input
Explaining the above, the -v options set awk variables without needing to worry about quoting issues in the body of the awk script. Pre-POSIX versions of awk won't understand the -v option, but will recognize the variable assignment without it. The -F option sets the field separator. In the body, we are using a pattern with the default action (which is print); the pattern uses the variables we set with -v for both the column ($ there is awk's "field index" operator, not a shell variable) and the pattern (and pat can indeed hold an awk-style regex).
cat text_file.txt| cut -d';' column_num | grep pattern
It prints only the column that is matched and not the entire line. let me think if there is a simple solution for that.
Python
#!/usr/bin/env python
import sys
column = 1 # the column to search
value = "the data you're looking for"
with open("your file","r") as source:
for line in source:
fields = line.strip().split(';')
if fields[column] == value:
print line
There's also a solution with egrep. It's not a very beautiful one but it works:
egrep "^([^;]+;){`expr $col - 1`}$value;([^;]+;){`expr 3 - $col`}([^;]+){`expr 4 - $col`}$" filename
or even shorter:
egrep "^([^;]+;){`expr $col - 1`}$value(;|$)" filename
grep -B1 -i "string from previous line" |grep -iv 'check string from previous line' |awk -F" " '{print $1}'
This will print your line.
Related
I have a path ./test/test1 and I need to extract the test1 part.
I can do that with
cut -d '/' -f 3
But I may also have a path like ./test/test1/test1a in which case I need to extract the test1a part.
I can do this in a similar manner by switching 2, 3, 4 to suit my needs.
But how can I achieve this if I have a list which contains some paths.
./test/test1
./test/test1/test1a/
./test/test1/test1a/example
How can I always make sure I extract the last part of the string after the last / delimiter? How do I start cutting from the last string up till the delimiter?
EDIT: Expected output:
test1
test1a
example
You can easily cut after the last delimiter, using awk, as you can see here:
cat conf.conf.txt | awk -F "/" '{ print $NF}'
(For your information: NF in awk stands for "Number of Fields".)
However, as the second line ends with a slash, the second result will be empty, as you see here:
test1
example
Is that what you want?
path=./foo/bar/baz
basename "$path"
# or pure shell:
echo "${path##*/}"
Both return baz. The counterpart, dirname, returns ./foo/bar.
It's not entirely clear what you mean by a "list". Perhaps you have the paths in an array, or just a space separated string. In either case, you can use basename, but the way you will use it depends on the data. If you have a space separated string, you can just use:
$ cat a.sh
#!/bin/sh
list='./test/test1
./test/test1/test1a/
./test/test1/test1a/example'
basename -a $list
$ ./a.sh
test1
test1a
example
That form will fail if there are characters in IFS in any of the names. If you have the names in an array, it is slightly easier to deal with that issue:
#!/bin/sh
list=('./test/with space/test1'
./test/test1/test1a/
./test/test1/test1a/example)
basename -a "${list[#]}"
Clean one-liner solution :
<<<"${test2}" mawk -F/ '$!_=$-_=$(NF-=_==$NF)'
test1
test1a
example
Tested and confirmed working on mawk-1.3.4, mawk 1.996, macos nawk, and gawk 5.1.1,
including invocation flags of -c/-P/-t/-S
—- The 4Chan Teller
I am trying to fetch the output of a variable stored in a file in another shell script.
Example:
cat abc.log
var1=2
var2=2
var3=25
I am writing a script to fetch the value of var3.
Thank you in advance.
awk -F= '$1 ~ /^[[:space:]]*var3/ { print $2 }' abc.log
Set the field delimiter to = and then where the line contains "var3", print the second field.
Alternatively, you could:
source abc.log
and then:
echo $var3
Using sed you can isolate 25 with particularity with:
sed -n '/^[[:space:]]*var3=/s/^[^=]*=//p' file
Explanation
This is the general substitution form s/find/replace/ with a matching expression preceding it. The total form is /match/s/find/replace/. The option -n suppresses the normal printing of pattern-space and the p at the end tells sed to print the line where the match and substitution took place. Specifically,
/match/ locates a line with any number of preceding whitespace characters followed by var3=. The POSIX [:space:] character class matches any whitespace,
the /find/ is all characters anchored from the '^' beginning that are not the [^=] character and then match the literal '=' character, and finally
the /replace/ is the empty-string leaving the 25 alone which is printed.
Example Use/Output
$ sed -n '/^[[:space:]]*var3=/s/^[^=]*=//p' file
25
A grep one-liner, if your grep has support for Perl-compatible regular expressions (the -P option; not all greps support that)
grep -Po '^\s*var3=\K.*' abc.log
or,
grep -Po '^\s*var3=\K.*' abc.log | tail -n1
in order to get the last value of the var3, if multiple var3s is a possibility.
I want to know how to use two 'grep' and 'sed' utilities or something else in order to replace the substring. I will explain what I want to do below.
We have the file 'test.txt' with the following string:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='AA5', A6='keyword_A'
After searching 'keyword_A' using grep, I want to replace the value of A5 with other string, for example, "NEW".
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
I tried to use two commands like
grep keyword_A test.txt | sed -e 's/blabla/blabla/'
After trying all I know, I gave up at all.
Please let me know the right solution.
First, you never need grep and sed. Sed has a full regular-expression search engine, so it is a superset of grep. This command will read test.txt, change the lines that you've indicated, and print the entire result on standard output:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" < test.txt
If you want to store the results back into the file test.txt, use the -i (in-place editing) switch to sed:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" -i.bak test.txt
If you want to select only the indicated lines, modify those, and print only those lines to standard out, use a combination of the p (print) command and the -n (no output) switch.
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/gp" -n test.txt
Using grep+sed is always the wrong approach. Here's one way to do it with GNU awk:
$ awk '/keyword_A/{ $0=gensub(/(A5({[^}]+})?=\047)[^\047]+/,"\\1NEW",1) } 1' file
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
Using a couple variables you could define the keyword and replacement ( if they change at all ):
q="keyword_A"
r="NEW"
Then with sed:
sed -r "s/^(.+\{.+\}=')(.+)('.+"${q}".+)$/\1"${r}"\3/" file
Result:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
A5="NEW"
A6="keyword_A"
# with sed
sed "s/='[^']*\(',[[:blank:]]*A6='${A6}'\)/='${A5}\1/" YourFile
# with awk
awk -F "'" -v A5="${A5}" -v A6="${A6}" '
BEGIN { OFS="\047" }
$12 == A6 { $10 = A5; $0 = $0 }
7
' YourFile
Change by the end of the string, for sed and using ' as field separator in awk instead of traditional space.
assuming there is no ' in value (or need to treat the escaping method) for awk version
We can just directly replace the fifth column when the sting keyword_A is found as shown below:
awk -F, 'BEGIN{OFS=",";}/keyword_A/{$5="A5{ATTR}='"'"NEW"'"'"}1' filename
Couple of slight alternatives:
sed -r "/keyword_A/s/(A5[^']*')[^']*/\1NEW/"
awk -F"'" '/keyword_A/{$10 = "NEW"}1' OFS="'"
Of course the negative with awk is afterwards you would have to rename the new file.
i am trying to replace the following string for ex:
from
['55',2,1,10,30,23],
to
['55',2,555,10,30,23],
OR
['55',2,1,10,30,23],
to
['55',2,1,10,9999,23],
i search around and find this :
$ echo "[55,2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="[55"){$2=10}{print}}'
[55,10,1,10,30,23],
but it's not working in my case since there is " ' " around the value of $1 in my if condition :
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['55'"){$2=10}{print}}'
['55',2,1,10,30,23],
The problem is not in the awk code, it's the shell expansion. You cannot have single quotes in a singly-quoted shell string. This is the same problem you run into when you try to put the input string into single quotes:
$ echo '['55',2,1,10,30,23],'
[55,2,1,10,30,23],
-- the single quotes are gone! And this makes sense, because they did their job of quoting the [ and the ,2,1,10,30,23], (the 55 is unquoted here), but it is not what we wanted.
A solution is to quote the sections between them individually and squeeze them in manually:
$ echo '['\''55'\'',2,1,10,30,23],'
['55',2,1,10,30,23],
Or, in this particular case, where nothing nefarious is between where the single quotes should be,
echo '['\'55\'',2,1,10,30,23],' # the 55 is now unquoted.
Applied to your awk code, that looks like this:
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['\'55\''"){$2=10}{print}}'
['55',10,1,10,30,23],
Alternatively, since this doesn't look very nice if you have many single quotes in your code, you can write the awk code into a file, say foo.awk, and use
echo "['55',2,1,10,30,23]," | awk -F, -f foo.awk
Then you don't have to worry about shell quoting mishaps in the awk code because the awk code is not subject to shell expansion anymore.
I think how to match and replace is not the problem for you. The problem you were facing is, how to match a single quote ' in field.
To avoid to escape each ' in your codes, and to make your codes more readable, you can assigen the quote to a variable, and use the variable in your codes, for example like this:
echo "['55' 1
['56' 1"|awk -v q="'" '$1=="["q"55"q{$2++}7'
['55' 2
['56' 1
In the above example, only in line with ['55', the 2nd field got incremented.
I have a large text file that contains multiple columns of data. I'm trying to write a script that accepts a column number and keyword from the command line and searches for any hits before displaying the entire row of any matches.
I've been trying something along the lines of:
grep $fileName | awk '{if ($'$columnNumber' == '$searchTerm') print $0;}'
But this doesn't work at all. Am I on the right lines? Thanks for any help!
The -v option can be used to pass shell variables to awk command.
The following may be what you're looking for:
awk -v s=$SEARCH -v c=$COLUMN '$c == s { print $0 }' file.txt
EDIT:
I am always trying to write more elegant and tighter code. So here's what Dennis means:
awk -v s="$search" -v c="$column" '$c == s { print $0 }' file.txt
Looks reasonable enough. Try using set -x to look at exactly what's being passed to awk. You can also use different and/or more awk things, including getting rid of the separate grep:
awk -v colnum=$columnNumber -v require="$searchTerm"
"/$fileName/ { if (\$colnum == require) print }"
which works by setting awk variables (colnum and require, in this case) and then using the literal string $colnum to get the desired field, and the variable require to get the required-string.
Note that in all cases (with or without the grep command), any regular expression meta-characters in $fileName will be meta-y, e.g., this.that will match the file named this.that but also the file named thisXthat.