Returning a column with grep

Returning a column with grep - search

I'm attempting to search through a file and return a particular column based on whether a particular value is present in the column. For example, if I search for "Red" in the file:
One Two Three
Cat Dog Chicken
Blue Black Red
Blah Blah Blah
I want returned:
Three
Chicken
Red
Blah
I would even accept just knowing which column grep or any other search command found a match in, so I could use cut, but I can't even find that much.

This is one way:
Store all the data in the matrix a[line][column]. Save the column number in p. Finally print all the items a[line][p].
$ awk -v text=Blue '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i~text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
One
Cat
Blue
Blah
$ awk -v text=Red '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i~text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
Three
Chicken
Red
Blah
Update
To have exact matches, replace ~ with == (thanks konsolebox):
awk -v text=Blue '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i==text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
^^

One possibility, depending on how you respond to the questions I posted in my comment:
awk -v tgt="Red" '
NR==FNR {for (i=1;i<=NF;i++) if ($i==tgt) cols[i]; next}
{sep=""; for (i=1;i<=NF;i++) if (i in cols) {printf "%s%s", sep, $i; sep=OFS}; print ""}
' file file

Related

Using awk, subtract with previous row in all columns and print the result

I need your guidance in one liner command for linux using awk, subtract the row with previous row recursively in all columns and then print the difference values.
I have input as
2021-02-15_16 101242 102108 17572 84538
2021-02-15_17 101235 102077 17625 84445
Expected output
2021-02-15_17 -7 -31 53 -93
I tried this by myself but with no luck.
cat test |awk 'NR==1{s=$3;next}{s-=$3}END{print s}' --> this displays only for 1 column
cat test | awk 'NR==1 {for(i=3; i<=NF; i++){s=$i;next}{s-=$i}{print s}}'

You may use this awk:
awk 'NR > 1 {for (i=2; i<=5; ++i) $i -= a[i]; print} {split($0,a)}' file
2021-02-15_17 -7 -31 53 -93
To make it more readable:
awk 'NR > 1 {
for (i=2; i<=5; ++i)
$i -= a[i]
print
}
{
split($0,a)
}' file

search for a string in a file which is delimited and then print the string until the next delimiter is reached in linux

I have below text in a file
1|2|SID1=/some/path|SID2=/some/path|4|5
1|2|SID1=/some/path|tel|path|SID2=/some/path|6|5|ord|til
1|2|SID1=/some/path|id1|id2|id3|SID2=/some/path|4|8|dea
In Linux, how do I seach for SID1 and SID2 in each line and print only till the next delimiter, so the output should be
SID1=/some/path SID2=/some/path
SID1=/some/path SID2=/some/path
SID1=/some/path SID2=/some/path

Perl to the rescue:
perl -lne 'print join " ", /SID[12]=[^|]*/g' file.txt
Explanation: Perl reads the file line by line (-n). All parts of the line containing SID followed by 1 or 2 followed by = followed by anything but | are printed with a space between them.

I feel like I'm missing a better solution but this works
Oneline:
awk -F'|' '{a=0; for (i=1; i<=NF; i++) {if ($i ~ /^SID[[:digit:]]*=/) { printf "%s%s", a?OFS:(NR>1)?ORS:"", $i; a++ }}} END {print ""}' file
Explained:
awk -F'|' '{
# Reset our field tracking.
a=0
# Loop over all the fields in the line.
for (i=1; i<=NF; i++) {
# If the current field starts with 'SID#=' then
if ($i ~ /^SID[[:digit:]]*=/) {
# Print out the field with the appropriate separator.
# When we have 'a' set we are in a line and want to print out a
# leading OFS. Otherwise if this is not the first line we want to
# print out a leading ORS. Otherise do nothing.
printf "%s%s", a?OFS:(NR>1)?ORS:"", $i
# Set our field tracking.
a=1
}
}
}
END {
# Print out the final newline.
print ""
}' file

Storing command output lines into array based on new line character

I have a variable as below & i perform certain operations to print the output one by one as mentioned below.
a="My name is A. Her Name is B. His Name is C"
echo "$a" | awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}'
The output is
is A
is B
is C
When I store the results into an array, it considers space as array separator and stores value. but i want to store the each line of the output to each array index values as below
x=($(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a"))
out puts ,
${x[0]} = is
${x[1]} = A
..and so on...
What i expect is
${x[0]} = is A
${x[1]} = is B
${x[2]} = is C
Also echo ${#x[#]} = 6 ; It should be = 3

OK try below:
i=0
while read v; do
x[i]="$v"
(( i++ ))
done < <(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a")

You can also use the mapfile command (bash version 4 or higher):
tempX=$(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a")
mapfile -t x <<< "$tempX"
~$ echo "${x[0]}"
is A

How can "squeeze-repeated" words?

How can "squeeze-repeated" words?
similar to "squeeze repeated characters" with tr -s ''
I would like to change for example:
hello.hello.hello.hello
to
hello

This can be a way:
$ cat a
hello hello bye but bye yeah
hello yeah
$ awk 'BEGIN{OFS=FS=" "}
{ for (i=1; i<=NF; i++) {
if (!($i in a)) {printf "%s%s",$i,OFS; a[$i]=$i}
};
delete a;
print ""
}' a
hello bye but yeah
hello yeah
You can change the field separator:
$ cat a
hello|hello|bye|but|bye|yeah
hello|yeah
$ awk 'BEGIN{OFS=FS="|"} {for (i=1; i<=NF; i++) {if (!($i in a)) {printf "%s%s",$i,OFS; a[$i]=$i}}; delete a; print ""}' a
hello|bye|but|yeah|
hello|yeah|

Merging Multiple records into a Unique records with all the non-null values

Suppose I have 3 records :
P1||1234|
P1|56001||
P1|||NJ
I want to merge these 3 records into one with all the attributes. Final record :
P1|56001|1234|NJ
Is there any way to achieve this in Unix/Linux?

I assume you ask solution with bash, awk, sed etc.
You could try something like
$ cat test.txt
P1||1234|
P1|56001||
P1|||NJ
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) print $i }' | egrep '.+' | sort | uniq | awk 'BEGIN{ c = "" } { printf c $0; c = "|" } END{ printf "\n" }'
1234|56001|NJ|P1
Briefly, awk splits the lines with '|' separator and prints each field to a line. egrep removes the empty lines. After that, sort and uniq removes multiple attributes. Finally, awk merges the lines with '|' separator.
Update:
If I understand correctly, here's what you seek for;
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) if($i) col[i]=$i } END{ for (i = 1; i <= length(col); i++) printf col[i] (i == length(col) ? "\n" : "|")}'
P1|56001|1234|NJ

In your example, 1st row you have 1234, 2nd row you have 56001.
I don't get why in your final result, the 56001 goes before 1234. I assume it is a typo/mistake.
an awk-oneliner could do the job:
awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
with your data:
kent$ echo "P1||1234|
P1|56001||
P1||NJ"|awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
P1|1234|56001|NJ

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Returning a column with grep - search

One possibility, depending on how you respond to the questions I posted in my comment: awk -v tgt="Red" ' NR==FNR {for (i=1;i<=NF;i++) if ($i==tgt) cols[i]; next} {sep=""; for (i=1;i<=NF;i++) if (i in cols) {printf "%s%s", sep, $i; sep=OFS}; print ""} ' file file

Related

Using awk, subtract with previous row in all columns and print the result

search for a string in a file which is delimited and then print the string until the next delimiter is reached in linux

Storing command output lines into array based on new line character

How can "squeeze-repeated" words?

Merging Multiple records into a Unique records with all the non-null values

Categories

Resources