Returning a column with grep - search

I'm attempting to search through a file and return a particular column based on whether a particular value is present in the column. For example, if I search for "Red" in the file:
One Two Three
Cat Dog Chicken
Blue Black Red
Blah Blah Blah
I want returned:
Three
Chicken
Red
Blah
I would even accept just knowing which column grep or any other search command found a match in, so I could use cut, but I can't even find that much.

This is one way:
Store all the data in the matrix a[line][column]. Save the column number in p. Finally print all the items a[line][p].
$ awk -v text=Blue '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i~text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
One
Cat
Blue
Blah
$ awk -v text=Red '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i~text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
Three
Chicken
Red
Blah
Update
To have exact matches, replace ~ with == (thanks konsolebox):
awk -v text=Blue '{for (i=1; i<=NF; i++) {a[NR,i]=$i; if ($i==text) {p=i}}} END{ for (i=1; i<=NR; i++) print a[i,p]}' a
^^

One possibility, depending on how you respond to the questions I posted in my comment:
awk -v tgt="Red" '
NR==FNR {for (i=1;i<=NF;i++) if ($i==tgt) cols[i]; next}
{sep=""; for (i=1;i<=NF;i++) if (i in cols) {printf "%s%s", sep, $i; sep=OFS}; print ""}
' file file

Related

Using awk, subtract with previous row in all columns and print the result

I need your guidance in one liner command for linux using awk, subtract the row with previous row recursively in all columns and then print the difference values.
I have input as
2021-02-15_16 101242 102108 17572 84538
2021-02-15_17 101235 102077 17625 84445
Expected output
2021-02-15_17 -7 -31 53 -93
I tried this by myself but with no luck.
cat test |awk 'NR==1{s=$3;next}{s-=$3}END{print s}' --> this displays only for 1 column
cat test | awk 'NR==1 {for(i=3; i<=NF; i++){s=$i;next}{s-=$i}{print s}}'
You may use this awk:
awk 'NR > 1 {for (i=2; i<=5; ++i) $i -= a[i]; print} {split($0,a)}' file
2021-02-15_17 -7 -31 53 -93
To make it more readable:
awk 'NR > 1 {
for (i=2; i<=5; ++i)
$i -= a[i]
print
}
{
split($0,a)
}' file

search for a string in a file which is delimited and then print the string until the next delimiter is reached in linux

I have below text in a file
1|2|SID1=/some/path|SID2=/some/path|4|5
1|2|SID1=/some/path|tel|path|SID2=/some/path|6|5|ord|til
1|2|SID1=/some/path|id1|id2|id3|SID2=/some/path|4|8|dea
In Linux, how do I seach for SID1 and SID2 in each line and print only till the next delimiter, so the output should be
SID1=/some/path SID2=/some/path
SID1=/some/path SID2=/some/path
SID1=/some/path SID2=/some/path
Perl to the rescue:
perl -lne 'print join " ", /SID[12]=[^|]*/g' file.txt
Explanation: Perl reads the file line by line (-n). All parts of the line containing SID followed by 1 or 2 followed by = followed by anything but | are printed with a space between them.
I feel like I'm missing a better solution but this works
Oneline:
awk -F'|' '{a=0; for (i=1; i<=NF; i++) {if ($i ~ /^SID[[:digit:]]*=/) { printf "%s%s", a?OFS:(NR>1)?ORS:"", $i; a++ }}} END {print ""}' file
Explained:
awk -F'|' '{
# Reset our field tracking.
a=0
# Loop over all the fields in the line.
for (i=1; i<=NF; i++) {
# If the current field starts with 'SID#=' then
if ($i ~ /^SID[[:digit:]]*=/) {
# Print out the field with the appropriate separator.
# When we have 'a' set we are in a line and want to print out a
# leading OFS. Otherwise if this is not the first line we want to
# print out a leading ORS. Otherise do nothing.
printf "%s%s", a?OFS:(NR>1)?ORS:"", $i
# Set our field tracking.
a=1
}
}
}
END {
# Print out the final newline.
print ""
}' file

Storing command output lines into array based on new line character

I have a variable as below & i perform certain operations to print the output one by one as mentioned below.
a="My name is A. Her Name is B. His Name is C"
echo "$a" | awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}'
The output is
is A
is B
is C
When I store the results into an array, it considers space as array separator and stores value. but i want to store the each line of the output to each array index values as below
x=($(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a"))
out puts ,
${x[0]} = is
${x[1]} = A
..and so on...
What i expect is
${x[0]} = is A
${x[1]} = is B
${x[2]} = is C
Also echo ${#x[#]} = 6 ; It should be = 3
OK try below:
i=0
while read v; do
x[i]="$v"
(( i++ ))
done < <(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a")
You can also use the mapfile command (bash version 4 or higher):
tempX=$(awk -F '[nN]ame |\\.' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$a")
mapfile -t x <<< "$tempX"
~$ echo "${x[0]}"
is A

How can "squeeze-repeated" words?

How can "squeeze-repeated" words?
similar to "squeeze repeated characters" with tr -s ''
I would like to change for example:
hello.hello.hello.hello
to
hello
This can be a way:
$ cat a
hello hello bye but bye yeah
hello yeah
$ awk 'BEGIN{OFS=FS=" "}
{ for (i=1; i<=NF; i++) {
if (!($i in a)) {printf "%s%s",$i,OFS; a[$i]=$i}
};
delete a;
print ""
}' a
hello bye but yeah
hello yeah
You can change the field separator:
$ cat a
hello|hello|bye|but|bye|yeah
hello|yeah
$ awk 'BEGIN{OFS=FS="|"} {for (i=1; i<=NF; i++) {if (!($i in a)) {printf "%s%s",$i,OFS; a[$i]=$i}}; delete a; print ""}' a
hello|bye|but|yeah|
hello|yeah|

Merging Multiple records into a Unique records with all the non-null values

Suppose I have 3 records :
P1||1234|
P1|56001||
P1|||NJ
I want to merge these 3 records into one with all the attributes. Final record :
P1|56001|1234|NJ
Is there any way to achieve this in Unix/Linux?
I assume you ask solution with bash, awk, sed etc.
You could try something like
$ cat test.txt
P1||1234|
P1|56001||
P1|||NJ
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) print $i }' | egrep '.+' | sort | uniq | awk 'BEGIN{ c = "" } { printf c $0; c = "|" } END{ printf "\n" }'
1234|56001|NJ|P1
Briefly, awk splits the lines with '|' separator and prints each field to a line. egrep removes the empty lines. After that, sort and uniq removes multiple attributes. Finally, awk merges the lines with '|' separator.
Update:
If I understand correctly, here's what you seek for;
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) if($i) col[i]=$i } END{ for (i = 1; i <= length(col); i++) printf col[i] (i == length(col) ? "\n" : "|")}'
P1|56001|1234|NJ
In your example, 1st row you have 1234, 2nd row you have 56001.
I don't get why in your final result, the 56001 goes before 1234. I assume it is a typo/mistake.
an awk-oneliner could do the job:
awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
with your data:
kent$ echo "P1||1234|
P1|56001||
P1||NJ"|awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
P1|1234|56001|NJ

Resources