Add a line counter to lines matching a pattern - linux

I need to prepend a line counter to lines matching specific patterns in a file, while still outputting the lines that do not match this pattern.
For example, if my file looks like this:
aaa 123
bbb 456
aaa 666
ccc 777
bbb 999
and the patterns I want to count are 'aaa' and 'ccc', I'd like to get the following output:
1:aaa 123
bbb 456
2:aaa 666
3:ccc 777
bbb 999
Preferably I'm looking for a Linux one-liner. Shell or tool doesn't matter as long it's installed by default in most distros.

With awk:
awk '{if ($1=="aaa" || $1=="ccc") {a++; $0=a":"$0}} {print}' file
1: aaa 123
bbb 456
2: aaa 666
3: ccc 777
bbb 999
Explanation
Loop through lines checking whether first field is aaa or ccc. If so, append the line ($0) with the variable a and auto increment it. Finally, print the line in all cases: if the pattern was matched will have a in the beginning, otherways just the original line.

Use the following code. The following approach is in perl
open FH,"<abc.txt";
$incremental_val = 1;
while(my $line = <FH>){
chomp($line);
if($line =~ m/^aaa / || $line =~ m/^ccc /){
print "$incremental_val : $line\n";
$incremental_val++;
next;
}
print "$line\n";
}
close FH;
The output will be as follows.
1 : aaa 123
bbb 456
2 : aaa 666
3 : ccc 777
bbb 999

Related

How to compare a file w.r.t a keyword file and replace mismatch strings of column 3 with correct string in column 3 of keyword file

I want to search every line of search_file in keyword_file and print an output_file replacing the incorrect string of the line with the correct string extracted from keyword_file. Also, should warn the user if there are few entries with missing 3rd column, which do not exist in keyword_file and have in search_file (for example, "ggg coms" in the file)
Note, here keyword_file may contain an unequal number of line compared to search_file. For example:
search_file
aaa coms 123
bbb coms 234
ccc
ddd coms 456
eez coms 789
fkk coms 987
ggg coms
hhh coms 989
....
keyword_file
aaa coms 789
bbb coms 234
ccc coms 878
ddd coms 456
ttt coms 654
eee coms 789
Output
aaa coms 789
bbb coms 234
ccc coms 878
ddd coms 456
eez coms 789
fkk coms 987
hhh coms 989
....
I tried the following awk command, but it was not able to retain column #1 entries of search_file in the Output.
awk 'FNR==NR{a[$1]=$0} FNR!=NR&&a[$1]{print $1,$2,$3}' search_file keyword_file
Thank you in advanced for your help :)
Could you please try following, written and tested based on shown samples only.
awk '
{
key=$1
}
FNR==NR{
a[key]=$3
next
}
(key in a){
$0=key OFS $2 OFS a[key]
}
1
' keyword_file search_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
{
key=$1 ##Run this command on each line of Input_file and create variable key with value of 1st field.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when keyword_file is being read.
a[key]=$3 ##Creating array a with index key and value of 3rd field here.
next ##next will skip all further statements from here.
}
(key in a){ ##Checking condition if key is present in array a then do following.
$0=key OFS $2 OFS a[key] ##Setting value of key OFS 2nd field OFS array a value with index key here.
}
1 ##1 will print edited/non-edited values for all lines.
' keyword_file search_file ##Mentioning Input_file names here.
Why OP's code didn't work: You were close you only printed lines where first and second fields are common in both the Input_files so what I did is: while checking condition if fields are common in both Input_files then re-create the line with new last value and then by mentioning 1 printing the current(edited/non-edited) lines.

shell duplicate spaces in file

Is it possible to remove multiple spaces from a text file and save the changes in the same file using awk or grep?
Input example:
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
Simply reset value of $1 to again $1 which will allow OFS to come into picture and will add proper spaces into lines.
awk '{$1=$1} 1' Input_file
EDIT: Since OP mentioned that what if we want to keep only starting spaces then try following.
awk '
match($0,/^ +/){
spaces=substr($0,RSTART,RLENGTH)
}
{
$1=$1
$1=spaces $1
spaces=""
}
1
' Input_file
Using sed
sed -i -E 's#[[:space:]]+# #g' < input file
For removing spaces at the start
sed -i -E 's#[[:space:]]+# #g; s#^ ##g' < input file
Demo:
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$sed -i -E 's#[[:space:]]+# #g' test.txt
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$

Linux shell command to copy text data from a file to another

file_1 contents:
aaa 111 222 333
bbb 444 555 666
ccc 777 888 999
file_2 contents:
ddd
eee
fff
how do i copy only part of the text from file_1 to file_2
so that file_2 would become:
ddd 111 222 333
eee 444 555 666
fff 777 888 999
Try with awk:
awk 'NR==FNR{a[FNR]=$2FS$3FS$4;next} {print $0, a[FNR]}' file_1 file_2
Explanation:
NR is the current input line, FNR is the number of input line in current file, you can see that by
$ awk '{print NR,FNR}' file_1 file_2
1 1
2 2
3 3
4 1
5 2
6 3
So, the condition NR==FNR is only true when reading the first file, and that's when the columns $2, $3, and $4 get saved in a[FNR]. After reading file_1, the condition NR==FNR becomes false and the block {print $0, a[FNR]} is executed, where $0 is the whole line in file_2.

awk how to print the rest

my file contains lines like this
any1 aaa bbb ccc
The delimiter is space. the number of words in the line is unknown
I want to put the first word into a var1. It's simple with
awk '{print $1}'
Now I want to put the rest of the line into a var2 with awk.
How I can print the rest of the line with awk ?
Better to use read here:
s="any1 aaa bbb ccc"
read var1 var2 <<< "$s"
echo "$var1"
any1
echo "$var2"
aaa bbb ccc
For awk only solution use:
echo "$s" | awk '{print $1; print substr($0, index($0, " ")+1)}'
any1
aaa bbb ccc
$ var=$(awk '{sub(/^[^[:space:]]+[[:space:]]+/,"")}1' file)
$ echo "$var"
aaa bbb ccc
or in general to skip some number of fields use a RE interval:
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){1}/,"")}1' file
aaa bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/,"")}1' file
bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1' file
ccc
Note that doing this gets much more complicated if you have a FS that's more than a single char, and the above is just for the default FS since it additionally skips any leading blanks if present (remove the first [[:space:]]* if you have a non-default but still single-char FS).
awk solution:
awk '{$1 = ""; print $0;}'`

merge specific line using awk and sed

I want to merge specific line
Input :
AAA
BBB
CCC
DDD
EEE
AAA
BBB
DDD
CCC
EEE
Output Should be
AAA
BBB
CCC DDD
EEE
AAA
BBB
DDD
CCC EEE
I want to search CCC and merge next line with it.
I have tried with awk command but didn't get success
Use awk patterns, if the line matches /CCC/ then print the line with a space at the end and go on to the next line. Otherwise (1), print the line.
awk '/CCC/ { printf("%s ", $0); next } 1' file
Using sed:
sed '/CCC/ { N; s/\n/ / }' file
Using awk:
awk '{ ORS=(/CCC/ ? FS : RS) }1' file

Resources