merge specific line using awk and sed

merge specific line using awk and sed - linux

I want to merge specific line
Input :
AAA
BBB
CCC
DDD
EEE
AAA
BBB
DDD
CCC
EEE
Output Should be
AAA
BBB
CCC DDD
EEE
AAA
BBB
DDD
CCC EEE
I want to search CCC and merge next line with it.
I have tried with awk command but didn't get success

Use awk patterns, if the line matches /CCC/ then print the line with a space at the end and go on to the next line. Otherwise (1), print the line.
awk '/CCC/ { printf("%s ", $0); next } 1' file

Using sed:
sed '/CCC/ { N; s/\n/ / }' file
Using awk:
awk '{ ORS=(/CCC/ ? FS : RS) }1' file

Related

shell duplicate spaces in file

Is it possible to remove multiple spaces from a text file and save the changes in the same file using awk or grep?
Input example:
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy

Simply reset value of $1 to again $1 which will allow OFS to come into picture and will add proper spaces into lines.
awk '{$1=$1} 1' Input_file
EDIT: Since OP mentioned that what if we want to keep only starting spaces then try following.
awk '
match($0,/^ +/){
spaces=substr($0,RSTART,RLENGTH)
}
{
$1=$1
$1=spaces $1
spaces=""
}
1
' Input_file

Using sed
sed -i -E 's#[[:space:]]+# #g' < input file
For removing spaces at the start
sed -i -E 's#[[:space:]]+# #g; s#^ ##g' < input file
Demo:
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$sed -i -E 's#[[:space:]]+# #g' test.txt
$cat test.txt
aaa bbb ccc
ddd yyyy
Output I want:
aaa bbb ccc
ddd yyyy
$

How to compare two columns in same file and store the difference in new file with the unchanged column according to it?

Row Actual Expected
1 AAA BBB
2 CCC CCC
3 DDD EEE
4 FFF GGG
5 HHH HHH
I want to compare actual and expected and store the difference in a file. Like
Row Actual Expected
1 AAA BBB
3 DDD EEE
4 FFF GGG
I have used awk -F, '{if ($2!=$3) {print $1,$2,$3}}' Sample.csv It will only compare Int values not String value

You can use AWK to do this
awk '{if($2!=$3) print $0}' oldfile > newfile
where
$2 and $3 are second and third columns
!= means second and third columns does not match
$0 means whole line
> newfile redirects to new file

I prefer an awk solution (can handle more fields and easier to understand), but you could use
sed -r '/\t([^ ]*)\t\1$/d' Sample.csv

Assuming the file uses tab or some other delimiter to separate the columns, then tsv-filter from eBay's TSV Utilities supports this type of field comparison directly. For the file above:
$ tsv-filter --header --ff-str-ne 2:3 file.tsv
Row Actual Expected
1 AAA BBB
3 DDD EEE
4 FFF GGG
The --ff-str-ne option compares two fields in a row for non-equal strings.
Disclaimer: I'm the author.

get paragraph with awk, and start-of-line regexp

I use awk to get paragraphs from a textfile, like so:
awk -v RS='' -v ORS='\n\n' '/pattern/' ./textfile
Say I have the following textfile:
aaa bbb ccc
aaa bbb ccc
aaa bbb ccc
aaa ccc
bbb aaa ccc
bbb aaa ccc
ccc bbb aaa
ccc bbb aaa
ccc bbb aaa
Now I only want the paragraph with one of the (original) lines starting with "bbb" (hence the second paragraph). However - using regexp ^ will not work anymore, (I presume) because of the RS='' line; awk now only matches to the begin of the paragraph.
Is there another way?

^ means start-of-string. You want start-of-line which is (^|\n), e.g.:
$ awk -v RS='' -v ORS='\n\n' '/(^|\n)bbb/' file
aaa ccc
bbb aaa ccc
bbb aaa ccc

In Linux command line console, how to get the sub-string from a file?

The content of the file is fixed.
Example:
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
....
Which command can I use to get all sub-strings, AAA, BBB, CCC, etc.

you can use cut and awk and perl for this.
cat >> file.data << EOF
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
EOF
AWK
awk '{ print $2 }' file.data
AAA
BBB
CCC
CUT
cut -d " " -f2 file.data
AAA
BBB
CCC
PERL
perl -alne 'print $F[1] ' file.data
AAA
BBB
CCC

You can use cut:
cut -d' ' -f 2 file

You can use AWK for this:
jayforsythe$ cat > file
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
jayforsythe$ awk '{ print $2 }' file
AAA
BBB
CCC
To save the result to another file, simply add the redirection operator:
jayforsythe$ awk '{ print $2 }' file > file2

awk how to print the rest

my file contains lines like this
any1 aaa bbb ccc
The delimiter is space. the number of words in the line is unknown
I want to put the first word into a var1. It's simple with
awk '{print $1}'
Now I want to put the rest of the line into a var2 with awk.
How I can print the rest of the line with awk ?

Better to use read here:
s="any1 aaa bbb ccc"
read var1 var2 <<< "$s"
echo "$var1"
any1
echo "$var2"
aaa bbb ccc
For awk only solution use:
echo "$s" | awk '{print $1; print substr($0, index($0, " ")+1)}'
any1
aaa bbb ccc

$ var=$(awk '{sub(/^[^[:space:]]+[[:space:]]+/,"")}1' file)
$ echo "$var"
aaa bbb ccc
or in general to skip some number of fields use a RE interval:
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){1}/,"")}1' file
aaa bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/,"")}1' file
bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1' file
ccc
Note that doing this gets much more complicated if you have a FS that's more than a single char, and the above is just for the default FS since it additionally skips any leading blanks if present (remove the first [[:space:]]* if you have a non-default but still single-char FS).

awk solution:
awk '{$1 = ""; print $0;}'`

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

merge specific line using awk and sed - linux

I want to merge specific line Input : AAA BBB CCC DDD EEE AAA BBB DDD CCC EEE Output Should be AAA BBB CCC DDD EEE AAA BBB DDD CCC EEE I want to search CCC and merge next line with it. I have tried with awk command but didn't get success

Use awk patterns, if the line matches /CCC/ then print the line with a space at the end and go on to the next line. Otherwise (1), print the line. awk '/CCC/ { printf("%s ", $0); next } 1' file

Using sed: sed '/CCC/ { N; s/\n/ / }' file Using awk: awk '{ ORS=(/CCC/ ? FS : RS) }1' file

Related

shell duplicate spaces in file

How to compare two columns in same file and store the difference in new file with the unchanged column according to it?

get paragraph with awk, and start-of-line regexp

In Linux command line console, how to get the sub-string from a file?

awk how to print the rest

Categories

Resources