accessing text between specific words in UNIX multiple times - linux

if the file is like this:
ram_file
abc
123
end_file
tony_file
xyz
456
end_file
bravo_file
uvw
789
end_file
now i want to access text between ram_file and end_file, tony_file & end _file and bravo_file & end_file simultaneously. I tried sed command but i don't know how to specify *_file in this
Thanks in advance

This awk should do the job for you.
This solution threat the end_file as an end of block, and all other xxxx_file as start of block.
It will not print text between the block of there are some, like in my example do not print this.
awk '/end_file/{f=0} f; /_file/ && !/end_file/ {f=1}' file
abc
123
xyz
456
uvw
789
cat file
ram_file
abc
123
end_file
do not print this
tony_file
xyz
456
end_file
nor this data
bravo_file
uvw
789
end_file
If you like some formatting, it can be done easy with awk
awk -F_ '/end_file/{printf (f?RS:"");f=0} f; /file/ && !/end_file/ {f=1;print "-Block-"++c"--> "$1}' file
-Block-1--> ram
abc
123
-Block-2--> tony
xyz
456
-Block-3--> bravo
uvw
789

Related

Shell - Delete line if the line has only one column? [duplicate]

This question already has answers here:
sed delete lines not containing specific string
(4 answers)
Closed 1 year ago.
How do I delete the first column (string) if the line has only one string on the first column?
abc def geh
ijk
123 xyz 345
mno
Expected output
abc def geh
123 xyz 345
A simple awk does the job without regex:
awk 'NF > 1' file
abc def geh
123 xyz 345
This will work for the cases when line has leading or trailing space or there are lines with just the white spaces.
A lot of option are available. One of them could be this :
grep " " myfile.txt
The output corresponding of the expected result. This command filter just the line with at least one space.
This works if first string have no space at end, if not this one works too :
awk 'NF > 1' myfile.txt

How to Print All line between matching first occurrence of word?

input.txt
ABC
CDE
EFG
XYZ
ABC
PQR
EFG
From above file i want to print lines between 'ABC' and first occurrence of 'EFG'.
Expected output :
ABC
CDE
EFG
ABC
PQR
EFG
How can i print lines from one word to first occurrence of second word?
EDIT: In case you want to print all occurrences of lines coming between ABC to DEF and leave others then try following.
awk '/ABC/{found=1} found;/EFG/{found=""}' Input_file
Could you please try following.
awk '/ABC/{flag=1} flag && !count;/EFG/{count++}' Input_file
$ awk '/ABC/,/EFG/' file
Output:
ABC
CDE
EFG
ABC
PQR
EFG
This might work for you (GNU sed):
sed -n '/ABC/{:a;N;/EFG/!ba;p}' file
Turn off implicit printing by using the -n option.
Gather up lines between ABC and EFG and then print them. Repeat.
If you want to only print between the first occurrence of ABC to EFG, use:
sed -n '/ABC/{:a;N;/EFG/!ba;p;q}' file
To print the second through fourth occurrences, use:
sed -En '/ABC/{:a;N;/EFG/!ba;x;s/^/x/;/^x{2,4}$/{x;p;x};x;}' file

Read a file for specific string and read lines after the match

I have a file which looks like:
AA
2
3
4
CCC
111
222
333
XXX
12
23
34
I am looking for awk command to search for a string 'CCC' from above and print all the lines that occur after 'CCC' but stop reading as soon as i reach 'XXX'.
A very simple command does the read for me but does not stop at XXX.
awk '$0 == "CCC" {i=1;next};i && i++' c.out
Could you please try following.
Solution 1st: With sed.
sed -n '/CCC/,/XXX/p' Input_file
Solution 2nd: With awk.
awk '/CCC/{flag=1} flag; /XXX/{flag=""}' Input_file
Solution 3rd: In case you want to print from string CCC to XXX but not these strings then do following.
awk '/CCC/{flag=1;next} /XXX/{flag=""} flag' Input_file
"Do something between this and that" can easily be solved with a range pattern:
awk '/CCC/,/XXX/' # prints everything between CCC and XXX (inclusive)
But it's not exactly what you've asked. You wanted to print everything after CCC and quit (stop reading) on XXX. This translates to
awk '/XXX/{exit};f;/CCC/{f=1}'

bash Sort uniq list of numbers and strings

I would like to sort and merge list in the following format
123 ABC
1 ABC
345 BGF
3 BGF
to
124 ABC
348 BGF
Thank you.
In bash thank you
Using awk you can do this:
awk '{a[$2]+=$1} END{for (i in a) print a[i], i}' file
124 ABC
348 BGF

Multiline trimming

I have a html file that I want to trim. I want to remove a section from the beginning all the way to a given string, and from another string to the end. How do I do that, preferably using sed?
With GNU sed:
sed '/mark1/,/mark2/d;/mark3/,$d'
this
abc
def
mark1
ghi
jkl
mno
mark2
pqr
stu
mark3
vwx
yz
becomes
abc
def
pqr
stu
you can use awk
$ cat file
mark1 dsf
abc
def
before mark2 after
blah mark1
ghi
jkl
mno
wirds mark2 here
pqr
stu
mark3
vwx
yz
$ awk -vRS="mark2" '/mark1/{gsub("mark1.*","")}/mark3/{ gsub("mark3.*","");print;f=1 } !f ' file
after
blah
here
pqr
stu

Resources