Multiline trimming - linux

I have a html file that I want to trim. I want to remove a section from the beginning all the way to a given string, and from another string to the end. How do I do that, preferably using sed?

With GNU sed:
sed '/mark1/,/mark2/d;/mark3/,$d'
this
abc
def
mark1
ghi
jkl
mno
mark2
pqr
stu
mark3
vwx
yz
becomes
abc
def
pqr
stu

you can use awk
$ cat file
mark1 dsf
abc
def
before mark2 after
blah mark1
ghi
jkl
mno
wirds mark2 here
pqr
stu
mark3
vwx
yz
$ awk -vRS="mark2" '/mark1/{gsub("mark1.*","")}/mark3/{ gsub("mark3.*","");print;f=1 } !f ' file
after
blah
here
pqr
stu

Related

How to Print All line between matching first occurrence of word?

input.txt
ABC
CDE
EFG
XYZ
ABC
PQR
EFG
From above file i want to print lines between 'ABC' and first occurrence of 'EFG'.
Expected output :
ABC
CDE
EFG
ABC
PQR
EFG
How can i print lines from one word to first occurrence of second word?
EDIT: In case you want to print all occurrences of lines coming between ABC to DEF and leave others then try following.
awk '/ABC/{found=1} found;/EFG/{found=""}' Input_file
Could you please try following.
awk '/ABC/{flag=1} flag && !count;/EFG/{count++}' Input_file
$ awk '/ABC/,/EFG/' file
Output:
ABC
CDE
EFG
ABC
PQR
EFG
This might work for you (GNU sed):
sed -n '/ABC/{:a;N;/EFG/!ba;p}' file
Turn off implicit printing by using the -n option.
Gather up lines between ABC and EFG and then print them. Repeat.
If you want to only print between the first occurrence of ABC to EFG, use:
sed -n '/ABC/{:a;N;/EFG/!ba;p;q}' file
To print the second through fourth occurrences, use:
sed -En '/ABC/{:a;N;/EFG/!ba;x;s/^/x/;/^x{2,4}$/{x;p;x};x;}' file

Display indention of long lines in following lines in vim

When I have a long line indented in vim, it wraps at the end of the window automatically (just visually). I'd like to show the indention in the next lines as well. Is it possible?
This visually indents lines that have been wrapped:
:set wrap
:let &showbreak=' '
Note, the indent width is fixed; it doesn't try to match the indent of the previous line
Before
abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc
After
abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc
abc abc abc abc abc abc abc abc abc abc abc abc
abc abc abc abc abc abc

accessing text between specific words in UNIX multiple times

if the file is like this:
ram_file
abc
123
end_file
tony_file
xyz
456
end_file
bravo_file
uvw
789
end_file
now i want to access text between ram_file and end_file, tony_file & end _file and bravo_file & end_file simultaneously. I tried sed command but i don't know how to specify *_file in this
Thanks in advance
This awk should do the job for you.
This solution threat the end_file as an end of block, and all other xxxx_file as start of block.
It will not print text between the block of there are some, like in my example do not print this.
awk '/end_file/{f=0} f; /_file/ && !/end_file/ {f=1}' file
abc
123
xyz
456
uvw
789
cat file
ram_file
abc
123
end_file
do not print this
tony_file
xyz
456
end_file
nor this data
bravo_file
uvw
789
end_file
If you like some formatting, it can be done easy with awk
awk -F_ '/end_file/{printf (f?RS:"");f=0} f; /file/ && !/end_file/ {f=1;print "-Block-"++c"--> "$1}' file
-Block-1--> ram
abc
123
-Block-2--> tony
xyz
456
-Block-3--> bravo
uvw
789

How to delete the matching pattern from given occurrence

I'm trying to delete matching patterns, starting from the second occurrence, using sed or awk. The input file contains the information below:
abc
def
abc
ghi
jkl
abc
xyz
abc
I want to the delete the pattern abc from the second instance. The output should be as below:
abc
def
ghi
jkl
xyz
Neat sed solution:
sed '/abc/{2,$d}' test.txt
abc
def
ghi
jkl
xyz
$ awk '$0=="abc"{c[$0]++} c[$0]<2; ' file
abc
def
ghi
jkl
xyz
Just change the "2" to "3" or whatever number you want to keep the first N occurrences instead of just the first 1.
One way using awk:
$ awk 'f&&$0==p{next}$0==p{f=1}1' p="abc" file
abc
def
ghi
jkl
xyz
Just set p to pattern that you only want the first instance of printing:
Taken from : unix.com
Using awk '!x[$0]++' will remove duplicate lines. x is a array and it's initialized to 0.the index of x is $0,if $0 is first time meet,then plus 1 to the value of x[$0],x[$0] now is 1.As ++ here is "suffix ++",0 is returned and then be added.So !x[$0] is true,the $0 is printed by default.if $0 appears more than once,! x[$0] will be false so won't print $0.

Re-ordering columns with a Perl one-liner

How do you reorganize this with one liner
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
# In fact there could be more fields that preceeds column with "rxx.x".
Into this
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq
Basically, put second column into the first and everything else that succeeds it, after.
Assuming your text is in the file "test", this will do it:
perl -lane 'print "$F[1] $F[0] $F[2]"' test
If you have more than three columns, you will want something like:
perl -lane 'print join q( ),$F[1],$F[0],#F[2..#F-1]'
$ perl -pale '$_ = "#F[1,0,2..$#F]"' file
If it's tab-separated, a little more is needed:
$ perl -pale 'BEGIN { $"="\t"; } $_ = "#F[1,0,2..$#F]"' file
Content of 'infile':
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
Perl one-line:
perl -pe 's/\A(\S+\s+)(\S+\s+)/$2$1/' infile
Result:
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq
The basic answers are provided by others, I considered the case of fixed width data with possible empty fields:
>cat spacedata.txt
foo r1.1 abc
foo r10.1 pqr
qux r2.1 lmn
bar r33.1 xpq
r1.2 cake
is r1.2 alie
>perl -lpwE '$_=pack "A7A5A*", (unpack "A5A7A*")[1,0,2];' spacedata.txt
r1.1 foo abc
r10.1 foo pqr
r2.1 qux lmn
r33.1 bar xpq
r1.2 cake
r1.2 is alie
file
a 5 ss
b 3 ff
c 2 zz
cat file | awk '{print $2, $1, $3}' # will print column 2,1,3
5 a ss
3 b ff
2 c zz
#or if you want to sort by column and print to new_file
cat file | sort -n -k2 | awk '{print $0}' > new_file
new_file
c 2 zz
b 3 ff
a 5 ss

Resources