Vi: delete only not-first (second, third, …) occurence of multiple lines - vim

How to delete every occurence of multiple lines in a text file except the first one? (This question might be related.)
I need to keep the order, otherwise I had used :sort u.
Example:
hsdf
asdf
csdf
csdf
hsdf
dsdf
jsdf
asdf
results in
hsdf
asdf
csdf
dsdf
jsdf
instead of
asdf
csdf
dsdf
hsdf
jsdf

Sometimes you don't need to think much. Just take that big hammer and make a bang.
So what's the plan? Loop over all lines; store them into associative array; if it's already there then delete it from buffer. Looks dumb enough to get things working by the first attempt:
:let foo = {}
:g/./if foo->has_key(getline(".")) | delete | else | let foo[getline(".")] = 1 | endif
:unlet foo

Related

VIM - Reformat text to one line paragraphs

I have a text file like the following:
--------
FOX&DOGS
The quick brown. Fox
jumped.
Over the lazy dogs.
-------------------
I want to change it as follow:
--------
FOX&DOGS
The quick brown. Fox jumped.
Over the lazy dogs.
-------------------
So in general:
preserve empty line/lines
have new-lines just after any period_newline ".\n" (end of paragraph... In the above example I don’t want to cut line after "brown." for instance: there is just a period but not followed by newline, so it isn’t an end of a paragraph, so it has to stay on the same line)
My solution:
%s/\n\n/#\r#\r/ | %s/\.\n/\.#\r/ | %j | s/# /\r/g | $$d
The idea is a bit rude:
mark all ends of paragraph and empty lines (I have chosen "#" as marker)
join all lines in a long single one
substitute the marker "# " (there is a space after #) with carriage return "\r" (newline)
delete last empty line created during this procedure
It seemed to work so I also created an alias in vimrc:
command Par %s/\n\n/#\r#\r/ | %s/\.\n/\.#\r/ | %j | s/# /\r/g | $$d
The problem:
If there aren’t any empty lines it returns error "pattern not found", and it doesn’t change anything. Seems a sort of conditional instruction is needed (if you find pattern substitute it with... else don't stop, continue with the other commands).
Any idea to solve in a simple way?
Maybe I found a solution:
add a blank line after the last one, so that the pattern “\n\n” is always found even if it isn’t present in the original file, and the error can’t block next commands.
in the end we will have to remove 2 blank lines at the bottom created by the substitution “s/# /\r/g”
So the command I tried is:
$ | put _ | %s/\n\n/#\r#\r/ | %s/\.\n/\.#\r/ | %j | s/# /\r/g | $$d | $$d
$ go to the last line
append a blank line
mark newlines involving blank lines (also last blank line added) with # character
mark newlines involving period (the last line can’t end with period due to the marker # added at the previous step)
join all lines in a long one
replace markers “# ” with a newline (here we creates two more blank lines at the bottom, have to be removed)
remove the two last blank lines added
Limitations:
if a paragraph ends with a punctuation mark other than “period”, it doesn’t work at all.
Any idea to improve my raw oneliner is welcome!

Ask for explanation of one vim command

let i=1 | g/aaa\zs/s//\=i/ | let i=i+1
The above command add counter number after matched pattern. So the following text is changed.
aaab
aaab
aaab
to
aaa1b
aaa2b
aaa3b
'|' joints commands into one command. In my opinion, the commands are executed sequentially like firstly let i=1, then g/aaa\zs/s//\=i/ , finally let i=i+1 . From the result above, s//\=i/**and **let i=i+1 are executed by g command. Can anyone explain?
The following command does wrong work. But I don't know why.
let i=1 | g/aaa\zs/s//\=i | let i=i+1
In s//\=i/, the replacement string is terminated and the | is treated as an argument by global. However, when you remove the trailing /, the replacement string to s consumes the | let i=i+1. From the help doc for sub-replace-special, you can find: "When the substitute string starts with "\=" the remainder is interpreted as an expression." So the expression i | let i=i+1 is evaluated, but the increment is not available outside of that evaluation.
You should understand your first command as:
let i=1 | g/aaa\zs/ ( s//\=i/ | let i=i+1 )
(Parenthesis are only here for explaining, they'd cause syntax error if typed).
i.e. everything after the g/<pattern/ is a single command given as an argument to the global g command.
So indeed: we start with let i=1, then for all lines matching pattern aaa we execute: s//\=i/ | let i=i+1 (substitution, then incrementing i).
Your second command does not work because s does not function the same way as g, and it does need an ending / after the expression to substitute to pattern.
Usually, the | separates two Ex commands, and they are then indeed executed sequentially. But some commands take a | as part of their arguments. :global is one of them (full list at :help :bar). So, the special application of commands over matching lines is applied with both the :s and the :let commands (the latter of which can be shortened as :let i+=1 BTW).

Multiple :g and :v commands in one statement

I have this file
foo
foo bar
foo bar baz
bar baz
foo baz
baz bar
bar
baz
foo 42
foo bar 42 baz
baz 42
I want to
Select lines which contain foo and do NOT contain bar
Delete lines which contain foo and do NOT contain bar
I read somewhere (can't find the link) that I have to use :exec with | for this.
I tried the following, but it doesn't work
:exec "g/foo" # works
:exec "g/foo" | exec "g/bar" -- first returns lines with foo, then with bar
:exec "g/foo" | :g/bar -- same as above
And ofcourse if I cannot select a line, I cannot execute normal dd on it.
Any ideas?
Edit
Note for the bounty:
I'm looking for a solution that uses proper :g and :v commands, and does not use regex hacks, as the conditions may not be the same (I can have 2 includes, 3 excludes).
Also note that the last 2 examples of things that don't work, they do work for just deleting the lines, but they return incorrect information when I run them without deleting (ie, viewing the selected lines) and they behave as mentioned above.
I'm no vim wizard, but if all you want to do is "Delete lines which contain foo and do NOT contain bar" then this should do (I tried on your example file):
:v /bar/s/.*foo.*//
EDIT: actually this leaves empty lines behind. You probably want to add an optional newline to that second search pattern.
This might still be hackish to you, but you can write some vimscript to make a function and specialized command for this. For example:
command! -nargs=* -range=% G <line1>,<line2>call MultiG(<f-args>)
fun! MultiG(...) range
let pattern = ""
let command = ""
for i in a:000
if i[0] == "-"
let pattern .= "\\(.*\\<".strpart(i,1)."\\>\\)\\#!"
elseif i[0] == "+"
let pattern .= "\\(.*\\<".strpart(i,1)."\\>\\)\\#="
else
let command = i
endif
endfor
exe a:firstline.",".a:lastline."g/".pattern."/".command
endfun
This creates a command that allows you to automate the "regex hack". This way you could do
:G +foo -bar
to get all lines with foo and not bar. If an argument doesn't start with + or - then it is considered the command to add on to the end of the :g command. So you could also do
:G d +foo -bar
to delete the lines, or even
:G norm\ foXp +two\ foos -bar
if you escape your spaces. It also takes a range like :1,3G +etc, and you can use regex in the search terms but you must escape your spaces. Hope this helps.
This is where regular expressions get a bit cumbersome. You need to use the zero width match \(search_string\)\#=. If you want to match a list of items in any order, the search_string should start with .* (so the match starts from the start of the line each time). To match a non-occurrence, use \#! instead.
I think these commands should do what you want (for clarity I am using # as the delimiter, rather than the usual /):
Select lines which contain foo and bar:
:g#\(.*foo\)\#=\(.*bar\)\#=
Select lines which contain foo, bar and baz
:g#\(.*foo\)\#=\(.*bar\)\#=\(.*baz\)\#=
Select lines which contain foo and do NOT contain bar
:g#\(.*foo\)\#=\(.*bar\)\#!
Delete lines which contain foo and bar
:g#\(.*foo\)\#=\(.*bar\)\#=#d
Delete lines which contain foo and do NOT contain bar
:g#\(.*foo\)\#=\(.*bar\)\#!#d
You won't achieve your requirements unless you're willing to use some regular expressions since the expressions are what drives :global and it's opposite :vglobal.
This is no hacking around but how the commands are supposed to work: they need an expression to work with. If you're not willing to use regular expressions, I'm afraid you won't be able to achieve it.
Answer terminates here if you're not willing to use any regular expressions.
Assuming that we are nice guys with an open mind, we need a regular expression that is true when a line contains foo and not bar.
Suggestion number 5 of Prince Goulash is quite there but doesn't work if foo occurs after bar.
This expression does the job (i.e. print all the lines):
:g/^\(.*\<bar\>\)\#!\(.*\<foo\>\)\#=/
If you want to delete them, add the delete command:
:g/^\(.*\<bar\>\)\#!\(.*\<foo\>\)\#=/d
Description:
^ starting from the beginning of the line
\(.*\<bar\>\) the word bar
\#! must never appear
\(.*\<foo\>\)\#= but the word foo has to appear anywhere on the line
The two patterns could also be swapped:
:g/^\(.*\<foo\>\)\#=\(.*\<bar\>\)\#!/
yields the same results.
Tested with the following input:
01 foo
02 foo bar
03 foo bar baz
04 bar baz
05 foo baz
06 baz bar
07 bar
08 baz
09 foo 42
10 foo bar 42 baz
11 42 foo baz
12 42 foo bar
13 42 bar foo
14 baz 42
15 baz foo
16 bar foo
Regarding multiple includes/excludes:
Each exclude is made of the pattern
\(.*\<what_to_exclude\>\)\#!
Each include is made of the pattern
\(.*\<what_to_include\>\)\#=
To print all the lines that contain foo but not bar nor baz:
g/^\(.*\<bar\>\)\#!\(.*\<baz\>\)\#!\(.*\<foo\>\)\#=/
Print all lines that contain foo and 42 but neither bar nor baz:
g/^\(.*\<bar\>\)\#!\(.*\<baz\>\)\#!\(.*\<foo\>\)\#=\(.*\<42\>\)\#=/
The sequence of the includes and excludes is not important, you could even mix them:
g/^\(.*\<bar\>\)\#!\(.*\<42\>\)\#=\(.*\<baz\>\)\#!\(.*\<foo\>\)\#=/
One might think a combination like :g/foo/v/bar/d would work, but unfortunately this isn't possible, and you will have to recur to one of the proposed work-arounds.
As described in the help, behind the scenes the :global command works in two stages,
first marking the lines on which to operate,
then performing the operation on them.
Out of interest, I had a look at the relevant parts in the Vim source: In ex_cmds.c, ex_global(), you will find that the global flag global_busy prevents repeated execution of the command while it is busy.
You want to employ a negative look ahead. This article gives more or less the specific example you are trying to achieve.
http://www.littletechtips.com/2009/09/vim-negative-match-using-negative-look.html
I changed it to
:g/foo(.*bar)\#!/d
Please let us know if you consider this a regex hack.
I will throw my hat in the ring. As vim's documentation explicitly states recursive global commands are invalid and the regex solution will get pretty hairy quickly, I think this is job for a custom function and command. I have created the :G command.
The usage is as :G followed by patterns surrounded by /. Any pattern that should not match is prefixed with a !.
:G /foo/ !/bar/ d
This will delete all lines that match /foo/ and does not match /bar/
:G /42 baz/ !/bar/ norm A$
This will append a $ to all lines matching /42 baz/ and that don't match /bar/
:G /foo/ !/bar/ !/baz/ d
This will delete all lines that match /foo/ and does not match /bar/ and does not match /baz/
The script for the :G command is below:
function! s:ManyGlobal(args) range
let lnums = {}
let patterns = []
let cmd = ''
let threshold = 0
let regex = '\m^\s*\(!\|v\)\=/.\{-}\%(\\\)\#<!/\s\+'
let args = a:args
while args =~ regex
let pat = matchstr(args, regex)
let pat = substitute(pat, '\m^\s*\ze/', '', '')
call add(patterns, pat)
let args = substitute(args, regex, '', '')
endwhile
if args =~ '\s*'
let cmd = 'nu'
else
let cmd = args
endif
for p in patterns
if p =~ '^(!\|v)'
let op = '-'
else
let op = '+'
let threshold += 1
endif
let marker = "let l:lnums[line('.')] = get(l:lnums, line('.'), 0)" . op . "1"
exe a:firstline . "," . a:lastline . "g" . substitute(p, '^(!\|v)', '', '') . marker
endfor
let patterns = []
for k in keys(lnums)
if threshold == lnums[k]
call add(patterns, '\%' . k . 'l')
endif
endfor
exe a:firstline . "," . a:lastline . "g/\m" . join(patterns, '\|') . "/ " . cmd
endfunction
command! -nargs=+ -range=% G <line1>,<line2>call <SID>ManyGlobal(<q-args>)
The function basically parses out the arguments then goes and marks all matching lines with each given pattern separately. Then executes the given command on each line that is marked the proper amount of times.
All right, here's one which actually simulates recursive use of global commands. It allows you to combine any number of :g commands, at least theoretically. But I warn you, it isn't pretty!
Solution to the original problem
I use the Unix program nl (bear with me!) to insert line numbers, but you can also use pure Vim for this.
:%!nl -b a
:exec 'norm! qaq'|exec '.,$g/foo/d A'|exec 'norm! G"apddqaq'|exec '.,$v/bar/d'|%sort|%s/\v^\s*\d+\s*
Done! Let's see the explanation and general solution.
General solution
This is the approach I have chosen:
Introduce explicit line numbering
Use the end of the file as a scratch space and operate on it repeatedly
Sort the file, remove the line numbering
Using the end of the file as a scratch space (:g/foo/m$ and similar) is a pretty well-known trick (you can find it mentioned in the famous answer number one). Also note that :g preserves relative ordering of the lines – this is crucial. Here we go:
Preparation: Number lines, clear "accumulator" register a.
:%!nl
qaq
The iterative bit:
:execute global command, collect matching lines by appending them into the accumulator register with :d A.
paste the collected lines at the end of the file
repeat for range .,$ (the scratch space, or in our case, the "match" space)
Here's an extended example: delete lines which do contain 'foo', do not contain 'bar', do contain '42' (just for the demonstration).
:exec '.,$g/foo/d A' | exec 'norm! G"apddqaq' | exec '.,$v/bar/d A' | exec 'norm! G"apddqaq' | exec '.,$g/42/d A' | exec 'norm! G"apddqaq'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(this is the repeating bit)
When the iterative bit ends, the lines .,$ contain the matches for your convenience. You can delete them (dVG) or whatever.
Cleanup: Sort, remove line numbers.
:%sort
:%s/\v^\s*\d+\s*
I'm sure other people can improve on the details of the solution, but if you absolutely need to combine multiple :gs and :vs into one, this seems to be the most promising solution.
The in-built solutions looks very complex.
One easy way would be to use LogiPat plugin:
Doc: http://www.drchip.org/astronaut/vim/doc/LogiPat.txt.html
Plugin: http://www.drchip.org/astronaut/vim/index.html#LOGIPAT
With this, you can easily search for patterns.
For e.g, to search for lines containing foo, and not bar, use:
:LogiPat "foo"&!("bar")
This would highlight all the lines matching the logical pattern (if you have set hls).
That way you can cross-check whether you got the correct lines, and then traverse with 'n', and delete with 'dd', if you wish.
I realize you explicitly stated that you want solutions using :g and :v, but I firmly believe this is a perfect example of a case where you really should use an external tool.
:%!awk '\!/foo/ || /bar/'
There's no need to re-invent the wheel.
Select lines which contain foo and do NOT contain bar
Delete lines which contain foo and do NOT contain bar
This can be done by combining global and substitute commands:
:v/bar/s/.*foo.*//g

Vim: Columnvise Increment inside and outside?

By outside, I want solutions that does not use Vim's scripting hacks but try to reuse certain basic *ix tools. Inside Vim stuff asks for solutions to get the column-increment with inside stuff such as scripting.
1 1
1 2
1 3
1 ---> 4
1 5
1 6
. .
. .
Vim has a script that does column-vise incrementing, VisIncr. It has gathered about 50/50 ups and down, perhaps tasting a bit reinventing-the-wheel. How do you column-increment stuff in Vim without using such script? Then the other question is, how do you column-increment stuff without/outside Vim?
Most elegant, reusable and preferably-small wins the race!
I don't see a need for a script, a simple macro would do
"a yyp^Ayy
then play it, or map to play it.
Of course, there is always the possibility that I misunderstood the question entirely...
The optimal choice of a technique highly depends on the actual circumstances
of the transformation. There are at least two points variations affecting
implementation:
Whether the lines to operate on are the only ones in a file? If not,
is the range of lines defined by context (i.e. it separated by blank
lines, like a paragraph) or is it arbitrary and should be specified by
user?
Are those lines already contain numbers that should be changed or is
it necessary to insert new ones leaving the text on the lines in tact?
Since there is no information to answer these questions, below we will try to
construct a flexible solution.
A general solution is a substitution operating on the beginnings of the lines
in the range specified by the user. Visual mode is probably the simplest way
of selecting an arbitrary range of lines, so we assume here that boundaries of
the range are defined by the visual selection.
:'<,'>s/^\d\+/\=line(".")-line("''")+1/
If it is necessary to number every line in a buffer, the command can be
simplified as follows.
:%s/^\d\+/\=line('.')/
In any case, if the number should be merely inserted at the beginnings of the
lines (without modifying the ones that already exist), one can change the
pattern from ^\d\+ to ^, and optionally add a separator:
:'<,'>s/^\d\+/\=(line(".")-line("''")+1).' '/
or
:%s/^/\=line('.').' '/
respectively.
For a solution based on command-line tools, one can consider using stream
editors like Sed or text extraction and reporting tools like AWK.
To number each of the lines in a file using Sed, run the commands
$ sed = filename | sed 'N;s/\n/ /'
In order to do the same in AWK, use the command
$ awk '{print NR " " $0}' filename
which could be easily modfied to limit numbering to a particular range of lines
satisfying a certain condition. For example, the following command numbers the
lines two through eight.
$ awk '{print (2<=NR && NR<=8 ? ++n " " : "") $0}' filename
Having an interest in how commands similar to those from the script linked in
the question statement are implemented, one can use the following command as
a reference.
vnoremap <leader>i :call EnumVisualBlock()<cr>
function! EnumVisualBlock() range
if visualmode() != "\<c-v>"
return
endif
let [l, r] = [virtcol("'<"), virtcol("'>")]
let [l, r] = [min([l, r]), max([l, r])]
let start = matchstr(getline("'<"), '^\d\+', col("'<")-1)
let off = start - line("'<")
let w = max(map([start, line("'>") + off], 'len("".v:val)'))
exe "'<,'>" 's/\%'.l.'v.*\%<'.(r+1).'v./'.
\ '\=printf("%'.w.'d",line(".")+off).repeat(" ",r-l+1-w)'
endfunction
If you want change 1 1 1 1 ... to 1 2 3 4 .... (Those numbers should be on different lines.)
:let i=1 | g/1/s//\=i/g | let i+=1
If some of 1 1 1 1 ... are in the same line:
:let g:i = 0
:func! Inc()
: let g:i+=1
: return g:i
:endfun
:%s/1/\=Inc()/g

I need to find a match using sed and deletes 2 lines before this match and 3 lines after it

I need to find a match using "sed" and deletes 2 lines before this match and 3 lines after it, and print the output , how can i do that ?
if the file is not huge, try this:
awk 'NR==FNR{if($0~/matchWord/){for(i=NR-2;i<=NR+3;i++){if(i!=NR)a[i]++}}}\
NR>FNR{if(!(FNR in a))print $0}' file file
I didn't test, but should work.
First off, you do not want to do this in sed. 2nd, your question is ill posed: what do you do if you have a match on lines 5 and 8? Does line 8 get deleted and line 6 is kept? Assuming that's not a concern, this seems to do what you want:
#!/bin/sed -nf
1{ h; d; }
H
2,5d
g
/^\([^\n]*\n\)\{2\}match/!P
/^\([^\n]*\n\)\{2\}match/{
s/\n[^\n]*$//
N
}
s/[^\n]*\n//
h
$p
Note: if the match occurs in the last 3 lines of the file, this does not behave as desired. That case is left as an exercise for the (masochistic) reader.
sed ‘/matchWord/,+3d;:flag;1,2!{P;N;D};N;bflag’ file

Resources