Search for string and get count in vi editor - search

I want to search for a string and find the number of occurrences in a file using the vi editor.

THE way is
:%s/pattern//gn

You need the n flag. To count words use:
:%s/\i\+/&/gn
and a particular word:
:%s/the/&/gn
See count-items documentation section.
If you simply type in:
%s/pattern/pattern/g
then the status line will give you the number of matches in vi as well.

:%s/string/string/g
will give the answer.

(similar as Gustavo said, but additionally: )
For any previously search, you can do simply:
:%s///gn
A pattern is not needed, because it is already in the search-register (#/).
"%" - do s/ in the whole file
"g" - search global (with multiple hits in one line)
"n" - prevents any replacement of s/ -- nothing is deleted! nothing must be undone!
(see: :help s_flag for more informations)
(This way, it works perfectly with "Search for visually selected text", as described in vim-wikia tip171)

:g/xxxx/d
This will delete all the lines with pattern, and report how many deleted. Undo to get them back after.

Short answer:
:%s/string-to-be-searched//gn
For learning:
There are 3 modes in VI editor as below
: you are entering from Command to Command-line mode. Now, whatever you write after : is on CLI(Command Line Interface)
%s specifies all lines. Specifying the range as % means do substitution in the entire file. Syntax for all occurrences substitution is :%s/old-text/new-text/g
g specifies all occurrences in the line. With the g flag , you can make the whole line to be substituted. If this g flag is not used then only first occurrence in the line only will be substituted.
n specifies to output number of occurrences
//double slash represents omission of replacement text. Because we just want to find.
Once got the number of occurrences, you can Press N Key to see occurrences one-by-one.
For finding and counting in particular range of line number 1 to 10:
:1,10s/hello//gn
Please note, % for whole file is repleaced by , separated line numbers.
For finding and replacing in particular range of line number 1 to 10:
:1,10s/helo/hello/gn

use
:%s/pattern/\0/g
when pattern string is too long and you don't like to type it all again.

I suggest doing:
Search either with * to do a "bounded search" for what's under the cursor, or do a standard /pattern search.
Use :%s///gn to get the number of occurrences. Or you can use :%s///n to get the number of lines with occurrences.
** I really with I could find a plug-in that would giving messaging of "match N of N1 on N2 lines" with every search, but alas.
Note:
Don't be confused by the tricky wording of the output. The former command might give you something like 4 matches on 3 lines where the latter might give you 3 matches on 3 lines. While technically accurate, the latter is misleading and should say '3 lines match'. So, as you can see, there really is never any need to use the latter ('n' only) form. You get the same info, more clearly, and more by using the 'gn' form.

Related

vim Search Replace should use replaced text in following searches

I have a data file (comma separated) that has a lot of NAs (It was generated by R). I opened the file in vim and tried to replace all the NA values to empty strings.
Here is a sample slimmed down version of a record in the file:
1,1,NA,NA,NA,NATIONAL,NA,1,NANA,1,AMERICANA,1
Once I am done with the search-replace, the intended output should be:
1,1,,,,NATIONAL,,1,NANA,1,AMERICANA,1
In other words, all the NAs should be replaced except the words NATIONAL, NANA and AMERICANA.
I used the following command in vim to do this:
1, $ s/\,NA\,/\,\,/g
But, it doesn't seem to work. Here is the output that I get:
1,1,,NA,,NATIONAL,,1,NANA,1,AMERICANA,1
As you can see, there is one ,NA, that is left out of the replacement process.
Does anyone have a good way to fix it? Thanks.
A trivial solution is to run the same command again and it will take care of the remaining ,NA,. However, it is not a feasible solution because my actual data file has 100s of columns and 500K+ rows each with a variable number of NAs.
, doesn't have a special meaning so you don't have to escape it:
:1,$s/,NA,/,,/g
Which doesn't solve your problem.
You can use % as a shorthand for 1,$:
:%s/,NA,/,,/g
Which doesn't solve your problem either.
The best way to match all those NA words to the exclusion of other words containing NA would be to use word boundaries:
:%s/,\<NA\>,/,,/g
Which still doesn't solve your problem.
Which makes those commas, that you used to restrict the match to NA and that are causing the error, useless:
:%s/\<NA\>//g
See :help :range and :help \<.
Use % instead of 1,$ (% means "the buffer" aka the whole file).
You don't need \,. , works fine.
Vim finds discrete, non-overlapping matches. so in ,NA,NA,NA, it only finds the first ,NA, and third ,NA, as the middle one doesn't have its own separate surrounding ,. We can modify the match to not include certain characters of our regex with \zs (start) and \ze (end). These modify our regex to find matches that are surrounded by other characters, but our matches don't actually include them, so we can match all the NA in ,NA,NA,NA,.
TL;DR: %s/,\zsNA\ze,//g

How to search for multiple overlapping occurrences of same word on the same line?

I want to search for all occurrences of a word on the same line as well as multiple files within a given file. For example:
ABCCG*CAT*AD*CAT*TT
DFGBBB*CAT*YYUAB
Manually searching for the word 'CAT' I found two when using /CAT, when in fact there are three occurrences of that word in the file.
What is the command to find all occurrences of a given word in a file irrespective of the fact that it may occur multiple times within a line?
Note: There are no * in the file. I have used it in the example above to denote the positions of the string CAT.
What if the multiple occurrences were to overlap on the same line? For example:
ABCCG*TNTNT*ADCATDD
DFGBBB*TNT*YYUAB
Searching for the word TNT using :%s/TNT//gn would still give me 2, when in fact there are three occurrences.
Is there a way to identify overlapping occurrences in the same line using Vim?
To get a count of the total number of all matches of an item—including ”overlapping” string cases, you actually need to use the %s command (long form: %substitute) and tell it three things:
do not actually perform the substitution (n flag; in this case, a mnemonic for “noop” I guess)
consider multiple matches on the same line to be separate matches (g flag for “global“)
do a “non-greedy“ match (\{-}; somewhat arcane but worth reading up on; see below)
Putting all that together, here's what it looks like:
:%s/[T]\{-}NT//gn
So, given the following text from the question:
ABCCG*TNTNT*ADCATDD
DFGBBB*TNT*YYUAB
…vim will then report this:
3 matches on 2 lines
If/when you do actually want a count of just the number of matching lines, you can omit the g and vim will use its default of reporting a count just for the number lines that contain a match. And if you don’t want to count “overlapping” strings, then omit the \{-} part.
The vim docs actually have very good info about this stuff.
For more help on counting items in vim, see :help count-items:
Counting words, lines, etc. count-items
To count how often any pattern occurs in the current buffer use the substitute
command and add the 'n' flag to avoid the substitution. The reported number
of substitutions is the number of items. Examples:
:%s/./&/gn characters
:%s/\i\+/&/gn words
:%s/^//n lines
:%s/the/&/gn "the" anywhere
:%s/\<the\>/&/gn "the" as a word
You might want to reset 'hlsearch' or do ":nohlsearch".
Add the 'e' flag if you don't want an error when there are no matches.
And for more help with doing “non-greedy“ matching, see :help non-greedy:
non-greedy
If a "-" appears immediately after the "{", then a shortest match
first algorithm is used (see example below). In particular, "\{-}" is
the same as "*" but uses the shortest match first algorithm. BUT: A
match that starts earlier is preferred over a shorter match: "a\{-}b"
matches "aaab" in "xaaab".
Example matches
ab\{2,3}c "abbc" or "abbbc"
a\{5} "aaaaa"
ab\{2,}c "abbc", "abbbc", "abbbbc", etc.
ab\{,3}c "ac", "abc", "abbc" or "abbbc"
a[bc]\{3}d "abbbd", "abbcd", "acbcd", "acccd", etc.
a\(bc\)\{1,2}d "abcd" or "abcbcd"
a[bc]\{-}[cd] "abc" in "abcd"
a[bc]*[cd] "abcd" in "abcd"
The } may optionally be preceded with a backslash: \{n,m\}.

Vim: substitution in a range that is less than a line

Let's say I have the following line of code:
something:somethingElse:anotherThing:woahYetAnotherThing
And I want to replace each : with a ; except the first one, such that the line looks like this:
something:somethingElse;anotherThing;woahYetAnotherThing
Is there a way to do this with the :[range]s/[search]/[replace]/[options] command without using the c option to confirm each replace operation?
As far as I can tell, the smallest range that s acts on is a single line. If this is true, then what is the fastest way to do the above task?
I'm fairly new to vim myself; I think you're right about range being lines-only (not 100% certain), but for this specific example you might try replacing all of the instances with a global flag, and then putting back the first one by omitting the global -- something like :s/:/;/g|s/;/:/.
Note: if the line contains a ; before the first : then this will not work.
Here you go...
:%s/\(:.*\):/\1;/|&|&|&|&
This is a simple regex substitute that takes care of one single not-the-first :.
The & command repeats the last substitute.
The | syntax separates multiple commands on one line. So, each substitute is repeated as many times as there are |& things.
Here is how you could use a single keystroke to do what you want (by mapping capital Q):
map Q :s/:/;/g\|:s/;/:<Enter>j
Every time you press Q the current line will be modified and the cursor will move to the next line.
In other words, you could just keep hitting Q multiple times to edit each successive line.
Explanation:
This will operate globally on the current line:
:s/:/;/g
This will switch the first semi-colon back to a colon:
:s/;/:
The answer by #AlliedEnvy combines these into one statement.
My map command assigns #AlliedEnvy's answer to the capital Q character.
Another approach (what I would probably do if I only had to do this once):
f:;r;;.
Then you can repeatedly press ;. until you reach the end of the line.
(Your choice to replace a semi-colon makes this somewhat comfusing)
Explanation:
f: - go to the first colon
; - go to the next colon (repeat in-line search)
r; - replace the current character with a semi-colon
; - repeat the last in-line search (again)
. - repeat the last command (replace current character with a semi-colon)
Long story short:
fx - moves to the next occurrence of x on the current line
; repeats the last inline search
While the other answers work well for this particular case, here's a more general solution:
Create a visual selection starting from the second element to the end of the line. Then, limit the substitution to the visual area by including \%V:
:'<,'>s/\%V:/;/g
Alternatively, you can use the vis.vim plugin
:'<,'>B s/:/;/g

Vim: delete until character for all lines containing a pattern

I'm learning the power of g and want to delete all lines containing an expression, to the end of the sentence (marked by a period). Like so:
There was a little sheep. The sheep was black. There was another sheep.
(Run command to find all sentences like There was and delete to the next period).
The sheep was black.
I've tried:
:g/There was/d\/\. in an attempt to "delete forward until the next period" but I get a trailing characters error.
:g/There was/df. but get a df. is not an editor command error.
Any thoughts?
The action associated with g must be able to act on the line without needing position information from the pattern match that g implies. In the command you are using, the delete forward command needs a starting position that is not being provided.
The problem is that g only indicates a line match, not a specific character position for it's pattern match. I did the following and it did what I think you want:
:g/There was/s/There was[^.]*[.]//
This found lines that matched the pattern There was, and performed a substitution of the regular expression There was[^.]*[.] with the empty string.
This is equivalent to:
:1,$s/There was[^.]*[.]//g
I'm not sure what the g is getting you in your use case, except the automatic application to the entire file line range (same as 1,$ or %). The g in this latter example has to do with applying the substitution to all patterns on the same line, not with the range of lines affected by the substitution command.
I'd just use a regex:
%s/There was\_.\{-}\.\s\?//ge
Note how \_. allows for cross-line sentences
You can use :norm like this:
:g/There was/norm 0weldf.
This finds lines with "There was" then executes the normal commands 0weldf..
0: go to beginning of line
w: go to next word (in this case, "was")
e: go the end of the word (so cursor is on the 's' of "was")
l: move one character to the right (so we don't delete any of "was")
df.: delete until the next '.', inclusive.
If you want to keep the period use dt. instead of df..
If you don't want to delete from the beginning of the line and instead want to do sentences, the :%s command is probably more appropriate here. (e.g. :%s/\(There was\)[^.]*\./\1/g or %s/\(There was\)[^.]*\./\1./g if you want to keep the period at the end of the sentence.
Use search and replace:
:%s/There was[^.]*\.\s*//g

Find first non-matching line in VIM

It happens sometimes that I have to look into various log and trace files on Windows and generally I use for the purpose VIM.
My problem though is that I still can't find any analog of grep -v inside of VIM: find in the buffer a line not matching given regular expression. E.g. log file is filled with lines which somewhere in a middle contain phrase all is ok and I need to find first line which doesn't contain all is ok.
I can write a custom function for that, yet at the moment that seems to be an overkill and likely to be slower than a native solution.
Is there any easy way to do it in VIM?
I believe if you simply want to have your cursor end up at the first non-matching line you can use visual as the command in your global command. So:
:v/pattern/visual
will leave your cursor at the first non-matching line. Or:
:g/pattern/visual
will leave your cursor at the first matching line.
you can use negative look-behind operator #<!
e.g. to find all lines not containing "a", use /\v^.+(^.*a.*$)#<!$
(\v just causes some operators like ( and #<! not to must have been backslash escaped)
the simpler method is to delete all lines matching or not matching the pattern (:g/PATTERN/d or :g!/PATTERN/d respectively)
I'm often in your case, so to "clean" the logs files I use :
:g/all is ok/d
Your grep -v can be achieved with
:v/error/d
Which will remove all lines which does not contain error.
It's probably already too late, but I think that this should be said somewhere.
Vim (since version about 7.4) comes with a plugin called LogiPat, which makes searching for lines which don't contain some string really easy. So using this plugin finding the lines not containing all is ok is done like this:
:LogiPat !"all is ok"
And then you can jump between the matching (or in this case not matching) lines with n and N.
You can also use logical operations like & and | to join different strings in one pattern:
:LP !("foo"|"bar")&"baz"
LP is shorthand for LogiPat, and this command will search for lines that contain the word baz and don't contain neither foo nor bar.
I just managed a somewhat klutzy procedure using the "g" command:
:%g!/search/p
This says to print out the non-matching lines... not sure if that worked, but it did end up with the cursor positioned on the first non-matching line.
(substitute some other string for "search", of course)
You can search with following line and press n to jump to the first non-matching line
^\(.*all is ok\)\#!.*$
Breakdown of operators:
^ -> means start of the line
\( and \) -> To match a whole string multiple times, it must be grouped into one item. This is done by putting "\(" before it and "\)" after it.
\#! -> Matches with zero width if the preceding atom does NOT match at the current position.
.* -> Matches any character repeated 1 or more times
$ -> end of the line
Here is sample animation how it works. For simplicity I searched for word apple.
You can iterate through the non-matches using g and a null substitution:
:g!/pattern/s/^//c
If you reply "n" each time you wont even mark the file as changed.
You need ctrl-C to escape from the circle (or keep going to bottom of file).

Resources