vim Search Replace should use replaced text in following searches - vim

I have a data file (comma separated) that has a lot of NAs (It was generated by R). I opened the file in vim and tried to replace all the NA values to empty strings.
Here is a sample slimmed down version of a record in the file:
1,1,NA,NA,NA,NATIONAL,NA,1,NANA,1,AMERICANA,1
Once I am done with the search-replace, the intended output should be:
1,1,,,,NATIONAL,,1,NANA,1,AMERICANA,1
In other words, all the NAs should be replaced except the words NATIONAL, NANA and AMERICANA.
I used the following command in vim to do this:
1, $ s/\,NA\,/\,\,/g
But, it doesn't seem to work. Here is the output that I get:
1,1,,NA,,NATIONAL,,1,NANA,1,AMERICANA,1
As you can see, there is one ,NA, that is left out of the replacement process.
Does anyone have a good way to fix it? Thanks.
A trivial solution is to run the same command again and it will take care of the remaining ,NA,. However, it is not a feasible solution because my actual data file has 100s of columns and 500K+ rows each with a variable number of NAs.

, doesn't have a special meaning so you don't have to escape it:
:1,$s/,NA,/,,/g
Which doesn't solve your problem.
You can use % as a shorthand for 1,$:
:%s/,NA,/,,/g
Which doesn't solve your problem either.
The best way to match all those NA words to the exclusion of other words containing NA would be to use word boundaries:
:%s/,\<NA\>,/,,/g
Which still doesn't solve your problem.
Which makes those commas, that you used to restrict the match to NA and that are causing the error, useless:
:%s/\<NA\>//g
See :help :range and :help \<.

Use % instead of 1,$ (% means "the buffer" aka the whole file).
You don't need \,. , works fine.
Vim finds discrete, non-overlapping matches. so in ,NA,NA,NA, it only finds the first ,NA, and third ,NA, as the middle one doesn't have its own separate surrounding ,. We can modify the match to not include certain characters of our regex with \zs (start) and \ze (end). These modify our regex to find matches that are surrounded by other characters, but our matches don't actually include them, so we can match all the NA in ,NA,NA,NA,.
TL;DR: %s/,\zsNA\ze,//g

Related

Searching for an exact match with a singular digit

I'm trying to search for only a singular digit in vim by itself. For example, if there are two sets of digits 1 and 123 and I want to search for 1, I would only want the singular 1 digit to be found.
I have tried using regular expressions like \<1> and \%(a)#
You almost had the right solution. You want:
\<1\>
This is because each angled bracket needs to be escaped. Alternatively, you could use:
\v<1>
The \v flag tells vim to treat more characters as special without needing to be escaped (for example, (){}+<> all become special rather than literal text. Read :h /\v for more on this.
A great reference for learning regex in vim is vimregex.com. The \<\> characters are explained in 4.1 "Anchors".
If you want to match text like 1.23 this is possible too. Two different approaches:
Modify the iskeyword option so that it includes .. This will also affect how w moves
Use \v<1(\d|.)#!, which basically means "a 1 at the beginning of a word, that isn't followed by some other digit or a period."

How to find and remove part of word in vim?

I'm new into vim, I have hug text file as follow:
ZK792.6,ZK792.6(let-60),cel-miR-62(18),0.239
UTR3,IV:11688688-11688716,0.0670782
ZC449.3b,ZC449.3(ZC449.3),cel-miR-62(18),0.514
UTR3,X:5020692-5020720,0.355907
First, I would like to get delete all rows with even numbers (2,4,6...).
Second, I would like to remove (18) from entire file. as a example:
cel-miR-62(18) would be cel-miR-62.
Third: How can I get delete all parentheses including it's inside?
Would someone help me with this?
For the first one:
:g/[02468]\>/d
where :g matches all lines by the regex between the slashes and runs d (delete line) on the matching lines. The regex is quite easy to read, the only interesting symbol there is perhaps the \>, which matches end of a word.
For the second question:
:%s/\V(18)//g
where % is the specification meaning "all lines of the file", s is the substitute command, \V sets the "very nomagic" mode of regexes (not sure what your default is, you might not need this) and the final g makes vim substitute all occurrences on each line (with an empty string, the one between slashes). Make sure that :set gdefault? prints nogdefault (the default setting of gdefault), otherwise, drop the final g from the substitute command.
To remove every even line (or every other line):
:g/^/+d
To remove every instance of (18):
:%s/(18)//g
Remove all the parenthetical content:
:%s/(.\\{-})//g
Note: the pattern in third answer is a non-greedy match.

replacing part of regex matches

I have several functions that start with get_ in my code:
get_num(...) , get_str(...)
I want to change them to get_*_struct(...).
Can I somehow match the get_* regex and then replace according to the pattern so that:
get_num(...) becomes get_num_struct(...),
get_str(...) becomes get_str_struct(...)
Can you also explain some logic behind it, because the theoretical regex aren't like the ones used in UNIX (or vi, are they different?) and I'm always struggling to figure them out.
This has to be done in the vi editor as this is main work tool.
Thanks!
To transform get_num(...) to get_num_struct(...), you need to capture the correct text in the input. And, you can't put the parentheses in the regular expression because you may need to match pointers to functions too, as in &get_distance, and uses in comments. However, and this depends partially on the fact that you are using vim and partially on how you need to keep the entire input together, I have checked that this works:
%s/get_\w\+/&_struct/g
On every line, find every expression starting with get_ and continuing with at least one letter, number, or underscore, and replace it with the entire matched string followed by _struct.
Darn it; I shouldn't answer these things on spec. Note that other regex engines might use \& instead of &. This depends on having magic set, which is default in vim.
For an alternate way to do it:
%s/get_\(\w*\)(/get_\1_struct(/g
What this does:
\w matches to any "word character"; \w* matches 0 or more word characters.
\(...\) tells vim to remember whatever matches .... So, \(w*\) means "match any number of word characters, and remember what you matched. You can then access it in the replacement with \1 (or \2 for the second, etc.)
So, the overall pattern get_\(\w*\)( looks for get_, followed by any number of word chars, followed by (.
The replacement then just does exactly what you want.
(Sorry if that was too verbose - not sure how comfortable you are with vim regex.)

Find first non-matching line in VIM

It happens sometimes that I have to look into various log and trace files on Windows and generally I use for the purpose VIM.
My problem though is that I still can't find any analog of grep -v inside of VIM: find in the buffer a line not matching given regular expression. E.g. log file is filled with lines which somewhere in a middle contain phrase all is ok and I need to find first line which doesn't contain all is ok.
I can write a custom function for that, yet at the moment that seems to be an overkill and likely to be slower than a native solution.
Is there any easy way to do it in VIM?
I believe if you simply want to have your cursor end up at the first non-matching line you can use visual as the command in your global command. So:
:v/pattern/visual
will leave your cursor at the first non-matching line. Or:
:g/pattern/visual
will leave your cursor at the first matching line.
you can use negative look-behind operator #<!
e.g. to find all lines not containing "a", use /\v^.+(^.*a.*$)#<!$
(\v just causes some operators like ( and #<! not to must have been backslash escaped)
the simpler method is to delete all lines matching or not matching the pattern (:g/PATTERN/d or :g!/PATTERN/d respectively)
I'm often in your case, so to "clean" the logs files I use :
:g/all is ok/d
Your grep -v can be achieved with
:v/error/d
Which will remove all lines which does not contain error.
It's probably already too late, but I think that this should be said somewhere.
Vim (since version about 7.4) comes with a plugin called LogiPat, which makes searching for lines which don't contain some string really easy. So using this plugin finding the lines not containing all is ok is done like this:
:LogiPat !"all is ok"
And then you can jump between the matching (or in this case not matching) lines with n and N.
You can also use logical operations like & and | to join different strings in one pattern:
:LP !("foo"|"bar")&"baz"
LP is shorthand for LogiPat, and this command will search for lines that contain the word baz and don't contain neither foo nor bar.
I just managed a somewhat klutzy procedure using the "g" command:
:%g!/search/p
This says to print out the non-matching lines... not sure if that worked, but it did end up with the cursor positioned on the first non-matching line.
(substitute some other string for "search", of course)
You can search with following line and press n to jump to the first non-matching line
^\(.*all is ok\)\#!.*$
Breakdown of operators:
^ -> means start of the line
\( and \) -> To match a whole string multiple times, it must be grouped into one item. This is done by putting "\(" before it and "\)" after it.
\#! -> Matches with zero width if the preceding atom does NOT match at the current position.
.* -> Matches any character repeated 1 or more times
$ -> end of the line
Here is sample animation how it works. For simplicity I searched for word apple.
You can iterate through the non-matches using g and a null substitution:
:g!/pattern/s/^//c
If you reply "n" each time you wont even mark the file as changed.
You need ctrl-C to escape from the circle (or keep going to bottom of file).

Search for string and get count in vi editor

I want to search for a string and find the number of occurrences in a file using the vi editor.
THE way is
:%s/pattern//gn
You need the n flag. To count words use:
:%s/\i\+/&/gn
and a particular word:
:%s/the/&/gn
See count-items documentation section.
If you simply type in:
%s/pattern/pattern/g
then the status line will give you the number of matches in vi as well.
:%s/string/string/g
will give the answer.
(similar as Gustavo said, but additionally: )
For any previously search, you can do simply:
:%s///gn
A pattern is not needed, because it is already in the search-register (#/).
"%" - do s/ in the whole file
"g" - search global (with multiple hits in one line)
"n" - prevents any replacement of s/ -- nothing is deleted! nothing must be undone!
(see: :help s_flag for more informations)
(This way, it works perfectly with "Search for visually selected text", as described in vim-wikia tip171)
:g/xxxx/d
This will delete all the lines with pattern, and report how many deleted. Undo to get them back after.
Short answer:
:%s/string-to-be-searched//gn
For learning:
There are 3 modes in VI editor as below
: you are entering from Command to Command-line mode. Now, whatever you write after : is on CLI(Command Line Interface)
%s specifies all lines. Specifying the range as % means do substitution in the entire file. Syntax for all occurrences substitution is :%s/old-text/new-text/g
g specifies all occurrences in the line. With the g flag , you can make the whole line to be substituted. If this g flag is not used then only first occurrence in the line only will be substituted.
n specifies to output number of occurrences
//double slash represents omission of replacement text. Because we just want to find.
Once got the number of occurrences, you can Press N Key to see occurrences one-by-one.
For finding and counting in particular range of line number 1 to 10:
:1,10s/hello//gn
Please note, % for whole file is repleaced by , separated line numbers.
For finding and replacing in particular range of line number 1 to 10:
:1,10s/helo/hello/gn
use
:%s/pattern/\0/g
when pattern string is too long and you don't like to type it all again.
I suggest doing:
Search either with * to do a "bounded search" for what's under the cursor, or do a standard /pattern search.
Use :%s///gn to get the number of occurrences. Or you can use :%s///n to get the number of lines with occurrences.
** I really with I could find a plug-in that would giving messaging of "match N of N1 on N2 lines" with every search, but alas.
Note:
Don't be confused by the tricky wording of the output. The former command might give you something like 4 matches on 3 lines where the latter might give you 3 matches on 3 lines. While technically accurate, the latter is misleading and should say '3 lines match'. So, as you can see, there really is never any need to use the latter ('n' only) form. You get the same info, more clearly, and more by using the 'gn' form.

Resources