I have a data file (comma separated) that has a lot of NAs (It was generated by R). I opened the file in vim and tried to replace all the NA values to empty strings.
Here is a sample slimmed down version of a record in the file:
1,1,NA,NA,NA,NATIONAL,NA,1,NANA,1,AMERICANA,1
Once I am done with the search-replace, the intended output should be:
1,1,,,,NATIONAL,,1,NANA,1,AMERICANA,1
In other words, all the NAs should be replaced except the words NATIONAL, NANA and AMERICANA.
I used the following command in vim to do this:
1, $ s/\,NA\,/\,\,/g
But, it doesn't seem to work. Here is the output that I get:
1,1,,NA,,NATIONAL,,1,NANA,1,AMERICANA,1
As you can see, there is one ,NA, that is left out of the replacement process.
Does anyone have a good way to fix it? Thanks.
A trivial solution is to run the same command again and it will take care of the remaining ,NA,. However, it is not a feasible solution because my actual data file has 100s of columns and 500K+ rows each with a variable number of NAs.
, doesn't have a special meaning so you don't have to escape it:
:1,$s/,NA,/,,/g
Which doesn't solve your problem.
You can use % as a shorthand for 1,$:
:%s/,NA,/,,/g
Which doesn't solve your problem either.
The best way to match all those NA words to the exclusion of other words containing NA would be to use word boundaries:
:%s/,\<NA\>,/,,/g
Which still doesn't solve your problem.
Which makes those commas, that you used to restrict the match to NA and that are causing the error, useless:
:%s/\<NA\>//g
See :help :range and :help \<.
Use % instead of 1,$ (% means "the buffer" aka the whole file).
You don't need \,. , works fine.
Vim finds discrete, non-overlapping matches. so in ,NA,NA,NA, it only finds the first ,NA, and third ,NA, as the middle one doesn't have its own separate surrounding ,. We can modify the match to not include certain characters of our regex with \zs (start) and \ze (end). These modify our regex to find matches that are surrounded by other characters, but our matches don't actually include them, so we can match all the NA in ,NA,NA,NA,.
TL;DR: %s/,\zsNA\ze,//g
I have a tricky problem. I need to make a minor change to a large number of xml files (500+). The change involves switching a value from 'false' to 'true.' The line that needs to change looks like this:
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
And it needs to become:
<VoltageIsMeasuredLineLine>true</VoltageIsMeasuredLineLine>
Unfortunately there are numerous instances of this set of tags in each file, so we can't do a simple find and replace. The thing that makes this set of tags unique is that they come some lines after:
<CID>STATIONNAME.BUS.STATIONNAME.DKV</CID>
However, each file has a different station name, so I had used wildcards to filter them out.
<CID>.*.BUS.*.DKV</CID>
So the code looks like this:
<CID>STATIONNAME.BUS.STATIONNAME.DKV</CID>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
And other sections in the code look like:
<CID>STATIONNAME.COLR.STATIONNAME.FCLR</CID>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<tag>Some Number of Other lines</tag>
<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
So I'm using the CID .BUS .DKV line as a starting point. Basically I need to change the first occurance of the VoltageisMeasured line that comes directly AFTER the CID .BUS .DKV line. But there's a lot of other lines in between (none of which are consistent from file to file) that I don't care about and are messing up my search.
I was suggested to try a Lookahead, but it did not work. This it the code I was told to try:
(?!<CID>.*.BUS.*.DKV</CID>(.*?)<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>
Hower, that line is also returning the lines without .BUS and .DKV, which are the really important factors in determining this section's uniqueness. How can I modify this Lookahead so that it only returns sections that had the .BUS and .DKV in the CID part?
Another idea I had was to select everything in between the CID and Voltage parts, keep the selections in groups, and then print the first two groups as-is, and replace the third. Like this:
(<CID>.*.BUS.*.DKV</CID>)(.*)(<VoltageIsMeasuredLineLine>false</VoltageIsMeasuredLineLine>)
And replace with
\1\2<VoltageIsMeasuredLineLine>true</VoltageIsMeasuredLineLine>
But something is still wrong with the CID part. I'm sure these wildcards are part of the problem but I've hit a wall. Any help appreciated!
Try the following in Notepad++ (Version >= 6.0) with replace
Activate Option matches newline and
set in Find what:
(<CID>[A-Za-z\.]*BUS[A-Za-z\.]*</CID>.*?<VoltageIsMeasuredLineLine>)false
and in Replace with:
\1true
The assumption is that every STATIONNAME.BUS.STATIONNAME.DKV has one corresponding VoltageIsMeasuredLineLine (as I read from your question)
The trick is, to use greedy search. I look for the first VoltageIsMeasuredLineLine after VoltageIsMeasuredLineLine
I'm having a bit of trouble using parenthesis in a vim string. I just need to add a set of parenthesis around 3 digits, but I can't seem to find where I'm suppose to correctly place them. So for example; I would have to place them around a phone number such as: 2015551212.
Right now I have a strings that separates the numbers and puts a hyphen between them. For example; 201 555-1212. So I just need the parenthesis. The final result should look like: (201) 555-1212
The string I have so far is this: s/\(\d\{3}\)\(\d\{3}\)/\1 \2-/g
How might I go about doing this?
Thanks
Just add the parens around the \1 in your replacement.
s/\(\d\{3\}\)\(\d\{3\}\)/(\1) \2-/g
If you want to go in reverse, and change "(800) 555-1212" to "8005551212", you can use something like this:
s/(\(\d\d\d\))\ \(\d\d\d\)-\(\d\d\d\d\)/\1\2\3/g
Instead of the \d\d\d, you could use \d\{3\}, but that is more trouble to type.
I have a very large file, and I want to remove all newline characters at the end of each line, so to merge all, except if the line starts with the character £.
So, if I have this:
data1
data2
£data3
data4
data5
I would like to end up with this:
data1data2
£data3data4data5
I was thinking of something like
:%s/\n(but not \n£)//g
Any ideas?
Just remove all new lines, then add them again where they should be. Or use a negative look ahead, but this is simpler, easier, and more comprehensible to anyone.
s/\n//g
s/£/\n£/g
Solution offered by #pb2q will remove all newlines and a next character if this character is not a “£” or a newline (because collection doesn’t match a newline by default), while in your question you asked to remove only the newline. This can be fixed by either using \ze, or a negative look-ahead:
%s/\n\ze\_[^£]
%s/\n£\#!
Note some things: first, you can omit a replacement string if you want to delete some text (unless you need to have a substitution flags which you don’t in this case). Second, \_ adds newline to a collection. It can be also written as [^£\n], but I guess it is not the best thing you can do: any guy coming from some PCRE-capable language thinks about [^£\n] as “match anything except ‘£’ and newline”, while in Vim it is really “match anything (including newline) except ‘£’”.
I would use the following :global command:
:g/^[^£]/-j!
It goes through all the lines that start with any character but £,
going from top to bottom, and joins each of those lines with the
preceding one via the :join command.
Consider following text file:
something
something
something = someother thing
other thing = third thing
another thing = forth thing
I want to make it look like this:
something
something
keyword something = someother thing
keyword other thing = third thing
keyword another thing = forth thing
so that, I add keyword to each line, what is contains a equals symbol in it.
Can I do this with global command, or how do you recommend I should do this?
:g/=/s/^/keyword /
or
:g/=/normal ikeyword
Note the space after "keyword"
For this type of problem, it's also quite common to use a solution like:
:%!sed '/=/s/^/keyword /'
I'm not quite sure what you're attempting to accomplish. Your title suggests a common pattern, but I don't see one in your example. So I'll show you both.
Making Changes Among Things With A Common Pattern
You can do search and replace with the following:
:s/<regex you are searching for>/<string to replace with>/g
s/pattern/replacement/ does search & replace, and the extra g will propogate the changes
Multi-Line Edit
Vim also lets you edit multiple lines at once. Say you want to edit the following three lines:
something = someother thing
other thing = third thing
another thing = fourth thing
Put your cursor on the s at the first something line.
Press <ctrl>-v outside of insert mode to go into Visual mode.`
Scroll down to the a on the bottom line. All three starting characters of all 3 lines should be highlighted.
Press A to append or I to enter directly into insert mode and start typing. When you hit escape your changes should reflect! You can also do other commands like y and d, etc.