A VIM search/replace pattern - vim

How do I do a VIM search/replace with the following conditions:
a) the line contains str1
b) the line also contain str2
c) the line do NOT contain str3
on such a line, I do want to replace xxx with yyy

This will replace the () string in every line of a file containing the string return and the string aLib at some point after that:
:%s/return.*aLib.*\zs()\ze/(aHandle)/
The % is the operating range of the command, which includes all lines in the file.
\zs denotes the start of the replaced string and \ze denotes its end.
If you have to replace more than one instance per line, put a g at the end of the command after the last /. If you wish to confirm each substitution, put the c flag at the end.
By assuming the () string is never present if aHandle is present, this command doesn't answer your question exactly, but based on the sample provided in the comments, this may match your needs.

regex doesn't handle and, not very well. there are ways to do it, but bit tricky. A very simple function could solve your problem:
function! Rep()
let l = getline('.')
if stridx(l, 'str1')>=0 && stridx(l,'str2') >=0 && stridx(l,'str3')<0
execute 's/xxx/yyy/g'
endif
endfunction
you can
either write it in your .vimrc file.
or save it in a foo.vim file and open it with vim type :so %
then open your target file, type :%call Rep() you will see for example:
str1 str2 str3 xxx
str3 str2 xxx
str2 xxx
*str1 str2 xxx
*xxx str2 xxx str2 xxx str1
would be changed into:
str1 str2 str3 xxx
str3 str2 xxx
str2 xxx
*str1 str2 yyy
*yyy str2 yyy str2 yyy str1
I think replacing str1,str2,xxx,yyy to your real values in that function isn't hard for you, is it?
EDIT
your real problem seems to be much easier than the problem you described in question. try this line:
:%s/\(\s*return\s*aLib\.[^(]\+(\)\s*\()\)/\1aHandle\2/
Note that it could be written shorter only for your example working, but I want it to be secure.

I think #Kent's answer (the function) is the "most correct" when doing complex searches/substitutions. It's easier to change afterward, and to understand it also after a while, it's also quicker to write than this. But I think it can be interesting if it's possible to do this with a "oneliner".
I welcome comments about this not working, I find it interesting myself if it's possible, not if it's the "best solution".
A "oneliner" for all situations described in the original question.
:v/str3/s/\v(.*str1&.*str2)#=.{-}\zsxxx/yyy/g
Explanation:
:v/str3/ all lines NOT containing "str3"
s/ substitute
\v "very magic" (to not have to escape regex expressions like {}), :h \v
(.*str1&.*str2)#= positive look-ahead, :h \#=
.*str1 any character 0 or more ("greedy") followed by "str1", the preceding .* is important for both & and #= to work
& pattern before and after & must be present, different than | that is using "or", both patterns must start at same place, thats why .* is used before str1/2, :h \&
#= the atom before (in the parentheses) should be present, but the pattern is not included in the match, the atom must however start at the beginning of the line for this to work (.* makes sure of that).
.{-} any character 0 or more ("not greedy"), {-} instead of * is important if more than one occurrence of xxx should be substituted (not enough with the g flag in this case)
\zs sets the start of the match (what to substitute).
xxx finally, the string to substitute
/yyy/ replace the match with yyy
g all occurences on the line
This gives the same result as using the example in #Kent's answer, it will handle any order of str1, str2, str3 and xxx.
I repeat that this is an alternative if for some reason a function is not an alternative. As I understand OP, e.g. str2 is always after str1 in his case, so this is more a general solution for situations like this (and perhaps more of academic interest).

Related

Replace double quotes that come in pairs

I'd like to replace double quotes " characters which come in pairs. Let me explain what I mean.
"Some sentence"
Here double quotes should be replaced because they come in pair.
"Some sentence
Here should not be replaced - there is no matching pair for the first quote character.
I'd like to replace first quote character with „.
❯ echo „ |hexdump -C
00000000 e2 80 9e 0a
And the second quote character with ”
❯ echo ” |hexdump -C
00000000 e2 80 9d 0a
Summing it up, the following:
Hi, "how
are you"
Should be the following after being replacement is made.
Hi, „how
are you”
I've come up with the following code, but it fails to work:
'sed -r s/(\")(.+)(\")/\1\xe2\x80\x9e\3\xe2\x80\x9d/g'
" hi " gives "„"”.
EDIT
As requested in the comments, here comes a sample from a file to be modified. Important note: the file is structured - perhaps it may help. The file is always a srt file, i.e. movie subtitle format.
104
00:10:25,332 --> 00:10:27,876
Kobieta mówi do drugiej:
"Widzisz to, co ja?"
105
00:10:28,001 --> 00:10:30,904
A tamta: "No to co?
Każdy wygląda tak samo."
Your expression doesn't work because you have three capturing groups: The three sets of (). You are putting the 1st (the first quote) and the 3rd (the last quote) in the output and ignoring the 2nd, which is the part you want to keep.
There's no reason to capture the quotes, since you don't want to inject them into the output. Only the bit in the middle needs to be captured.
There is also a flaw, the (.*) will itself match against a string containing a quote. So /"(.*)"/ would match the entire sequence "one"two", with the capture, (.*), matching one"two. Use [^"]* to match a sequence of non-quote characters.
Fixing this, and treating the entire text file as one line with -z, which only works if there are no nul characters in the text file, it appears this works:
sed -zE 's/"([^"]+)"/„\1“/g'
sed -rn ':a;s/"([^"]*)"/„\1”/g;/"/!{p;b;};$p;N;ba'
It substitutes all "xx" with „xx”. If the result contains no more " it is printed and we restart with next line. Else we concatenate the next line and we restart. The $p is just here to print the last lines if they contain a dangling ".

Tcl - How to replace ? with -

(You'd think this would be easy, but I'm stumped.)
I'm converting an iOS note to a text file, and the note contains "0." and "?" whenever there is a list or bullet.
This was a bulleted list
? item 20
? Item 21
? Item 22
I'm having so much problem replacing the "?"
I don't want to replace a legitimate question mark at the end of a sentence,
but I want to replace the "?" bullets with "-" (preferably anywhere in the line, not just at the beginning)
I tried these searches - no luck
set line "? item 20"
set index_bullet [string first "(\s|\r|\n)(\?)" $line]
set index_bullet [string first "(!\w)(\?)" $line]
set index_bullet [string first ^\? $line]
This works, but it would match any question mark
set index_bullet [string first \? $line]
Does anyone know what I'm doing wrong?
How do I find and replace only question mark bullets with a "-"?
Thank you very much in advance
If you're really wanting to replace a question mark where you've got a regular expression that describes the rule, the regsub command is the right way. (The string first command finds literal substrings only. The string match command uses globbing rules.) In this case, we'll use the -all option so that every instance is replaced:
set line "? item 20"
set replaced [regsub -all {(\s|^)\?(\s)} $line {\1-\2}]
puts "'$line' --> '$replaced'"
# Prints: '? item 20' --> '- item 20'
The main tricks to using regular expressions in Tcl are, as much as possible, to keep REs and their replacements in braces so that the you can use Tcl metacharacters (e.g., backslash or square brackets) without having to fiddle around a lot.
Also, \s by default will match a newline.
It seems likely that a character used to indicate a list item is the first character on the line or the first character after optional whitespace. To match a question mark at the beginning of a line:
string match {\?*} $line
or
string match \\?* $line
The braces or doubled backslash keeps the question mark from being treated as a string match metacharacter.
To find a question mark after optional whitespace:
string match {\?*} [string trimleft $line]
The command returns 1 if it finds a match, and 0 if it doesn't.
To do this with string first, use
if {[string first ? [string trimleft $line]] eq 0} ...
but in that case, keep in mind that the index returned from string first isn't the true location of the question mark. (Use
== instead of eq if you have an older Tcl).
When you have determined that the line contains a question mark in the first non-whitespace position, a simple
set line [regsub {\?} $line -]
will perform a single substitution regardless of where it is.
Documentation:
regsub,
string,
Syntax of Tcl regular expressions
I figured it out.
I did it in two steps:
1) First find the "?"
set index_bullet [string first "\?" $line]
2) Then filter out "?" that is not a bullet
set index_question_mark [string first "\w\?" $line]
I have a solution, but please post if you have a better way of doing this.
Thanks!

VIM: delete strings with the same pattern

I need to find all pairs of strings that have the same pattern.
For example:
another string, that is not interesting
la-di-da-di __pattern__ -di-la-di-la
la-di-da-da-di-la __pattern__ -la-da-li-la
and yet another usual string
So I want to delete strings with __pattern__ inside.
I don't know how to do it just with builtin commands and now I have the function, that doesn't work properly:
function! DelDup(pattern)
echom a:pattern
redir => l:count
execute "normal! :%s/a:pattern//n\<cr>"
redir END
echo l:count
endfunction
Here I try to run ":%s/a:pattern//n" to find the count of occurrences of pattern in the text.
And at the same time I try to put it into the variable "l:count".
Then I tried to echo the count I got, but nothing happens when I try to do it.
So the last my problem in function writing is that I can't write the command execution result to variable.
If you have another solution -- please describe it to me.
Update:
Excuse me for bad description. I want to delete only strings, that has pattern-twins in text.
I'm not sure if I understand your question correctly, but I'm assuming you want to remove all lines where there are at least 2 matches. If that's the case you can use the following command:
:g/\(__pattern__.*\)\{2,}/d
How this works is that it deletes all the lines where there is a match (:g/../d).
The pattern is made up of a group (\(..\)) which needs to be matched at least 2 times (\{2,}). And the pattern has a .* at the end so it matches everything between the matches of the pattern.
There are many ways to count occurrences of a pattern, and I'm quite sure there exist a Q/A on the subject. Let's do it yet another way and chain with the next step. (Yes this is completely obfuscated, but it permits to obtain programmatically the information without the need to parse the localized result of :substitute after redirection.)
" declare a list that contain all matches
let matches = []
" replace each occurrence of the "pattern" with:
" the result of the expression "\=" that can be
" interpreted as the last ([-1]) element of the
" list "matches" returned by the function (add)
" that adds the current match (submatch(0)) to the
" list
:%s/thepattern/\=add(matches, submatch(0))[-1]/gn
" The big caveat of this command is that it modifies
" the current buffer.
" We need something like the following to leave it unmodified:
:g/thepattern/call substitute(getline('.'), 'thepattern', '\=add(counter, submatch(0))[-1]', 'g')
" Note however that this flavour won't work with multi-lines patterns
" Now you can test the number of matches or do anything fancy with it
if len(matches) > 1
" replaces matches with nothing
:%s/thepattern//g
endif
Only if you want to define this as a function you'll need to play with:
exe 'normal :%s/'.escape(a:pattern, '/\').'/replacement..../flags....'

Octave strcat ignores added spaces

Octave adds spaces with strcat
In Octave I run these commands:
strcat ("hel", " ", "lo")
I get this result:
ans = hello
Instead of what I expected:
ans = hel lo
strcat to me sounds like "concatenate strings". A space is a valid character, so adding a space should be OK. Matlab has the same behaviour, so it's probably intended.
I find it counter intuitive. Does this behavior makes sense?
Hmm. It works how it is defined:
"strcat removes trailing white space in the arguments (except within cell arrays), while cstrcat leaves white space untouched. "
From http://www.gnu.org/software/octave/doc/interpreter/Concatenating-Strings.html
So the question could be: Should this behaviour be changed.
strcat takes the input parameters and trims the trailing spaces, but not the leading spaces. if you pass a parameter as one or more spaces, they are collapsed to blank string.
That behavior is a manifestation of how "cellstr" works where spaces at the end are removed.
Work around 1
If you put the space up against the 'lo', it is a leading space and not removed.
strcat ("hel", " lo")
ans = hel lo
Work around 2
use cstrcat instead:
cstrcat("hel", " ", "lo")
ans = hel lo
Work around 3
Use sprintf, can be faster than strcat.
sprintf("%s%s%s\n", "hel", " ", "lo")
ans = hel lo

Remove escapes from a string, or, "how can I get \ out of the way?"

Escape characters cause a lot of trouble in R, as evidenced by previous questions:
Change the values in a column
Can R paste() output "\"?
Replacing escaped double quotes by double quotes in R
How to gsub('%', '\%', ... in R?
Many of these previous questions could be simplified to special cases of "How can I get \ out of my way?"
Is there a simple way to do this?
For example, I can find no arguments to gsub that will remove all escapes from the following:
test <- c('\01', '\\001')
The difficulty here is that "\1", although it's printed with two glyphs, is actually, in R's view a single character. And in fact, it's the very same character as "\001" and "\01":
nchar("\1")
# [1] 1
nchar("\001")
# [1] 1
identical("\1", "\001")
# [1] TRUE
So, you can in general remove all backslashes with something like this:
(test <- c("\\hi\\", "\n", "\t", "\\1", "\1", "\01", "\001"))
# [1] "\\hi\\" "\n" "\t" "\\1" "\001" "\001" "\001"
eval(parse(text=gsub("\\", "", deparse(test), fixed=TRUE)))
# [1] "hi" "n" "t" "1" "001" "001" "001"
But, as you can see, "\1", "\01", and \001" will all be rendered as 001, (since to R they are all just different names for "\001").
EDIT: For more on the use of "\" in escape sequences, and on the great variety of characters that can be represented using them (including the disallowed nul string mentioned by Joshua Ulrich in a comment above), see this section of the R language definition.
I just faced the same issue - if you want any \x where x is a character then I am not sure how, I wish I knew, but to fix it for a specific escape sequence,. say \n then you can do
new = gsub("\n","",old,fixed=T)
in my case, I only had \n

Resources