Vim Substitute: How to replace multiple occurrences with additional parts in string - vim

I have a string like this:
"test", "test2", "test3"
I want to replace test* with some value. that means it would look like this after substitution:
"abc", "abc", "abc"
This what I tried so far:
:s/test\(\d\)/abc/g => didn't work as expected.
:s/test\p/abc/g => deletes quotes in first occurrence.
:s/test\d/abc/g => first occurrence remains unchanged.
Could you please me help with the right syntax.

substitute 'test' followed by zero or more digits:
:s/test\d*/abc/g

Related

Match multiple instances of pattern in parenthesis

For the following string, is it possible for regex to return the comma delimited matches within the square brackets?
"root.path.definition[id=1234,test=blah,scope=A,B,C,D]"
Expected output would be:
["id=1234", "test=blah", "scope=A,B,C,D"]
The closest I have gotten so far is the following:
(?<=\[)(.*?)(?=\])
But this will only return one match for everything within the square brackets.
One option is to use the re module and first get the part between the square brackets using a capturing group and a negated character class.
\[([^][]*)]
That part will match:
\[ Match [ char
([^][]*) Capture group 1, match 0+ times any char other than [ and ]
] A [ char
Then get the separate parts by matching the key value pairs separated by a comma.
\w+=.*?(?=,\w+=|$)
That part will match:
\w+ Match 1+ word characters
= Match literally
.*?(?=,\w+=|$) Match as least as possible chars until you either encounter a comma, 1+ word characters and = or the end of the string
For example
import re
s = "root.path.definition[id=1234,test=blah,scope=A,B,C,D]"
m = re.search(r"\[([^][]*)]", s)
if m:
print(re.findall(r"\w+=.*?(?=,\w+=|$)", m.group(1)))
Python demo
Output
['id=1234', 'test=blah', 'scope=A,B,C,D']
If you can make use of the regex module, this might also be an option matching the keys and values using lookarounds to assert the [ and ]
(?<=\[[^][]*)\w+=.*?(?=,\w+=|])(?=[^][]*])
For example
import regex
s = "root.path.definition[id=1234,test=blah,scope=A,B,C,D]"
print(regex.findall(r"(?<=\[[^][]*)\w+=.*?(?=,\w+=|])(?=[^][]*])", s))
Output
['id=1234', 'test=blah', 'scope=A,B,C,D']
Regex demo | Python demo

Lua - match only words outside {} braces in string and replace or append the words with substring

I have various strings with forms similar to:
This is a sentence outside braces{sentence{} with some words. {This is a
sentence inside braces with some words.}{This is a second sentence
inside braces.} Maybe some more words here for another sentence.
With Lua, I want to only match specific words in the string which are outside the "{}" braces. For example, I might want to match the word "sentence" outside the braces but not the occurrences of "sentence" inside the braces. I want to only match the bolded occurrences of the word not the italicized ones.
How to do it?
EDIT: What if I want append or replace the matched words while keeping the substrings inside the braces intact?
Example: append "word" to sentence:
This is a sentenceword outside braces{sentence{} with some words. {This is a
sentence inside braces with some words.}{This is a second sentence
inside braces.} Maybe some more words here for another sentenceword.
The simplest way to do this would be to replace all the brackets with a zero length strings in a temporary variable which you can then use to search for whatever you like.
You can easily do this using Lua's pattern matching and the following simple gsub code:
local tempStr = startStr:gsub("{.-}","")
The .- is the part that makes it grab everything between the { and } and gsub then replaces it all with a blank string.
Edit: The issue with the above method, as DarkWiiPlayer has pointed out is that the first open brace mathces with the first close brace which is incorrect.
The way around that is to use balanced braces (%b) as DarkWiiPlayer has recommended in his answer, like so:
local tempStr = startStr:gsub("%b{}","")
local function weird_match(word, str)
return str:gsub("%b{}", ''):match(word)
end
Replace balanced pairs of { and } with the empty string
Find the desired pattern (word) in the resulting string
Return the matched word (or its captures, if it has any)

define a character string containing "

I wish to define a character variable as: a"", as in: my.string <- 'a""' Nothing I have tried works. I always get: "a\"\"", or some variation thereof.
I have been reading the documentation for: grep, strsplit, regex, substr, gregexpr and other functions for clues on how to tell R that " is a character I want to keep unchanged, and I have tried maybe a hundred variations of a"" by adding \\, \, /, //, [], _, $, [, #.
The only potential example I can find on the internet of a string including " is: ‘{}>=40*" years"’ from here: http://cran.r-project.org/doc/manuals/R-lang.html However, that example is for performing a mathematical operation.
Sorry for such a very basic question. Thank you for any advice.
The backslashes is an artifact of the print method. In fact the default print surrounds your string with quotes. You can disable this by setting argument quote to FALSE.
For example You can use :
print(my.string,quote=FALSE)
[1] a""
But I would use cat or write like this :
cat(my.string)
a""
write(my.string,"")
a""
Using substr, one sees that the backslashes seem just to be an artefact of printing:
substr(my.string,2,2)
gives
[1] "\""
also, the string length is as you want it:
> nchar(my.string)
[1] 3
if you want to print your string without the backslashes, use noquote :
> noquote(my.string)
[1] a""

How to split a string into a list of words in TCL, ignoring multiple spaces?

Basically, I have a string that consists of multiple, space-separated words. The thing is, however, that there can be multiple spaces instead of just one separating the words. This is why [split] does not do what I want:
split "a b"
gives me this:
{a {} {} {} b}
instead of this:
{a b}
Searching Google, I found a page on the Tcler's wiki, where a user asked more or less the same question.
One proposed solution would look like this:
split [regsub -all {\s+} "a b" " "]
which seems to work for simple string. But a test string such as [string repeat " " 4] (used string repeat because StackOverflow strips multiple spaces) will result in regsub returning " ", which split would again split up into {{} {}} instead of an empty list.
Another proposed solution was this one, to force a reinterpretation of the given string as a list:
lreplace "a list with many spaces" 0 -1
But if there's one thing I've learned about TCL, it is that you should never use list functions (starting with l) on strings. And indeed, this one will choke on strings containing special characters (namely { and }):
lreplace "test \{a b\}"
returns test {a b} instead of test \{a b\} (which would be what I want, every space-separated word split up into a single element of the resulting list).
Yet another solution was to use a 'filter':
proc filter {cond list} {
set res {}
foreach element $list {if [$cond $element] {lappend res $element}}
set res
}
You'd then use it like this:
filter llength [split "a list with many spaces"]
Again, same problem. This would call llength on a string, which might contain special characters (again, { and }) - passing it "\{a b\}" would result in TCL complaining about an "unmatched open brace in list".
I managed to get it to work by modifying the given filter function, adding a {*} in front of $cond in the if, so I could use it with string length instead of llength, which seemed to work for every possible input I've tried to use it on so far.
Is this solution safe to use as it is now? Would it choke on some special input I didn't test so far? Or, is it possible to do this right in a simpler way?
The easiest way is to use regexp -all -inline to select and return all words. For example:
# The RE matches any non-empty sequence of non-whitespace characters
set theWords [regexp -all -inline {\S+} $theString]
If instead you define words to be sequences of alphanumerics, you instead use this for the regular expression term: {\w+}
You can use regexp instead:
From tcl wiki split:
Splitting by whitespace: the pitfalls
split { abc def ghi}
{} abc def {} ghi
Usually, if you are splitting by whitespace and do not want those blank fields, you are better off doing:
regexp -all -inline {\S+} { abc def ghi}
abc def ghi

How do I do pattern matching in strings?

Is there a way to iterate over a comma-separated string, then doing something with the matches? So far I have:
for a in string.gmatch("this, is, a commaseparated, string", "(.-)[,]") do
print (a)
end
The problem is the last entry in the table is not found. In C it is possible to match against NULL to check whether you are at the end of a string. Is there something similar in Lua?
Try this:
for a in string.gmatch("this, is, a commaseparated, string", "([^,]+),?") do
print (a)
end
The regex pattern ([^,]+),? captures one or more non-comma characters that are optionally followed by a comma.

Resources