Get the length of the string in substitution

Get the length of the string in substitution - vim

I'd like to calculate the length of a replace string used in a substitution. That is, "bar" in :s/foo/bar. Suppose I have access to this command string, I can run and undo it, and may separate the parts marked by / with split(). How would I get the string length of the replace string if it contains special characters like \1, \2 etc or ~?
For instance if I have
:s/\v(foo)|(bars)/\2\rreplace/
the replace length would be strlen("bars\rreplace") = 12.
EDIT: Just to be clear, I hope to use this to move the cursor past the text that was affected by a substitute operation. I'd appreciate alternative solutions as well.

You have to use :help sub-replace-expression. In it, you use submatch(2) instead of \2. If the expression is a custom function, you can as a side effect store the original length in a variable, and access that later:
function! Replace()
let g:replaceLength = strlen(submatch(0))
" Equivalent of \2\rreplace
return submatch(2) . "\r" . 'replace'
endfunction
:s/\v(foo)|(bars)/\=Replace()/

Related

In Swift how to obtain the "invisible" escape characters in a string variable into another variable

In Swift I can create a String variable such as this:
let s = "Hello\nMy name is Jack!"
And if I use s, the output will be:
Hello
My name is Jack!
(because the \n is a linefeed)
But what if I want to programmatically obtain the raw characters in the s variable? As in if I want to actually do something like:
let sRaw = s.raw
I made the .raw up, but something like this. So that the literal value of sRaw would be:
Hello\nMy name is Jack!
and it would literally print the string, complete with literal "\n"
Thank you!

The newline is the "raw character" contained in the string.
How exactly you formed the string (in this case from a string literal with an escape sequence in source code) is not retained (it is only available in the source code, but not preserved in the resulting program). It would look exactly the same if you read it from a file, a database, the concatenation of multiple literals, a multi-line literal, a numeric escape sequence, etc.
If you want to print newline as \n you have to convert it back (by doing text replacement) -- but again, you don't know if the string was really created from such a literal.

You can do this with escaped characters such as \n:
let secondaryString = "really"
let s = "Hello\nMy name is \(secondaryString) Jack!"
let find = Character("\n")
let r = String(s.characters.split(find).joinWithSeparator(["\\","n"]))
print(r) // -> "Hello\nMy name is really Jack!"
However, once the string s is generated the \(secondaryString) has already been interpolated to "really" and there is no trace of it other than the replaced word. I suppose if you already know the interpolated string you could search for it and replace it with "\\(secondaryString)" to get the result you want. Otherwise it's gone.

Efficient way to insert characters between other characters in a string

What is an efficient way in MATLAB to replace/insert one symbol (in series of symbols) with several others that correspond to the one that is being replaced?
For example, consider having a string Eq: Eq = 'A*exp(-((x-xc)/w)^2)'. Is there a way to replace * with .*, / with ./,\ with .\, and ^ with .^ without writing four separate strrep() lines?

Regular expressions will do the job nicely. Regular expressions simply find patterns in text. You specify what kind of pattern you are looking for by a regular expression, and the output gives you the locations of where the pattern occurred.
For our particular case, not only do we want to find where patterns occur, we also want to replace those patterns with something else. Specifically, use the function regexprep from MATLAB to replace matches in a string with something else. What you want to do is replace all *, /, \ and ^ symbols by adding a . in front of each.
How regexprep works is that the first input is the string you're looking at, the second input is a pattern that you're trying to find. In our case, we want to find any of *, /, \ and ^. To specify this pattern, you put those desired symbols in [] brackets. Regular expressions reserve \ as a special symbol to delineate characters that can be parsed as a regular expression but actually aren't. As such, you need to use \\ for the \ character and \^ for the ^ character. The third input is what you want to replace each match with. In our case, we simply want to reuse each matched character, but we add a . at the beginning of the match. This is done by doing \.$0 in the regular expression syntax. $0 means to grab the first token produced by a match... which is essentially the matched symbol from the pattern. . is also a reserved keyword using regular expressions, so we must prepend this symbol with a \ character.
Without further ado:
>> Eq = 'A*exp(-((x-xc)/w)^2)';
>> out = regexprep(Eq, '[*/\\\^]', '\.$0')
out =
A.*exp(-((x-xc)./w).^2)
The pattern we are looking for is [*/\\\^], which means that we want to find any of *, /, \ - denoted as \\ in regex, and \^ - denoted as ^ in regex. We want to find any of these symbols and replace them with the same symbol by adding a . character in front - \.$0.
As a more complicated example, let's make sure that we include all of the symbols you're looking for in a sample equation:
>> A = 'A*exp(-((x-xc)/w)^2) \ b^2';
>> out = regexprep(A, '[*/\\\^]', '\.$0')
out =
A.*exp(-((x-xc)./w).^2) .\ b.^2

I'd go with regexp as in rayryeng's answer. But here's another approach, just to provide an alternative.
ops = '*/\^'; %// operators that need a dot
ii = find(ismember(Eq, ops)); %// find where dots should be inserted
[~, jj] = sort([1:numel(Eq) ii-.5]); %// will be used to properly order the result
result = [Eq repmat('.',1,numel(ii))]; %// insert dots at the end
result = result(jj); %// properly order the result
And a variant:
ops = '*/\^'; %// operators that need a dot
ii = find(ismember(Eq, ops)); %// find where dots should be inserted
jj = sort([1:numel(Eq) ii-.5]); %// dot locations are marked with fractional part
result = Eq(ceil(jj)); %// repeat characters where the dots will be placed
result(mod(jj,1)>0) = '.'; %// place dots at indices with fractional part

The vectorize function already does almost all of what you want except that it does not convert mldivide (\) to ldivide (.\).
By "efficient," do you mean fewer lines of code or faster? Regular expressions are almost always slower than other approaches and less readable. I don't think they're necessary or a good choice in this case. If you only need to convert your string once, then speed is less of a concern than readability (strrep will still be faster). If you need to do it many times, this simple code that you alluded to is 4–5 times faster than regexrep for short strings like your example (and much faster for longer strings):
out = strrep(Eq,'*','.*');
out = strrep(out,'/','./');
out = strrep(out,'\','.\');
out = strrep(out,'^','.^');
If you want one line, use:
out = strrep(strrep(strrep(strrep(Eq,'*','.*'),'/','./'),'\','.\'),'^','.^');
which will also be slightly faster still. Or create your own version of vectorize and call that.
Where regular expressions shine is in more complex cases, e.g., if your string is already partially vectorized: Eq = 'A.*exp(-((x-xc)/w)^2)'. Even still, the vectorize function just uses strrep and then calls strfind to "remove any possible '..*', '../', etc." and replace them with the proper element-wise operators because it's faster (symbolic math strings can get very large, for example).

Meaning of $ in a string?

I came along this
__date__ = "$Date: 2011/06$"
and found this in the docs
$$ is an escape; it is replaced with a single $.
$identifier names a substitution placeholder matching a mapping key of "identifier". By default, "identifier" must spell a Python identifier. The first non-identifier character after the $ character terminates this placeholder specification.
${identifier} is equivalent to $identifier. It is required when valid identifier characters follow the placeholder but are not part of the placeholder, such as "${noun}ification".
but I don't understand it.
Could someone explain in plain english what's the $ for and give some examples preferably?

To Python, those dollar signs mean nothing at all. Just like the 'D' or 'a' that follow, the dollar sign is merely a character in a string.
To your source-code control system, the dollar signs indicate a substitution command. When you check out a new copy of your source code, that string is replaced with the timestamp of the last committed change to the file.
Reference:
http://svnbook.red-bean.com/en/1.6/svn.advanced.props.special.keywords.html
http://www.badgertronics.com/writings/cvs/keywords.html

This has been used in the context of string replace. For ex, if you have scenario with a variable which takes different value in same string, you can use this as follows:
import string
mytext = "$dog is an animal"
replaceDogtoCat = {"dog":"cat"}
mytemplate = string.Template(mytext)
print mytemplate.substitute(replaceDogtoCat) #output: cat is an animal
replaceDogtoGoat = {"dog":"goat"}
print mytemplate.substitute(replaceDogtoGoat) #output: goat is an animal
$dog is a variable which would get replaced when substitute gets executed

How do I remove lines from a string begins with specific string in Lua?

How do I remove lines from a string begins with another string in Lua ? For instance i want to remove all line from string result begins with the word <Table. This is the code I've written so far:
for line in result:gmatch"<Table [^\n]*" do line = "" end

string.gmtach is used to get all occurrences of a pattern. For replacing certain pattern, you need to use string.gsub.
Another problem is your pattern <Table [^\n]* will match all line containing the word <Table, not just begins with it.
Lua pattern doesn't support beginning of line anchor, this almost works:
local str = result:gsub("\n<Table [^\n]*", "")
except that it will miss on the first line. My solution is using a second run to test the first line:
local str1 = result:gsub("\n<Table [^\n]*", "")
local str2 = str1:gsub("^<Table [^\n]*\n", "")

The LPEG library is perfect
for this kind of task.
Just write a function to create custom line strippers:
local mk_striplines
do
local lpeg = require "lpeg"
local P = lpeg.P
local Cs = lpeg.Cs
local lpegmatch = lpeg.match
local eol = P"\n\r" + P"\r\n" + P"\n" + P"\t"
local eof = P(-1)
local linerest = (1 - eol)^1 * (eol + eof) + eol
mk_striplines = function (pat)
pat = P (pat)
local matchline = pat * linerest
local striplines = Cs (((matchline / "") + linerest)^1)
return function (str)
return lpegmatch (striplines, str)
end
end
end
Note that the argument to mk_striplines() may be a string or a
pattern.
Thus the result is very flexible:
mk_striplines (P"<Table" + P"</Table>") would create a stripper
that drops lines with two different patterns.
mk_striplines (P"x" * P"y"^0) drops each line starting with an
x followed by any number of y’s -- you get the idea.
Usage example:
local linestripper = mk_striplines "foo"
local test = [[
foo lorem ipsum
bar baz
buzz
foo bar
xyzzy
]]
print (linestripper (test))

The other answers provide good solutions to actually stripping lines from a string, but don't address why your code is failing to do that.
Reformatting for clarity, you wrote:
for line in result:gmatch"<Table [^\n]*" do
line = ""
end
The first part is a reasonable way to iterate over result and extract all spans of text that begin with <Table and continue up to but not including the next newline character. The iterator returned by gmatch returns a copy of the matching text on each call, and the local variable line holds that copy for the body of the for loop.
Since the matching text is copied to line, changes made to line are not and cannot modifying the actual text stored in result.
This is due to a more fundamental property of Lua strings. All strings in Lua are immutable. Once stored, they cannot be changed. Variables holding strings are actually holding a pointer into the internal table of reference counted immutable strings, which permits only two operations: internalization of a new string, and deletion of an internalized string with no remaining references.
So any approach to editing the content of the string stored in result is going to require the creation of an entirely new string. Where string.gmatch provides an iteration over the content but cannot allow it to be changed, string.gsub provides for creation of a new string where all text matching a pattern has been replaced by something new. But even string.gsub is not changing the immutable source text; it is creating a new immutable string that is a copy of the old with substitutions made.
Using gsub could be as simple as this:
result = result:gsub("<Table [^\n]*", "")
but that will disclose other defects in the pattern itself. First, and most obviously, nothing requires that the pattern match at only the beginning of the line. Second, the pattern does not include the newline, so it will leave the line present but empty.
All of that can be refined by careful and clever use of the pattern library. But it doesn't change the fact that you are starting with XML text and are not handling it with XML aware tools. In that case, any approach based on pattern matching or even regular expressions is likely to end in tears.

result = result:gsub('%f[^\n%z]<Table [^\n]*', '')
The start of this pattern, '%f[^\n%z], is a frontier pattern which will match any transition from either a newline or zero character to another character, and for frontier patterns the pre-first character counts as a zero character. In other words, using that prefix allows the rest of the pattern to match at either the first line or any other start-of-line.
Reference: the Lua 5.3 manual, section 6.4.1 on string patterns

how to understand below vim script entries?

Question 1:
I only know the bash script like this let var = value, but how to understand the mean of the below grammar under vim?
let g:counter += 1
return g:counter . '. '
Question 2:
What’s the means by '<C-\>^>', what is the key sequence in vim?
map '<C-\>^>'
I want to append my question, please forgive me,
the vim key map is like this
map <C-\>^] :GtagsCursor<CR>
I press key like
Ctrl-\ Shift-. and press ]
this doesn't work, what's the matter?

Question 1:
the two lines should be in a function. otherwise the return doesn't make any sense.
also the global variable g:counter should be already defined.
then the first line, just does as same as:
let g:counter = g:counter+1
so increment the variable g:counter by 1.
The 2nd line:
return g:counter . '. '
for example, after increment, the variable value is 10, then the line returns a string 10. (space)
the first dot concatenates two strings. first string is the variable value, which is converted into string type automatically. and the second string is '. '
Question 2:
map <C-\>^>
Note that I took the single quote from your map command away.
The key sequence is:
Ctrl-\Shift-6Shift-.
shift-6 is ^
Shift-. is >

Regarding the first question, you should probably type :help eval.txt or :help usr_41.txt inside Vim and read a good chunk of it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string