Lua 'plain' string.gsub - string

I've hit s small block with string parsing. I have a string like:
footage/down/temp/cars_[100]_upper/cars_[100]_upper.exr
and I'm having difficulty using gsub to delete a portion of the string. Normally I would do this
lineA = footage/down/temp/cars_[100]_upper/cars_[100]_upper.exr
lineB = footage/down/temp/cars_[100]_upper/
newline = lineA:gsub(lineB, "")
which would normally give me 'cars_[100]_upper.exr'
The problem is that gsub doesn't like the [] or other special characters in the string and unlike string.find gsub doesn't have the option of using the 'plain' flag to cancel pattern searching.
I am not able to manually edit the lines to include escape characters for the special characters as I'm doing file a file comparison script.
Any help to get from lineA to newline using lineB would be most appreciated.

Taking from page 181 of Programming in Lua 2e:
The magic characters are:
( ) . % + - * ? [ ] ^ $
The character '%' works as an escape
for these magic characters.
So, we can just come up with a simple function to escape these magic characters, and apply it to your input string (lineB):
function literalize(str)
return str:gsub("[%(%)%.%%%+%-%*%?%[%]%^%$]", function(c) return "%" .. c end)
end
lineA = "footage/down/temp/cars_[100]_upper/cars_[100]_upper.exr"
lineB = literalize("footage/down/temp/cars_[100]_upper/")
newline = lineA:gsub(lineB, "")
print(newline)
Which of course prints: cars_[100]_upper.exr.

You may use another approach like:
local i1, i2 = lineA:find(lineB, nil, true)
local result = lineA:sub(i2 + 1)

You can also escape punctuation in a text string, str, using:
str:gsub ("%p", "%%%0")

Related

Remove spaces from a string but not new lines in lua

I used string.gsub(str, "%s+") to remove spaces from a string but not remove new lines, example:
str = "string with\nnew line"
string.gsub(str, "%s+")
print(str)
and I'm expecting the output to be like:
stringwith
newline
what pattern should I use to get that result.
It seems you want to match any whitespace matched with %s but exclude a newline char from the pattern.
You can use a reverse %S pattern (that matches any non-whitespace char) in a negated character set, [^...], and add a \n there:
local str = "string with\nnew line"
str = string.gsub(str, "[^%S\n]+", "")
print(str)
See an online Lua demo yielding
stringwith
newline
"%s" matches any whitespace character. if you want to match a space use " ". If you want to define a specific number of spaces either explicitly write them down " " or use string.rep(" ", 5)

Remove newline character from a string?

I have a string that is like so:
"string content here
"
because it is too long to fit on the screen in one line
The string is the name of a file i would like to read, but i always get an error message that the file name wasn't found because it includes the new line character in the string when this obviously isn't in the file name. I cannot rename the file and I have tried the strip function to remove it, but this doesn't work. How can I remove the enter character from my string so I can load my file?
You can use the function strip to remove any trailing whitespace from a string.
>> text = "hello" + newline; %Create test string.
>> disp(text)
hello
>> text_stripped = strip(text);
>> disp(text_stripped)
hello
>>
In the above ">>" has been included to better present the removal of the whitespace in the string.
Consider replacing the newline character with nothing using strrep. Link
As an example:
s = sprintf('abc\ndef') % Create a string s with a newline character in the middle
s = strrep(s, newline, '') % Replace newline with nothing
Alternatively, you could use regular expressions if there are several characters causing you issues.
Alternatively, you could use strip if you know the newline always occurs at the beginning or end.

Haskell - how to pattern match on backslash character?

I want to replace \n with a space in a String with a recursive function using pattern matching, but I can't figure out how to match the \ char.
This is my function:
replace :: String -> String
replace ('\\':'n':xs) = ' ' : replace xs
replace (x:xs) = x : replace xs
replace "" = ""
In ('\':'n':xs) the backslash would escape the single quote and mess up the code, so I wrote ('\\':'n':xs) expecting that the first \ would escape the escape of the second \ and would match a backslash in a String. However, it doesn't.
This is what happens when I try the function in GHCi:
*Example> replace "m\nop"
"m\nop"
*Example> replace "m\\nop"
"m op"
How can I match a single backslash?
\n is a single character. If we use \n in a string like "Hello\nWorld!", then the resulting list looks like this: ['H','e','l','l','o','\n','W','o','r','l','d','!']. \n denotes a newline character, a single ASCII byte 10. However, since a newline isn't really easy to type in many programming languages, the escape sequence \n is used instead in string literals.
If you want to pattern match on a newline, you must use the whole escape sequence:
replace :: String -> String
replace ('\n':xs) = ' ' : replace xs
replace (x:xs) = x : replace xs
replace "" = ""
Otherwise, you will only match the literal \.
Exercise: Now that replace works, try to use map instead of explicit recursion.

gsubbing a string with a pattern containing a newline character in Lua

Does string.gsub recognize the newline character in a string literal? I have a scenario in which I am trying to gsub a portion of a string indicated by a given operator from the start of the operator to the newline like so:
local function removeComments(str, operator)
local new_Sc = (str):gsub(operator..".*\n", "");
return new_Sc;
end
local source = [[
int hi = 123; //a basic comment
char ok = "abc"; //another comment
]];
source = removeComments(source, "//");
print(source);
however in the output I see that it removed the rest of the string literal after the first comment:
int hi = 123;
I tried using the literal newline character by using string.char(10) like so (str):gsub(operator..".*"..string.char(10), ""); however I still got the same output; it removes the comment and the rest of the string instead of the start of the comment to the newline.
So is there anyway to gsub a string literal for a pattern containing a newline character?
Thanks
The problem you are facing is akin to greedy vs. lazy matching in regular expressions (.* vs .*?).
In Lua patterns, X.*\n means "match X, then match as many as possible characters followed by a newline". gsub has no special handling for a newline, hence it will try to continue matching until the last newline, subbing as many characters as it can. You want to match as few characters as possible, which is represented by .- in Lua patterns.
Also, I am not sure if it is intended or not, but this strategy will not remove the comment from the last line, if it is not (properly) ended by a newline. I am not sure if it can be represented by a single pattern, but this function will remove comments from all lines:
local function removeComments(str, operator)
local new_Sc = str:gsub(operator..".-\n", "\n");
new_Sc = new_Sc:gsub(operator.."[^\n].*$", "");
return new_Sc;
end

Perl Force Inteprolation of Literal String [duplicate]

In perl suppose I have a string like 'hello\tworld\n', and what I want is:
'hello world
'
That is, "hello", then a literal tab character, then "world", then a literal newline. Or equivalently, "hello\tworld\n" (note the double quotes).
In other words, is there a function for taking a string with escape sequences and returning an equivalent string with all the escape sequences interpolated? I don't want to interpolate variables or anything else, just escape sequences like \x, where x is a letter.
Sounds like a problem that someone else would have solved already. I've never used the module, but it looks useful:
use String::Escape qw(unbackslash);
my $s = unbackslash('hello\tworld\n');
You can do it with 'eval':
my $string = 'hello\tworld\n';
my $decoded_string = eval "\"$string\"";
Note that there are security issues tied to that approach if you don't have 100% control of the input string.
Edit: If you want to ONLY interpolate \x substitutions (and not the general case of 'anything Perl would interpolate in a quoted string') you could do this:
my $string = 'hello\tworld\n';
$string =~ s#([^\\A-Za-z_0-9])#\\$1#gs;
my $decoded_string = eval "\"$string\"";
That does almost the same thing as quotemeta - but exempts '\' characters from being escaped.
Edit2: This still isn't 100% safe because if the last character is a '\' - it will 'leak' past the end of the string though...
Personally, if I wanted to be 100% safe I would make a hash with the subs I specifically wanted and use a regex substitution instead of an eval:
my %sub_strings = (
'\n' => "\n",
'\t' => "\t",
'\r' => "\r",
);
$string =~ s/(\\n|\\t|\\n)/$sub_strings{$1}/gs;

Resources