What are the correct rules for Groovy escaping? - string

In the Groovy manual you can found these two pieces of text:
Any Groovy expression can be interpolated in all string literals,
apart from single and triple single quoted strings.
Slashy string
...
Only forward slashes need to be escaped with a backslash
They are obviously contradictory, for, according to the second sentence, /$a/ will be interpreted as '$a'. But, according to the first one, it will be interpreted as '-the meaning of variable a-'. In the real life, it will work the second way.
What is interesting, the dollar before something that looks like a variable should be escaped in single-quoted strings, too. Real life examples are here. Groovy tries to read $ as a variable name prefix even when not interpolating.
It seems, the explanation of the dollar slashy strings sets it correctly for ALL strings:
except to escape the dollar of a string subsequence that would start
like a GString placeholder sequence, ...
Could you formulate the correct and uncontradictory rules for groovy escape rules?
The practical tests were made on Gradle plugin for Intellij.

Related

Is there a definitive documented answer on double quoted string escaping?

Say, for example, I want to write this header line in an HTTP response:
Content-Disposition: attachment; filename="I can't believe it's not header!.jpg"
It contains a mix of quotes, repeating the quotes doesn't work (despite being the cleanest approach):
Header set always Content-Disposition "attachment; filename=""I can't believe it's not header!.jpg"""
It throws an error Header has too many arguments.
Good old backslash works:
Header always set Content-Disposition "attachment; filename=\"I can't believe it's not header!.jpg\""
but the docs provide examples where backslashes are used unescaped, so I assume that "a\b" is parsed the same as "a\\b" because \b isn't special like \ and ". We know what they say about assumptions. Am I just being dense? Where are the docs?
Update: I opened a bug as I found other oddities.
Backslash-escapes are certainly the standard way of escaping characters in Apache config files, so backslash-escaping double quotes inside a string that is itself delimited by double quotes is certainly the way to go.
However, where is this documented? The page in the Apache docs that covers configuration file syntax does not explicitly cover this. (The only mention of backslashes are in regards to continuing directives across multiple lines - something which is rarely required.)
The Apache docs for mod_log_config (a base module) do state:
Literal quotes and backslashes should be escaped with backslashes.
This is where the argument is (always) enclosed in double quotes. The same happens to apply to pretty much all string arguments in all modules.
but the docs provide examples where backslashes are used unescaped, so I assume that "a\b" is parsed the same as "a\\b" because \b isn't special like \ and ".
I can't see where you are referring to? The link you provide does not seem to include such an example?
If the argument is an ordinary string then "a\b" would be seen as "ab" (the literal b is unnecessarily escaped). And "a\\b" would be "a\b" (the backslash itself is escaped for a literal backslash). However, if the argument takes a regex (as many of those examples on the Apache expressions page do) then \b itself is a special meta-character that asserts a word-boundary - there is no backslash-escape in this instance.
Note that arguments in Apache config files only need to be surrounded in double quotes if the value contains spaces. Many examples in the Apache docs include double quotes, but this is not a requirement. Spaces themselves can often be backslash-escaped (to avoid having to double quote the argument), but this tends to be less readable. For regex arguments it is often preferable use \s instead (any space character).

create a string with quotes and back slash using python

i would like to create a string in the following exact format :
"\"\"\"\nThis is a beautiful world.\n"
But the code :
test ="\"\"\"\nthis is a beautiful world.\n"
test
gives the output :
'"""\nthis is a beautiful world.'
please help in getting an exact text.
My string test should look exactly like the string it has been initialized to. but after initialization, it actually gives the output as mentioned. i want to concatenate the test string to another string
the symbol "\" is called an escape character in most programming languages. this is used to add symbols to a string that might not be easy to add. eg - to add a double quote into a string, we add the \" to the string. eg -
a = "he said, \"hello\" to me"
this would give the output as -
he said "hello" to me
here, the "\" acts as a symbol for the code which allows it to recognize characters which might raise errors other-wise.
to include a backslash in your code, add an extra backslash to it. eg -
a = "\\"
here, the value of a is \.
if you still haven't been able to understand it, try - this tutorial
for your code,try this -
test = "\\\"\\\"\\\"\\nThis is a beautiful world.\\n"
and if you also want the double quotes at the ends,
test = "\"\\\"\\\"\\\"\\nThis is a beautiful world.\\n\""
The first thing to note is, that just typing in the variable name when running python interactively returns the canonical string representation of the object and not (necessarily) the plain value of the object.
For strings this means (among other things) that quotes are added around the output (in your example the outermost single quotes) and any newlines are replaced with "\n".
This means that, although your output does show "\n" the actual string contains a newline character in its place.
The check what a string looks like, you should use the print() function to, well, print it.
>>> test = "\"\"\"\nthis is a beautiful world."
>>> test
'"""\nthis is a beautiful world.'
>>> print(test)
"""
this is a beautiful world.
>>>
Also, when running the code from a file, lines just containing variable names will not result in any output.
To answer the question
There are a few ways to handle that.
Assuming that the desired output is
"""\nThis is a beautiful world.\n
i.e. the outermost double quotes are not supposed to be part of the string, that is
While using double quotes ("…") to denote strings: escape any \ or " by prepending it with \:
>>> test ="\\\"\\\"\\\"\\nthis is a beautiful world.\\n"
>>> print(test)
\"\"\"\nthis is a beautiful world.\n
Within regular strings \ is used to designate control character. For example: \n is interpreted as newline, \b would be a backspace. If you need to have a \ in a string, you need actually write two \\.
If you are usually using "…" for string notation, this allows for a more consistent coding style but it is (especially in this case) quite ugly and might be hard to understand at a glance.
As your string contains a lot of " characters, just use single quotes ('…') to designate the string. This removes the need to escape ":
>>> test = '\\"\\"\\"\\nthis is a beautiful world.\\n'
>>> print(test)
\"\"\"\nthis is a beautiful world.\n
This is less consistent (if "…" is usually used for strings, but allows the code to be quite a bit closer to the desired output.
Use raw strings (r'…' or r"…") to disable the interpretation of control characters and allow the use of " within the string:
>>> test = r'\"\"\"\nthis is a beautiful world.\n'
>>> print(test)
\"\"\"\nthis is a beautiful world.\n"
or even
>>> test = r"\"\"\"\nthis is a beautiful world.\n"
>>> print(test)
\"\"\"\nthis is a beautiful world.\n"
This allows the code to be identical to the desired output, but it has some limitations when it comes to freely mixing " and ' within a single string as it is not possible to escape the quotation marks within the string without also adding \ to the string output. This can be seen in the second example, where we use \" to escape the double quote within r"…" in the code but where the \s are still present in the output. While this works well in this specific case, I would recommend against using \' within r'…' or \" within r"…" to avoid confusion.

String lexical rule in ANTLR with greedy wildcald and escape character

From the book "The Definitive ANTLR 4 Reference":
Our STRING rule isn’t quite good enough yet because it doesn’t allow
double quotes inside strings. To support that, most languages define
escape sequences starting with a backslash. To get a double quote
inside a double-quoted string, we use \". To support the common escape
characters, we need something like the following:
STRING ​: ​ ​'"' ​( ESC |.)*?​ ​'"' ​ ​;
fragment
ESC ​: ​ ​'\\"' | ​ ​'\\\\' ​ ​; ​ ​// 2-char sequences \" and \\
​ ANTLR itself needs to escape the escape character, so that’s why we need \\ to
specify the backslash character. The loop in STRING now matches either
an escape character sequence, by calling fragment rule ESC, or any
single character via the dot wildcard. The *? subrule operator
terminates the (ESC |.)*?
That sounds fine, but when I read that I noticed a certain ambiguity in the choice between ESC and .. As far as STRING is concerned, it is possible to match an input "Hi\"" by matching the escape character \ to the ., and to consider the following escaped double-quote as closing the string. This would even be less greedy and so would conform better to the use of ?.
The problem, of course, is that if we do that, then we have an extra double-quote at the end that does not get matched to anything.
So I wrote the following grammar:
grammar String;
anything: STRING '"'? '\r\n';
STRING: '"' (ESC|.)*? '"';
fragment
ESC: '\\"' | '\\\\';
which accepts an optional lonely double-quote character right after the string. This grammar still parses "Orange\"" as a full string:
So my question is: why is this the accepted parse, as opposed to the one taking "Orange\" as the STRING, followed by an isolated double-quote "? Note that the latter would be less greedy, which would seem to conform better to the use of ?, so one could think it would be preferable.
After some more experimentation, I realize the explanation is that the choice operator | is order-dependent (but only under non-greedy operator ?): ESC is tried before .. If I invert the two and write (.|ESC)*?, I do get
This is not really surprising, but an interesting reminder that ANTLR is not as declarative as we may sometimes expect (in the sense that logic-or is order-independent but | is not). It is also a good reminder that the non-greedy operator ? does not extend its minimization capabilities to all choices, but just to the first one that matches the input (#sepp2k adds that order dependency only applies to the non-greedy case).

two asterisk sttring match in TCL

Help me to decide one problem in TCL.
By using my macros I want find string, which contains two asterisk (**).
I tried to used following commands:
string match \*\* string_name
But it doesn't work. Can you explain me where I made a mistake and how to do it correctly?
Thanks in advance!
What you are actually passing to the interpreter is string match ** string_name. You need to pass the actual backslashes to the interpreter so that it then will understand two escaped asterisks, and to do that you need to add a couple more backslashes:
string match \\*\\* $s
Or use braces:
string match {\*\*} $s
Note that the above will match only if $s contains 2 asterisks, and nothing else. To allow for anything before and after the asterisks, you can use more asterisks...
string match {*\*\**} $s
There are a few other ways to check if a string has double asterisks, you can for instance use string first (and since this one does not support expressions, you can actually get away without having to escape anything):
string first ** $s
If you get something greater than -1, then ** is present in $s.
Or if you happen to know some regular expressions:
regexp -- {\*\*} $s
Those are the most common I think.

Lua - How to remove quotes around integers in strings

So I have this string:
{"scores":{"1":["John",60],"2":["Jude",60],"3":["Max",60],"4":["Kyle",60],"5":["Smith",60],"6":["Mark",50],"7":["Luke",40],"8":["Anne",30],"9":["Bruce",20],"10":["kazuo",10]}}
There are a number of integers there that have quotes around them, and I want to get rid of them. How do I do that? I already tried out:
print(string.gsub(string, '/"(\d)"/', "%1"));
but it does not work. :(
Lua does not have regular expressions like Perl, instead, it does have patterns. These are similar with a few differences.
There is no need for delimiting slashes / /, and the escaping character is % but not \. Otherwise, your trial is essentially correct:
print(string.gsub(str, '"(%d+)"', "%1"))
Where str is the variable containing the input string. Also note that string.gsub returns 2 values, which are both printed, the second result being the number of substitutions. Use an extra pair of parentheses to keep only the first result.
You can simplify a little the notation using the colon : operator :
print((str:gsub('"(%d+)"', "%1")))

Resources