I saw the operator r#"" in Rust but I can't find what it does. It came in handy for creating JSON:
let var1 = "test1";
let json = r#"{"type": "type1", "type2": var1}"#;
println!("{}", json) // => {"type2": "type1", "type2": var1}
What's the name of the operator r#""? How do I make var1 evaluate?
I can't find what it does
It has to do with string literals and raw strings. I think it is explained pretty well in this part of the documentation, in the code block that is posted there you can see what it does:
"foo"; r"foo"; // foo
"\"foo\""; r#""foo""#; // "foo"
"foo #\"# bar";
r##"foo #"# bar"##; // foo #"# bar
"\x52"; "R"; r"R"; // R
"\\x52"; r"\x52"; // \x52
It negates the need to escape special characters inside the string.
The r character at the start of a string literal denotes a raw string literal. It's not an operator, but rather a prefix.
In a normal string literal, there are some characters that you need to escape to make them part of the string, such as " and \. The " character needs to be escaped because it would otherwise terminate the string, and the \ needs to be escaped because it is the escape character.
In raw string literals, you can put an arbitrary number of # symbols between the r and the opening ". To close the raw string literal, you must have a closing ", followed by the same number of # characters as there are at the start. With zero or more # characters, you can put literal \ characters in the string (\ characters do not have any special meaning). With one or more # characters, you can put literal " characters in the string. If you need a " followed by a sequence of # characters in the string, just use the same number of # characters plus one to delimit the string. For example: r##"foo #"# bar"## represents the string foo #"# bar. The literal doesn't stop at the quote in the middle, because it's only followed by one #, whereas the literal was started with two #.
To answer the last part of your question, there's no way to have a string literal that evaluates variables in the current scope. Some languages, such as PHP, support that, but not Rust. You should consider using the format! macro instead. Note that for JSON, you'll still need to double the braces, even in a raw string literal, because the string is interpreted by the macro.
fn main() {
let var1 = "test1";
let json = format!(r#"{{"type": "type1", "type2": {}}}"#, var1);
println!("{}", json) // => {"type2": "type1", "type2": test1}
}
If you need to generate a lot of JSON, there are many crates that will make it easier for you. In particular, with serde_json, you can define regular Rust structs or enums and have them serialized automatically to JSON.
The first time I saw this weird notation is in glium tutorials (old crate for graphics management) and is used to "encapsulate" and pass GLSL code (GL Shading language) to shaders of the GPU
https://github.com/glium/glium/blob/master/book/tuto-02-triangle.md
As far as I understand, it looks like the content of r#...# is left untouched, it is not interpreted in any way. Hence raw string.
Related
In the regex below, \s denotes a space character. I imagine the regex parser, is going through the string and sees \ and knows that the next character is special.
But this is not the case as double escapes are required.
Why is this?
var res = new RegExp('(\\s|^)' + foo).test(moo);
Is there a concrete example of how a single escape could be mis-interpreted as something else?
You are constructing the regular expression by passing a string to the RegExp constructor.
\ is an escape character in string literals.
The \ is consumed by the string literal parsing…
const foo = "foo";
const string = '(\s|^)' + foo;
console.log(string);
… so the data you pass to the RegEx compiler is a plain s and not \s.
You need to escape the \ to express the \ as data instead of being an escape character itself.
Inside the code where you're creating a string, the backslash is a javascript escape character first, which means the escape sequences like \t, \n, \", etc. will be translated into their javascript counterpart (tab, newline, quote, etc.), and that will be made a part of the string. Double-backslash represents a single backslash in the actual string itself, so if you want a backslash in the string, you escape that first.
So when you generate a string by saying var someString = '(\\s|^)', what you're really doing is creating an actual string with the value (\s|^).
The Regex needs a string representation of \s, which in JavaScript can be produced using the literal "\\s".
Here's a live example to illustrate why "\s" is not enough:
alert("One backslash: \s\nDouble backslashes: \\s");
Note how an extra \ before \s changes the output.
As has been said, inside a string literal, a backslash indicates an escape sequence, rather than a literal backslash character, but the RegExp constructor often needs literal backslash characters in the string passed to it, so the code should have \\s to represent a literal backslash, in most cases.
A problem is that double-escaping metacharacters is tedious. There is one way to pass a string to new RegExp without having to double escape them: use the String.raw template tag, an ES6 feature, which allows you to write a string that will be parsed by the interpreter verbatim, without any parsing of escape sequences. For example:
console.log('\\'.length); // length 1: an escaped backslash
console.log(`\\`.length); // length 1: an escaped backslash
console.log(String.raw`\\`.length); // length 2: no escaping in String.raw!
So, if you wish to keep your code readable, and you have many backslashes, you may use String.raw to type only one backslash, when the pattern requires a backslash:
const sentence = 'foo bar baz';
const regex = new RegExp(String.raw`\bfoo\sbar\sbaz\b`);
console.log(regex.test(sentence));
But there's a better option. Generally, there's not much good reason to use new RegExp unless you need to dynamically create a regular expression from existing variables. Otherwise, you should use regex literals instead, which do not require double-escaping of metacharacters, and do not require writing out String.raw to keep the pattern readable:
const sentence = 'foo bar baz';
const regex = /\bfoo\sbar\sbaz\b/;
console.log(regex.test(sentence));
Best to only use new RegExp when the pattern must be created on-the-fly, like in the following snippet:
const sentence = 'foo bar baz';
const wordToFind = 'foo'; // from user input
const regex = new RegExp(String.raw`\b${wordToFind}\b`);
console.log(regex.test(sentence));
\ is used in Strings to escape special characters. If you want a backslash in your string (e.g. for the \ in \s) you have to escape it via a backslash. So \ becomes \\ .
EDIT: Even had to do it here, because \\ in my answer turned to \.
I have a raw string literal which is very long. Is it possible to split this across multiple lines without adding newline characters to the string?
file.write(r#"This is an example of a line which is well over 100 characters in length. Id like to know if its possible to wrap it! Now some characters to justify using a raw string \foo\bar\baz :)"#)
In Python and C for example, you can simply write this as multiple string literals.
# "some string"
(r"some "
r"string")
Is it possible to do something similar in Rust?
While raw string literals don't support this, it can be achieved using the concat! macro:
let a = concat!(
r#"some very "#,
r#"long string "#,
r#"split over lines"#);
let b = r#"some very long string split over lines"#;
assert_eq!(a, b);
It is possible with indoc.
The indoc!() macro takes a multiline string literal and un-indents it at compile time so the leftmost non-space character is in the first column.
let testing = indoc! {"
def hello():
print('Hello, world!')
hello()
"};
let expected = "def hello():\n print('Hello, world!')\n\nhello()\n";
assert_eq!(testing, expected);
Ps: I really think we could use an AI that recommend good crates to Rust users.
In Swift I can create a String variable such as this:
let s = "Hello\nMy name is Jack!"
And if I use s, the output will be:
Hello
My name is Jack!
(because the \n is a linefeed)
But what if I want to programmatically obtain the raw characters in the s variable? As in if I want to actually do something like:
let sRaw = s.raw
I made the .raw up, but something like this. So that the literal value of sRaw would be:
Hello\nMy name is Jack!
and it would literally print the string, complete with literal "\n"
Thank you!
The newline is the "raw character" contained in the string.
How exactly you formed the string (in this case from a string literal with an escape sequence in source code) is not retained (it is only available in the source code, but not preserved in the resulting program). It would look exactly the same if you read it from a file, a database, the concatenation of multiple literals, a multi-line literal, a numeric escape sequence, etc.
If you want to print newline as \n you have to convert it back (by doing text replacement) -- but again, you don't know if the string was really created from such a literal.
You can do this with escaped characters such as \n:
let secondaryString = "really"
let s = "Hello\nMy name is \(secondaryString) Jack!"
let find = Character("\n")
let r = String(s.characters.split(find).joinWithSeparator(["\\","n"]))
print(r) // -> "Hello\nMy name is really Jack!"
However, once the string s is generated the \(secondaryString) has already been interpolated to "really" and there is no trace of it other than the replaced word. I suppose if you already know the interpolated string you could search for it and replace it with "\\(secondaryString)" to get the result you want. Otherwise it's gone.
Is there a functional equivalent in Swift to Scala's Raw String or the verbatim string literal in C#?
Sample raw string without escape characters (not syntactically correct):
val secretKey = """long\^578arandom&61~8791escaped&*^%#(chars"""
I've tried briefly gripping through the language docs but haven't found a functional equivalent yet.
Swift 5 supports raw strings now. With this feature, backslashes and quote marks are interpreted as to their respective literal symbols. They are not treated as escapes characters or string terminators in raw strings.
To use raw strings, # symbol is used(same as python uses ‘r’ or ‘R’). Here are the number of variations for using raw strings in swift 5:
let myPets = #"The name of my dog is "barky" and my cat is "smily"."#
//The name of my dog is "barky" and my cat is "smily".
Inside of the raw string # is used for string interpolation instead of usual backslash of swift.
let val = 1
let result = #"The answer is \#(val)."#
//The answer is 1
If you want to use # inside of a raw string, place ## at the beginning and at the end.
let str = ##"I am happy bla#blablabla"##
//"I am happy bla#blablabla"
Raw strings will be helpful for regular expressions I guess, lesser backslashes in regex definition. For example:
let regex_Prev = "\\\\[A-Z]+[A-Za-z]+\\.[a-z]+"
Now we can write:
let regex_Swift5version = #"\\[A-Z]+[A-Za-z]+\.[a-z]+"#
Supposedly, this is implemented in Swift 5, see SE-0200 – Support Raw Text. From the document:
You may pad a string literal with one or more # characters:
#"She said, "This is dialog!""#
// Equivalent to "She said, \"This is dialog!\""
#"A \"quote"."#
// Backslash interpreted as an extra character.
Currently expected Swift 5 release date: “Early 2019”.
There is currently no such function, swift is new. In the conference speakers encouraged us to report anything that you feel swift needs. Therefore I suggest you to report that you need a raw string function like Scala.
This question already has answers here:
What is the syntax for a multiline string literal?
(5 answers)
Closed 1 year ago.
Is it possible to write something like:
fn main() {
let my_string: &str = "Testing for new lines \
might work like this?";
}
If I'm reading the language reference correctly, then it looks like that should work. The language ref states that \n etc. are supported (as common escapes, for inserting line breaks into your string), along with "additional escapes" including LF, CR, and HT.
Another way to do this is to use a raw string literal:
Raw string literals do not process any escapes. They start with the
character U+0072 (r), followed by zero or more of the character U+0023
(#) and a U+0022 (double-quote) character. The raw string body can
contain any sequence of Unicode characters and is terminated only by
another U+0022 (double-quote) character, followed by the same number
of U+0023 (#) characters that preceded the opening U+0022
(double-quote) character.
All Unicode characters contained in the raw string body represent
themselves, the characters U+0022 (double-quote) (except when followed
by at least as many U+0023 (#) characters as were used to start the
raw string literal) or U+005C (\) do not have any special meaning.
Examples for string literals:
"foo"; r"foo"; // foo
"\"foo\""; r#""foo""#; // "foo"
"foo #\"# bar";
r##"foo #"# bar"##; // foo #"# bar
"\x52"; "R"; r"R"; // R
"\\x52"; r"\x52"; // \x52
If you'd like to avoid having newline characters and extra spaces, you can use the concat! macro. It concatenates string literals at compile time.
let my_string = concat!(
"Testing for new lines ",
"might work like this?",
);
assert_eq!(my_string, "Testing for new lines might work like this?");
The accepted answer with the backslash also removes the extra spaces.
Every string is a multiline string in Rust.
But if you have indents in your text like:
fn my_func() {
const MY_CONST: &str = "\
Hi!
This is a multiline text!
";
}
you will get unnecessary spaces. To remove them you can use indoc! macros from indoc crate to remove all indents: https://github.com/dtolnay/indoc
There are two ways of writing multi-line strings in Rust that have different results. You should choose between them with care depending on what you are trying to accomplish.
Method 1: Dangling whitespace
If a string starting with " contains a literal line break, the Rust compiler will "gobble up" all whitespace between the last non-whitespace character of the line and the first non-whitespace character of the next line, and replace them with a single .
Example:
fn test() {
println!("{}", "hello
world");
}
No matter how many literal (blank space) characters (zero or a hundred) appear after hello, the output of the above will always be hello world.
Method 2: Backslash line break
This is the exact opposite. In this mode, all the whitespace before a literal \ on the first line is preserved, and all the subsequent whitespace on the next line is also preserved.
Example:
fn test() {
println!("{}", "hello \
world");
}
In this example, the output is hello world.
Additionally, as mentioned in another answer, Rust has "raw literal" strings, but they do not enter into this discussion as in Rust (unlike some other languages that need to resort to raw strings for this) supports literal line breaks in quoted content without restrictions, as we can see above.