Enhancing String Literals Delimiters to Support Raw Text Swift - string

I recently found this code snippets on the Swift 5 Book.
print(#"Write an interpolated string in Swift using \(multiplier)."#)
// Prints "Write an interpolated string in Swift using \(multiplier).โ€
print(#"6 times 7 is \#(6 * 7)."#)
// Prints "6 times 7 is 42.โ€
I learnt it was an accepted proposal in Swift 5 for enhancing string literals delimiters to support raw text, with so many examples given.
My question is when and how is it used in practical cases because from the examples given above, I would still clearly achieve what I want to even without the # signs!

To give just one example where it is very useful. How about when writing Regex, previously it was a nightmare as you had to escape all special characters. E.g.
let regex1 = "\\\\[A-Z]+[A-Za-z]+\\.[a-z]+"
Can now be replaced with
let regex2 = #"\\[A-Z]+[A-Za-z]+\.[a-z]+"#
Much easier to write. Now when you find a regex online, you can just copy and paste it in without having to spend ages escaping special characters.
Edit:
Can read here
https://www.hackingwithswift.com/articles/162/how-to-use-raw-strings-in-swift

Related

How to reverse strings that contain surrogate pairs in Dart?

I was playing with algorithms using Dart and as I actually followed TDD, I realized that my code has some limitations.
I was trying to reverse strings as part of an interview problem, but I couldn't get the surrogate pairs correctly reversed.
const simple = 'abc';
const emoji = '๐ŸŽ๐Ÿ๐Ÿ›';
const surrogate = '๐Ÿ‘ฎ๐Ÿฝโ€โ™‚๏ธ๐Ÿ‘ฉ๐Ÿฟโ€๐Ÿ’ป';
String rev(String s) {
return String.fromCharCodes(s.runes.toList().reversed);
}
void main() {
print(simple);
print(rev(simple));
print(emoji);
print(rev(emoji));
print(surrogate);
print(rev(surrogate));
}
The output:
abc
cba
๐ŸŽ๐Ÿ๐Ÿ›
๐Ÿ›๐Ÿ๐ŸŽ
๐Ÿ‘ฎ๐Ÿฝโ€โ™‚๏ธ๐Ÿ‘ฉ๐Ÿฟโ€๐Ÿ’ป
๐Ÿ’ปโ€๐Ÿฟ๐Ÿ‘ฉ๏ธโ™‚โ€๐Ÿฝ๐Ÿ‘ฎ
You can see that the simple emojis are correctly reversed as I'm using the runes instead of just simply executing s.split('').toList().reversed.join(''); but the surrogate pairs are reversed incorrectly.
How can I reverse a string that might contain surrogate pairs using the Dart programming language?
When reversing strings, you must operate on graphemes, not characters nor code units. Use grapheme_splitter.
Dart 2.7 introduced a new package that supports grapheme cluster-aware operations. The package is called characters. characters is a package for characters represented as Unicode extended grapheme clusters.
Dartโ€™s standard String class uses the UTF-16 encoding. This is a common choice in programming languages, especially those that offer support for running both natively on devices, and on the web.
UTF-16 strings usually work well, and the encoding is transparent to the developer. However, when manipulating strings, and especially when manipulating strings entered by users, you may experience a difference between what the user perceives as a character, and what is encoded as a code unit in UTF-16.
Source: "Announcing Dart 2.7: A safer, more expressive Dart" by Michael Thomsen, section "Safe substring handling"
The package will also help to reverse your strings with emojis the way a native programmer would expect.
Using simple Strings, you find issues:
String hi = 'Hi ๐Ÿ‡ฉ๐Ÿ‡ฐ';
print('String.length: ${hi.length}');
// Prints 7; would expect 4
With characters
String hi = 'Hi ๐Ÿ‡ฉ๐Ÿ‡ฐ';
print(hi.characters.length);
// Prints 4
print(hi.characters.last);
// Prints ๐Ÿ‡ฉ๐Ÿ‡ฐ
It's worth taking a look at the source code of the characters package, it's far from simple but looks easier to digest and better documented than grapheme_splitter. The characters package is also maintained by the Dart team.

Converting Unicode in Swift

I currently have a string as follows which I received through an API call:
\n\nIt\U2019s a great place to discover Berlin and a comfortable place
to come home to.
And I want to convert it into something like this which is more readable:
It's a great place to discover Berlin and a comfortable place to come
home to.
I've taken a look at this post, but that's manually writing down every conversion, and there may be more of these unicode scalar characters introduced.
What I understand is \u{2019} is unicode scalar, but the format for this is \U2019 and I'm quite confused. Are there any built in methods to do this conversion?
This answer suggests using the NSString method stringByFoldingWithOptions.
The Swift String class has a concept called a "view" which lets you operate on the string under different encodings. It's pretty neat, and there are some views that might help you.
If you're dealing with strings in Swift, read this excellent post by Mike Ash. He discusses the idea of what a string really is with great detail and has some helpful hints for Swift 2.
Assuming you are already splitting the string and can get the offending format separately:
func convertFormat(stringOrig: String) -> Character {
let subString = String(stringOrig.characters.split("U").map({$0})[1])
let scalarValue = Int(subString)
let scalar = UnicodeScalar(scalarValue!)
return Character(scalar)
}
This will convert the String "\U2019" to the Character represented by "\u{2019}".

What's the point of nesting brackets in Lua?

I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff

How to extract youtube video id from a URL in Lua

I need to extract youtube video id (e.g., brSU-lAACiA) from URL below that is in a Lua string variable.
local string = "a:2:{s:8:\"td_video\";s:60:\"http:\/\/www.youtube.com\/watch?v=brSU-lAACiA&feature=autoshare\";s:13:\"td_last_video\";s:60:\"http:\/\/www.youtube.com\/watch?v=brSU-lAACiA&feature=autoshare\";}"
What should be the pattern?
I think I got it.
local string = "a:2:{s:8:\"td_video\";s:60:\"http:\/\/www.youtube.com\/watch?v=brSU-lAACiA&feature=autoshare\";s:13:\"td_last_video\";s:60:\"http:\/\/www.youtube.com\/watch?v=brSU-lAACiA&feature=autoshare\";}"
pattern = "v=(...........)"
local vidid = string.match(string, pattern)
There are 11 dots because Youtube video ID are only 11 characters. I'm not expert in making these patterns, so if there are other easier and shorter methods please share them with me.
Your own solution works fine, but time may come that Youtube decides to use video id that is not exactly 11 characters, this is an alternative solution for you:
local vidid = string.match(string, "%?v=(.-)&")
The pattern "%?v=(.-)&" matches a character ?, follows by v= and 0 or more characters after, then ends with &. The characters between v= and & are captured, note the use of - for non-greedy match.

How do I do a multi-line string in node.js?

With the rise of node.js, multi-line strings are becoming more necessary in JavaScript.
Is there a special way to do this in Node.JS, even if it does not work in browsers?
Are there any plans or at least a feature request to do this that I can support?
I already know that you can use \n\ at the end of every line, that is not what I want.
node v4 and current versions of node
As of ES6 (and so versions of Node greater than v4), a new "template literal" intrinsic type was added to Javascript (denoted by back-ticks "`") which can also be used to construct multi-line strings, as in:
`this is a
single string`
which evaluates to: 'this is a\nsingle string'.
Note that the newline at the end of the first line is included in the resulting string.
Template literals were added to allow programmers to construct strings where values or code could be directly injected into a string literal without having to use util.format or other templaters, as in:
let num=10;
console.log(`the result of ${num} plus ${num} is ${num + num}.`);
which will print "the result of 10 plus 10 is 20." to the console.
Older versions of node
Older version of node can use a "line continuation" character allowing you to write multi-line strings such as:
'this is a \
single string'
which evaluates to: 'this is a single string'.
Note that the newline at the end of the first line is not included in the resulting string.
Multiline strings are a current part of JavaScript (since ES6) and are supported in node.js v4.0.0 and newer.
var text = `Lorem ipsum dolor
sit amet, consectetur
adipisicing
elit. `;
console.log(text);
What exactly are you looking for when you mean multiline strings.
Are you looking for something like:
var str = "Some \
String \
Here";
Which would print as "Some String Here"?
If so, keep in mind that the above is valid Javascript, but this isn't:
var str = "Some \
String \
Here";
What's the difference? A space after the \. Have fun debugging that.
As an aside to what folks have been posting here, I've heard that concatenation can be much faster than join in modern javascript vms. Meaning:
var a =
[ "hey man, this is on a line",
"and this is on another",
"and this is on a third"
].join('\n');
Will be slower than:
var a = "hey man, this is on a line\n" +
"and this is on another\n" +
"and this is on a third";
In certain cases. http://jsperf.com/string-concat-versus-array-join/3
As another aside, I find this one of the more appealing features in Coffeescript. Yes, yes, I know, haters gonna hate.
html = '''
<strong>
cup of coffeescript
</strong>
'''
Its especially nice for html snippets. I'm not saying its a reason to use it, but I do wish it would land in ecma land :-(.
Josh
In addition to accepted answer:
`this is a
single string`
which evaluates to: 'this is a\nsingle string'.
If you want to use string interpolation but without a new line,
just add backslash as in normal string:
`this is a \
single string`
=> 'this is a single string'.
Bear in mind manual whitespace is necessary though:
`this is a\
single string`
=> 'this is asingle string'
Take a look at the mstring module for node.js.
This is a simple little module that lets you have multi-line strings in JavaScript.
Just do this:
var M = require('mstring')
var mystring = M(function(){/***
Ontario
Mining and
Forestry
Group
***/})
to get
mystring === "Ontario\nMining and\nForestry\nGroup"
And that's pretty much it.
How It Works
In Node.js, you can call the .toString method of a function, and it will give you the source code of the function definition, including any comments. A regular expression grabs the content of the comment.
Yes, it's a hack. Inspired by a throwaway comment from Dominic Tarr.
note: The module (as of 2012/13/11) doesn't allow whitespace before the closing ***/, so you'll need to hack it in yourself.
Take a look at CoffeeScript: http://coffeescript.org
It supports multi-line strings, interpolation, array comprehensions and lots of other nice stuff.
If you use io.js, it has support for multi-line strings as they are in ECMAScript 6.
var a =
`this is
a multi-line
string`;
See "New String Methods" at http://davidwalsh.name/es6-io for details and "template strings" at http://kangax.github.io/compat-table/es6/ for tracking compatibility.
Vanilla Javascipt does not support multi-line strings. Language pre-processors are turning out to be feasable these days.
CoffeeScript, the most popular of these has this feature, but it's not minimal, it's a new language. Google's traceur compiler adds new features to the language as a superset, but I don't think multi-line strings are one of the added features.
I'm looking to make a minimal superset of javascript that supports multiline strings and a couple other features. I started this little language a while back before writing the initial compiler for coffeescript. I plan to finish it this summer.
If pre-compilers aren't an option, there is also the script tag hack where you store your multi-line data in a script tag in the html, but give it a custom type so that it doesn't get evaled. Then later using javascript, you can extract the contents of the script tag.
Also, if you put a \ at the end of any line in source code, it will cause the the newline to be ignored as if it wasn't there. If you want the newline, then you have to end the line with "\n\".

Resources