How do I do a multi-line string in node.js? - string

With the rise of node.js, multi-line strings are becoming more necessary in JavaScript.
Is there a special way to do this in Node.JS, even if it does not work in browsers?
Are there any plans or at least a feature request to do this that I can support?
I already know that you can use \n\ at the end of every line, that is not what I want.

node v4 and current versions of node
As of ES6 (and so versions of Node greater than v4), a new "template literal" intrinsic type was added to Javascript (denoted by back-ticks "`") which can also be used to construct multi-line strings, as in:
`this is a
single string`
which evaluates to: 'this is a\nsingle string'.
Note that the newline at the end of the first line is included in the resulting string.
Template literals were added to allow programmers to construct strings where values or code could be directly injected into a string literal without having to use util.format or other templaters, as in:
let num=10;
console.log(`the result of ${num} plus ${num} is ${num + num}.`);
which will print "the result of 10 plus 10 is 20." to the console.
Older versions of node
Older version of node can use a "line continuation" character allowing you to write multi-line strings such as:
'this is a \
single string'
which evaluates to: 'this is a single string'.
Note that the newline at the end of the first line is not included in the resulting string.

Multiline strings are a current part of JavaScript (since ES6) and are supported in node.js v4.0.0 and newer.
var text = `Lorem ipsum dolor
sit amet, consectetur
adipisicing
elit. `;
console.log(text);

What exactly are you looking for when you mean multiline strings.
Are you looking for something like:
var str = "Some \
String \
Here";
Which would print as "Some String Here"?
If so, keep in mind that the above is valid Javascript, but this isn't:
var str = "Some \
String \
Here";
What's the difference? A space after the \. Have fun debugging that.

As an aside to what folks have been posting here, I've heard that concatenation can be much faster than join in modern javascript vms. Meaning:
var a =
[ "hey man, this is on a line",
"and this is on another",
"and this is on a third"
].join('\n');
Will be slower than:
var a = "hey man, this is on a line\n" +
"and this is on another\n" +
"and this is on a third";
In certain cases. http://jsperf.com/string-concat-versus-array-join/3
As another aside, I find this one of the more appealing features in Coffeescript. Yes, yes, I know, haters gonna hate.
html = '''
<strong>
cup of coffeescript
</strong>
'''
Its especially nice for html snippets. I'm not saying its a reason to use it, but I do wish it would land in ecma land :-(.
Josh

In addition to accepted answer:
`this is a
single string`
which evaluates to: 'this is a\nsingle string'.
If you want to use string interpolation but without a new line,
just add backslash as in normal string:
`this is a \
single string`
=> 'this is a single string'.
Bear in mind manual whitespace is necessary though:
`this is a\
single string`
=> 'this is asingle string'

Take a look at the mstring module for node.js.
This is a simple little module that lets you have multi-line strings in JavaScript.
Just do this:
var M = require('mstring')
var mystring = M(function(){/***
Ontario
Mining and
Forestry
Group
***/})
to get
mystring === "Ontario\nMining and\nForestry\nGroup"
And that's pretty much it.
How It Works
In Node.js, you can call the .toString method of a function, and it will give you the source code of the function definition, including any comments. A regular expression grabs the content of the comment.
Yes, it's a hack. Inspired by a throwaway comment from Dominic Tarr.
note: The module (as of 2012/13/11) doesn't allow whitespace before the closing ***/, so you'll need to hack it in yourself.

Take a look at CoffeeScript: http://coffeescript.org
It supports multi-line strings, interpolation, array comprehensions and lots of other nice stuff.

If you use io.js, it has support for multi-line strings as they are in ECMAScript 6.
var a =
`this is
a multi-line
string`;
See "New String Methods" at http://davidwalsh.name/es6-io for details and "template strings" at http://kangax.github.io/compat-table/es6/ for tracking compatibility.

Vanilla Javascipt does not support multi-line strings. Language pre-processors are turning out to be feasable these days.
CoffeeScript, the most popular of these has this feature, but it's not minimal, it's a new language. Google's traceur compiler adds new features to the language as a superset, but I don't think multi-line strings are one of the added features.
I'm looking to make a minimal superset of javascript that supports multiline strings and a couple other features. I started this little language a while back before writing the initial compiler for coffeescript. I plan to finish it this summer.
If pre-compilers aren't an option, there is also the script tag hack where you store your multi-line data in a script tag in the html, but give it a custom type so that it doesn't get evaled. Then later using javascript, you can extract the contents of the script tag.
Also, if you put a \ at the end of any line in source code, it will cause the the newline to be ignored as if it wasn't there. If you want the newline, then you have to end the line with "\n\".

Related

Meaning of `$` in AJAX functions [duplicate]

The code in question is here:
var $item = $(this).parent().parent().find('input');
What is the purpose of the dollar sign in the variable name, why not just exclude it?
A '$' in a variable means nothing special to the interpreter, much like an underscore.
From what I've seen, many people using jQuery (which is what your example code looks like to me) tend to prefix variables that contain a jQuery object with a $ so that they are easily identified and not mixed up with, say, integers.
The dollar sign function $() in jQuery is a library function that is frequently used, so a short name is desirable.
In your example the $ has no special significance other than being a character of the name.
However, in ECMAScript 6 (ES6) the $ may represent a Template Literal
var user = 'Bob'
console.log(`We love ${user}.`); //Note backticks
// We love Bob.
The dollar sign is treated just like a normal letter or underscore (_). It has no special significance to the interpreter.
Unlike many similar languages, identifiers (such as functional and variable names) in Javascript can contain not only letters, numbers and underscores, but can also contain dollar signs. They are even allowed to start with a dollar sign, or consist only of a dollar sign and nothing else.
Thus, $ is a valid function or variable name in Javascript.
Why would you want a dollar sign in an identifier?
The syntax doesn't really enforce any particular usage of the dollar sign in an identifier, so it's up to you how you wish to use it. In the past, it has often been recommended to start an identifier with a dollar sign only in generated code - that is, code created not by hand but by a code generator.
In your example, however, this doesn't appear to be the case. It looks like someone just put a dollar sign at the start for fun - perhaps they were a PHP programmer who did it out of habit, or something. In PHP, all variable names must have a dollar sign in front of them.
There is another common meaning for a dollar sign in an interpreter nowadays: the jQuery object, whose name only consists of a single dollar sign ($). This is a convention borrowed from earlier Javascript frameworks like Prototype, and if jQuery is used with other such frameworks, there will be a name clash because they will both use the name $ (jQuery can be configured to use a different name for its global object). There is nothing special in Javascript that allows jQuery to use the single dollar sign as its object name; as mentioned above, it's simply just another valid identifier name.
The $ sign is an identifier for variables and functions.
https://web.archive.org/web/20160529121559/http://www.authenticsociety.com/blog/javascript_dollarsign
That has a clear explanation of what the dollar sign is for.
Here's an alternative explanation: http://www.vcarrer.com/2010/10/about-dollar-sign-in-javascript.html
Dollar sign is used in ecmascript 2015-2016 as 'template literals'.
Example:
var a = 5;
var b = 10;
console.log(`Sum is equal: ${a + b}`); // 'Sum is equlat: 15'
Here working example:
https://es6console.com/j3lg8xeo/
Notice this sign " ` ",its not normal quotes.
U can also meet $ while working with library jQuery.
$ sign in Regular Expressions means end of line.
When using jQuery, the usage of $ symbol as a prefix in the variable name is merely by convention; it is completely optional and serves only to indicate that the variable holds a jQuery object, as in your example.
This means that when another jQuery function needs to be called on the object, you wouldn't need to wrap it in $() again. For instance, compare these:
// the usual way
var item = $(this).parent().parent().find('input');
$(item).hide(); // this is a double wrap, but required for code readability
item.hide(); // this works but is very unclear how a jQuery function is getting called on this
// with $ prefix
var $item = $(this).parent().parent().find('input');
$item.hide(); // direct call is clear
$($item).hide(); // this works too, but isn't necessary
With the $ prefix the variables already holding jQuery objects are instantly recognizable and the code more readable, and eliminates double/multiple wrapping with $().
No reason. Maybe the person who coded it came from PHP. It has the same effect as if you had named it "_item" or "item" or "item$$".
As a suffix (like "item$", pronounced "items"), it can signify an observable such as a DOM element as a convention called "Finnish Notation" similar to the Hungarian Notation.
I'll add this:
In chromium browser's developer console (haven't tried others) the $ is a native function that acts just like document.querySelector most likely an alias inspired from JQuery's $
Here is a good short video explanation: https://www.youtube.com/watch?v=Acm-MD_6934
According to Ecma International Identifier Names are tokens that are interpreted according to the grammar given in the “Identifiers” section of chapter 5 of the Unicode standard, with some small modifications. An Identifier is an IdentifierName that is not a ReservedWord (see 7.6.1). The Unicode identifier grammar is based on both normative and informative character categories specified by the Unicode Standard. The characters in the specified categories in version 3.0 of the Unicode standard must be treated as in those categories by all conforming ECMAScript implementations.this standard specifies specific character additions:
The dollar sign ($) and the underscore (_) are permitted anywhere in an IdentifierName.
Further reading can be found on: http://www.ecma-international.org/ecma-262/5.1/#sec-7.6
Ecma International is an industry association founded in 1961 and dedicated to the standardization of Information and Communication Technology (ICT) and Consumer Electronics (CE).
"Using the dollar sign is not very common in JavaScript, but
professional programmers often use it as an alias for the main
function in a JavaScript library.
In the JavaScript library jQuery, for instance, the main function $
is used to select HTML elements. In jQuery $("p"); means "select all
p elements". "
via https://www.w3schools.com/js/js_variables.asp
I might add that using it for jQuery allows you to do things like this, for instance:
$.isArray(myArray);
let $ = "Hello";
let $$ = "World!";
let $$$$$$$$$$$ = $ + " " + $$;
alert($$$$$$$$$$$);
This displays a "Hello World!" alert box.
As you can see, $ is just a normal character as far as JavaScript identifiers or variable names go. In fact you can use a huge range of Unicode characters as variable names that look like dollar or other currency signs!
Just be careful as the $ sign is also used as a reference to the jQuery namespace/library:
$("p").text("I am using some jquery");
// ...is the same as...
jQuery("p").text("I am using some jquery");
$ is also used in the new Template Literal format using string interpolation supported in JavaScript version ES6/2015:
var x = `Hello ${name}`;

How can I transform RegEx categories into plain RegEx?

This question is based on this question. During coding, I got some new things popping up and because the initial question is properly answered, I want to describe my issues in this question.
My goal is to have a RegEx which filters out everything, instead of some special requirements:
Alphanumeric allowed
non-Lating e.g. Chinese or Japanese allowed
.,-?!"'=$|<>[]{} allowed
Works with NodeJS 8.9.4
During implementation of the answer from the main question, I've found out, that this only works with newer Node versions (because of the supported ES version). Sadly, our project runs on 8.9.4 which can't be changed in any way. So upgrading is not an option.
I've started searching around and found this page: https://github.com/slevithan/xregexp/blob/master/tools/output/categories.js
With the help of another question, I've tried to build something together which matches my requirements. I came out with:
/[^\(?:[A-Za-z\xAA\xB5\xBA\xC0-\xD6\xD8-\xF6\xF8-\u02C1\u02C6-\u02D1\u02E0-\u02E4\u02EC\u02EE\u0370-\u0374\u0376\u0377\u037A-\u037D\u037F\u0386\u0388-\u038A\u038C\u038E-\u03A1\u03A3-\u03F5\u03F7-\u0481\u048A-\u052F\u0531-\u0556\u0559\u0560-\u0588\u05D0-\u05EA\u05EF-\u05F2\u0620-\u064A\u066E\u066F\u0671-\u06D3\u06D5\u06E5\u06E6\u06EE\u06EF\u06FA-\u06FC\u06FF\u0710\u0712-\u072F\u074D-\u07A5\u07B1\u07CA-\u07EA\u07F4\u07F5\u07FA\u0800-\u0815\u081A\u0824\u0828\u0840-\u0858\u0860-\u086A\u0870-\u0887\u0889-\u088E\u08A0-\u08C9\u0904-\u0939\u093D\u0950\u0958-\u0961\u0971-\u0980\u0985-\u098C\u098F\u0990\u0993-\u09A8\u09AA-\u09B0\u09B2\u09B6-\u09B9\u09BD\u09CE\u09DC\u09DD\u09DF-\u09E1\u09F0\u09F1\u09FC\u0A05-\u0A0A\u0A0F\u0A10\u0A13-\u0A28\u0A2A-\u0A30\u0A32\u0A33\u0A35\u0A36\u0A38\u0A39\u0A59-\u0A5C\u0A5E\u0A72-\u0A74\u0A85-\u0A8D\u0A8F-\u0A91\u0A93-\u0AA8\u0AAA-\u0AB0\u0AB2\u0AB3\u0AB5-\u0AB9\u0ABD\u0AD0\u0AE0\u0AE1\u0AF9\u0B05-\u0B0C\u0B0F\u0B10\u0B13-\u0B28\u0B2A-\u0B30\u0B32\u0B33\u0B35-\u0B39\u0B3D\u0B5C\u0B5D\u0B5F-\u0B61\u0B71\u0B83\u0B85-\u0B8A\u0B8E-\u0B90\u0B92-\u0B95\u0B99\u0B9A\u0B9C\u0B9E\u0B9F\u0BA3\u0BA4\u0BA8-\u0BAA\u0BAE-\u0BB9\u0BD0\u0C05-\u0C0C\u0C0E-\u0C10\u0C12-\u0C28\u0C2A-\u0C39\u0C3D\u0C58-\u0C5A\u0C5D\u0C60\u0C61\u0C80\u0C85-\u0C8C\u0C8E-\u0C90\u0C92-\u0CA8\u0CAA-\u0CB3\u0CB5-\u0CB9\u0CBD\u0CDD\u0CDE\u0CE0\u0CE1\u0CF1\u0CF2\u0D04-\u0D0C\u0D0E-\u0D10\u0D12-\u0D3A\u0D3D\u0D4E\u0D54-\u0D56\u0D5F-\u0D61\u0D7A-\u0D7F\u0D85-\u0D96\u0D9A-\u0DB1\u0DB3-\u0DBB\u0DBD\u0DC0-\u0DC6\u0E01-\u0E30\u0E32\u0E33\u0E40-\u0E46\u0E81\u0E82\u0E84\u0E86-\u0E8A\u0E8C-\u0EA3\u0EA5\u0EA7-\u0EB0\u0EB2\u0EB3\u0EBD\u0EC0-\u0EC4\u0EC6\u0EDC-\u0EDF\u0F00\u0F40-\u0F47\u0F49-\u0F6C\u0F88-\u0F8C\u1000-\u102A\u103F\u1050-\u1055\u105A-\u105D\u1061\u1065\u1066\u106E-\u1070\u1075-\u1081\u108E\u10A0-\u10C5\u10C7\u10CD\u10D0-\u10FA\u10FC-\u1248\u124A-\u124D\u1250-\u1256\u1258\u125A-\u125D\u1260-\u1288\u128A-\u128D\u1290-\u12B0\u12B2-\u12B5\u12B8-\u12BE\u12C0\u12C2-\u12C5\u12C8-\u12D6\u12D8-\u1310\u1312-\u1315\u1318-\u135A\u1380-\u138F\u13A0-\u13F5\u13F8-\u13FD\u1401-\u166C\u166F-\u167F\u1681-\u169A\u16A0-\u16EA\u16F1-\u16F8\u1700-\u1711\u171F-\u1731\u1740-\u1751\u1760-\u176C\u176E-\u1770\u1780-\u17B3\u17D7\u17DC\u1820-\u1878\u1880-\u1884\u1887-\u18A8\u18AA\u18B0-\u18F5\u1900-\u191E\u1950-\u196D\u1970-\u1974\u1980-\u19AB\u19B0-\u19C9\u1A00-\u1A16\u1A20-\u1A54\u1AA7\u1B05-\u1B33\u1B45-\u1B4C\u1B83-\u1BA0\u1BAE\u1BAF\u1BBA-\u1BE5\u1C00-\u1C23\u1C4D-\u1C4F\u1C5A-\u1C7D\u1C80-\u1C88]+/g
My current example string is:
Test=😕查看
°°^ Marting 10202029 Offline!"§$%&/()!"§$%&/()After this we want to keep the allowed special chars: .,-?!"'=$|<>[]{}
Somehow, the answer from the first question works better as the parsed categories from me. So there must be something wrong, which I'm unable to find.
At the end, I want to put everything inside a var.replace() command, to replace everything bad with a single whitespace.
For testing, I'm using: https://regexr.com/
You can either use regexpu to transpile the regex into an ES6-compliant regex, or you may go to the Unicode Utilities: UnicodeSet page and get the code point ranges manually.
In your case, paste [^\p{L}\p{N}] into the Input field, check Abbreviate and Escape, then click Show Set. Add the .,?!"'=$|<>[\]{}- at the end of the character class. Then, double the backslashes (also, escape the ' or ", your string literal delimiter char, I escaped ' below) and put inside the pattern_from_uu variable definition in this JavaScript code and then, all you need to define the regex is const reg = new RegExp(pattern, "gu") or const reg = new RegExp(pattern, "u"):
const pattern_from_uu = '[^0-9A-Za-z\\u00AA\\u00B2\\u00B3\\u00B5\\u00B9\\u00BA\\u00BC-\\u00BE\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u02C1\\u02C6-\\u02D1\\u02E0-\\u02E4\\u02EC\\u02EE\\u0370-\\u0374\\u0376\\u0377\\u037A-\\u037D\\u037F\\u0386\\u0388-\\u038A\\u038C\\u038E-\\u03A1\\u03A3-\\u03F5\\u03F7-\\u0481\\u048A-\\u052F\\u0531-\\u0556\\u0559\\u0560-\\u0588\\u05D0-\\u05EA\\u05EF-\\u05F2\\u0620-\\u064A\\u0660-\\u0669\\u066E\\u066F\\u0671-\\u06D3\\u06D5\\u06E5\\u06E6\\u06EE-\\u06FC\\u06FF\\u0710\\u0712-\\u072F\\u074D-\\u07A5\\u07B1\\u07C0-\\u07EA\\u07F4\\u07F5\\u07FA\\u0800-\\u0815\\u081A\\u0824\\u0828\\u0840-\\u0858\\u0860-\\u086A\\u0870-\\u0887\\u0889-\\u088E\\u08A0-\\u08C9\\u0904-\\u0939\\u093D\\u0950\\u0958-\\u0961\\u0966-\\u096F\\u0971-\\u0980\\u0985-\\u098C\\u098F\\u0990\\u0993-\\u09A8\\u09AA-\\u09B0\\u09B2\\u09B6-\\u09B9\\u09BD\\u09CE\\u09DC\\u09DD\\u09DF-\\u09E1\\u09E6-\\u09F1\\u09F4-\\u09F9\\u09FC\\u0A05-\\u0A0A\\u0A0F\\u0A10\\u0A13-\\u0A28\\u0A2A-\\u0A30\\u0A32\\u0A33\\u0A35\\u0A36\\u0A38\\u0A39\\u0A59-\\u0A5C\\u0A5E\\u0A66-\\u0A6F\\u0A72-\\u0A74\\u0A85-\\u0A8D\\u0A8F-\\u0A91\\u0A93-\\u0AA8\\u0AAA-\\u0AB0\\u0AB2\\u0AB3\\u0AB5-\\u0AB9\\u0ABD\\u0AD0\\u0AE0\\u0AE1\\u0AE6-\\u0AEF\\u0AF9\\u0B05-\\u0B0C\\u0B0F\\u0B10\\u0B13-\\u0B28\\u0B2A-\\u0B30\\u0B32\\u0B33\\u0B35-\\u0B39\\u0B3D\\u0B5C\\u0B5D\\u0B5F-\\u0B61\\u0B66-\\u0B6F\\u0B71-\\u0B77\\u0B83\\u0B85-\\u0B8A\\u0B8E-\\u0B90\\u0B92-\\u0B95\\u0B99\\u0B9A\\u0B9C\\u0B9E\\u0B9F\\u0BA3\\u0BA4\\u0BA8-\\u0BAA\\u0BAE-\\u0BB9\\u0BD0\\u0BE6-\\u0BF2\\u0C05-\\u0C0C\\u0C0E-\\u0C10\\u0C12-\\u0C28\\u0C2A-\\u0C39\\u0C3D\\u0C58-\\u0C5A\\u0C5D\\u0C60\\u0C61\\u0C66-\\u0C6F\\u0C78-\\u0C7E\\u0C80\\u0C85-\\u0C8C\\u0C8E-\\u0C90\\u0C92-\\u0CA8\\u0CAA-\\u0CB3\\u0CB5-\\u0CB9\\u0CBD\\u0CDD\\u0CDE\\u0CE0\\u0CE1\\u0CE6-\\u0CEF\\u0CF1\\u0CF2\\u0D04-\\u0D0C\\u0D0E-\\u0D10\\u0D12-\\u0D3A\\u0D3D\\u0D4E\\u0D54-\\u0D56\\u0D58-\\u0D61\\u0D66-\\u0D78\\u0D7A-\\u0D7F\\u0D85-\\u0D96\\u0D9A-\\u0DB1\\u0DB3-\\u0DBB\\u0DBD\\u0DC0-\\u0DC6\\u0DE6-\\u0DEF\\u0E01-\\u0E30\\u0E32\\u0E33\\u0E40-\\u0E46\\u0E50-\\u0E59\\u0E81\\u0E82\\u0E84\\u0E86-\\u0E8A\\u0E8C-\\u0EA3\\u0EA5\\u0EA7-\\u0EB0\\u0EB2\\u0EB3\\u0EBD\\u0EC0-\\u0EC4\\u0EC6\\u0ED0-\\u0ED9\\u0EDC-\\u0EDF\\u0F00\\u0F20-\\u0F33\\u0F40-\\u0F47\\u0F49-\\u0F6C\\u0F88-\\u0F8C\\u1000-\\u102A\\u103F-\\u1049\\u1050-\\u1055\\u105A-\\u105D\\u1061\\u1065\\u1066\\u106E-\\u1070\\u1075-\\u1081\\u108E\\u1090-\\u1099\\u10A0-\\u10C5\\u10C7\\u10CD\\u10D0-\\u10FA\\u10FC-\\u1248\\u124A-\\u124D\\u1250-\\u1256\\u1258\\u125A-\\u125D\\u1260-\\u1288\\u128A-\\u128D\\u1290-\\u12B0\\u12B2-\\u12B5\\u12B8-\\u12BE\\u12C0\\u12C2-\\u12C5\\u12C8-\\u12D6\\u12D8-\\u1310\\u1312-\\u1315\\u1318-\\u135A\\u1369-\\u137C\\u1380-\\u138F\\u13A0-\\u13F5\\u13F8-\\u13FD\\u1401-\\u166C\\u166F-\\u167F\\u1681-\\u169A\\u16A0-\\u16EA\\u16EE-\\u16F8\\u1700-\\u1711\\u171F-\\u1731\\u1740-\\u1751\\u1760-\\u176C\\u176E-\\u1770\\u1780-\\u17B3\\u17D7\\u17DC\\u17E0-\\u17E9\\u17F0-\\u17F9\\u1810-\\u1819\\u1820-\\u1878\\u1880-\\u1884\\u1887-\\u18A8\\u18AA\\u18B0-\\u18F5\\u1900-\\u191E\\u1946-\\u196D\\u1970-\\u1974\\u1980-\\u19AB\\u19B0-\\u19C9\\u19D0-\\u19DA\\u1A00-\\u1A16\\u1A20-\\u1A54\\u1A80-\\u1A89\\u1A90-\\u1A99\\u1AA7\\u1B05-\\u1B33\\u1B45-\\u1B4C\\u1B50-\\u1B59\\u1B83-\\u1BA0\\u1BAE-\\u1BE5\\u1C00-\\u1C23\\u1C40-\\u1C49\\u1C4D-\\u1C7D\\u1C80-\\u1C88\\u1C90-\\u1CBA\\u1CBD-\\u1CBF\\u1CE9-\\u1CEC\\u1CEE-\\u1CF3\\u1CF5\\u1CF6\\u1CFA\\u1D00-\\u1DBF\\u1E00-\\u1F15\\u1F18-\\u1F1D\\u1F20-\\u1F45\\u1F48-\\u1F4D\\u1F50-\\u1F57\\u1F59\\u1F5B\\u1F5D\\u1F5F-\\u1F7D\\u1F80-\\u1FB4\\u1FB6-\\u1FBC\\u1FBE\\u1FC2-\\u1FC4\\u1FC6-\\u1FCC\\u1FD0-\\u1FD3\\u1FD6-\\u1FDB\\u1FE0-\\u1FEC\\u1FF2-\\u1FF4\\u1FF6-\\u1FFC\\u2070\\u2071\\u2074-\\u2079\\u207F-\\u2089\\u2090-\\u209C\\u2102\\u2107\\u210A-\\u2113\\u2115\\u2119-\\u211D\\u2124\\u2126\\u2128\\u212A-\\u212D\\u212F-\\u2139\\u213C-\\u213F\\u2145-\\u2149\\u214E\\u2150-\\u2189\\u2460-\\u249B\\u24EA-\\u24FF\\u2776-\\u2793\\u2C00-\\u2CE4\\u2CEB-\\u2CEE\\u2CF2\\u2CF3\\u2CFD\\u2D00-\\u2D25\\u2D27\\u2D2D\\u2D30-\\u2D67\\u2D6F\\u2D80-\\u2D96\\u2DA0-\\u2DA6\\u2DA8-\\u2DAE\\u2DB0-\\u2DB6\\u2DB8-\\u2DBE\\u2DC0-\\u2DC6\\u2DC8-\\u2DCE\\u2DD0-\\u2DD6\\u2DD8-\\u2DDE\\u2E2F\\u3005-\\u3007\\u3021-\\u3029\\u3031-\\u3035\\u3038-\\u303C\\u3041-\\u3096\\u309D-\\u309F\\u30A1-\\u30FA\\u30FC-\\u30FF\\u3105-\\u312F\\u3131-\\u318E\\u3192-\\u3195\\u31A0-\\u31BF\\u31F0-\\u31FF\\u3220-\\u3229\\u3248-\\u324F\\u3251-\\u325F\\u3280-\\u3289\\u32B1-\\u32BF\\u3400-\\u4DBF\\u4E00-\\uA48C\\uA4D0-\\uA4FD\\uA500-\\uA60C\\uA610-\\uA62B\\uA640-\\uA66E\\uA67F-\\uA69D\\uA6A0-\\uA6EF\\uA717-\\uA71F\\uA722-\\uA788\\uA78B-\\uA7CA\\uA7D0\\uA7D1\\uA7D3\\uA7D5-\\uA7D9\\uA7F2-\\uA801\\uA803-\\uA805\\uA807-\\uA80A\\uA80C-\\uA822\\uA830-\\uA835\\uA840-\\uA873\\uA882-\\uA8B3\\uA8D0-\\uA8D9\\uA8F2-\\uA8F7\\uA8FB\\uA8FD\\uA8FE\\uA900-\\uA925\\uA930-\\uA946\\uA960-\\uA97C\\uA984-\\uA9B2\\uA9CF-\\uA9D9\\uA9E0-\\uA9E4\\uA9E6-\\uA9FE\\uAA00-\\uAA28\\uAA40-\\uAA42\\uAA44-\\uAA4B\\uAA50-\\uAA59\\uAA60-\\uAA76\\uAA7A\\uAA7E-\\uAAAF\\uAAB1\\uAAB5\\uAAB6\\uAAB9-\\uAABD\\uAAC0\\uAAC2\\uAADB-\\uAADD\\uAAE0-\\uAAEA\\uAAF2-\\uAAF4\\uAB01-\\uAB06\\uAB09-\\uAB0E\\uAB11-\\uAB16\\uAB20-\\uAB26\\uAB28-\\uAB2E\\uAB30-\\uAB5A\\uAB5C-\\uAB69\\uAB70-\\uABE2\\uABF0-\\uABF9\\uAC00-\\uD7A3\\uD7B0-\\uD7C6\\uD7CB-\\uD7FB\\uF900-\\uFA6D\\uFA70-\\uFAD9\\uFB00-\\uFB06\\uFB13-\\uFB17\\uFB1D\\uFB1F-\\uFB28\\uFB2A-\\uFB36\\uFB38-\\uFB3C\\uFB3E\\uFB40\\uFB41\\uFB43\\uFB44\\uFB46-\\uFBB1\\uFBD3-\\uFD3D\\uFD50-\\uFD8F\\uFD92-\\uFDC7\\uFDF0-\\uFDFB\\uFE70-\\uFE74\\uFE76-\\uFEFC\\uFF10-\\uFF19\\uFF21-\\uFF3A\\uFF41-\\uFF5A\\uFF66-\\uFFBE\\uFFC2-\\uFFC7\\uFFCA-\\uFFCF\\uFFD2-\\uFFD7\\uFFDA-\\uFFDC\\U00010000-\\U0001000B\\U0001000D-\\U00010026\\U00010028-\\U0001003A\\U0001003C\\U0001003D\\U0001003F-\\U0001004D\\U00010050-\\U0001005D\\U00010080-\\U000100FA\\U00010107-\\U00010133\\U00010140-\\U00010178\\U0001018A\\U0001018B\\U00010280-\\U0001029C\\U000102A0-\\U000102D0\\U000102E1-\\U000102FB\\U00010300-\\U00010323\\U0001032D-\\U0001034A\\U00010350-\\U00010375\\U00010380-\\U0001039D\\U000103A0-\\U000103C3\\U000103C8-\\U000103CF\\U000103D1-\\U000103D5\\U00010400-\\U0001049D\\U000104A0-\\U000104A9\\U000104B0-\\U000104D3\\U000104D8-\\U000104FB\\U00010500-\\U00010527\\U00010530-\\U00010563\\U00010570-\\U0001057A\\U0001057C-\\U0001058A\\U0001058C-\\U00010592\\U00010594\\U00010595\\U00010597-\\U000105A1\\U000105A3-\\U000105B1\\U000105B3-\\U000105B9\\U000105BB\\U000105BC\\U00010600-\\U00010736\\U00010740-\\U00010755\\U00010760-\\U00010767\\U00010780-\\U00010785\\U00010787-\\U000107B0\\U000107B2-\\U000107BA\\U00010800-\\U00010805\\U00010808\\U0001080A-\\U00010835\\U00010837\\U00010838\\U0001083C\\U0001083F-\\U00010855\\U00010858-\\U00010876\\U00010879-\\U0001089E\\U000108A7-\\U000108AF\\U000108E0-\\U000108F2\\U000108F4\\U000108F5\\U000108FB-\\U0001091B\\U00010920-\\U00010939\\U00010980-\\U000109B7\\U000109BC-\\U000109CF\\U000109D2-\\U00010A00\\U00010A10-\\U00010A13\\U00010A15-\\U00010A17\\U00010A19-\\U00010A35\\U00010A40-\\U00010A48\\U00010A60-\\U00010A7E\\U00010A80-\\U00010A9F\\U00010AC0-\\U00010AC7\\U00010AC9-\\U00010AE4\\U00010AEB-\\U00010AEF\\U00010B00-\\U00010B35\\U00010B40-\\U00010B55\\U00010B58-\\U00010B72\\U00010B78-\\U00010B91\\U00010BA9-\\U00010BAF\\U00010C00-\\U00010C48\\U00010C80-\\U00010CB2\\U00010CC0-\\U00010CF2\\U00010CFA-\\U00010D23\\U00010D30-\\U00010D39\\U00010E60-\\U00010E7E\\U00010E80-\\U00010EA9\\U00010EB0\\U00010EB1\\U00010F00-\\U00010F27\\U00010F30-\\U00010F45\\U00010F51-\\U00010F54\\U00010F70-\\U00010F81\\U00010FB0-\\U00010FCB\\U00010FE0-\\U00010FF6\\U00011003-\\U00011037\\U00011052-\\U0001106F\\U00011071\\U00011072\\U00011075\\U00011083-\\U000110AF\\U000110D0-\\U000110E8\\U000110F0-\\U000110F9\\U00011103-\\U00011126\\U00011136-\\U0001113F\\U00011144\\U00011147\\U00011150-\\U00011172\\U00011176\\U00011183-\\U000111B2\\U000111C1-\\U000111C4\\U000111D0-\\U000111DA\\U000111DC\\U000111E1-\\U000111F4\\U00011200-\\U00011211\\U00011213-\\U0001122B\\U00011280-\\U00011286\\U00011288\\U0001128A-\\U0001128D\\U0001128F-\\U0001129D\\U0001129F-\\U000112A8\\U000112B0-\\U000112DE\\U000112F0-\\U000112F9\\U00011305-\\U0001130C\\U0001130F\\U00011310\\U00011313-\\U00011328\\U0001132A-\\U00011330\\U00011332\\U00011333\\U00011335-\\U00011339\\U0001133D\\U00011350\\U0001135D-\\U00011361\\U00011400-\\U00011434\\U00011447-\\U0001144A\\U00011450-\\U00011459\\U0001145F-\\U00011461\\U00011480-\\U000114AF\\U000114C4\\U000114C5\\U000114C7\\U000114D0-\\U000114D9\\U00011580-\\U000115AE\\U000115D8-\\U000115DB\\U00011600-\\U0001162F\\U00011644\\U00011650-\\U00011659\\U00011680-\\U000116AA\\U000116B8\\U000116C0-\\U000116C9\\U00011700-\\U0001171A\\U00011730-\\U0001173B\\U00011740-\\U00011746\\U00011800-\\U0001182B\\U000118A0-\\U000118F2\\U000118FF-\\U00011906\\U00011909\\U0001190C-\\U00011913\\U00011915\\U00011916\\U00011918-\\U0001192F\\U0001193F\\U00011941\\U00011950-\\U00011959\\U000119A0-\\U000119A7\\U000119AA-\\U000119D0\\U000119E1\\U000119E3\\U00011A00\\U00011A0B-\\U00011A32\\U00011A3A\\U00011A50\\U00011A5C-\\U00011A89\\U00011A9D\\U00011AB0-\\U00011AF8\\U00011C00-\\U00011C08\\U00011C0A-\\U00011C2E\\U00011C40\\U00011C50-\\U00011C6C\\U00011C72-\\U00011C8F\\U00011D00-\\U00011D06\\U00011D08\\U00011D09\\U00011D0B-\\U00011D30\\U00011D46\\U00011D50-\\U00011D59\\U00011D60-\\U00011D65\\U00011D67\\U00011D68\\U00011D6A-\\U00011D89\\U00011D98\\U00011DA0-\\U00011DA9\\U00011EE0-\\U00011EF2\\U00011FB0\\U00011FC0-\\U00011FD4\\U00012000-\\U00012399\\U00012400-\\U0001246E\\U00012480-\\U00012543\\U00012F90-\\U00012FF0\\U00013000-\\U0001342E\\U00014400-\\U00014646\\U00016800-\\U00016A38\\U00016A40-\\U00016A5E\\U00016A60-\\U00016A69\\U00016A70-\\U00016ABE\\U00016AC0-\\U00016AC9\\U00016AD0-\\U00016AED\\U00016B00-\\U00016B2F\\U00016B40-\\U00016B43\\U00016B50-\\U00016B59\\U00016B5B-\\U00016B61\\U00016B63-\\U00016B77\\U00016B7D-\\U00016B8F\\U00016E40-\\U00016E96\\U00016F00-\\U00016F4A\\U00016F50\\U00016F93-\\U00016F9F\\U00016FE0\\U00016FE1\\U00016FE3\\U00017000-\\U000187F7\\U00018800-\\U00018CD5\\U00018D00-\\U00018D08\\U0001AFF0-\\U0001AFF3\\U0001AFF5-\\U0001AFFB\\U0001AFFD\\U0001AFFE\\U0001B000-\\U0001B122\\U0001B150-\\U0001B152\\U0001B164-\\U0001B167\\U0001B170-\\U0001B2FB\\U0001BC00-\\U0001BC6A\\U0001BC70-\\U0001BC7C\\U0001BC80-\\U0001BC88\\U0001BC90-\\U0001BC99\\U0001D2E0-\\U0001D2F3\\U0001D360-\\U0001D378\\U0001D400-\\U0001D454\\U0001D456-\\U0001D49C\\U0001D49E\\U0001D49F\\U0001D4A2\\U0001D4A5\\U0001D4A6\\U0001D4A9-\\U0001D4AC\\U0001D4AE-\\U0001D4B9\\U0001D4BB\\U0001D4BD-\\U0001D4C3\\U0001D4C5-\\U0001D505\\U0001D507-\\U0001D50A\\U0001D50D-\\U0001D514\\U0001D516-\\U0001D51C\\U0001D51E-\\U0001D539\\U0001D53B-\\U0001D53E\\U0001D540-\\U0001D544\\U0001D546\\U0001D54A-\\U0001D550\\U0001D552-\\U0001D6A5\\U0001D6A8-\\U0001D6C0\\U0001D6C2-\\U0001D6DA\\U0001D6DC-\\U0001D6FA\\U0001D6FC-\\U0001D714\\U0001D716-\\U0001D734\\U0001D736-\\U0001D74E\\U0001D750-\\U0001D76E\\U0001D770-\\U0001D788\\U0001D78A-\\U0001D7A8\\U0001D7AA-\\U0001D7C2\\U0001D7C4-\\U0001D7CB\\U0001D7CE-\\U0001D7FF\\U0001DF00-\\U0001DF1E\\U0001E100-\\U0001E12C\\U0001E137-\\U0001E13D\\U0001E140-\\U0001E149\\U0001E14E\\U0001E290-\\U0001E2AD\\U0001E2C0-\\U0001E2EB\\U0001E2F0-\\U0001E2F9\\U0001E7E0-\\U0001E7E6\\U0001E7E8-\\U0001E7EB\\U0001E7ED\\U0001E7EE\\U0001E7F0-\\U0001E7FE\\U0001E800-\\U0001E8C4\\U0001E8C7-\\U0001E8CF\\U0001E900-\\U0001E943\\U0001E94B\\U0001E950-\\U0001E959\\U0001EC71-\\U0001ECAB\\U0001ECAD-\\U0001ECAF\\U0001ECB1-\\U0001ECB4\\U0001ED01-\\U0001ED2D\\U0001ED2F-\\U0001ED3D\\U0001EE00-\\U0001EE03\\U0001EE05-\\U0001EE1F\\U0001EE21\\U0001EE22\\U0001EE24\\U0001EE27\\U0001EE29-\\U0001EE32\\U0001EE34-\\U0001EE37\\U0001EE39\\U0001EE3B\\U0001EE42\\U0001EE47\\U0001EE49\\U0001EE4B\\U0001EE4D-\\U0001EE4F\\U0001EE51\\U0001EE52\\U0001EE54\\U0001EE57\\U0001EE59\\U0001EE5B\\U0001EE5D\\U0001EE5F\\U0001EE61\\U0001EE62\\U0001EE64\\U0001EE67-\\U0001EE6A\\U0001EE6C-\\U0001EE72\\U0001EE74-\\U0001EE77\\U0001EE79-\\U0001EE7C\\U0001EE7E\\U0001EE80-\\U0001EE89\\U0001EE8B-\\U0001EE9B\\U0001EEA1-\\U0001EEA3\\U0001EEA5-\\U0001EEA9\\U0001EEAB-\\U0001EEBB\\U0001F100-\\U0001F10C\\U0001FBF0-\\U0001FBF9\\U00020000-\\U0002A6DF\\U0002A700-\\U0002B738\\U0002B740-\\U0002B81D\\U0002B820-\\U0002CEA1\\U0002CEB0-\\U0002EBE0\\U0002F800-\\U0002FA1D\\U00030000-\\U0003134A.,?!"\'=$|<>[\\]{}-]';
let pattern = pattern_from_uu.replace(/\\U000([a-f\d]+)/gi, "\\u{$1}");
console.log("Your regex is:\n/" + pattern + "/gu");
const texts = ["Test=😕查看","°°^ Marting 10202029 Offline!\"§$%&/()!\"§$%&/()After this we want to keep the allowed special chars: .,-?!\"'=$|<>[]{}"];
const reg = new RegExp(pattern, "gu")
for (const text of texts) {
console.log(text.replace(reg, ""));
}
See the generated regex demo.

Enhancing String Literals Delimiters to Support Raw Text Swift

I recently found this code snippets on the Swift 5 Book.
print(#"Write an interpolated string in Swift using \(multiplier)."#)
// Prints "Write an interpolated string in Swift using \(multiplier).”
print(#"6 times 7 is \#(6 * 7)."#)
// Prints "6 times 7 is 42.”
I learnt it was an accepted proposal in Swift 5 for enhancing string literals delimiters to support raw text, with so many examples given.
My question is when and how is it used in practical cases because from the examples given above, I would still clearly achieve what I want to even without the # signs!
To give just one example where it is very useful. How about when writing Regex, previously it was a nightmare as you had to escape all special characters. E.g.
let regex1 = "\\\\[A-Z]+[A-Za-z]+\\.[a-z]+"
Can now be replaced with
let regex2 = #"\\[A-Z]+[A-Za-z]+\.[a-z]+"#
Much easier to write. Now when you find a regex online, you can just copy and paste it in without having to spend ages escaping special characters.
Edit:
Can read here
https://www.hackingwithswift.com/articles/162/how-to-use-raw-strings-in-swift

What's the point of nesting brackets in Lua?

I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff

lexer/parser ambiguity

How does a lexer solve this ambiguity?
/*/*/
How is it that it doesn't just say, oh yeah, that's the begining of a multi-line comment, followed by another multi-line comment.
Wouldn't a greedy lexer just return the following tokens?
/*
/*
/
I'm in the midst of writing a shift-reduce parser for CSS and yet this simple comment thing is in my way. You can read this question if you wan't some more background information.
UPDATE
Sorry for leaving this out in the first place. I'm planning to add extensions to the CSS language in this form /* # func ( args, ... ) */ but I don't want to confuse an editor which understands CSS but not this extension comment of mine. That's why the lexer just can't ignore comments.
One way to do it is for the lexer to enter a different internal state on encountering the first /*. For example, flex calls these "start conditions" (matching C-style comments is one of the examples on that page).
The simplest way would probably be to lex the comment as one single token - that is, don't emit a "START COMMENT" token, but instead continue reading in input until you can emit a "COMMENT BLOCK" token that includes the entire /*(anything)*/ bit.
Since comments are not relevant to the actual parsing of executable code, it's fine for them to basically be stripped out by the lexer (or at least, clumped into a single token). You don't care about token matches within a comment.
In most languages, this is not ambiguous: the first slash and asterix are consumed to produce the "start of multi-line comment" token. It is followed by a slash which is plain "content" within the comment and finally the last two characters are the "end of multi-line comment" token.
Since the first 2 characters are consumed, the first asterix cannot also be used to produce an end of comment token. I just noted that it could produce a second "start of comment" token... oops, that could be a problem, depending on the amount of context is available for the parser.
I speak here of tokens, assuming a parser-level handling of the comments. But the same applies to a lexer, whereby the underlying rule is to start with '/*' and then not stop till '*/' is found. Effectively, a lexer-level handling of the whole comment wouldn't be confused by the second "start of comment".
Since CSS does not support nested comments, your example would typically parse into a single token, COMMENT.
That is, the lexer would see /* as a start-comment marker and then consume everything up to and including a */ sequence.
Use the regexp's algorithm, search from the beginning of the string working way back to the current location.
if (chars[currentLocation] == '/' and chars[currentLocation - 1] == '*') {
for (int i = currentLocation - 2; i >= 0; i --) {
if (chars[i] == '/' && chars[i + 1] == '*') {
// .......
}
}
}
It's like applying the regexp /\*([^\*]|\*[^\/])\*/ greedy and bottom-up.
One way to solve this would be to have your lexer return:
/
*
/
*
/
And have your parser deal with it from there. That's what I'd probably do for most programming languages, as the /'s and *'s can also be used for multiplication and other such things, which are all too complicated for the lexer to worry about. The lexer should really just be returning elementary symbols.
If what the token is starts to depend too much on context, what you're looking for may very well be a simpler token.
That being said, CSS is not a programming language so /'s and *'s can't be overloaded. Really afaik they can't be used for anything else other than comments. So I'd be very tempted to just pass the whole thing as a comment token unless you have a good reason not to: /\*.*\*/

Resources