Variable amount of params express js - node.js

I am trying to define a route in express js that takes an unknown amount N of parameters. It should match the following routes, capturing all digit groups:
/scope
/scope/1/12
/scope/1/12/123
etc.
I wrote a regex for the matching of the n-amount of numbers, as follows:
/(?:\/?(\d+)\/?)/g
The global /g however doesn't seem to be allowed, see (The regex parser of express js on github). Am I doing something wrong here? I could solve this very ugly and dirty by doing something like:
^\/scope\/?(\d+)?\/?(\d+)?\/?(\d+)?
But this is not dynamic, feels dirty and if I add deeper levels of scoping I always will need to add more /?(\d+) regex parts, which is a model that does not fit my business logic. I am shure there must be a better way...
Okay, after a discussion with #vks, which was useful but unfortunately not answering the question, we came to the conclusion that this is not a regex problem. With the \g modifier a regex capturing all digit groups can quite easily be written, even in javascripts very limited regex engine.
The question now becomes more clearly formulated: since expressjs does not allow a full regex from begin to end, but rather encloses the regex you use in a route in it's own begin and end of a regex, not allowing /g modifiers, what is the expressjs idiomatic way to solve this problem?

^\/scope(?:\/\d+)*$
You can try this.See demo.
https://regex101.com/r/eZ0yP4/30

Related

Arbitrary lookaheads in PLY

I am trying to parse a config, which would translate to a structured form. This new form requires that comments within the original config be preserved. The parsing tool is PLY. I am running into an issue with my current approach which I will describe in detail below, with links to code as well. The config file is going to look contain multiple config blocks, each of which is going to be of the following format
<optional comments>
start_of_line request_stmts(one or more)
indent reply_stmts (zero or more)
include_stmts (type 3)(zero or more)
An example config file looks like this.
While I am able to partially parse the config file with the grammar below, I fail to accomodate comments which would exist within the block.
For example, a block like this raises syntax errors, and any comments in a block of config fail to parse.
<optional comments>
start_of_line request_stmts(type 1)(one or more)
indent reply_stmts (type 2)(one or more)
<comments>
include_stmts (type 3)(one or more)(optional)
The parser.out mentions one shift/reduce conflict which I think arises because once the reply_stmts are parsed, a comments section which follows could mark start of a new block or comments within the subblock. Current grammar parsing result for the example file
[['# test comment ', '# more of this', '# does this make sense'], 'DEFAULT', [['x', '=',
'y']], [['y', '=', '1']], ['# Transmode', '# maybe something else', '# comment'],
'/random/location/test.user']
As you might notice, the second config block complete misses the username, request_stmt, reply_stmt sections.
What I have tried
I have tried moving the comments section around in the grammar, by specifying it before specific blocks or in the statement grammar. In the code link pasted above, the comments section has been specified in the overall statement grammar. Both of these approaches fail to parse comments within a config block.
username : comments username
| username
include_stmt : comments includes
| includes
I have two main questions:
Is there a mistake I am making in the implementation/understanding of LR parsing, solving which I could achieve what I want to ?
Is there a better way to achieve the same goal than my current approach ? (PLY-fu, different parser, different grammar)
P.S Wasn't able to include the actual code in the question, mentioned in the comments
You are correct that the problem is that when the parser sees a comment, it cannot know whether the comment belongs to the same section or whether the previous section is finished. In the former case, the parser needs to shift the comment, while in the latter case it needs to reduce the configuration section.
Since there could be any number of comments, the necessary lookahead could be arbitrarily large, in which case LR parsing wouldn't be possible. But a simple trick can reduce the lookahead to two tokens: just combine consecutive comments into a single token.
Any LR(k) grammar has an equivalent LR(1) grammar. In effect, the LR(1) grammars works by delaying all decisions for k-1 tokens, accumulating these tokens into the parser state. That's a massive increase in grammar size, but it's usually possible to achieve the same effect in other ways, and that's certainly the case here.
The basic idea is that any comment is (temporarily) accumulated into a list of comments. When a non-comment token is encountered, this temporary list is attached to that token.
This can be done either in the lexical scanner or in the parser actions, depending on your inclinations.
Before attempting all that, you should make sure that retaining comments is really useful to your application. Comments are normally not relevant to the semantics of a program (or configuration file), and it would certainly be much simpler for the lexer to just drop comments into the bit-bucket. If your application will end up reformatting the input, then it will have to retain comments. But if it only needs to extract information from the configuration, putting a lot of effort into handling comments is hard to justify.

Capybara: Should I get rid of extracted constants or keep them?

I was wondering about some best practices regarding extraction of selectors to constants. As a general rule, it is usually recommended to extract magic numbers and string literals to constants so they can be reused, but I am not sure if this is really a good approach when dealing with selectors in Capybara.
At the moment, I have a file called "selectors.rb" which contains the selectors that I use. Here is part of it:
SELECTORS = {
checkout: {
checkbox_agreement: 'input#agreement-1',
input_billing_city: 'input#billing\:city',
input_billing_company: 'input#billing\:company',
input_billing_country: 'input#billing\:country_id',
input_billing_firstname: 'input#billing\:firstname',
input_billing_lastname: 'input#billing\:lastname',
input_billing_postcode: 'input#billing\:postcode',
input_billing_region: 'input#billing\:region_id',
input_billing_street1: 'input#billing\:street1',
....
}
In theory, I put my selectors in this file, and then I could do something like this:
find(SELECTORS[:checkout][:input_billing_city]).click
There are several problems with this:
If I want to know the selector that is used, I have to look it up
If I change the name in selectors.rb, I could forget to change it somewhere else in the file which will result in find(nil).click
With the example above, I can't use this selector with fill_in(SELECTORS[:checkout][:input_billing_city]), because it requires an ID, name or label
There are probably a few more problems with that, so I am considering to get rid of the constants. Has anyone been in a similar spot? What is a good way to deal with this situation?
Someone mentioned the SitePrism gem to me: https://github.com/natritmeyer/site_prism
A Page Object Model DSL for Capybara
SitePrism gives you a simple, clean and semantic DSL for describing
your site using the Page Object Model pattern, for use with Capybara
in automated acceptance testing.
It is very helpful in that regard and I have adjusted my code accordingly.

Ternary operator should not be used on a single line in Node.js. Why?

Consider the following sample codes:
1.Sample
var IsAdminUser = (User.Privileges == AdminPrivileges)
? 'yes'
: 'no';
console.log(IsAdminUser);
2.Sample
var IsAdminUser = (User.Privileges == AdminPrivileges)?'yes': 'no';
console.log(IsAdminUser);
The 2nd sample I am very comfortable with & I code in that style, but it was told that its wrong way of doing without any supportive reasons.
Why is it recommended not to use a single line ternary operator in Node.js?
Can anyone put some light on the reason why it is so?
Advance Thanks for great help.
With all coding standards, they are generally for readability and maintainability. My guess is the author finds it more readable on separate lines. The compiler / interpreter for your language will handle it all the same. As long as you / your project have a set standard and stick to it, you'll be fine. I recommend that the standards be worked on or at least reviewed by everyone on the project before casting them in stone. I think that if you're breaking it up on separate lines like that, you may as well define an if/else conditional block and use that.
Be wary of coding standards rules that do not have a justification.
Personally, I do not like the ternary operator as it feels unnatural to me and I always have to read the line a few times to understand what it's doing. I find separate if/else blocks easier for me to read. Personal preference of course.
It is in fact wrong to put the ? on a new line; even though it doesn’t hurt in practice.
The reason is a JS feature called “Automatic Semicolon Insertion”. When a var statement ends with a newline (without a trailing comma, which would indicate that more declarations are to follow), your JS interpreter should automatically insert a semicolon.
This semicolon would have the effect that IsAdminUser is assigned a boolean value (namely the result of User.Privileges == AdminPrivileges). After that, a new (invalid) expression would start with the question mark of what you think is a ternary operator.
As mentioned, most JS interpreters are smart enough to recognize that you have a newline where you shouldn’t have one, and implicitely fix your ternary operator. And, when minifying your script, the newline is removed anyway.
So, no problem in practice, but you’re relying on an implicit fix of common JS engines. It’s better to write the ternary operator like this:
var foo = bar ? "yes" : "no";
Or, for larger expressions:
var foo = bar ?
"The operation was successful" : "The operation has failed.";
Or even:
var foo = bar ?
"Congratulations, the operation was a total success!" :
"Oh, no! The operation has horribly failed!";
I completely disagree with the person who made this recommendation. The ternary operator is a standard feature of all 'C' style languages (C,C++,Java,C#,Javascript etc.), and most developers who code in these languages are completely comfortable with the single line version.
The first version just looks weird to me. If I was maintaining code and saw this, I would correct it back to a single line.
If you want verbose, use if-else. If you want neat and compact use a ternary.
My guess is the person who made this recommendation simply wasn't very familiar with the operator, so found it confusing.
Because it's easier on the eye and easier to read. It's much easier to see what your first snippet is doing at a glance - I don't even have to read to the end of a line. I can simply look at one spot and immediately know what values IsAdminUser will have for what conditions. Much the same reason as why you wouldn't write an entire if/else block on one line.
Remember that these are style conventions and are not necessarily backed up by objective (or technical) reasoning.
The reason for having ? and : on separate lines is so that it's easier to figure out what changed if your source control has a line-by-line comparison.
If you've just changed the stuff between the ? and : and everything is on a single line, the entire line can be marked as changed (based on your comparison tool).

What's the name for hyphen-separated case?

This is PascalCase: SomeSymbol
This is camelCase: someSymbol
This is snake_case: some_symbol
So my questions is whether there is a widely accepted name for this: some-symbol? It's commonly used in url's.
There isn't really a standard name for this case convention, and there is disagreement over what it should be called.
That said, as of 2019, there is a strong case to be made that kebab-case is winning:
https://trends.google.com/trends/explore?date=all&q=kebab-case,spinal-case,lisp-case,dash-case,caterpillar-case
spinal-case is a distant second, and no other terms have any traction at all.
Additionally, kebab-case has entered the lexicon of several javascript code libraries, e.g.:
https://lodash.com/docs/#kebabCase
https://www.npmjs.com/package/kebab-case
https://v2.vuejs.org/v2/guide/components-props.html#Prop-Casing-camelCase-vs-kebab-case
However, there are still other terms that people use. Lisp has used this convention for decades as described in this Wikipedia entry, so some people have described it as lisp-case. Some other forms I've seen include caterpillar-case, dash-case, and hyphen-case, but none of these is standard.
So the answer to your question is: No, there isn't a single widely-accepted name for this case convention analogous to snake_case or camelCase, which are widely-accepted.
It's referred to as kebab-case. See lodash docs.
It's also sometimes known as caterpillar-case
This is the most famous case and It has many names
kebab-case: It's the name most adopted by official software
caterpillar-case
dash-case
hyphen-case or hyphenated-case
lisp-case
spinal-case
css-case
slug-case
friendly-url-case
As the character (-) is referred to as "hyphen" or "dash", it seems more natural to name this "dash-case", or "hyphen-case" (less frequently used).
As mentioned in Wikipedia, "kebab-case" is also used. Apparently (see answer) this is because the character would look like a skewer... It needs some imagination though.
Used in lodash lib for example.
Recently, "dash-case" was used by
Angular (https://angular.io/guide/glossary#case-types)
NPM modules
https://www.npmjs.com/package/case-dash (removed ?)
https://www.npmjs.com/package/dasherize
Adding the correct link here Kebab Case
which is All lowercase with - separating words.
I've always called it, and heard it be called, 'dashcase.'
There is no standardized name.
Libraries like jquery and lodash refer it as kebab-case. So does Vuejs javascript framework. However, I am not sure whether it's safe to declare that it's referred as kebab-case in javascript world.
I've always known it as kebab-case.
On a funny note, I've heard people call it a SCREAM-KEBAB when all the letters are capitalized.
Kebab Case Warning
I've always liked kebab-case as it seems the most readable when you need whitespace. However, some programs interpret the dash as a minus sign, and it can cause problems as what you think is a name turns into a subtraction operation.
first-second // first minus second?
ten-2 // ten minus two?
Also, some frameworks parse dashes in kebab cased property. For example, GitHub Pages uses Jekyll, and Jekyll parses any dashes it finds in an md file. For example, a file named 2020-1-2-homepage.md on GitHub Pages gets put into a folder structured as \2020\1\2\homepage.html when the site is compiled.
Snake_case vs kebab-case
A safer alternative to kebab-case is snake_case, or SCREAMING_SNAKE_CASE, as underscores cause less confusion when compared to a minus sign.
I'd simply say that it was hyphenated.
Worth to mention from abolish:
https://github.com/tpope/vim-abolish/blob/master/doc/abolish.txt#L152
dash-case or kebab-case
In Salesforce, It is referred as kebab-case. See below
https://developer.salesforce.com/docs/component-library/documentation/lwc/lwc.js_props_names
Here is a more recent discombobulation. Documentation everywhere in angular JS and Pluralsight courses and books on angular, all refer to kebab-case as snake-case, not differentiating between the two.
Its too bad caterpillar-case did not stick because snake_case and caterpillar-case are easily remembered and actually look like what they represent (if you have a good imagination).
My ECMAScript proposal for String.prototype.toKebabCase.
String.prototype.toKebabCase = function () {
return this.valueOf().replace(/-/g, ' ').split('')
.reduce((str, char) => char.toUpperCase() === char ?
`${str} ${char}` :
`${str}${char}`, ''
).replace(/ * /g, ' ').trim().replace(/ /g, '-').toLowerCase();
}
This casing can also be called a "slug", and the process of turning a phrase into it "slugify".
https://hexdocs.pm/slugify/Slug.html

wildcards in node-http-proxy router table

can somebody tell me how to use wildcards in the router table of node-http-proxy?
for example for wildcard subdomains, something like *.domain.de
i know there are RegEx used but i cant get it to work.
i tried like
'([a-zA-Z0-9_]).domain.de': '127.0.0.1:8085',
and
'([^.]*).domain.de' : '127.0.0.1:8085'
but none seem to redirect.
I've not done this myself but I would think that the whole string needs to be a regular expression. So it would be something like:
'[a-zA-Z0-9_]\.domain\.de': '127.0.0.1:8085',
Note the escaping of the dots. In fact, this would be simpler (though perhaps not as secure) if that format is correct:
'.*\.domain\.de': '127.0.0.1:8085',
Or even:
'\w*\.domain\.de': '127.0.0.1:8085',
Sadly, and as usual with all things Node, you are expected to "know" this stuff - mainly by reading the source code :( This is one of the key issues that puts me off using Node in the real world.

Resources