Pattern match in nodejs rest url - node.js

In my node app I am using the router.use to do the token validation.
I want to skip validation for few urls, so I want to check if the url matches, then call next();
But the URL I want to skip has a URLparam
E.g., this is the URL /service/:appname/getall.
This has to be matched against /service/blah/getall and give a true.
How can this be achieved without splitting the url by '/'
Thanks in advance.

The parameters will match :[^/]+ because it is a : followed by anything other than a / 1 or more times.
If you find the parameters in the template and replace them with a regex that will match any string you can do what you asked for.
let template = '/service/:appname/getall'
let url = '/service/blah/getall'
// find params and replace them with regex
template = template.replace(/:[^/]+/g, '([^/]+)')
// the template is now a regex string '/service/[^/]+/getall'
// which is essentially '/service/ ANYTHING THAT'S NOT A '/' /getall'
// convert to regex and only match from start to end
template = new RegExp(`^${template}$`)
// ^ = beggin
// $ = end
// the template is now /^\/service\/([^\/]+)\/getall$/
matches = url.match(template)
// matches will be null is there is no match.
console.log(matches)
// ["/service/blah/getall", "blah"]
// it will be [full_match, param1, param2...]
Edit: use \w instead of [^/], because:
The name of route parameters must be made up of “word characters” ([A-Za-z0-9_]). https://expressjs.com/en/guide/routing.html#route-parameters
I believe this is true for most parsers so I have updated my answer. The following test data will only work with this updated method.
let template = '/service/:app-:version/get/:amt';
let url = '/service/blah-v1.0.0/get/all';
template = template.replace(/:\w+/g, `([^/]+)` );
template = new RegExp(`^${template}$`);
let matches = url.match(template);
console.log(url);
console.log(template);
console.log(matches);
// Array(4) ["/service/blah-v1.0.0/get/all", "blah", "v1.0.0", "all"]

Related

Is there a regex to be able to match two url's , one that has a wildcard and one that doesn't?

I am writing a program in Nodejs with the following scenarios.
I have an array of url's that include wildcards, such as the following:
https://*.example.com/example/login
http://www.example2.com/*/example2/callback
Secondly, I have an incoming redirect url that I need to validate matches what is in the array of url's above. I was wondering if there was a way using Regex or anything else that I can use something like arr.includes(incomingRedirectUrl) and compare the two.
I can match non-wildcard url's using array.includes(incomingRedirectUrl), but when it comes to matching the array that has wildcards, I cannot think of a solution.
For example,
https://x.example.com/example/login should work because it matches the first url in the above example, only replacing the "*" with the x.
Is there a way I can achieve this? Or do I have to break down the url's using something like slice at the "*" to compare the two?
Thanks in advance for any help.
for (let i = 0; i < arr.length; i++) {
if (arr[i].indexOf('*') !== -1) {
wildcardArr.push(arr[i]);
} else {
noWildcardArr.push(arr[i]);
}
}
***Note, the reason I check noWildcardArr first is because most of the validate redirect url's do not contain wildcard
if (noWildcardArr.includes(incomingRedirectUrl)) {
//Validated correct url, proceed with the next part of my code (this part already works)
} else if (wildcardArr.includes(incomingRedirectUrl)) {
//need to figure out this logic here, not sure if the above is possible without formatting wildcardArr but url should be validated if url matches with wildcard
} else {
log.error('authorize: Bad Request - Invalid Redirect URL');
context.res = {
status: 400,
body: 'Bad Request - Invalid Redirect URL',
};
}
You could compile your URL array into proper regex and then iterate over them to see if it matches. Similar to something like a web framework would do that allows URL path parameters such as /users/:id.
function makeMatcher(urls) {
const compiled = urls.map(url => {
// regex escape the url but dont escape *
let exp = url.replace(/[-[\]{}()+?.,\\^$|#\s]/g, '\\$&');
// replace * with .+ for the wildcard
exp = exp.replaceAll('*', '.+');
// the expression is used to create the match function
return new RegExp(`^${exp}$`);
});
// return the match function, which returns true, on the first match,
// or false, if there is no match at all
return function match(url) {
return compiled.find(regex => url.match(regex)) == undefined ?
false :
true;
};
}
const matches = makeMatcher([
'https://*.example.com/example/login',
'http://www.example2.com/*/example2/callback'
]);
// these 2 should match
console.log(matches('https://x.example.com/example/login'));
console.log(matches('http://www.example2.com/foo/example2/callback'));
// this one not
console.log(matches('http://nope.example2.com/foo/example2/callback'));

How to extract value from a url

I have a few requests coming in which follow the pattern below
contacts/id/
contacts/x/id/name
contacts/x/y/id/address
contacts/z/address/
I want to extract the value which follows right after 'contacts'
In above cases,
1. id
2. x
3. x
4. z
Here is my regex
(?<=contacts)\/[^\/]+
https://regex101.com/r/ePmv5Y/1
But it is matching along with the trailing '/' for eg. /id, /x etc
How do I optimize to get rid of this trailing slash?
We can use match() here:
var urls = ["contacts/id/", "contacts/x/id/name", "contacts/x/y/id/address", "contacts/z/address/"];
for (var i=0; i < urls.length; ++i) {
var output = urls[i].match(/\bcontacts\/(.*?)\//)[1];
console.log(urls[i] + " => " + output);
}
I have a few requests coming in
If you mean http requests, then this is likely the pathname of the requested URL, and they'll start with a /. (This is the value of req.url in a Node.js server.)
To match on a URL pathname, you can use this expression: ^\/contacts\/([^/?]+). Here's a link to another regular expression builder that demonstrates it and includes an explanation for every character: https://regexr.com/6tugf
The [^/?] is a negated set that matches any token which is not a / or a ? and the + means that it matches 1 or more of those tokens. It's important to include the ? because otherwise it could match into the query string portion of the URL — for example, in this URL:
https://domain.tld/contacts/x/id/name?filter=recent # URL
/contacts/x/id/name?filter=recent # req.url in Node.js
/contacts/x/id/name # pathname
?filter=recent # query string
And here's a runnable code snippet demonstrating the same expression, using String.prototype.match():
const contactIdRegexp = /^\/contacts\/([^/?]+)/;
const inputs = [
'/contacts/id/', // id
'/contacts/x/id/name', // x
'/contacts/x/y/id/address', // x
'/contacts/z/address/', // z
'/contacts/x/id/name?filter=recent', // x
];
for (const str of inputs) {
const id = str.match(contactIdRegexp)?.[1];
console.log(id);
}
You can add the / inside the lookbehind:
(?<=contacts\/)[^\/]+
See a regex demo.
If you like to continue without regex, You can try below.
//get the URL object.
const url = new URL(`${req.protocol}://${req.get('host')}${req.originalUrl}`);
//extract the pathname and split using "/"
const pathName= url.pathname.split("/");
//get the required value using array index.
const val = pathName[2];

extract a string using regular expression in node

I'm trying to use exec for a regular expression in node. I know the expression works via testing it with an extension in VSCode but when I run the node app it keeps returning null.`
str = '\r\nProgram Boot Directory: \\SIMPL\\app01\r\nSource File: C:\\DRI\\DRI\\DRI Conf Room v2 20180419aj\r\nProgram File: DRI Conf Room v2 20180419aj.smw\r\n';
var regex = /\Program File:(.*?)\\/;
var matched = regex.exec(str);
console.log(matched);
I think you don't have to escape the \P at the beginning and the string ends with \r\n so you could match that instead of \\ which will match a backslash.
If you don't want the leading whitespace in the first capturing group you could add \s*to match zero or more whitespace characters: /Program File:\s*(.*?)\r\n/
For example:
str = '\r\nProgram Boot Directory: \\SIMPL\\app01\r\nSource File: C:\\DRI\\DRI\\DRI Conf Room v2 20180419aj\r\nProgram File: DRI Conf Room v2 20180419aj.smw\r\n';
var regex = /Program File:(.*?)\r\n/;
var matched = regex.exec(str);
console.log(matched[0]);
console.log(matched[1]);
Demo
You need to use a RegExp constructor:
var str = '\r\nProgram \r\nProgram File: DRI 0180419aj.smw\r\n'
.replace('[\\r,\\n]',''); // removes the new lines before we search
var pattern = 'Program File.+' // write your raw pattern
var re = new RegExp(pattern); // feed that into the RegExp constructor with flags if needed
var result = re.exec(str); // run your search
console.log(result)
Not really sure what your pattern should do, so I just put one there, that matches whatever starts with Program File. If you want all matches, not just the first, just change it to
var re = new RegExp(pattern,'g');
Hope that helps!
The regex syntax you use looks off. Try it like this:
const regex = /^Program File:\s*(.*?)$/gm;
const str = `
Program Boot Directory: \\\\SIMPL\\\\app01
Source File: C:\\\\DRI\\\\DRI\\\\DRI Conf Room v2 20180419aj
Program File: DRI Conf Room v2 20180419aj.smw
`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}

how to display title of websites in node.js

I want to get title of different site. like this.
localhost:1234/index/?url=google.com&url=www.yahoo.com/&url=twitter.com
if i got to this url it crawl on all the mention site in the url and display title of website.
- Google
- Yahoo
- Twitter
var Urls = 'localhost:1234/index/?url=google.com&url=www.yahoo.com/&url=twitter.com';
// remove all special characters like '/' '&' and '='
Urls = Urls.replace(/\&/g, '').replace(/\//g, '').replace(/\=/g, '');
// split it based on url
Urls = Urls.split('url');
//delete first element as its not required
delete Urls[0]
Urls.forEach(function (url) {
//split each element based on '.'
url = url.split('.');
url.forEach(function (ele) {
// if its not 'www' and 'com'
if (ele !== 'www' && ele !== 'com') {
// the title of url
console.log(ele);
}
})
})
you need to remove all special character as above using regular expression and if urls contains ".org" or ".in" .. etc, then that also need to include inside if condition

How to use stringByAddingPercentEncodingWithAllowedCharacters() for a URL in Swift 2.0

I was using this, in Swift 1.2
let urlwithPercentEscapes = myurlstring.stringByAddingPercentEscapesUsingEncoding(NSUTF8StringEncoding)
This now gives me a warning asking me to use
stringByAddingPercentEncodingWithAllowedCharacters
I need to use a NSCharacterSet as an argument, but there are so many and I cannot determine what one will give me the same outcome as the previously used method.
An example URL I want to use will be like this
http://www.mapquestapi.com/geocoding/v1/batch?key=YOUR_KEY_HERE&callback=renderBatch&location=Pottsville,PA&location=Red Lion&location=19036&location=1090 N Charlotte St, Lancaster, PA
The URL Character Set for encoding seems to contain sets the trim my
URL. i.e,
The path component of a URL is the component immediately following the
host component (if present). It ends wherever the query or fragment
component begins. For example, in the URL
http://www.example.com/index.php?key1=value1, the path component is
/index.php.
However I don't want to trim any aspect of it.
When I used my String, for example myurlstring it would fail.
But when used the following, then there were no issues. It encoded the string with some magic and I could get my URL data.
let urlwithPercentEscapes = myurlstring.stringByAddingPercentEscapesUsingEncoding(NSUTF8StringEncoding)
As it
Returns a representation of the String using a given encoding to
determine the percent escapes necessary to convert the String into a
legal URL string
Thanks
For the given URL string the equivalent to
let urlwithPercentEscapes = myurlstring.stringByAddingPercentEscapesUsingEncoding(NSUTF8StringEncoding)
is the character set URLQueryAllowedCharacterSet
let urlwithPercentEscapes = myurlstring.stringByAddingPercentEncodingWithAllowedCharacters( NSCharacterSet.URLQueryAllowedCharacterSet())
Swift 3:
let urlwithPercentEscapes = myurlstring.addingPercentEncoding( withAllowedCharacters: .urlQueryAllowed)
It encodes everything after the question mark in the URL string.
Since the method stringByAddingPercentEncodingWithAllowedCharacters can return nil, use optional bindings as suggested in the answer of Leo Dabus.
It will depend on your url. If your url is a path you can use the character set
urlPathAllowed
let myFileString = "My File.txt"
if let urlwithPercentEscapes = myFileString.addingPercentEncoding(withAllowedCharacters: .urlPathAllowed) {
print(urlwithPercentEscapes) // "My%20File.txt"
}
Creating a Character Set for URL Encoding
urlFragmentAllowed
urlHostAllowed
urlPasswordAllowed
urlQueryAllowed
urlUserAllowed
You can create also your own url character set:
let myUrlString = "http://www.mapquestapi.com/geocoding/v1/batch?key=YOUR_KEY_HERE&callback=renderBatch&location=Pottsville,PA&location=Red Lion&location=19036&location=1090 N Charlotte St, Lancaster, PA"
let urlSet = CharacterSet.urlFragmentAllowed
.union(.urlHostAllowed)
.union(.urlPasswordAllowed)
.union(.urlQueryAllowed)
.union(.urlUserAllowed)
extension CharacterSet {
static let urlAllowed = CharacterSet.urlFragmentAllowed
.union(.urlHostAllowed)
.union(.urlPasswordAllowed)
.union(.urlQueryAllowed)
.union(.urlUserAllowed)
}
if let urlwithPercentEscapes = myUrlString.addingPercentEncoding(withAllowedCharacters: .urlAllowed) {
print(urlwithPercentEscapes) // "http://www.mapquestapi.com/geocoding/v1/batch?key=YOUR_KEY_HERE&callback=renderBatch&location=Pottsville,PA&location=Red%20Lion&location=19036&location=1090%20N%20Charlotte%20St,%20Lancaster,%20PA"
}
Another option is to use URLComponents to properly create your url
Swift 3.0 (From grokswift)
Creating URLs from strings is a minefield for bugs. Just miss a single / or accidentally URL encode the ? in a query and your API call will fail and your app won’t have any data to display (or even crash if you didn’t anticipate that possibility). Since iOS 8 there’s a better way to build URLs using NSURLComponents and NSURLQueryItems.
func createURLWithComponents() -> URL? {
var urlComponents = URLComponents()
urlComponents.scheme = "http"
urlComponents.host = "www.mapquestapi.com"
urlComponents.path = "/geocoding/v1/batch"
let key = URLQueryItem(name: "key", value: "YOUR_KEY_HERE")
let callback = URLQueryItem(name: "callback", value: "renderBatch")
let locationA = URLQueryItem(name: "location", value: "Pottsville,PA")
let locationB = URLQueryItem(name: "location", value: "Red Lion")
let locationC = URLQueryItem(name: "location", value: "19036")
let locationD = URLQueryItem(name: "location", value: "1090 N Charlotte St, Lancaster, PA")
urlComponents.queryItems = [key, callback, locationA, locationB, locationC, locationD]
return urlComponents.url
}
Below is the code to access url using guard statement.
guard let url = createURLWithComponents() else {
print("invalid URL")
return nil
}
print(url)
Output:
http://www.mapquestapi.com/geocoding/v1/batch?key=YOUR_KEY_HERE&callback=renderBatch&location=Pottsville,PA&location=Red%20Lion&location=19036&location=1090%20N%20Charlotte%20St,%20Lancaster,%20PA
In Swift 3.1, I am using something like the following:
let query = "param1=value1&param2=" + valueToEncode.addingPercentEncoding(withAllowedCharacters: .alphanumeric)
It's safer than .urlQueryAllowed and the others, because it this will encode every characters other than A-Z, a-z and 0-9. This works better when the value you are encoding may use special characters like ?, &, =, + and spaces.
In my case where the last component was non latin characters I did the following in Swift 2.2:
extension String {
func encodeUTF8() -> String? {
//If I can create an NSURL out of the string nothing is wrong with it
if let _ = NSURL(string: self) {
return self
}
//Get the last component from the string this will return subSequence
let optionalLastComponent = self.characters.split { $0 == "/" }.last
if let lastComponent = optionalLastComponent {
//Get the string from the sub sequence by mapping the characters to [String] then reduce the array to String
let lastComponentAsString = lastComponent.map { String($0) }.reduce("", combine: +)
//Get the range of the last component
if let rangeOfLastComponent = self.rangeOfString(lastComponentAsString) {
//Get the string without its last component
let stringWithoutLastComponent = self.substringToIndex(rangeOfLastComponent.startIndex)
//Encode the last component
if let lastComponentEncoded = lastComponentAsString.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.alphanumericCharacterSet()) {
//Finally append the original string (without its last component) to the encoded part (encoded last component)
let encodedString = stringWithoutLastComponent + lastComponentEncoded
//Return the string (original string/encoded string)
return encodedString
}
}
}
return nil;
}
}
Swift 4.0
let encodedData = myUrlString.addingPercentEncoding(withAllowedCharacters: CharacterSet.urlHostAllowed)

Resources