I am having issues dealing with special characters in a REST URI. The reason we need to deal with special characters is because the URI can reference an ID and that ID may contain a special character, something like:
http://myServer/v1/myResource/123<abc>
The solution that we currently have is not the easiest for consumers of the API in that special characters need to be double encoded. So, for the above, the URI would look like:
http://myServer/v1/myResource/123%253Cabc%253E
I am hoping we can do something better and have two questions.
(1) Why isn't single encoding the URI enough. Specifically, if I change the URI to the following:
http://myServer/v1/myResource/123%3Cabc%3E
Then I get an error along the lines:
A potentially dangerous Request.Path value was detected from the client (<)
This is happening, in part, because something is decoding the URI somewhat prematurely. Interestingly, this does work OK when I try the URI on the same machine as the REST web server. That is, the "premature" URI decoding only happens when REST request is coming from a remote machine.
(2) requestPathInvalidCharacters
By setting requestPathInvalidCharacters to empty string, most of the characters only need to be single encoded. This is a reasonable solution for us, though I haven't been able to determine if this opens us up to any security concerns. From what I can tell, it does not since this web server will only be used for REST. However, I am not certain.
Does setting requestPathInvalidCharacters open any security issues when only using REST API and, if so, can you provide an example where security might be an issue?
Thanks in advance,
Eric
Related
I'm familiar with using templates in NodeJS like EJS to escape data for an HTML context.
However what would be the recommended way to safely output from an API? Given the intended usage is not known, it couldn't be escaped using HTML encoding.
Since I'm currently basically just doing res.json({}) for the output.
I'm thinking while some fields of incoming data can be validated (like 'email'), other fields that are more vague (like 'description') could contain any of the characters someone might use for XSS. Like < and ;. The options on OWASP seem limited https://cheatsheetseries.owasp.org/cheatsheets/Nodejs_Security_Cheat_Sheet.html Like this, but it was last updated 7 years ago https://github.com/ESAPI/node-esapi
Is it up to the recipient to handle? So if someone sends "alert(0);" as their description, I allow it through, as that is a valid JSON {"description":"alert(0);"}
If someone wants to send <script>tweet(document.cookie)</script> in a description let them do so. They may have perfectly valid and legitimate reasons to do that. Perhaps they're writing an article about security and this is just an example of an XSS attack.
This isn't a threat to your database but to your web pages.
Security is neither a server-only nor a client-only job. It's a bit of both and the way you mitigate threats depends on the context.
When writing to a database, it's not XSS you have to worry about but things like SQL injection for example.
XSS is a threat for web applications and the way to mitigate that threat is to properly encode and/or escape any user-controlled input before it gets into the DOM.
I'm currently trying to implement an Express application using the Serverless framework on API Gateway + Lambda. Everything worked as expected until I started introducing request signing on our end. The signing works in a way that it signs the complete URL including the query string using a secret token. Unfortunately it seems like either API Gateway or Cloudfront are re-sorting the query string alphabetically which leads to the checksum generated on our side to be different from the one the client generated.
What our Express server sees:
https://example.com/endpoint?build_number=1&platform=ios
What the client was sending:
https://example.com/endpoint?platform=ios &build_number=1
As you can see the query parameters got re-sorted alphabetically which is no behaviour I would expect.
Any idea?
I'd suggest that your algorithm is destined to give you issues, because the query string is a set of key/value pairs with no intrinsic ordering.
There should not be an expectation that it will pass through any particular system in any particular order. The same is true of request headers. Some libraries that build HTTP requests store query string parameters in an intermediate dictionary/hash structure, so even absent the issue you see here (which I suspect to be API Gateway, since CloudFront claims to preserve the ordering), which is arguably a sub-optimal design since ?color=red&size=large is (again, arguably, but pretty compellingly-so) exactly the same thing as ?size=large&color=red.
My guess would be that API Gateway may be optimizing its ability to perform caching (which does not actually use the CloudFront cache -- it has its own implementation) by canonicalizing the query string ordering.
But, as I suggest above, your algorithm should require a binary, lexical sort (case sensitive, rather than "alphabetical" which might be assumed to be case insensitive) of the query parameters on the sending end and the same thing again on the receiving end.
This seems like unnecessary complexity, but this is almost certainly why the various AWS signing algorithms require the query string (and header, for the same reason) keys and values be sorted before signing -- because you simply can't rely on client libraries, proxies, or other entities to handle them consistently.
I have been looking around SO and other on-line resources but cant seem to locate how this is done. I was wondering how things like magnet links worked on torrent website. They automatically open up and application and pass the appropriate params. I was wondering how could I create one to send a custom program params from the net?
Thanks
s654m
I wouldn't say this is an answer, but it is actually too long for a comment to fit.
Apps tend to register as authorities that can open a specific scheme. I don't know how it's done in desktop apps (especially because depending on each OS, it will vary), but on Android you can catch schemes or base urls by Intent Filters.
The way it works (and I'm pretty sure the functionality is cross-OS) is:
Your app tells the system it can "read" a specific scheme or base url (it could be magnet:// or even http://www.twitter.com/).
When you try to open a URI (Uniform resource identifier, a supergroup that can contain URLs), the system searches for any application that was registered for that kind of URI. I guess it runs from more specific and complete formats to the base. So for instance, this tweet: https://twitter.com/korcholis/status/491724155176222720 may be traced in this order:
https://twitter.com/korcholis/status/491724155176222720 Oh, no registrar? Moving on
https://twitter.com/korcholis/status Nothing yet? Ok
https://twitter.com/korcholis Nnnnnnope?
https://twitter.com Anybody? Ah, you, Totally random name for a Twitter Client know how to handle these links? Then it's yours
This random twitter client gets the full URI and does something accordingly.
As you see, nobody had a chance to track https://, since another application caught the URI before them. In this case, nobody could be your browsers.
It also defines, somehow, a default value. This is the true key why browsers tend to battle to be your default browser of choice. This just tells you they want to be the default applications that catch http://, https:// and probably some more.
The true wonder here is that, as long as there's an app that catches a scheme, you can set the one you want. For instance, it's a common practice that apps from the same developer contain the same schemes, in case the developer wants to share tasks between them. This ensures the user will have to use a group of apps. So, one app can just offer data such as:
my-own-scheme://user/12
While another app is registered to get links that start with
my-own-scheme://
So, if you want to make your own schemes, it's ok, as long as they don't collide with other's. And if you want to read other's schemes, well, that's up to you to search for that. See? This is not a real answer, but I hope it removes almost all doubt.
We know the URL itself is not a secure way to pass or store information. Too many programs will perform unexpected processing on the URL or even ship it over the network, and generally speaking, the URL is not treated with a high regard for its privacy.
In the past we've seen Bitcoin wallets, for example, which have relied on keeping a URL secret, but they found out the hard way there are too many ways in which a URL (sent via Skype, or emailed, or even just typing it into the Google Chrome omnibar) will get stored by a remote server, and possibly displayed publicly.
And so I thought URL would be forsaken forever as a means for carrying any private data... despite being extremely convenient, except now I've seen a few sites which are using URL fragments -- the portion of the URL after the '#' -- as a kind of 'secure' storage. I think the expectation is that Google won't parse the fragment and allow it to show up in search results, so that data shouldn't be published.
But that seems like a pretty weak basis for the security of your product. There would be a huge benefit to having a way to securely move data in URL fragments, but can we really rely on that?
So, I would really like to understand... Can anyone explain, what is the security model for fragment identifiers?
Tyler Close and others who did the security architecture for Waterken did the relevent research form this. They use unguessable strings in URI fragments as web-keys:
This leakage of a permission bearing URL via the Referer header is only a problem in practice if the target host of a hyperlink is different from the source host, and so potentially malicious. RFC 2616 foresaw the danger of such leakage of information and so provided security guidance in section 15.1.3:
"Because the source of a link might be private information or might reveal an otherwise private information source, … Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP request if the referring page was transferred with a secure protocol."
Unfortunately, clients have implemented this guidance to the letter, meaning the Referer header is sent if both the referring page and the destination page use HTTPS, but are served by different hosts.
This enthusiastic use of the Referer header would present a significant barrier to implementation of the web-key concept were it not for one unrelated, but rather fortunate, requirement placed on use of the Referer header. Section 14.36 of RFC 2616, which governs use of the Referer header, states that: "The URI MUST NOT include a fragment." Testing of deployed web browsers has shown this requirement is commonly implemented.
Putting the unguessable permission key in the fragment segment produces an https URL that looks like: <https://www.example.com/app/#mhbqcmmva5ja3>.
Fetching a representation
Placing the key in the URL fragment component prevents leakage via the Referer header but also complicates the dereference operation, since the fragment is also not sent in the Request-URI of an HTTP request. This complication is overcome using the two cornerstones of Web 2.0: JavaScript and XMLHttpRequest.
So, yes, you can use fragment identifiers to hold secrets, though those secrets could be stolen and exfiltrated if your application is susceptible to XSS, and there is no equivalent of http-only cookies for fragment identifiers.
I believe Waterken mitigates this by removing the secret from the fragment before it runs any application code in the same way many sensitive daemons zero-out their argv.
The part after the # is not any more secure than any other part of the URL. The only difference is that it MAY be omitted from the web server access log. But the web server is not the threat.
As long as you store the secret, either in a URL or somewhere else where it can become public it is insecure. That is why we invented passwords, because they are supposed to only exist in peoples head.
The problem is not to find a way to store a secret in a URL.
That is impossible, because as you say: The probably will become public. If all you need is the URL, and it gos public, nobody cares what the original data is. Bacuse they have what they need, the URL. So to rely on the URL alone for authentication is.. moronic.
The The problem is to store your secrets in a secure way, and to create secure systems.
I don't know if the title is clear enough, anyway what I need to do is quite simple: I have some content you can access by an API call on my server; this content is user-related so when you request access to it, you must first wait for the owner to authorize you. Since this content will be probably embedded into blog articles or form posts I want it to be accessible only from the URL the user authorized to.
The only way that came to my mind is to check in some secure way where the request is coming from: the problem with this approach is that anybody could create a fake request, using a valid URL but coming from a non-authorized URL actually.
I'm looking for a way to solve this problem, even if this doesn't involve checking the actual URL but using some other approach or whatever. Feel free to ask any questions if this is not clear enough.
With Sessions:
If you generate a secure token, most languages have libraries to do such a thing, you will have to persist it probably in a session on your server. When you render the page which will access the other content you can add that token to the link/form post/ajax request on the page you wish to be able to access it from.
You would then match that token against the value in the user session if the token doesn't match you return an error of some sort. This solution relies on the security of your session.
Without Sessions:
If you don't have sessions to get around server persistance, you can use a trick that amazon s3 uses for security. You would create something like a json string which gives authorization for the next 30 seconds, 5 minutes, whatever is appropriate. It would need to include a timestamp so that the value changes. You would use a secret key on your sever that you combine with the JSON string to create a hash value.
Your request would have to include the JSON string as one request parameter. You would need to base64 encode it or some other means so that you don't run into special characters not allowed over http. The second parameter would be the output of your hash operation.
When you get the request you would decode the JSON string so it was exactly the same as before and hash it with your secret key. If that value matches the one sent with the request it means those are the two values you sent to the page that ultimately requested the content.
Warnings:
You need to make sure you're using up to date algorithms and properly audited security libraries to do this stuff, do not try to write your own. There may be other ways around this depending on what context this ultimately ends up in but I think it should be relatively secure. Also I'm not a security expert I would consult one if you're dealing with very sensitive information.