API Gateway sorting Query string

API Gateway sorting Query string - node.js

I'm currently trying to implement an Express application using the Serverless framework on API Gateway + Lambda. Everything worked as expected until I started introducing request signing on our end. The signing works in a way that it signs the complete URL including the query string using a secret token. Unfortunately it seems like either API Gateway or Cloudfront are re-sorting the query string alphabetically which leads to the checksum generated on our side to be different from the one the client generated.
What our Express server sees:
https://example.com/endpoint?build_number=1&platform=ios
What the client was sending:
https://example.com/endpoint?platform=ios &build_number=1
As you can see the query parameters got re-sorted alphabetically which is no behaviour I would expect.
Any idea?

I'd suggest that your algorithm is destined to give you issues, because the query string is a set of key/value pairs with no intrinsic ordering.
There should not be an expectation that it will pass through any particular system in any particular order. The same is true of request headers. Some libraries that build HTTP requests store query string parameters in an intermediate dictionary/hash structure, so even absent the issue you see here (which I suspect to be API Gateway, since CloudFront claims to preserve the ordering), which is arguably a sub-optimal design since ?color=red&size=large is (again, arguably, but pretty compellingly-so) exactly the same thing as ?size=large&color=red.
My guess would be that API Gateway may be optimizing its ability to perform caching (which does not actually use the CloudFront cache -- it has its own implementation) by canonicalizing the query string ordering.
But, as I suggest above, your algorithm should require a binary, lexical sort (case sensitive, rather than "alphabetical" which might be assumed to be case insensitive) of the query parameters on the sending end and the same thing again on the receiving end.
This seems like unnecessary complexity, but this is almost certainly why the various AWS signing algorithms require the query string (and header, for the same reason) keys and values be sorted before signing -- because you simply can't rely on client libraries, proxies, or other entities to handle them consistently.

Related

Is there ever a need to have GET request API as POST is better in every way?

So we were starting a new project from scratch and one of the developers suggested why have any GET API requests as POST API's are better in every which way. (At least when using a mobile client)
On further looking into this it does seem POST can do everything GET can do and it can do it better -
slightly more secure as parameters are not in URL
larger limit than GET request
So is there even a single reason to have a GET API ? (This will only be used from a mobile client so browser specific cacheing doesn't affect us)

Is there ever a need to have GET request API as POST is better in every way?
In general, yes. In your specific circumstances -- maybe no.
GET and POST are method tokens.
The request method token is the primary source of request semantics
They are a form of meta data included in the http request so that general purpose components can be aware of the request semantics and contribute constructively.
POST is, in a sense, the wildcard method - it can mean anything. But one of the consequences of this is - because the method has unconstrained semantics, general purpose components can't do anything useful other than pass the request along.
GET, however, has safe semantics (which includes idempotent semantics). Because the request is idempotent, general purpose components know that they can resend a GET request when the server returns no response (ie messages being lost on unreliable transport); general purpose components can know that representations of the resource can be pre-fetched, reducing perceived latency.
You dismissed caching as a concern earlier, but you may want to rethink that - the cache constraint is an important element that helped the web take over the world.
Reducing everything to POST reduces HTTP from an application for transferring documents over a network to dumb transport.
Using HTTP for transport isn't necessarily wrong: Simple Object Access Protocol (SOAP) works that way, as does gRPC. You still get authorization, and conditional requests; features of HTTP that you might otherwise need to roll your own.
You aren't doing REST at that point, but that's OK; not everybody has to.
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. (Fielding, 2008)

Best practice for sending query parameters in a GET request?

I am writing a backend for my application that will accept query parameters from the front end, and then query my DB based on these parameters. This sounds to me like it should be a GET request, but since I have a lot of params that I'm passing with some of them being optional I think it would be easiest to do a POST request and send the search params in a request body. I know I can convert my params to a query string and append it to my GET request, but there has to be a better way because I will be passing different data types and will end up having to parse the params on the backend anyways if I do it this way.

This depends heavily on the context, but I would prefer using GET request in your scenario.
What Request Method should I use
According to the widely accepted convention, one uses:
GET to read existing data
POST to create something new
More details can be found here: https://www.restapitutorial.com/lessons/httpmethods.html
How do I pass the parameters
Regarding the way to pass parameters, it is a less obvious thing. Unless there's something sensitive in the request parameters, it is perfectly fine to send them as part of URL.
Parameters may be either part of path:
myapi/customers/123
or a query string:
myapi?customer=123
Both options are feasible, and I'd say a choice depends heavily on the application domain model. One popular rule of thumb is:
use "parameters as a part of a path" for mandatory parameters
use "parameters as a query string" for optional parameters.

I'd recommend using POST in the case where there are a lot of parameters/options. There are a few of reasons why I think it's better than GET:
Your url will be cleaner looking
You hide internal structure from the user (it's still visible if they use the Developer Tools of the browser though)
People can't easily change the options to adjust your query. Having it in the url is simple to just modify and reload with other values. It's more work to do this as a POST.
However, if it's of any use that the URL you end up with can be bookmarked or shared, then you'd want all parameters encoded as part of the query, so using GET would be best in that case.
Another answer stated that POST should be used for creating something new, but I disagree. That might apply to PUT, but it's perfectly fine to use POST to allow more complex structures to be passed even when retrieving existing data.
For example, with POST you can send a JSON body object that has nested structure. This can be very handy and would be difficult to explode into a traditional GET query. You also have to worry about URL-encoding your data then decoding it when receiving it, which is a hassle.

For simple frontend to backend communication you don't really need REST to start with as it targets cases where the server is accessed by a plethora of clients not under your control or a client has to access plenty of different servers and should work with all of them. REST should be aimed for if you see benefit in a server that can evolve freely in future without having to fear breaking clients as they will adept to changes quite easily. Such strong properties however come at its price in terms of development overhead and careful designing. Don't get me wrong, you can still aim for a REST architecture, but for such a simple application-2-backend scenario this sounds like an overkill.
In a REST architecture usually a server will tell clients how it wants to receive input data. Think of HTML forms where the method and enctype attributes specify which HTTP method to use and to which representation format the input to convert to. Which HTTP method to use depends on the use case actually. If a server constantly receives the same request for the same input parameters and calculating the result may be costly, then caching the response once and serving further requests from that cache might take away a lot of unnecessary computation overhead from the server. I.e. the BBC claims that the cache is the single most important technology in keeping sites scalable and fast. I once read that they cache most articles for only a minute but this is sufficient enough to spare them form retrieving the same content thousands and thousands of times again and again, freeing up the resources for other requests or tasks. It is no miracle that caching also belongs to one of the few constraints REST has.
HTTP by default will allow caches to store response representations for requested URIs (including any query, path or matrix parameters) if requested via safe operations, such as HEAD or GET requests. Any unsafe operation invoked, however, will lead to a cache invalidation and therefore the removal of any stored representations for that target URI. Hence, any followup requests of that URI will reach the server in order to process a response for the requesting client.
Unfortunately caching isn't the only factor to consider when to decide between using GET or POST as also the current representation format the client currently processes has an influence on the decision. Think of a client processing the previous HTML response received from a server. The HTML response contains a form that teaches a client what fields the server expects as input as well as the choices a client can make for certain input parameters. HTML is a perfect example where the media-type restricts which HTTP methods are available (GET as default method and POST are supported) and which not (all of the other HTTP methods). Other representation formats might only support POST (i.e. while application/soap+xml would allow for either GET or POST (at least in SOAP 1.2), I have never seen GET requests in reality and so everything is exchanged with POST).
A further point that may prevent you from using GET requests is a de facto limitation on the URI length most HTTP implementations have. If you exceed this limitations some HTTP frameworks might not be able to process the message exchanged. On looking at the Web, however, one might find a little workaround to such a limitation. In most Web shops the checkout area is usually split into different pages where each page consists of a form that collects some input like address information, bank or payment data and further input that as a whole act as kind of wizard to guide the user through the payment process. Such a wizard style could be implemented in this case as well. Parts of the request are sent via POST to a dedicated endpoint that takes care of collecting the data and on the final "page" of the wizard the server will ask for a final confirmation on the collected data and uses that resource as GET target. This way the response remains cacheable even though the input data exceeded the typical URL limitation imposed by some HTTP frameworks.
While the arguments listed by Always Learning aren't wrong, I wouldn't rely on those from a security standpoint. While it may filter out people with little knowledge, it won't hinder the ones for long with knowledge (and there are plenty out there) to modify the request before sending it to your server. So simply recommending using PUT as a way to making user edits harder feels odd to me.
So, in summary, I'd base the decision whether to use POST or GET for sending data to the server mainly on the factor whether the response should be cacheable, as it is often requested, or not. In cases where the URI might get so large that certain HTTP frameworks may fail processing the request you are basically forced to use POST anyway unless you can split the actual request into multiple tinier requests which act as wizard for the data collection until a final confirmation request triggers the actual final HTTP call.

Parse.com API security concerns

This question is a mirror of a bug report I made on parse's help forum
Now, I know that the one on parse's site is not a question but a report, and I do not want to leave here just a mirror of the report, but just check that my concerns are well-founded, with people that probably have more experience with me.
The problem is that it seems like parse is not generating the HMAC signature in the right way.
First test: I took a proxy (Charles proxy), set up a breakpoint on an update request and change a field leaving the signature untouched. Execute the request. The server accept the request and the fields are updated accordingly to it (even the field modified in the breakpoint of course).
Second test: instead of modifying the request i just changed the signature to make sure the server is actually testing the signature value, the request got rejected as expected.
Third test: Instead of modifying just the value of an existent field, add a fresh new field to the request and execute. The server accept the request, updates the field, if the field added doesn't exist it adds it to the updated row, otherwise it just update it.
Now, are my concerns well-founded? Did I misunderstand the OAuth RFC in any parts regarding the signature generation? How is it possible that Parse's employees/users do not ever notice such a HUGE bug?
Please, I know that this question can generate a broad discussion, but since the importance of this question (and not only for me, but for all parse's users) leave the time for someone informed to leave a valid response.
EDIT:
I'm digging inside Parse iOS SDK to find out why this is actually happening. After some research and a little of reverse engineering of their static library I found that they are using a modified (probably they just modified the names of the methods prefixing them with 'PF') library called OAuthCore. After having discovered this I've got the confirmation by looking to an old open source version of their SDK (found googling for the modified library names). Now, the library does its job and work as expected, sticking enough to the RFC. The problem is that, obviously, OAuth does not cover the entire HTTP request but just part of it. What I was expecting, and how should be IMHO, is that when you make a request for updating a field (or making a purchase? logging in? Send sensitive data?) the 'dirty' fields should be sent as request's parameters, so that they would be included in the signature/verification process done through the OAuth protocol. Instead update requests (specifically made through the call of a POST request directed to https://api.parse.com/2/update) are made setting the POST request's body to the json string representing the actual update. To be honest this was clear even before all of this, since by looking at the request I should have realized that the json text was being sent as the raw body of the request instead of a x-www-form-urlencoded body (thus having the query parameters urlencoded and &-concatenated in the request's body).
While this is now the "correct" behaviour I feel like this is not like it should be in a production environment used by thousands of people. What I'll do now is trying to patch it without breaking functionality, should I manage to do that I'll share the patch.
Still hoping to get a response from Parse directly.
EDIT 2: Parse has closed my question as a not-question but a bug report. No comments on the major security flaws their implementation implies.
Below the copy of the reported bug
I was playing around with the Parse iOS SDK and I found a major bug
that seriously threat the security of the apps developed using parse
as a backend.
Now, I'm sorry if I'm not using the bug issue reporting tool but I do
not own a facebook account and I'm not willing to.
Premise: Parse APIs seems to conform to OAuth protocol 1.0a (RFC
5849). The relevant part of the RFC that involve this bug is at
page 18, signature.
In oauth, according to the above mentioned RFC, each request should
have an authentication header composed like:
OAuth realm="Example",
oauth_consumer_key="0685bd9184jfhq22",
oauth_token="ad180jjd733klru7",
oauth_signature_method="HMAC-SHA1",
oauth_signature="wOJIO9A2W5mFwDgiDvZbTSMK%2FPY%3D",
oauth_timestamp="137131200",
oauth_nonce="4572616e48616d6d65724c61686176",
oauth_version="1.0"
This will ensure not only that a request is authorized but even
request integrity since the HMAC signature will enforce this. As a
matter of fact the signature should be calculated by using a
normalized string composed by the request parameters and signed with
the client shared concatenated to the token shared secret (see section
3.4.2, page 25 of the RFC). In this way a malicious user should not EVER be able to modify the request before it reaches the server. The
server in fact should check for the signature to match the whole
request, rejecting it if it doesn't.
Sadly enough Parse seems not to totally conform to the above. By using
a simple proxy I'm able to totally modify requests, from changing the
user ID performing the request, change the value of a parameter in the
request, ADD A FIELD AND A VALUE THAT WERE NOT INCLUDED IN THE REQUEST
AT ALL.
Now it is really easy to imagine the drawbacks that all of this can
lead to. In particular I'm thinking to the mobile developers that
enable in-app purchases in their app, relying that parse is secure
enough for them that their users will not be able to "cheat", thus
losing the income and nullifying the efforts they made for their app.
Now, while I was able to test it on the other SDKs, I'm pretty sure
the same bug is reproducible there too, or even worse the problem is
that the server is not checking the signature at all.
Waiting response from a Parse employee about this bug.
Regards, Antonio

It is impressing that you have digged into the framework to check security issues. I am not expert in oauth. But I just want to comment about your worry about in-App purchase. It is not neccessary to worry about in-app purchase because that is handled completely by App Store. Any purchase will be handled by iOS' StoreKit.framework. Parse has nothing to do with in-app purchase. If you want to check if a person has bought anything, you only need to use the functions brought by StoreKit.framework, not Parse.

Implementing SCRAM - nonce validation and server/client keys

Technically two questions - but they are so heavily related I didn't want to split them up; but if the community feels I should, I will.
Following a recent question I am implementing SCRAM for a website login and web service API. Client environments will be .Net and Javascript (with Java likely in the future).
My first issue is basic: The protocol utilises a client and server key as key steps in the authentication process; and yet in order to be validated, both need to be known by both parties in advance since the protocol doesn't allow for exchange of these (to do so would result in a bit of a chicken and egg scenario). If you consider a Javascript client, for example, this means both keys are likely to be constants defined in the source - thus making them easy to fetch. So: why bother? Is it just to mitigate against 'Eve' where that 'Eve', for some reason, hasn't bothered to get the JS or client source code, which will necessarily be public!?
Secondly, like practically any other authentication mechanism it requires a client + server nonce.
Given that the authentication nonce, by definition, should never be used more than once (at least by the same user), this presumably means that a server must maintain a record of all nonce values used by all users forever. Unlike other data that we regularly archive off, such a table is only ever going to get bigger and queries against it likely to get slower and slower!
If that's correct, then it's technically unfeasible to implement this or almost any other authentication mechanism! Since I know that's plainly ridiculous; it must be common to define some additional scope that factors in a reasonable timescale as well.
As always with authentication and encryption; despite being a very experienced software developer I feel like I'm going back to school! What am I missing!?

both need to be known by both parties
in advance since the protocol doesn't
allow for exchange of these (to do so
would result in a bit of a chicken and
egg scenario).
Yes that's correct. Challenge response isn't a key-exchange protocol. It only norms, once client and server share a key, how to compute the same value from that key without transmitting in clear the key via network.
If you consider a Javascript client,
for example, this means both keys are
likely to be constants defined in the
source - thus making them easy to
fetch.
That's not a good idea. Alternatively client and server can agree on a key during a preliminary registration process.
Given that the authentication nonce,
by definition, should never be used
more than once (at least by the same
user), this presumably means that a
server must maintain a record of all
nonce values used by all users
forever.
NO. A new nonce should be generated for each new session using pseudo-random number generation. It's very improbable that you will get the same nonce twice, anyway It doesn't matter if a nonce it has already been used if the attacker don't know that .

How to accept authentication on a web API without SSL?

I'm building a web API very similar to what StackOverflow provide.
However in my case security is importance since data is private.
I must use HTTP.
I can't use SSL.
What solution(s) do you recommend me?
EDIT: authentication != encryption

Nearly every public API works by passing an authentication token for each web request.
This token is usually assigned in one of two ways.
First, some other mechanism (usually logging into a website) will allow the developer to retrieve a permanent token for use in their particular application.
The other way is to provide a temporary token on request. Usually you have a webmethod in which they pass you a username / password and you return a limited use token based on if it is authenticated and authorized to perform any API actions.
After the dev has the token they then pass that as a parameter to every webmethod you expose. Your methods will first validate the token before performing the action.
As a side note the comment you made about "security is important" is obviously not true. If it was then you'd do this over SSL.
I wouldn't even consider this as "minimal" security in any context as it only provides a false belief that you have any sort of security in place. As Piskvor pointed out, anyone with even a modicum of interest could either listen in or break this in some way.

First of all, I suggest you read this excellent article: http://piwik.org/blog/2008/01/how-to-design-an-api-best-practises-concepts-technical-aspects/
The solution is very simple. It is a combination of Flickr like API (token based) and authentication method used by the paiement gateway I use (highly secure), but with a private password/salt instead.
To prevent unauthorized users from using the API without having to send the password in the request (in my case, in clear since there is no SSL), they must add a signature that will consist of a MD5 hashing of a concatenation of both private and public values:
Well know values, such as username or even API route
A user pass phrase
A unique code generated by the user (can be used only once)
If we request /api/route/ and the pass phrase is kdf8*s#, the signature be the following:
string uniqueCode = Guid.NewGuid().ToString();
string signature = MD5.Compute("/api/route/kdf8*s#" + ticks);
The URL of the HTTP request will then be:
string requestUrl =
string.Format("http://example.org/api/route/?code={0}&sign={1}", uniqueCode, signature);
Server side, you will have to prevent any new requests with the same unique code. Preventing any attacker to simply reuse the same URL to his advantage. Which was the situation I wanted to avoid.
Since I didn't want to store code that were used by API consumer, I decided to replace it by a ticks. Ticks represents the number of 100-nanosecond intervals that have elapsed since 12:00:00 midnight, January 1, 0001.
On server side, I only accept ticks (timestamp) with a tolerance of +-3 minutes (in case client & server are not time synchronized). Meaning that potential attacker will be able to use that window to reuse the URL but not permanently. Security is reduced a little, but still good enough for my case.

Short answer: if it's supposed to be usable through usual clients (browser requests/AJAX), you're screwed.
As long as you are using an unencrypted transport, an attacker could just remove any sort of in-page encryption code through a MITM attack. Even SSL doesn't provide perfect security - but plain HTTP would require some out-of-page specific extensions.
HTTP provides only transport - no secure identification, no secure authentication, and no secure authorization.
Example security hole - a simple HTTP page:
<script src="http://example.com/js/superstrongencryption.js"></script>
<script>
encryptEverything();
</script>
This may look secure, but it has a major flaw: you don't have any guarantee, at all, that you're actually loading the file superstrongencryption.js you're requesting. With plain HTTP, you'll send a request somewhere, and something comes back. There is no way to verify that it actually came from example.com, nor you have any way to verify that it is actually the right file (and not just function encryptEverything(){return true}).
That said, you could theoretically build something very much like SSL into your HTTP requests and responses: cryptographically encrypt and sign every request, same with every response. You'll need to write a special client (plus server-side code of course) for this though - it won't work with standard browsers.

HTTP digest authentication provides very good authentication. All the HTTP client libraries i've used support it. It doesn't provide any encryption at all.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string