Background
I'm having a trouble with the design and implementation of a REST service which publishes content that some users cannot view (medical information, you know, country's laws), I'm using a ABAC-like/RBAC system to protect them, but what causes me concern is that I may be violating the REST pattern. My services does the following process for each query:
The security middleware reads a token from a session that an app/webpage sends using authorization header or cookies.
ABAC/RBAC Rules are applied to know if user can access the resource.
After authorize the token, my service executes the query and filters the results, hiding content that requesting user cannot see (if needed. POST, PUT and DELETE operations are almost exempt from this step). The filter is done using ABAC/RBAC rules.
An operation report is stored in logs.
I already know that sessions violates REST pattern, but I can replace it using BASIC/DIGEST authorizations. My real question is the following:
Question
Does hiding resources from list/retrieve operations violates REST pattern? As far I know, REST is stateless, so ... What happens if I use some context variables to filter my results (user id)? Am I violating REST? Not at all?
If I do, What are your recommendations? How can I implement this without breaking REST conventions?
First of all, client-side sessions don't violate REST at all. REST says the communication between client and server must be stateless, or in other words, the server should not require any information not available in the request itself to respond it properly. If the client keeps a session and sends all information needed on every request, it's fine.
As to your question, there's nothing wrong with changing the response based on the authenticated user. REST is an architectural style that attempts to apply the successful design decisions behind the web itself to software development. When you log in to Stack Overflow, what you see as your profile is different from what I see, even though we are both using the same URI, right? That's how REST is supposed to work.
I'd recommend returning status codes 401 (Unauthorized) if the user is not authorized to access a resource. And 404 (Not found) if you cannot confirm that the resource even exists.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4
A GET is meant to return a representation of the resource. Nowhere does it say that you must return everything you know about that resource.
Exactly what representation is returned will depend on the request headers. For example of you might return either JSON or XML depending on what the client requested. Extending this line of thinking; it is ok to return different representations of a resource based on the client's authentication without violating REST principals.
Related
It is a multi-tenant serverless system.
The system has groups with permissions.
Users derive permissions based on the groups they are in.
If it makes a difference, we are using Cognito for authentication and it is a stateless application.
For example:
GET endpoint for sites (so sites that the logged-in user has access to based on the groups they are in)
GET endpoint for devices (so sites that the logged-in user has access to based on the groups they are in)
In REST APIs. "The idea is that the data returned by an endpoint should depend solely on the parameters passed meaning two different users should receive the same result for the identical request.
"
What should the REST URI look like to ensure the above-stated idea? Since the deciding factor for the list here is "groups" and thus effective permissions, I was thinking we could pass the groups a user in, in the URI in sorted order to leverage caching on GET endpoints as well, Is there a better way to do it?
In REST APIs. "The idea is that the data returned by an endpoint should depend solely on the parameters passed meaning two different users should receive the same result for the identical request. "
No this is not strictly true. It can be a desirable property, but absolutely not needed. In fact, if you build a proper hypermedia REST api, you would likely want to hide links/actions that the current user is not allowed to use.
Furthermore, a cache will never store responses and send to different users if an AUthorization header is present on the request.
Anyway, there could be other reasons to want this.. maybe it's a simpler design for your case, and there is a pretty reasonable solution.
What I'm inferring from your question is that you might have two endpoints:
/sites
/devices
They return different things depending on who's accessing. Instead of using those kind of routes, you could just do:
/user/1234/sites
/user/1234/devices
Now every user has their own separate 'sites' and 'devices' collection. The additional benefit is that if you ever want to let a user find the list of sites or devices from another user, the API is ready to support that.
The idea is that the data returned by an endpoint should depend solely
on the parameters passed
This is called the statelessness constraint, but if you check the parameters always include auth parameters because of this. The idea is keeping the session data on the client side, because managing sessions becomes a problem when you have several million users and multiple servers all around the world. Since the parameters include auth data, the response can depend on this data, so you can use here the exact same endpoints for users with different permissions.
As of the responses you might want to send back hyperlinks, which represent the available operations. The concept is the same here, if the user does not have permission for the actual operation, then they won't get a hyperlink for that operation and in theory they should never get a 403 status either, because you must follow the hyperlinks you got from the service instead of hardcoding URI templates into your client. So you have to handle less errors and junk requests, and another reason here that you can change your URI templates without breaking the clients. This is called hypermedia as the engine of application state, it is part of the uniform interface constraint.
So we were starting a new project from scratch and one of the developers suggested why have any GET API requests as POST API's are better in every which way. (At least when using a mobile client)
On further looking into this it does seem POST can do everything GET can do and it can do it better -
slightly more secure as parameters are not in URL
larger limit than GET request
So is there even a single reason to have a GET API ? (This will only be used from a mobile client so browser specific cacheing doesn't affect us)
Is there ever a need to have GET request API as POST is better in every way?
In general, yes. In your specific circumstances -- maybe no.
GET and POST are method tokens.
The request method token is the primary source of request semantics
They are a form of meta data included in the http request so that general purpose components can be aware of the request semantics and contribute constructively.
POST is, in a sense, the wildcard method - it can mean anything. But one of the consequences of this is - because the method has unconstrained semantics, general purpose components can't do anything useful other than pass the request along.
GET, however, has safe semantics (which includes idempotent semantics). Because the request is idempotent, general purpose components know that they can resend a GET request when the server returns no response (ie messages being lost on unreliable transport); general purpose components can know that representations of the resource can be pre-fetched, reducing perceived latency.
You dismissed caching as a concern earlier, but you may want to rethink that - the cache constraint is an important element that helped the web take over the world.
Reducing everything to POST reduces HTTP from an application for transferring documents over a network to dumb transport.
Using HTTP for transport isn't necessarily wrong: Simple Object Access Protocol (SOAP) works that way, as does gRPC. You still get authorization, and conditional requests; features of HTTP that you might otherwise need to roll your own.
You aren't doing REST at that point, but that's OK; not everybody has to.
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. (Fielding, 2008)
Im working on a small node/angular application
A superadmin should be able to create/edit/delete new client accounts, within views delivered directly from the node application.
The clients on the other hand communicate with the backend/database through Angular and a REST API that the node application delivers. The clients need a username/password to login to their account.
Question: I have this route map, is it right of me to think that the :client need to be in the URL of the REST API, so the backend knows which data to fetch?
the : in the url indicates that it's a variable...
Route map Superadmin
/admin/client – POST
/admin/client/:id – GET
/admin/client/:id – PUT
/admin/client/:id – DELETE
/admin/clients – GET
Route map API JSON
/v1/:client/candidate – POST
/v1/:client/candidate/:id – GET
/v1/:client/candidate/:id – PUT
/v1/:client/candidate/:id – DELETE
/v1/:client/candidates – GET
/v1/:client/settings – GET
/v1/:client/settings – PUT
I think this is a little difficult to answer, because it would say one way is "right" and another is "wrong", when really there could be multiple ways of solving this problem. Here's what I would say though, around how you structure the API endpoints.
If we focus on these API endpoints specifically:
/v1/:client/candidate – POST
/v1/:client/candidate/:id – GET
/v1/:client/candidate/:id – PUT
/v1/:client/candidate/:id – DELETE
/v1/:client/candidates – GET
/v1/:client/settings – GET
/v1/:client/settings – PUT
Here we have a set of APIs that allow someone to look up and perform actions on resources for a specific client. In doing this, you've essentially opened this up to allow anyone to access anyone's data (until you add security). Building APIs like this would be more useful for a "superadmin" like you've described in your question, which would need to access multiple client profiles throughout their day. But as you might imagine, you'd need to restrict access to these endpoints to only those who have "superadmin" access OR are in fact the client themselves.
If instead the main use case of these API endpoints was to serve the clients, I would instead remove the :client parameter:
/v1/candidate – POST
/v1/candidate/:id – GET
/v1/candidate/:id – PUT
/v1/candidate/:id – DELETE
/v1/candidates – GET
/v1/settings – GET
/v1/settings – PUT
Since you mentioned that the client would need to login to hit these APIs, you already know who the client is when they make the request. You can instead look up the client from the request, and access these resources based on who is making the call. Personally I think this makes things a little easier to follow, since the request is always asking for my data, rather than some "client's" data, which you then need to verify they have access to.
But again this is all based on how you architect your application, what's the use cases, who is going to be accessing the system, etc. It might make sense to separate out the "superadmin" APIs from the normal "client" APIs like I've described above, or it might be better the keep them all together. The answer will probably end up being which is easier to understand and maintain in the long term.
tl;dr
I am considering a webservice design model which consist of several services/subdomains, each of which may be implemented in different platforms and hosted in different servers.
The main issue is authentication. If a request for jane's resources came in, can a split system authenticate that request as her's?
All services access the same DB layer, of course. So I have in mind a single point of truth each service can use to authenticate each request.
For example, jane accesses www.site.com, which renders stuff in her browser. The browser may send a client-side request to different domains of site.com, with requests like:
from internalapi.site.com fetch /user/users_secret_messages.json
from imagestore.site.com fetch /images/list_of_images
The authentication issue is: another user (or an outsider) can craft a request that can fool a subdomain into giving them information they should not access.
So I have in mind a single point of truth: a central resource accessible by each service that can be used to authenticate each request.
In this pseudocode, AuthService.verify_authentication() refers the central resource
//server side code:
def get_user_profile():
auth_token=request.cookie['auth_token']
user=AuthService.verify_authentication(auth_token)
if user=Null:
response.write("you are unauthorized/ not logged in")
else:
response.write(json.dumps(fetch_profile(user)))
Question: What existing protocols, software or even good design practices exist to enable flawless authentication across multiple subdomains?
I seen how OAuth takes the headache out of managing 3rd-party access and wonder if something exists for such authentication. I also got the idea from Kerberos and TACACS.
This idea was the result of teamthink, as a way to simplify architecture (rather than handle heavy loads).
I built a system that did this a little while ago. We were building shop.megacorp.com, and had to share a login with www.megacorp.com, profile.megacorp.com, customerservice.megacorp.com, and so on.
The way it worked was in two parts.
Firstly, all signon was handled through a set of pages on accounts.megacorp.com. The signup link from our pages went there, with a return URL as a parameter (so https://accounts.megacorp.com/login?return=http://shop.megacorp.com/cart). The login process there would redirect back to the return URL after completion. The login page also set an authentication cookie, scoped to the whole of the megacorp.com domain.
Secondly, authentication was handled on the various sites by grabbing the cookie from the request, then forwarding it via an internal web service to accounts.megacorp.com. We could have done this is a straightforward SOAP or REST query, with the cookie as a parameter, but actually, what we did was send a HTTP request, with the cookie added to the headers (sort of as if the user had sent the request directly). That URL would then come back as a 200 if the cookie was valid, serving up some information about the user, or a 401 or something if it wasn't. We could then deal with the user accordingly.
Needless to say, we didn't want to make a request to accounts.megacorp.com for every user request, so after a successful authentication, we would mark the user's session as authenticated. We'd store the cookie value and a timestamp, and if subsequent requests had the same cookie value, and were within some timeout of the timestamp, we'd treat them as authenticated without passing them on.
Note that because we pass the cookie as a cookie in the authentication request, the code to validate it on accounts.megacorp.com is exactly the same as handling a direct request from a user, so it was trivial to implement correctly. So, in response to your desire for "existing protocols [or] software", i'd say that the protocol is HTTP, and the software is whatever you can use to validate cookies (a standard part of any web container's user handling). The authentication service is as simple as a web page which prints the user's name and details, and which is marked as requiring a logged-in user.
As for "good design practices", well, it worked, and it decoupled the login and authentication processes from our site pretty effectively. It did introduce a runtime dependency on a service on accounts.megacorp.com, which turned out to be somewhat unreliable. That's hard to avoid.
And actually, now i think back, the request to accounts.megacorp.com was actually a SOAP request, and we got a SOAP response back with the user details, but the authentication was handled with a cookie, as i described. It would have been simpler and better to make it a REST request, where our system just did a GET on a standard URL, and got some XML or JSON describing the user in return.
Having said all that, if you share a database between the applications, you could just have a table, in which you record (username, cookie, timestamp) tuples, and do lookups directly in that, rather than making a request to a service.
The only other approach i can think of is to use public-key cryptography. The application handling login could use a private key to make a signature, and use that as the cookie. The other applications could have the corresponding public key, and use that to verify it. The keys could be per-user or there could just be one. That would not involve any communication between applications, or a shared database, following the initial key distribution.
I have a SOA which makes heavy use of nonces (i.e, one-time one-use security tokens).
My app takes a nonce from a client, verifies it, then sends a new nonce back to said client as part of every reply. Also included in each reply are the results of business logic operations that executed right after the nonce was authenticated.
The nonce verification and generation are operationally coupled with the business logic, since both occur in response to every client request. However I don't want the two to be coupled in code. What's the right way to partition them in accordance with SOA principles? Is it too much to break the security and business logic into two separate services, with one calling the other as part of each reply to each client request?
Yes it makes sense to separate them. But I don't think they should have awareness of each other at all (Call each other directly).
I'll dive into a specific example and technology of how something similar is implemented.
In the web frame work Struts2 all incoming requests pass through a stack of operations(called interceptors) before arriving at a user defined object (called an action). The action then will access the business tier.
When submitting a web form there is the issue of double submission. So one way to protect against this is with a token that is sent along with the form submission. So we need to create a unique token place it as a hidden field, and then when we receive the request only process it if the token is good. This prevent users from doing something like accidentally buying something more than once.
In Struts2 there is a special server side token tag which creates the hidden field for us. So there is something that needs to be done for each form. The token interceptor if active will enforce that this value always exists and is good when receiving the form and will redirect responses that do not somewhere else.
The idea of implementing a nonces interceptor/filter that checks that the incoming nonce value is good and for responses adds the correct nonces value to the response should be completely independent of the business logic.
The example here is with html forms but adding an interceptor(or whatever you call "that which handles cross cutting concerns at the request/response level" for your appropriate technology) which adds such a value to json or xml messages should be pretty easy and likely produce the most elegant result.
The following is a link to struts2 interceptor reference (it might clarify the idea better):
http://struts.apache.org/2.2.1.1/docs/interceptors.html
The following two links are both interceptors which manage tokens:
http://struts.apache.org/2.2.1.1/docs/token-interceptor.html
http://struts.apache.org/2.2.1.1/docs/token-session-interceptor.html
I expect only the first few paragraphs of each link will be useful but something like it for your technology should be nice.
I think what you outlined above would be in keeping with SOA principles. You're keeping two distinct sets of operations separated - once service has the business logic, the other has the security logic.
This would be especially true if you have (or the potential of having) other services that would rely on nonces.