I've read a lot about microservices, but one question remain : the security.
What I would like to do is something similar to Netflix, i.e one general backend and many backends for each front end (for example mobile devices, desktop app, ...).
On the top of that I plan to put my firewall security layer. Here is the problem : how to authorize a request through this layer only once and not in each microservice ?
Is is possible to expose certain microservices to the whole internet, and others only to trusted sources ? If so, is it the right way ?
You've read a lot, but not too much. Please give a look at:
Building microservices (it talks about every aspect of MSs, and also about security)
The Practice of Cloud System Administration (not strictly MS relatad but containts many useful information)
One solution would be assigning a public IP to the microservice you want to expose, and then perhaps have a VPN set-up that trusted entities use to access the other microservices.
Related
I have a single back-end running node/express providing API endpoints and 2 static (react) front-ends. The front-ends interact with the users and communicate with the back-end.
I need to use https through-out once enter production stage.
The front-ends will require their own domain names.
I’ve been thinking on the simplest way to have these configured and have come up with Option 1 (see diagram). Node.js API server running on one VPS and as the front-ends are static sites these can be loaded on separate servers (UPDATE- Mean't to say hosting providers) hence get their own domains. As an option, and I’m unsure if its needed, add cloudflare to the front-end to provide a layer of security.
This will allow front-ends to have separate domain names.
As this is a start-up project and doubt a large number of visitors, I’m wondering if the above is over-engineered and un-necessarily complicated.
So I’m considering Option 2 of hosting back-end api app and the two front-ends on the same linux vps. As the front-ends are static, I added the front-ends into the public folder of node.js. This allowed me to access the front-ends as http://serverIP:8080/siteA
As I want to access front-end as http://siteA.com I’m assuming I require a reverse proxy (nginX)
The questions to help me decide between the two options are:
For a start-up operation and given he above scenario which option is best ?
I understand that node.js requires a port number regardless to work, for the API I don’t mind having a port number (as its not applicable for end users i.e. http://10.20.30.40:3000), however the two front-ends require their own domain names (www.siteA.xom, www.siteB.com), therefore will I need to employ a reverse proxy (nginX) regardless if they are static sites or not ?
I’m concerned that someone could attack API end-points (http://10.20.30.40:3000). In this case, is it true with Option 2 is safer than 1 - that I could potentially block malicious direct API calls as all sites are hosted on the same VPS and the API can be easily be secured, this is not exposed to the outside world?
My developer once upon a time told me that option 1 is best as nginX adds un-needed complication, but not sure what he meant – hence my confusion, to be honest I don’t think he wanted to add nginX to the server.
I’m looking at a high-level guidance to get me on the right track. Thank
This is - as you have also doubted - unnecessarily complex, and incorrect in some cases. Here's a better (and widely used across the industry) design. I'm strongly recommending to drop the whole VM approach and go for a shared computing unit, unless you are using that machine for some other computation and utilizing it that way is saving your company a lot of money. I strongly doubt this is the case. Otherwise, you're just creating problems for yourself.
One of the most common mistakes that you can make when using Node.js is to host the static content through the public folder (for serious projects) Don't. Use a CDN instead. You'll get better telemetry depending on the CDN, redundancy, faster delivery, etc. If you aren't expecting high volumes of traffic and performance of delivering that static content isn't outrageously important at the moment, you can even go for a regular hosting server. I've done this with namecheap and GoDaddy before.
Use a direct node-js shared - or dedicated depending on size - hosting for your app and use CI/CD to deploy it. You can use CNAMEs to map whatever domain name you want to have your app on (ex: https://something.com) to map to the domain name of the cloud-hosting provider url for your app. I've used multiple things, Azure, Heroku, Namecheap for the apps and primarily Azure DevOps to manage the CI/CD pipeline, although Jenkins is super popular as well. I'd recommend Heroku - since it provides a super easy setup.
When designing any API on HTTP, you should assume people will call your API directly. See this answer for more details: How to prevent non-browser clients from sending requests to my server I'm not suggesting to put something like CloudFlare, but you may be overthinking it, look into your traffic first. Get it when you need it. As long as you have the right authentication / authorization mechanism in place, security of the API shouldn't be a big problem on these platforms. If you deploy it on one of these platforms, you won't have to deal with ports either. Unless you reach absolutely massive scale, it will definitely be cheaper for you operate with high-reliability this way.
You won't need to deal with nginx anymore.
I am working on implementing a microservices-based application using node.js. While searching for examples on how to implement the api gateway, I came across the following article that seems to provide an example on implementing the api gateway: https://memz.co/api-gateway-microservices-docker-node-js/. Though, finding example for implementing the api gateway pattern in node.js seems to be a little hard to come by so far, this article seemed to be a really good example.
There are a few items that are still unclear and I am still have issues finding doc. on.
1) Security is a major item for the app. I am developing, I am having trouble seeing where the authentication should take place (i.e. using passport, should I add the authentication items in the api gateway and pass the jwt token along with the request to the corresponding microservice as the user's logged in information is needed for certain activities? The only issue here seems to be that all of the microservices would need passport in order to decrypt the jwt token to get the user's profile information. Would the microservice be technically, inaccessible to the outside world except through the api gateway as this seems to be the aim?
2) How does this scenario change if I need to scale to multiple servers with docker images on each one? How would this affect load balancing, as it seems like something would have to sit at a higher level to deal with load balancing?
I can tell that much depends on your application requirements. Really.
I'm now past the 5 years of experience in production microservices using several languages going from medium to very large scale system.
None of them shared the same requirements, and without having a deep understanding of what you need and what are your business (product) requirements it would be hard to know what's the right answer, by the way I'll try to share some experience to help you get it right.
Ideally you want the security to be encapsulated in an external service, so that you can update and apply new policies faster. Also you'll be able to deprecate all existing tokens should you find a breach in your system or if someone in your team inadvertedly pushes some secret key (or cert) to an external service.
You could handle authentication on each single service or using an edge newtwork tool (such as the API Gateway). Becareful choosing how to handle it because each one has it's own privileges:
Choosing the API Gateway your services will remain lighter and do not need to know anything about the authentication steps, but surely at some point you'll need to know who the authenticated user is and you need some plain reference to it (a JSON record, a link or ID to a "user profile" service). How you do it it's up to your requirements and we can even go deeper talking about different pros and cons about each possible choice applicable for your case.
Choosing to handle it at the service level requires you (and your teams) to understand better about the security process taking place (you can hide it with a good library) and you'll need to give them support from your security team (it's may also be yourself btw you know the more service implementing security, the more things you'll have to think about to avoid adding unnecessary features). The big problem here is that you'll often end up stopping your tasks to think about what would help you out on this particular service and you'll be tempted to extend your authentication service (and God, unless you really know what you're doing, don't add a single call not needed for authentication purposes).
One thing is easy to be determined: you surely need to think about tokens (jwt, jwe or, again, whatever your requirements impose).
JWT has good benefits, but data is exposed to spoofing, so never put in there sensitive data or things you wouldn't publicly share about your user (e.g. an ID is probably fine, while security questions or resolution to 2FA would not). JWE is an encrypted form of the spec. A common token (with no meaning) would require a backend to get the data, but it works much like cookie-sessions and data is not leaving your servers.
You need to define yourself the boundaries of your services and do yourself a favor: make each service boundaries clean, defined and standard.
Try to define common policies and standardize interactions, I know it may be easier to add a queue here, a REST endpoint there, a RPC there, but you'll soon end up with a bunch of IPC you will not be able to handle anymore and it will soon catch your attention.
Also if your business solution is pretty heavy to do I don't think it's a good idea to do yourself the API Gateway, Security and so on. I'd go with open source, community supported (or even company-backed if you have some budget) and production-tested solutions.
By definition microservice architectures are very dynamic, you'll fight to keep it immutable between each deployment version, but unless you're a big firm you cannot effort keeping live thousands of servers. This means you'll discover bugs that only presents under certain circumstances you cannot spot in other environments (it happens often to not be able to reproduce them).
By choosing to develop the whole stack yourself you agree with having to deal with maintenance and bug-discovery in your whole stack. So when you try to load a page that has 25 services interacting you know it may be failing because of a bug in: your API Gateway, your Security implementation, your token parser, your user account service, your business service A to N, your database service (if any), your database load balance (if any), your database instance.
I know it's tempting to do everything, but try to keep it flat and do what you need to do. By following this path you'll think about your product, which I think is what's the most important think to do now.
To complete my answer, about the scaling issues:
it doesn't matter. Whatever choice you pick it will scale seamlessly:
API Gateway should be able to work on a pool of backends (so from that server you should be able to redirect to N backend machines you can put live when you need to, you can even have some API to support automatic registration of new instances, or even simples put the IP of an Elastic Load Balancer or HAproxy or equivalents, and as you add backends to them it will just work -you have moved the multiple IPs issue from the API Gateway to one layer down).
If you handle authentication at services level (and you have an API Gateway) see #1
If you handle authentication at services level (without an API Gateway) then you need to look at some other level in your stack: load balancing (layer 3 or layer 7), or the DNS level, you can use several features of DNS to put different IPs to answer from, using even advanced features like Anycast if you need latency distribution.
I know this answer introduced a lot of other questions, but I really tried to answer your question. The fact is that you need to understand and evaluate a lot of things when planning a microservice architecture and I'd not write a SLOC without a very-written-plan printed on every wall of my office.
You'll often need to go mental focus and exit from a single service to review the global vision and check everything is going fine.
I don't want to scare you, I'm rather trying to make you think to succeed.
I just want you to make sure you correctly evaluated all of the possibilities before to decide to do everything from scratch.
P.S. Should you choose to act using an API gateway be sure to limit services to only accept requests through it. On the same machine just start listening on localhost, on multiple machines you'll need some advanced networking rule depending on your operating system.
Good Luck!
While the are many social networks in the wild, most rely on data stored on a central site owned by a third party.
I'd like to build a solution, where data remains local on member's systems. Think of the project as an address book, which automagically updates contact's data as soon a a contact changes its coordinates. This base idea might get extended later on...
Updates will be transferred using public/private key cryptography using a central host. The sole role of the host is to be a store and forward intermediate. Private keys remain private on each member's system.
If two client are both online and a p2p connection could be established, the clients could transfer data telegrams without the central host.
Thus, sender and receiver will be the only parties which are able create authentic messages.
Questions:
Do exist certain protocols which I should adopt?
Are there any security concerns I should keep in mind?
Do exist certain services which should be integrated or used somehow?
More technically:
Use e.g. Amazon or Google provided services?
Or better use a raw web-server? If yes: Why?
Which algorithm and key length should be used?
UPDATE-1
I googled my own question title and found this academic project developed 2008/09: http://www.lifesocial.org/.
The solution you are describing sounds remarkably like email, with encrypted messages as the payload, and an application rather than a human being creating the messages.
It doesn't really sound like "p2p" - in most P2P protocols, the only requirement for central servers is discovery - you're using store & forward.
As a quick proof of concept, I'd set up an email server, and build an application that sends emails to addresses registered on that server, encrypted using PGP - the tooling and libraries are available, so you should be able to get that up and running in days, rather than weeks. In my experience, building a throw-away PoC for this kind of question is a great way of sifting out the nugget of my idea.
The second issue is that the nature of a social network is that it's a network. Your design may require you to store more than the data of the two direct contacts - you may also have to store their friends, or at least the public interactions those friends have had.
This may not be part of your plan, but if it is, you need to think it through early on - you may end up having to transmit the entire social graph to each participant for local storage, which creates a scalability problem....
The paper about Safebook might be interesting for you.
Also you could take a look at other distributed OSN and see what they are doing.
None of the federated networks mentioned on http://en.wikipedia.org/wiki/Distributed_social_network is actually distributed. What Stefan intends to do is indeed new and was only explored by some proprietary folks.
I've been thinking about the same concept for the last two years. I've finally decided to give it a try using Python.
I've spent the better part of last night and this morning writing a sockets communication script & server. I also plan to remove the central server from the equation as it's just plain cumbersome and there's no point to it when all the members could keep copies of their friend's keys.
Each profile could be accessed via a hashed string of someone's public key. My social network relies on nodes and pods. Pods are computers which have their ports open to the network. They help with relaying traffic as most firewalls block incoming socket requests. Nodes store information and share it with other nodes. Each node will get a directory of active pods which may be used to relay their traffic.
The PeerSoN project looks like something you might be interested in: http://www.peerson.net/index.shtml
They have done a lot of research and the papers are available on their site.
Some thoughts about it:
protocols to use: you could think exactly on P2P programs and their design
security concerns: privacy. Take a great care to not open doors: a whole system can get compromised 'cause you have opened some door.
services: you could integrate with the regular social networks through their APIs
People will have to install a program in their computers and remeber to open it everytime, like any P2P client. Leaving everything on a web-server has a smaller footprint / necessity of user action.
Somehow you'll need a centralized server to manage the searches. You can't just broadcast the internet to find friends. Or you'll have to rely uppon email requests to add somenone, and to do that you'll need to know the email in advance.
The fewer friends /contacts use your program, the fewer ones will want to use it, since it won't have contact information available.
I see that your server will be a store and forward, so the update problem is solved.
We're trying to implement the Gatekeeper Design pattern as recommended in Microsoft Security Best Practices for Azure, but I;m having some trouble determining how to do that.
To give some background on the project, we're taking an already developed website using the traditional layered approach (presentation, business, data, etc.) and converting it over to use Azure. The client would like some added security built around this process since it will now be in the cloud.
The initial suggestion to handle this was to use Queues and have worker roles process requests entered into the queue. Some of the concerns we've come across are how to properly serialize the objects and include what methods we need run on that object as well as the latency inherent in such an approach.
We've also looked setting up some WCF services in the Worker Role, but I'm having a little trouble wrapping my head around how exactly to handle this. (In addition to this being my first Azure project, this would also be my first attempt at WCF.) We'd run into the same issue with object serialization here.
Another thought was to set up some web services in another web role, but that seems to open the same security issue since we won't be able to perform IP-based security on the request.
I've searched and searched but haven't really found any samples that do what we're trying to do (or I didn't recognize them as doing so). Can anyone provide some guidance with code samples? Thanks.
Please do not take this the wrong way, but it sounds like you are in danger of over-engineering a solution based on the "requirement" that 'the client would like some added security'. The gatekeeper pattern that is described on page 13 of the Security Best Practices For Developing Windows Azure Applications document is a very big gun which you should only fire at large targets, i.e., scenarios where you actually need hardened applications storing highly sensitive data. Building something like this will potentially cost a lot of time & performance, so make sure you weigh pro's & con's thoroughly.
Have you considered leveraging SQL Azure firewall as an additional (and possibly acceptable) security measure? You can specify access on an IP address level and even configure it programmatically through stored procedures. You can block all external access to your database, making your Azure application (web/worker roles) the only "client" that is allowed to gain access.
To answer one of your questions specifically, you can secure access to a WCF service using X.509 certificates and implement message security; if you also need an SSL connection to protect data in transit you would need to use both message and transport security. It's not the simplest thing on earth, but it's possible. You can make it so only the servers that have the correct certificate can make the WCF request. Take a look at this thread for more details and a few more pointers: http://social.msdn.microsoft.com/Forums/en-US/windowsazuresecurity/thread/1f77046b-82a1-48c4-bb0d-23993027932a
Also, WCF makes it easy to exchange objects as long as you mark them Serializable. So making WCF calls would dramatically simplify how you exchange objects back and forth with your client(s).
I'm working on an external web site (in DMZ) that needs to get data from our internal production database.
All of the designs that I have come up with are rejected because the network department will not allow a connection of any sort (WCF, Oracle, etc.) to come inside from the DMZ.
The suggestions that have come from the networking side generally fall under two categories -
1) Export the required data to a server in the DMZ and export modified/inserted records eventually somehow, or
2) Poll from inside, continually asking a service in the DMZ whether it has any requests that need serviced.
I'm averse to suggestion 1 because I don't like the idea of a database sitting in the DMZ. Option 2 seems like a ridiculous amount of extra complication for the nature of what's being done.
Are these the only legitimate solutions? Is there an obvious solution I'm missing? Is the "No connections in from DMZ" decree practical?
Edit: One line I'm constantly hearing is that "no large company allows a web site to connect inside to get live production data. That's why they send confirmation emails". Is that really how it works?
I'm sorry, but your networking department are on crack or something like that - they clearly do not understand what the purpose of a DMZ is. To summarize - there are three "areas" - the big, bad outside world, your pure and virginal inside world, and the well known, trusted, safe DMZ.
The rules are:
Connections from outside can only get to hosts in the DMZ, and on specific ports (80, 443, etc);
Connections from the outside to the inside are blocked absolutely;
Connections from the inside to either the DMZ or the outside are fine and dandy;
Only hosts in the DMZ may establish connections to the inside, and again, only on well known and permitted ports.
Point four is the one they haven't grasped - the "no connections from the DMZ" policy is misguided.
Ask them "How does our email system work then?" I assume you have a corporate mail server, maybe exchange, and individuals have clients that connect to it. Ask them to explain how your corporate email, with access to internet email, works and is compliant with their policy.
Sorry, it doesn't really give you an answer.
I am a security architect at a fortune 50 financial firm. We had these same conversations. I don't agree with your network group. I understand their angst, and I understand that they would like a better solution but most places don't opt with the better choices (due to ignorance on their part [ie the network guys not you]).
Two options if they are hard set on this:
You can use a SQL proxy solution like greensql (I don't work for them, just know of them) they are just greensql dot com.
The approach they refer to that most "Large orgs" use is a tiered web model. Where you have a front end web server (accessed by the public at large), a mid-tier (application or services layer where the actual processes occurs), and a database tier. The mid-tier is the only thing that can talk to the database tier. In my opinion this model is optimal for most large orgs. BUT that being said, most large orgs will run into either a vendor provided product that does not support a middle tier, they developed without a middle tier and the transition requires development resources they dont have to spare to develop the mid-tier web services, or plain outright there is no priorty at some companies to go that route.
Its a gray area, no solid right or wrong in that regard, so if they are speaking in finality terms then they are clearly wrong. I applaud their zeal, as a security professional I understand where they are coming from. BUT, we have to enable to business to function securely. Thats the challange and the gauntlet I always try and throw down to myself. how can I deliver what my customer (my developers, my admins, my dbas, business users) what they want (within reason, and if I tell someone no I always try to offer an alternative that meets most of their needs).
Honestly it should be an open conversation. Here's where I think you can get some room, ask them to threat model the risk they are looking to mitigate. Ask them to offer alternative solutions that enable your web apps to function. If they are saying they cant talk, then put the onus on them to provide a solution. If they can't then you default to it working. Site that you open connections from the dmz to the db ONLY for the approved ports. Let them know that DMZ is for offering external services. External services are not good without internal data for anything more than potentially file transfer solutions.
Just my two cents, hope this comment helps. And try to be easy on my security brethren. We have some less experienced misguided in our flock that cling to some old ways of doing things. As the world evolves the threat evolves and so does our approach to mitigation.
Why don't you replicate your database servers? You can ensure that the connection is from the internal servers to the external servers and not the other way.
One way is to use the ms sync framework - you can build a simple windows service that can synchronize changes from internal database to your external database (which can reside on a separate db server) and then use that in your public facing website. Advantage is, your sync logic can filter out sensitive data and keep only things that are really necessary. And since the entire control of data will be in your internal servers (PUSH data out instead of pull) I dont think IT will have an issue with that.
The connection formed is never in - it is out - which means no ports need to be opened.
I'm mostly with Ken Ray on this; however, there appears to be some missing information. Let's see if I get this right:
You have a web application.
Part of that web application needs to display data from a different production server (not the one that normally backs your site).
The data you want/need is handled by a completely different application internally.
This data is critical to the normal flow of your business and only a limited set needs to be available to the outside world.
If I'm on track, then I would have to say that I agree with your IT department and I wouldn't let you directly access that server either.
Just take option 1. Have the production server export the data you need to a commonly accessible drop location. Have the other db server (one in the DMZ) pick up the data and import it on a regular basis. Finally, have your web app ONLY talk to the db server in the dmz.
Given how a lot of people build sites these days I would also be loath to just open a sql port from the dmz to the web server in question. Quite frankly I could be convinced to open the connection if I was assured that 1. you only used stored procs to access the data you need; 2. the account information used to access the database was encrypted and completely restricted to only running those procs; 3. those procs had zero dynamic sql and were limited to selects; 4. your code was built right.
A regular IT person would probably not be qualified to answer all of those questions. And if this database was from a third party, I would bet you might loose support if you were to start accessing it from outside it's normal application.
Before talking about your particular problem I want to deal with the Update that you provided.
I haven't worked for a "large" company - though large is hard to judge without a context, but I have built my share of web applications for the non profit and university department that I used to work for. In both situations I have always connected to the production DB that is on the internal Network from the Web server on the DMZ. I am pretty sure many large companies do this too; think for example of how Sharepoint's architecture is setup - back-end indexing, database, etc. servers, which are connected to by front facing web servers located in the DMZ.
Also the practice of sending confirmation e-mails, which I believe you are referring to confirmations when you register for a site don't usually deal with security. They are more a method to verify that a user has entered a valid e-mail address.
Now with that out of the way, let us look at your problem. Unfortunately, other than the two solutions you presented, I can't think of any other way to do this. Though some things that you might want to think about:
Solutions 1:
Depending on the sensitivity of the data that you need to work with, extracting it onto a server on the DMZ - whether using a service or some sort of automatic synchronization software - goes against basic security common sense. What you have done is move the data from a server behind a firewall to one that is in front of it. They might as well just let you get to the internal db server from the DMZ.
Solution 2:
I am no networking expert, so please correct me if I am wrong, but a polling mechanism still requires some sort of communication back from the web server to inform the database server that it needs some data back, which means a port needs to be open, and again you might as well tell them to let you get to the internal database without the hassle, because you haven't really added any additional security with this method.
So, I hope that this helps in at least providing you with some arguments to allow you to access the data directly. To me it seems like there are many misconceptions in your network department over how a secure database backed web application architecture should look like.
Here's what you could do... it's a bit of a stretch, but it should work...
Write a service that sits on the server in the DMZ. It will listen on three ports, A, B, and C (pick whatever port numbers make sense). I'll call this the DMZ tunnel app.
Write another service that lives anywhere on the internal network. It will connect to the DMZ tunnel app on port B. Once this connection is established, the DMZ tunnel app no longer needs to listen on port B. This is the "control connection".
When something connects to port A of the DMZ tunnel app, it will send a request over the control connection for a new DB/whatever connection. The internal tunnel app will respond by connecting to the internal resource. Once this connection is established, it will connect back to the DMZ tunnel app on port C.
After possibly verifying some tokens (this part is up to you) the DMZ tunnel app will then forward data back and forth between the connections it received on port A and C. You will effectively have a transparent TCP proxy created from two services running in the DMZ and on the internal network.
And, for the best part, once this is done you can explain what you did to your IT department and watch their faces as they realize that you did not violate the letter of their security policy, but you are still being productive. I tell you, they will hate that.
If all development solutions cannot be applied because of system engineering restriction in DMZ then give them the ball.
Put your website in intranet, and tell them 'Now I need inbound HTTP:80 or HTTPS:443 connections to that applications. Set up what you want : reverse proxies, ISA Server, protocols break, SSL... I will adapt my application if necessary.'
About ISA, I guess they got one if you have exchange with external connections.
Lot of companies are choosing this solution when a resource need to be shared between intranet and public.
Setting up a specific and intranet network with high security rules is the best way to make the administration, integration and deployment easier. What is easier is well known, what is known is masterized : less security breach.
More and more system enginers (like mines) prefer to maintain an intranet network with small 'security breach' like HTTP than to open other protocols and ports.
By the way, if they knew WCF services, they would have accepted this solution. This is the most secure solution if well designed.
Personnaly, I use this two methods : TCP(HTTP or not) Services and ISA Server.