We would like to add additional layer of security to our database and we want to make sure that even if DB files will leak to public - noone will be able to reach the actual data inside it.
Additionaly we want to make sure that even if the encryption key will leak it will only allow to decrypt one or few rows, while others will still be inaccessible.
What are the best practices for doing that?
If it's important - we are using rails(5.1) + postgres(9.6) and our database is running in AWS RDS.
From the way your question is asked, you are simply not ready for this and heading for a world of pain if you try to tackle it. Again, applying a lot of inference, that you seem confused about where to start with this makes me think there are a lot of other security measures you have not put in place. Measures which might give you greater benefits for less cost and risk.
There are several different conventional models for applying encryption to data at rest. What you ask as additional requirements will require a tremendously complex application tier managing many, many encryption keys. Most experts would quake at the prospect of trying to implement this. Where are you going to store all the keys? What you've not asked is also very telling - what impact do you think opaque data has on query performance?
Yes, if your application sits in AWS there are different potential risks of direct file access than on a dedicated device - but you deal with that by using filesystem or block level encryption or the native capabilities of your DBMS (understanding what impact the latter has on your data access).
Related
Very closely related: How to protect strings without SecureString?
Also closely related: When would I need a SecureString in .NET?
Extremely closely related (OP there is trying to achieve something very similar): C# & WPF - Using SecureString for a client-side HTTP API password
The .NET Framework has class called SecureString. However, even Microsoft no longer recommends its use for new development. According to the first linked Q&A, at least one reason for that is that the string will be in memory in plaintext anyway for at least some amount of time (even if it's a very short amount of time). At least one answer also extended the argument that, if they have access to the server's memory anyway, in practice security's probably shot anyway, so it won't help you. (The second linked Q&A implies that there was even discussion of dropping this from .NET Core entirely).
That being said, Microsoft's documentation on SecureString does not recommend a replacement, and the consensus on the linked Q&A seems to be that that kind of a measure wouldn't be all that useful anyway.
My application, which is an ASP.NET Core application, makes extensive use of API Calls to an external vendor using the HttpClient class. The generally-recommended best practice for HttpClient is to use a single instance rather than creating a new instance for each call.
However, our vendor requires that all API Calls include our API Key as a header with a specific name. I currently store the key securely, retrieve it in Startup.cs, and add it to our HttpClient instance's headers.
Unfortunately, this means that my API Key will be kept in plaintext in memory for the entire lifecycle of the application. I find this especially troubling for a web application on a server; even though the server is maintained by corporate IT, I've always been taught to treat even corporate networks as semi-hostile environments and not to rely purely on corporate firewalls for application security in such cases.
Does Microsoft have a recommended best practice for cases like this? Is this a potential exception to their recommendation against using SecureString? (Exactly how that would work is a separate question). Or is the answer on the other Q&A really correct in saying that I shouldn't be worried about plaintext strings living in memory like this?
Note: Depending on responses to this question, I may post a follow-up question about whether it's even possible to use something like SecureString as part of HttpClient headers. Or would I have to do something tricky like populate the header right before using it and then remove it from memory right afterwards? (That would create an absolute nightmare for concurrent calls though). If people think that I should do something like this, I would be glad to create a new question for that.
You are being WAY too paranoid.
Firstly, if a hacker gets root access to your web server, you have WAY bigger problems than your super-secret web app credentials being stolen. Way, way, way bigger problems. Once the hackers are on your side of the airtight hatchway, it is game over.
Secondly, once your infosec team detects the intrusion (if they don't, again, you've got WAY bigger problems) they're going to tell you and the first thing you're going to do is change every key and password you know of.
Thirdly, if a hacker does get root access to your webserver, their first thought isn't going to be "let's take a memory dump for later analysis". A dumpfile is rather large (will take time to transfer over the wire, and the network traffic might well be noticed) and (at least on Windows) hangs the process until it's complete (so you'd notice your web app was unresponsive) - both of which are likely to raise some red flags.
No, hackers are there to grab as much valuable information in the least amount of time, because they know their access could be discovered at any second. So they're going to go for the low-hanging fruit first - usernames and passwords. Then they'll move on to trying to find out what's connected to that server, and since your DB credentials are likely in a config file on that server, they will almost certainly switch their attentions to that far more interesting target.
So all things considered, your API key is pretty darn unlikely to be compromised - and even if it is, it won't be because of something you did or didn't do. There are far more productive ways of focusing your time than trying to secure something that already is (or should be) incredibly secure. And, at the end of the day, no matter how many layers of security you put in place... that API or SSL key is going to be raw, in memory, at some stage.
I would like go get my head around how is best to set up a client server architecture where security is of up most importance.
So far I have the following which I hope someone can tell me if its good enough, or it there are other things I need to think about. Or if I have the wrong end of the stick and need to rethink things.
Use SSL certificate on the server to ensure the traffic is secure.
Have a firewall set up between the server and client.
Have a separate sql db server.
Have a separate db for my security model data.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Passwords generated using a salt which is stored in a different db to the passwords.
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
I have tried searching for some diagrams which could help me understand but I cannot find any which seem to be appropriate.
Thanks in advance
Hardening your architecture can be a challenging task and sharding your services across multiple servers and over-engineering your architecture for semblance security could prove to be your largest security weakness.
However, a number of questions arise when you come to design your IT infrastructure which can't be answered in a single SO answer (will try to find some good white papers and append them).
There are a few things I would advise which is somewhat opinionated backed up with my own thought around it.
Your Questions
I would really like to know is there any other steps or layers I need to make this secure. Is storing everything in the cloud wise, or should I have some physical servers as well?
Settle for the cloud. You do not need to store things on physical servers anymore unless you have current business processes running core business functions that are already working on local physical machines.
Running physical servers increases your system administration requirements for things such as HDD encryption and physical security requirements which can be misconfigured or completely ignored.
Use SSL certificate on the server to ensure the traffic is secure.
This is normally a no-brainer and I would go with a straight, "Yes"; however you must take into consideration the context. If you are running something such as a blog site or documentation-related website that does not transfer any sensitive information at any point in time through HTTP then why use HTTPS? HTTPS has it's own overhead, it's minimal, but it's still there. That said, if in doubt, enable HTTPS.
Have a firewall set up between the server and client.
That is suggested, you may also want to opt for a service such as CloudFlare WAF, I haven't personally used it though.
Have a separate sql db server.
Yes, however not necessarily for security purposes. Database servers and Web Application servers have different hardware requirements and optimizing both simultaneously is not very feasible. Additionally, having them on separate boxes increases your scalability quite a bit which will be beneficial in the long run.
From a security perspective; it's mostly another illusion of, "If I have two boxes and the attacker compromises one [Web Application Server], he won't have access to the Database server".
At foresight, this might seem to be the case but is rarely so. Compromising the Web Application server is still almost a guaranteed Game Over. I will not go into much detail into this (unless you specifically ask me to) however it's still a good idea to keep both services separate from eachother in their own boxes.
Have a separate db for my security model data.
I'm not sure I understood this, what security model are you referring to exactly? Care to share a diagram or two (maybe an ERD) so we can get a better understanding.
Store my passwords in the database using a secure hashing function such as PBKDF2.
Obvious yes; what I am about to say however is controversial and may be flagged by some people (it's a bit of a hot debate)—I recommend using BCrypt instead of PKBDF2 due to BCrypt being slower to compute (resulting in slower to crack).
See - https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage
Passwords generated using a salt which is stored in a different db to the passwords.
If you use BCrypt I would not see why this is required (I may be wrong). I go into more detail regarding the whole username and password hashing into more detail in the following StackOverflow answer which I would recommend you to read - Back end password encryption vs hashing
Use cloud based infrastructure such as AWS to ensure that the system is easily scalable.
This purely depends on your goals, budget and requirements. I would personally go for AWS, however you should read some more on alternative platforms such as Google Cloud Platform before making your decision.
Last Remarks
All of the things you mentioned are important and it's good that you are even considering them (most people just ignore such questions or go with the most popular answer) however there are a few additional things I want to point:
Internal Services - Make sure that no unrequired services and processes are running on server especially in productions. These services will normally be running old versions of their software (since you won't be administering them) that could be used as an entrypoint for your server to be compromised.
Code Securely - This may seem like another no-brainer yet it is still overlooked or not done properly. Investigate what frameworks you are using, how they handle security and whether they are actually secure. As a developer (and not a pen-tester) you should at least use an automated web application scanner (such as Acunetix) to run security tests after each build that is pushed to make sure you haven't introduced any obvious, critical vulnerabilities.
Limit Exposure - Goes somewhat hand-in-hand with my first point. Make sure that services are only exposed to other services that depend on them and nothing else. As a rule of thumb, keep everything entirely closed and open up gradually when strictly required.
My last few points may come off as broad. The intention is to keep a certain philosophy when developing your software and infrastructure rather than a permanent rule to tick on a check-box.
There are probably a few things I have missed out. I will update the answer accordingly over time if need be. :-)
Can UUID on database level be used as a security measure instead of a true rights control?
Consider a web application where all servlets implements "normal" access control by having a session id connected to the user calling it (through the web client). All users are therefore authenticated.
The next level of security needed is if a authenticated user actually "owns" the data being changed. In a web application this could for example be editing some text in a form. The client makes sure a user, by accident, doesn’t do something wrong (JavaScript). The issue is of course is that any number of network tools could easily repeat the call made by the browser and, by only changing the ID, edit a different row in the database table behind the servlet that the user does not "own".
My question is if it would be sufficient to use UUID's as keys in the database table and thereby making it practically impossible to guess a valid ID (https://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates)? As far as I know similar approaches is used in Google Photos (http://www.theverge.com/2015/6/23/8830977/google-photos-security-public-url-privacy-protected) but I'm not sure it is 100% comparable.
Another option is off cause to have every servlet verify that the user is only performing an action on its own data, but in a big application with 200+ servlets and 50-100 tables this could be a very cumbersome task where mistakes could easily happen. In my mind this weakens the security far more, but I'm not sure if that is true.
I'm leaning towards the UUID solution, but I'm also curious if there are other obvious approaches to this problem that I ought to consider.
Update:
I should probably have clarified that my plan would be to use UUIDv4 which is supposed to be random. I know that entropy comes in to play here in regards to how random the UUID's actually are, but as far as I have read then Java (which is the selected platform/language) uses SecureRandom which is supposed to be "cryptographically strong" (link).
And in that case wiki states (link):
In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%.
Using UUIDs in this manner has two major issues:
If there are no additional authentication methods, any attacker could simply guess UUIDs until they find one belonging to someone else. Google Photos doesn't need to worry about this as much, because they only use UUIDs to obfuscate publicly-shared photo views; you still need to authenticate to modify the photos. This is especially dangerous because:
UUIDs are intended to be unique, not random. There are likely to be predictable patterns in your UUIDs that an attacker would be able to observe and take advantage of. In addition, even without a clear pattern, the number of UUIDs an attacker needs to test to find a valid one swiftly decreases as your userbase grows.
I will always recommend using secure, continuously-checked authentication. However, if you have a fairly small userbase, and you are only using this to obfuscate public data access, then using UUIDs in this manner might be alright. Even then, you should be using actual random strings, and not UUIDs.
Another option is off cause to have every servlet verify that the user
is only performing an action on its own data, but in a big application
with 200+ servlets and 50-100 tables this could be a very cumbersome
task where mistakes could easily happen. In my mind this weakens the
security far more, but I'm not sure if that is true.
With a large legacy application adding in security later is always a complex task. And you're right - the more complicated an application, the harder it is to verify security. Complexity is the main enemy of security.
However, this is the best way to go rather than by trying to obscure insecure direct object reference problems.
If you are using these UUIDs in the query string then this information within URLs may be logged in various locations, including the user's browser, the web server, and any forward or reverse proxy servers between the two endpoints. URLs may also be displayed on-screen, bookmarked or emailed around by users. They may be disclosed to third parties via the Referer header when any off-site links are followed. Placing direct object references into the URL increases the risk that they will be captured by an attacker. An existing user of the application that then has their access revoked to certain bits of data - they will still be able to access this data by using a previously bookmarked URL (or by using their browser history). Even where the ID is passed outside of the URL mechanism, a local attacker that knows (or has figured out) how your system works could have purposely saved IDs just for the occasion.
As said by other answers, GUIDs/UUIDs are not meant to be unguessable, they are just meant to be unique. Granted, the Java implementation does actually generate cryptographically secure random numbers. However, what if this implementation changes in future releases, or what if your system is ported elsewhere where this functionality is different? If you're going to do this, you might as well generate your own cryptographically secure random numbers using your own implementation to use as identifiers. If you have 128bits of entropy in your identifiers, it is completely infeasible for anyone ever to guess them (even if they had all of the world's computing power).
However, for the above reasons I recommend you implement access checks instead.
You are trying to bypass authorisation controls by hoping that the key is unguessable. This is a security no-no. Depending on whom you ask, they may refer to it as an insecure direct object reference or a violation of the complete mediation principle.
As noted by F. Stephen Q, your assumption that UUIDs are unique does not imply that they are not predictable. The threat here is that if a user knows a few UUIDs, say his own, does that allow him to predict other peoples' UUIDs? This is a very real threat, see: Cautionary note: UUIDs generally do not meet security requirements. Especially note what the UUID RFC says:
Do not assume that UUIDs are hard to guess; they should not be used as
security capabilities (identifiers whose mere possession grants
access), for example.
You can use UUIDs for keys, but you still need to do authorisation checks. When a user wants to access his data, the database should identify the owner of the data, and the server logic needs to enforce that the current user is the same as the database claims the owner is.
I am working on a personal project and I have being considering the security of sensitive data. I want to use API for accessing the Backend and I want to keep the Backend in a different server from the one the user will logon to. This then require a cross domain accessing of data.
Considering that a lot of accessing and transaction will be done, I have the following questions to help guide me in the right path by those who have tried and tested cross domain access. I don't want to assume and implement and run into troubles and redesign when I have launched the service thereby losing sleep. I know there is no right way to do many things in programming but there are so many wrong ways.
How safe is it in handling sensitive data (even with https).
Does it have issues handling a lot of users transactions.
Does it have any downside I not mentioned.
These questions are asked because some post I have read this evening discouraged the use of cross-domain access while some encouraged it. I decided to hear from professionals who have actually used it in a bigger scale.
I am actually building a Mobile App, using Laravel as the backend.
Thanks..
How safe is it in handling sensitive data (even with https).
SSL is generally considered safe (it's used everywhere and is considered the standard). However, it's not any less safe by hitting a different server. The data still has to traverse the pipes and reach its destination which has the same risks regardless of the server.
Does it have issues handling a lot of users transactions.
I don't see why it would. A server is a server. Ultimately, your server's ability to handle volume transactions is going to be based on its power, the efficiency of your code, and your application's ability to scale.
Does it have any downside I not mentioned.
Authentication is the only thing that comes to mind. I'm confused by your question as to how they would log into one but access data from another. It seems that would all just be one application. If you want to revise your question, I'll update my answer.
Okay, so we have to store our clients` private medical records online and also the web site will have a lot of requests, so we have to use some scaling solutions.
We can have our own share of a datacenter and run something like Zend Server Cluster Manager on it, but services like Amazon EC2 look a lot easier to manage, and they are incredibly cheaper too. We just don't know if they are secure enough!
Are they?
Any better solutions?
More info: I know that there is a reference server and it's highly secured and without it, even the decrypted data on the cloud server would be useless. It would be a bunch of meaningless numbers that aren't even linked to each other.
Making the question more clear: Are there any secure storage and process service providers that guarantee there won't be leaks from their side?
First off, you should contact AWS and explain what you're trying to build and the kind of data you deal with. As far as I remember, they have regulations in place to accommodate most if not all the privacy concerns.
E.g., in Germany such thing is a called a "Auftragsdatenvereinbarung". I have no idea how this relates and translates to other countries. AWS offers this.
But no matter if you go with AWS or another cloud computing service, the issue stays the same. And therefor, whatever is possible is probably best answered by a lawyer and based on the hopefully well educated (and expensive) recommendation, I'd go cloud shopping, or maybe not. If you're in the EU, there are a ton of regulations especially in regards to medical records -- some countries add more to it.
From what I remember it's basically required to have end to end encryption when you deal with these things.
Last but not least security also depends on the setup and the application, etc..
For complete and full security, I'd recommend a system that is not connected to the Internet. All others can fail.
You should never outsource highly sensitive data. Your company and only your company should have access to it - in both software and hardware terms. Even if your hoster is generally trusted someone there might just steal hardware.
Depending on the size of your company you should have your custom servers - preferable even unaccessible for the technicans in your datacenter (supposing you don't own the datacenter ;).
So the more important the data is, the less foreign people should have access to it in any means. In the best case you can name all people that have access to them in any way.
(Update: This might not apply to anonymous data, but as you're speaking of customers I don't think that applies here?)
(On a third thought: There're are probably laws to take into consideration of how you have to handle that kind of information ;)