This maybe a very newbie question, but exactly what do I need so that I can say my network is considered "secure"?
To be more specific, if I have a website that deals with login/signup and lots of money transactions, what do I need to protect it?
So far I know I need EV SSL certificate, login system protections like brute force login protection, hashing the password, key stretching. Is there anything I missed?
Besides, is firewall really necessary in my case? I just feel like everything I want to do can be accomplished by the server itself, so is there really a need to get a software/hardware firewall?
To be completely blunt, you should probably hire a security professional to assess and make recommendations about your site. Alternatively, a part or full-time network administrator with security experience/certifications might be a good hire.
I recommend the "don't do-it-yourself" approach not because I want to increase work for my peers, or that I don't believe you are a fully competent individual. Rather, I recommend it because security is really, really hard to get right, and any site that handles money is an ideal target for any attacker out there. From a professional perspective, you would be best served by getting an expert to secure your network, perhaps on an ongoing basis; this is a situation that security professionals are very used to, and very well equipped to handle. From a legal perspective, getting an expert opinion on such a sensitive matter is essential due diligence, and trying to do it entirely on your own opens you to significant liability if your system gets breached and attackers are able to carry off your customer's data. Which, as your business grows and you gain more visibility online, only more and more likely to happen without ongoing, professional help.
I can appreciate that seeing "basic auth" and "safe enough" in the same sentence is a lot like reading "Is parachuting without a parachute still safe?", so I'll do my best to clarify what I am getting at.
From what I've seen online, people typically describe basic HTTP auth as being unsecured due to the credentials being passed in plain text from the client to the server; this leaves you open to having your credentials sniffed by a nefarious person or man-in-the-middle in a network configuration where your traffic may be passing through an untrusted point of access (e.g. an open AP at a coffee shop).
To keep the conversation between you and the server secure, the solution is to typically use an SSL-based connection, where your credentials might be sent in plain text, but the communication channel between you and the server is itself secured.
So, onto my question...
In the situation of replicating one CouchDB instance from an EC2 instance in one region (e.g. us-west) to another CouchDB instance in another region (e.g. singapore) the network traffic will be traveling across a path of what I would consider "trusted" backbone servers.
Given that (assuming I am not replicating highly sensitive data) would anyone/everyone consider basic HTTP auth for CouchDB replication sufficiently secure?
If not, please clarify what scenarios I am missing here that would make this setup unacceptable. I do understand for sensitive data this is not appropriate, I just want to better understand the ins and outs for non-sensitive data replicated over a relatively-trusted network.
Bob is right, it is better to err on the side of caution, but I disagree. Bob could be right in this case (see details below), but the problem with his general approach is that it ignores the cost of paranoia. It leaves "peace dividend" money on the table. I prefer Bruce Schneier's assessment that it is a trade-off.
Short answer
Start replicating now! Do not worry about HTTPS.
The greatest risk is not wire sniffing, but your own human error, followed by software bugs, which could destroy or corrupt your data. Make a replica!. If you will replicate regularly, plan to move to HTTPS or something equivalent (SSH tunnel, stunnel, VPN).
Rationale
Is HTTPS is easy with CouchDB 1.1? It is as easy as HTTPS can possibly be, or in other words, no, it is not easy.
You have to make an SSL key pair, purchase a certificate or run your own certificate authority—you're not foolish enough to self-sign, of course! The user's hashed password is plainly visible from your remote couch! To protect against cracking, will you implement bi-directional SSL authentication? Does CouchDB support that? Maybe you need a VPN instead? What about the security of your key files? Don't check them into Subversion! And don't bundle them into your EC2 AMI! That defeats the purpose. You have to keep them separate and safe. When you deploy or restore from backup, copy them manually. Also, password-protect them so if somebody gets the files, they can't steal (or worse, modify!) your data. When you start CouchDB or replicate, you must manually input the password before replication will work.
In a nutshell, every security decision has a cost.
A similar question is, "should I lock my house at night? It depends. Your profile says you are in Tuscon, so you know that some neighborhoods are safe, while others are not. Yes, it is always safer to always lock all of your doors all of the time. But what is the cost to your time and mental health? The analogy breaks down a bit because time invested in worst-case security preparedness is much greater than twisting a bolt lock.
Amazon EC2 is a moderately safe neighborhood. The major risks are opportunistic, broad-spectrum scans for common errors. Basically, organized crime is scanning for common SSH accounts and web apps like Wordpress, so they can a credit card or other database.
You are a small fish in a gigantic ocean. Nobody cares about you specifically. Unless you are specifically targeted by a government or organized crime, or somebody with resources and motivation (hey, it's CouchDB—that happens!), then it's inefficient to worry about the boogeyman. Your adversaries are casting broad nets to get the biggest catch. Nobody is trying to spear-fish you.
I look at it like high-school integral calculus: measuring the area under the curve. Time goes to the right (x-axis). Risky behavior goes up (y-axis). When you do something risky you saved time and effort, but the the graph spikes upward. When you do something the safe way, it costs time and effort, but the graph moves down. Your goal is to minimize the long-term area under the curve, but each decision is case-by-case. Every day, most Americans ride in automobiles: the single most risky behavior in American life. We intuitively understand the risk-benefit trade-off. Activity on the Internet is the same.
As you imply, basic authentication without transport layer security is 100% insecure. Anyone on EC2 that can sniff your packets can see your password. Assuming that no one can is a mistake.
In CouchDB 1.1, you can enable native SSL. In earlier version, use stunnel. Adding SSL/TLS protection is so simple that there's really no excuse not to.
I just found this statement from Amazon which may help anyone trying to understand the risk of packet sniffing on EC2.
Packet sniffing by other tenants: It is not possible for a virtual instance running in promiscuous mode to receive or "sniff" traffic that is intended for a different virtual instance. While customers can place their interfaces into promiscuous mode, the hypervisor will not deliver any traffic to them that is not addressed to them. This includes two virtual instances that are owned by the same customer, even if they are located on the same physical host. Attacks such as ARP cache poisoning do not work within EC2. While Amazon EC2 does provide ample protection against one customer inadvertently or maliciously attempting to view another's data, as a standard practice customers should encrypt sensitive traffic.
http://aws.amazon.com/articles/1697
Okay, so we have to store our clients` private medical records online and also the web site will have a lot of requests, so we have to use some scaling solutions.
We can have our own share of a datacenter and run something like Zend Server Cluster Manager on it, but services like Amazon EC2 look a lot easier to manage, and they are incredibly cheaper too. We just don't know if they are secure enough!
Are they?
Any better solutions?
More info: I know that there is a reference server and it's highly secured and without it, even the decrypted data on the cloud server would be useless. It would be a bunch of meaningless numbers that aren't even linked to each other.
Making the question more clear: Are there any secure storage and process service providers that guarantee there won't be leaks from their side?
First off, you should contact AWS and explain what you're trying to build and the kind of data you deal with. As far as I remember, they have regulations in place to accommodate most if not all the privacy concerns.
E.g., in Germany such thing is a called a "Auftragsdatenvereinbarung". I have no idea how this relates and translates to other countries. AWS offers this.
But no matter if you go with AWS or another cloud computing service, the issue stays the same. And therefor, whatever is possible is probably best answered by a lawyer and based on the hopefully well educated (and expensive) recommendation, I'd go cloud shopping, or maybe not. If you're in the EU, there are a ton of regulations especially in regards to medical records -- some countries add more to it.
From what I remember it's basically required to have end to end encryption when you deal with these things.
Last but not least security also depends on the setup and the application, etc..
For complete and full security, I'd recommend a system that is not connected to the Internet. All others can fail.
You should never outsource highly sensitive data. Your company and only your company should have access to it - in both software and hardware terms. Even if your hoster is generally trusted someone there might just steal hardware.
Depending on the size of your company you should have your custom servers - preferable even unaccessible for the technicans in your datacenter (supposing you don't own the datacenter ;).
So the more important the data is, the less foreign people should have access to it in any means. In the best case you can name all people that have access to them in any way.
(Update: This might not apply to anonymous data, but as you're speaking of customers I don't think that applies here?)
(On a third thought: There're are probably laws to take into consideration of how you have to handle that kind of information ;)
I know SO isn't traditionally used this way (or maybe it is), but I've been learning about webapp security and was thinking it would be nice and encouraging to hear from SO experts what they think of this article (I'm reading it now, it's on session security).
http://carsonified.com/blog/dev/how-to-create-bulletproof-sessions/
Maybe we can have a discussion of some kind, point out what the author misstated/forgot and what better practices are there?
For example when it comes to a different security topic like sql injections, many people recommend things like mysql_real_escape_strings, but the experts will tell you that nothing beats prepared statements. From the comments, this article seems to have its problems, so I'm wondering how far on the good or bad side his content is.
I think the article is quite nice, however these are just the basic concepts and if somebody seriously tries to make a serious security aware application, things like this will be addressed. In other words, the level of the article is quite low.
Issues like a man-in-the-middle attack are not addressed here (although i can imagine that something like this is usually outside the scope of the application layer). Another possible vulnerability can be random number generation. So depending on the implementation of session key generation, the entropy of the session keys could be much lower as the maximum possible entropy which may or may not make brute force attacks feasible.
So it really depends on the security requirements you have how the solution will be, there is no single security solution that works in all cases. To apply the latter, imagine that you've got a valid session id and you know to which ip the session is bound. Also assume that the target in this example is a bank. Now i can perform a request to transfer money to my account, and make this work by spoofing my ip-address and providing the stolen session. Ok, the reply of my request will never arrive since the IP address is spoofed, but who cares, I got the money since the server accepted my request.
The point is that depending on the context, your security requirements and thus your security solution(s) may greatly vary.
we know that each executable file can be reverse engineered (disassembled, decompiled). No mater how strong security you will implement, anyway if crackers want to, they do crack!!! Just that is a question of time.
What about websites? May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)? If no, than what is the reason?
Yes it is always possible to do. There is always a way in.
It's like my grandfather always said:
Locks are meant to keep the honest
people out
May we say that website can be completely safe from attacks of hackers?
No. Even the most secure technology in the world is vulnerable to social engineering attacks, for one thing.
You can easily write a webapp that is mathematically proven to be secure... But that proof will only hold as long as the underlying operating system, interpreter|compiler, and hardware are secure, which is never the case.
The key thing to remember is that websites are usually part of a huge and complex system and it doesn't really matter if the hacker enters the system through the web application itself or some other part of the entire infrastructure. If someone can get access to your servers, routers, DNS or whatever, they can bring down even the best web application. In my experience a lot of systems are vulnerable in some way or another. So "completely secure" means either "we're trying really hard to secure the platform" or "we have no clue whatsoever, but we hope everything is okay". I have seen both.
To sum up and add to the posts that precede:
Web as a shared resource - websites are useful so long as they are accessible. Render the web site unaccessible, and you've broken it. Denial of service attacks add up to flooding the server so that it can no longer respond to legitimate requests will always be a factor. It's a game of keep away - big server sites find ways to distribute, hackers find ways to deluge.
Dynamic data = dynamic risk - if the user can input data, there's a chance for a hacker to be a menance. Today the big concepts are cross-site scripting and SQL injection, but once one avenue for cracking is figured out, chances are high that another mechanism will rise. You could, conceivably, argue that a totally static site can be secure from this, but then how many useful sites fit that bill?
Complexity = the more complex, the harder to secure - given the rapid change of technology, I doubt that any web developer could say with 100% confidence that a modern website was secure - there's too much unknown code. Taking the host aside (the server, network protocols, OS, and maybe database), there's still all the great new libraries in Java EE and .Net. And even a less enterprise-y architecture will have some serious complexity that makes knowing all potential inputs and outputs of the code prohibitively difficult.
The authentication problem = by definition, the web site lets a remote user do something useful on a server that is far away. Knowing and trusting the other end of the communication is an old challenge. These days server side authenitication is relatively well implemented an understood and (so far as I know!) no one's managed to hack PKI. But getting user authentication ironed out is still quite tricky. It's doable, but it's a tradeoff between difficulty for the user and for configuration, and a system with a higher risk of vulnerability. And even a strong system can be broken when users don't follow the rules or when accidents happen. All this doesn't apply if you want to make a public site for all users, but that severely limits the features you'll be able to implement.
I'd say that web sites simply change the nature of the security challenge from the challenges of client side code. The developer does not need to be as worried about code replication, but the developer does need to be aware of the risks that come from centralizing data and access to a server (or collection of servers). It's just a different sort of problem.
Websites suffer greatly from injection and cross site scripting attacks
Cross-site scripting carried out on
websites were roughly 80% of all
documented security vulnerabilities as
of 2007
Also part of a website (in some web sites a great deal) is sent to the client in the form of CSS, HTML and javascript, which is the open for inspection by anyone.
Not to nitpick, but your definition of "good hosting" does not assume the HTTP service running on the host is completely free from exploits.
Popular web servers such as IIS and Apache are often patched in order to protect against such exploits, which are often discovered the same way exploits in local executables are discovered.
For example, a malformed HTTP request could cause a buffer overrun on the server, leading to part of its data being executed.
It's not possible to make anything 100% secure.
All that can be done is to make something hard enough to break into, that the time and effort spent doing so makes it not worth doing.
Can I crack your site? Sure, I'll just hire a few suicide bombers to blow up your servers. Or... I'll blow up those power plants that power up your site, or I do some sort of social engineering, and DDOS attacks would quite likely be effective in a large scale not to mention atom bombs...
Short answer: yes.
This might be the wrong website to discuss that. However, it is widely known that security and usability are inversely related. See this post by Bruce Schneier for example (which refers to another website, but on Schneier's blog there's a lot of interesting readings on the issue).
Assuming the server itself isn't comprimised, and has no other clients sharing it, static code should be fine. Things usually only start to get funky when there's some sort of scripting language involved. After all, I've never seen a comprimised "It Works!" page
Saying 'completely secure' is a bad thing as it will state two things:
there has not been a proper threat analysis, because secure enough would be the 'correct' term
since security is always a tradeoff it means that the a system that is completely secure will have abysmal usability and the site will be a huge resource hog as security has been taken to insane levels.
So instead of trying to achieve "complete security" you should;
Do a proper threat analysis
Test your application (or have someone professional test it) against common attacks
Apply best practices, not extreme measures
The short of it is that you have to strike a balance between ease of use and security, much of the time, and decide what provides the optimal level of both for your purposes.
An excellent case in point is passwords. The easy way to go about it is to just have one, use it everywhere, and make it something easy to remember. The secure way to go about it is to have a randomly generated variable-length sequence of characters across the encoding spectrum that only the user himself knows.
Naturally, if you go too far on the easy side, the user's data is easy to pick off. If you go too far on the side of security, however, practical application could end up leading to situations that compromise the added value of the security measures (e.g. people can't remember their whole keychain of passwords and corresponding user names, and therefore write them all down somewhere. If the list is compromised, the security measures that had been put into place are for naught. Hence, most of the time a balance gets struck and places ask that you put a number in your password and tell you not to do anything stupid like tell it to other people.
Even if you remove the possibility of a malicious person with the keys to everything leaking data from the equation, human stupidity is infinite. There is no such thing as 100% security.
May we say that website can be completely safe from attacks of hackers (we assume that hosting is not vulnerable)?
Well if we're going to start putting constraints on the attacker, then of course we can design a completely secure system: we just have to bar all of the attacker's attacks from the scenario.
If we assume the attacker actually wants to get in (and isn't bound by the rules of your engagement), then the answer is simply no, you can't be completely safe from attacks.
Yes, it's possible for a website to be completely secure, for a reasonable definition of 'complete' that includes your original premise that the hosting is not vulnerable. The problem is the same as with any software that contains defects; people create software of a complexity that is slightly beyond their capability to manage and thus flaws remain undetected until it's too late.
You could start smaller and prove all your work correct and safe as you construct it, remaking any off-the-shelf components that haven't been designed to that stringent degree of quality, but unfortunately that leaves you at a massive commercial disadvantage compared to the people who can write 99% safe software in 1% of the time. Therefore there's rarely a good business reason for going down this path.
The answer to this question lies close to the ideas about computational theory that arise from considering the halting problem. http://en.wikipedia.org/wiki/Halting_problem To wit, if you could with clarity say you'd devised a way to programmatically determine if any particular program was secure, you might be close to disproving the undecidability of the halting problem on the class of machines you were working with. Since the undecidability of the halting problem has been proven, we can know that over turing machines you would be unable to prove securability since the problem of security reduces to the halting problem. Even for finite machines you might be able to decide all of the states of the program, but Minsk would tell us that the time required for a complete state tree for even simplistic modern day machines and web servers would be huge. You probably know a lot about a specific piece of code, but as soon as you changed the code, or updated it, a complete retest would be required. Fundamentally this is interesting because it all boils back to the concept of information and meaning. Read about Automated theory proving to understand more about the limits of computational systems. http://en.wikipedia.org/wiki/Automated_theorem_proving
The fact is hackers are always one step ahead of developers, you can never ever consider a site to be bullet proof and 100% safe. You just avoid malicious stuff as much as you can !!
In fact, you should follow whitelist approach rather than blacklist approach when it comes to security.