I know the concept of building a simple P2P network without any server. My problems is with securing the network. The network should have some administrative nodes. So there are two kinds of nodes:
Nodes with privileges
Nodes without privileges
The first question is: Can I assign some nodes more rights than others, like the privileges to send a broadcast message?
How can I secure the network of modified nodes that are trying to get privileges?
I'm really interested in answers and resources than can help me. It is important to me to understand this, and I'm happy to add further information if anything is unclear.
You seem lost, and I used to do research in this area, so I'll take a shot. I feel this question is borderline off-topic, but I tend to error toward leaving things open.
See the P2P networks Chord, CAN, Tapestry, and Pastry for examples of P2P networks as well as psuedo-code. These works are all based off distributed hash tables (DHTs) and have been around for over 10 years now. Many of them have open source implementations you can use.
As for "privileged nodes", your question contradicts itself. You want a P2P network, but you also want nodes with more rights than others. By definition, your network is no longer P2P because peers are no longer equally privileged.
Your question points to trust within P2P networks - a problem that academics have focused on since the introduction of (DHTs). I feel that no satisfactory answer has been found yet that solves all problems in all cases. Here are a few approaches which will help you:
(1) Bitcoin addresses malicious users by forcing all users within their network do perform computationally intensive work. For any member to forge bitcoins that would need more computational power than everyone to prove they had done more work than everyone else.
(2) Give privileges based on reputation. You can calculate reputation in any number of ways. One simple example - for each transaction in your system (file sent, database look up, piece of work done), the requester sends a signed acknowledgement (using private/public keys) to the sender. Each peer can then present the accumulation of their signed acknowledgements to any other peer. Any peer who has accumulated N acknowledgements (you determine N) has more privileges.
(3) Own a central server that hands out privileges. This one is the simplest and you get to determine what trust means for you. You're handing it out.
That's the skinny version - good luck.
I'm guessing that the administrative nodes are different from normal nodes by being able to tell other nodes what to do (and the regular nodes should obey).
You have to give the admin nodes some kind of way to prove themselves that can be verified by other nodes but not forged by them (like a policeman's ID). The Most standard way I can think of is by using TLS certificates.
In (very) short, you create couples of files called key and certificate. The key is secret and belongs to one identity, and the certificate is public.
You create a CA certificate, and distribute it to all of your nodes.
Using that CA, you create "administrative node" certificates, one for each administrative node.
When issuing a command, an administrative node presents its certificate to the "regular" node. The regular node, using the CA certificate you provided beforehand, can make sure the administrative node is genuine (because the certificate was actually signed by the CA), and it's OK to do as it asks.
Pros:
TLS/SSL is used by many other products to create a secure tunnel, preventing "man in the
middle" attacks and
impersonations
There are ready-to-use libraries and sample projects for TLS/SSL in practically every language, from .net to C.
There are revocation lists, to "cancel" certificates that have been stolen (although you'll have to find a way to distribute these)
Certificate verification is offline - a node needs no external resources (except for the CA certificate) for verification
Cons:
Since SSL/TLS is a widely-used system, there are many tools to exploit misconfigured / old clients / servers
There are some exploits found in such libraries (e.g. "heartbleed"), so you might need to patch your software a lot.
This solution still requires some serious coding, but it's usually better to rely on an existing and proven system than to go around inventing your own.
Related
I can appreciate that seeing "basic auth" and "safe enough" in the same sentence is a lot like reading "Is parachuting without a parachute still safe?", so I'll do my best to clarify what I am getting at.
From what I've seen online, people typically describe basic HTTP auth as being unsecured due to the credentials being passed in plain text from the client to the server; this leaves you open to having your credentials sniffed by a nefarious person or man-in-the-middle in a network configuration where your traffic may be passing through an untrusted point of access (e.g. an open AP at a coffee shop).
To keep the conversation between you and the server secure, the solution is to typically use an SSL-based connection, where your credentials might be sent in plain text, but the communication channel between you and the server is itself secured.
So, onto my question...
In the situation of replicating one CouchDB instance from an EC2 instance in one region (e.g. us-west) to another CouchDB instance in another region (e.g. singapore) the network traffic will be traveling across a path of what I would consider "trusted" backbone servers.
Given that (assuming I am not replicating highly sensitive data) would anyone/everyone consider basic HTTP auth for CouchDB replication sufficiently secure?
If not, please clarify what scenarios I am missing here that would make this setup unacceptable. I do understand for sensitive data this is not appropriate, I just want to better understand the ins and outs for non-sensitive data replicated over a relatively-trusted network.
Bob is right, it is better to err on the side of caution, but I disagree. Bob could be right in this case (see details below), but the problem with his general approach is that it ignores the cost of paranoia. It leaves "peace dividend" money on the table. I prefer Bruce Schneier's assessment that it is a trade-off.
Short answer
Start replicating now! Do not worry about HTTPS.
The greatest risk is not wire sniffing, but your own human error, followed by software bugs, which could destroy or corrupt your data. Make a replica!. If you will replicate regularly, plan to move to HTTPS or something equivalent (SSH tunnel, stunnel, VPN).
Rationale
Is HTTPS is easy with CouchDB 1.1? It is as easy as HTTPS can possibly be, or in other words, no, it is not easy.
You have to make an SSL key pair, purchase a certificate or run your own certificate authority—you're not foolish enough to self-sign, of course! The user's hashed password is plainly visible from your remote couch! To protect against cracking, will you implement bi-directional SSL authentication? Does CouchDB support that? Maybe you need a VPN instead? What about the security of your key files? Don't check them into Subversion! And don't bundle them into your EC2 AMI! That defeats the purpose. You have to keep them separate and safe. When you deploy or restore from backup, copy them manually. Also, password-protect them so if somebody gets the files, they can't steal (or worse, modify!) your data. When you start CouchDB or replicate, you must manually input the password before replication will work.
In a nutshell, every security decision has a cost.
A similar question is, "should I lock my house at night? It depends. Your profile says you are in Tuscon, so you know that some neighborhoods are safe, while others are not. Yes, it is always safer to always lock all of your doors all of the time. But what is the cost to your time and mental health? The analogy breaks down a bit because time invested in worst-case security preparedness is much greater than twisting a bolt lock.
Amazon EC2 is a moderately safe neighborhood. The major risks are opportunistic, broad-spectrum scans for common errors. Basically, organized crime is scanning for common SSH accounts and web apps like Wordpress, so they can a credit card or other database.
You are a small fish in a gigantic ocean. Nobody cares about you specifically. Unless you are specifically targeted by a government or organized crime, or somebody with resources and motivation (hey, it's CouchDB—that happens!), then it's inefficient to worry about the boogeyman. Your adversaries are casting broad nets to get the biggest catch. Nobody is trying to spear-fish you.
I look at it like high-school integral calculus: measuring the area under the curve. Time goes to the right (x-axis). Risky behavior goes up (y-axis). When you do something risky you saved time and effort, but the the graph spikes upward. When you do something the safe way, it costs time and effort, but the graph moves down. Your goal is to minimize the long-term area under the curve, but each decision is case-by-case. Every day, most Americans ride in automobiles: the single most risky behavior in American life. We intuitively understand the risk-benefit trade-off. Activity on the Internet is the same.
As you imply, basic authentication without transport layer security is 100% insecure. Anyone on EC2 that can sniff your packets can see your password. Assuming that no one can is a mistake.
In CouchDB 1.1, you can enable native SSL. In earlier version, use stunnel. Adding SSL/TLS protection is so simple that there's really no excuse not to.
I just found this statement from Amazon which may help anyone trying to understand the risk of packet sniffing on EC2.
Packet sniffing by other tenants: It is not possible for a virtual instance running in promiscuous mode to receive or "sniff" traffic that is intended for a different virtual instance. While customers can place their interfaces into promiscuous mode, the hypervisor will not deliver any traffic to them that is not addressed to them. This includes two virtual instances that are owned by the same customer, even if they are located on the same physical host. Attacks such as ARP cache poisoning do not work within EC2. While Amazon EC2 does provide ample protection against one customer inadvertently or maliciously attempting to view another's data, as a standard practice customers should encrypt sensitive traffic.
http://aws.amazon.com/articles/1697
While the are many social networks in the wild, most rely on data stored on a central site owned by a third party.
I'd like to build a solution, where data remains local on member's systems. Think of the project as an address book, which automagically updates contact's data as soon a a contact changes its coordinates. This base idea might get extended later on...
Updates will be transferred using public/private key cryptography using a central host. The sole role of the host is to be a store and forward intermediate. Private keys remain private on each member's system.
If two client are both online and a p2p connection could be established, the clients could transfer data telegrams without the central host.
Thus, sender and receiver will be the only parties which are able create authentic messages.
Questions:
Do exist certain protocols which I should adopt?
Are there any security concerns I should keep in mind?
Do exist certain services which should be integrated or used somehow?
More technically:
Use e.g. Amazon or Google provided services?
Or better use a raw web-server? If yes: Why?
Which algorithm and key length should be used?
UPDATE-1
I googled my own question title and found this academic project developed 2008/09: http://www.lifesocial.org/.
The solution you are describing sounds remarkably like email, with encrypted messages as the payload, and an application rather than a human being creating the messages.
It doesn't really sound like "p2p" - in most P2P protocols, the only requirement for central servers is discovery - you're using store & forward.
As a quick proof of concept, I'd set up an email server, and build an application that sends emails to addresses registered on that server, encrypted using PGP - the tooling and libraries are available, so you should be able to get that up and running in days, rather than weeks. In my experience, building a throw-away PoC for this kind of question is a great way of sifting out the nugget of my idea.
The second issue is that the nature of a social network is that it's a network. Your design may require you to store more than the data of the two direct contacts - you may also have to store their friends, or at least the public interactions those friends have had.
This may not be part of your plan, but if it is, you need to think it through early on - you may end up having to transmit the entire social graph to each participant for local storage, which creates a scalability problem....
The paper about Safebook might be interesting for you.
Also you could take a look at other distributed OSN and see what they are doing.
None of the federated networks mentioned on http://en.wikipedia.org/wiki/Distributed_social_network is actually distributed. What Stefan intends to do is indeed new and was only explored by some proprietary folks.
I've been thinking about the same concept for the last two years. I've finally decided to give it a try using Python.
I've spent the better part of last night and this morning writing a sockets communication script & server. I also plan to remove the central server from the equation as it's just plain cumbersome and there's no point to it when all the members could keep copies of their friend's keys.
Each profile could be accessed via a hashed string of someone's public key. My social network relies on nodes and pods. Pods are computers which have their ports open to the network. They help with relaying traffic as most firewalls block incoming socket requests. Nodes store information and share it with other nodes. Each node will get a directory of active pods which may be used to relay their traffic.
The PeerSoN project looks like something you might be interested in: http://www.peerson.net/index.shtml
They have done a lot of research and the papers are available on their site.
Some thoughts about it:
protocols to use: you could think exactly on P2P programs and their design
security concerns: privacy. Take a great care to not open doors: a whole system can get compromised 'cause you have opened some door.
services: you could integrate with the regular social networks through their APIs
People will have to install a program in their computers and remeber to open it everytime, like any P2P client. Leaving everything on a web-server has a smaller footprint / necessity of user action.
Somehow you'll need a centralized server to manage the searches. You can't just broadcast the internet to find friends. Or you'll have to rely uppon email requests to add somenone, and to do that you'll need to know the email in advance.
The fewer friends /contacts use your program, the fewer ones will want to use it, since it won't have contact information available.
I see that your server will be a store and forward, so the update problem is solved.
I consider myself to be quite a good programmer but I know very little about sever administration. I'm sorry if these questions are noobish but I would really appreciate some advice or links on steps I can take to make this more secure.
I've completed a project for a client that involves storing some very sensitive information, ie personal details of big donors. From a programming perspective it's protected using user authentication.
I don't mind spending some money if it means the info will be more secure, what other steps should I take?
Can the database be encrypted some how so that even if the server is compromised people can't just dump the mysqldb and have everything?
Is it worth purchasing an ssl certificate?
The site is currently hosted on a personal hosting plan with a reasonably trustworthy host. Would a virtual private server be more secure? Are there special hosts I can use that take additional steps to protect info (ie would it be more secure on amazon s3)?
As a side note to the specific question, I would recommend reading some books on computer/programming security. Some good ones are 19 Deadly Sins of Software Security and Writing Solid Code.
You don’t need to encrypt the database itself, just encrypt the data before storing it. (Make sure to use real, cryptographically-secure algorithms instead of making one up yourself.)
Using SSL is definitely an important step if you want to avoid MITM attacks or snooping. A certificate allows you to use SSL without having to take extra steps like installing a self-signed one on each of the client systems (not to mention other benefits like revocation of compromised certs and such).
It depends on just how sensitive the information is and how bad leakage would be. You may want to read some reviews of hosts to get an idea of how good the host is. (If possible, sort the reviews ascending by rating and look at the bad reviews to see if they are objective problems that could apply to you and/or have to do with security, or if they are just incidental or specific issues to that reviewer.) As for the “cloud”, you would kind of be taking a chance since real-world security and privacy of it has yet to be determined. Obviously, if you do go with it, you’ll want a notable, trustworthy host like Amazon or Microsoft since they have benefits like accountability and work constantly and quickly to fix any problems.
HTH
Assuming we have a server S and a few Clients (C) and whenever a client update a server, an internal database on the server is updated and replicated to the other clients. This is all done using sockets in an intranet environment.
I believe that an attacker can fairly easily sniff this plain text traffic. My colleagues believe I am overly paranoid because we are behind a firewall.
Am I being overly paranoid? Do you know of any exploit (link please) that took advantage of a situation such as this and what ca be done differently. Clients were rewritten in Java but server is still using C++.
Any thing in code can protect against an attack?
Inside your company's firewall, you're fairly safe from direct hack attacks from the outside. However, statistics that I won't trouble to dig out claim that most of the damage to a business' data is done from the INside. Most of that is simple accident, but there are various reasons for employees to be disgruntled and not found out; and if your data is sensitive they could hurt your company this way.
There are also boatloads of laws about how to handle personal ID data. If the data you're processing is of that sort, treating it carelessly within your company could also open your company up to litigation.
The solution is to use SSL connections. You want to use a pre-packaged library for this. You provide private/public keys for both ends and keep the private keys well hidden with the usual file access privileges, and the problem of sniffing is mostly taken care of.
SSL provides both encryption and authentication. Java has it built in and OpenSSL is a commonly used library for C/C++.
Your colleagues are naïve.
One high-profile attack occurred at Heartland Payment Systems, a credit card processor that one would expect to be extremely careful about security. Assuming that internal communications behind their firewall were safe, they failed to use something like SSL to ensure their privacy. Hackers were able to eavesdrop on that traffic, and extract sensitive data from the system.
Here is another story with a little more description of the attack itself:
Described by Baldwin as "quite a
sophisticated attack," he says it has
been challenging to discover exactly
how it happened. The forensic teams
found that hackers "were grabbing
numbers with sniffer malware as it
went over our processing platform,"
Baldwin says. "Unfortunately, we are
confident that card holder names and
numbers were exposed." Data, including
card transactions sent over
Heartland's internal processing
platform, is sent unencrypted, he
explains, "As the transaction is being
processed, it has to be in unencrypted
form to get the authorization request
out."
You can do many things to prevent a man in the middle attack. For most internal data, within a firewall/IDS protected network you really don't need to secure it. However, if you do wish to protect the data you can do the following:
Use PGP encryption to sign and encrypt messages
Encrypt sensitive messages
Use hash functions to verify that the message sent has not been modified.
It is a good standard operating proceedure to secure all data, however securing data has very large costs. With secure channels you need to have a certificate authority, and allow for extra processing on all machines that are involved in communication.
You're being paranoid. You're talking about data moving across an, ideally, secured internal network.
Can information be sniffed? Yea, it can. But it's sniffed by someone who has already breached network security and got within the firewall. That can be done in innumerable ways.
Basically, for the VAST majority of businesses, no reason to encrypt internal traffic. There are almost always far far easier ways of getting information, from inside the company, without even approaching "sniffing" the network. Most such "attacks" are from people who are simply authorized to see the data in the first place, and already have a credential.
The solution is not to encrypt all of your traffic, the solution is to monitor and limit access, so that if any data is compromised, it is easier to detect who did it, and what they had access to.
Finally, consider, the sys admins, and DBAs pretty much have carte blanche to the entire system anyway, as inevitably, someone always needs to have that kind of access. It's simply not practical to encrypt everything to keep it away from prying eyes.
Finally, you're making a big ado about something that is just as likely written on a sticky tacked on the bottom of someone's monitor anyway.
Do you have passwords on your databases? I certainly hope the answer to that is yes. Nobody would believe that password protecting a database is overly paranoid. Why wouldn't you have at least the same level of security* on the same data flowing over your network. Just like an unprotected DB, unprotected data flow over the network is vulnerable not only to sniffing but is also mutable by a malicious attacker. That is how I would frame the discussion.
*By same level of security I mean use SSL as some have suggested, or simply encrypt the data using one of the many available encryption libraries around if you must use raw sockets.
Just about every "important" application I've used relies on SSL or some other encryption methodology.
Just because you're on the intranet doesn't mean you may not have malicious code running on some server or client that may be trying to sniff traffic.
An attacker which has access to a device inside your network that offers him the possibility to sniff the entire traffic or the traffic between a client and a server is the minimum required.
Anyway, if the attacker is already inside, sniffing should be just one of the problems you'll have to take into consideration.
There are not many companies I know of which use secure sockets between clients and servers inside an intranet, mostly because of the higher costs and lower performance.
There are a few questions (C#, Java) that cover how one might implement automatic updates. It appears initially easy to provide automatic updates, and there are seemingly no good reasons not to provide automatic updates for most software.
However, none appear to cover the security aspects of automatic updates.
How safe are automatic updates now?
How safe should they be?
How safe can they be?
My main issue is that the internet is, for all intents and purposes, a wild west where one cannot assume anything about any data they receive. Automatic updates over the internet appears inherently risky.
A company computer gets infected, spoofs the DNS (only a small percentage of which win), and makes the other company computers believe that the update server for a common application is elsewhere, they download the 'new' application and become infected.
As a developer, what possible attacks are there, and what steps should I take to protect my customers from abuse?
-Adam
With proper use of cryptography your updates can be very safe. Protect the site you distribute your updates from with SSL. Sign all your updates with GPG/PGP or something else, make your clients verify the signature before applying the update. Takes steps to make sure your server and keys are kept extremely secure.
Adequate, is very subjective. What is adequate for a internet game, maybe completely inAdequate for the security system for our nuclear missiles. You have to decide how much potential damage could occur if someone managed to break your security.
The most obvious attack would be an attacker supplying changed binaries through his "evil" update server. So you should ensure that the downloaded data can be verified to originate from you, using a digital signature.
To ensure security, obviously you should avoid distributing the key for the signature. Therefore, you could implement some variation of RSA message signing
Connecting to your update server via SSL can be sufficient, provided your client will refuse to connect if they get an invalid certificate and your server requires negotiating a reasonable level of connection security (and the client also supports that).
However realistically almost anything you do is going to be at least as secure as the route via which your users get the first install of your software anyhow. If your users initially download your installer via plain http, it is too late to start securing things on the updates.
This is also true to some extent even if they get your intial software via https or digitally signed - as most users can easily be persuaded to click OK on almost any security warning they see on that.
there are seemingly no good reasons not to provide automatic updates for most software.
There are good reasons not to force an update.
bug fixes may break code
users may not want to risk breaking production systems that rely on older features