Should I put my Blockchain behind an API? - hyperledger-fabric

I am using a Hyperledger Fabric Blockchain.
My blockchain is "private", which means only people we accept can participate.
Right now, the blockchain is open to Internet. Is it safe ?
Should I put my Blockchain behind an API that would be in charge of Read / Write operations ?
From the "data certification" point of view, less steps between data and blockchain, the better.
Does it make sense in a security point of view ?

I don't get what you mean by blockchain being opened to internet.
If you are referring to the data structure of blocks forming a chain, it's just some files stored in peer component. If you want to discuss about it being opened, I'd rather see it as a peer component being opened to the internet.
If you manage to protect the peer component from illegal access, the only legal access Fabric provides you is through using correctly authorized certificate, so from that point on, I wouldn't worry that much.
Putting API middleware in front of the blockchain, however will provide easier interface for other users. From realistic implementation point of view, those interface will be required to some point, so it should better be protected from other security threats.

What you may want to do is to create a "standard" API for your network. This would be an app which can request for and store crypto from the CA and provides some sort of authentication mechanism to allow authorized users to then use that crypto to make requests to the network.
When you begin onboarding users to your network, you need to give them the option of using your standard API or making their own. This gives them the convenience to get started quickly by using a pre-built solution with the freedom to build their own interface into the network if they would like to.

Right now, the blockchain is open to Internet. Is it safe ?
First of all, Blockchain is secure because of its design. Blockchains stores data using certain rules that are extremely difficult for attackers to manipulate. As the numbers of block grows, it will be harder for attacker to manipulate the old block.
However, although we have a super secure blockchain system, we can't control the security of each user account or third party system.
That's why most of the security breaches in public blockchain are happening on the third party system or because of human error (not inside the blockchain).
Should I put my Blockchain behind an API that would be in charge of
Read / Write operations ?
API can be seen as a "bridge" between blockchain and us. Of course, we need this bridge to read / write Transactions to blockchain. It doesn't matter what bridge are you using, as long you can ensure the security of your bridge design.
From the "data certification" point of view, less steps between data
and blockchain, the better. Does it make sense in a security point of
view ?
One of the important aspect private blockchain is about limiting who can access into a blockchain. By limiting the user interaction, we can reduce the potential of security breach. So, the risk of having security breach can be reduced.

Related

Need some ideas to achieve data marketplace through hyperledger

I am trying to create a data marketplace where a party can transact with other parties, agree on the set of terms and sell data from one to another.
Here data security is of utmost concern. A party makes data available on the hyperledger, this data should be secured and no one should get hold of it. If an interested party wants this data, they have to transact with data owner party and agree on the set of terms. Only then the interested party will get the data. And now only two parties should have hold of this data. Everyone else should not get hold of it.
I would like to know what components of hyperledger can be used here. I have an idea of private data concept in hyperledger, but not sure how and where it would fit.
Would love to hear some comments from experts regarding this.
Edit:
I am thinking of using private data for data sharing securely whenever two parties agree for the transaction. And for that I am thinking of upgrading chaincode every time two parties agree on set of terms for data sharing. Only thing concerns me is that every endorsing peer needs to install and upgrade the new version of chaincode simultaneously and that could be undesirable, because data exchanges on this platform could be very frequent.
You can refer to this, and get some idea
When to use a collection within a channel vs. a separate channel

node.js api gateway implementation and passport authentication

I am working on implementing a microservices-based application using node.js. While searching for examples on how to implement the api gateway, I came across the following article that seems to provide an example on implementing the api gateway: https://memz.co/api-gateway-microservices-docker-node-js/. Though, finding example for implementing the api gateway pattern in node.js seems to be a little hard to come by so far, this article seemed to be a really good example.
There are a few items that are still unclear and I am still have issues finding doc. on.
1) Security is a major item for the app. I am developing, I am having trouble seeing where the authentication should take place (i.e. using passport, should I add the authentication items in the api gateway and pass the jwt token along with the request to the corresponding microservice as the user's logged in information is needed for certain activities? The only issue here seems to be that all of the microservices would need passport in order to decrypt the jwt token to get the user's profile information. Would the microservice be technically, inaccessible to the outside world except through the api gateway as this seems to be the aim?
2) How does this scenario change if I need to scale to multiple servers with docker images on each one? How would this affect load balancing, as it seems like something would have to sit at a higher level to deal with load balancing?
I can tell that much depends on your application requirements. Really.
I'm now past the 5 years of experience in production microservices using several languages going from medium to very large scale system.
None of them shared the same requirements, and without having a deep understanding of what you need and what are your business (product) requirements it would be hard to know what's the right answer, by the way I'll try to share some experience to help you get it right.
Ideally you want the security to be encapsulated in an external service, so that you can update and apply new policies faster. Also you'll be able to deprecate all existing tokens should you find a breach in your system or if someone in your team inadvertedly pushes some secret key (or cert) to an external service.
You could handle authentication on each single service or using an edge newtwork tool (such as the API Gateway). Becareful choosing how to handle it because each one has it's own privileges:
Choosing the API Gateway your services will remain lighter and do not need to know anything about the authentication steps, but surely at some point you'll need to know who the authenticated user is and you need some plain reference to it (a JSON record, a link or ID to a "user profile" service). How you do it it's up to your requirements and we can even go deeper talking about different pros and cons about each possible choice applicable for your case.
Choosing to handle it at the service level requires you (and your teams) to understand better about the security process taking place (you can hide it with a good library) and you'll need to give them support from your security team (it's may also be yourself btw you know the more service implementing security, the more things you'll have to think about to avoid adding unnecessary features). The big problem here is that you'll often end up stopping your tasks to think about what would help you out on this particular service and you'll be tempted to extend your authentication service (and God, unless you really know what you're doing, don't add a single call not needed for authentication purposes).
One thing is easy to be determined: you surely need to think about tokens (jwt, jwe or, again, whatever your requirements impose).
JWT has good benefits, but data is exposed to spoofing, so never put in there sensitive data or things you wouldn't publicly share about your user (e.g. an ID is probably fine, while security questions or resolution to 2FA would not). JWE is an encrypted form of the spec. A common token (with no meaning) would require a backend to get the data, but it works much like cookie-sessions and data is not leaving your servers.
You need to define yourself the boundaries of your services and do yourself a favor: make each service boundaries clean, defined and standard.
Try to define common policies and standardize interactions, I know it may be easier to add a queue here, a REST endpoint there, a RPC there, but you'll soon end up with a bunch of IPC you will not be able to handle anymore and it will soon catch your attention.
Also if your business solution is pretty heavy to do I don't think it's a good idea to do yourself the API Gateway, Security and so on. I'd go with open source, community supported (or even company-backed if you have some budget) and production-tested solutions.
By definition microservice architectures are very dynamic, you'll fight to keep it immutable between each deployment version, but unless you're a big firm you cannot effort keeping live thousands of servers. This means you'll discover bugs that only presents under certain circumstances you cannot spot in other environments (it happens often to not be able to reproduce them).
By choosing to develop the whole stack yourself you agree with having to deal with maintenance and bug-discovery in your whole stack. So when you try to load a page that has 25 services interacting you know it may be failing because of a bug in: your API Gateway, your Security implementation, your token parser, your user account service, your business service A to N, your database service (if any), your database load balance (if any), your database instance.
I know it's tempting to do everything, but try to keep it flat and do what you need to do. By following this path you'll think about your product, which I think is what's the most important think to do now.
To complete my answer, about the scaling issues:
it doesn't matter. Whatever choice you pick it will scale seamlessly:
API Gateway should be able to work on a pool of backends (so from that server you should be able to redirect to N backend machines you can put live when you need to, you can even have some API to support automatic registration of new instances, or even simples put the IP of an Elastic Load Balancer or HAproxy or equivalents, and as you add backends to them it will just work -you have moved the multiple IPs issue from the API Gateway to one layer down).
If you handle authentication at services level (and you have an API Gateway) see #1
If you handle authentication at services level (without an API Gateway) then you need to look at some other level in your stack: load balancing (layer 3 or layer 7), or the DNS level, you can use several features of DNS to put different IPs to answer from, using even advanced features like Anycast if you need latency distribution.
I know this answer introduced a lot of other questions, but I really tried to answer your question. The fact is that you need to understand and evaluate a lot of things when planning a microservice architecture and I'd not write a SLOC without a very-written-plan printed on every wall of my office.
You'll often need to go mental focus and exit from a single service to review the global vision and check everything is going fine.
I don't want to scare you, I'm rather trying to make you think to succeed.
I just want you to make sure you correctly evaluated all of the possibilities before to decide to do everything from scratch.
P.S. Should you choose to act using an API gateway be sure to limit services to only accept requests through it. On the same machine just start listening on localhost, on multiple machines you'll need some advanced networking rule depending on your operating system.
Good Luck!

Private secured P2P Network

I know the concept of building a simple P2P network without any server. My problems is with securing the network. The network should have some administrative nodes. So there are two kinds of nodes:
Nodes with privileges
Nodes without privileges
The first question is: Can I assign some nodes more rights than others, like the privileges to send a broadcast message?
How can I secure the network of modified nodes that are trying to get privileges?
I'm really interested in answers and resources than can help me. It is important to me to understand this, and I'm happy to add further information if anything is unclear.
You seem lost, and I used to do research in this area, so I'll take a shot. I feel this question is borderline off-topic, but I tend to error toward leaving things open.
See the P2P networks Chord, CAN, Tapestry, and Pastry for examples of P2P networks as well as psuedo-code. These works are all based off distributed hash tables (DHTs) and have been around for over 10 years now. Many of them have open source implementations you can use.
As for "privileged nodes", your question contradicts itself. You want a P2P network, but you also want nodes with more rights than others. By definition, your network is no longer P2P because peers are no longer equally privileged.
Your question points to trust within P2P networks - a problem that academics have focused on since the introduction of (DHTs). I feel that no satisfactory answer has been found yet that solves all problems in all cases. Here are a few approaches which will help you:
(1) Bitcoin addresses malicious users by forcing all users within their network do perform computationally intensive work. For any member to forge bitcoins that would need more computational power than everyone to prove they had done more work than everyone else.
(2) Give privileges based on reputation. You can calculate reputation in any number of ways. One simple example - for each transaction in your system (file sent, database look up, piece of work done), the requester sends a signed acknowledgement (using private/public keys) to the sender. Each peer can then present the accumulation of their signed acknowledgements to any other peer. Any peer who has accumulated N acknowledgements (you determine N) has more privileges.
(3) Own a central server that hands out privileges. This one is the simplest and you get to determine what trust means for you. You're handing it out.
That's the skinny version - good luck.
I'm guessing that the administrative nodes are different from normal nodes by being able to tell other nodes what to do (and the regular nodes should obey).
You have to give the admin nodes some kind of way to prove themselves that can be verified by other nodes but not forged by them (like a policeman's ID). The Most standard way I can think of is by using TLS certificates.
In (very) short, you create couples of files called key and certificate. The key is secret and belongs to one identity, and the certificate is public.
You create a CA certificate, and distribute it to all of your nodes.
Using that CA, you create "administrative node" certificates, one for each administrative node.
When issuing a command, an administrative node presents its certificate to the "regular" node. The regular node, using the CA certificate you provided beforehand, can make sure the administrative node is genuine (because the certificate was actually signed by the CA), and it's OK to do as it asks.
Pros:
TLS/SSL is used by many other products to create a secure tunnel, preventing "man in the
middle" attacks and
impersonations
There are ready-to-use libraries and sample projects for TLS/SSL in practically every language, from .net to C.
There are revocation lists, to "cancel" certificates that have been stolen (although you'll have to find a way to distribute these)
Certificate verification is offline - a node needs no external resources (except for the CA certificate) for verification
Cons:
Since SSL/TLS is a widely-used system, there are many tools to exploit misconfigured / old clients / servers
There are some exploits found in such libraries (e.g. "heartbleed"), so you might need to patch your software a lot.
This solution still requires some serious coding, but it's usually better to rely on an existing and proven system than to go around inventing your own.

Is basic HTTP auth in CouchDB safe enough for replication across EC2 regions?

I can appreciate that seeing "basic auth" and "safe enough" in the same sentence is a lot like reading "Is parachuting without a parachute still safe?", so I'll do my best to clarify what I am getting at.
From what I've seen online, people typically describe basic HTTP auth as being unsecured due to the credentials being passed in plain text from the client to the server; this leaves you open to having your credentials sniffed by a nefarious person or man-in-the-middle in a network configuration where your traffic may be passing through an untrusted point of access (e.g. an open AP at a coffee shop).
To keep the conversation between you and the server secure, the solution is to typically use an SSL-based connection, where your credentials might be sent in plain text, but the communication channel between you and the server is itself secured.
So, onto my question...
In the situation of replicating one CouchDB instance from an EC2 instance in one region (e.g. us-west) to another CouchDB instance in another region (e.g. singapore) the network traffic will be traveling across a path of what I would consider "trusted" backbone servers.
Given that (assuming I am not replicating highly sensitive data) would anyone/everyone consider basic HTTP auth for CouchDB replication sufficiently secure?
If not, please clarify what scenarios I am missing here that would make this setup unacceptable. I do understand for sensitive data this is not appropriate, I just want to better understand the ins and outs for non-sensitive data replicated over a relatively-trusted network.
Bob is right, it is better to err on the side of caution, but I disagree. Bob could be right in this case (see details below), but the problem with his general approach is that it ignores the cost of paranoia. It leaves "peace dividend" money on the table. I prefer Bruce Schneier's assessment that it is a trade-off.
Short answer
Start replicating now! Do not worry about HTTPS.
The greatest risk is not wire sniffing, but your own human error, followed by software bugs, which could destroy or corrupt your data. Make a replica!. If you will replicate regularly, plan to move to HTTPS or something equivalent (SSH tunnel, stunnel, VPN).
Rationale
Is HTTPS is easy with CouchDB 1.1? It is as easy as HTTPS can possibly be, or in other words, no, it is not easy.
You have to make an SSL key pair, purchase a certificate or run your own certificate authority—you're not foolish enough to self-sign, of course! The user's hashed password is plainly visible from your remote couch! To protect against cracking, will you implement bi-directional SSL authentication? Does CouchDB support that? Maybe you need a VPN instead? What about the security of your key files? Don't check them into Subversion! And don't bundle them into your EC2 AMI! That defeats the purpose. You have to keep them separate and safe. When you deploy or restore from backup, copy them manually. Also, password-protect them so if somebody gets the files, they can't steal (or worse, modify!) your data. When you start CouchDB or replicate, you must manually input the password before replication will work.
In a nutshell, every security decision has a cost.
A similar question is, "should I lock my house at night? It depends. Your profile says you are in Tuscon, so you know that some neighborhoods are safe, while others are not. Yes, it is always safer to always lock all of your doors all of the time. But what is the cost to your time and mental health? The analogy breaks down a bit because time invested in worst-case security preparedness is much greater than twisting a bolt lock.
Amazon EC2 is a moderately safe neighborhood. The major risks are opportunistic, broad-spectrum scans for common errors. Basically, organized crime is scanning for common SSH accounts and web apps like Wordpress, so they can a credit card or other database.
You are a small fish in a gigantic ocean. Nobody cares about you specifically. Unless you are specifically targeted by a government or organized crime, or somebody with resources and motivation (hey, it's CouchDB—that happens!), then it's inefficient to worry about the boogeyman. Your adversaries are casting broad nets to get the biggest catch. Nobody is trying to spear-fish you.
I look at it like high-school integral calculus: measuring the area under the curve. Time goes to the right (x-axis). Risky behavior goes up (y-axis). When you do something risky you saved time and effort, but the the graph spikes upward. When you do something the safe way, it costs time and effort, but the graph moves down. Your goal is to minimize the long-term area under the curve, but each decision is case-by-case. Every day, most Americans ride in automobiles: the single most risky behavior in American life. We intuitively understand the risk-benefit trade-off. Activity on the Internet is the same.
As you imply, basic authentication without transport layer security is 100% insecure. Anyone on EC2 that can sniff your packets can see your password. Assuming that no one can is a mistake.
In CouchDB 1.1, you can enable native SSL. In earlier version, use stunnel. Adding SSL/TLS protection is so simple that there's really no excuse not to.
I just found this statement from Amazon which may help anyone trying to understand the risk of packet sniffing on EC2.
Packet sniffing by other tenants: It is not possible for a virtual instance running in promiscuous mode to receive or "sniff" traffic that is intended for a different virtual instance. While customers can place their interfaces into promiscuous mode, the hypervisor will not deliver any traffic to them that is not addressed to them. This includes two virtual instances that are owned by the same customer, even if they are located on the same physical host. Attacks such as ARP cache poisoning do not work within EC2. While Amazon EC2 does provide ample protection against one customer inadvertently or maliciously attempting to view another's data, as a standard practice customers should encrypt sensitive traffic.
http://aws.amazon.com/articles/1697

Considerations regarding a p2p social network

While the are many social networks in the wild, most rely on data stored on a central site owned by a third party.
I'd like to build a solution, where data remains local on member's systems. Think of the project as an address book, which automagically updates contact's data as soon a a contact changes its coordinates. This base idea might get extended later on...
Updates will be transferred using public/private key cryptography using a central host. The sole role of the host is to be a store and forward intermediate. Private keys remain private on each member's system.
If two client are both online and a p2p connection could be established, the clients could transfer data telegrams without the central host.
Thus, sender and receiver will be the only parties which are able create authentic messages.
Questions:
Do exist certain protocols which I should adopt?
Are there any security concerns I should keep in mind?
Do exist certain services which should be integrated or used somehow?
More technically:
Use e.g. Amazon or Google provided services?
Or better use a raw web-server? If yes: Why?
Which algorithm and key length should be used?
UPDATE-1
I googled my own question title and found this academic project developed 2008/09: http://www.lifesocial.org/.
The solution you are describing sounds remarkably like email, with encrypted messages as the payload, and an application rather than a human being creating the messages.
It doesn't really sound like "p2p" - in most P2P protocols, the only requirement for central servers is discovery - you're using store & forward.
As a quick proof of concept, I'd set up an email server, and build an application that sends emails to addresses registered on that server, encrypted using PGP - the tooling and libraries are available, so you should be able to get that up and running in days, rather than weeks. In my experience, building a throw-away PoC for this kind of question is a great way of sifting out the nugget of my idea.
The second issue is that the nature of a social network is that it's a network. Your design may require you to store more than the data of the two direct contacts - you may also have to store their friends, or at least the public interactions those friends have had.
This may not be part of your plan, but if it is, you need to think it through early on - you may end up having to transmit the entire social graph to each participant for local storage, which creates a scalability problem....
The paper about Safebook might be interesting for you.
Also you could take a look at other distributed OSN and see what they are doing.
None of the federated networks mentioned on http://en.wikipedia.org/wiki/Distributed_social_network is actually distributed. What Stefan intends to do is indeed new and was only explored by some proprietary folks.
I've been thinking about the same concept for the last two years. I've finally decided to give it a try using Python.
I've spent the better part of last night and this morning writing a sockets communication script & server. I also plan to remove the central server from the equation as it's just plain cumbersome and there's no point to it when all the members could keep copies of their friend's keys.
Each profile could be accessed via a hashed string of someone's public key. My social network relies on nodes and pods. Pods are computers which have their ports open to the network. They help with relaying traffic as most firewalls block incoming socket requests. Nodes store information and share it with other nodes. Each node will get a directory of active pods which may be used to relay their traffic.
The PeerSoN project looks like something you might be interested in: http://www.peerson.net/index.shtml
They have done a lot of research and the papers are available on their site.
Some thoughts about it:
protocols to use: you could think exactly on P2P programs and their design
security concerns: privacy. Take a great care to not open doors: a whole system can get compromised 'cause you have opened some door.
services: you could integrate with the regular social networks through their APIs
People will have to install a program in their computers and remeber to open it everytime, like any P2P client. Leaving everything on a web-server has a smaller footprint / necessity of user action.
Somehow you'll need a centralized server to manage the searches. You can't just broadcast the internet to find friends. Or you'll have to rely uppon email requests to add somenone, and to do that you'll need to know the email in advance.
The fewer friends /contacts use your program, the fewer ones will want to use it, since it won't have contact information available.
I see that your server will be a store and forward, so the update problem is solved.

Resources