what is the transaction per second of hyperledger fabric with Kafka - hyperledger-fabric

I want to know that how much transactions can be take place in one second over hyperledger fabric with Kafka protocol.
I'm creating a network, as such project size is not as much big but if my product will do millions of transactions per second around the globe and want to log those transactions over hyperledger fabric so could it be possible that fabric can manage them all?
If yes than how many nodes should i need to setup first and what should be the server specs to deploy the network.
And if this is not the good idea of using fabric than kindly let me know which blockchain should i use to log the immutable and consortium ledger.

As Hyperledger Fabric is a scalable platform, you may scale it to fit your needs. But as your project will do millions of transactions per second, you can apply more than just adding new nodes to increase TPS(Transactions Per Second). Like,
Using IPFS based chain storage to store bulky data off the chain and have the proof hash on the hyperledger fabric.
Indexing the CouchDB underneath for speed and scalability that suits your need.
Using high throughput network provided by Hyperledger Fabric itself
Increasing the number of endorser channels
Now, for the hardware requirements, there has been an experiment done at IBM with a single node of 4vCPU and 16GB memory with SSDs, the following data has been found-
Using 2 endorsers -> 785 TPS
Using 4 endorsers -> 948 TPS
Using 8 endorsers -> 1265 TPS

Related

Simulate Hyperledger Fabric network with 5000 users

I am new to Hyperledger Fabric.
I read its documentation and followed the test network they provided on their website, so the test-network provides a bunch of terminal commands to add a third organization and its peer. I like that everything is ready to run on terminals, but the problem is the high level of abstraction over many details.
Goal:
I would like to simulate a permissioned blockchain network with 5000 users. Each user should be able to broadcast a transaction in every 15 seconds to the channel. The orderers should package these transactions in every 15 seconds and let the connected users verify new blocks.
Questions:
Should I create a new peer for each user?
Or can I use a single peer and let each user use the app?
I could not find a single tutorial on adding more peers dynamically.
Reading the documentation, I think I should let each user have his own peer and app to broadcast transactions. However, creating 5000 peers(one-by-one) would be very time-consuming.
I know these questions may sound naive, considering my other options like creating my blockchain network simulation using socketio or grpc would be less painful at the moment. I don't really want to avoid reading the docs of HLF, but the high-level of abstraction and the learning-time make me wonder, I should better use the other options for my simulation. As Linus Torvalds puts it simply:
Talk is cheap, show me the code!
In HLF case, I don't want the already-provided terminal commands, I want to really understand and modify the source code of peers.
Thank you for any recommendation or direction.
You need 5000 users (as registered in the CA), not 5000 peers. A single peer should be enough (although some more peers can be useful to distribute the endorsements and improve performance).
So, you should:
Register 5000 users in your Fabric-CA
Enroll their cryptographic material from the Fabric-CA
Run the 5000 clients (peer command, Fabric SDK based application or whatever).
Fabric CA related stuff: https://hyperledger-fabric-ca.readthedocs.io/en/latest/deployguide/use_CA.html.
Obviously, you should prepare some kind of script to do that. Don't do it manually.
If your purpose is only testing, maybe you can use cryptogen instead of Fabric-CA.
It seems that you are trying to perform some kind of performance test. Hyperledger Caliper is designed for these kind of tests. Maybe you can configure Caliper with 5000 workers (although I'm not sure if you can configure less than 1 TPS to simulate your request every 15 seconds).
About the orderers, you can configure your ordering service with a batch time of 15 seconds, but take into account that your 5000 transactions every 15 seconds may reach the batch size before that happens, so the block is generated before.

What kind of consensus used in Hyperledger Fabric?

I don't know if this question make sense, I know Raft is consensus algorithm and use etcd to distributed the data, and i know etcd in Raft Ordering Service have a similar job with zookeeper in Kafka Ordering Service, but what I don't understand is, what kind of consensus used in Kafka ordering service?
Right now ordering service can use Raft or Kafka (deprecated), but Raft is a consensus algorithm yet Kafka is not. Or actually both of them just part of the consensus ordering phase? then does that mean now Fabric uses consensus algorithm to be part of consensus??? then what kind of consensus used in Fabric? I've read somewhere Fabric is not PBFT yet.
Let's talk about it as ordering and consensus and bring in Kafka and Raft.
In a distributed system, where messages are going to multiple nodes, the said nodes need a way to know which message came first, which was second, etc. Think of it as transactions on your bank account. If you have $20 in your account and someone pays you $30 so your account goes to $50, and you pay me $50 and your account goes to $0, its a valid sequence. But if your bank messes the order and you start with $20 and the transfer to me for $50 comes next, that check is going to bounce.
So that sequence (also known as order) is important, and in Fabric this is done by The Order Node.
For redundancy, to mitigate malicious intent, for decentralization and other reasons, you may not want just one node providing order. But, if you have n ordering nodes, how do you make sure they come up with one order of messages and not n variations of that order? You get a consensus among those nodes on the order of those messages.
As one of the responders posted - you can achieve that consensus with RAFT or Kafka. Both are Crash Fault Tolerant (CFT) consensus algorithms, which means theoretically as long as majority of the ordering nodes are good, (2 out of 3, or 3 out of 5, etc) you are in good shape.
You are correct and RAFT does use etcd, but I think that's an implementation detail and not tied to the consensus conceptually. Etcd is an open source key-value store used to hold and manage information that distributed systems need to keep running. Its used by RAFT in Fabric, but it's also used by other projects like I think kubernetes uses it to manage all the configuration and metadata, etc
I am not aware of a Byzantine Fault tolerant library (where 2/3rd or fewer ordering nodes can be faulty I think and the system would still function) being available for Hyperledger Fabric yet, although there have been and continue to be discussions on it and the Fabric documentation states that RAFT CFT is a stepping stone to a BFT consensus library for Fabric in the future.
I would also reiterate reviewing the link to The Ordering Service Docs that was posted by another poster as good material to review for more information.
I also really like this introduction to RAFT video, it's not related to Fabric, but does an excellent job of explaining RAFT in general, if you are interested.
In its entirety, a consensus in the blockchain is a mechanism that ensures all copies of a distributed ledger are the same.
Hyperledger Fabric achieves consensus by relying on a backend service (known as the ordering service) that intermediates the messages between senders and receivers. This backend service will ensure that all receivers will see messages in the same order – it follows that if all receivers see messages in the same order(prior to version 1.4, used Kafka, and later RAFT), they will perform the same actions/commits, etc. and the consensus is achieved.
Hyperledger Fabric uses Crash Fault Tolerance(CFT) to achieve consensus for single as well as multiple org systems. Crash Fault Tolerant model guaranties to withstand system failures, such as crashes, network partitioning. Having N nodes in your consensus system CFT capable to withstand up to N/2 such crashes.
For more information, you can read this article which does a good job on explaining consensus in Hyperledger Fabric.
I am not an expert on the subject , but I will try to respond to your questions.
Apache ZooKeeper (used in Kafka) , does not use a consensus algorithm , it is a centralized service that save configuration and expose endpoints (https://zookeeper.apache.org/) , so Zookeeper is used as a central communication point and it use Zab to propagate state update. If you want more info , go here : https://kafka.apache.org/intro
Now Fabric use etcd to maintain the state of the world state , etcd use Raft wich is Leader/Follower type consensus algorithm.
So Raft is the consensus used in HyperLedger Fabric as 2.x , but as it is a Leader/Follower type algorithm , it is not Byzantine Fault Tolerant (at is core , modification can be made to make it PBFT).
I recommend you read the Hyperledger documentation which is very complete , and probably explain better than me: https://hyperledger-fabric.readthedocs.io/en/release-2.2/orderer/ordering_service.html
Also , the RAFT documentation if you want to understand how the algorithm work : https://raft.github.io/

Why Hyperledger Fabric’s performance is so bad in our deployment benchmark?

We use Hyperledger Fabric and Composer to build a system with a channel and 3 peers (8GB, 8CPU) but we faced 2 problems like below:
Performance is very low: 10 tps. Is it the limit of the composer or our fault in implementation?
Size is very large, it's more than 20 times compare to SQL Server. With 1000 records in SQL we've used 40MB but with Hyperledger system as above (3 peer) is 700MB (I think if it's linear the size of Hyperledger should be = 40*4 = 160MB). Is this size is normal or our fault? What's the best practices in size optimization?
Thanks in advances
Performance is very low: 10 tps. Is it the limit of the composer or
our fault in implementation?
I don't know about composer but it's definitely not the limit of Fabric. More data about your deployment would be helpful.
Size is very large, it's more than 20 times compare to SQL Server.
With 1000 records in SQL we've used 40MB but with Hyperledger system
as above (3 peer) is 700MB (I think if it's linear the size of
Hyperledger should be = 40*4 = 160MB).
A transaction with a single endorsement weighs about 3K.
1000*3K is about 3MB... something here is off ;)
How many blocks and transactions do you have?
I conducted some similar testing. I took an existing microservice and replaced the MySql, Redis, and Cassandra parts with API calls to a business network built with Hyperledger Composer. The storage capacity for the original microservice was 750 times more efficient than CouchDB under Fabric. You can read all about this experiment in this article on Evaluating Hyperledger Composer.

how many number of nodes we can create under hyperledger fabric

Is there any limit of creating number of nodes while configuring hyperledger fabric?
I have gone through the below answer but I'm not clear what he is explaining.
Limit of number of nodes in Hyperledger
When I say number of nodes, it could be number of stakeholders(marked as organizations) or peers or endorser nodes.
The answer on that post is now incorrect. Fabric does not currently used Byzantine Fault Tolerance, it only has Crash Tolerance through Kafka ordering. Byzantine Fault Tolerance is estimated to come around Fabric 1.4.
With Kafka, there is not a limit on the number of nodes. There is a performance hit as you introduce nodes, Hyperledger Sawtooth is known to be better for node scalability
There is no limit to creating the number of nodes in fabric ( that's the idea behind distributed system) but be aware that as and when you start adding more and more nodes, you may see the performance being adversely hit when you do the transactions.
As per my recent conversations with the teams which have implemented Hyperledger Fabric on version 1.1 it seems the performance is okay for upto 16 to 18 nodes. It seems to be a trade off due to the faster finality demonstrated by Hyperledger Fabric.
In Hyperledger Fabric, nodes can be of type orderers, endrosing peers or clients.
If we are talking about how many Byzantine nodes, then the precise answer is as follows: a) There is no limit on Byzantine peers and clients. If there are too many of them, a client just won't be able to get his transaction endorsed. However the integrity of the system is not endangered. b) Since the consensus algorithm is run between the orderers, then the limit depends on that specific algorithm used. Remember Hyperledger Fabric supports pluggable consensus, meaning that the consensus algorithm is not necessarily hardcoded. In its current implementation, Hypeledger Fabric runs "Kafka" which is NOT Byzantine-Fault tolerant. This means that even one Byzantine orderer can compromise the whole system! However, there are plans for BFT-Smart which is Byzantine-Fault tolerant and supports up to 33% faulty nodes, as the above answer says.
If we are talking about the total number of nodes, then the precise answer is as follows: a) There is (theoretically) no limit on the number of clients-peers. b) The practical limit of orderers again depends on the consensus. For BFT, this translates up to practically 10 (maybe 20) orderers.

What is the scale at which hyper ledger fabric has been deployed?

I am looking for information on how many peer nodes , ordering nodes and CA servers are required to handle 1 million transactions per minute. Which deployment strategy is helpful. Docker Swarm or Kubernetes - which one is ideal to use to provide scaling and extensibility.
The scaling of Hyperledger fabric depends on the chosen consensus method. The consensus methods that support Byzantine Fault Tolerance can handle transactions <1000 per seconds for <20 nodes. For more number of transactions or more number of nodes, other non-BFT consensus methods can be chosen. However, these other consensus methods cannot guarantee the correctness of transactions as guaranteed by the former.

Resources