I want to make a secure hyperledger fabric infrastructure to manage all nodes based on physical devices.
The front-end user application writes to HL. It asks for a random node and if it answers application sends request and payload.
What is the best way to guarantee private communication between off-chain frontend app and hyperledger?
I have already created private domain secured by SSL certificate for every node but this method doesn’t sound scalable - what if we have 10k nodes? Is there a better approach?
If your intent is to communicate directly with the Peer, the endpoint's already able to be secured with TLS.
However, under an ideal situation, your web app, would communicate with your back-end server (lets say NodeJS Express server). Your Express server would be TLS secured and your web app would communicate via https. Your Express server would then use the Fabric Node SDK to communicate with your network, which is also TLS secured communication. You're not configuring anything more extensively than you would have while building a TLS-secured web server in the first place.
To your last point, who owns the 10k nodes? An organization would only be expected to own a few nodes, and your few nodes would be handling your transactions, you wouldn't be submitting to other organizations peers. You owning so many peer's in a network would defeat the purpose of Fabric's consensus, allowing you to compromise the network by always being able to provide policy quorum.
Related
Should an app have a single Gateway for all users and switch to the relevant identity for requests, or multiple Gateways (one per user) with identity initially set and never changed?
Should the Gateway be connected and disconnected after each request, or should it be initially connected once and left open?
The following applies to v1.4 of the node implementation of Gateway in the fabric-network package.
You should have a single gateway for each individual user. Once you disconnect a gateway you should not use it again (ie don't attempt to call connect again).
A basic pattern for multi-user apps that are long running would be to create a gateway for each user and cache it. Use a stale policy to disconnect and disgard that gateway if it hasn't been used for a while by the user.
I'm seting up a production environment of Hyperldger Fabric 1.4 and one of my concerns is connectivity with third party systems. Since the infrastructure is not running inside a VPN and third party systems available to public are generating load for our network, I am skeptical about allowing for a connection over public network directly into Hyperledger Composer API. I am wondering if anybody has experience with performance when deploying a intermediary host that is solely allowed to communicate with Hyperledger network?
Don't see a problem with that, if you need that kind of setup. If you use composer you will have an API to communicate with your network. Nothing stops you from creating another app that solely communicates with this API.
The performance depends on other factors, like number of requests, size of data, frequency of data.
Also, don't forget that the Hyperledger API needs to be secured. As for public access, there should be any, the whole point of Hyperledger is to allow only known entities to connect and do whatever needs to be done.
I am interested in why we might use one over the other, i.e. the pros and cons of these approaches.
From what I understood both provide fault-handling and Endpoint Resolution. My assumption is to use ReverseProxy for external clients (outside the cluster) and ServicePartitionClient within a cluster.
The reverse proxy works on the service side, as a gateway that routes Http requests to other services running inside the cluster. Reverse proxy users can be inside or outside the cluster.
Pro: Can be accessed by anyone who understands Http.
Con: Restricted to Http. Requires intimate knowledge about service name and partitioning strategy at the caller side.
The partition client runs client side, to call services. Depending on the underlying communication technology implemented by TCommunicationClient, partition clients can be used inside or outside the cluster. (it's not restricted to Service Remoting)
You could write code that uses the partition client to call the reverse proxy with retry support.
Pro: Can be accessed by anyone who has the TCommunicationClient.
Con: Requires intimate knowledge about service name and partitioning strategy at the caller side.
I was working on a side project and i deiced to redesign my Skelton project to be as Microservices, so far i didn't find any opensource project that follow this pattern. After a lot of reading and searching i conclude to this design but i still have some questions and thought.
Here are my questions and thoughts:
How to make the API gateway smart enough to load balnce the request if i have 2 node from the same microservice?
if one of the microservice is down how the discovery should know?
is there any similar implementation? is my design is right?
should i use Eureka or similar things?
Your design seems OK. We are also building our microservice project using API Gateway approach. All the services including the Gateway service(GW) are containerized(we use docker) Java applications(spring boot or dropwizard). Similar architecture could be built using nodejs as well. Some topics to mention related with your question:
Authentication/Authorization: The GW service is the single entry point for the clients. All the authentication/authorization operations are handled in the GW using JSON web tokens(JWT) which has nodejs libray as well. We keep authorization information like user's roles in the JWT token. Once the token is generated in the GW and returned to client, at each request the client sends the token in HTTP header then we check the token whether the client has the required role to call the specific service or the token has expired. In this approach, you don't need to keep track user's session in the server side. Actually there is no session. The required information is in the JWT token.
Service Discovery/ Load balance: We use docker, docker swarm which is a docker engine clustering tool bundled in docker engine (after docker v.12.1). Our services are docker containers. Containerized approach using docker makes it easy to deploy, maintain and scale the services. At the beginning of the project, we used Haproxy, Registrator and Consul together to implement service discovery and load balancing, similar to your drawing. Then we realized, we don't need them for service discovery and load balancing as long as we create a docker network and deploy our services using docker swarm. With this approach you can easily create isolated environments for your services like dev,beta,prod in one or multiple machines by creating different networks for each environment. Once you create the network and deploy services, service discovery and load balancing is not your concern. In same docker network, each container has the DNS records of other containers and can communicate with them. With docker swarm, you can easily scale services, with one command. At each request to a service, docker distributes(load balances) the request to a instance of the service.
Your design is OK.
If your API gateway needs to implement (and thats probably the case) CAS/ some kind of Auth (via one of the services - i. e. some kind of User Service) and also should track all requests and modify the headers to bear the requester metadata (for internal ACL/scoping usage) - Your API Gateway should be done in Node, but should be under Haproxy which will care about load-balancing/HTTPS
Discovery is in correct position - if you seek one that fits your design look nowhere but Consul.
You can use consul-template or use own micro-discovery-framework for the services and API-Gateway, so they share end-point data on boot.
ACL/Authorization should be implemented per service, and first request from API Gateway should be subject to all authorization middleware.
It's smart to track the requests with API Gateway providing request ID to each request so it lifecycle could be tracked within the "inner" system.
I would add Redis for messaging/workers/queues/fast in-memory stuff like cache/cache invalidation (you can't handle all MS architecture without one) - or take RabbitMQ if you have much more distributed transaction and alot of messaging
Spin all this on containers (Docker) so it will be easier to maintain and assemble.
As for BI why you would need a service for that? You could have external ELK Elastisearch, Logstash, Kibana) and have dashboards, log aggregation, and huge big data warehouse at once.
Hi We have a UI component deployed to Bluemix on Noedjs which makes REST service calls (JSON/XML) to services deployed in Data-center. These calls will go through the IBM Data Power gateway as a security proxy.
Data Power establishes an HTTPS Mutual Authentication connection (using certs that are exchanged offline) to the caller.
Although this method is secure it is time consuming to set up and if this connection is in setup for each service request it will create a slow response for the end user.
To optimize response time we are looking for any solution which can pool connections between nodejs app deployed on Bluemix and DataPower security proxy. Any one has any experience in this area?
In regards to "it is time-consuming to set up", in datapower you can create a multi-protocol gateway (MPGW) in front of your services to act as router. The MPGW will match services calls based on their URI and route them accordingly. In this scenario, you will only need to configure a single endpoint in the Bluemix Cloud Integration service in order to work with all your services. One downside to this approach is that it will be harder to control access to specific on-premise services because they will all be exposed to your Bluemix app as a single service.
In regards to optimizing response times, where are you seeing the bottleneck?
If the establishment of the tcp connections is causing too much overhead, you should be able to configure your Node.js app to use or re-use persistent connections via keepalive settings or you can look into setting up a connection pool that manages that for you (e.g. https://www.npmjs.com/package/generic-pool seems a popular choice).
On the datapower side, make sure the front/back persistent timeout is set according to your requirements:http://www-01.ibm.com/support/knowledgecenter/SS9H2Y_7.2.0/com.ibm.dp.doc/mpgw_availableproperties_serviceview.html?lang=en
Other timeout values in datapower can be found at http://www-01.ibm.com/support/docview.wss?uid=swg21469404