I am a newbie in Akka. But I have been working in web project for several years.
Previously, We user "ngix+ tomcat cluster" plan, once one of the tomcat server crash(physically machine crash), the other members of the tomcat cluster would take over the request destined to the failed server. And we achieve the High Availability through this.
In our new web project, we plan to use akka to achieve the fault tolerance and scalability. Here is one server to be master actor, whose jobs is to dispatch the request messages from outside to the child actors. The child actors may be distributed locally or remotely. Once any one of the child actor failed, the master actor would fix it.
My question is, once the master actor failed(maybe the machine crashs), how should I achieve the High Availability in this cases? If the machine crash, where the master actor resides, we have to restart the machine manually. And in this period ,the service has to be stop for a while. It is unacceptable for us.
Could anyone tell me, how should I do to solve this issues?
I would suggest to use an external load balancer. For example you can report all addresses of your Akka cluster nodes to middle-tier of your web app (and cache them there for some time) and load balance requests from your middle-tier to your Akka backend (for example, pick random cluster node every time).
If your Akka backend reqiures some kind of single point of resource acquisition for your child actors, you can try to use a Cluster Singleton as a Master actor. In this case all of cluster nodes should send client requests to the Cluster Singleton and Cluster Singleton then can distribute tasks between all cluster nodes in a some particular way.
Related
I have a three node cluster configured for voltdb. Currently 2 applications are running and all the traffic is going to only single node. ( Only one server)
As we have 3 cluster ( 3 nodes) and data is replicated around all the nodes. Can i run one service on one Node and other service on another node? Is that possible?
Yes, as long as both these services use the same database, they can both point to different nodes in the cluster, and VoltDB will reroute the data to the proper partition accordingly.
However, it is recommended to connect applications to all of the nodes in a cluster, so they can send requests to the cluster more evenly. Depending on which client is being used, there are optimizations that send each request to the optimal server based on which partition is involved. This is often called "client affinity". Clients can also simply send to each node in a round-robin style. Both client affinity and round-robin are much more efficient than simply sending all traffic to 1 node.
Also, be cautious of running applications on the same hosts as VoltDB nodes, because they could unpredictably starve the VoltDB process of resources that it needs. However, for applications that behave well and on servers where there are adequate resources, they can be co-located and many VoltDB customers do this.
Full Disclosure: I work at VoltDB.
I m planning to host the the many job in serice fabric as stateless service run asyc. plan is to host on the multiple nodes and them running in parallel with the que mechanism. The only issue (may be) if I follow the design with multiple jobs, running on many node , running same time hitting the same database, could cause a database issue?. In typical on prem application, it used to be SQL queue, so SQL could read the message and process them. But in this scenario, the service fabric nodes it self instructing the database may cause slowness at the DB level.
Does anyone has faced the issue? or deployed background run asyc process on all SF nodes running parallel for data concentric work?
I'm working on a project with Node.js that involves a server. Now due to large number of jobs, I need to perform clustering to divide the jobs between different servers (different physical machines). Note that my jobs has nothing to do do with internet, so I cannot use stateless connection (or redis to keep state) and a load balancer in front of the servers to distribute the connection.
I already read about the "cluster" module, but, from what i understood, it seems to scale only on multiprocessors on the same machine.
My question: is there any suitable distributed module available in Node.js for my work? What about Apache mesos? I have heard that mesos can abstract multiple physical machines into a single server? is it correct? If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Thanks
My question: is there any suitable distributed module available in Node.js for my work?
Don't know.
I have heard that mesos can abstract multiple physical machines into a single server? is it correct?
Yes. Almost. It allows you to pool resources (CPU, RAM, DISK) across multiple machines, gives you ability to allocate resources for your applications, run and manage the said applications. So you can ask Mesos to run X instances of node.js and specify how much resource does each instance needs.
http://mesos.apache.org
https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
If yes, it is possible to use the node.js cluster module on top of the mesos, since now we have only one virtual server?
Admittedly, I don't know anything about node.js or clustering in node.js. Going by http://nodejs.org/api/cluster.html, it just forks off a bunch of child workers and then round robins the connection between them. You have 2 options off the top of my head:
Run node.js on Mesos using an existing framework such as Marathon. This will be fastest way to get something going on Mesos. https://github.com/mesosphere/marathon
Create a Mesos framework for node.js, which essentially does what cluster node.js is doing, but across the machines. http://mesos.apache.org/documentation/latest/app-framework-development-guide/
In both these solutions, you have the option of letting Mesos create as many instances of node.js as you need, or, use Mesos to run cluster node.js on each machine and let it manage all the workers on that machine.
I didn't google, but there might already be a node.js mesos framework out there!
So I have an app I am working on and I am wondering if I am doing it correctly.
I am running cluster on my node.js app, here is a link to cluster. I couldn't find anywhere that states if I should only run cluster on a single server or if it is okay to run it on a cluster of servers. If I continue down the road I am going I will have a cluster inside a cluster.
So that it is not just opinions as answers, here is my question. Was cluster the package made to do what I am doing (cluster of workers on a single server inside a cluster of servers)?
Thanks in advance!
Cluster wasn't specifically designed for that, but there is nothing about it which would cause a problem. If you've designed your app to work with cluster, it's a good indication that your app will also scale across multiple servers. The main gotcha would be if you're doing anything stateful on the filesystem. For example, if a user uploads a photo and you store it on the server disk, that would be problematic when scaling out across multiple servers (that don't share the same disk).
I have a java based server accepting clients requests and the client requests are cpu-bound jobs and the jobs have no dependency between them. My server is equipped with a thread pool having number of threads equal to the number of processors(or number of cores) in the system but server performance is low and client's requests wait for thread availability. Can cluster help me in this scenario? I want to use cluster and I want to distribute the jobs to nodes so that client's request wait time can be eliminated. help me in this regard. Also tell me about the framework I should use. can RMI technology help me? should I use hazelcast?
You can use the distributed ExecutorService to distribute your operations to the different nodes and offload them to your own threadpool.
There are some pretty good compute grid frameworks that will do the job. You can start by googling "java grid computing" or "java cluster computing". To name a few:
JPPF
GridGain
HTCondor
Hadoop
Unicore
etc ...