can i use cluster for many cpu-bound jobs? - multithreading

I have a java based server accepting clients requests and the client requests are cpu-bound jobs and the jobs have no dependency between them. My server is equipped with a thread pool having number of threads equal to the number of processors(or number of cores) in the system but server performance is low and client's requests wait for thread availability. Can cluster help me in this scenario? I want to use cluster and I want to distribute the jobs to nodes so that client's request wait time can be eliminated. help me in this regard. Also tell me about the framework I should use. can RMI technology help me? should I use hazelcast?

You can use the distributed ExecutorService to distribute your operations to the different nodes and offload them to your own threadpool.

There are some pretty good compute grid frameworks that will do the job. You can start by googling "java grid computing" or "java cluster computing". To name a few:
JPPF
GridGain
HTCondor
Hadoop
Unicore
etc ...

Related

How to run two service with different node with voltdb

I have a three node cluster configured for voltdb. Currently 2 applications are running and all the traffic is going to only single node. ( Only one server)
As we have 3 cluster ( 3 nodes) and data is replicated around all the nodes. Can i run one service on one Node and other service on another node? Is that possible?
Yes, as long as both these services use the same database, they can both point to different nodes in the cluster, and VoltDB will reroute the data to the proper partition accordingly.
However, it is recommended to connect applications to all of the nodes in a cluster, so they can send requests to the cluster more evenly. Depending on which client is being used, there are optimizations that send each request to the optimal server based on which partition is involved. This is often called "client affinity". Clients can also simply send to each node in a round-robin style. Both client affinity and round-robin are much more efficient than simply sending all traffic to 1 node.
Also, be cautious of running applications on the same hosts as VoltDB nodes, because they could unpredictably starve the VoltDB process of resources that it needs. However, for applications that behave well and on servers where there are adequate resources, they can be co-located and many VoltDB customers do this.
Full Disclosure: I work at VoltDB.

Is it possible to isolate spark cluster nodes for each individual application

We have a spark cluster comprising of 16 nodes. Is it possible to limit nodes 1 & 2 for application 'A'; nodes 3,4,5 for application 'B'; nodes 10,11,12,15 for application 'C'; and so on?
From the documentation, I understand that we can set some properties to control spark executor cores, number of executors to be launched, memories etc. But, I am curious to know if I can achieve the above use case.
One obvious way to do that is to configure 3 different clusters with the desired topology, otherwise you're out of luck, spark does not have any provision,
because it is usually a bad idea and generally against the design principles of spark and clustering in general. Why? If you assign application A to specific hosts, but it gets idle, while application B is running at 100%, you have 2 idle hosts that could be working for B, so you would be wasting costly computing resources. Usually, what you want is to assign a certain number of resources per application and let the system decide how to allocate them (scheduling.. plain spark is pretty elementary, but running under YARN and Mesos you can be more sophisticated).
Another reason why it's a bad idea is that you don't want rules that specify a specific host or set of hosts. What if you assign node 1&2 to application A and they both go down? Beside not using your resources efficiently, tying your app to specific hosts makes it also difficult to make them resilient to failure by rescheduling them on other hosts.
You may have other ways to do something similar though, if you're running spark under YARN or Mesos, you can define queues or quotas and limit the amount of resources that each application can use at a given time.
In general, it depends on the reason, why do you want to statically allocate resources to applications. If it's for resource management, you should instead looking at schedulers and queues. If it's for security, you should have multiple clusters, keeping in mind that you'd be losing in performance.

hazelcast map-reduce use only one cpu core

I run a map-reduce job on a single-node hazelcast cluster and it consumes only about one CPU (120-130%). I can't find how to configure hazelcast to eat all available CPU up, is it possible at all?
EDIT:
While Hazelcast does not support in-node parallelism another competing opensource in-memory datagrid (IMDG) solution does - Infinispan. See this article to learn more about that.
The current implementation of Mapping and Reducing is single threaded. Hazelcast is not meant to run as a single node environment and the map-reduce framework is designed in a way to support scale-out and not exhaust the whole CPU. You can start up multiple nodes on your machine to parallelize processing and utilize the CPU that way but it seems to me that you might use Hazelcast for a problem that it is not meant to solve. Can you elaborate your use case?

Which class in storm does instantiate the number of threads for each bolt and spout?

I need to know how Storm manages number of parallel worker in each bolt. neither IrichBolt class nor IRichSpout Class implements Runnable class. I really need to know how storm manage multithreading?
Its kinda too broad to discuss but here's something I could try to share. In very brief Spouts or Bolts in storm can be defined as an entity or component that actually process the data . In storm terminology they are known as tasks(so you don't need its parent interface such as IRichSpout to implement something like Runnable ). Now the Thread which in responsible for carrying out these tasks are called Executors. From the doc
in Storm’s terminology "parallelism" is specifically used to describe the so-called parallelism hint, which means the initial number of executor (threads) of a component (spout or bolt)
These executors (threads) are again spawned by the worker process . From the doc
A worker process executes a subset of a topology. A worker process belongs to a specific topology and may run one or more executors for one or more components (spouts or bolts) of this topology
A machine in a storm cluster may run single or multiple such worker process for one or more topologies, and each process can run executors for specific topologies ( you can even change these executors during run time using the storm re-balancing mecanism).
For internal communication with in these workers process Storm uses various message queues backed by LMAX Disruptor . They maintain their own threads like receiver thread & sender thred for managing incoming and outgoing messages.
You can probably take look in this doc page for a better overview. And this very nice article explaining how it handles parallelism. This might help you digging further and share your findings :)

Presto configuration

As I set up a cluster of Presto and try to do some performance tuning, I wonder if there's a more comprehensive configuration guide of Presto, e.g. how can I control how many CPU cores a Presto worker can use. And is it good practice if I start multiple presto workers on a single server (in which case I don't need a dedicated server to run the coordinator)?
Besides, I don't quite understand the task.max-memory argument. Will the presto worker start multiple tasks for a single query? If yes, maybe I can use task.max-memory together with the -Xmx JVM argument to control the level of parallelism?
Thanks in advance.
Presto is a multithreaded Java program and works hard to use all available CPU resources when processing a query (assuming the input table is large enough to warrant such parallelism). You can artificially constrain the amount of CPU resources that Presto uses at the operating system level using cgroups, CPU affinity, etc.
There is no reason or benefit to starting multiple Presto workers on a single machine. You should not do this because they will needlessly compete with each other for resources and likely perform worse than a single process would.
We use a dedicated coordinator in our deployments that have 50+ machines because we found that having the coordinator process queries would slow it down while it performs the query coordination work, which has a negative impact on overall query performance. For small clusters, dedicating a machine to coordination is likely a waste of resources. You'll need to run some experiments with your own cluster setup and workload to determine which way is best for your environment.
You can have a single Presto process act as both a coordinator and worker, which can be useful for tiny clusters or testing purposes. To do so, add this to the etc/config.properties file:
coordinator=true
node-scheduler.include-coordinator=true
Your idea of starting a dedicated coordinator process on a machine shared with a worker process is interesting. For example, on a machine with 16 processors, you could use cgroups or CPU affinity to dedicate 2 cores to the coordinator process and restrict the worker process to 14 cores. We have never tried this, but it could be a good option for small clusters.
A task is a stage in a query plan that runs on a worker (the CLI shows the list of stages while the query is running). For a query like SELECT COUNT(*) FROM t, there will be a task on every work that performs the table scan and partial aggregation, and another task on a single worker for the final aggregation. More complex queries that have joins, subqueries, etc., can result in multiple tasks on every worker node for a single query.
-Xmx must be higher than task.max-memory, or at least equal.
otherwise you will be likely to see OOM issue as I have experienced that before.
and also, since Presto-0.113 they have changed the way Presto manages the query memory and according configurations.
please refer to this link:
https://prestodb.io/docs/current/installation/deployment.html
For your question regarding "many CPU cores a Presto worker can use", I think it's controlled by the parameter task.concurrency, which by default is 16

Resources