Presto configuration - presto

As I set up a cluster of Presto and try to do some performance tuning, I wonder if there's a more comprehensive configuration guide of Presto, e.g. how can I control how many CPU cores a Presto worker can use. And is it good practice if I start multiple presto workers on a single server (in which case I don't need a dedicated server to run the coordinator)?
Besides, I don't quite understand the task.max-memory argument. Will the presto worker start multiple tasks for a single query? If yes, maybe I can use task.max-memory together with the -Xmx JVM argument to control the level of parallelism?
Thanks in advance.

Presto is a multithreaded Java program and works hard to use all available CPU resources when processing a query (assuming the input table is large enough to warrant such parallelism). You can artificially constrain the amount of CPU resources that Presto uses at the operating system level using cgroups, CPU affinity, etc.
There is no reason or benefit to starting multiple Presto workers on a single machine. You should not do this because they will needlessly compete with each other for resources and likely perform worse than a single process would.
We use a dedicated coordinator in our deployments that have 50+ machines because we found that having the coordinator process queries would slow it down while it performs the query coordination work, which has a negative impact on overall query performance. For small clusters, dedicating a machine to coordination is likely a waste of resources. You'll need to run some experiments with your own cluster setup and workload to determine which way is best for your environment.
You can have a single Presto process act as both a coordinator and worker, which can be useful for tiny clusters or testing purposes. To do so, add this to the etc/config.properties file:
coordinator=true
node-scheduler.include-coordinator=true
Your idea of starting a dedicated coordinator process on a machine shared with a worker process is interesting. For example, on a machine with 16 processors, you could use cgroups or CPU affinity to dedicate 2 cores to the coordinator process and restrict the worker process to 14 cores. We have never tried this, but it could be a good option for small clusters.
A task is a stage in a query plan that runs on a worker (the CLI shows the list of stages while the query is running). For a query like SELECT COUNT(*) FROM t, there will be a task on every work that performs the table scan and partial aggregation, and another task on a single worker for the final aggregation. More complex queries that have joins, subqueries, etc., can result in multiple tasks on every worker node for a single query.

-Xmx must be higher than task.max-memory, or at least equal.
otherwise you will be likely to see OOM issue as I have experienced that before.
and also, since Presto-0.113 they have changed the way Presto manages the query memory and according configurations.
please refer to this link:
https://prestodb.io/docs/current/installation/deployment.html

For your question regarding "many CPU cores a Presto worker can use", I think it's controlled by the parameter task.concurrency, which by default is 16

Related

What is the proper setup for spark with cassandra

After using and playing around with the spark connector, I want to utilize it in the most efficient way, for our batch processes.
is the proper approach to set up a spark worker on the same host where Cassandra node is on? does the spark connector ensure data locality?
I am a bit concerned that a memory intensive spark worker will cause the entire machine to stop, then I will lose a Cassandra node, so I'm a bit confused whether I should place the workers on the Cassandra nodes, or separate (which means no data locality). what is the common way and why?
This depends on your particular use case. Some things to be aware of
1) CPU Sharing, while memory will not be shared (heaps will be separate) between Spark and Cassandra. There is nothing stopping spark executors from stealing time on C* cpu cores. This can lead to load and slowdowns in C* if the spark process is very cpu intensive. If it isn't then this isn't much of a problem.
2) Your network speed, if your network is very fast then there is much less value to locality than if you are on a slower network.
So you have to ask yourself, do you want a simpler setup (everything in one place) or do you want a complicated setup but more isolated.
For instance DataStax (the company I work for) ships Spark running colocated with Cassandra by default, but we also offer the option of having it run separately. Most of our users colocate possibly because of this default, those who don't usually do so because of easier scaling.

Is it possible to isolate spark cluster nodes for each individual application

We have a spark cluster comprising of 16 nodes. Is it possible to limit nodes 1 & 2 for application 'A'; nodes 3,4,5 for application 'B'; nodes 10,11,12,15 for application 'C'; and so on?
From the documentation, I understand that we can set some properties to control spark executor cores, number of executors to be launched, memories etc. But, I am curious to know if I can achieve the above use case.
One obvious way to do that is to configure 3 different clusters with the desired topology, otherwise you're out of luck, spark does not have any provision,
because it is usually a bad idea and generally against the design principles of spark and clustering in general. Why? If you assign application A to specific hosts, but it gets idle, while application B is running at 100%, you have 2 idle hosts that could be working for B, so you would be wasting costly computing resources. Usually, what you want is to assign a certain number of resources per application and let the system decide how to allocate them (scheduling.. plain spark is pretty elementary, but running under YARN and Mesos you can be more sophisticated).
Another reason why it's a bad idea is that you don't want rules that specify a specific host or set of hosts. What if you assign node 1&2 to application A and they both go down? Beside not using your resources efficiently, tying your app to specific hosts makes it also difficult to make them resilient to failure by rescheduling them on other hosts.
You may have other ways to do something similar though, if you're running spark under YARN or Mesos, you can define queues or quotas and limit the amount of resources that each application can use at a given time.
In general, it depends on the reason, why do you want to statically allocate resources to applications. If it's for resource management, you should instead looking at schedulers and queues. If it's for security, you should have multiple clusters, keeping in mind that you'd be losing in performance.

Do multiple spark applications running on yarn have any impact on each other?

Do multiple spark jobs running on yarn have any impact on each other?
e.g. If the traffic on one streaming job increases too much does it have any effect on second job? Will it slow it down or any other consequences?
I have enough resources for both of the applications to run concurrently.
Yes they do. Depending on how your scheduler is set up (static vs dynamic) they either share just the network output (important for shuffles) and disk throughput (important for shuffles, reading in of data locally or on HDFS, writing away data locally or on HDFS) or also the memory and CPUs if it's on dynamic allocation. Still, running your two jobs on parallel as opposed to sequentially will benefit on average, due to the network and disk resources not being used constantly. This mostly depends on the amount of shuffling necessary in your jobs.

What are the benefits of running multiple Spark instances per node (physical machine)?

Is there any advantage to starting more than one spark instance (master or worker) on a particular machine/node?
The spark standalone documentation doesn't explicitly say anything about starting a cluster or multiple workers on the same node. It does seem to implicitly conflate that one worker equals one node
Their hardware provisioning page says:
Finally, note that the Java VM does not always behave well with more than 200 GB of RAM. If you purchase machines with more RAM than this, you can run multiple worker JVMs per node. In Spark’s standalone mode, you can set the number of workers per node with the SPARK_WORKER_INSTANCES variable in conf/spark-env.sh, and the number of cores per worker with SPARK_WORKER_CORES.
So aside from working with large amounts of memory or testing cluster configuration, is there any benefit to running more than one worker per node?
I think the obvious benefit is to improve the resource utilization of the hardware per box without losing performance. In terms of parallelism, one big executor with multiple cores seems to be same with multiple executors with less cores.

hazelcast map-reduce use only one cpu core

I run a map-reduce job on a single-node hazelcast cluster and it consumes only about one CPU (120-130%). I can't find how to configure hazelcast to eat all available CPU up, is it possible at all?
EDIT:
While Hazelcast does not support in-node parallelism another competing opensource in-memory datagrid (IMDG) solution does - Infinispan. See this article to learn more about that.
The current implementation of Mapping and Reducing is single threaded. Hazelcast is not meant to run as a single node environment and the map-reduce framework is designed in a way to support scale-out and not exhaust the whole CPU. You can start up multiple nodes on your machine to parallelize processing and utilize the CPU that way but it seems to me that you might use Hazelcast for a problem that it is not meant to solve. Can you elaborate your use case?

Resources