How to scale resource pool for Apache Spark under Mesos

Question

I have developed an application using the SMACK stack and am currently working on a research project to automatically distribute resources within the cluster (we use DC/OS).

In oder to scale up or down Kafka or Cassandra, the Marathon framework does its job to easily launch more or less instances. Now I want to do this "at-runtime-scaling" for Apache Spark. The configuration of spark.cores.max, described in the Spark Config Manual works well for setting an initial limit of how many CPU cores can be requested.

When running on a standalone deploy cluster or a Mesos cluster in "coarse-grained" sharing mode, the maximum amount of CPU cores to request for the application from across the cluster (not from each machine). If not set, the default will be spark.deploy.defaultCores on Spark's standalone cluster manager, or infinite (all available cores) on Mesos.

Spark already automatically launches new executors, but only as long as spark.cores.max is not exceeded (not considering RAM for the moment).

My question is now, how can I scale up (i.e. allowing Spark to request more CPUs from the cluster) during runtime, without having to re-deploy the application with an updated configuration?

Well with dynamic allocation the upper bound is still the same. It's just different in the way that Spark does not allocate all available resources right away, but - as the name indicates - only allocates as required. — Benedikt Wedenik, Jan 12 '18 at 17:07
In case you haven't found them yet, the instructions for Spark on DC/OS are [here](https://docs.mesosphere.com/services/spark/2.1.0-2.2.0-1/install/), and you can get real-time help in the [DC/OS Community Slack](http://chat.dcos.io/) #data-services channel :) — Judith Malnick, Jan 12 '18 at 23:29

How to scale resource pool for Apache Spark under Mesos

0 Answers0