I have developed an application using the SMACK stack and am currently working on a research project to automatically distribute resources within the cluster (we use DC/OS).
In oder to scale up or down Kafka or Cassandra, the Marathon framework does its job to easily launch more or less instances.
Now I want to do this "at-runtime-scaling" for Apache Spark.
The configuration of spark.cores.max, described in the Spark Config Manual works well for setting an initial limit of how many CPU cores can be requested.
When running on a standalone deploy cluster or a Mesos cluster in "coarse-grained" sharing mode, the maximum amount of CPU cores to request for the application from across the cluster (not from each machine). If not set, the default will be spark.deploy.defaultCores on Spark's standalone cluster manager, or infinite (all available cores) on Mesos.
Spark already automatically launches new executors, but only as long as spark.cores.max is not exceeded (not considering RAM for the moment).
My question is now, how can I scale up (i.e. allowing Spark to request more CPUs from the cluster) during runtime, without having to re-deploy the application with an updated configuration?