4

I'm trying to optimize an expensive function for which I can choose sample points. The difficulty is that many function evaluations may be computed in parallel, taking varying amounts of time. I don't know which keywords to search for to find existing literature(/implementations).

So at a time, I might have already computed function values at 18 points, with 15 still being computed, and I want to start evaluating the function another point. Without the running jobs, I could make a model and find which next point might provide the most information. But now, I need to somehow tell the model that there are also 15 points for which I don't have values, but near which I don't want to evaluate the function.

Specifically, I'm looking for something with minimal assumptions and no gradient information. But I'll be happy if I just know the keyword(s) to search for. (I can come up with several hacks to that kind of work, but I wonder if there is any 'real' solution).


For illustration, say every column is a function evaluation point and every row is a timestep in the below ascii-figure. An X means the evaluation is still being computed and a * in the bottom row means it's done. A new evaluation starts as soon as an old one terminates, with others still busy.

XXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXX
 XXXXXXXXXXX XXXXX
 XXXXXXXXXXX XXXXX
 XXXXXXXXXXX XXXXX
 XXXXXXXXXXX X  XXX X
 XXXX XXXXXX X  XXX XX
 XXXX XXXXXX X  XXX  XX
 XXXX XXXXXX X  XXX  XX
 XXXX XXXXXX X  XXX  XX
 XXXX XXXXXX X  XXX  XX
 XXXX X XXXX X  XXX  XXX
 XXXX X XXXX X  X X  XXXX
 XXXX X XXXX X  X X  XXXX
  XXX X XXXX X  X X  XXXXX
  XXX X XXXX X  X X  XXXXX
  XXX X XXXX X  X X  XXXXX
  XXX X XXXX X  X X  XXXXX
  XXX X XX X X  X X  XXXXXX
  XXX   XX X X  X X  XXXXXXX
  XXX   XX X X  X X  XXXXXXX
  XXX   XX X X  X X  XXXXXXX
  XXX   XX   X  X    X XXXXXXXX
  XXX   XX   X  X    X XXXXXXXX
  XXX   XX   X  X    X XXXXXX XX
  XXX   XX   X  X    X XXXXXX XX
  XXX   XX   X       X XXXXXX XXX
  XX    XX   X       X XXXXXX XXXX
**  ****  *** ******* *      *

Each time an evaluation completes, how to choose the next point to evaluate, given several complete and several incomplete evaluations?

Mark
  • 213
  • 1
  • 6

2 Answers2

1

SMAC has a parallel version available.

I'm not sure it handles the single runs as you want, but it sounds strange to me if it don't. Give a try!

Vincenzo Lavorini
  • 1,794
  • 1
  • 11
  • 22
0

I would use a parallelized Gaussian process regression model. The next points to query in this framework are given by the posterior extrema or those of maximum expected improvement. Here's an implementation.

Emre
  • 10,541
  • 1
  • 31
  • 39