1

I have a set of points randomly located around a center. I want to change the locations of the points by optimizing some objective function iteratively, so that they can be scattered (the definition of "scattered" is vague, it could be that the distance between every pair is above some value).

Since there're too many points, I can't afford computing the distance between every pair, I would prefer a linear time complexity. The determinant of the covariance matrix seems fine, just the derivatives look a bit complex to implement.

What should be a good objective function?
I guess this is more or less the same as asking, how to measure the "scatteredness" of a set of points?

I hope I've explained my problem clearly, thanks in advance.


Update

Following user3658307's method, taking a subset $P_t$ and comparing it against the entire set $P$ will give good results in my case.

dontloo
  • 516
  • What is your objective? What is wrong with just randomly placing them within some distance from the center? You seem to think two being close is not good. What is wrong with placing them on a hexagonal grid? That will maximize the distance between the points given the area. You seem to think that is not good, presumably because it is too regular. What is "random enough"? – Ross Millikan Jun 09 '17 at 03:36
  • @RossMillikan hi thanks for the reply, yes "two being close is not good", a hexagonal grid would be ideal (points are evenly distributed in an area), but actually the space can be unbounded, and I'd like to do it in a iterative non-discrete optimization way so that it can be combined with my other objective functions. – dontloo Jun 09 '17 at 03:59
  • Why then not pick all the points on a hexagonal grid out to some maximum radius? No iteration needed. Pick the radius using whatever random distribution you want, then compute the proper spacing to give the proper number of points. When you can avoid iteration, or avoid iterating over certain variables, that is usually a good thing. – Ross Millikan Jun 09 '17 at 04:02
  • @RossMillikan yea.. its a bit complex, actually the locations of the points should obey some order (some points should be relatively closer than others), but the order is defined via another objective function, so I can not place them on a hexagonal grid in the right order in advance, so I'm thinking of combining two objective functions. – dontloo Jun 09 '17 at 04:11
  • Then it is hard to give advice. Maybe start with a grid and perturb the points randomly over some small radius. If you can't say what you are looking for, how can we help find it? – Ross Millikan Jun 09 '17 at 04:30
  • And what about the standard deviation? This is the measure you are looking for... The great the covariance, the great the dispersion. That is the way one measure "scatterness". – Brethlosze Jun 09 '17 at 05:43
  • @hyprfrco hi thanks, I've tried that, if I remembered correctly, it tended to push the points towards two opposite directions in order to get a maximum variance – dontloo Jun 09 '17 at 06:05
  • @RossMillikan as mentioned in the question I'm looking for a continuous function $f(P)$ that reaches its optimum when the distance between every pair of $P$ is above some value, or, if it helps to simply the solution this condition can be changed as long as the "scatteredness" is guaranteed. I'm probably not interested in discrete optimization methods since I want to use it together with another objective function. – dontloo Jun 09 '17 at 06:19
  • No you dont want any discrete optimization, i insist you need some probability density, not necesarily a single cluster one.. just draw a distribution you feel comfortable with, and then, compute the fitness against it. This is cheaper that fitting point by point. – Brethlosze Jun 09 '17 at 06:41
  • I agree with the other commenters that the easiest way to scatter the points is just use a random distribution (e.g. a multidimensional gaussian). However, if you want a (cheap, differentiable) energy to minimize, one idea is $E(P)=\sum_{i\in P_s} \sum_{j\in P_s} ||p_i - p_j||_k^k$, where $P_s\subset P$ is a random subset of $P$, changing every iteration. This is essentially the same as stochastic gradient descent in neural networks. The expected value of the gradient is the true gradient btw. Also, using high $k$ penalize higher distances more. – user3658307 Jun 09 '17 at 22:06
  • @user3658307 thank you, I think an energy function is exactly what I'm looking for, but still computing the distance between every pair is not affordable in my case, do you happen to know other even cheaper energy functions or ways to approximate this? – dontloo Jun 12 '17 at 02:40
  • 1
    @dontloo Er, yes, like I say in my comment just above, you can use a random subset of pairwise distances as a (stochastic) energy function. $|P_S| << |P|$, in other words. This is how deep neural networks minimize essentially the same energies, but with tens of millions of input points. – user3658307 Jun 12 '17 at 03:12
  • @user3658307 ah I missed that, I got it, thank you lots – dontloo Jun 12 '17 at 03:26

1 Answers1

2

Just to expand my comment into an answer :)

As noted by others, the easiest way to scatter points about an origin is to use some random distribution centered there, and then "move" the points by drawing new ones from that distribution.

However, it can be useful to have an energy function to minimize instead (it lets you add other terms, for example). The obvious choice is the sum of squared distances, but this can be expensive. An alternative employed elsewhere (e.g. machine learning) is to use a random subset to compute the gradient, changing at every step during the decent.

More exactly, given a point set $P$, at every step $t$ during an iterative minimization procedure, choose a random subset $P_t\subset P$ such that $|P_t|<<|P|$ and maximize: $$ E_t(P_t) = \sum_{p_i\in P_t}\sum_{p_j\in P_t} || p_i - p_j ||_k^k $$ where $k$ denotes the order of the Minkowski metric to use. Often $k=2$ is fine. This can be useful, though, if you want to weight outliers more or less. This one is also nice because it is differentiable. You could also maximize the determinant or a matrix norm of the covariance matrix of $P_t$.

Also, although this seems like a hack, note that, in probabilistic terms, this approach is expected to work on average. See here.


Another idea, using derivative-free optimization, let $$ \mathcal{E}_t(P_t) = \min_{p,q\in P_t} ||p - q||_k^k + \alpha\,f(P_t) $$ be your energy function at time $t$, where $f$ is the other energy stuff you want to add and $\alpha\in\mathbb{R}$. Then maximize it with a gradient-less optimizer like differential evolution or CE optimization. Notice you still change the random subset each time.

Notice this is a little closer to your original desire to simply ensure that the points are further than a threshold, rather than keep moving the average distances.

user3658307
  • 10,843