1

Note to moderators : this question have not been answered in another post yet, because I want to use spherical coordinates.

In an optimization algorithm, I have a categorical variable with N+1 categories. But to represent these categories, I want to manage only N parameters because the optimization process is way more efficient with less parameters to optimize.

So let's imagine that this variable is represented by a point in a RN+1 space whose each direction is a category : the point lies on a unit N-sphere and its distance to each category determinates which category the variable is equal to. Using N coordinates instead of N+1 means using spherical coordinates instead of cartesian ones.

The optimization algorithm manages each N parameters in [0, 1]. But using these directly as spherical coordinates (multiplying them by $\frac{\pi}2$) doesn't work because each of these parameters have different weight on the displacement of the point on the N-Sphere. I need that a given change of any parameter always induce the same amount of displacement of the point on the N-sphere. The corollary is that if I sample uniformly at random these N parameters, I must obtain uniformly distributed points on the N-sphere.

As a try, I uniformly sampled N parameters in [0,1] : the 1st to (N-1)th were cosines of the polar angles and the last one was the normalized angle of the azimuth.

Then, I calculated the distance between each point and each category :

  • when N=1 (circle), average distance to 1st and 2nd categories was 0.745, so the distribution is uniform ;
  • when N=2 (sphere), average distance to 1st, 2nd and 3rd categories was 0.943, so the distribution is uniform. BUT :
  • starting from N=3, points seemed to be closer to the first category (so the distibution is NOT uniform anymore) : average distance to 1st category was 0,943 while it was 1,067 for the 3 last categories ;
  • as N increases, the non-uniformity became worse and worse.

Here is a minimal python code to reproduce my issue :

import numpy as np
import math

def spherical_to_cartesian(spherical_coords): angles = [math.acos(spherical_coords[i]) for i in range(len(spherical_coords)-1)] + [spherical_coords[-1]*math.pi/2]

N = len(angles) + 1
cartesian_coords = np.zeros(N)
sin_cumul = 1

for i in range(0, N-1):
    cartesian_coords[i] = np.cos(angles[i]) * sin_cumul
    sin_cumul *= np.sin(angles[i])

cartesian_coords[-1] = sin_cumul
return cartesian_coords


def distance_to_categories(angles): coords = spherical_to_cartesian(angles) dist = []

vertex = len(angles)+1
for a in range(vertex):
    category = np.zeros(vertex)
    category[a] = 1
    dist.append(math.dist(coords, category))

return np.array(dist)


points = 100_000 N = 20 dist_to_categories = np.ndarray(shape=(points, N+1))

for i in range(points): angles = [] for j in range(N): angles.append(np.random.uniform(0, 1)) dist_to_categories[i] = distance_to_categories(angles)

mean = [np.mean(dist_to_categories[:,i]) for i in range(dist_to_categories.shape[1])] print(mean)

So the question is here : how to we sample spherical coordinates to be uniformly distributed on the N-sphere ?

(Please note that I only want to consider points on the N-sphere whose all the cartesian coordinates are non-negative)

Baxlan
  • 23
  • It works for $N=1$ because you never take the arccosine, and works for $N=2$ because the arccosine is correct function for the second spherical coordinate (related to Archimedes' result on a sphere and its enclosing cylinder), but does not work for $N>2$ because the arccosine is then the wrong function. – Henry Jan 30 '25 at 15:21
  • It might be easier to select $N+1$ values $x_0,x_1,\ldots x_N$ iid $N(0,1)$ and then say $\phi_i= \operatorname{atan2}\left(\sqrt{\sum\limits_{j=i+1}^{N} x_j^2}, x_i\right)$ – Henry Jan 30 '25 at 15:22
  • @Henry Thanks for your comments. The goal is to correctly convert spherical to cartesian coordinates because these spherical coordinates are actually variables in a larger optimization problem, and I need the minimum possible amount of parameter to ease the optimization process. – Baxlan Jan 30 '25 at 15:32
  • @Henry I Edited the OP to clarify the question – Baxlan Feb 05 '25 at 13:17

1 Answers1

0

Ok, I found out (here and here) that each independant cartesian coordinate $x_n$ of a uniformly distributed point on the N-sphere in the R(N+1) space follow the beta distribution such as : $$\frac{x_n+1}2 \sim\ B(\frac{N}{2}, \frac{N}{2})$$

Nevertheless, coordinates are actually dependant because : $$\sum{x_i^2} = 1$$ So the thing is that once the first coordinate $x_1$ have been sampled, the second one will be sampled as if it was the first coordinate of a (N-1)-sphere whose ray is $\sin(\arccos(x_1))$, and so on :

$$\begin{align} & x_1 \sim 2B(\frac{N}2, \frac{N}2)-1 \\ & x_2 \sim [2B(\frac{N-1}2, \frac{N-1}2)-1].\sin(\arccos(x_1)) \\ & x_3 \sim [2B(\frac{N-2}2, \frac{N-2}2)-1].\sin(\arccos(x_1)).\sin(\arccos(x_2)) \\ & ... \\ & x_{N} \sim [2B(\frac{1}2, \frac{1}2)-1].\prod^{N-1}_{i=0}{\sin(\arccos(x_i))} \\ & x_{N+1} = \prod^{N}_{i=0}{\sin(\arccos(x_i))} \end{align}$$

We know that : $$\begin{align} \\ & x_1 = \cos(\theta_1) \\ & x_2 = \cos(\theta_2).\sin(\theta_1) \\ & ... \\ & x_{N} = \cos(\theta_{N}) \prod^{N-1}_{i=0}{\sin(\theta_i)} \\ & x_{N+1} = \prod^{N}_{i=0}{\sin(\theta_i)} \\ \end{align}$$

so we can demonstrate that $\forall i \in [1,N]$: $$\boxed{\theta_i \sim\ \arccos[2B(\frac{N-i+1}2, \frac{N-i+1}2)-1]}$$

And that's how we sample N spherical coordinates to be uniformly distributed on the N-sphere.

Thus, my function becomes:

def spherical_to_cartesian(spherical_coords):
    # spherical_coords are not really spherical coordinates, they are uniformly 
    # distributed variables that must be converted to beta distributed ones
N = len(spherical_coords)
angles = [math.acos(abs((2*beta.ppf(spherical_coords[i], (N-i)/2, (N-i)/2)-1))) for i in range(len(spherical_coords))]

N += 1 # cartesian coords
cartesian_coords = np.zeros(N)
sin_cumul = 1

for i in range(0, N-1):
    cartesian_coords[i] = np.cos(angles[i]) * sin_cumul
    sin_cumul *= np.sin(angles[i])

cartesian_coords[-1] = sin_cumul
return cartesian_coords

With this conversion function, my points are uniformly distributed for all N. (The use of the absolute function is because I am only intersted in the part of the N-sphere where all the cartesian coordinates are non-negative, for my specific use)

(Note that it works perfectly when sampling spherical coordinates, but not when directly sampling cartesian coordinates, I don't know why... When doing so, the points seem to be closer to the last unit vector than to the other ones)

Baxlan
  • 23