2

I have a sample space of all the undirected graphs having n vertices and m edges. Then what is the expected value of the size of the largest connected component?

My original problem is that using the G(n, p) model (family of graphs having n vertices and any two vertices would be joined by an edge with probability p. See: Erdős–Rényi model), I need to calculate the expected size of the largest connected component in G(n, p). If the above problem can be solved then this original problem can be easily solved.

PS: I need to exactly calculate the value of the expectation for any fixed n and p given in the input and not just upper or lower bounds.

thestyx
  • 31

1 Answers1

1

I am answering my own question as I have solved the original problem.
The solution is extremely computationally intensive.

Let S be our required expected size of the largest connected component in G(n, p). and let $P_i^n$ be the probability that a graph of n vertices formed from G(n, p) have the largest connected component(s) of size $i$. then

$$S = \sum_{i = 1}^{i = n}iP_i^n$$

Now, for the calculation for $P_i^n$, first choose a set of $i$ vertices and define the below terms

  • $f_i$ = the probability that a specific set of already chosen $i$ vertices is connected
  • $g_i^n$ = the probability that no vertex from the set of $i$ chosen vertices make an edge to other remaining $(n-i)$ vertices.

Then $P_i^n$ would be $$P_i^n = \binom{n}{i} f_i g_i^n \sum_{j = 1}^{j = i} P_j^{n - i} $$

This gives a recursive relation for $P_i^n$ (Note that in the summation term would go for $j = i$ and not for $j = i - 1$ as there can be more than one connected component of size i can be possible.

Calculation for $g_i^n$
$g_i$ can be easily calculated as the probability of not forming an edge between two vertices is $1 - p$ and there are total $i * (n - i)$ possible edges which shouldn't be formed, thus $$g_i^n = (1 - p)^{i(n - i)}$$

Calculation for $f_i$
The calculation of $f_i$ is described here: Exact probability of random graph being connected
$$f_i = 1 - \sum_{k = 1}^{i - 1}f_k \binom{i - 1}{k - 1} (1 - p)^{k(i - k)}$$

The calculation of $f_i$ is also recursive. Overall the exact calculation for the required expectation $S$ is much computationally intensive.

Practically, generating a random set of graphs from the G(n, p) model, and calculating the average size of the largest connected component using DFS/BFS would be much easier than the exact method.

thestyx
  • 31