Does the dominant prime factor contribute about $62\%$ of the value of the logarithm of numbers?

Question

Let $n = p_1^{a_1}p_2^{a_2}\cdots p_k^{a_k}$ be the prime factorization of $n$ for some primes such that $p_i < p_{i+1}$. We define the minor prime factor and the dominant prime factor of $n$ as the primes which has the smallest the largest contributions respectively to the value of $n$. Clearly, the dominant and the minor prime factors of $n$ is not necessarily its largest and smallest prime factors. E.g. if $n = 2^8 3^6 5^2 7^2$ then the largest and the smallest prime factors of $n$ are $7$ and $2$ respectively where as its dominant and the minor prime factors are $3$ and $5$ respectively since $3^6 = 729$ is greater than $2^8 = 256, 5^2=25$ and $7^2=49$.

Definition 1: The minor prime factor of $n$ is defined as the prime factor $p_m$ such that $p_m^{a_m} \le p_i^{a_i}$.

Definition 2: If a number as two or more distinct prime factors, then the dominant prime factor of $n$ is defined as the prime factor $p_d$ such that $p_d^{a_d} \ge p_i^{a_i}$ for any $i \ne d$.

Note: If a number has only one distinct prime factor then its minor and its dominant prime factors are the same i.e. $p_m = p_d$.

Since $\frac{a_1 \ln p_1}{\ln n} + \frac{a_2 \ln p_2}{\ln n} + \cdots + \frac{a_k \ln p_k}{\ln n} = 1$ we can say that each term in this sum is the contribution of the respective prime to the value of $\ln n$ and we can ask on and average what is the contribution of the minor/dominant prime to logarithm of a number.

Let $p_d$ be the dominant prime factor of $n$. Experimental data for all $n \le 4 \times 10^{8}$ and the average over several smaller intervals upto $10^{100}$ ; suggests that such that

$$ \lim_{x \to \infty}\frac{1}{x}\sum_{n = 1}^x \frac{a_d\ln p_d}{\ln n} \sim 0.62 $$

Conjecture: The dominant prime factor contributes approximately $62\%$ of the value of the logarithm of natural numbers.

Can this be proved or disproved?

Update 24 Feb 2023: Digging into literature, I read about the Golomb-Dickman Constant whose value $0.6243299$ is suspiciously close to observed mean value of the dominant prime factor. This constant is related to the largest prime factor (not the dominant prime factor) through the Dickman Function.

SageMath Code:

n      = 2
step   = target = 10^6
r_max1 = r_max2 = 0
while True:
    x = prime_factors(n)
    max1 = 1
    max2 = 1
for factor in x:
    p = 1
    check = 1
    while True:
        check = check*factor
        if n%check == 0:
            p = p + 1
        else:
            p = p - 1
            check = check/factor
            if check &gt; max1:
                max1 = check
                max2 = check
            break

if len(x) == 1:
    max2 = 1

# print(n,max)
l     = ln(n).n()
lmax1  = ln(max1).n()
lmax2  = ln(max2).n()
r_max1 = r_max1 + lmax1/l
r_max2 = r_max2 + lmax2/l

if n == target:
    print(n, 'lower_bound:', r_max2/n, 'upper_bound:', r_max1/n, 'mean:', (r_max2/n + r_max1/n)/2)
    target = target + step
n = n + 1

Just to clarify : We compare the prime powers in the factorization of the number and determine the largest and smallest of those , right ? — Peter, Feb 18 '23 at 08:23
"My initial intuition was that contributions of the dominant prime should also decrease but this is not the case." You have not tested enough values. Your graph ends at 10 Millions, I would say it is a very very small number for this conjecture. Test up to $10^15$ to have significant results. — Lourrran, Feb 18 '23 at 08:30
Where you write "while the average contribution the minor prime decreased as $n$ increased, the average contribution the minor prime increased", I think the second "minor" should be "dominant"? (And "of" is missing twice.) — joriki, Feb 18 '23 at 08:40
I still do not understand the definition of the dominant prime factor , if it is not the prime in the largest prime power in the product of prime powers in the prime factorization of $n$. — Peter, Feb 18 '23 at 08:53
@Peter Given the prime factorization of $n$, find the maximum value of $p^a$ and that is your dominant prime. Eg If $n = 2^5.199^2. 1997^1$, even though 2 has the highest power and 1997 is the largest prime, $199^2$ is greater than both $2^5$ and $1997^1$ so 199 is our dominant prime. — Nilotpal Sinha, Feb 18 '23 at 09:04
@NilotpalSinha This is not different from what I described. $199^2$ is the largest prime power and therefore $199$ the dominant prime factor. But why do we have no dominant prime factor in the case of $n$ being a prime power ? Is it just a definition that in this case , we have value $1$ ? Because we cannot compare the prime power to any other ? — Peter, Feb 18 '23 at 09:17
@Peter The sum of the logarithmic contributions of all the prime factors including the dominant and the minor must be 1 as explained in the post. If a number has only 1 distinct prime factor than we can either call it the minor or the dominant prime factor but not both because if we do then then the sum of the contributions the dominant and the minor will be 2 instead of 1. Hence to avoid double counting in this case, we define the lone prime factor as the minor and define dominants factor to be 1 so that the sum of the logarithmic contribution remains 1. — Nilotpal Sinha, Feb 18 '23 at 09:23
This is your choice, your definition. You can perfectly decide that for numbers $n$ with only 1 factor, minor factor = dominant factor = $n$ ; it is consistant. Sum of logarithmic contribution use all distinct factors ; you say that it use primor factor plus dominant factor plus other factors ; just say that it use all distinct factors, you will avoid inconsistency. — Lourrran, Feb 18 '23 at 09:45
As Peter, I consider that you should change your definition for those numbers with only 1 prime factor : when you have many distinct prime factors, the weight of each factor is small, when you have few prime factors, the weight of each factor is high, except when only 1 prime factor ??? — Lourrran, Feb 18 '23 at 09:45
After some reflexions, we have numbers with more than 1 factor, where dominant has a weight which is close to 50% (with some small variations that we want ot analyze), and numbers with only 1 factor, where we have to decide if it is 0 or 1. Maybe the good decision is to exclude numbers with only 1 prime factor to see something. Impact of those numbers is so important that they mask anything else. — Lourrran, Feb 18 '23 at 10:47
@Lourrran I have been thinking on similar lines. We have three possibilities when it comes number with only 1 prime factor. We can decide if (minor, dominant) is a (1,0) or (0,1) or exclude these numbers i.e. (0,0). The data so far shows that for (1,0) it increases as shown in the post so it might converge to a value 0.5865. For (0,1) it decreases and might converge to a value less than 0.653. Continued below ... — Nilotpal Sinha, Feb 18 '23 at 12:40
@Lourrran And we take the take the mid point of these two, we get 0.619 which we the data for the case (0,0) when we exclude all number that have only one prime factors seems to be approaching. So the data seems so suggest that regardless of how we define it, the contribution of the dominant prime approaches some limiting value close to 0.61 — Nilotpal Sinha, Feb 18 '23 at 12:40
@Lourrran Updated the definition to keep it consistent and added more empirical insights to the question. — Nilotpal Sinha, Feb 20 '23 at 07:10
The number of prime factors increases like $\ln(\ln(n))$ so the dominant share cannot decline faster than $\frac{1}{\ln(\ln(n))}$ which is very slow. — Zoe Allen, Feb 20 '23 at 07:26
Process is very long for very big numbers, ok. Instead of starting with $n=2$, you can start with $n=10^{15}$, and run up to $10^{15}+10^6$ ; then start with $n=10^{25}$, and run up to $10^{25}+10^6$. And compare these 2 results. Only $10^6$ numbers to process in each range, but very big numbers. — Lourrran, Feb 20 '23 at 08:33
I suspect that for most integers, the largest prime factor is the dominant prime factor, so any average formed using the dominant prime will equal the corresponding average formed using the largest prime. — Gerry Myerson, Feb 24 '23 at 06:08
@GerryMyerson Yes, this seems to be true. I am experimentally verifying this and it indeed seems that the DPF = GPF for most integers. For $n < 1.2 \times 10^7$ less than $2%$ integers have $GPD \ne DPF$ and this proportion is decreasing. — Nilotpal Sinha, Feb 24 '23 at 06:25
https://oeis.org/A102749 tabulates numbers for which the greatest prime isn't the dominant prime. It is speculated there that the sequence has density zero. — Gerry Myerson, Feb 24 '23 at 06:41

K. Makabre · Accepted Answer · 2024-01-28T01:03:55.147

I will prove that the limit exists and is precisely the Golomb-Dickman constant $\lambda$. First, let me sketch how one can show that the expected value of $\frac{\log(P_1(n))}{\log(n)}$, where $P_1(n)$ denotes the largest prime factor of $n$, is precisely $\lambda\approx0.62433$ via probability theory.

We say that an integer $n\geq1$ is $y$-smooth if every prime divisor of $n$ is $\leq y$. We also say that $n\geq1$ is $y$-powersmooth if every prime power divisor of $n$ is $\leq y$. Let's denote by $S(x,y)$ and $PS(x,y)$ the set of $y$-smooth and $y$-powersmooth numbers $\leq x$ respectively. It is a theorem of Dickman that $$F_S(\nu):=\lim_{x\to+\infty}\frac{1}{x}|S(x,x^{\nu})|=\rho(1/\nu)$$ where $\rho(t)$ is Dickman's function satisfying the differential equation $t\rho'(t)+\rho(t-1)=0$.

For large $x\gg0$, choose a random $tx\in[0,x]$ where $t\in[0,1]$ (that is, consider the random variable $xT\sim U(0,x)$ where $U(0,x)$ is the discrete uniform distribution that "looks" continuous for large $x$). What is the probability that $\frac{\log(P_1(tx))}{\log(tx)}\leq\nu$? Well $$tx\in S(x,x^\nu)\iff P_1(tx)\leq x^\nu\iff \frac{\log(P_1(tx))}{\log(tx)}\leq\frac{\nu\log(x)}{\log(tx)}=\frac{\nu}{1+\frac{\log(t)}{\log(x)}}\sim\nu$$ so, for large $x\gg0$, we have $P\left(\frac{\log(P_1(xT))}{\log(xT)}\leq\nu\right)\sim P(xT\in S(x,x^\nu))=\frac{1}{x}|S(x,x^\nu)|\sim F_S(\nu)$. This means that $F_S(\nu)$ is the cumulative distribution function of $\frac{\log(P_1(n))}{\log(n)}$ so we can compute the expected value as $$\lim_{x\to+\infty}\frac{1}{x}\sum_{n\leq x}\frac{\log(P_1(n))}{\log(n)}=\int_0^1\nu F_S'(\nu)d\nu=\int_0^1\frac{-\rho'(1/\nu)}{\nu}d\nu$$ $$=\int_0^1\rho\left(\frac{1}{\nu}-1\right)d\nu=\int_0^\infty\frac{\rho(t)}{(1+t)^2}dt=\lambda$$

Now we denote the greatest prime power divisor of $n\geq1$ as $Q_1(n)$. For example, $$Q_1(24)=Q_1(2^3\times3)=2^3=8\text{ and }Q_1(72)=Q_1(2^3\times3^2)=3^2=9$$ We want to compute the expected value of $\frac{\log(Q_1(n))}{\log(n)}$ is a similar manner as before. The key idea is to note that, for large $x\gg0$ and $tx\in[0,x]$ chosen uniformly, we have $$tx\in PS(x,x^\nu)\iff Q_1(tx)\leq x^\nu\iff\frac{\log(Q_1(tx))}{\log(tx)}\leq\frac{\nu\log(x)}{\log(tx)}\sim\nu$$ so, again, $P\left(\frac{\log(Q_1(xT))}{\log(xT)}\leq\nu\right)\sim\frac{1}{x}|PS(x,x^\nu)|$. Thus, if we had $\lim\limits_{x\to+\infty}\frac{|S(x,x^\nu)|-|PS(x,x^\nu)|}{x}=0$ (which I'll prove in just a moment), it would immediately follow that $$\lim_{x\to+\infty}\frac{1}{x}|PS(x,x^\nu)|=\lim_{x\to+\infty}\frac{1}{x}|S(x,x^\nu)|=F_s(\nu)=\rho(1/\nu)$$ $$\Rightarrow\lim_{x\to+\infty}\sum_{n\leq x}\frac{\log(Q_1(n))}{\log(n)}=\int_0^1\nu F_S'(\nu)d\nu=\int_0^\infty\frac{\rho(t)}{(1+t)^2}dt=\lambda$$ as desired. So it suffices to show that $\lim\limits_{x\to+\infty}\frac{|S(x,x^\nu)|-|PS(x,x^\nu)|}{x}=0$.

First observe that every $y$-smooth number is $y$-powersmooth. Thus we have that $$0\leq|S(x,y)\setminus PS(x,y)|=|S(x,y)|-|PS(x,y)|$$ For every prime $p\leq y$, denote $M_p(x,y)=\{p^{1+\lfloor\log_p y\rfloor}k\leq x\}$ the set of multiples of $p^{1+\lfloor\log_p y\rfloor}$ which are $\leq x$ where $p^{1+\lfloor\log_p y\rfloor}$ is the least power of $p$ that exceeds $y$.

Now observe that, if $n\in S(x,y)\setminus PS(x,y)$, then $n$ is a product of powers of primes $\leq y$ as $n=\prod_{p_i\leq y}p_i^{\epsilon_i}$ where $\epsilon_i$ can be zero and, at the same time, $\exists p_i\leq y$ such that $p_i^{\epsilon}>y\Rightarrow \epsilon_i\geq1+\lfloor\log_{p_i}y\rfloor$. But this means that $n\in S(x,y)\setminus PS(x,y)$ implies that $\exists p\leq y$ prime such that $n\in M_p(x,y)$ so $S(x,y)\setminus PS(x,y)\subseteq\bigcup_{p\leq y}M_p(x,y)$. Thus, we can bound $|S(x,y)|-|PS(x,y)|$ by $$|S(x,y)|-|PS(x,y)|=|S(x,y)\setminus PS(x,y)|\leq\left|\bigcup_{p\leq y}M_p(x,y)\right|\leq\sum_{p\leq y}|M_p(x,y)|$$ $$=\sum_{p\leq y}\left\lfloor\frac{x}{p^{1+\lfloor\log_p y\rfloor}}\right\rfloor\leq\sum_{p\leq y}\frac{x}{p^{1+\lfloor\log_p y\rfloor}}\leq\sum_{p\leq y}\frac{x}{p^{\log_p(y)}}=\frac{x}{y}\sum_{p\leq y}1=\frac{\pi(y)}{y}x$$ $$\Rightarrow\boxed{\therefore 0\leq\frac{|S(x,y)|-|PS(x,y)|}{x}\leq\frac{\pi(y)}{y}}$$ Finally, note that $\frac{\pi(y)}{y}\sim\frac{1}{\log(y)}\to0$ by the prime number theorem which, in particular, implies $\lim\limits_{x\to+\infty}\frac{|S(x,x^\nu)|-|PS(x,x^\nu)|}{x}=0$. In fact, we only need $\frac{\pi(y)}{y}\to 0$ (i.e. the zero natural density of prime numbers) which is not that hard to prove.$\ \square$

Does the dominant prime factor contribute about $62\%$ of the value of the logarithm of numbers?

1 Answers1

Linked