0

According to Wikipedia, the density of Sophie Germain primes is expected to be

$$ 2C{\frac {n}{(\ln n)^{2}}}\approx 1.32032{\frac {n}{(\ln n)^{2}}}, $$

where $C$ is the twin prime constant. However, since Sophie Germain primes are defined to be primes $p$ such that $2p+1$ is also prime, would it be more accurate to say

$$ 2C{\frac {n}{(\ln n)(\ln 2n)}} $$

instead?

Or equivalently, by denoting the amount of Sophie Germain primes below a positive integer $n$ by $\pi_{SG}(n)$:

$$ \pi_{SG}(n) \sim 2C \int_2^{n} \frac{dt}{\ln t \ln 2t}? $$


EDIT

I understand that the contribution of the $\ln 2$ term is small, but it is still more accurante. Asymptotically there is no doubt that Wikipedia is correct, however for the SG prime-counting function it is definitely more accurante to include that term.

Klangen
  • 5,459
  • No, the prime number theorem is $\pi(x)\sim x/\log(x)$, and not $x/2\log(x)$, for $q=2p+1$. – Dietrich Burde Apr 08 '19 at 11:19
  • But on this page for instance (albeit within the context of Cunningham chains): https://primes.utm.edu/glossary/page.php?sort=CunninghamChain the integral in the calculation of the density takes into account the multiplicative factors in the Sophie Germain primes... – Klangen Apr 08 '19 at 11:21
  • It says $B_2N/log(N)^2$ for $k=2$, so without a factor of $1/2$? And $B_2=1.32032$, so exactly what wikipedia says. – Dietrich Burde Apr 08 '19 at 11:23
  • @Klangen Firstly in your link above, the multiplicative factors are in the conjectured integral. Secondly, even if you take the multiplicative factors, compared to $n$ the contribution of $\log 2$ is so small that it gets eaten up in the asymptotic and you end up with the form without these factors – Nilotpal Sinha Apr 08 '19 at 11:25
  • @DietrichBurde Did you take a look at the primes.edu page? Specifically the integral on the bottom of the page contains these multiplicative constants – Klangen Apr 08 '19 at 12:13
  • Question edited for clarity – Klangen Apr 08 '19 at 12:19
  • Note, the Wikipedia article says the formula's estimate is too small for $n=10^4$ and $10^7$. Your correction would make the estimates even smaller, hence less accurate for those numbers. (The "official" formula may, of course, eventually be an overestimate; it might be of interest to know if and when that happens.) – Barry Cipra Apr 08 '19 at 12:35
  • To be fair the only possible length greater than 3, chains are almost all elements being 29 mod 30. –  Apr 09 '19 at 01:25
  • Your integral formula, corrected to $2C\int_2^n{dt\over\ln t\ln2t}$ (you had some $t$'s and $n$'s reversed), is actually quite good. See Section 3.5 on pages 11-12 at https://www.utm.edu/staff/caldwell/preprints/Heuristics.pdf – Barry Cipra Apr 10 '19 at 09:42
  • @BarryCipra Yes indeed thank you for noticing the erroneous formula – Klangen Apr 10 '19 at 12:31
  • @BarryCipra That paper is really interesting, thank you for posting it! – Klangen Apr 10 '19 at 12:36
  • Perhaps of interest to you is this observation : https://math.stackexchange.com/questions/3790597/why-does-this-ratio-5-occur-relating-prime-twins-and-sophie-germain-primes – mick Jun 14 '23 at 22:13

1 Answers1

2

Given that the infinitude of Sophie Germain primes is still only conjectural, it's hopeless here to prove that one formula gives a better estimate, asymptotically, than another. However, we can discuss the relationships of the various formulas and how they relate to current counts of the Sophie Germain primes.

The main thing we can prove is the following: For $n\ge7$, we have

$${n\over\ln n\ln2n}\lt{n\over(\ln n)^2}\lt\int_2^n{dt\over\ln t\ln2t}$$

The first inequality is actually true for $n\ge2$, since $\ln2n\gt\ln n$ for all $n\gt1$. The second inequality needs to be checked for $n=7$ (it turns out the inequality points the other way for $n\le6$), after which we can show that

$$f(x)=\int_2^x{dt\over\ln t\ln2t}-{x\over(\ln x)^2}$$

is an increasing function by computing its derivative,

$$f'(x)={1\over\ln x\ln2x}-{1\over(\ln x)^2}+{2\over(\ln x)^3}$$

and showing that $f'(x)\ge0$ for all $x\ge7$. We do this by noting that, since $\ln2x$, $\ln x$, and $\ln x/2$ are all positive when $x\ge7$, we have

$$\begin{align} f'(x) &={(\ln x)^2-(\ln x)(\ln2x)+2\ln2x\over(\ln x)^3\ln2x}\\ &\gt{(\ln x)^2-(\ln x)(\ln2x)+\ln2\ln2x\over(\ln x)^3\ln2x}\\ &={(\ln x)^2-(\ln x/2)(\ln2x)\over(\ln x)^3\ln2x}\\ &\ge{(\ln x)^2-\displaystyle\left(\ln x/2+\ln2x\over2 \right)^2\over(\ln x)^3\ln2x}\\ &={(\ln x)^2-\displaystyle\left(\ln x^2\over2 \right)^2\over(\ln x)^3\ln2x}\\ &={(\ln x)^2-(\ln x)^2\over(\ln x)^3\ln2x}\\ &=0 \end{align}$$

where the key step invokes the Arithmetic-Geometric Mean inequality $\sqrt{ab}\le{a+b\over2}$ if $a,b\ge0$. (This is why we noted that $\ln x/2$ is positive for $n\ge7$.)

Now experimentally, the integral (multiplied by $2C$) seems to give a surprisingly accurate approximation to the actual count of Sophie Germain primes, $\pi_{SG}(n)$, while the simple fraction $n/(\ln n)^2$ seems to systematically undercount them. For example, for $n=10^{14}$, we have (in decreasing numerical order)

$$\begin{align} 2C\int_2^n{dt\over\ln t\ln2t}&\approx132822400531\\ \pi_{SG}(n)&=132822315652\\ 2C{n\over(\ln n)^2}&\approx127055347336\\ 2C{n\over\ln n\ln2n}&\approx124380891673 \end{align}$$

where the exact value is take from the OEIS sequence A092816. Other comparisons for powers of $10$ from $10^3$ to $10^{11}$ can be found on page 12 in Chris Caldwell's paper, "An Amazing Prime Heuristic." They all show $2Cn/(\ln n)^2$ giving a substantial undercount to $\pi_{SG}(n)$, while the integral sometimes undercounts and sometimes overcounts, but always gives a better approximation.

If that relationship persists, then changing $n/(\ln n)^2$ to $n/(\ln n\ln2n)$ will not help. On the other hand, if the conjectured infinitude of Sophie Germain primes is wrong, then $2Cn/(\ln n\ln2n)$ would, asymptotically, be the closest of the three "estimates," but only because it tends to infinity more slowly than the other expressions.

Barry Cipra
  • 81,321