Finding missing bits mathematicaly in a DLP situation (full problem)

Question

In a preparatory question we had to recover decimal digits $@$ of $r$ and $s$ given

$g=51234$, $h=90403$, $N=111649$, $r=3@497$, $s=276@3$,

with $r$ and $s$ the smallest positive solutions to $g^r\equiv h\pmod N$ and $h^s\equiv g\pmod N$.

For large $N$ of unknown factorization, we can't compute $\varphi(N)$ from $N$ as was possible in the above. What techniques can be used to solve the problem?

I tried to use Index Calculus on the example below but failed. This is a very simple problem launched in 2016:

Recover all digits of $r$, $s$ and the factorization of $N$.

It is given $N$, $g$, $h$, $r=\log_g h$ such that $g^r\equiv h\pmod N$, and $s=\log_h g$ such that $h^s\equiv g\pmod N$, with @ figuring an unknown decimal digit:

N = 4770047289861054128673165840475881666985708841069569122247779994575999373990855387188690474135592158102615485877459618836427811081668578893542268141889988869764086285203345206116260923268064851230188829163155218509489708275219402417
g = 1818674628639967921918316591104926407701223436868786162485485191757839034354239408133296195009525639079428182610287045427973123535470060353099447741037582550708917693998695104714221823196243333713913383763090393730760094204741650090
h = 2208658095572997139683467669564020944425802120964728746970717796984260065210935180456703377157524093154996948648279266706057554344748359754567027091354206701512345543745638076145129569549991043944481795915889787265463492997755593901
r =  1@6826295140541573@9405922358@031348321@068314394@23991550@60902965@70229914@6203562@77909154@2484058239@305290554@451041944@77284864589306634379124981555276837338619890402355995522612897221369937487171781010716680@8004465050769716
s =  57@2623169175139588682565993777078629@51369239581656482190187775@98993533114834@931596544472993@0093163506369690@0954113499@984660633390860@46262557530048@277694556546720@7679243674115857@55557012193132332836046587@22597144319762@@

_{Editor's note: the original question was changed for clarity, de-emphasizing the numerical example, and using the simpler notation introduced in this answer to the original preparatory question.}

fgrieu · Accepted Answer · 2018-07-08T21:56:35.547

Resolution plan

$N$ is a composite with no trival factor, and our plan will be:

Find what's missing in $r$ (14 decimal digits) from what's known, the values of $N$, $g$, $h$, and the relation $g^r\equiv h\pmod N$, using a minor variation of the baby-step giant-step Discrete Log algorithm.
Similarly find the missing decimal digits in $s$.
Raising $g^r\equiv h\pmod N$ to the power $s$ then substituting $h^s\equiv g\pmod N$ gives $g^{r\cdot s}\equiv g\pmod N$. Hence $r\cdot s-1$, which we now can compute, is a multiple of the order of $g$, and thus has a fair chance to be a multiple of $\varphi(N)/2$.
Note: That's not sure! In the preparatory problem $g$ is of order $\varphi(N)/2$, and it's at best reasonable to hope that in the full problem that order is $\varphi(N)/2$ or $\varphi(N)$. However, if $g$ has been randomly selected, its order likely is a multiple of $\varphi(N)/c$ for some small $c$ dividing $\varphi(N)$, and $c$ can be searched.
Factor $N$ from its value and that of the multiple of $\varphi(N)/2$ just obtained.

Finding the full $r$ with baby-step giant-step

We are working in base $b$ and missing digits of $r$ at index $i$ in set $\Bbb M$ (starting at index 0 on the right), and can write $\displaystyle r=\hat r+\sum_{i\in\Bbb M}x_i\,b^i$ with unknowns $x_i\in[0,b)$. Here $b=10$, $\hat r$ is obtained by replacing @ with 0 in the given form of $r$, and the 14 indexes of the @ starting from the right are $\Bbb M=\{16,106,116,126,137,146,154,163,172,181,191,201,212,229\}$.

We split $\Bbb M$ about evenly into disjoint subsets $\Bbb U$ and $\Bbb V$ and rewrite $g^r\equiv h\pmod N$ as

$$\begin{array}{llll} g^r&\equiv&h&\pmod N\\ g^{\left(\hat r+\displaystyle\sum_{i\in\Bbb M}x_i\,b^i\right)}&\equiv&h&\pmod N\\\\ g^{\hat r}\cdot\displaystyle\prod_{i\in\Bbb M}\left(g^{(b^i)}\right)^{x_i}&\equiv&h&\pmod N\\ g^{\hat r}\cdot\displaystyle\prod_{i\in\Bbb U}\left(g^{(b^i)}\right)^{x_i}&\equiv&h\cdot\displaystyle\prod_{i\in\Bbb V}\left(g^{(-b^i)}\right)^{x_i}&\pmod N \end{array}$$

We pre-compute $g^{-1}\bmod N$ (using e.g. the half-extended Euclidean algorithm), and $\left(g^{-1}\right)^{(b^i)}\bmod N$ for ${i\in\Bbb V}$, then can sequentially compute and store in a data structure allowing fast search all $b^{|\Bbb V|}$ values of $\displaystyle\left(h\cdot\prod_{i\in\Bbb V}\left(g^{(-b^i)}\right)^{x_i}\right)\bmod N$ (alongside the matching tuple of $x_i$ for $i\in\Bbb V$). We start from $h$ for all $x_i=0$ with ${i\in\Bbb V}$ and perform one modular multiplication for each other (we need $|\Bbb V|$ intermediary variables, including $h$).

We pre-compute $g^{\hat r}\bmod N$ and the $g^{(b^i)}\bmod N$ for ${i\in\Bbb U}$, and sequentially compute and search in said data structure $g^{\hat r}\cdot\displaystyle\left(\prod_{i\in\Bbb U}\left(g^{(b^i)}\right)^{x_i}\right)\bmod N$. We start from $g^{\hat r}\bmod N$ for all $x_i=0$ with ${i\in\Bbb U}$ and perform one modular multiplication for each other (we need $|\Bbb U|$ intermediary variables, including $g^{\hat r}$). When we get a match, that gives us all the $x_i\in\Bbb M$ (those from $\Bbb V$ come from the data structure, those from $\Bbb U$ are the one which led to the match), hence the full $r$.

The worst case computing effort is dominated by about $2\,b^{|\Bbb M|/2}$ (here 20,000,000) modular multiplications modulo $N$ (here, costing about $2\,\lceil(\log_2N)/32+1\rceil^2<1500$ elementary multiplications of 32-bit variables and additions of the 64-bit result using schoolbook algorithms). We are talking minutes of runtime, and less than 3 Megabytes of memory (and by making $|\Bbb V|$ one smaller, we reduce memory requirement by a factor $b$, at the expense of runtime).

Finding missing bits mathematicaly in a DLP situation (full problem)

1 Answers1

Resolution plan

Finding the full $r$ with baby-step giant-step

Linked