The simplest way to proceed is rejection sampling. For this we need to first assume $M \geq N$. If this isn't the case (maybe you're playing D&D with a d6, so $N=20,M=6$), then you should roll at least $k=\lceil \log_M(N) \rceil$ times, and then label the $k$-tuples of rolls. For the discussion of the rejection approach let us assume you have already done this and accordingly rename $M^k$ as $M$ if need be.
Now rejection sampling amounts to assigning each of the numbers $1,2,\dots,N$ to $q(M,N)$ of the possible rolls, where $q$ is the quotient when $M$ is divided by $N$. If you got one of the assigned rolls, you terminate, otherwise you start over.
This rejection method takes constant space. It fails with probability $\frac{r(M,N)}{M}$ (where $r$ is the remainder when $M$ is divided by $N$), so the average number of steps taken is $\frac{M}{M-r(M,N)}$. In theory this can run forever but the probability of long runtimes decays exponentially fast.
A different way to proceed is to set yourself up to use a dM to generate a random number from any discrete distribution and then apply it to this setting. To do that, note that if $X_n$ are iid rolls from a $M$-sided die then $U=\sum_{n=1}^\infty (X_n-1) M^{-n}$ is uniformly distributed on $[0,1]$. Then given whatever other univariate random variable $X$ you like, if $Q_X$ is its quantile function then $Q_X(U)$ has the same distribution as $X$. This technique is called the probability integral transformation.
Now at first glance this seems silly, because you can't do infinitely many rolls. But because $Q_X$ for discrete $X$ is piecewise constant, you don't need to exactly know what $U$ is. For simulating a dN, you only need to resolve which of the intervals $(k/N,(k+1)/N]$ that $U$ will eventually be in. Given $R$ rolls and a current value of $U$, say $U_R$, you know that the final value of $U$ will be somewhere between $U_R$ and $U_R+\sum_{n=R+1}^\infty (M-1) M^{-n}=U_R+M^{-R}$. If these numbers fall in the same interval of the form above then you are done computing.
Again the runtime of this alternative method is random and not bounded. The space footprint of this method is also random and not bounded. The advantage of it is that it can sometimes conserve prior information instead of just starting over, so that the probability that you will finish in the next step sometimes improves as you go on. Also, in the setting with $M<N$, although you need to roll at least $k=\lceil \log_M(N) \rceil$ times, you do not have to perform a multiple of $k$ rolls, which could be good if for some reason $k$ were large.
Let's do a demonstration of the method with $N=20,M=6$. I roll a 4, putting me in $[3/6,4/6]$, but $\lceil 3 \cdot 20/6 \rceil = 10$ while $\lceil 4 \cdot 20/6 \rceil=14$, so I have only narrowed it down to 5 possibilities so far. Then I roll a 6, putting me in $[23/36,24/36]$. I'm still not done because $\lceil 23 \cdot 20/36 \rceil = 13$ while $ \lceil 24 \cdot 20/36 \rceil=14$, so I have narrowed it to 13 or 14 but it isn't yet clear which. I roll again and get a $1$, and now my roll is understood as a 13.
This "pseudo-probability integral transformation method" does not always succeed at retaining entropy at every step. Returning to this $N=20,M=6$ example, when you expand it all out, the algorithm does two rolls and assigns 20 of the 36 possibilities an outcome number. At this point each outcome needs an additional $(4/5) (1/36)$ unconditional probability. So we use the subsequent rolls to break up the other sixteen possibilities for the first two rolls between the 20 possible d20 rolls. This goes as follows:
- 80/20 (finish one number, leave one short by $(3/5) (1/36)$)
- 60/40 (finish the number you just started, leave one short by $(2/5) (1/36)$)
- 40/60 (finish the number you just started, leave one short by $(1/5) (1/36)$)
- 20/80 (finish the number you just started and the next one)
and then you repeat that pattern four times. There is no way to implement these 80/20 or 60/40 splits with a d6 while conserving entropy, because the number $3/5$ (resp. 4/5) splits $[3/6,4/6]$ (resp. $[4/6,5/6]$) in the same proportion as each splits $[0,1]$ itself. You can think of that as coming about because $1/5=0.\overline{1}$ in base 6.