That $\mathcal{F}:L_1\rightarrow C_0$ is not surjective, follows from the steps that are outlined in the Exercise in Rudin's book. See here for details if you get stuck.
As for the density of $\mathcal{F}(L_1)$ in $C_0$, every proof I have seen makes use of the Stone-Weierstrass theorem in one way or another. The Stone-Weierstrass theorem is covered in courses in analysis (Rudin's more basic book, Principles of Mathematical Analysis has a discussion of it). Below I outlined two solutions.
- Here is a sketch of a classic solution the problem in the OP. The details of statements 1-4 can be proved by material covered up to chapter 9 in Rudin's bookend I leave the details to the OP.
Theorem 9.2 in Rudin's RCA stated conditions under which $\hat{f}=(\mathcal{F}f)(t)$, $f\in L_1$, is differentiable and that $\widehat{f}'(t)=-i\widehat{xf(x)}(t)$. By induction and under the assumption that $x^nf(x)\in L_1$, it follows that $\widehat{f}\in C^n$ and $\widehat{f}^{(n)}(t)=(-i)^n\widehat{x^nf(x)}(t)$.
On the other hand, if $\phi\in C^\infty\cap L_1$, $\phi'\in L_1$, and $\lim_{|x|\rightarrow\infty}x\phi(x)=0$, then an application of integration by parts yields $\widehat{\phi'}(t)=it\widehat{\phi}(t)$. By induction, if $\phi\in C^\infty$ is such that $\phi^{(k)}\in L_1$ for all $0\leq k\leq n$, then $\widehat{\phi}^{(n)}(t)=(it)^n\widehat{\phi}(t)$. A comment along this lines appears as Remarks 9.3 of Rudin (idem).
Consider the space $\mathcal{S}\subset C^\infty(\mathbb{R})$ of functions $\phi$ such that for any $n,m\in\mathbb{Z}_+$, the map
$x\mapsto x^m\phi^{(n)}(x)$ is bounded on $\mathbb{R}^n$. The function $\phi(x)=e^{-x^2}$ is an example of a function in $\mathcal{S}$.
It is easy to check that $\mathcal{S}\subset L_1(\mathbb{R})$: $|\phi(x)|=(1+x^2)\frac{|\phi(x)|}{(1+x^2}\leq C\frac{1}{1+x^2}\in L_1$, where $C$ is some constant. Also, for any $m,n\in\mathbb{Z}_+$, $x\mapsto x^m\phi^{(n)}(x)$ is itself a function in $\mathcal{S}$.
These observations, put together, imply that
$$\big|(it)^m \frac{d^n}{dt^n}\widehat{\phi}(t)\big|=\big|(it)^m(-i)^n\widehat{x^n\phi(x)}(t)\big|=\Big| \mathcal{F}\Big(\frac{d^m}{dx^m}\big(x^n\phi(x)\big)\Big)(t)\Big|$$
Since $x\mapsto \frac{d^m}{dx^m}\big(x^n\phi(x)\big)$ is in $\mathcal{S}\subset L_1$,
$$\big|(it)^m \frac{d^n}{dt^n}\widehat{\phi}(t)\big|\leq \sup_{t\in\mathbb{R}}\Big| \mathcal{F}\Big(\frac{d^m}{dx^m}\big(x^n\phi(x)\big)\Big)(t)\Big|\leq\Big\|\frac{d^m}{dx^m}\big(x^n\phi(x)\big)\Big)\Big\|_1$$
This shows that the Fourier transform maps $\mathcal{S}$ into $\mathcal{S}$. The Fourier inversion theorem then implies that the Fourier transform maps $\mathcal{S}$ onto $\mathcal{S}$ (in fact $\mathcal{F}^4\phi=\phi$).
- It is easy to check that $\mathcal{S}$ is dense in $C_0$ by he Stine-Weierstrass theorem ($\mathcal{S}$ contains for example all smooth functions that have compact support).
Comment: The space $\mathcal{S}$ is of considerable interest on its own. It is called Schwartz space. Under a suitable topology, $\mathcal{S}$ is a locally convex vector space. Its dual is the space of tempered distributions.
- Another, perhaps more abstract, proof can be obtained from the complex Stone-Weierstrass theorem for $\mathcal{C}_0$. Let $R=\mathcal{F}(L_1)$ the range of the Fourier transform defined on $L_1$. It is known that $R\subset C_0(\mathbb{R};\mathbb{C})$. $R$ is a (complex) vector ring, that is,
- $R$ is a complex linear space,
- $R$ is closed under multiplication (this is basically the effect of Fourier transform on convolution of functions)
- $R$ is closed under conjugacy, i.e., of $\phi\in R$, then $\overline{\phi}\in R$. Indeed, if $\phi(t)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb{R}}f(x)e^{-ixt}\,dx$, then $\overline{\phi}(t)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb{R}}e^{-ixt}\overline{f(-x)}\,dx$.
- $R$ separates points of $\mathbb{R}$, that is if $s,t\in\mathbb{R}$, then there is $\phi\in R$ such that $\phi(t)\neq\phi(s)$. Take for instance the $f(x)=e^{- x^2/2}$ so that $\phi(t)=\widehat{f}(t)=e^{-t^2/2}$. $\{\phi(\cdot-a):a\in\mathbb{R}\}$ separate points.
By the Stone-Weierstrass theorem, the uniform closure of $R$ in $C_0(\mathbb{R};\mathbb{C})$ is $C_0(\mathbb{R};\mathbb{C})$.
Here is the (real) version of Stone-Weierstrass theorem I am considering
Theorem (l.c.H): Suppose $(X,\tau)$ is a locally compact Hausdorff space and let $\mathcal{E}\subset\mathcal{C}_0(X;\mathbb{R})$ be a Stone lattice or ring. Define $Z_{\mathcal{E}}=\{x\in X: \phi(x)=0,\,\forall\phi\in\mathcal{E}\}$. Then $\overline{\mathcal{E}}=\{\phi\in\mathcal{C}_0(X;\mathbb{R}):\phi(z)=0,\,\forall z\in Z_{\mathcal{E}}\}$.
In this case, $\mathbb{R}$ with the usual topology is locally compact Hausdorff, $Z_\mathbb{R}=\emptyset$. The proof of the theorem above follows from the usual Stone-Weierstrass theorem for compact sets by using the one point compactification $X\cup\{\Delta\}$.
Here are the classic version of the Stone-Weierstrass theorem and its complex counterpart
Theorem (real): Suppose $S$ is a compact Hausdorff space and $\mathcal{E}\subset\mathcal{C}(S:\mathbb{R})$ is either a Stone lattice or a ring. Let $Z:=\{x\in S: \phi(x)=0,\,\forall\phi\in \mathcal{E}\}$. If $\mathcal{E}$ separates points of $S_0=S\setminus Z$ i.e. for any points $s,t \in S_0$ such that $s\neq t$, there is $\phi\in\mathcal{E}$ such that $\phi(s)\neq\phi(t)$, then $\overline{\mathcal{E}}^u=\mathcal{C}_Z(S):=\{\phi\in\mathcal{C}(S;\mathbb{R}):\phi(x)=0,\,\forall x\in Z \}$ (When $Z =\emptyset$, $\mathcal{C}_Z(S)=\mathcal{C}(S;\mathbb{R})$).
Theorem (complex): Let $S$ be a compact set and let $\mathcal{E}$ be a complex ring of bounded complex--valued continuous functions on $S$ that
is closed under complex conjugations, i.e., $f\in\mathcal{E}$ implies that
$\overline{f}\in\mathcal{E}$. If $\mathcal{E}$ separates points, then either $\overline{\mathcal{E}}^u=\mathcal{C}(S,\mathbb{C})$, or there is (a unique) $z\in S$ such that $g(z)=0$ for all $g\in\mathcal{E}$ and $\overline{\mathcal{E}}^u=\{f\in\mathcal{C}(S,\mathbb{C}): f(z)=0\}$.