This proof is the proverbial "long and winding road". It can be made somewhat less tedious by the use of Dynkin's multiplicative system theorem, which I state and discuss in this answer.
Note that the random variables $W_t : W \to \mathbb{R}$ are just the evaluation functionals on $W$: $W_t(\omega) = \omega(t)$, which are continuous with respect to the norm topology on $W$.
Lemma 1. The random variables $W_t, t \in [0,1]$, generate the Borel $\sigma$-field $\mathcal{B}$ of $W$. That is, any $\sigma$-field on $W$ which makes all $W_t$ measurable must contain $\mathcal{B}$.
Proof. Let $B(\omega_0, r)$ be an open ball in $W$, so that $\omega \in B(\omega_0, r)$ iff $|\omega(t) - \omega_0(t)| < r$ for all $t \in [0,1]$. By continuity, it is sufficient that this hold for all $t \in [0,1] \cap \mathbb{Q}$. Then we have
$$B(\omega,r) = \bigcap_{t \in \mathbb{Q} \cap [0,1]} W_t^{-1}((\omega(t) - r, \omega(t) + r)).$$
Since $W_t$ is continuous, this is a countable intersection of open sets, hence Borel. So any $\sigma$-field that makes all $W_t$ measurable must contain all the open balls, Since the open balls generate $\mathcal{B}$, it must contain $\mathcal{B}$ as well.
Lemma 2. The random variables $f(W_t), f \in \mathcal{S}(\mathbb{R}), t \in [0,1]$, also generate $\mathcal{B}$.
Proof. Take a sequence of functions $f_n \in \mathcal{S}(\mathbb{R})$ converging pointwise to the identity function $x$. Then $f_n(W_t) \to W_t$ pointwise, so any $\sigma$-field making all $f(W_t)$ measurable also makes all $W_t$ measurable, and hence by the previous lemma it contains $\mathcal{B}$.
Now we invoke Dynkin. Let $M$ be the set of all random variables of the form $f(W_{t_1}, \dots, W_{t_n})$ for $f \in \mathcal{S}(\mathbb{R}^n)$. It is easily checked that $M$ is a multiplicative system (if $X,Y \in M$ then $XY \in M$), because a product of two Schwartz functions is again Schwartz. Also, in our previous lemma we saw that $M$ generates $\mathcal{B}$.
Let $E$ be the $L^2$-closure of $M$, and let $H$ be the set of all bounded Borel functions which are in $E$. (I would just say $H = E \cap L^\infty(\mu)$, but technically Dynkin's theorem is about measurable functions, not equivalence classes thereof.) Clearly $H$ is a vector space and $M \subset H$.
To see $H$ is closed under bounded convergence, suppose that $X_n \in H$, $X_n \to X$ pointwise, and $|X_n| \le C$ for all $n$. Then since $\mu$ is a finite measure, by dominated convergence we have $X_n \to X$ in $L^2(\mu)$. $E$ was $L^2$-closed, so $X \in E$ and hence $X \in H$.
To see $H$ contains the constants, let $f_n \in \mathcal{S}(\mathbb{R})$ with $f_n \to 1$ pointwise and boundedly. Then by dominated convergence $f_n(W_t) \to 1$ in $L^2$, so $1 \in E$ and hence $1 \in H$.
Having verified the myriad hypotheses of Dynkin's theorem, we obtain its conclusion: that $H$ contains all bounded $\sigma(M)$-measurable functions. We know from Lemma 2 that $\sigma(M) \supset \mathcal{B}$, so in fact $H$ contains all the bounded Borel functions. Hence so does $E$. But the bounded Borel functions are dense in $L^2(\mu)$. Since $E$ is the closure of $M$ and contains a dense set, $M$ must itself be dense, and we are done.
With regards to your hint: Dynkin's theorem is sort of a "functional monotone class lemma". You could also use the ordinary monotone class lemma: let $\mathcal{P}$ be something like the collection of all events of the form $\{X \ne 0\}$ where $X \in M$, and let $\mathcal{L}$ be all the events $A$ such that $1_A$ is in $E$. Show that $\sigma(\mathcal{P}) = \mathcal{B}$, that $\mathcal{P} \subset \mathcal{L}$, that $\mathcal{P}$ is closed under intersection, and that $\mathcal{L}$ is a monotone class. Conclude that $\mathcal{B} \subset \mathcal{L}$, hence $E$ contains all indicator functions, hence all simple functions, hence is dense.
I don't see how the martingale convergence theorem is useful here, though. The martingale representation theorem would be helpful but I think its proof depends on this very fact or something similar.