The definitions aren't actually that different. In both, the elements of $V\otimes W$ are equivalence classes of linear combinations of objects labeled by pairs of vectors (one from $V$ and one from $W$). The definitions seem different because in the second one the equivalence classes are seen directly in the notation $Z/E$ and explicitly defined in the process of forming the quotient, whereas in the first one you have to think a bit to realize that equivalence classes are even there.
The equivalence relation in the first definition is, in fact, simple equality: each equivalence class contains those linear combinations of the $[v,w]$ that represent the same function from $\mathcal{L}(V,W;\mathbf{R})$ to $\mathbf{R}$. Remember that an expression like $a_1[v_1,w_1]+\ldots+a_n[v_n,w_n]$ is not the unique name of a function: due to bilinearity the same function will be named by many other expressions of that form.
If you have the impression that these equivalence classes are large sets, I've misled you. In fact these equivalence classes are sets of size $1$ since all the different linear combinations they contain are equal. They represent one and the same function, and it is functions that comprise the set, not formal expressions representing functions. Here we are simply following the usual convention that when we define a set, as, for example, you did when you wrote $\operatorname{span}\{[v,w]\mid v\in V,\ w\in W\}$ as a shorthand for the set of linear combinations of the functions $[v,w]$, equal expressions represent a single element. In general, when the equivalence relation is equality, the equivalence classes will be of size $1$ because of what it means to be an element of a set.
This is the point that necessitates introducing $[[v,w]]$ in the first place: we need a mathematical way of talking about an unevaluated linear combination, that is, of talking about the form of a linear combination (the terms and coefficients it contains) rather than the function from $\mathcal{L}(V,W;\mathbf{R})$ to $\mathbf{R}$ that it represents. We can't do this with $Z'=\operatorname{span}\{[v,w]\mid v\in V,\ w\in W\}$ since its elements are just functions and the usual ways of understanding and manipulating sets don't distinguish between the same element, described differently. We can do it in $Z=\operatorname{span}\{[[v,w]]\mid v\in V,\ w\in W\}$, however, since the elements are the formal linear combinations of pairs $[[v,w]]$ and therefore all distinct.
This answers your third question too: $Z/E$, where $E$ is the subspace of $Z$ that you defined above, is isomorphic to $Z'$. In the case of $Z/E$, vast numbers of elements of the enormous, infinite-dimensional set $Z$ are declared equivalent in forming the quotient by $E$. In the case of $Z'$, vast numbers of expressions appearing in the definition of $Z'$ represent the same function and hence contribute just one element to $Z'$. And there is an obvious correspondence between equivalence classes in $Z/E$ and functions in $Z'$: $[[v,w]]+E\mapsto [v,w]$.