To get the precise asymptotic growth is not so easy and is usually done using the saddle point method. To get the asymptotic growth up to a factor of $n$, with upper and lower bounds, can be done in a relatively low-tech elementary way as follows. First, a lower bound:
Lower bound: We have
$$B_n \ge \frac{k^n}{k!}$$
for every positive integer $k$.
Note that this is slightly better (by a factor of $e$) than what we get from picking a single term in Dobinski's formula.
Proof. By Burnside's lemma, $\frac{k^n}{k!}$ is a lower bound on the number of orbits of the action of $S_k$ on the set of functions $[n] \to [k]$, where $[n] = \{ 1, 2, \dots n \}$. These orbits correspond to partitions of $n$ into at most $k$ non-empty subsets, so there are at most $B_n$ of them. $\Box$
For fixed $k$ this already shows (without Dobinski's formula) that $B_n$ grows faster than exponential. But the real significance of this bound is that we can optimize $k$ as a function of $n$. We consider the function $f(k) = \frac{k^n}{k!}$ and compute the effect of increasing $k$ by $1$, namely
$$\frac{f(k+1)}{f(k)} = \frac{(k+1)^n}{k^n (k+1)} = \frac{\left( 1 + \frac{1}{k} \right)^n}{k+1}.$$
We want $k = k_n$ to be the smallest positive integer such that this ratio is $\le 1$, so that increasing $k$ no longer improves the bound. Equivalently, if $x_n$ denotes the exact real solution to the equation
$$x_n + 1 = \left( 1 + \frac{1}{x_n} \right)^n$$
then $k_n = \lceil x_n \rceil$. Applying the trusty exponential inequality $(1 + x)^n \le e^{nx}$ to the RHS gives
$$x_n \le x_n + 1 \le \exp \left( \frac{n}{x_n} \right) \Rightarrow$$
$$\frac{n}{x_n} \exp \left( \frac{n}{x_n} \right) \ge n.$$
The equation $we^w = n$ has solution $w = W(n)$ where $W$ is the Lambert $W$ function; this is, as far as I know, the simplest argument which explains why the Lambert $W$ function appears in the asymptotics of the Bell numbers. $\frac{n}{x_n}$ is slightly bigger than this, which gives
$$\begin{align*} \frac{n}{x_n} &\ge W(n) \Leftrightarrow \\
x_n &\le \frac{n}{W(n)} \Rightarrow \\
k_n &\le \boxed{ \left\lceil \frac{n}{W(n)} \right\rceil }. \end{align*}$$
We can prove a matching lower bound on $k_n$ but we don't really need it; using this choice for $k$ is enough, and gives:
Corollary: We have
$$\boxed{ B_n \ge \frac{\left\lceil \frac{n}{W(n)} \right\rceil^n}{\left\lceil \frac{n}{W(n)} \right\rceil!} }.$$
This lower bound has the correct asymptotic growth up to a factor of $O(\sqrt{n})$ or so. You can get something that looks more like the other bounds on Wikipedia or in Claude's answer and comment using Stirling's formula. I like this bound because it has a nice conceptual interpretation: heuristically speaking, what it's saying is that a random set partition resembles the set partition obtained from the fibers of a random function $[n] \to [k]$ where $k \approx \frac{n}{W(n)}$. This suggests the following, all of which are true:
- the expected number of parts of a random set partition is roughly $\frac{n}{W(n)}$,
- the expected size of a random part of a random set partition is roughly $W(n)$,
- the expected number of parts of size $k$ has roughly a Poisson shape $\frac{W(n)^k}{k!}$.
In turn, one way to motivate the significance of the number $k \approx \frac{n}{W(n)}$ here is that it inverts the coupon collection time! It is in fact the solution to $n = k \log k$, so it tells us when a random function $[n] \to [k]$ is likely to be surjective, hence when the corresponding partition of $[n]$ is likely to not contain empty subsets.
There is a matching (up to a factor of $O(n)$) upper bound which can be derived by a similar argument, but we need to reason about the decomposition $B_n = \sum S(n, k)$ of the Bell numbers into the Stirling numbers of the second kind; I will edit it in later.
Edit: Here is a sketch of the upper bound. We use:
Upper bound: We have
$$S(n, k) \le \frac{k^n}{k!}$$
for every $1 \le k \le n$.
Proof. $S(n, k) k!$ is the number of surjective functions $[n] \to [k]$, which is at most the number of all functions. $\Box$
Corollary: We have
$$B_n \le \sum_{k=1}^n \frac{k^n}{k!}.$$
Note the similarity to Dobinski's formula, but without the factor of $e^{-1}$, and we get to truncate to $n$ terms. Now we can apply the following simple idea: just bound this sum by $n$ times its largest term. We already defined this largest term to occur at $k = k_n$, so we get:
Corollary: We have
$$\boxed{ B_n \le n \frac{k_n^n}{k_n!} }$$
where $k_n$ maximizes $\frac{k^n}{k!}$.
This is within exactly a factor of $n$ from our lower bound, which was actually $B_n \ge \frac{k_n^n}{k_n!}$. From here we finally need a lower bound on $k_n$ to locate it more precisely; I believe but have not carefully checked that it is always either equal to $\left\lceil \frac{n}{W(n)} \right\rceil$ or $\left\lfloor \frac{n}{W(n)} \right\rfloor$.
This means either our lower or our upper bound must be correct to within a factor of $\sqrt{n}$. A natural guess is that we should take the geometric mean of our bounds to get a better estimate, giving
$$B_n \approx \sqrt{n} \frac{k_n^n}{k_n!}$$
which we can test the accuracy of numerically. For $n = 10$ we have
$$B_{10} = \boxed{ 115975 }$$
$$x_{10} = \frac{10}{W(10)} = 5.73 \dots $$
$$k_{10} = 6$$
$$\frac{6^{10}}{6!} = 83980.8$$
$$10 \frac{6^{10}}{6!} = 839808$$
$$\sqrt{10} \frac{6^{10}}{6!} = 265570.6 \dots $$
which is not bad but suggests that a better estimate, taking the hint from Dobinski's formula, would be to divide by $e$; this gives
$$e^{-1} \sqrt{10} \frac{6^{10}}{6!} = \boxed{ 97698.0 \dots }$$
and suggests (without proof) the estimate
$$\boxed{ B_n \approx e^{-1} \sqrt{n} \frac{ \left( \frac{n}{W(n)} \right)^n}{ \left( \frac{n}{W(n)} \right)!} }$$
which I believe (but have not checked carefully) is correct asymptotically or close to it, possibly off by a small constant factor. Comparing to the other known asymptotics can be done, again, using Stirling's.
I think we should be able to recover the correct square root factor rigorously from Dobinski's formula by estimating the contributions of the terms other than the dominant term $k = k_n$ more closely; they should end up looking approximately like a Gaussian times the dominant term, so we'll get a factor that looks like a Gaussian integral.