It's reasonable to question this interchange of operations. You're looking at trying to justify
$$
\frac{d}{dt}\int_{-\infty}^{\infty}e^{-isx}u(x,t)dt = \int_{-\infty}^{\infty}e^{-isx}u_{t}(x,t)dt.
$$
The correct norm for the heat equation is the $L^1(\mathbb{R})$ norm because one would typically have positive termperatures, and the total heat (or energy) would be proportional to the integral of $u$ over all $x\in\mathbb{R}$. The natural conditions to justify the above are a continuity condition
$$
\lim_{h\downarrow 0}\int_{-\infty}^{\infty}|u(t+h,x)-u(t,x)|dx =0
$$
and a derivative condition
$$
\lim_{h\downarrow 0}\int_{-\infty}^{\infty}\left|\frac{1}{h}\{u(t+h,x)-u(t,x)\}-u_t(t,x)\right|dx=0.
$$
These two vector conditions combine to allow one to justify the interchange, provided you assume $u_{t}(t,x)\in L^1(\mathbb{R})$ in the spatial variable $x$. Anything less makes it almost impossible to justify the interchange, which is why the interchange is often ignored in order to arrive at a solution.
You can show that these conditions do hold for the final solution obtained by the Fourier transform method. The second condition may not hold at $t=0$, though the first will; and both hold for all other $t > 0$. The second condition at $t=0$ requires spatial smoothness of the initial data distribution $u(0,x)$. It turns out these are natural conditions in the setting of $C^0$ semigroup theory, at least for $t > 0$. The first condition must hold in $C^0$ semigroup theory for $t=0$, while the second cannot hold unless $u(0,x)$ is in the domain of the generator of the semigroup, which is $\frac{d^2}{dx^2}$ in this case. $C_0$ semigroup theory is a vector theory devised to deal with time evolution systems.
It's reasonable to impose these additional stability conditions, and to put restrictions on the functions to be in $L^1$. Why? Because solutions are not necessarily unique unless you do impose such conditions. Being in $L^1$ and imposing stability conditions of this type lead to unique solutions that can be obtained using the Fourier transform method, because the interchange of operations is permitted.