Let's take a function $f(x)=x^2$ and a point $x_0$. Calculate the difference:
$$\Delta y = f(x_0+\Delta x)-f(x_0) = (x_0+\Delta x^2)-x_0^2 = 2x_0\cdot \Delta x + (\Delta x)^2$$
Hmmm, it looks like that $\Delta y = a\cdot \Delta x + (\Delta x)^2$, where $a=2x_0$ is a number. Assuming it is not zero (i.e. $x_0\neq 0$) we see that $|\Delta x| \ll |a|$ the second term is quite negligible, so $\Delta y \approx a\cdot\Delta x$ (and the quality of this approximation depends on the inequality $|\Delta x| \ll |a|$).
Now let's think about infinitesimals. What do we mean that $\Delta x$ is infinitesimally small? We could try to define
$\Delta x$ is infinitesimal if for all real $a\neq 0$ we have $|\Delta x| < |a|$
But... there is no such (non-zero) "infinitesimal": for every $\Delta x$ we can choose $a=\Delta x /2$. We either need to invent new kind of numbers or accept "infinitesimal" to be a loosely defined concept, meaning "for very small $\Delta x$ we have $\Delta y\approx \Delta x$".
As you will see, in mathematics usually we don't speak about "infinitesimal" $dy$ and $dx$ – we usually speak about the ratio $dy/dx$ that often happens to be a real number. (Such a ratio doesn't need to exist, for example if your function "jumps", i.e. is not continuous).
I'd suggest thinking about them in a loose way ("$dy/dx \approx \Delta y/\Delta x$ and the approximation becomes better and better if we take $\Delta x$ to be smaller and smaller"), but learning rigorous analysis in which you consider derivatives, not infinitesimal quantities. Eventually you will get used to this way of thinking and to proving rigorous theorems – good luck!
(A small disclaimer: there is also a way to make sense about "infinitesimal" equations like $dy = f'(x)\,dx$ that use the language of differential forms, but both sides are not numbers – they are linear functions. You can read about this staff for example in the books mentioned here).