No, they're not all the same and it's important to understand the differences between them.
Start with a simple optimization problem
$\min f(x)$
subject to
$g(x) = 0$
where we can assume for simplicity that $f$ and $g$ are smooth (at least twice continously differentiable.)
The Lagrangian function is
$L(x,\lambda)=f(x)+\lambda g(x)$
Note that $L$ is a function of $x$ and $\lambda$. The first order necessary condition for a point $x^{*}$ to be a minimizer is that there is a $\lambda^{*}$ such that $(x^{*},\lambda^{*})$ is a stationary point of $L$. In the method of multipliers, we try to solve the nonlinear system of equations
$\nabla_{x,\lambda} L(x,y)=0$
This is typically done by alternately minimizing with respect to $x$ and updating $\lambda$. Given a Lagrange multiplier estimate $\lambda^{(k)}$, we minimize $L(x,\lambda^{k})$ to get $x^{(k)}$. Then we update $\lambda$ with
$\lambda^{(k+1)}=\lambda^{(k)} +\alpha_{k} g(x^{(k)})$
Where $\alpha_{k}$ is a step size parameter that can be set in various ways.
An penalty function for our problem is a function that is $0$ if $g(x)=0$ and greater than $0$ when $g(x) \neq 0$. A commonly used penalty function is the quadratic penalty function
$\phi(g(x))=g(x)^{2}$
In the penalty function method, we solve an unconstrained problem of the form
$\min_{x} f(x)+\rho \phi(g(x))$
where $\rho$ is a penalty parameter that is increased until the solution of the penalized problem is close to satisfying $g(x)=0$. Note that $\rho$ is not a Lagrange multiplier in this case.
For problems with inequality constraints a commonly used penalty function is
$\phi(g(x))=\max(g(x),0)^{2}$.
An augmented Lagrangian function combines the penalty function idea with the Lagrangian:
$\hat{L}(x,\lambda; \rho)=f(x)+\lambda g(x) + \rho \phi(g(x))$
Augmented Lagrangian methods minimize $\hat{L}$ with respect to $x$, update the
Lagrange multiplier estimate $\lambda$ and then (if necessary) update the penalty parameter $\rho$ in each iteration. In practice, augmented Lagrangian methods outperform simple penalty methods and the method of multipliers.
Merit functions are used in a variety of nonlinear programming algorithms. You'll most commonly see them used in sequential quadratic programming methods. In these methods, a search direction, $d^{(k)}$, is computed at each iteration. The step is from $x^{(k})$ to
$x^{(k+1)}=x^{(k)}+\alpha_{k} d^{(k)}$
where the step size parameter $\alpha_{k}$ is determined by minimizing a
merit function
$\min_{\alpha} M(x^{(k)}+\alpha d^{(k)})$
The merit function is typically something like a penalized objective function or an augmented Lagrangian, but there's a great deal of freedom in the form of the merit function.
These functions and the associated methods are described in many textbooks on nonlinear optimization. A good discussion can be found in Numerical Optimization by Nocedal and Wright.