4

I have a problem with the definition of probability density function (PDF).

Usually this concept is defined in terms of a given distribution function, while I would like to know if it is possible to define the concept in one shot (i.e. for both the discrete and continuous case) without passing through cdf.

Thus,...

... can we say that a PDF is any function $f : \mathbb{R} \to [0 ,1]$ that satisfies the following two basic requirements?

  1. $f \geq 0$,
  2. $\int f d \lambda = 1$, where $\lambda$ is the Lebesgue measure on $\mathbb{R}$.

If this is correct, does this definition encompass in one shot both the discrete and continuous case (thanks to the Lebesgue integration)?

I would say no, because condition (2) should be ill-defined for the discrete case, because it is based on the Lebesgue measure according to which every point has measure zero.

  • Is this last intuition correct (which implies that we need to explicitly add a third condition with summation instead of integration to deal with the discrete case), or my intuition is simply wrong?
  • If it is wrong, what am I missing?

As always, any feedback is enormously appreciated.
Thank you for your time.

Kolmin
  • 4,249
  • 3
    Any Lebesgue measurable function with those two properties defines some density, yes. Part of the advantage of passing through the CDF is that the CDF is unique but the PDF is not. – Ian Oct 12 '16 at 14:55
  • Thanks! Do you actually mean that it is not unique in the sense that it is unique up to a.e. equivalence classes? – Kolmin Oct 12 '16 at 14:59
  • Also, am I right in writing that condition (2) saves the day through Lebesgue integration? – Kolmin Oct 12 '16 at 15:00
  • @Ian: I just edited the question to emphasize some points that are still not clear to me. Thanks for any feedback (just in case)! :) – Kolmin Oct 12 '16 at 18:30
  • 2
    Continuous random variables are those that have PDFs. Discrete random variables have PMFs. You can unify the two using distribution theory if you really want. – Ian Oct 12 '16 at 18:32
  • 3
    What you're defining are called absolutely continuous probability measures. I think if you look up the measure theoretic theory of probability, you'll have a lot of your questions answered. Also, note that you can't take any function and integrate with respect to Lebesgue measure; it has to be measurable, of course. – Callus - Reinstate Monica Oct 12 '16 at 18:33
  • 1
    This definiton does not encompass both discrete and continuous. this is the definition of continuous RV. Test this definition for any of your favorite discrete distributions. – Ranc Oct 12 '16 at 18:36
  • @Ian: Thus, just to get the point straight, distribution theory is where I can find a definition of density function that is independent of distribution theory and unify the case of continuous and discrete r.v., right? – Kolmin Oct 12 '16 at 22:21
  • @Ranc:Thus, you are saying that I need indeed a third condition that deals with the specific case of discrete r.v., right? Would this make the job? – Kolmin Oct 12 '16 at 22:23
  • @Callus: Of course it has to be measureable. Beyond that, when you refer to absolute continuity, are you implicitly referring to the representability in terms of density function of a distribution function? Because this is exactly what I don't want, that is a reference to distribution function to talk about density functions. Can we say basically that the take home lesson is that we need three conditions, where the third one deals with the discrete case of r.v.? – Kolmin Oct 12 '16 at 22:26
  • 2
    The kinds of variables that have classical densities are continuous random variables. These are better referred to as absolutely continuous random variables, although this terminology is rarely used in practice. This means that the CDF is absolutely continuous (as a real-valued function of a real variable). It also means that the pushforward measure, aka the law, of the random variable is absolutely continuous with respect to the Lebesgue measure. The latter is a definition of "continuous random variable" which strictly speaking does not involve the CDF. – Ian Oct 12 '16 at 22:31
  • 2
    (Cont.) Another such definition is the one that you gave prior to your edits. Anyway, by considering "densities" which are no longer functions, you can assign a "density" to any real-valued or vector-valued random variable. For example, for a discrete random variable the density is a linear combination of Dirac delta functions. But now the notion of "integration" is not really Lebesgue integration anymore. – Ian Oct 12 '16 at 22:31
  • 1
    @Ian Really thanks a lot for your comments: lot of food for thoughts. Thus, I have basically two comments. First, you write "The kinds of variables that have classical densities are continuous random variables". In a sense, my question is the following: when you write something like this (something that I do actually understand), what is the definition of density you have in mind? Is there a definition of density that is independent of distributions (i.e. something that looks like what I wrote)? – Kolmin Oct 13 '16 at 13:48
  • 1
    @Ian Second point is that I am not completely sure I follow your second comment: for example, what do you mean when you write that this is not the Lebesgue Integration anymore? Could you give me some references about the context/field you have in mind? – Kolmin Oct 13 '16 at 13:49
  • 1
  • A "classical" density is a nonnegative Lebesgue measurable function whose Lebesgue integral over the whole line (or $\mathbb{R}^n$) is $1$. So this is pretty much what you were writing before your edits (except that densities do not have to be bounded), and yes this makes sense without passing through the CDF.
  • – Ian Oct 13 '16 at 13:55
  • 1
  • We sometimes identify distributions (in the sense of distribution theory, not probability theory) which aren't functions as if they are functions. Certain types of such distributions, such as the Dirac delta or the weak derivative of the Cantor function, correspond to probability distributions, in which case we think of them as the "density" of such a distribution.
  • – Ian Oct 13 '16 at 13:55
  • @Ian Wow! I mean, really thanks a lot: I am really starting to see things! I Two other things, that are most probably related. (1) If what I wrote (with the exception of boundedness) is correct, this works only for continuous r.v., otherwise Lebesgue measure "kills" single point. Thus, what about discrete? Do we have to come up with another definition? Isn't it ugly? (2) Why not bounded? [wild guess: we allow value at infinity for single points, because at the end of the day we don't really care about them thanks to – again – the fact that Lebesgue measure kills them]. – Kolmin Oct 13 '16 at 14:13
  • 1
  • Yes, what you wrote is for continuous random variables. It doesn't directly extend to discrete random variables. It "formally" extends to discrete (and continuously singular, and mixed) random variables through the lens of distribution theory. 2. There are absolutely continuous functions with unbounded derivative, such as $\sqrt{x}$. The distribution with CDF $f(x)=\begin{cases} 0 & x<0 \ \sqrt{x} & x \in [0,1] \ 1 & x>1 \end{cases}$ is a continuous distribution but its density is not bounded.
  • – Ian Oct 13 '16 at 23:06