What exactly means the notation $\mathbb{E}_xX_t$

Question

I can't find a rigorous introduction of the notation $\mathbb{E}_xX_t$ for a stochastic process $X$. Where I have seen it so far the authors normally just say that it means that the process starts at $x$. But just with this intuition the following formula would allways make sense to me: $$ \mathbb{E}_xX_t=\mathbb{E}(X_t+x). $$ Nevertheless it seems that this is true for Levy processes but not for general Markov processes. Could someone give a rigorous introduction of the notation $\mathbb{E}_x$ that clears this confusion?

saz · Accepted Answer · 2017-03-13T21:05:39.820

3

A Markov process is a tuple $$(X_t,t \geq 0, \mathbb{P}^x, x \in \mathbb{R}^d, \mathcal{F}_t, t \geq 0)$$ consisting of a stochastic process $(X_t)_{t \geq 0}$ which is adapted to the filtration $(\mathcal{F}_t)_{t \geq 0}$ and a family of probability measures $(\mathbb{P}^x)_{x \in \mathbb{R}^d}$. This tuple is called a Markov process if $\mathbb{P}^x(X_0 = x)=1$ and $(X_t)_{t \geq 0}$ satisfies the Markov property, i.e. $$\mathbb{P}^x(X_t \in B \mid \mathcal{F}_s) = \mathbb{P}^{X_s}(X_{t-s} \in B) \quad \mathbb{P}^x\text{-a.s.} \tag{1}$$ for all $s \leq t$ and $x \in \mathbb{R}^d$. For each $x \in \mathbb{R}^d$, we write $$\mathbb{E}^x f(X_t) := \int_{\Omega} f(X_t) \, d\mathbb{P}^x.$$

In general, $$\mathbb{E}^x f(X_t) = \mathbb{E}^0 f(x+X_t) \tag{1}$$ does not hold true. To see this, consider for instance the process $$X_t(\omega) := \omega \cdot e^t$$ on $\Omega := \mathbb{R}$. If we define $\mathbb{P}^x := \delta_x$ and $\mathcal{F}_t := \{\emptyset,\Omega\}$, then it is not difficult to see that $(X_t)_{t \geq 0}$ is a Markov process. However,

$$\mathbb{E}^x f(X_t)= f(x e^t) \neq f(x) = f(x+0 \cdot e^t) = \mathbb{E}^0 f(x+X_t),$$

i.e. $(1)$ is not satisfied.

Lévy processes are, essentially, the only processes satisfying $(1)$ for a large class of functions $f$ (e.g. all bounded and continuous functions). Recall that a Lévy process $(L_t)_{t \geq 0}$ on a probability space $(\Omega,\mathcal{A},\mathbb{P})$ has càdlàg sample paths, independent and stationary increments and satisfies $L_0 = 0$ almost surely. In order to show that $(L_t)_{t \geq 0}$ is a Markov process, we have to find a suitable family of probability measures $\mathbb{P}^x$. One possible approach is to define a larger space $\tilde{\Omega} := \mathbb{R}^d \times \Omega$ endowed with the product $\sigma$-algebra $\tilde{A} := \mathcal{B}(\mathbb{R}^d) \otimes \mathcal{A}$. Then

$$\mathbb{P}^x := \delta_x \otimes \mathbb{P}$$

defines a measure on $(\tilde{\Omega},\tilde{\mathcal{A}})$. Moreover, we set $\tilde{L}(t,(x,\omega)) = x+L_t(\omega)$. The process $(\tilde{L}_t)_{t \geq 0}$ has still independent and stationary increments and $\mathbb{P}^x(\tilde{L}_0 = x)=1$. Moreover, because of the independence and stationarity of the increments, it is not difficult to see that $(\tilde{L}_t)_{t \geq 0}$ is a Markov process. (Here, it is crucial that $(L_t)_{t \geq 0}$ is a Lévy process; for an arbitrary stochastic process $(L_t)_{t \geq 0}$ this construction does not yield a Markov process!) Finally, it follows from the very definition of $\mathbb{P}^x$ that $$\mathbb{E}^x f(\tilde{L}_t) = \mathbb{E}f(x+L_t),$$ i.e. $(1)$ holds.

As mentioned above, it is possible to show under certain additional assumptions on the associated semigroup $T_t f(x) := \mathbb{E}^x f(X_t)$ that Lévy processes are the only stochastic processes satisfying $(1)$ for a large class of functions. This is, however, not easy to prove.

edited Mar 13 '17 at 21:05

answered Mar 13 '17 at 16:40

saz

123,507

Thanks a lot for this answer. I was searching for this explenation for quite a while. I have two more question about this. First, do you know if it is possible to get the $\mathbb{P}^x$ without extending the space? And second, I have seen now a few times the notation $\mathbb{E}(X_t|X_0=x)$. Is this notation simply defined by $\mathbb{E}(X_t|X_0=x):=\mathbb{E}^xX_t$? – Alexander Mar 13 '17 at 17:03
@drogan 1. Well,yes... the idea is then to define $\mathbb{P}^x(L_t \in B) := \mathbb{P}(L_t \in B-x)$ for any Borel set $B$. This defines a measure $\mathbb{P}^x$ on $\mathcal{F}{\infty} := \sigma(L_t; t \geq 0)$ and $(L_t){t \geq 0}$ is a Markov process with respect to this family of probability measures. 2. Yeah. – saz Mar 13 '17 at 18:52
I have tried to prove that such a $\mathbb{P}_x$ on $\sigma(L_t:t\geq 0)$ exists but was not succesfull so far. Could you give some advise how to do it? My approach was by using Caratheory's extension theorem. – Alexander Mar 19 '17 at 12:33
@drogan Yeah, that's a good idea. So where are you stuck? – saz Mar 20 '17 at 15:55
My approach is the following. Let $I:={{X_{t_1}\in A_1,...,X_{t_n}\in A_n}:n\in\mathbb{N},t_1,...,t_n\geq 0,A_1,...,A_n\in\mathcal{B}(\mathbb{R})}$ and $\mathcal{A}:={A_1\cup...\cup A_n:n\in\mathbb{N},A_1,...,A_n\in I}$. If I didn't make a mistake $\mathcal{A}$ is an algebra. I need now a $\mu_0:\mathcal{A}\to[0,1]$ that is finitely additive and countably subadditive with $\mu_0(X_t\in A)=\mathbb{P}(X_t+x\in A)$ for all $t\geq 0,A\in\mathcal{B}(\mathbb{R})$. But I don't see why such an $\mu_0$ exists. – Alexander Mar 20 '17 at 18:55
@drogan For $B := {X_{t_1} \in A_1,\ldots,X_{t_n} \in A_n}$ we define $\mu_0(B) := \mathbb{P}(X_{t_1} \in A_1-x, \ldots,X_{t_n} \in A_n-x)$. This definition can be extended to $\mathcal{A}$ (defined in your previous comment) using the inclusion-exclusion formula. If I'm not mistaken, it shouldn't be too difficult to check that the thus defined mapping $\mu_0: \mathcal{A} \to [0,1]$ is finitely additive and countably subadditive. – saz Mar 21 '17 at 16:25
Sorry for all these questions but is the definition of $\mu_0(B)$ here independent of the representation of $B$? – Alexander Mar 24 '17 at 15:25
@drogan Well, why should this not be the case? – saz Mar 24 '17 at 15:47
Maybe I am confusing something. Don't I have to prove that ${X_{t_1}\in A_1,...,X_{t_n}\in A_n}={X_{s_1}\in B_1,...,X_{t_n}\in B_n}$ implies $\mathbb{P}({X_{t_1}\in A_1-x,...,X_{t_n}\in A_n-x}=\mathbb{P}({X_{s_1}\in B_1-x,...,X_{t_n}\in B_n-x})$ ($A_1,..,A_n,B_1,...B_n\in\mathcal{B}(\mathbb{R})$)? – Alexander Mar 24 '17 at 18:21
@drogan How does $s_1$ come into play...? – saz Mar 24 '17 at 19:26
Sorry there is a typo. The first equation should be ${X_{t_1}\in A_1,...,X_{t_n}\in A_n}={X_{s_1}\in B_1,...,X_{s_n}\in B_n}$. – Alexander Mar 24 '17 at 19:32
What I mean is the the following. Let $B\in I$. Then there exist $A_1,..,A_n\in\mathcal{B}(\mathbb{R}),t_1,...,t_n\geq 0$ such that $B={X_{t_1}\in A_1,...,X_{t_n}\in A_n}$. But maybe there exist also $B_1,...,B_n\in\mathcal{B}(\mathbb{R}),s_1,...,s_n\geq 0$ with $B={X_{s_1}\in B_1,...,X_{s_n}\in B_n}$. – Alexander Mar 24 '17 at 19:37
In that case we would need then $\mathbb{P}({X_{t_1}\in A_1-x,...,X_{t_n}\in A_n-x}=\mathbb{P}({X_{s_1}\in B_1-x,...,X_{s_n}\in B_n-x})$. – Alexander Mar 24 '17 at 19:39
@drogan WLOG assume that $A_j \neq \mathbb{R}^d$ and $B_i \neq \mathbb{R}^d$ for all $i,j$. Now write down what it means for $\omega \in \Omega$ to be an element of the set $B$ having the two representations... you will easily see that necessarily both representations are exactly the same. – saz Mar 24 '17 at 20:01
Let us continue this discussion in chat. – Alexander Mar 25 '17 at 15:10
But can't I find a counterexamples to that? For example the following: Let $\hat{X}$ be nonnegative Levy process (for example poisson process) and $\omega_0\in\Omega$ with $\mathbb{P}({\omega_0})=0$. Then we define $$ X_t(\omega):= \begin{cases} -1, & \text{for }t=1,\omega=\omega_0\ -2, & \text{for }t=2,\omega=\omega_0\ \hat{X}_t(\omega), & \text{else }. \end{cases} $$ Then $X$ is also a Levy process with $$ {X_1\in{-1}}={X_2\in{-2}}=:B. $$ So here the representation of $B$ wouldn't be unique? – Alexander Mar 26 '17 at 18:24
@drogan Why should ${X_1 \in {-1}} = {X_2 \in {-2}}$ hold true...? – saz Mar 26 '17 at 18:28
We should have ${X_1 \in {-1}} ={\omega_0}= {X_2 \in {-2}}$ in this example. – Alexander Mar 26 '17 at 18:52
@drogan I see, because you assumed that the process $\hat{X}$ is non-negative... well, but such things work only for null sets and they are not of importance since their measure is zero. Anyway, discussing these things in comments is quite tedious ... please open a new question. In order to ensure that this question is not closed as a duplicate of this one, please state clearly what you want to do, what you have tried and where you got stuck. – saz Mar 26 '17 at 19:08
Ok, thank you. I will do that. – Alexander Mar 27 '17 at 11:41
@saz Could you please clarify on the choice of the process $\tilde{L}_t (x, \omega) = x + L_t (\omega)$. I am able to show that $\tilde{L}_t ( x, \omega)$, $t \geq 0$, as a stochastic process on $(\tilde{\Omega}, \tilde{\mathcal{A}}, \mathbb{P}^x)$, indeed has independent and stationary increments, càdlàg paths and starts almost surely at $x$. However, I do not see how this is in line with the definition of the Markov process above, since the process $X_t$ in the tuple $(X_t,t \geq 0, \mathbb{P}^x, x \in \mathbb{R}^d, \mathcal{F}_t, t \geq 0)$ seems to be independent of the choice of $x$. – Holden Nov 27 '19 at 03:24
In other words, shouldn't $\tilde{L}_t$ be independent of $x$, so that for different $x \in \mathbb{R}^d$ one would have the same process $\tilde{L}_t$, but different probability measures $\mathbb{P}^x$ that correspond to different distributions of $\tilde{L}_t$? Alternatively, would it be reasonable to redefine the Markov process as a tuple $(X^x_t,t \geq 0, \mathbb{P}^x, x \in \mathbb{R}^d, \mathcal{F}_t, t \geq 0)$? – Holden Nov 27 '19 at 03:24
1

@Holden I think you are misunderstanding something about the construction of the new probability space. We consider a new space $\tilde{\Omega} = \mathbb{R}^d \times \Omega$ and the family of measures $\mathbb{P}^x := \delta_x \otimes \mathbb{P}$. Note that $(\tilde{\Omega},\tilde{\mathcal{A}},\mathbb{P}^x)$ is for each fixed $x \in \mathbb{R}^d$ a measure space. If we set $\tilde{L}t((y,\omega)) := y+L_t(\omega),$ then $(\tilde{L}_t){t \geq 0}$ on the "enlarged" space $\tilde{\Omega}$. – saz Nov 27 '19 at 07:18
Consequently, we have a) a stochastic process on $\tilde{\Omega}$, b) a filtration (=the natural filtration), c) a family of probability measures on $(\tilde{\Omega},\tilde{\mathcal{A}})$. These are exactly the ingredients which are required in the definition of a Markov process. – saz Nov 27 '19 at 07:19
Since in your original answer you wrote $L_{t} (x, \omega)$, I thought that for every fixed $x$ you were considering something like $\mathbb{R}^d \times \Omega \rightarrow \mathbb{R}^d$, $(y, \omega) \mapsto L^x_{t} ( \omega ) = L_t(\omega)+x$, which depends on $x$ and is independent of $y$; for each fixed $x$ this is again a stochastic process on the extended space $( \tilde{\Omega}, \tilde{\mathcal{A}}, \mathbb{P}^x)$ and has independent and stationary increments, càdlàg paths and starts almost surely at $x$, but it is not a Markov process in the sense of the above definition. – Holden Nov 27 '19 at 17:45
Then what about showing that $\mathbb{P}^x ( \tilde{L}_0 = x ) = 1$? For example, if we assume that $L_0 ( \omega ) = 0$ for all $\omega \in \Omega$ instead of only a.s., then \begin{align} \mathbb{P}^x ( \tilde{L}_0 = 0) &= \delta_x \otimes \mathbb{P} ( { (y, \omega) \in \mathbb{R}^d \times \Omega : L_0(\omega) + y = x } ) \ &= \delta_x \otimes \mathbb{P} ( { y \in \mathbb{R}^d : y = x } \times \Omega ) = \delta_x \otimes \mathbb{P} ( { x } \times \Omega ) = 1. \end{align} Is there a way to show this when $L_0 = 0$ a.s.? – Holden Nov 27 '19 at 17:58
1

@Holden Well, it's essentially the same calculation, $$(\delta_x \otimes \mathbb{P})({(y,\omega); L_0(\omega)+y=x}) \geq (\delta_x \otimes \mathbb{P})({(x,\omega); L_0(\omega)+x=x}) =\mathbb{P}(L_0=0)=1.$$ Since $\tilde{\mathbb{P}}^x=\delta_x \otimes \mathbb{P}$ is a probability measure, this gives $\tilde{\mathbb{P}}(\tilde{L}_0=0)=1$. (I just realized that you wrote plenty of further comments... don't have the time now to read them all; this is just the answer to your last comment.) – saz Nov 27 '19 at 18:15

pre-kidney · Answer 2 · 2017-03-09T20:50:24.880

A stochastic process $X$ is simply a random variable taking values in a space of functions, for example the set of all continuous functions from $[0,\infty)$ to $\mathbb R$. Since $X$ is a random variable, it has a law, which is a probability measure on the space of functions.

Oftentimes, one considers stochastic processes as solutions to stochastic differential equations, in which case there is a freedom to choose the starting point $X_0$. If we start $X_0$ at some deterministic value $x$, then the law of the resulting process started $x$ is denoted by $\mathbb P_x$. As mentioned above, this is a probability measure on the space of functions (let's call this space $\Omega$ from now on, for example $\Omega=C([0,\infty),\mathbb R)$.)

Now $\mathbb E_x$ denotes expectation with respect to the measure $\mathbb P_x$. In other words, for an arbitrary stochastic process $Y$, we define $\mathbb E_x Y=\int_{\Omega}Y\ d\mathbb P_x$.

Thus we have a collection of different measures $\{\mathbb P_x\colon x\in\mathbb R\}$ on the same space $\Omega$, one for each value of $x\in\mathbb R$. Note that these measures are mutually singular, and $\mathbb P_x(X_0=x)=1$ for all $x$.

Lastly, you are asking about whether $\mathbb E_x X_t=\mathbb E(X_t+x)$. This will be the case if the distribution $\mathbb P_x$ is equal to the distribution of $\mathbb P_0$ after shifting by $x$. Formally, there is a translation operator $\tau_x\colon \Omega\to\Omega$ sending a (deterministic) process $(y_t)_{t\geq 0}$ to $(y_t+x)_{t\geq 0}$. Then if $\mathbb P_x$ is the pushforward of $\mathbb P_0$ under the map $\tau_x$, denoted $\mathbb P_x=(\tau_x)_*\mathbb P_0$, then $\mathbb E_x X_t=\mathbb E_0(X_t+x)$ for all $x$ and for all $t$.

Thank you.This makes things clearer. The following is still not clear to me. So in the case of a Levy process $X$ I can prove that it is a Markov process. Then I can use the Kolmogorov theorem to prove that there exists a transition function $(P_t)_t$ such that for every $x\in\mathbb{R}$ there exists a probability measure $\mathbb{P}_x$ with $\mathbb{P}_x(X_0=x)=1$ and $\mathbb{E}_x(f(X_t|\mathcal{F}_s))=(P_tf)(X_s)$. How do I prove now that $\mathbb{P}_x$ is the pushforward of $\mathbb{P}$ under the $\tau_x$? Is it easy to to see that or should I better ask in a new post about it? — Alexander, Mar 10 '17 at 12:01

score 0 · Answer 3 · answered Mar 09 '17 at 20:20

0

It's $$\mathbb E[X_t\mid X_0=x]$$

answered Mar 09 '17 at 20:20

Surb

57,262
11
68
119

I have seen it in the context where $X$ is a Levy process starting at 0. Then we would have for $x\neq 0$ $\mathbb{P}(X_0=x)=0$ and the above is not well defined since we would devide by zero. – Alexander Mar 09 '17 at 20:29
1

@drogan It is unlikely to see $E_x(X_t)$ with $x\ne0$ in a context where $X_0=0$ almost surely. Could you explain where you would have seen this? – Did Mar 09 '17 at 20:49
@did The book is "Introductory Lectures on Fluctuations of Levy Processes with Application" by Andreas Kyprianou. In chapter 1 he defines that a Levy process starts at zero with probability 1. In chapert 11 he uses then the notation $\mathbb{E}x$ in combination with a Levy process. For example the expression $\mathbb{E}_x(e^{-q\tau}G(X\tau))$ on page 309. I must admit that this is not the same as $E_xX_t$. But wouldn't that lead to a similar problem if we interprete it as $\mathbb{E}((e^{-q\tau}G(X_\tau))|X_0=x)$? – Alexander Mar 09 '17 at 21:38
@drogan Once again, $E_x$ obviously refers to $E(\ \mid X_0=x)$. (Please use @.) – Did Mar 09 '17 at 21:40
@Did Sorry, $X_0=0$ was a typo. – Alexander Mar 09 '17 at 21:46

What exactly means the notation $\mathbb{E}_xX_t$

3 Answers3

Linked