Optimal Switching at Poisson Random Intervention Times

This paper introduces a new class of optimal switching problems, where the player is allowed to switch at a sequence of exogenous Poisson arrival times, and the underlying switching system is governed by an infinite horizon backward stochastic differential equation system. The value function and the optimal switching strategy are characterized by the solution of the underlying switching system. In a Markovian setting, the paper gives a complete description of the structure of switching regions by means of the comparison principle.


Introduction
obtained the solutions by a verification approach. Ly Vath and Pham [18] employed the viscosity solution approach to determine an explicit solution in the two-regime case, which was extended to multi-regime case in [22]. Bayraktar and Egami [1] obtained an even more general result by making an extensive use of one dimensional diffusions. Notwithstanding, most of the results in this spectrum is based on the assumption that the switching system is one dimensional diffusion (such as one dimensional geometric Brownian motion) and the time horizon is infinite. On the other hand, if the switching system is governed by a more general stochastic process, such as multi-dimensional diffusions or BSDEs, it is often a formidable task to determine the structure of switching regions. In such a situation, the attention more focuses on the characterization of the value function and the optimal switching strategy, either by using a system of variational inequalities as in Tang and Yong [24], or by using a system of reflected BSDEs such as in Hamadene and Jeanblanc [12], Hamadene and Zhang [13], and Hu and Tang [14]. Numerical solution is therefore also an important aspect in this method (see Carmona and Ludkovski [6] and Porchet et al [23]). Finally, if the underlying switching system is modeled by non-diffusive processes such as Markov chains, we refer to Bayraktar and Ludkovski [2].
In this paper, we try to cover both spectra of the methods to tackle our optimal switching problem. In Section 2, we present a general optimal switching model, where the underlying switching system is governed by an infinite horizon BSDE system. For a general introduction of BSDEs, we refer to the seminal paper by Pardoux and Peng [20], and a follow-up survey paper by El Karoui et al [11]. See also the two monographs by Ma and Yong [19] and Yong and Zhou [25] with more references therein. Our main result in Section 2 is to show that if the underlying switching system follows a "penalized version" of infinite horizon BSDE system, then the value of the corresponding optimal switching problem is nothing but the solution of this penalized equation (see Theorem 2.2). The basic observation comes from the optimal stopping time representation for one dimensional penalized BSDE, firstly discovered by Liang [16] in a finite horizon case. In this paper, we also prove an infinite horizon version in Lemma 2.3.
We take the other spectrum of the methods in Section 3, where we work in a Markovian setting with one dimensional geometric Brownian motion and two regimes. This simplification enables us to fully describe the structure of switching regions (see Theorem 3.1). The basic observation therein is that we can consider the difference of the value functions for the two switching regimes, and then employ the comparison principle for one dimensional equation.
The paper is organized as follows: We present our general optimal switching model in Section 2, and characterize the value of the optimal switching problem and the associated optimal switching strategy by the solution of an infinite horizon BSDE system. In Section 3, we work out a specific example in a Markovian setting, and give a complete description of the structure of switching regions. All the technical details are provided in the Appendix.

The Optimal Switching Model
Let (W t ) t≥0 be a n-dimensional standard Brownian motion defined on a filtered probability space (Ω, F , F = {F t } t≥0 , P), where the filtration F is the minimal augmented Brownian filtration.
For any fixed time t ≥ 0, let {T n } n≥0 be the arrival times of a Poisson process (N s ) s≥t with intensity λ and minimal augmented filtration {H (t,λ) s } s≥t . We follow the convention that T 0 = t and T ∞ = ∞, and throughout this paper, we assume that the Brownian motion and the Poisson process are independent. Given the parameter set (t, λ), let G and G (t,λ) = {G (t,λ) s } s≥t . Moreover, given the Poisson arrival time T n , define pre-T n σ-field: Tn } n≥0 . Let the superscript * denote the matrix transpose, and · denote the inner product in R d with Let H 2 a (R d×n ) be the space of all F-progressively measurable processes Z, valued in R d×n , endowed with the norm:

Infinite Horizon BSDE System
We introduce the following infinite horizon BSDE system, which will be used to characterize the value of the optimal switching problem introduced in the next subsection, The driver f s = (f 1 s · · · , f d s ) * and the parameter a are the given data, and the impulse term MY i t is defined as Note that (2.1) are nothing but penalized equations of multi-dimensional reflected BSDEs. A solution to (2.1) is a pair of F-progressively measurable processes (Y, Z) valued in R d × R d×n . The solution Y i represents the payoff in regime i, and the impulse term MY i t represents the payoff if the player switches from the current regime i to regime j, where g ij t is the associate switching cost from i and j.
We impose the following assumptions on the data of the infinite horizon BSDE system (2.1).
Assumption 1 The driver f s (y, z) : Ω × R + × R d × R d×n → R d is monotone in y and Lipschitz continuous in z, i.e. there exist constants a 1 , a 2 > 0 such that for any y,ȳ ∈ R d and z,z ∈ R d×n , and it has linear growth in both components (y, z). Moreover, the parameter a satisfies the following structure condition: It is known that structure conditions such as (2.4) are critical for solving infinite horizon BSDE systems. However, the structure condition (2.4) is slightly different from the standard ones in Section 3 of Darling and Pardoux [8] and Section 2 of Briand and Hu [5]: the additional term . The switching cost g ij satisfies the following assumption.

Assumption 2
The switching cost g ij for 1 ≤ i, j ≤ d is a bounded F-progressively measurable process valued in R, and satisfies (1) g ii t = 0; (2) inf t≥0 g ij t + g ji t ≥ C > 0 for i = j; and (3) The proof essentially follows from Section 3 of Darling and Pardoux [8] and Section 2 of Briand and Hu [5], though they consider the random terminal time. For completeness and readers' convenience, we give the proof of Proposition 2.1 in the Appendix.
We conclude this subsection by observing that (Y, Z) solve (2.1), if and only if the corresponding where the driverf s = (f 1 s , · · · ,f d s ) * is given bỹ f i s (y, z) = e rs f i s (e −rs y, e −rs z) − ry i for (y, z) ∈ R d × R d×n , and the impulse termMU i t is defined as Hence, as a direct consequence of Proposition 2.1, (2.6) admits a unique solution (U, V ) ∈ S 2 a−r (R d )× H 2 a−r (R d×n ).

Optimal Switching Representation: Main Results
Consider the following optimal switching problem: Given d switching regimes, a player starts in regime i ∈ {1, · · · , d} at any fixed time t ≥ 0, and makes her switching decisions sequentially at a sequence of Poisson arrival times {T n } n≥0 associated with the Poisson process (N s ) s≥t . Hence, her switching decision at any time s ≥ t is represented as T k -measurable random variables valued in {1, · · · , d}, so they represent the regime that the player is going to switching to at the Poisson arrival time T k . Define the control set K i (t, λ) as K i (t, λ) = G (t,λ) -progressively measurable process (u s ) s≥t : u has the form (2.7) with α 0 = i .
The resulting expected payoff associated with any control u ∈ K i (t, s) is for any r ≤ a, where the running payoff f s = (f 1 s , · · · , f d s ) * and the parameter a satisfy Assumption 1, with (Y, Z) given as the solution of the infinite horizon BSDE system (2.1), and the switching cost g i,j satisfies Assumption 2. The player maximize her expected payoff by choosing an optimal control u * ∈ K i (t, λ): Note that if a ≥ 0, then r can take value zero, and (2.8) corresponds to a non-discounted optimal switching problem. However, if a < 0, then the discounting is necessary, which is not the case for the finite horizon optimal switching problem.
Our main result is the following representation result of the above optimal switching problem, which is a counterparty of the finite horizon case in Section 4 of Liang [16].
Theorem 2.2 Suppose that Assumptions 1 and 2 hold. Let (Y, Z) be the unique solution to the infinite horizon BSDE system (2.1). Then the value of the optimal switching problem (2.8) is given by y a.s. for t ≥ 0, and the optimal switching strategy is τ * 0 = t and α * 0 = i, for k ≥ 0. Hence, the optimal switching strategy at any time s ≥ t is To prove Theorem 2.2, we first observe that the solution Y i t to (2.1) is the value of the optimal switching problem (2.8) with the associated optimal switching strategy u * , if and only if the solution is the value of the following optimal switching problem (without discounting): with the optimal switching strategy τ * 0 = t and α * 0 = i, From now on, we will mainly work with the formulation (2.10), and prove that its value is given by U i t . The proof crucially depends on the following lemma about the optimal stopping time representation for the infinite horizon BSDE system (2.6), whose proof since quite long, is postponed to the next subsection. The new feature of this optimal stopping time problem is that the player is only allowed to stop at exogenous Poisson arrival times. Such kind of optimal stopping was first introduced by Dupuis and Wang [10] in a Markovian setting. Lemma 2.3 Suppose that Assumptions 1 and 2 hold. Let (U, V ) be the unique solution to the infinite horizon BSDE system (2.6). For n ≥ 0 and 1 ≤ i ≤ d, consider the following auxiliary optimal stopping time problem: where the control set R Tn+1 (t, λ) is defined as Then its value is given byỹ a.s. for t ≥ 0. The optimal stopping time is given by Let us acknowledge the above lemma for the moment, and proceed to prove Theorem 2.2.
Proof. For any switching strategy u ∈ K i (t, λ) with the form: we consider the auxiliary optimal stopping time problem (2.12) starting from T 0 = t, stopping at the first Poisson arrival time T 1 , and switching to α 1 : Thanks to Lemma 2.3, the value of the optimal stopping time problem (2.12) starting from T 1 is given byỹ . We consider such an optimal stopping time problem stopping at the Poisson arrival time T 2 , and switching to α 2 : . (2.14) By plugging (2.14) into (2.13), we obtaiñ We repeat the above procedure M times, and obtaiñ Since P{ω : T n (ω) < ∞ for all n ≥ 0} = 1, the player actually only makes finite number of switching decisions on [0, ∞), i.e. the switching strategy is finite. Recall that the solution U T converges to zero in L 2 as r ≤ a: Hence, letting M ↑ ∞, we get Taking the supremum over u ∈ K i (t, λ) and using Lemma 2.3 once again, we derive that We prove the reverse inequality by considering the switching strategy u = u * as defined in (2.11). From Lemma 2.3, τ * 1 is the optimal stopping time forỹ . By the definition of α * 1 , Similarly, τ * 2 is the optimal stopping time forỹ Hence, Plugging (2.16) into (2.15) gives us We repeat the above procedure as many times as necessary, and obtain for any M ≥ 0, Since the switching strategy is finite and U T converges to zero in L 2 , we concludeỹ , and u * is the optimal switching strategy for the optimal switching problem (2.10).

Optimal Stopping Representation: Proof of Lemma 2.3
The proof is adapted from the proof of Theorem 1.2 in Liang [16], where a finite horizon problem was considered. First, we introduce an equivalent formulation of the optimal stopping time problem (2.12) Notice that (2.17) is a discrete optimal stopping problem, as the player is allowed to stop at a sequence of integers n + 1, n + 2, · · · . The optimal stopping time is then some integer-valued random variable N * n+1 : We will mainly work with the formulation (2.17) from now on. The proof is based on two observations. The first observation is that the solution to the infinite horizon BSDE system (2.6) on the Poisson arrival time T n can be calculated recursively as follows: Indeed, by applying Itô's formula to e −λt U i t , we obtain for any T ≥ T n , Next, we use integration by parts and the conditional density λe −λ(x−Tn) dx of T n+1 − T n to simplify (2.19): Moreover, Hence, plugging the above two expressions into (2.19) gives us We conclude (2.18) by taking T ↑ ∞ in the above equation.
The second observation is that if we define U i = max M U i , U i , then U i satisfies the following recursive equation: We show that is given by y i,(t,λ) Tn =Û i Tn a.s. for n ≥ 0. The optimal stopping time is given by Proof. Without loss of generality, we may assumef i s (U s , V s ) = 0. Otherwise we only need to consider where we used (2.20) in the second inequality. Taking the supremum over N ∈ N n (t, λ), we get To prove the reverse inequality, we first show that U i T m∧N * n m≥n is aG (t,λ) -martingale. Indeed, where we used (2.20) and the definition of N * n in the second last equality. Hence, and N * n is the optimal stopping time for the optimal stopping problem (2.21). We are now in a position to prove Lemma 2.3. From (2.18) and the definition ofÛ , Tn+1 , which is the value of the optimal stopping problem (2.21). Hence, for any N ∈ N n+1 (t, λ), To prove the reverse inequality, we take N * n+1 = inf{k ≥ n + 1 : U i T k ≤MU i T k }, which is the optimal stopping time for y i,(t,λ) Tn+1 , and therefore, Hence the value of the optimal stopping time problem (2.12) is given byỹ i,(t,λ) Tn = U i Tn , and the optimal stopping time is N * n+1 , or equivalently, τ * Tn+1 .

The Structure of Switching Regions in a Markovian Case
In this section, we investigate the switching regions of the optimal switching problem (2.12) in a Markovian setting. Specifically, we assume that there are two switching regimes d = 2, and the Brownian motion is one dimension n = 1. Moreover, Assumptions 1 and 2 are replaced by Assumption 3 The driver f s (y, z) has the form: f s (y, z) = h(X s ) − a 1 y, where X is a geometric Brownian motion starting from X 0 = x ∈ R + with constant drift b and constant volatility σ > 0: dX s = bX s ds + σX s dW s , h = (h 1 , h 2 ) * is nonnegative and Lipschitz continuous, and a 1 > max{b, 0} is large enough so that for a = −a 1 + 2.5λ, Assumption 4 The switching cost g ij for i, j ∈ {1, 2} is a constant, and satisfies (1) g ii = 0; and (2) g ij + g ji > 0 for i = j.

Under Assumptions 3 and 4, the solution to the infinite horizon BSDE system (2.1) is Markovian,
i.e. there exist measurable functions v = (v 1 , v 2 ) such that Y t = v(X t ). By Proposition 2.1, v(X t ) is the solution to the following equation: for i = 1, 2 and j = 3 − i. Without loss of generality, we may also assume that h(0) = 0. Indeed, From Theorem 2.2, by choosing r = −a 1 , we know that v i (X 0 ) = v i (x) is the value of the following optimal switching problem: Moreover, the optimal switching strategy is given as (2.9): τ * 0 = 0 and α * 0 = i, where α * k+1 = 3 − α * k . Therefore, the player will switch from regime i to regime j, if X is in the following switching region S i at Poisson arrival times: On the other hand, the player will stay in regime i, if X is in the following continuation region C i : We further set with the usual convention that inf φ = ∞ and sup φ = 0.
To distinguish the regimes 1 and 2, we impose the following assumption, which includes several interesting cases for applications.

Assumption 5
The difference of the running profits is non-negative: F (x) = h 2 (x) − h 1 (x) ≥ 0, and is strictly increasing on (0, ∞), and moreover, the switching cost from regime 1 to regime 2 is positive: g 12 > 0.
The above assumption has clear financial meanings. The non-negativity means that the regime 2 is more favorable than the regime 1. The monotonicity implies that the improvement is better and better. Thus, it is natural to assume that the corresponding switching cost g 12 from regime 1 to regime 2 is positive.

The Structure of Switching Regions: Proof of Theorem 3.1
The proof relies on several basic properties of the value function v(·), and the associated comparison principle. First, we prove that the value function v(·) has at most linear growth and is Lipschitz continuous by employing the optimal switching representation (3.2). The proof is adapted from Ly Vath and Pham [18], and is provided in the Appendix.
Proposition 3.2 Suppose that Assumptions 3 and 4 hold. Then the value function v(·) of the optimal switching problem has at most linear growth, for i = 1, 2 and x,x ∈ R + .
Given the above linear growth property and the Lipschitz continuous property of v(·), the nonlinear Feynman-Kac formula (see Section 6 of [8]) then implies that v(·) is the unique (viscosity) solution to the following system of ODEs: for i = 1, 2 and j = 3 − j, where the operator L = 1 2 σ 2 x 2 d 2 dx 2 + bx d dx . Note that (3.3) are actually penalized equations for the system of variational inequalities: Moreover, the following comparison principle for (3.3) also holds, whose proof is also provided in the Appendix for completeness.
is the subsolution of the following ODE: for x ∈ R + , and that v i (·) has at most linear growth. Then v i (x) ≤ 0 for x ∈ R + . The supersolution property also holds for v i (·) in a similar way.
The structure of the switching region S 1 , it is easy to see that G 1 (·) is the solution of the following ODE: (3.5) From Proposition 3.2, G 1 (·) has at most linear growth and is Lipschitz continuous. The switching Lemma 3.4 Suppose that Assumptions 3, 4 and 5 hold. If there exists (3.5), it is easy to see G ′ 1 (·) is the subsolution of the following ODE: where H(x) = 1 [0,∞) (x). Since a 1 > b, by using the subsolution property in Proposition 3.3, we 2. If a 1 g 12 < F (∞), then x 1 ∈ (0, ∞) and S 1 = [x 1 , ∞). Proof.
2. Similar to the proof in 1, we have x 1 = 0. Therefore, we only need to prove x 1 = ∞. If not, then G 1 (x) > 0 for x ∈ (0, ∞), and (3.5) reduces to By the comparison principle in Proposition 3.3, we have G 1 (x) ≤ G(x), where G(x) is the solution with linear growth to the following ODE: The Feynman-Kac formula implies that By Fatou lemma, we have Thus, we have G 1 (∞) < 0. This is a contradiction.
The financial intuition behind Proposition 3.5 is that the structure of the switching region of the regime 1 depends on the "instant loss" of the switching cost g 12 and the "net running profit" F = h 2 − h 1 . If a 1 g 12 ≥ F (∞), which means the loss by switching can not be compensated by the net running profit, then one has no interest to switch; If a 1 g 12 < F (∞), which means the net running profit may exceed the loss due to the switching cost in some state, then one may switch when the net running profit reaches some level at Poisson arrival times.
2. Thanks to Lemma 3.6, we only need to show that x 2 = 0 and x 2 = ∞.
Hence, (3.7) reduces to By the comparison principle in Proposition 3.3, we have is the solution with linear growth to the following ODE: By continuity of both G 2 (·) and G(·), G 2 (0) ≤ G(0) = g 21 < 0, which is a contradiction.
The Feynman-Kac formula implies that By Fatou lemma, we have This is a contradiction. Thus, we have x 2 = ∞.
The financial intuition behind Proposition 3.7 is that (1) when the switching cost from a "higher regime" to a "lower regime" is non-negative, it is unnecessary to switch; (2) if the switching cost is negative and can "compensate" the loss due to switching to a "lower regime" in some state, the player would switch to a "lower regime" at some level at Poisson arrival times; (3) if the profit from switching cost exceeds the maximum loss from the "net running profit", the player would switch to a "lower regime" from a "higher regime" at Poisson arrival times.

A Appendix
A.1 Proof of Proposition 2.1 Existence of solutions to the infinite horizon BSDE system (2.1) The idea is to truncate the infinite horizon BSDE system (2.1) on [0, ∞) to a finite horizon one on [0, n] for any n ≥ 0: Then for n ≥ m ≥ 0, we consider the difference of two truncated equations truncated on different time intervals [0, n] and [0, m]: By the monotone condition and Lipschitz condition in Assumption 1, the second term on the RHS of (A.2) is dominated by for any constants δ 1 , δ 2 , where we used the elementary inequality 2ab ≤ 1 δ 2 a 2 + δ 2 b 2 . Similarly, the third term on the RHS of (A.2) is dominated by By plugging (A.3) and (A.4) into (A.2), and choosing δ 2 1 > 1 and δ 2 1 + δ 2 2 < δ, and a as in the structure condition (2.4), we obtain Taking expectation at t = 0, we have n m e 2as |f s (0, 0)| 2 ds ↓ 0 as m, n ↑ ∞. Hence, (Z(n)) n≥0 is a Cauchy sequence in H 2 a (R d×n ), and converges to some limit process, denoted as Z. On the other hand, taking supremum over t ≥ 0, and then taking expectation, we obtain The standard argument by using the BDG inequality implies that the martingale term is in fact uniformly integrable, so by taking m, n ↑ ∞, we deduce that (Y (n)) n≥0 is a Cauchy sequence in S 2 a (R d ), and converges to some limit process, denoted as Y . It is standard to check that (Y, Z) indeed satisfies (2.1), so in order to verify that they are one solution to the infinite horizon BSDE system (2.1), we only need to prove that Indeed, since (Y (n)) n≥0 is a Cauchy sequence in S 2 a (R d ), for any ǫ > 0, there exists n large enough such that Letting t ↑ ∞ and noting Y t (n) = 0 for t ≥ n, we obtain the desired convergence (A.5).
Uniqueness of solutions to the infinite horizon BSDE system (2.1) The proof of uniqueness is similar to the proof of the existence, so we only sketch it. Let (Y, Z) and (Ỹ ,Z) be two solutions to the infinite horizon BSDE system (2.1). Denote δY i t = Y i t −Ỹ i t , and δZ i t = Z i t −Z i t . Then (δY, δZ) satisfies the following equation: where we denote Apply Itô's formula to e 2at |δY t | 2 , e 2at |δY t | 2 = e 2aT |δY T | 2 − T t 2ae 2as |δY s | 2 ds Using the monotone condition and the Lipschitz condition in Assumption 1, we get Choosing δ 2 1 = δ and taking expectation on (A.7) at t = 0, we obtain Since lim T ↑∞ E e 2aT |δY T | 2 = 0, we conclude that On the other hand, choosing δ 2 1 = δ, taking supremum over t ≥ 0, and take expectation on (A.7), we get Since lim T ↑∞ E e 2aT |δY T | 2 = 0, and the martingale is uniformly integrable, we conclude that E sup t≥0 e 2at |δY t | 2 = 0.

A.2 Proof of Proposition 3.2
For any switching strategy u ∈ K i (0, λ), the running profit of (3. for any M ≥ 1, which can be proved by induction as in [18]. Indeed, (A.9) obviously holds for M = 1. Suppose (A.9) holds for M ≥ 1, we consider M + 1. When g αM ,αM+1 ≥ 0, (A.9) obviously holds as a 1 ≥ 0. When g αM ,αM+1 ≤ 0, we have Once again, from the arbitrariness of u ∈ K i (0, λ), we obtain that the RHS of the above inequality is the lower bound of v i (·), so v i (·) has at most linear growth. We conclude by showing the Lipschitz continuity of v i (·). Indeed, for x,x ∈ R + ,

A.3 Proof of Proposition 3.3
We only prove the subsolution property, as the supersolution property is similar. The proof relies on the comparison principle for the ODE (3.4) on a finite interval. For any x 0 ∈ (0, ∞) and ǫ > 0, since b < a 1 , we can choose p > 0, q > 1 such that With such p, q, we then choose C 1 , C 2 , C 3 > 0 such that Now we consider the following auxiliary function Then we have Since v i (·) is the subsolution to the ODE (3.4) and the terms involving p, q are negative by the choice of p, q, we obtain that (−L + a 1 )w i (x) ≤ 0 for x ∈ (0, ∞).