MARKOVIAN STRATEGIES FOR PIECEWISE DETERMINISTIC DIFFERENTIAL GAMES WITH CONTINUOUS AND IMPULSE CONTROLS

. This paper is concerned with the Markovian feedback strategies of piecewise deterministic diﬀerential games and their applications to business and management decision-making problems that involve multiple agents and continuous and impulse controls. For a class of piecewise deterministic diﬀer- ential games in ﬁnite or inﬁnite horizons we formulate conditions for the value functions in the form of quasi-variational inequalities, prove a veriﬁcation the- orem, and derive a criterion for the Markovian regime change in certain case. These results are applied to a technology adoption problem that involves mul- tiple companies engaged in extraction of an exhaustible resource with diﬀerent technologies. Using the model proposed by Long et al in [16], we show the ex- istence of a pure Markovian strategy and develop an algorithm for computing the solutions.


1.
Introduction. Business and management planning often involves two types of decisions, one leads to gradual and slow changes, and other leads to abrupt and sudden changes. The former includes setting the production rates; adjusting investment proportions; and raising or lowering tax. The latter includes adopting new technologies; enacting new policies; and entering into new business partnerships. Accordingly, business and management environment often has two types of features. One vary slowly, such as the available capitals and the market shares. The other vary abruptly, such as the technology adopted, and business regulations enforced. The tasks of business strategists are to make optimal gradual changes as well as choosing the optimal timing for taking the abrupt changes. For a single agent, this is an optimal control problem with continuous and impulse controls. For multiple agents, this is a piecewise in time multi-regime game.
Optimal timing for regime switch has been a subject of study in recent years, especially in the area of technology adoption and investment strategies. For singleagent optimal control problems, Tomiyama [21] studies the investment decision of a firm whose capital goods have a delivery lag. This work is continued by Tomiyama and Rossana in [22] who include the switch time as an argument of optimization. Amit [1] is interested in the petroleum recovery process that has two phases, the primary phase during which the resource recovers by itself and the secondary phase during which the resource recovers using artificial means. He derives necessary conditions for the optimal controls. Both Tomiyama and Amit deal with the optimal control problems in finite horizon. In contrast, Makris [17] derives necessary conditions for the optimal regime switch in infinite horizon, and use the results to study a model of capital controls. Boucekkine et al [5] study a two-stage optimal control problem for technology adoption that involves obsolescence and learning costs. Following this direction, Valente [23] studies the switching from exhaustible resources to renewable resources. Saglam [20] extends the results to a multi-stage problem. Boucekkine et al [3] consider a control problem with two types of regime switch, one concerns technological or institutional regimes, and the other features regimes on some given threshold values, such as pollution. Their method is applied to a problem of optimal management of natural resources under ecological irreversibility. For multi-agent differential game models, the results are much sparse. Reinganum [19] considers the optimal timing for adopting a new technology by finding a Nash equilibrium, Fudenberg and Tirole [9] consider the timing of adopting new technology for n identical firms. Boucekkine et al [4] study political regime changes in resourcerich countries and examine the "oil impedes democracy" hypothesis. Finally, Long et al [16] study a two-company competition model of technology adoption for extraction of an exhaustible resource. It is interesting to note that although many of the above works provide closed-loop feedback strategies, none gives pure Markovian strategies.
In this paper we develop Markovian feedback strategies for autonomous piecewise deterministic differential games, and apply the results to a class of business and management problems. Differential game models have been used in economics and management, such as resources and environment management and marketing and investment strategies for a long time (cf. [7,8,11,14,15,18]). However, ordinary differential games are mostly used to model gradual changing processes and thus not sufficient for processes that also involve regime changes. For processes that are intertwined with gradual and abrupt changes, strategies that synthesize both continuous and impulse controls are needed. Piecewise deterministic differential games are tools for determining such synthesized strategies. A piecewise deterministic differential game is a type of differential games that involve continuous evolution of the state variables and abrupt regimes changes at certain jump times. Between such instants the state variables evolve deterministically according to the system dynamics of the economic interrelationship. This theory has been developed in recent years and have been used in some models of economics ( [2,6,7,10,12,13,16]). In this paper, we apply it to a class of business and management problems.
The problems under study are finding Markorian strategies for single or multiple agents in a system that consists of finite sets of state variables and regimes. The state variables are governed by a system of autonomous differential equations. The agents can use continuous controls to alter the rates of change of the state variables. At any time the system is in one of a finite number of regimes. The regimes are changed by the impulse controls of the agents. Finally, each agent has an instantaneous payoff as well as a lumpsum payoff when the regime switches. The objective of each agent is to choose the optimal continuous and impulse controls to obtain the maximum possible total payoff, assuming that other agents are doing the same. The precise mathematical form of the problems is given in Section 2. For such problems, we derive conditions for Markovian strategies as a set of quasivariational inequalities. We show that the classical solutions to the quasi-variational inequalities are optimal by proving a verification theorem. In addition, in the case where the system has only one state variable, we derive a necessary condition for regime change. To illustrate the application of our results, we perform a thorough analysis to the technology adoption problem in resource extraction process proposed in [16], showing the existence of the Markovian strategies and describe how the strategies can be numerically computed. These results are new because so far there exist no similar methods in the literature to find Markovian strategies that involve both continuous and impulse control for multi-company competition problems.
This paper is organized as follows. In Section 2 we derive a system of quasivariational inequalities for Markovian strategies of the general piecewise deterministic differential game model and provide a verification theorem which guarantees the the optimality of the Markovian strategies. We also derive a necessary condition for the regime switch in the case where there is one state variable. In Section 3 we apply the results in Section 2 to the resource extraction model proposed in [16], deriving the Markovian strategies in the cases where there is one company with multiple technologies, and there are multiple companies with two technologies. Numerical examples are presented, including the one with the same parameter values as in [16] for comparison. In Section 4 we give brief concluding remarks. Finally in Appendix we give proofs of Theorems 2.1 and 3.3.

2.
Markovian continuous and impulse controls. In this section we derive conditions for the Markovian strategies in the form of quasi-variational inequalities, provide a verification theorem, and derive a necessary condition for the regime change when the system involves only one state variable. In what follows, we use "player(s)" for "agent(s)" and "mode(s)" for "regime(s)" to be consistent with the terminologies used in the literature of differential games.
The mathematical form of the problems is formulated as follows. The general differential game involves n players denoted by i = 1, . . . , n, d state variables represented by an d-tuple y = (y 1 , . . . , y d ) ∈ R d , and m modes σ ∈ {σ 1 , . . . , σ m }. We use Ω and Σ to denote the set of possible values of y and σ, respectively. The state variables evolve deterministically except at certain jump times t < t 1 < t 2 < · · · < T , at which times the mode and the values of the state variables change abruptly. Here, T is either a finite positive number or ∞, and the jump times are determined endogenously as described in Subsection 2.1. Between two jumps the state variables are governed by a system of differential equations, for t l < s < t l+1 , y (t l ) = x l , l = 1, 2, . . .
where σ l is the mode for t l < s < t l+1 , u = (u 1 , . . . , u n ) represents the continuous controls of the players, and x l ∈ Ω is the value of the state variables at t l . Switching of mode are caused by the impulse controls of the players given by a mapping where F is continuous in x and ξ = (ξ 1 , . . . , ξ n ) represents the impulse controls of the players. We assume that the payoffs are exponentially discounted. Let g σ i (x, u) be the instantaneous payoff for player i when the state is (x, σ). Also, let γ σσ i (x) be the lumpsum payoff for player i when the mode changes from σ to σ . Then, the exponentially discounted total payoff for player i for s > t when the initial values are θ (t) = σ and y (t) = x is where E t i is the expectation operator conditional on player i's available information at t, ρ i > 0 is the discount factor,V σ i is the lumpsum payoff for player i in mode σ at the time when the state variable y (s) reaches the boundary ∂Ω, and χ (t,∞) (·) is the characteristic function of the interval (0, ∞). The sets of admissible continuous and impulse controls on any interval [t, T ) are respectively, for i = 1, . . . , n, where U i and X i are the sets of possible values of u i and ξ i , respectively. A set of optimal strategy profile is a pair {u * , ξ * }, where u * ≡ u * (·) = (u * 1 (·) , . . . , u * n (·)) and ξ * ≡ ξ * (·) = (ξ * 1 (·) , . . . , ξ * n (·)) satisfying T ), i = 1, . . . , n. (Throughout this paper, we use the subscript "−i" to denote the sub-vector with all the components j ∈ {1, . . . , n} \ {i}.)

2.1.
Quasi-variational inequalites. We derive conditions for the Markovian strategies. Let , ξ * (·)) for i ∈ {1, . . . , n} , σ ∈ Σ denote the value functions, where {u * (·) , ξ * (·)} satisfies (2.4). A Markovian strategy is a mapping (x, σ) → (u * , ξ * ) where u * = (u * 1 , . . . , u * 1 ) are the continuous controls and ξ * = (ξ * 1 , . . . , ξ * n ) are the impulse controls depending only on (x, σ). The impulse controls form a Nash equilibrium. For any i ∈ {1, . . . , n} and σ ∈ Σ we define the function X i . Then, the payoff of player i at the state (x, σ) if the players take the impulse controls ξ = (ξ 1 , . . . , ξ n ) is The result of the actions ξ * is the mode change σ → σ * where holds, and only when this relation is equal can the change σ → σ * be made. We derive the condition for u * as follows. If the mode does not immediately change at t, it must satisfy the inequality for t > t and near t, where u * (·) = (u * 1 (·) , . . . , u * n (·)) ∈ U[t, t ) is the part of the Nash equilibrium strategy profile u * (·) on [t, t ), and y (t) is the solution of the initial-value problem The result is a Markovian strategy for the differential game. Assuming that . . , f σ d ) and D represents the gradiant operator. Substituting the right-hand side into (2.7), dividing the result by t − t, and then taking the limit t → t, we obtain Then an equilibrium continuous controls u * = (u * 1 , . . . , u * n ) satisfy the inequalities for any u i ∈ U i , i = 1, . . . , n.
It follows from the above derivations that if the value function V σ i is differentiable in Ω, both inequalities (2.6) and (2.9) must hold, and at least one of them holds equal. This leads to the quasi-variational inequalities where u * ∈ U , ξ * ∈ X, and σ * ∈ Σ satisfy the conditions for any ξ i ∈ X i i = 1, . . . , n.
(H): Letξ i denote the impulse control of player i taking no action, and let ξ = ξ 1 , . . . ,ξ n . Then F x, σ,ξ = σ for any (x, σ) ∈ Ω × Σ. Also, functions γ σσ i (x) satisfy the conditions . . , n} . The first part of Hypothesis (H) indicates that if all players take no action, the mode does not change, and the second part of (H) indicates that if the mode does not change, no player receives a lumpsum payoff. Under this hypothesis we assert

and (2.4).
The long proof is deferred to Appendix.

2.2.
Mode change. We next derive a criterion for mode changes when the state space is of one dimension. That is, when d = 1 and Ω is an interval. In this case, V σ i (x) is a single variable function and DV σ i (x) is a scalar function for any i and σ. Substituting a solution u * of the first inequality in (2.11) If in a mode σ the system of equations can be solved for p for any x ∈ Ω, then the following condition holds for the mode change σ → σ * .
Theorem 2.2. Let Hypothesis (H) hold and let d = 1. Suppose x * ∈ Ω and there is a neighborhood (a, b) ⊂ Ω of x * such that the mode is σ for x ∈ (a, x * ) and is σ for x ∈ (x * , b). Also suppose that the system (2.14) is uniquely solvable for for i ∈ N and the following relations hold: (2.15) Proof. Since the mode is σ for x ∈ (a, x * ) and is σ for x ∈ (x * , b), by (2.10), Let the solution (p 1 , . . . , p n ) of (2.14) be written as . . , y n ) for i = 1, . . . , n.
Then, system (2.16) can be written as (2.17) Hence the system of differential equations can be written in the integral form Since the value functions are optimal, ∂ x * V σ i (x) = 0 for any x ∈ (x * , b) and i ∈ N . Thus, by differentiating the right-hand side of the above equation with respect to x * , we obtain We obtain the equations . . , n. These equations are equivalent to . . , p * n ) = 0 for i = 1, . . . , n. Using the initial conditions in (2.17), we obtain (2.15). This completes the proof.
3. Application: Technology adoption in resource extraction process . In this section, we solve a technology adoption problem for a resource extraction process as an illustration of application of the results in Section 2. In a recent paper [16] Long et al study a differential game model of two competing companies extracting an exhaustible resource with two available technologies. Each company has at its disposal a continuous control and an impulse control. The former is the day by day consumption rate of the resource, and the latter is the timing of adopting the new technology. The authors of [16] propose a kind of "closed-loop" strategies for the companies, with Markovian continuous controls and non-Markovian impulse controls. In this section we develop pure Markovian feedback strategies with both continuous and impulse controls, and prove their optimality.
To motivate our formulation, we first describe the model proposed in [16]. The differential game model consists of two companies, indicated by a subscript i ∈ {1, 2}, and two technologies, old and new. The new technology is more efficient, but there is a cost for adopting it. Depending on which technology each company uses, there are four possible modes (1, 1), (1, 2), (2, 1), and (2, 2), where the first component represents the technology that Company 1 is using and the second component represents the one that Company 2 is using, with 1 representing the old technology and 2 the new technology. It is assumed that the companies extract the resource in proportion to their consumption rates. Let γ j i be the constant proportionality for Company i using Technology j. Then, the reciprocal of γ j i represents the efficiency of Technology j for Company i, and the rate of extraction of the resource by Company i is −γ j i u i (t). Hence, the resource stock y (t) is governed by the differential equationẏ . It is assumed in [16] that the instantaneous payoff for Company i is ln u i (t) and the lumpsum cost of Company i adopting the new technology for some positive constants α i and β i . In addition, the payoff is exponentially discounted with a rate ρ ∈ (0, 1). Thus the total payoff of Company i starting extraction at time t when the resource stock is x and the mode is σ is where T , either finite or infinite, is the time when the extraction ends, and χ (t,T ) (·) is a function such that (It is shown below that T is actually a finite number.) Each company has two types of controls, the continuous control, u i (t), which is the instantaneous rate of consumption of the resource, and the impulse control, ξ i , which is switching to the new technology. A strategy of a company consists of both continuous and impulse controls at any time. A Markovian strategy is one such that both u i and ξ i depend directly on (y, σ), but not directly on t. This means there are functions φ i and ψ i defined on R + × Σ, where Σ is the set of possible modes, such that The closed-loop strategies proposed in [16] assume that the second company switching to the new technology chooses its switching time based on the state (y, σ) at the time when the first company switches to the new technology. For example, if Company 1 first switches to the new technology at t 1 . Then Company 2 switches at t 2 based not on y (t 2 ) but on y (t 1 ). The authors term such a strategy a "piecewise closed-loop" strategy. Such a strategy discounts the effect of the observation that the company makes after the other company has switched to the new technology on the company's switching time. It also causes the strategy rather complicated and difficult to generalize to multi-company and multi-technology cases.
We prove the existence of pure Markovian strategies of the continuous and impulse controls, and develop a computing algorithm of the Markovian strategies in two cases. One case is the optimal control problem with one company and any number of technologies. The other is the differential game model with any number of companies and two technologies. The more general case of n companies and m technologies quickly become extraordinary complicated as n and m increase. Therefore we content ourselves with these two special cases. It will be shown that in the first case the value functions can be analytically solved, while in the second case the value functions cannot be analytically solved, but can be numerically computed.
3.1. One company with multiple technologies. In this subsection we consider the case where there is one company with multiple technologies. The company wants to choose the optimal rate of consumption and optimal times to change technology. This is a typical optimal control problem with continuous and impulse controls. We first give the form of the value function and derive the criterion for the value of the resource stock at the mode change, and then formulate a computing algorithm for determining the optimal continuous and impulse controls. An example is provided to illustrate the algorithm.
Suppose there are m technologies, denoted by σ = 1, . . . , m. The resource stock is governed by the differential equation The quasi-variational inequalities consists of and ω σσ (x) = α σσ + β σσ x is the cost of technology change from type σ to type σ . It is easy to see that So the quasi-variational inequalities (2.10) have the form for σ = 1, . . . , m.
3.1.1. Value function in the terminal mode. The following theorem gives the value function in the terminal mode.
Theorem 3.1. Suppose mode σ continues until the end of extraction when the resource is exhausted. Also suppose that V σ (x 0 ) = V σ 0 for some x 0 ≥ 0. Then for any x > x 0 the value function V σ is given by Proof. In mode σ, V σ (x) satisfies the initial value problem By differentiating the two-sides of the differential equation, we find that the derivative P σ = V σ x satisfies the initial value problem The separable equation has the solution

Criterion for change of technology.
We show that if technology changes from type σ to type σ as x decreases across x * , then the relation holds. Furthermore, for x > x * the value function V σ (x) is given by (3.10) Theorem 3.2. Suppose the mode changes from σ to σ as x decreases across x * and there is no other mode change in a neighborhood of x * . Then x * satisfies (3.9) and the value function V σ (x) in mode σ is given by (3.10).
3.1.3. Computing algorithm. Theorems 3.1 and 3.2 give the value function within each mode and the criterion of x * at a mode change. Based on these results we can find the Markovian feedback controls as follows. Let σ be the terminal mode in which the extraction of the resource ends. Tracing backward, the last technology change occurs at x * = x * σσ that satisfies the equation (3.9), where σ is the mode before the technology change and V σ (x) is value function in mode σ given by (3.6) with σ replaced by σ and V σ 0 = 0. Using (3.6) with V σ 0 = V σ (x * ) − ω σσ (x * ) we can find the value function V σ (x) in mode σ. We can then find the resource stock at previous technology change by (3.9), and the value function before the change by (3.6). This process can be repeated until x reaches the initial resource stock. In each mode, the optimal continuous control is given by (3.4), and at the change of mode, the impulse control σ → σ is taken at x * that is determined by (3.9).
Suppose the modes are ordered so that γ σ < γ σ if σ > σ, and suppose that α σσ ≥ 0, β σσ ≥ 0 if σ > σ, and at least one of α σσ and β σσ is positive. We propose the following algorithm for computing the optimal controls.
Step 1.: Find the value functions V m (x; 0) by (3.6) Then, for each σ < m, find x * σm by solving These are the resource stocks when the company switches from mode σ to m directly. Draw lines connecting 0 and x * σm for all x * σm > 0 and label the line by m. If there exists a σ < m that is not connected to m, find the largest m > σ such that x * σm > 0. Draw a line connecting 0 and x * σm . If no such m exists, connect 0 by σ to any x * σ σ for all σ < σ such that x * σ σ > 0. Repeat this process until each mode is either labeled on a segment connecting 0 or there is a modeσ > σ such that x * σσ > 0.
Step 2.: For any σ that is labeled on a line segment connecting 0 and any σ < σ such that Then, for each σ < σ , find x * σ σ by solving the equation If x * σ σ > x * σ σ , Draw a line connecting x * σ σ and x * σ σ and label the line by σ . Repeat this step to all modes less than σ , etc.
Step 3.: Let x p and σ 1 denote the present resource stock and mode, respectively.
Identify all paths that connect x p and 0 by line segments in the form σ 1 → σ 2 → · · · → σ k . For each such path, compute V σ1 x p ; x * σ1σ2 and choose the maximum of these values. The optimal impulse controls consist of switching modes labeled on the segments on the chosen path, at the resource stock which are the nodes of the path. The continuous controls in mode σ i is (3.14) for i = 1, . . . , k − 1.
We can find the time t in terms of x as follows. From Eq. (3.2) we find dt = −γ σ u σ (y) dy = − dy P σ (y) in mode σ.
The continuous controls are given by (3.14), where .34. Technology changes occur at  As shown in Fig. 3.3, since x * 12 < x * 23 , there is no connection between x * 23 and x p . The only path connecting 0 and x p is 1 → 3. This means Technology 2 will never be adopted, and the company will use Technology 1 until the resource stock reaches x = 301.87. At this time t * 13 = 22.59. The continuous controls are where V 1 (x * 13 ) = V 3 (x * 13 , 0) − ω 13 (x * 13 ) = 31.30. The extraction ends when t = 64.95. Fig. 3.4 shows how the extraction and the consumption rates change with time.

3.2.
Multiple companies with two technologies. In this subsection we consider the case where there are n companies with two technologies. In this case the modes can be represented by the n-tuples σ = (µ 1 , . . . , µ n ) where µ i ∈ {1, 2} meaning that Company i is using Technology µ i . Let Σ denote all 2 n possible modes. The resource stock equation has the form (3.16) and the value functions V σ i satisfy the inequalities represents the cost of Company i switching from Technology µ i to Technology µ i . The maximizing consumption rates u µi i satisfies The solution is u µi i = 1/ (γ µi i DV σ i ) , i = 1, . . . , n. Therefore, the differential inequalities for the value functions become DV σ i /DV σ j ≥ 0, for i = 1, . . . , n.
This initial value problem cannot be solved explicitly. However, we show that it has a unique classical solution for any nonnegative initial values V σ 1,0 ,. . . , and V σ n,0 . Theorem 3.3. For any σ ∈ Σ, Problem (3.19) has a unique classical solution (V σ 1 , . . . , V σ n ) defined for any x > 0. The lengthy proof of theorem is deferred to Appendix. Based on the proof of this theorem, the system of differential equations (3.19) can be solved in two steps: Step 1.: Solve functions P σ i (x) = DV σ i (x) for x ≥ 0, i = 1, . . . , n, from the differential equations As shown in the proof of Theorem 3.3 in Appendix, V σ i satisfies (3.19).

Change of mode.
At certain values of x one or more companies may change the technology, resulting in a mode change. Suppose that Company i changes the technology from µ i to µ i as x decreases across x * . Let σ and σ be the modes before and after the company changes the technology, respectively. We suppose that no other mode change occurs in a neighborhood of x * . In this case Theorem 2.2 ensures that the relations hold for some positive constants p * j , j ∈ {1, . . . , n} \ {i}, provided that the system is uniquely solvable for p 1 , . . . , p n for any y 1 , . . . , y n ∈ R. The solvability of the system is given by Lemma A.1 in Appendix with a i = ρy i + ln γ µi i . Therefore (3.22) holds. Furthermore, if a solution x * of (3.22) exists, the value functions V σ 1 (x) , . . . , V σ n (x) satisfy the equations (3.19) with the initial conditions Step 1.: Start with the mode σ 0 = (2, . . . , 2). Solve the initial value problem (3.19) with σ = σ 0 and V σ i,0 = 0. Then for any i = 1, . . . , n, solve (3.22) with µ i = 1 and µ i = 2. Let the solution be denoted as i is the minimum. Repeat this process with the new mode. Continue this process until all x * j are positive. The resulting mode is the terminal mode, and the index i such that x * i is the minimum is the last company that adopt the new technology. Denote the terminal mode by σ 1 , and denote the company that adopts the new technology by m 1 .
Step 2.: Let σ 2 be the mode whose m 1 -th component is 1 and other component the same as that in σ 1 . Solve the initial value problem (3.19) with the initial values Then for any i ∈ {1, . . . , n} such that the i-th component in σ 2 is 2, solve (3.22) and denote the solution by x * i . Find the minimum of {x * i } and denote the index by m 2 . Repeat this process until the mode reaches the present mode, σ p .
Step 3.: The impulse controls are σ p → · · · → σ 2 → σ 1 at the resource stock x * mp , . . . , x * m1 , respectively. The continuous control in mode σ k is where µ k i is the i-th component of σ k , x * m0 = 0, and x * mp ≡ x p is the present resource stock. The time at which a company adopts the new technology, t * m k can be found by the differential equation (3.16), which can be written in the form where P σ j = DV σ j . Integrating the two sides on the interval x * m k−1 , x * m k successively for k = p, . . . , 2, 1, with t p = 0, we find , for k = p − 1, . . . , 1. The total extraction time is .
Step 1. We first solve (3.19) for σ = (2, 2) and V σ i,0 = 0 numerically, using the methods described after Theorem 3.3. System (3.19) consists of equations Where P i = DV 2,2 i . The equations can be solved for P 1 and P 2 as (3.25) The initial values P 1 (0) and P 2 (0) satisfy equations ln γ 2 1 + ln P 1 (0) + 1 + We find the numerical solution We then solve system (3.22) for i = 1, 2. For i = 1 the system takes the form and for i = 2 it takes the form where p * 1 and p * 2 are positive constants. Solving the systems numerically, we find x * 1 = 975.26, x * 2 = 132.80.
Step 2. Since x * 2 < x * 1 , the mode is (2, 2) for 0 < x < x * 2 and (2, 1) for x * 2 < x < x * 3 where x * 3 is the resource stock at which Company 1 changes the technology. We find V 2,1 1 (x) and V 2,1 2 (x) by solving the initial value problem and 39. We then solve the equations for x * 3 , for some p * . The solution is x * 3 = 949.01. This completes Step 2.
If γ 2 2 in (3.24) is changed to 1.2 while other parameters remain the same, by repeating the computation we find x * 1 = 962.62, x * 2 = −53.68. Since x * 2 < 0, the terminal mode is (2, 1). In Steps 2 and 3, we find x * 3 = 917.58, which corresponds to t * 3 = 4.60. This is the time when Company 1 adopts the new technology. The resource is finally exhausted at t = 20.89. The graph of the resource stock and the consumption rates are shown in Fig. 3.6.
These results are substantially different from those in [16]. In particular, it is interesting to note that even when both companies are using the same old technology and having the same efficiency, their continuous controls may not be the same, and the different future timing of switching technologies have effect on the consumption rates of the companies early on.

4.
Conclusions. For a class of piecewise deterministic differential games in finite or infinite horizon, we derive Markovian feedback strategies and formulate a set of quasi-variational inequalities for the value functions and prove a verification theorem. We also prove a necessary condition for the mode change in the case where there is only one state variable.
The results are applied to the technology adoption problem for multiple companies extracting an exhaustible resource, proposed in [16]. In two cases we derive the solutions for the optimal Markovian continuous and impulse controls. One is when there is one company with multiple technologies, and the other is when there are multiple companies with two technologies. For both cases we are able to prove the existence of classical solutions, and develop computing schemes to obtain numerical solutions. Our results improve the "closed-loop" strategies proposed in [16] by taking into account the information observed by a company after some other company has switched the technology, and by allowing more companies and/or technologies.
Our results for this technology adoption problem is the first one regarding the Markovian strategies for multi-company interactions with continuous and impulse controls. No similar result seems to exist in the literature. On the other hand, the verification theorem that we prove in this paper is only valid for the classical solutions. For more general cases where solutions exist in some weaker form, such as the viscosity solutions, a new proof is needed. In addition, so far, we only obtain the necessary condition for the regime change when the state space is of one dimension. The higher dimensional case is subject to future study.
Using the mean value theorem on the left-hand side, we have where θ i is between z i andz i . Thus ln θ i > ln z i . The above system can be written By multiplying z i to the i-th column of the coefficient matrix, we see that Note that by the first relation in (A.5), Therefore, the matrix on the right-hand side of (A.7) is strictly diagonally dominant. By Levy-Desplanques Theorem, it is invertible. Therefore System (A.6) has only trivial solution. This proves the uniqueness of the solution of (A.3), which is equivalent to (A.1).
Note that any solution of (A.1) is necessarily positive. So functions F 1 , . . . , F n are positive. Furthermore, by the Implicit Function Theorem, these functions are continuously differentiable with respect to (a 1 , . . . , a n ). The next lemma regards the determinant of the matrix · · · −a n −a 1 n j=1 a j · · · −a n . . .
We show that on the right-hand side, every product a δ1 1 · · · a δn n has a positive coefficient, where δ 1 , . . . , δ n are nonnegative integers with the sum δ 1 + · · · + δ n = n. The coefficient of this product in det (B) is It is positive if and only if The right-hand side is a function of δ 1 , . . . , δ n in the set {δ i ≥ 0 : δ 1 + · · · + δ n = n}.
At this point the right-hand side has the value So (A.12) is true for any δ 1 , . . . , δ n . This proves that the coefficient of a δ1 1 · · · a δn n in det (B) is positive.
Since all coefficients of ( n k=1 a k ) n are no more than n!, inequality (A.10) follows.
Proof of Theorem 3.3. Let P σ i = DV σ i for i = 1, . . . , n. Differentiating the left-hand side of (3.19) with respect to x, we obtain the equation for P i in the form We show that this system can be uniquely solved for DP σ i if P σ i > 0 for i = 1, . . . , n. The linear system for DP σ i , i = 1, . . . , n, can be written in the matrix form where A is the coefficient matrix q = (DP σ 1 , . . . , DP σ n ) T , and b = −ρ (P σ 1 , . . . , P σ n ) T . (Here "T " denotes transposition.) We first show that the determinant of A is positive. Dividing P σ i into the i-th row of A followed by multiplying P σ i to the i-th column of A, the matrix becomes By the properties of determinant, det (A) = det (B). By Lemma A.2 with a i = 1/P σ i , det (B) > 0. So, det (A) > 0. Therefore system (A.14) is uniquely solvable. We show that its solution is negative. By the symmetry of the form of the system, it suffices to show that DP σ 1 < 0. Using Cramer's rule, By dividing the i-th row of C by P σ i followed by multiplying the i-th column of C by P σ i , we see that det (C) is the same as the determinant of the matrix  This proves that system (A.13) can be uniquely solved for DP σ i if P σ i > 0 for i = 1, . . . , n, and each DP σ i is negative. We write the differential equations in the form DP σ i = f i (P σ 1 , . . . , P σ n ) , i = 1, . . . , n.
(A. 19) It can be seen that functions f i are differentiable with respect to P σ j because they are quotients of determinants of matrices whose entries are differentiable functions of the variables P σ j , j = 1, . . . , n. The initial values P σ i (x 0 ), i = 1, . . . , n, are derived by the initial values of V σ i (x 0 ) in (3.19), which lead to the equations ρV σ i,0 + ln γ σ i + ln P σ i (x 0 ) + n j=1 P σ i (x 0 ) P σ j (x 0 ) = 0, i = 1, . . . , n. (A.20) By Lemma A.1, this system has a unique solution (P σ 1 (x 0 ) , . . . , P σ n (x 0 )). Since functions f i in (A. 19) are differentiable, the initial-value problem has a solution (P 1 (x) , . . . , P n (x)) on an interval [x 0 , x 0 + δ) for some δ > 0. We show that the solution (P 1 (x) , . . . , P n (x)) exists for all x > x 0 . Suppose by contradiction that the solution exists on the maximum interval [x 0 , x 0 +δ) for someδ < ∞. If P i δ > 0 for all i, then by the differentiability of f i , the solution can be extended beyondδ. Therefore there is an i ∈ {1, . . . , n} such that P i δ = 0. We show that there is a constant C > 0 such that Observe that the quantity on the right-hand side in the brackets is positive and bounded. Let C be its upper bound. Then DP σ 1 ≥ −C (P σ 1 ) 2 .

This inequality leads to
A similar derivation leads to the inequality for i = 2, . . . , n. Hence the solution (P σ 1 (x) , . . . , P σ n (x)) exists for all x > x 0 . Finally, let