OPTIMAL INVESTMENT PROBLEM WITH DELAY UNDER PARTIAL INFORMATION

. In this paper, we investigate the optimal investment problem in the presence of delay under partial information. We assume that the ﬁnancial market consists of one risk free asset (bond) and one risky asset (stock) and only the price of the risky asset can be observed from the ﬁnancial market. The ob- jective of the investor is to maximize the expected utility of the terminal wealth and average of the path segment. By using the ﬁltering theory, we establish the separation principle and reduce the problem to the complete information case. Explicit expressions for the value function and the corresponding optimal strategy are obtained by solving the corresponding Hamilton-Jacobi-Bellman equation. Furthermore, we study the sensitivity of the optimal investment strategy on the model parameters in a numerical section and both of the full and partial information schemes are simulated and compared.

1. Introduction. Optimal investment problems have been a topic of much interest in actuarial science. There are several papers which are concerned with maximizing the utility from terminal. For example, Browne [4] found the optimal investment strategy for exponential utility maximization and also the ruin probability minimization for diffusion risk model. Yang and Zhang [20] considered the same investment problem for the jump-diffusion risk process. We shall refer to such a type of problem as full information case since these papers consider the problem within the framework that all the parameters involved in the model can be observed. In the field of insurance and mathematical finance, a large number of existing papers deal with the problem in such framework. Clearly, it is more realistic to assume that the investors have only partial information since prices and interest rates are published and available to the public, but drift and paths of Brownian motions are mere mathematical tools for model creation, which certainly are not observable. We shall call this situation of case of partial information. In the field of insurance, there are few papers on this topic. Bai and Guo [2] assume that the drift term in the geometric Brownian motion which the risky asset follows is driven by a stochastic differential equation, and obtain the optimal strategy for this special case.
On the other hand, due to the nature of past dependence for practical problems, we shall consider the schemes with delay. This gives rise to a stochastic control problems with delay. Øksendal et al. [11] investigated the controlled system with delay and jumps in the state equation. They established both sufficient and necessary stochastic maximum principles. The portfolio management problem of Merton's type in which the risky asset return is related to the return history was solved by Chang et al. [5]. Zeng and Li [21] considered the optimal timeconsistent investment and reinsurance policies for mean-variance problem. Shen and Zeng [17] studied the optimal investment and reinsurance problem with delay under the criterion of mean-variance. A and Li [1] studied the optimal investment and excess-of-loss reinsurance problem with delay for a Hestons stochastic volatility model.
The portfolio selection problem with partial information has been studied extensively in a considerable mathematical finance literature, see for example, Lakner [8], Björk et al. [3], Hata and Sheu [7], Xiong [19], Liang and Song [10], Peng and Hu [15], etc. However, there are few papers to study the optimal control problems with both the partial information and delay. Here we also mention Pang and Hussain [12,13,14] for some related works.
The aim of this paper is to seek optimal investment with delay under partial information. The only information available to the investors are the past prices of the stock and the bond. This leads to an optimal control problem under partial information. It should be mentioned here that both the model formulation and the technique in this paper is different from that of Peng and Hu [15]. We apply the filtering theory to reformulate the problem into a completely observed case. Then, by dynamic programming approach, we get explicit expressions of the value function and the optimal investment strategy for this optimal control problem. Also, the solution of full information case is obtained. Furthermore, simulation is presented to compare the value function under two schemes. The utility in the full information case is higher than that in the partial information case.
The rest of this paper is organized as follows. We first formulate the problem in Section 2. In Section 3, we reformulate the problem into completely observable case by using the separation principle. Closed-form expressions, including the optimal strategy and the value function are obtained by solving the corresponding HJB equation in Section 4. In Section 5, the solution of full information case is given. Finally, several numerical simulations are presented in Section 6. 2. The model. Let (Ω, F, {F t }, P ) be a probability space with filtration {F t : t ≥ 0} satisfying the usual conditions. The terminal time T < ∞. We consider a financial market consisting of a risk-free asset and a risky asset (e.g., stock). Specifically, the price process of the risk-free asset is given by where (r(t)) t≥0 denotes the interest rate process which is assumed to be nonnegative and F t -adapted. The price process of the risky asset is given by where W (t) is a standard Brownian motion on the probability space (Ω, F, {F t }, P ), µ(t) is the appreciation rate process, and σ(t) is the volatility process.
In reality, it is natural that only the price of risky asset can be observed since the price of the risky asset is published and available to public. Namely, G t = σ(S(s) : s ≤ t), rather than F W t = σ( W (s) : s ≤ t), is the only information available to the investor at time t. In this case, the appreciation rate of the risky asset or the underling Brownian motion W can not be observed by the investor. Let v(t) represent the amount invested in the risky asset at time Let X(t) denote the wealth process at time t. Then the dollar amount invested in the risk-free asset is X(t) − v(t). In addition, we assume that there exists no capital inflow into or outflow from the investors current wealth. We denote by Z(t) and Y (t) the path segment and its average respectively. i.e., The dynamics of the wealth process X = (X(t)) t≥0 can be written as As being stated by A and Li [1] that the function f (t, represent the capital inflow/outflow amount, where X(t) − Z(t) accounts for the absolute performance of the wealth between t − δ and t, and X(t) − Y (t) implies the average performance of the wealth in the time horizon [t − δ, t]. In this way, the capital inflow/outflow, is related to the past performance of the wealth. A good past performance may bring more gain to the investor and the investor can pay a part of the gain as dividend to its stakeholders. While in contrary, a poor past performance may force the investor to make capital injection for the investment to cover the loss and make the final performance objective still achievable. To make the problem solvable, we assume that and so the dynamics of X(t) can be rewritten as where c(t) = r(t) − a − b, and a, b are nonnegative constants.
We are now in the position to introduce the optimal investment problem with delay and partial information for the investor. Suppose that the investor is concerned with both the terminal wealth X(T ) and the average wealth Y (T ). The investor's objective is to find an admissible strategy v(·) to maximize the following exponential utility of the terminal wealth and the average wealth: where U (x, y) = κ − q ρ e −ρ(x+ y) with κ, q, ρ > 0. The value function for our problem is defined by Remark 1. In the above utility function, the term −ρ(X(T ) + Y (T )) is a key quantity in many classical literature on stochastic optimization control problem with delay, such as Elsanousi et al [6].
3. Separation principle. In this section, we shall use stochastic filtering tools to deal with unobservable appreciation process of the risky asset. Based on the separation principle, we will derive a representation for the wealth process. We outline the main steps here. By Itô's formula, we have Employing Lemma 5.6 in [18], the innovation process Substituting (4) into (1), we then have Theorem 3.1. Under any admissible strategy v(t), the corresponding wealth process X(t) satisfies the following stochastic delayed differential equation (SDDE): Proof. The proof is essentially the same as that of Theorem 3.1 of [19], with the exception that our wealth process contains delay terms Y (t) and Z(t). But these two terms are observable and have no relationship with µ(t) or v(t). So we omit it here.
In order to obtain explicit solutions, we assume that in (1), σ(t) = σ and whereW (t) is a Brownian motion independent of W (t). According to the Kalman-Bucy filtering theory, where γ is the solution to the following Riccati equation:

HJB-equation and its solution.
To solve the problem, the arguments and assumption of [9] can be adapted to this paper, with minor modifications since the present work contains additional processμ(t). We see that if the value function V (t, x, y,μ) ∈ C 1,2,1,2 and its derivatives V t , V x , V y , V xμ and V xx are continuous on [0, T ] × R 3 , then V satisfies the following Hamilton-Jacobi-Bellman (HJB) equation: with boundary condition where

Thus, the HJB equation can be written as
Theorem 4.1. Let W ∈ C 1,2,1,2 be a solution to HJB equation (11) subject to the boundary condition (10). Then, the value function V (t, x, y,μ) given by (2) coincides x, y,μ). Proof. The proof is standard so we only give a sketch here.
Firstly, utilizing Theorem 4.2 in [9] by setting f = 0 and changing inf into sup, we obtain the optimality principle: for any stopping time τ , Fix (t, x, y,μ) ∈ [0, T ) × R 3 , τ ∈ (t, T ) (deterministic), and v ∈ Π. Then Eq. (12) implies that Applying Itô's formula, we arrive at Dividing both sides by τ − t and taking τ ↓ t, we see that Conversely, for any > 0, 0 ≤ t < τ ≤ T with τ − t > 0 small enough, there exists v * such that here the notation X v * indicates that X is obtained with control v * . Applying Itô's formula, dividing by τ − t and letting τ ↓ t we see that Combining with (13), we have the equality holds. The rest of the proof is routine.
Finally, we construct an example when the explicit solution to (11) can be obtained. Such an example is hard to find due to the time delay and partial information. We will propose a numerical scheme in Section 6 for the solution.
Example 1. Suppose (a − λ) = (c + ) and b = e −λδ . Let the utility function U be given by U (x, y) = κ − q ρ e −ρ(x+ y) with κ, q, ρ > 0. Then is a solution to (11) and the optimal control is given by where and Substituting the above expression into (11), we have In the above HJB equation, the maximum is attained at v * (t) given by (16), and maximum is 1 Consequently, we deduce that the HJB equation (11) can be written as follows.
H (x + y) Recalling (a − λ) = (c + ) and b = e −λδ , we have (17). Set Thus, we obtain the following equation G satisfies: Note that (20) can then be easily verified for the function H and G given in our example.

5.
Solution of the problem under full information. In this section, we investigate the optimal investment under full information in order to compare the effect of the partial observable information in an example in the next section. The dynamics of the controlled process is given by where µ(t) can be observed. The value functionV (t, x, y) is the same as (2) except here the admissible control v is adapted to the filtration which is generated by W . The set of all admissible control is denoted by Π . Then,V satisfies the following Hamilton-Jacobi-Bellman (HJB) equation: with boundary condition where

SHUAIQI ZHANG, JIE XIONG AND XIN ZHANG
Thus, the HJB equation can be written as We see that if the value function V (t, x, y) ∈ C 1,2,1 and its derivatives V t , V x , V y and V xx are continuous on [0, T ] × R 2 , then V satisfies the HJB equation.
Theorem 5.1. LetJ ∈ C 1,2,1 be a solution to HJB equation (23) subject to the boundary condition (22). Then, the value functionV (t, x, y) coincides withJ. Furthermore, let v ∈ Π such that The proof follows from the same arguments as those for Theorem 2, so we omit it here.
As in last section, we solve the HJB equation (23) when the utility function is of special form.
Example 2. Let the unitity function U be given by U (x, y) = κ − q ρ e −ρ(x+ y) with κ, q, ρ > 0. Then is the solution of (23), and the optimal control is given by where and Proof. Note that in the above HJB equation, the maximum is attained at v * (t) given by (24) and maximum is 1 It is easy to check that K(T ) = 1, L(T ) = 0, We can then verify (23) through simply calculations.
In detail, from (7), the filtering of the appreciation rate process increases as the value of α increases. It is shown in Figure 1.  Accordingly, the higher a return a stock has, the more money should be invested in the stock market. In other words, the increasing of the appreciation rate process cause the rising of the control, as illustrated in Figure 2.
Therefore, more investment in the stock leads to higher expected returns, as shown in the Figure 3.
6.2. Sensitivity of σ. Since σ is the volatility rate of the stock price, a highly volatile stock is inherently riskier. It is reasonable to reduce the amount invested in stock, just as Figure 4 shows.
Consequently, a relatively stable price leads to more utility return, which is also the same as our intuition. For more details, see Figure 5.
6.3. Sensitivity of β. Figure 6 shows that the appreciation rate of the stock price goes up when β rises.
As demonstrated in the sensitivity of α, the higher the appreciation rate is, the more money is invested in the stock, see Figure 7.
Hence, the optimality of the strategy is reflected by the increase of the value function. See Figure 8. 6.4. Comparison of V (x) andV (x). Obviously, the value function is higher in the full information case than it in the partial information case. See Figure 9.