Robust equilibrium control-measure policy for a DC pension plan with state-dependent risk aversion under mean-variance criterion

In reality, when facing a defined contribution (DC) pension fund investment problem, the fund manager may not have sufficient confidence in the reference model and rather considers some similar alternative models. In this paper, we investigate the robust equilibrium control-measure policy for an ambiguity-averse and risk-averse fund manger under the mean-variance (MV) criterion. The ambiguity aversion is introduced by adopting the model uncertainty robustness framework developed by Anderson. The risk aversion model is state-dependent, and takes a linear form of the current wealth level after contribution. Moreover, the fund manager faces stochastic labor income risk and allocates his wealth among a risk-free asset and a risky asset. We also propose two complicated ambiguity preference functions which are economically meaningful and facilitate analytical tractability. Due to the time-inconsistency of the resulting stochastic control problem, we attack it by using the game theoretical framework and the concept of subgame perfect Nash equilibrium. The extended Hamilton-Jacobi-Bellman-Isaacs (HJBI) equations and the verification theorem for our problem are established. The explicit expressions for the robust equilibrium policy and the corresponding robust equilibrium value function are derived by stochastic control technique. In addition, we discuss two special cases of our model, which shows that our results extend some existing works in the literature. Finally, some numerical experiments are conducted to demonstrate the effects of model parameters on our robust equilibrium policy.


(Communicated by Phillip Yam)
Abstract. In reality, when facing a defined contribution (DC) pension fund investment problem, the fund manager may not have sufficient confidence in the reference model and rather considers some similar alternative models. In this paper, we investigate the robust equilibrium control-measure policy for an ambiguity-averse and risk-averse fund manger under the mean-variance (MV) criterion. The ambiguity aversion is introduced by adopting the model uncertainty robustness framework developed by Anderson. The risk aversion model is state-dependent, and takes a linear form of the current wealth level after contribution. Moreover, the fund manager faces stochastic labor income risk and allocates his wealth among a risk-free asset and a risky asset. We also propose two complicated ambiguity preference functions which are economically meaningful and facilitate analytical tractability. Due to the time-inconsistency of the resulting stochastic control problem, we attack it by using the game theoretical framework and the concept of subgame perfect Nash equilibrium. The extended Hamilton-Jacobi-Bellman-Isaacs (HJBI) equations and the verification theorem for our problem are established. The explicit expressions for the robust equilibrium policy and the corresponding robust equilibrium value function are derived by stochastic control technique. In addition, we discuss two special cases of our model, which shows that our results extend some existing works in the literature. Finally, some numerical experiments are conducted to demonstrate the effects of model parameters on our robust equilibrium policy.

1.
Introduction. Due to the demographic threat and financial market development, DC pension plans are playing an increasingly important role nowadays. Many countries shift their pension forms from defined benefit (DB) plans to DC plans to ease the pressure on the public financial system since DC plans can transfer the investment risk from the sponsor to the retiree (Poterba et al. [35]) Although the optimal investment problem for DC plans has been widely investigated from different aspects, only a few incorporated the model uncertainty. Here, the ambiguity refers to the Knightian uncertainty which is distinct from the famous notion of risk by Knight [20]. As we all know, the expected returns of risky assets are extremely difficult to forecast with any adequate precision, and investors are skeptical about the reliability of historical estimates. However, our lack of knowledge about the actual state process or estimation error inevitably introduces ambiguity into the problem. Thus, rational investors should take the ambiguity aversion into account. Rather than making ad-hoc decisions about how much error is contained in the estimation, an ambiguity-averse investor (AAI) will instead consider some alternative models which are close to the reference model. This methodology to handle uncertainty has been widely implemented recently in quantitative finance, for portfolio selection, asset pricing and insurance with model uncertainty or model misspecification. We review some of the prominent studies in the following.
Anderson et al. [1] introduce the concept of ambiguity aversion into the Lucas model and formulate a robust control problem for investors. Following this work, Maenhout [30,31] innovate a "homothetic robustness" framework and investigate the effect of ambiguity on the intertemporal portfolio choice in a setting with constant investment opportunities and in a setting with mean-reverting equity risk premium, respectively. Later, a great deal of works are based on Maenhout [30] to address the implications of ambiguity on portfolio choice. Liu [27] studies the optimal investment and consumption policy for an investor under "homothetic robustness" and obtains the robust optimal policy under recursive preferences. Flor and Larsen [15] determine the optimal investment policy for an AAI with a stochastic interest rate. Munk and Rubtsov [33] introduce the stochastic interest rate and inflation into a portfolio management problem for an AAI. Liu et al. [26] study the role of ambiguity aversion in options pricing under an equilibrium model with rare-event premia. Xu et al. [47] consider a robust equilibrium pricing model under Heston's SV model. Zhang and Siu [53] investigate a reinsurance and investment problem with model uncertainty and formulate the problem as a zero-sum stochastic differential game. Lin et al. [25] and Korn et al. [21] discuss the optimal reinsurance problem and the optimal reinsurance-investment problem with model uncertainty by using a stochastic differential game approach. Yi et al. [49] and Yi et al. [50] study the robust reinsurance-investment problem for an AAI under the expected exponential utility framework and MV criterion, respectively. Pun and Wong [36] consider the robust optimal reinsurance-investment problem with multi-scale stochastic volatility under a general concave utility function. Zeng et al. [51] extend the analysis to a reinsurance-investment problem for an AAI who faces uncertainties regarding models in the financial and insurance markets with jumps. Pun [37] establishes a general tractable framework for robust time-inconsistent stochastic control problems. Zeng et al. [52] provide a derivative-based optimal policy for an ambiguity-averse pension investor who faces not only risks from stochastic income and market return volatility but also uncertain economic conditions.
Recently, an overwhelming amount of literature has investigated time-inconsistent stochastic control problems, where the objective function contains time-inconsistent terms such that the classical Pontryagin's and Bellman's optimality principles are not applicable. The MV portfolio selection problem pioneered by Markowitz [32] is a typical example. Typical studies for DC pension plans include Yao et al. [48], Vigna [40], Guan and Liang [16], Blake et al. [7] and Liu et al. [28].
In the literature, there are basically two ways to handle the time-inconsistency. The first approach is to investigate the pre-committed problem, where the policy optimizing the initial objective function will be interpreted optimal over the whole investment period. See Li and Ng [24], Zhou and Li [55]. The other popular approach is to formulate the problem under the game theoretic framework in order to derive (time-consistent) equilibrium policy, which consistently optimizes the objective function anticipated at every time point in the similar manner of dynamic programming but uses the concept of subgame perfect equilibrium. The primitive idea of this approach can be traced back to Strotz [39]. Related papers include Basak and Chabakauri [3], Björk et al. [6], Björk et al. [5], Li and Li [23] and Wu et al. [46]. The equilibrium policy focuses on the local optimality while the pre-commitment policy emphasizes the global optimality which is criticized for the time-inconsistency in efficiency (see Cui [11]) because of its strong commitment.
The risk aversion model under the MV criterion also raises an upsurge interest in the past few years. The investor's risk aversion attitude may be time-dependent or even state-dependent, i.e., it depends on the current realizations of state variables. Björk et al. [5] describe the risk aversion in a fractional form of the current wealth level. This model is also adopted by Li and Li [23] to study a reinsurance optimization problem. Hu et al. [18] define the risk aversion as a linear function of the current wealth level in a continuous-time setting. Wu [44] and Wang and Chen [41] adopt the same model for the risk aversion parameter as that in Björk et al. [5] and investigate the optimal policy for the portfolio selection and DC pension plans in discrete-time, respectively. Cui et al. [12] and Cui et al. [13] propose a flexible behavioral risk aversion model which takes a piece-wise linear form of the surplus or shortage with respect to a preset investment target in the continuous-time setting and the multi-period setting, respectively.
As mentioned above, the main goal of this paper is to investigate the equilibrium policy for a DC pension plan in continuous-time under the MV criterion, where the fund manager's risk aversion is state-dependent and he agrees that the estimated model (also called the reference model) contains significant parameter uncertainty, i.e., the market (true model) may deviate from the reference model. To the best of our knowledge, this problem has not been studied yet. In this paper, we incorporate the stochastic labor income into our model and adopt a more realistic state-dependent risk aversion model which is a linear function of the current wealth level after contribution. Moreover, we define alternative models which are equivalent to the reference model in terms of probability measure. In addition, inspired by Pun [37], we propose two ambiguity preference functions which are economically meaningful and facilitate analytical tractability. We adopt a minmax formulation to construct the robust decision rule. To find a time-consistent policy for the proposed problem, we use the concept of subgame perfect equilibrium to characterize the time-consistent policy. With the extended dynamic programming approach derived in this paper, we characterize the robust equilibrium policy using an extended Hamilton-Jacobi-Bellman-Isaacs (HJBI) system. We try our best to derive the explicit expression of the robust equilibrium policy by solving the extended HJBI system and show that the robust equilibrium policy is linear with respect to the current wealth level and the current labor income level for our specified ambiguity preference functions.
Compared to the existing literature, the main contributions of our paper are quadruple: 1) We adopt a state-dependent risk aversion function to describe the fund manager's risk attitude and incorporate the model uncertainty according to the rule of model misspecification proposed by Anderson. Based on these, we investigate the asset allocation problem for a DC pension plan in a continuous-time setting under the MV criterion and aim to seek the robust equilibrium policy. This problem is still an open question. 2) We incorporate the stochastic labor income factor into our model and link it to the contribution premiums, which leads to a more complicated problem than that in Pun [37]. Moreover, our choice of the ambiguity preference functions extends the specifications of the ambiguity preference function in Maenhout [30] and Pun [37]. 3) We provide rigorous mathematical definition for the robust equilibrium control-measure policy which extends and improves the definition in Zeng et al. [51]. A verification theorem is established to show that solving the extended HJBI system is a sufficient condition for robust optimality. Our work extends the existing analyses for DC pension plans under the MV criterion to their robust counterparts. 4) Some numerical results are also provided to illustrate the effects of model parameters on the robust equilibrium policy, which sheds light on our theoretical results.
The rest of this paper is organized as follows. In Section 2, we introduce the market structure as well as the reference model and alternative models. The robust optimization problem for a DC pension plan is then proposed. Section 3 proceeds with the deduction of the extended HJBI system as well as the verification theorem for our problem. The explicit robust equilibrium policy is obtained for our choice of ambiguity preference functions. Two special cases are discussed in Section 4. In Section 5, we present some numerical examples to illustrate the effects of model parameters on the robust equilibrium policy. Finally, Section 6 concludes the paper.
2.1. The reference model. Let Ω, F, {F P t } t∈[0,T ] , P be a filtered complete probability space satisfying the usual conditions, where T is a positive and finite constant representing the investment time horizon (retirement date); {F P t } t∈[0,T ] is generated by two standard one-dimensional P-Brownian motions W P (t) and W P 0 (t), here W P 0 (t) is independent of W P (t); F P t denotes the information available until time t, and P is the reference measure. Hereafter, we suppose that all stochastic processes are welldefined and adapted to this probability space, and all functions are measurable and uniformly bounded in [0, T ]. In addition, we assume that there are no transaction costs or taxes in the market, and trading takes place continuously.
The reference model will be defined over the reference measure P. We consider an asset allocation problem in the accumulation phase for a DC pension fund. The financial market under consideration consists of a risk-free asset and a risky asset. Denote the prices of the risk-free asset and the risky asset by B(t) and S(t), respectively, whose dynamics under the reference measure P are specified as follows, where r(t) > 0 is the risk-free interest rate at time t, µ(t) ( > r(t)) and σ(t) are, respectively, the appreciation rate and the volatility rate of the risky asset at time t. We assume that r(t), µ(t) and σ(t) are deterministic functions of t. For the sake of simplicity, we consider only one risky asset in our model, which can be interpreted as a stock market index.
In a DC pension plan, the plan member will continuously contribute part of his labor income to the pension account before retirement. Upon the retirement, he can convert his fund into an annuity. We assume that the plan member receives a continuous stream of non-negative labor income over time. Similar to that in Battocchio and Menoncin [4] and Cairns et al. [9], let L(t) be the labor income at time t which satisfies the following stochastic differential equation (SDE): where α(t) > 0 is the appreciation rate of the labor income at time t. ϕ(t) is the volatility rate of the labor income at time t measuring how the financial market risk source W P (t) affects the labor income, while β(t) is a non-hedgeable volatility whose risk source is not related to the financial market. This non-hedgeable risk source is represented by an one-dimensional standard Brownian motion W P 0 (t) which is supposed to be independent of W P (t).

Remark 1.
In standard financial economics, one often speaks about an endowment, or equivalently an income stream, to describe the concept of labor income we introduce. The reason for adopting this concept is to keep in line with the relevant literature in insurance. The labor income (or salary in some papers) plays an important role in pension plans and is analyzed in many studies. Note that in equation (3), the labor income is influenced by the risk sources both from the financial and non-financial markets. This assumption is consistent with the studies in Battocchio and Menoncin [4] and Cairns et al. [9], but is different from the studies in Deelstra et al. [14], Bodie et al. [8] and Chen et al. [10], where they all assume that the labor income is only influenced by the risk from the financial market.
Suppose that the plan member continuously contributes a constant proportion, c ∈ [0, 1], of his labor income to the pension plan at any time point. Denote by u(t) the dollar amount invested in the risky asset at time t and by X u (t) the wealth level of the pension fund after contribution under control u(t). Let O := [0, T ] × R × R + , in this paper, we restrict ourselves to the feedback control, i.e., the control is in the form u(t) = u t, X u (t), L(t) , where the mapping u : O → R is measurable. Then, the wealth process under probability measure P follows where θ(t) = µ(t) − r(t) is the excess return rate at time t.
Remark 2. The contribution term cL(t)dt in equation (4) is the same as that in Cairns et al. [9]. However, Battocchio and Menoncin [4] adopt cdL(t). The most significant difference between them is whether the contribution is proportional to the labor income itself or the changes of the labor income. However, Ma [29] points out that the assumption of Battocchio and Menoncin [4] is incorrect and the correct expression should be cL(t)dt, because it implies that the change of the labor income dL(t) (rather than labor income L(t)) contributes to the growth of pension wealth. If this is true, a contribution from the constant labor income would not increase the pension wealth.

2.2.
Robust stochastic control problem. The above reference model corresponds to the traditional framework for an ambiguity-neutral fund manager (ANFM) where his preliminary knowledge of ambiguity is captured by the probability measure P. To incorporate the model uncertainty into our problem for an ambiguityaverse fund manager (AAFM), we assume that the fund manager doesn't know precisely whether the reference model is the true model and prefers to consider some alternative models in the decision process. Loosely speaking, robustness is achieved by hedging against all adverse alternative models that are reasonably "similar" to the reference one. Mathematically, we adopt the concept of the measure equivalence to characterize the model "similarity". Parallel to Anderson et al. [2] and Maenhout [30], the alternative models are defined by a class of probability measures equivalent to P : According to the Girsanov theorem (see Karatzas and Shreve [19]), for each Q ∈ Q, there exists a pair of progressively measurable stochastic processes q(t) = q 1 (t), q 2 (t) , i.e. q(t) = q(t, x, l, u), such that for t ∈ [0, T ], we have the following Radon-Nikodym derivative: then, {Λ(t)} t∈[0,T ] is a positive P-martingale, and under the alternative measure Q, where W Q (t) and W Q 0 (t) are two one-dimensional standard Brownian motions. For each Q ∈ Q, we can rewrite the SDEs of the labor income process in (3) and the wealth process in (4) under the measure Q as Remark 3. We notice that the labor income process and the wealth process under the alternative model differ only in their drift terms, as they should be.
We can see that the dissimilarity between the alternative model Q and the reference model P is characterized by q(t). In other words, choosing the measure Q is equivalent to determining the pair of stochastic processes q(t) for a given P. Noting that the measure Q is fully characterized by {q(t)} t∈[0,T ] and the reference model, we denote by (u, q) = {u(t, x, l)} t∈[0,T ] , {q(t, x, l, u)} t∈[0,T ] the control-measure policy, which is also adopted by Pun [37]. Let U × Q be the set of admissible control-measure policies (the definition of the admissible control-measure policy will be given below).
Inspired by Maenhout [30] and Pun [37], an AAFM seeks a robust optimal policy that minimizes the MV cost function under the worst scenario among Q for every (t, x, l) ∈ O. Then, the robust stochastic control problem can be formulated as follows, where E Q t,x,l [·] and Var Q t,x,l [·] are the expectation and variance conditional on the event {X u (t) = x, L(t) = l} under measure Q, respectively. γ(x) = 2γx is the fund manager's state-dependent risk aversion function, and γ > 0 is the risk aversion coefficient. The intuition behind this risk aversion function is clear: the larger the current wealth level after contribution, the less risk-averse of the fund manager. Similar to the discussion in Wang and Chen [42], our choice of the risk aversion function is reasonable from both the dimensional analysis and the economic point of view. D t,x,l (Q||P) ≥ 0 is the generalized Kullback-Leibler (KL) divergence between Q and P, which is introduced to regularize the choices of Q. In this paper, we consider D t,x,l (Q || P) of the form where ξ > 0 is an aggregate measure of the ambiguity aversion, deterministic functions Φ 1 ≥ 0, Φ 2 ≥ 0 stand for the fund manager's preference on ambiguity aversion, which measures the degree of confidence in the reference model. Similar to Pun [37], we immediately have the following remarks.
x,l (Q||P) acts as the penalty for the model choice, and the magnitude of the penalty depends on the preference functions Φ i (·, ·, ·) (i = 1, 2). The larger the preference functions are, the less the deviations from the reference model are penalized.
where D KL (Q || P) is the KL divergence between Q and P. If ξ ↓ 0, the fund manager is completely convinced that the true model is the reference model, and the robust stochastic control problem is reduced to an ordinary control problem where no model uncertainty is considered as D KL (Q || P) ≡ 0. If ξ ↑ ∞, the term D KL (Q || P) vanishes and all candidate measures are identical. The scenario degenerates to the the case studied by Zhang and Siu [53].
(iii) Pun [37] characterizes the sources of uncertainty with a more general statedependent ambiguity aversion, while we simply assume that the ambiguity aversion level ξ is a constant for analytical tractability. It is particularly important to note that Pun [37] only provides a general mathematical framework under statedependent ambiguity aversion. For the concrete illustrative example, i.e., the robust MV portfolio selection problem, the ambiguity aversion is also assumed to be a constant to facilitate analysis.
Remark 5. The AAFM attains his robustness by considering a worst-case scenario Q * ∈ Q and a control u * to hedge against the worst-case scenario. These worst-case options are determined in the following way: for every fixed admissible control u, we propose a measure Q * (u) ∈ Q which provides the biggest possible penalty for the model choice (the worst-case model when u is fixed); note that this defines Q * (·) as a mapping from the set of admissible controls into Q. Then, we minimize the resulting worst-case penalized cost function over all admissible controls. This gives us the minimized worst-case value function attained at a specific optimal control u * . Finally, we say that our worst-case model Q * is the one corresponding to u * , namely, Q * = Q * (u * ).
We rewrite the objective function as where Now we give the definition of the admissible control-measure policy.
(ii) q satisfies the Novikov condition (5); (iii) For each initial point (t, x, l) ∈ O, the SDE (9) admits an unique strong 3. Robust equilibrium policy. In this section, we proceed with the solution of Problem (10) by reformulating it in the game theoretic framework. We aim to derive the robust equilibrium policy and the corresponding robust equilibrium value function.
3.1. Definition of robust equilibrium policy. To deal with time-inconsistent control problems, following Björk et al. [6], we reformulate our problem in the game theoretic framework to characterize robust equilibrium policy. More precisely, we view it as a non-cooperative game that every time point t ∈ [0, T ] is regarded as a player-player number t, the rule is that player t can only choose the control u(t, x, l) and simultaneously plays another Stackelberg game (In economics, it means that the leader firm determines its quantity first and then the follower firms determine their quantities sequentially based on the leader firm's quantity.) against the "nature" control q t, x, l, u(t, x, l) (i.e., u is the "leader" and q is the "follower") who alters the probability space. Under the condition that player t already knows that the robust policy of player s is u * s, X u * (s), L(s) , q * s, X u * (s), L(s), u * (s, X u * (s), L(s)) for all s ∈ (t, T ], the robust equilibrium policy for player t is then u * t, x, l , q * t, x, l, u * (t, x, l) . Note that at time t, player t solves a robust control problem on a time set of Lebesgue measure zero, the control has no impact. Therefore, for each time point t, we instead investigate a robust stochastic control problem over [t, t + ] given that the players s ∈ [t + , T ] have chosen robust optimal controls, here > 0 is the minimal time elapse. We give the formal definition of equilibrium policy as follows.
Definition 3.1. Given an admissible control-measure policy (u * , q * ), fix an arbitrary initial point (t, x, l) ∈ O, and choose an arbitrary admissible control-measure policy (u, q), a real number > 0, define an -policy (u , q ) by where 1 A is the indicator function of a subset A.
then, (u * , q * ) is called an equilibrium control-measure policy. The equilibrium value function is defined by Remark 6. (i) Definition 3.1 clarifies the time-consistency in terms of both control and measure since the supremum problem with respect to measure also requires a time-consistent treatment on Q. As Pun [37] points out that, similar robust equilibrium problem was studied in Zeng et al. [51]. However, they identify the worst-case measure as a prior step and regards the worst-case objective as an ordinary objective. That is, the time-consistency in their work is only with respect to the control.
(ii) The inequalities in Definition 3.1 keep the order "inf sup" according to the robust rule suggested in Hansen et al. [17].
(iii) Definition 3.1 shows that if at time s ∈ (t, T ], the robust policy is u * s, , then the robust optimal policy at time t is u * t, x, l , q * t, x, l, u * (t, x, l) . In other words, the robust optimal policy viewed at time t is also robust optimal if viewed at time s. This is exactly the meaning of the time-consistency. Thus, we say that the equilibrium control-measure policy is time-consistent. Based on Definition 3.1, the equilibrium control-measure policy is time-consistent and robust optimal in the sense of (10). The equilibrium control-measure policy is thus termed as the robust equilibrium policy, and the corresponding equilibrium value function is termed as the robust equilibrium value function. In the following, we will establish the robust equilibrium policy and its corresponding robust equilibrium value function in accordance with Definition 3.1.

3.2.
The recursion for the robust equilibrium value function. We introduce two functions which will play a central role in later derivations.
To obtain a robust equilibrium policy, we start from establishing a recursive formula for the robust equilibrium value function V . To this end, we first derive the recursive equation for J(t, x, l; u, q) in (12) and the result is presented in the following lemma. Lemma 3.3. For s > t and any admissible control-measure policy (u, q) ∈ U × Q, we have, x,l G X u (s), g u,q s, X u (s), L(s) .
Proof. See Appendix A.
Remark 7. Note that the sources of time-inconsistency in (12) come from two aspects. Firstly, the current state x appears in functions F and G, and the penalty term D t,x,l depend on x and l. As a consequence, the objective function changes as state variables vary. Secondly, in the term G x, E Q t,x,l [X u (T )] we have, even forgetting about the appearance of x, a nonlinear function G acting on the conditional expectation. Due to these, the term I u,q will be nonzero in general and it qualifies the fund manager's incentives to deviate from the optimal policy of time t during the period [t, s], which leads to the violation of Bellman's principle of optimality. Another point that deserves special attention is that, as we mentioned in the third point of Remark 4, the ambiguity aversion in Pun [37] is state-dependent, which leads to an additional term in I u,q i.e., the term L u,q C in equation (8) of Pun [37] related to the state-dependent ambiguity aversion. However, that term is an infinitesimal and it vanishes in the HJBI equation under his setting.
The recursion for the robust equilibrium value function V depends on the recursive equation of J in (15) and the equilibrium concept in Definition 3.1. Suppose that the robust equilibrium policy (u * , q * ) exists and the corresponding robust equilibrium value function at time t is V (t, x, l). For any initial point (t, x, l) ∈ O and a real number > 0, based on Definition 3.1, we consider a robust optimal control problem over [t, t+ ] where policies over [t+ , T ] are fixed as u * and q * , and policies over [t, t + ) are arbitrary, written as u and q. Then, in view of Definitions 3.1 and 3.2, equation (15) implies the following expression for the robust equilibrium value function: with x, l, t + ), here, g(·, ·, ·) = g u * ,q * (·, ·, ·) and f (·, ·, ·, ·) = f u * ,q * (·, ·, ·, ·) are given in (13) and (14), respectively.
Note that I u ,q ≡ 0 when the sources of time-inconsistency in (12) do not exist. Equation (17) without the term I u ,q is similar to the dynamic programming equation. For this reason, we call (17) the extended dynamic programming equation (EDPE).
3.3. Extended HJBI system and verification theorem. We now present the extended HJBI system of equations and the verification theorem for our robust stochastic control problem. Before this, we firstly introduce some notations for convenience. Let C 1,2,2 (O) = ψ(t, x, l) ψ(t, ·, ·) is once continuously differenctiable on [0, T ] and ψ(·, x, l) is twice continuously differentiable on R × R .
For any test function φ(t, x, l) ∈ C 1,2,2 (O) and any admissible policy (u, q) ∈ U × Q, the controlled infinitesimal operator, A u,q , is given by Following the derivations in Björk et al. [6] and Pun [38], using the EDPE (17), Definition 3.1 and the infinitesimal operator, we can obtain the following extended HJBI system of equations and verification theorem.
Theorem 3.4 (Verification theorem). Assume that for all y ∈ R, there exist functions V, g ∈ C 1,2,2 (O), and f ∈ C 1,2,2,2 (O × R) satisfying the following extended HJBI system of equations: (1) The function V (t, x, l) is determined by the HJBI equation +A u,q f (t, x, l, y)| y=x − A u,q G x, g(t, x, l) (2) For each fixed y ∈ R, the function f (t, x, l, y) is defined by (3) The function g(t, x, l) is defined by Moreover, assume that there is an admissible policy (u * , q * ) which realizes the infimum and supremum in equation (20) for all (t, x, l) ∈ O. Then (u * , q * ) is a robust equilibrium policy and V is the corresponding robust equilibrium value function. Furthermore, f and g have the probabilistic interpretations in (13) and (14).
Proof. This theorem can be proved in a similar way as that in Theorem 5.2 of Björk et al. [6] and Theorem 5 of Pun [38], thus we omit it.
Remark 8. (i) The above theorem shows that once we solve the extended HJBI system, we have solved our problem. Note that we have a system of equations (20)- (22) to determine the functions V, f and g simultaneously. In order to solve V , we need to know f and g. However, they are determined by the robust equilibrium policy (u * , q * ), which in turn is determined by equation (20).
(ii) It is noteworthy that the specification of the ambiguity preference functions Φ 1 (t, x, l) and Φ 2 (t, x, l) determines the solvability of the extended HJBI system. Remark 9. (i) Note that the function space in Björk et al. [6] is L 2 . From their derivations, we see that the integrability condition is used to eliminate the stochastic integral part when computing the expectation of the value function (i.e., the formula below equation (5.6) in Björk et al. [6]). However, we do not need to confine ourselves to this space, because we can obtain that result by using the Dynkin's Theorem (see Øksendal [34]). While the conditions in Dynkin's Theorem can be ensured by the smoothness of the functions, which are the assumptions of Theorem 3.4. Many studies have also removed the above restriction on the function space, see Liang and Song [22], Wu et al. [45], Zeng et al. [51], Zhu et al. [56] and so on.
(ii) We provide the verification theorem by assuming the smoothness of functions V, f and g. However, finding conditions to guarantee that V, f and g are regular enough to satisfy the extended HJBI system is a difficult open problem even for the case without uncertainty. Yi et al. [49], Zheng et al. [54] and Zeng et al. [52] all study the robust optimal control problems under the exponential utility and find out the proper conditions to guarantee their assumptions. Under the MV framework, this problem is still open. Another related open problem is to prove the existence and uniqueness of solutions to the extended HJBI system. Due to its complexity, there is little progress about this problem in the current literature. These issues are left for our future research, please see the discussions about these open problems in Björk et al. [6] and Pun [37].

3.4.
Robust equilibrium policy for specified ambiguity preference functions. To obtain an analytical solution for our problem, in this subsection, we will derive the robust equilibrium policy and the corresponding robust equilibrium value function for specified ambiguity preference functions.
Inspired by Pun [37], we choose the ambiguity preference functions of the following form with where ∂G(x,y) ∂y y=g(t,x,l) = −2g(t, x, l).

Remark 10.
(i) Maenhout [30] points out that, if we simply choose the ambiguity preference function to be a constant or a state-independent parameter, then the robustness on the model uncertainty will vanish as the state variable changes. That is, an ambiguity preference function implicitly imposes the homotheticity that the robustness will not wear off as state variables (in our case, the state variables are the current wealth x and the current labor income l) vary. Therefore, he proposes a preference function which scales the preference parameter θ by the value function, i.e., Ψ(t, W ) = θ (1−γ)V (t,W ) (equation (13) in Maenhout [30]). Then, Zeng et al. [52], Yi et al. [49] and Yi et al. [50] adopt this construction to describe the preference function under the power (CRRA) and exponential (CARA) utility frameworks and the MV framework, respectively. Differently, Pun and Wong [36] define the preference function as φ(t, x, y) = − Vxx(t,x,y) V 2 x (t,x,y) (equation (7) in Pun and Wong [36]) for a general utility function. Pun [37] chooses the ambiguity preference function of the form (equation (23) in Pun [37]) Notice that the value function in Pun [37] can be rewritten as V (t, x) = h(t, x, x)+ G(x, g(t, x)), so it is identical to the preference function in Pun and Wong [36]. However, we do not simply set our ambiguity preference function to be χ 1 (t, x, l) + χ 2 (t, x, l) + χ 3 (t, x, l) The reason is that our selection of the ambiguity preference functions (23)-(24) are based on the expressions of q * 1 (t, x, l, u), q * 2 (t, x, l, u) and H(t, x, l; q * ) in equations (55), (56) and (11), such that the optimal control u deduced from the HJB equation (57) takes a linear form of variables x and l. Only with this linear form of u, we can separate equations (21) and (22) into a system of ODEs by the order of variables x and l, in other words, we can obtain analytical solutions for our problem.
Obviously, our ambiguity preference functions are state-dependent, thus they remain the property of homothetic robust, that is, they are economically meaningful. Moreover, the numerator of Φ 1 (t, x, l) has the dimension (dollar) 2 , and the denominator of Φ 1 (t, x, l) has the dimension (dollar) 4 , while the numerator of Φ 2 (t, x, l) has the dimension one, and the denominator of Φ 2 (t, x, l) has the dimension (dollar) 2 . Therefore, both Φ 1 (t, x, l) and Φ 2 (t, x, l) have the dimension 1 (dollar) 2 . Noting that our value function has the dimension (dollar) 2 . So our choice of the ambiguity preference functions is also consistent with the preference function in Maenhout (2004) in terms of the dimensional analysis.
(ii) The specification of the ambiguity preference functions Φ i (t, x, l) (i = 1, 2) is subject to the Ansatzes: such that both Φ 1 (t, x, l) and Φ 2 (t, x, l) are positive.
(iii) Zeng et al. [51] investigate the robust equilibrium policy for a reinsuranceinvestment problem with jumps under the MV criterion. However, in order to obtain an analytical solution, they simply assume the preference functions to be nonnegative constants. As we discussed in the first point, this will lead to the absence of the model uncertainty as the state variable changes.
Therefore, the ambiguity preference functions Φ i (i = 1, 2) are reasonable since Φ i (i = 1, 2) is monotonically decreasing with respect to the risk-tolerance function R i (i = 1, 2) (i.e., a higher risk aversion implies more robustness), then Φ i (i = 1, 2) preserves the relation between risk and ambiguity aversion.
With the specification of the ambiguity preference functions in (23) and (24), we obtain the following result.
Proposition 3.5. The robust equilibrium policy is given by where λ i (i = 1, 2) and χ i (i = 1, 2, 3) are given in (25), f and g are determined by equations (21) and (22). The robust equilibrium value function is given by Proof. See Appendix B.
Remark 12. From expression (30), we can see that the robust equilibrium control consists of two parts, where the first part has a similar form as the result obtained in Pun [37], the second part is an additional term which arises from the fund's contribution.
Obviously, if we can determine f and g via solving equations (21) and (22) with the robust equilibrium policy given in (28)- (30), then V given in (31) is the robust equilibrium value function.
We are now ready to present the main result for our problem.
Theorem 3.6. For a DC pension fund, with the ambiguity preference functions Φ i (t, x, l) (i = 1, 2) given in (23) and (24), the robust time-inconsistent control problem (10) admits the robust equilibrium value function: and the robust equilibrium control-measure policy (u * , q * ) is given by with where the deterministic functions g i (t) (i = 1, 2) and f i (t) (i = 1, · · · , 5) satisfy the system of ODEs in (64)-(70) which are given in Appendix C .
Proof. See Appendix C.
Remark 13. (i) Theorem 3.6 shows that the robust equilibrium policy (33) is linear with respect to the current wealth level and the current labor income level, which extends the result in Pun [37] by incorporating the labor income risk. Moreover, the robust equilibrium value function is quadratic in the current wealth level and the current labor income level. In addition, noting the dependence of g j (j = 1, 2) and f i (i = 1, · · · , 5) from (64)-(70) on the aggregate ambiguity aversion level ξ, the robustness influences on u * and q * i (i = 1, 2) are highly nonlinear. However, Maenhout [30] and Liu [27] have shown that the ambiguity aversion level can be understood as an extra risk aversion parameter under the power (CRRA) and the recursive utility functions, thus the robustness influences on the optimal policy for their utility functions just enlarges the investor's risk aversion level.
(ii) As we mentioned in Remark 9, it is a difficult issue to find conditions to ensure that the assumptions in the verification theorem are satisfied. In Björk et al. [6], the authors took the linear-quadratic regulator as an example and discussed the technical conditions of the verification theorem. Since the equilibrium state dynamics are linear under their setting, the assumptions of the verification theorem can be easily checked. Zeng et al. [51] constructed a robust control problem under the MV criterion, and they simply set the ambiguity preference parameters be constants. However, they were still not able to find conditions to ensure their assumptions. In our case, though the robust equilibrium control u * is linear with respect to x and l, noting the expressions of q * i (i = 1, 2), the equilibrium state dynamics are highly nonlinear, which makes it more difficult to find the conditions. Therefore, in this paper, this issue will not be investigated and remains to be open.

Remark 14.
According to (34) and (35), the ambiguity preference functions in (23) and (24) can be rewritten as With the above expressions, the Ansatzes made in the second point of Remark 10 now become m(t, x, l) > 0 and f 2 (t) > 0.
4. Special cases. In this section, we examine two special cases of our model.

Special case 1.
If there is no labor income in our model, i.e., L(t) = 0, our problem reduces to a MV portfolio selection problem for an AAFM. The robust equilibrium control-measure policy in Theorem 3.6 can then be simplified as and the robust equilibrium value function becomes It is worth noting that the robust equilibrium policy (39)- (41) obtained in this case is essentially the same as that given in Pun [37] (i.e., the case of state-dependent risk aversion).
Special case 2. If the fund manager is ambiguity-neutral, i.e., the ambiguity aversion parameter ξ equals zero. As we discussed in the second point of Remark 4, in this case, we have D KL (Q P ) ≡ 0. Therefore, the robust equilibrium controlmeasure policy in Theorem 3.6 becomes q * 1 (t, x, l, u * ) = 0, q * 2 (t, x, l, u * ) = 0, (44) and the system of ODEs forf i (t) (i = 1, · · · , 5) andĝ i (t) (i = 1, 2) becomes: wherê and the robust equilibrium value function is given bŷ 5. Numerical illustration. In this section, we carry out several numerical experiments to illustrate the effects of model parameters on the robust equilibrium policy derived in this paper. In the following numerical experiments, unless otherwise stated, the basic parameters are set as those in Table 1. Most of them are selected from existing numerical studies for comparison purpose, see Zeng et al. [52], Yi et al. [49] and Wang and Li [43]. The financial market parameters are assumed to be independent of time. It is known from (33) that the robust equilibrium policy u * is driven by the wealth process X u (t) and the labor income process L(t). Due to the random nature of X u (t) and L(t), the robust equilibrium policy is also a stochastic process. In the following experiments, to get an idea about the evolution of the robust equilibrium policy along with time, we plot its sample paths, so do other stochastic processes.
(1) Effects of the risky asset's appreciation rate and volatility rate Let the risky asset's appreciation rate and volatility rate be µ = 0.2, 0.3, 0.4 and σ = 0.1, 0.2, 0.3, respectively, and other parameters are the same as those in Table  1. The influence of µ on the robust equilibrium policy is shown in Figure 1. We see that the amount invested in the risky asset basically decreases over time for each µ. As µ increases, the fund manager increases his investment in the risky asset at every time point. This result is obvious and consistent with the reality. The higher the appreciation rate of the risky asset, the larger the expected return of the risky asset, which leads to more investment in the risky asset. Figure 2 illustrates the influence of σ on the robust equilibrium policy. We observe that the amount invested in the risky asset decreases as σ increases. This result is reasonable since the fund manager will suffer larger risk when the volatility rate of the risky asset increases, then he will decrease his investment in the risky asset. This result also matches the reality: when the retirement time approaches, the suggestion usually given to the plan member in pension plans is to decrease the investment in the risky asset, that is, the fund manager shifts his investment from the risky asset to the risk-free asset with the coming of the terminal time. The above trends are consistent with the results of Zeng et al. [52]. In Figures 1 and 2, we also show the values of m(t, x, l) and f 2 (t) for each case, we can see that both of them take positive values during the whole investment period for each case. This means that the conditions in Remark 14 are satisfied, that is, our results are effective.    Remark 15. It is worthy pointing out that both the values of m(t, x, l) and f 2 (t) are positive in the following experiments, however, we will no longer display their specific values because of the space limit.
(2) Effects of the labor income's appreciation rate and volatility rate Increase the labor income's appreciation rate α from 0.1 to 0.3 with step size 0.1 while keep other parameters unchanged, we get Figure 3. The influence of α on the robust equilibrium policy is mainly similar to the case of µ. When the labor income's hedgeable volatility rate ϕ and non-hedgeable volatility rate β are increased from 0.1 to 0.3 with step size 0.1, respectively, while other parameters remain the initial values, Figures 4-5 are obtained. We can see that the amount invested in the risky asset decreases over time for each hedgeable/non-hedgeable volatility rate of the labor income. And the investment in the risky asset decreases as the hedgeable/non-hedgeable volatility rate of the labor income increases. The reason for this result is similar to that of the risky asset's volatility rate.       (3) Effects of the risk aversion coefficient, the investment horizon and the contribution rate Firstly, let the risk aversion coefficient γ increase from 0.5 to 2.5 with the step size being 1, while keep other parameters at their initial values. The robust equilibrium policy corresponding to the increasing γ is demonstrated in Figure 6. We note that: a) the investment in the risky asset decreases over time for each γ. b) The investment of the risky asset increases as γ increases. The reason is that the larger the value of γ is, the less risk aversion the fund manager becomes. Thus he intends to pursue a riskier policy, i.e., investing more in the risky asset. This result is consistent with the reality that an investor with high risk aversion will lower the speculative investment in the risky asset.
Secondly, let the investment horizon T increase from 6 to 10 with the step size being 2, the robust equilibrium policies under different T s are demonstrated in Figure 7. We can draw some conclusions from this figure. a) The allocation in the risky asset decreases during the whole investment horizon for each T . b) The shorter the investment horizon T is, the larger the amount invested in the risky asset. This is easy to understand since the fund manager would have less confidence to control the investment uncertainty in the future when the investment horizon becomes longer, which results in smaller investment in the risky asset for longer investment horizon. Pay attention to the case T = 10, when approaching the terminal, namely, from t = 8 to t = 10, the amount invested in the risky asset is negative. This means that, i) the fund manager should short sell the risky asset and invest it together with the net wealth that he holds at that time in the risk-free asset, ii) the borrowed amount of money from t = 8 to t = 10 first increases then decreases. In a word, the long horizon brings more uncertainty and accumulates much less of confidence on the reference model for an AAFM. The trend highlighted in this figure is consistent with the results obtained in Wu et al. [45].
Thirdly, we examine the effect of the contribution rate c on the robust equilibrium policy. Let c increase from 0 to 0.3 with step size being 0.1, and remain other parameters at their initial values. The relationship between the contribution rate and the robust equilibrium policy is depicted in Figure 8. With the increase of the contribution rate, the amount invested in the risky asset increases. This result is reasonable since increasing contribution rate implies that there will be higher pension fund accumulation, then the fund manager can invest more to earn more. Pay attention to the case c = 0, in this case, our model degenerates to a portfolio selection model for an AAFM. Compared with the other three cases, the fluctuation of the investment in the risky asset in this case is the smallest.
(4) Effect of the aggregate ambiguity aversion Increase the aggregate ambiguity aversion level ξ from 0.5 to 2.5 with the step size being 1 while keep other parameters unchanged, we obtain Figure 9, which depicts the influence of the aggregate ambiguity aversion on the robust equilibrium policy. We find that the amount invested in the risky asset decreases with respect to ξ. When ξ increases, the fund manager becomes more ambiguity-averse, i.e., he loses more confidence in the reference model and seeks more robustness of the model uncertainty. Therefore, he intends to decrease his investment in the risky asset. As we mentioned in Remark 13, Maenhout [31] and Liu [27] point out that the ambiguity aversion level can be understood as an extra risk aversion coefficient, and a higher ambiguity aversion level implies a decrease in the investment of the risky asset. Though the robustness influence of ξ on the robust equilibrium policy in our problem is highly nonlinear, as we discussed in Remark 13, the trend showed in Figure 9 is mainly consistent with their conclusions.
Then, we investigate the effect of model uncertainty on the robust value function. Recall for an ANFM, i.e., the fund manager believes fully in the reference model, he  will follow the robust equilibrium policy given in (43), and the value function for the fund manager in this case is given by (52). To measure the increased spread of the value function due to aversion to model uncertainty (i.e, for an AAFM), compared to the case of model certainty (i.e, for an ANFM), we define a discrepancy function as follows whereV (t, x, l) and V (t, x, l) represent the value functions for an ANFM (i.e., without the ambiguity aversion) and for an AAFM with ambiguity aversion level ξ, respectively. The effect of the aggregate ambiguity aversion level ξ on the discrepancy function is demonstrated in Figure 10. The higher the ambiguity aversion level ξ is, the more ambiguity averse of the AAFM will be, i.e., the AAFM is more skeptical about the reference model. We see from Figure 10 that the discrepancy function mainly increases with respect to time, which means that the utility deviation for an AAFM with a longer time horizon is bigger than that with a short one. Moreover, as ξ increases, the discrepancy function increases. This is easy to understand, the more ambiguity averse the AAFM is, the more conservative policy he will seek for, then in order to compensate his robust behavior, the larger utility deviation he will suffer, i.e., the greater the value of the discrepancy function. This result is consistent with the results in Zeng et al. [52].
Remark 16. Our discrepancy function U (t, x, l) is adopted from Yi et al. [50]. However, Zeng et al. [52] use (equation (48) in their paper) to define the utility improvement obtained by considering the ambiguity aversion, here J(t, x, v, l) andJ(t, x, v, l) represent the value functions with and without the ambiguity aversion, respectively. Zheng et al. [54] describe this by defining a utility loss function here V (t, x, s) and V 0 (t, x, s) represent the value functions with and without the amibiguity aversion, respectively. Noting that the optimization problems in Zeng et al. [52] and Zheng et al. [54] are constructed under the order "sup inf", while our optimization model adopts the order "inf sup", thus our discrepancy function is essentially the same as their utility improvement (or loss) function, and all these functions describe the utility deviations caused by considering the model uncertainty.    (5) Effects of the initial wealth and the initial labor income Finally, we show the effects of the initial wealth X 0 and the initial labor income L 0 on the robust equilibrium policy. Let the initial wealth/initial labor income increase from 1 to 3 with the step size being 1, while keep other parameters at their initial values, Figures 11-12 are obtained. We can see that the amount invested in the risky asset increases as the initial wealth/initial labor income increases. This result is easy to understand since larger initial wealth/initial labor income means higher pension fund accumulation, then the fund manager can invest more.
All the above numerical results and analyses further demonstrate the influences of model parameters on the robust equilibrium policy. These observations can help us have a more intuitive and better understanding of the theoretical results obtained in this paper. 6. Conclusion. In this article, we study a robust equilibrium investment problem for an AAFM in continuous-time under the MV criterion, where the risk aversion function takes a linear and state-dependent form and the model uncertainty accords with the rule of model misspecification proposed by Anderson. The financial market under consideration consists of one risk-free asset and one risky asset. Meanwhile, the AAFM faces a stochastic labor income risk and aims to develop a robust equilibrium policy. With our choice of ambiguity preference functions, explicit expressions  Figure 11. Effect of X 0 on the robust equilibrium policy. of the robust equilibrium control-measure policy and the corresponding robust equilibrium value function have been derived by adopting the concept of subgame perfect equilibrium and applying the extended dynamic programming approach. Our work enriches the literature in DC pension plans by introducing meaningful ambiguity preference functions under the MV criterion with state-dependent risk aversion. Two special cases of our model and the associated results are also discussed. Finally, economic implications of our theoretical results are illustrated through a series of numerical experiments. two anonymous reviewers for their detailed and valuable comments and suggestions, which have helped us to substantially improve the presentation and quality of this manuscript.
In the light of (53), g u,q can be written as g u,q (t, x, l) = E Q t,x,l g u,q (s, X u (s), L(s)) , thus we have, G x, g u,q (t, x, l) = E Q t,x,l G X u (s), g u,q s, X u (s), L(s) +G x, E Q t,x,l g u,q s, X u (s), L(s) − E Q t,x,l G X u (s), g u,q s, X u (s), L(s) = E Q t,x,l G X u (s), g u,q s, X u (s), L(s) + I u,q G (t, x, l, s).
It remains to check that the candidate equilibrium policy (u * , q * ) given in (33)-(35) is admissible.