Distributionally robust chance constrained problems under general moments information

In this paper, we focus on distributionally robust chance constrained problems (DRCCPs) under general moments information sets. By convex analysis, we obtain an equivalent convex programming form for DRCCP under assumptions that the first and second order moments belong to corresponding convex and compact sets respectively. We give some examples of support functions about matrix sets to show the tractability of the equivalent convex programming and obtain the closed form solution for the worst case VaR optimization problem. Then, we present an equivalent convex programming form for DRCCP under assumptions that the first order moment set and the support subsets are convex and compact. We also give an equivalent form for distributionally robust nonlinear chance constrained problem under assumptions that the first order moment set and the support set are convex and compact. Moreover, we provide illustrative examples to show our results.

be viewed as a combination of chance constraint and distributionally robust optimization, has become a significant and effective approach to address practical optimization problems involving uncertainty and acquired a great development both in theory and applications [5,13,14,27,31,35,36,37]. In this paradigm, the distribution of uncertain parameter is not precisely known and belongs to a given information set, and the optimal solution is required to be feasible for all the realizations of the uncertain distribution. DRCCP provides a computationally viable methodology for immunizing chance constrained programming against the distribution uncertainty.
Most distributionally robust chance constrained methods are developed with the purpose of achieving a computationally tractable model for corresponding DRC-CPs, in which how to describe the distribution set is an important issue. Recently, Calafiore and El Ghaoui [5] presented that a distributionally robust individual chance constrained problem could be rewritten as a second-order cone problem under the exact second order moment information. Zymler et al. [36] obtained a semidefinite programming for a distributionally robust joint chance constraint under the exact second order moment information. Yang and Xu [34] proved that DRCCP for nonlinear uncertainties under the exact second order moment information was equivalent to a robust optimization problem. Yet, most previous results on the tractability of DRCCPs are restricted to the case that the distribution information is described by the exact second order moment information. As pointed out by Delage and Ye [6], it is not reliable when empirical moments have been treated as exact moments. In order to strengthen safety, they replaced the exact moments by the ambiguity moments information and discussed the distributionally robust optimization problem (DROP) under ellipsoidal moments information. Moreover, Natarajan et al. [17] considered a distributionally robust expected utility problem involving an interval moments uncertainty set. Wiesemann et al. [30] discussed DROPs by using a family of support subsets described by conic representable confidence sets and first order moment residing on an affine manifold to describe the distribution information set.
On the other hand, Ding et al. [7] discussed DRCCP under interval moments uncertainty set and obtained its equivalent convex programming. Zhang et al. [35] considered a distributionally robust chance constrained appointment scheduling problem under ellipsoid moments uncertainty set and derived an approximate semidefinite programming model. Wang et al. [28] showed a distributionally robust chance constrained support vector machines problem could be transformed into a second order cone problem under the exact second order moment information, and discussed the problem under interval first order moment uncertainty, ellipsoid first order moment uncertainty and second order moment uncertainty controlled by F -norm, respectively.
We note that, in the research works mentioned above, DROPs and DRCCPs have been discussed under various uncertainty moments sets such as ellipsoidal moments sets, interval moments sets and so on. Clearly, the ellipsoidal moments sets and the interval moments sets are all convex and compact. Therefore, one natural question is: can we get the equivalent form for DRCCPs involving some convex and compact information sets?
The main purpose of this paper is to make a new attempt to study DRCCPs. The rest of this paper is organized as follows. In Section 2, we obtain the convex programming for DRCCP under assumptions that the first and second order moments belong to the corresponding convex and compact sets, respectively. We also present some computationally tractable examples for the support functions of the matrix sets and the equivalent convex programming. Moreover, we apply one result to the worst case VaR optimization problem and obtain its closed form solution. To avoid the computation of support function about the matrix sets, we show the convex programming for DRCCP under assumptions that the information set of first order moment and the support subsets are convex and compact in Section 3. In Section 4, for distributionally robust nonlinear chance constrained problem under the set of first order moment and the support set are convex and compact, we obtain the equivalent form and show an example in which the equivalent form is convex. Finally, we summarize the results in Section 5.

DRCCP under the general first and second order moment information.
In this section, we consider the following DRCCP: where x is decision vector, χ ⊂ R n is a convex and compact set, c ∈ R n is a deterministic cost vector, ξ ∈ R k is a random vector, and w 0 (x), w (x) are two affine functions. Let P denote the distribution information set of the involved probability distributions and include the true distribution P. By employing the model (1), Ghaoui et al. [8] discussed the worst case VaR problem and presented several convex reformulations, Zymler et al. [37] considered two types of the worst case VaR of nonlinear portfolio and developed tractable representations, respectively, and Wang et al. [28] showed a distributionally robust chance constrained support vector machines problem could be transformed into a second order cone problem. Based on limited history data, it is hard to obtain the exact distribution information. In this section, we suppose the ambiguity information set has the following form where P denotes the set of all probability measures on the measurable space (R k , B) with B being the Borel σ-algebra on R k . Denote by S k the space of symmetric matrices of dimension k. Generally speaking, the exact first (respectively, second) order moment µ (respectively, Σ) is not far away from the empirical first (respectively, second) order moment µ 0 (respectively, Σ 0 ). Thus, we can assume that where A, Ξ ∈ S k , ζ ∈ R k , and for i = 1, 2, U i is a given nonempty, convex and compact set with 0 ∈ U i . The confidence of decision makers in µ 0 and Σ 0 has been demonstrated by the magnitude of sets U 1 and U 2 , respectively. Some examples of U i are as follows: They are exactly the ellipsoidal set (µ − µ 0 ) T Σ −1 0 (µ − µ 0 ) ≤ λ 1 and LMI Σ λ 2 Σ 0 which have been mentioned in [6], respectively; (ii) Let U 1 = ζ| µ − µ 0 ≤ ζ ≤ µ − µ 0 with A being the unit matrix and They are exactly the interval sets µ ≤ µ ≤ µ and 0 Σ Σ Σ which have been mentioned in [7,17], respectively. It is worth to mention that the distributionally robust individual chance constrained problem discussed in [7] is a special case of DRCCP in this paper. Next, we are devoted to obtaining the tractable reformulation of DRCCP (1) under the distribution information set P 1 . Denote the support function of set U ⊂ R k by Similarly, denote the support function of the set S ⊂ S k by where Y, X = tr(Y X) is the trace scalar product.
Theorem 2.2. DRCCP (1) under the information set P 1 is the following convex programming Proof. Firstly, we rewrite problem (1) as the following equivalent form By constructing the information sets U = {(µ, Σ) : µ ∈ U µ , Σ ∈ U Σ } and the chance constraint in (4) can be reformulated as follows: Let 1 S denote the indicator function of the set Letting y = w(x) T ξ, similar to the discussion in the proof of Theorem 2.2 in [7], by employing the strong duality theorem of Ihii [10] and Lemma 2.1, we can reformulate the problem as the following form where λ 0 , λ 1 , λ 2 ∈ R are the dual variables for the constraints. When λ 2 ≤ 0, the feasible region is empty. Thus we get λ 2 > 0. Similar to the proof of Theorem 2.1 in [17], by taking y * = −λ1 2λ2 in the first constraint and y * = η−λ1 2λ2 in the second constraint and using the following change of variables Then, problem (6) can be rewritten as follows: In what follows, we prove that problem (8) is equivalent to the following one: For fixed x, suppose that (µ * , Σ * , p * , t * , z * , s * , η * ) is the optimal solution of (8). We also assume that, for some (µ 1 , Σ 1 ) ∈ U, at least one of the constraints is violated for (p * , t * , z * , s * , η * ), which means one of the following inequalities holds Suppose the first inequality holds. Then we can obtain which is a contradiction with the fact that (µ * , Σ * , p * , t * , z * , s * , h * ) is the optimal solution of (8). Thus, we have Similarly, we can prove In conclusion, we obtain that problem (8) is equivalent to problem (9). Then, problem (4) can be reformulated as By the definition of support function, the above problem can be reformulated as Clearly, the third constraint of the above system does not hold when η = 0, which implies that η > 0. From the first and second inequalities of the above problem, we have Then, (5) can be simplified as For investigating the convexity of problem (4), we introduce a new variable v and so (10) can be simplified as follows: It is well known that x T Σx is convex in x when Σ 0. This fact shows that, for any θ ∈ [0, 1] and x 1 , x 2 ∈ R n , we have . From the above discussion, we get the conclusion.

2.1.
Examples. Theorem 2.2 shows that DRCCP (1) under the information set P 1 can be reformulated as a convex programming. It follows from (3) that the key issue for the computationally tractability of the convex programming is the computation of support functions about the vector set and matrix set. Concerning with the computationally tractable support functions about the vector sets, we refer the reader to [3]. For matrix Ξ, if we define U 2 U(vec(Ξ)), then we can deal with the support function of the matrix set by employing the method of [3]. Nevertheless, it is necessary to consider the support function about the matrix set directly, since the structure of the matrix is destroyed if we transform the matrix into the vector form. Therefore, we next show some examples for the support functions about the matrix sets, in which the term δ * w(x)w T (x)|U 2 can be reformulated as some computationally tractable forms.
Example 2.1. Let U 2 = Ξ ∈ S k | Θ j Ξ Ξ j , j = 1, 2, · · · , J , where Θ j 0 and Ξ j 0 are given matrices. By Schur complement, the support function of U 2 is given as follows: where τ > 0 and D 0. By using the inequality the support function of U 2 can be represented as follows: If D is the unit matrix, then U 2 = Ξ ∈ S k | Ξ F ≤ √ τ and the corresponding support function reduces to √ τ w(x) 2 2 . Example 2.3. Let U 2 = Ξ ∈ S k | Ξ σp ≤ τ with τ > 0, where the Schatten norm · σp is defined in [4] by Here A ∈ S k and σ i (A) is the absolute value of the i-largest eigenvalue of A which has real eigenvalues. Let A, B ∈ S k and p, q ∈ [1, +∞] satisfy 1 p + 1 q = 1. Then it is easy to see that tr(A T B) ≤ A σp B σq . In particular, Thus, the support function can be given as follows: Example 2.4. Let U 2 = Ξ ∈ S k | tr(C j Ξ) ≤ b j , Ξ 0, j = 1, 2, · · · , J , where C j ∈ S k and b j ∈ R are given for j = 1, 2, · · · , J. For the support function of U 2 , we have where β j ∈ R, D j ∈ S k , for j = 1, 2, · · · , J, and Ξ 0 0 are given. Then the support function of U 2 can be given as follows: In the above examples, we drive some computationally tractable support functions about the matrix sets. Next, we show two computationally tractable examples for problem (3).

An application to the risk management.
Recently, the Value-at-risk (VaR) has become one of the most popular risk measures in risk management. Usually, the VaR can be defined as follows: where x denotes the vector of asset weights, and ξ denotes the random vector of relative asset returns. When the distribution of returns is normal distribution N (µ, Σ), the VaR can be expressed as where κ(ε) = −Φ −1 (ε), and Φ −1 (·) is the inverse cumulative standard normal distribution. In practice, the exact distribution is not easy to acquire based on limited history data. In 2003, Ghaoui et al. [8] presented the exact formulation for (11) by setting κ(ε) = ε −1 (1 − ε) under the first and second order moments information.
But the optimal portfolio strategy may become infeasible when we treat the empirical moments as the exact moments. In this subsection, we investigate the VaR (11) under the information set P 1 . Let w 0 (x) = −γ and w(x) = −x. From Theorem 2.2, we have Thus, we obtain It is worth to mentioning that (13) will degenerate to (12) with κ(ε) = ε −1 (1 − ε) when we set U 1 = 0 and U 2 = 0 (no uncertainty about µ and Σ) . Next we give two examples to show how to use (13) to compute the VaR.
where Σ 2 0 and C ∈ R k×k with c ∈ R k . Then, the support function can be rewritten as follows: Thus, we have 3. DRCCP under the general first order moment and support subsets information. In section 2, we obtain the equivalent convex programming for DR-CCP under the information set P 1 . We notice that the term δ * w(x)w T (x)|U 2 in the equivalent convex programming only has a few tractable forms which have limited the practical applications of this model. In 2014, Wiesemann et al. [30] discussed distributionally robust programming under the conic represented distribution sets with support and first order moment information. In the WKS-type of ambiguity set, the authors used the interval probability of support subsets instead of variance to describe the deviation. Many stochastic optimization problems have been studied under the WKS-type of ambiguity set [11,12]. In this section, we consider the following information set where X 0 , X j are convex and compact sets and U µ is defined in (2), J = {1, 2, · · · , J}. Suppose that p j ,p j ∈ [0, 1] and p j ≤p j , for all j ∈ J . We remark that the information set P 2 where it is assumed that the set of first order moment and the support subsets are convex and compact is a generalization of the conic represented information set considered by Wiesemann et al. [30]. We also suppose that P 2 satisfies the following conditions: (C1) There exists a distribution P ∈ P 2 such that P(ξ ∈ X j ) ∈ (p j ,p j ), whenever p j <p j for j ∈ J . (C2) X J is bounded and almost contains all realizations of random variable such that P(ξ ∈ X J ) = 1. (C3) For all j, j ∈ J and j = j , either X j X j , X j X j or X j ∩ X j = ∅.
Condition (C1) ensures that we can apply the strong dual theorem in [21] to reformulate DRCCP (1). Condition (C2) states that X J contains the support information of random variable ξ. Condition (C3) implies that a strict partial order on the confidence sets X j regarding the -relation and requires that the incomparable sets must be disjoint. We denote Next, we devote to getting the equivalent form for DRCCP (1) under the information set P 2 which satisfies conditions (C1)-(C3).
Theorem 3.1. DRCCP (1) under the information set P 2 can be rewritten as the following convex programming where x ∈ R n , t ∈ R and β, λ j , γ j ∈ R k are decision variables.
Proof. By constructing the information set the chance constraint in (1) can be reformulated as follows where 1 S represents the indicator function of the set Then, the inner subordinate problem sup P∈P21 E P [1 S ] can be unfolded as where β ∈ R k , γ j , λ j ∈ R are the dual variables for the constraints. Obviously, the strong dual theorem holds by condition (C1). By Lemma 2.1, the problem (17) is equivalent to By condition (C3), we divide X J into nonempty and disjoint sets where D(j) denotes the index set of strict subsets of X j . Then, the left hand term of (16) can be rewritten as Similar to the proof of Theorem 2.1, the above problem can be reformulated as inf β,γj ,λj ,η Next we claim that η > 0. Suppose η = 0, then the constraints which contain η become Taking integral operation on both sides of the above constraints, we have where ζ(·) is the exact nonnegative distribution measure. Then, we have From p j ≤ p j ≤p j , one has which is a contradiction with Thus, we have η > 0. Divided by η in all constraints of problem (19) and substituted ( β η , The robust counterparts of constraints (21) are Since the left side of the above inequality is linear in ξ, we obtain that the optimal solution must be taken on the boundary ofX j . According to the same boundary of X j andX j , the above problem can be reformulated as By similar discussion for constraint (22) and the definition of support function, problem (20) can be reformulated as follows Thus, constraint (16) can be rewritten as This completes the proof.
Theorem 3.1 shows that DRCCP (1) under the information set P 2 which satisfies conditions (C1)-(C3) can be rewritten as an equivalent convex programming. The equivalent tractable form of DRCCP (1) is difficult to acquire when P 2 does not satisfy condition (C3). By choosing a partition for X j and each element of this partition is required to be satisfied condition (C3), instead we can obtain a convex approximation for DRCCP (1). Compared to the equivalent form (3) of DRCCP under the information set P 1 , we only need to compute the support function about vector sets in problem (15). We remark that the method for solving the distributionally robust programming under the conic represented distribution sets proposed in [30] is not suitable for solving problem (15). Now we give the following examples to illustrate the efficiency of our results.

4.
Distributionally robust nonlinear chance constrained problem under the general first order moment and support information. In the former discussions, the constraint function in DRCCP (1) is bilinear in x and ξ. In many practical problems the constraint function may have more complicate and flexible structures. In this section, we consider the following distributionally robust nonlinear chance constrained problem where g(·, x) is concave for all x ∈ R n . We notice that the nonlinear constraint function will result in unnecessarily conservation about DRCCP (1). Here we consider the following distribution set under support and first moment information where U ξ is a convex and compact set and U µ is defined in (2). Let f : R k → (−∞, +∞]. We denote the convex conjugate of f by and the concave conjugate of f by where Dom(f ) is the effective domain of f . We need the following lemma to show our main result.
[1] Let f, −g : R k → (−∞, +∞] be two proper, convex and lowersemicontinuous functionals. If there exists an element x 0 ∈ Dom(f ) Dom(g) such that either f or g is continuous at x 0 , then the following equality holds: Now we give our main result in this section.

5.
Conclusions. In this paper, we mainly focus on the equivalent determinate forms for DRCCPs under several general convex moments information sets. We demonstrate some examples for the support functions about some matrix sets and the equivalent convex programming. We also show the closed form solution for worst case VaR optimization problem. Compared with the results of [28], we directly obtain the equivalent convex programming form for DRCCP and establish a uniform framework for DRCCP under general second order moments information.
How to obtain more computationally tractability examples for support function about matrix set is still an open problem. The equivalent determinate form for distributionally robust nonlinear chance constrained problem under information sets P 1 and P 2 also need further discussion. As pointed out by [2], the larger the uncertainty set P is, the worse the optimal solution's performance level will be under P for distributionally robust optimization problems. Clearly, DRCCPs will be more conservative when the magnitude of uncertainty set P becomes bigger. Therefore, how to reduce the conservatism of DRCCP would really deserve to be taken into consideration.
We note that, in some real applications, the sets U µ and U Σ might be unknown and need to be estimated. Sun and Xu [26] presented a comprehensive convergence analysis of distributionally robust optimization problem where the ambiguity sets were described by a sequential increasing distribution information. Thus, it is important and necessary to consider the perturbation analysis of DRCCPs, which requires our further investigation.