Distribution-free solutions to the extended multi-period newsboy problem

This paper concerns the distribution-free, multi-period newsboy problem in which the newsboy has to decide the order quantity of the newspaper in the subsequent period without knowing the distribution of the demand. The Weak Aggregating Algorithm (WAA) developed in learning and prediction with expert advices makes decision only based on historical information and provides theoretical guarantee for the decision-making method. Based on the advantage of WAA and stationary expert advices, this paper continues providing distribution-free methods for the extended multi-period newsboy problems in which the shortage cost and the integral order quantities are considered. In particular, we provide an alternative proof for the theoretical result which guarantees the cumulative gain our proposed method achieves is as large as that of the best stationary expert advice. Numerical examples are provided to illustrate the effectiveness of our proposed methods.


1.
Introduction. This paper is a continuation of the classical newsboy or newsvendor problem in which the decision maker (newsboy) has to choose an order quantity of a perishable product in the subsequent period without knowing the demand in that period. The tradeoff is between the risk of over-stocking and the risk of understocking; where the former will incur disposal below the unit purchasing cost and the latter will incur losing the opportunity of earning a profit. The newsboy problem is a classical inventory problem that is important in aiding decision-making in the fashion and the sporting goods industries, both at manufacturing and the retail level. Once an order quantity is decided, the demand is revealed. In the probabilistic demand framework, the goal of this problem is to decide the order quantity which can maximize the expected profit for a given product. However, in reality the distributional information is limited, inaccurate, or unavailable to characterize the demand; the only information available is the past demands.
The newsboy problem is very important in terms of both theoretical and practical considerations. For the case the statistical information about the demand is unavailable, several approaches are proposed to deal with it. Bayesian update is advanced to solve the unknown demand distribution inventory problem. This method is used under the assumption that the demand distribution belongs to a parametric family of distributions. Originally, Scarf considered the case in which the demand distribution belongs to the exponential class [21]. Later, Murray and Silver assimilated the Bayesian approach's advantage and considered it into the stochastic inventory models [19].
Under the assumption that only the mean and standard deviation of the demand distribution are known, Scarf proposed a min-max solution, which maximizes the total profit obtained from the worst possible demand distribution, and presented a closed form expression for the order quantity [20]. Gallego and Moon presented a simpler proof of Scarf's result and extended the single period model by considering fixed costs and resource constraints [7]. Moon and Choi generalized the results Gallego and Moon obtained [7] to the case where customers may balk if the available inventory is below a certain level [16]. Moon and Choi examined the newsvendor problem with an integrated view, beginning with the raw materials to the final product stage [17]. Vairaktarakis developed a minimax regret approach for the distribution-free multi-item newsboy problem under a budgetary constraint and two types of uncertainty [22]. At the same time, Moon and Silver focused on the distribution-free model by developing a heuristic method for the multi-item newsvendor problem under a budget constraint and fixed ordering costs [18] . Alfares and Elmorra extended the results obtained by Gallego and Moon [7] by incorporating a shortage penalty cost [2].
The classical newsboy problem is mostly formulated for single period. It is practically and theoretically important to extend single period newsboy problem to multiperiod newsboy problem. Assuming that the probability density function of demand is given for each period, ordering plan that maximizes the expected profit has been provided by Keisuke [11]. Bayesian method is also used for analysis of multi-period perishable-inventory systems [15]. Several non-parametric approaches have been proposed for multi-period inventory management. For the related setting of repeated capacity booking problems, stochastic gradient algorithms are mainly studied [4,9,12,13,8]. The importance of censoring information through an asymptotic analysis is emphasized by Besbes and Muharremoglu [3], which is one of the latest literature in this area.
Approaches advanced in computer science are used to study the fully probabilityfree problems [1,6]. The theory of prediction with expert advice is one of these approaches [5]. It aims to develop algorithms that compete with countable number of experts. Weak Aggregating Algorithm is one kind of online learning approaches that combine predictions of different experts [10] . Based on the assumption that newsboy's environment is stationary, Levina et al. applied WAA to solve the multiperiod, distribution-free newsboy problem [14]. Assuming that experts' recommendations are fixed (i.e., experts keep the order quantity at the same value throughout all periods), they provided newsboy an explicit ordering rule. This ordering rule doesn't make statistical assumption about future demand and makes decision only bases on historical demand; more important, the theoretical result guarantees the ordering rule can achieve cumulative gain as large as that of the best expert which can be seen as a benchmark. The stationary environment means the demands fluctuate smoothly and usually take values in a fixed range. This phenomenon is practically related to seasonal products, i.e., the demand of one seasonal vegetable does not fluctuate wildly in the season. Therefore, the motivation of application of WAA to stationary expert advices has its practical justification. Based on their work, later we considered experts who are allowed to switch between different order quantities; and thus proposed newsboy strategies for both real-valued and integervalued order quantities for the non-stationary newsboy problem [23].
In Levina et al.'s newsboy problem, no cost is assumed if the order quantity is less than the demand. Furthermore, they only considered the real-valued order quantities and demands in their model. However, in reality failure to meet demand is always associated with a penalty (shortage cost). The discrete multi-period newsboy problem in which the order quantities and demands are integers is already studied by Zhang et al. [23]. In this paper, we extend Levina et al.'s study by introducing practical factors of shortage cost and integer-valued order quantities and demands. We aim to present ordering rules and their theoretical guarantees for the extend multi-period newsboy problems in which shortage cost and integral order quantities are considered. In our extensions, we first provide an alternative proof which guarantees the same competitive performance for the ordering rule presented by Levina et al. [14]. This proof is short and easy to understand, and becomes the foundation of our further discussion. The rest of this paper is organized as follows. In Section 2, an alternative proof for Levina et al.'s main result is presented. In Section 3, the ordering rules for the multi-period newsboy problem with shortage cost and its competitive performances are provided; and the conclusions for the multi-period newsboy problem with loss function to evaluate the ordering rule's performance are briefly stated. In Section 4, the ordering rule for the discrete multi-period newsboy problem is extendedly presented. In Section 5, numerical examples are presented to illustrate the feasibility and advantage of our proposed ordering rule. Finally, in Section 6, a brief conclusion about this paper is obtained.   [10], is a method of online prediction with expert advice; it makes decision by merging a pool of experts' advices and aims to develop algorithms that compete with a benchmark set of "experts", who can be free agents or decision strategies. Given a set of experts who provide decisions every period, the decision maker employs the WAA to combine these decisions in a certain way. In this paper we will refer to the decision maker as newsboy without necessarily implying that the decision problem is the newsboy problem. The WAA starts with an initial weights distribution on the experts and recomputes the weights assigned to the experts every day. The weights are recomputed after an actual demand becomes known to reflect the change in newsboy's level of trust at each of the experts' decisions.
We denote the set of experts by Θ and assume that Θ is a measurable space. Both experts and newsboy choose their decisions from a decision set T . The demand set is denoted by Ω. There is a gain function π defined on Ω × T . In period n, given newsboy's decision γ n ∈ T and real demand ω n ∈ Ω, newsboy's gain is g n := π(ω n , γ n ); and given an expert θ's decision γ θ n , his gain is g θ n := π(ω t , γ θ n ). Therefore, the cumulative gains for newsboy and expert θ during the first n periods are G n := n i=1 g i and G θ n := n i=1 g θ i . The main parameter of the WAA is a probability measure q(dθ) on Θ, which is interpreted as the prior weights assigned to the experts. The weights are continuously recomputed, and in day n are represented by a probability measure p n (dθ). The pseudo-code of the WAA for the multi-period newsboy problem is as follows: • The initial cumulative gains for newsboy and experts are 0: • In each period n = 1, 2, · · · : -The experts' weights are recomputed: where β n := e 1/ √ n , θ ∈ Θ; -Experts give their decisions γ θ n , θ ∈ Θ; -Newsboy announces his decision -The demand ω n is obtained; -The cumulative gains are updated: G n := G n−1 + π(ω n , γ n ) and G θ n := G θ n−1 + π(ω n , γ θ n ), θ ∈ Θ. Levina et al. proved theoretical guarantee on the performance of WAA for the case of the bounded gain function [14]. The result is presented in Lemma 2.1.
Based on Lemma 2.1, they further provided a specialized bound for the case of discrete prior distributions which presented in Lemma 2.2.
Lemma 2.2. Let π ∈ [−L, 0] and the prior q be discrete. The WAA guarantees that, for all N and θ ∈ Θ, Correspondingly, we will refer to the decision as order quantity when apply WAA to the newsboy problem. For the multi-period newsboy problem, let c (c > 0) be the unit cost of the newspaper, p (p > c) be the newsboy's unit selling price, B > 0 be an upper bound on the number of newspapers bought by newsboy. Let G (y) N denote the cumulative gain of the fixed ordering strategy (stationary expert advice) that keeps the order quantity at the same value y ∈ [0, B] throughout all periods. Levina et al. proved that newsboy has a strategy that guarantees for all N = 1, 2, · · · [14]. 2.
2. An alternative proof. We first introduce salvage into Levina et al.'s multiperiod newsboy problem. Let s (s < c) be the unit salvage value. Given order quantity y and demand d, newsboy's gain in a period is By redefining p := p − s, c := c − s, this case immediately reduces to the case studied by Levina et al. [14]. The primary issue of the application of WAA to this newsboy problem is the computation of the integral (1). Let d (1) , d (2) , · · · , d (n−1) be the order statistics of demands for the first n − 1 periods and set The expert corresponding to θ = y ∈ [0, B] always suggests the order quantity at the same value y ∈ [0, B] throughout all periods. According to (1) and following the computation technique presented in reference [14], the explicit formula y n is Letting a n , based on the order statistics of demands, we obtain and the third equation holds based on the fact that Similarly, b n is expressed as Thus, the explicit ordering rule is obtained. When s = 0, the ordering rule (7) is the one obtained by Levina et al. [14]. Based on an alternative proof which is easy to understand compared with that presented by Levina et al. [14], the theoretical guarantee for ordering rule (7) is obtained.
Proof. Consider Θ = [0, B] for given finite B. According to gain function (5), the largest gain in one period is obtained in the situation that the order quantity equals the demand and the demand is B, which produces gain B(p − c); and the worst situation is that the order quantity is B while the demand is 0, which induces the lowest gain −B(c − s). Figure 1 describes the relationship between gain g and order quantity y in one period. Thus, the gain function satisfies . Intuitively, bounding the integral in this way is reasonable since the best-performing values of y will asymptotically have much higher weights than the poorly performing ones. As a result, most of the weight will concentrate in the neighborhood of the best values. Since, on this neighborhood, G (θ) substituting this into (2) gives We obtain the final result after substitution of (p − s) for m and B(p − s) for L using m ≤ p − s and L = B(p − s).

3.
Extension to the case with shortage cost and loss function.
Noticing that the computation of explicit ordering rule and the theoretical guarantee are independent of the term ld, we can redefine the gain function by g := g+ld and obtain the new gain function which is exactly the gain function (6) if p is replaced by p + l. Thus, the explicit ordering rule is where , k = 0, · · · , n − 1.
The theoretical guarantee is obtained from Theorem 2.3 by replacing p by p + l.

3.2.
Extension to the case with loss function. Originally, loss function is used to evaluate the competitive performance of online algorithms. By defining suitable loss function, WAA can be applied to the multi-period newsboy problem. The difference is that using loss function needs to increase the weight assigned to expert who gets smaller loss. Newsboy uses WAA to combine experts' decisions and update the weights such that newsboy's cumulative loss is competitive with the best expert's cumulative loss.
Keeping the notations proposed above, let λ = l(y, d) be the loss function. We consider the same experts θ = y ∈ [0, B] who always recommend order quantities y through all periods. Given decision y n and demand d n in period n, newsboy's cumulative loss is L n := L n−1 + l(d n , y n ) and experts' cumulative losses are L y n := L y n−1 + l(d n , y). The loss function is defined as For the cases without and with shortage cost, the ordering rules based on loss function are identical to (7) and (12), respectively. On the other hand, the loss function (14) implies Subtracting the left-hand and right-hand sides of inequations (8) and (13) for all N = 1, 2, · · · .
Remark 2. For Levina et al.'s newsboy problem with salvage and shortage cost, based on loss function newsboy applies WAA to obtain a strategy that recommends order quantities according to (12) that guarantees 16) for all N = 1, 2, · · · .
4. Ordering rules for the discrete multi-period newsboy problem. The ordering rules and their competitive performance presented above base on the assumption that the newspapers are "infinitely divisible", in other words, the demands and order quantities can be any real number in range [0, B]. This section consider the more practical (discrete) multi-period newsboy problem in which the order quantities and demands are integers in {0, 1, · · · , B}. In the absence of some strong perceptions/initial information about the market, we consider a uniform prior on {0, 1, · · · , B}. Similarly, the order statistics d (0) , d (1) , · · · , d (n−1) are integers and belongs to set {0, 1, · · · , B}. Suppose d (k+1) = d (k) + m k , k = 0, 1, · · · , n − 1, where m k are integers, d (0) = 0, d (n) = B. We first define For the discrete multi-period newsboy problem without salvage and shortage cost, the gain function is g = p min(y, d) − cy. Similarly, the ordering rule for this extended problem and its theoretical guarantee can be obtained by applying WAA to stationary expert advices. The ordering rule is presented in Theorem 4.1. The theoretical guarantee on this ordering rule is presented in Theorem 4.2. We provide simple proofs for both of them.
Theorem 4.1. For the discrete multi-period newsboy problem without salvage and shortage cost, the ordering rule obtained by applying the WAA is: where Proof. According to decision-making process of WAA, newsboy's order quantity is

YONG ZHANG, XINGYU YANG AND BAIXUN LI
In order to give explicit online ordering rule, we depart (18)'s nominator e n into several parts, i.e., Similarly, we obtain Theorem 4.2. For the discrete multi-period newsboy problem, newsboy has a randomized online ordering rule that recommends order quantity according to (17) that guarantees for all N = 1, 2, · · · .
For the discrete multi-period newsboy problem with salvage (with salvage and shortage cost), the ordering rule and its competitive performance can be obtained by replacing p by p − s (p by p + l − s), c by c − s in (17) and (19). 5. Numerical analysis. In this section we provide numerical examples to illustrate the competitive performance of our proposed ordering rules. For simplicity, we only consider the discrete multi-period newsboy problem for which the order quantities and demands are integers. For convenience, we denote the ordering rule (17) proposed in this paper by DAS. Firstly, in subsection 5.1 we illustrate DAS's competitive performance by comparing with its benchmark best expert. Secondly, in subsection 5.2 we show DAS's advantage by comparing with the ordering rule ANS we proposed in [23] which also treats the discrete newsboy problem. In order to show the extensive applicability of DAS, we use different parameter settings according to different references.
The cumulative gains of DAS and experts in the case without/with shortage cost are presented in Table 1. We conclude that the cumulative gain DAS achieves is almost as large as that achieved by the best expert. When N = 100 and N = 200, for the case with shortage cost the best expert is the fifth expert whose prediction is 5; however, when N = 300 and N = 400 the best expert is the fourth expert whose prediction is 4. For the case without shortage cost the best expert is always the fourth expert whose prediction is 4. Table 1 also shows that the cumulative gains DAS achieves are larger in the case without shortage cost. This accords with our intuition. The results presented in Table 1, Figures 2 and 3 are computed according to a fixed generated instance. In order to certainly show the great competitive performance of DAS, we generate demand sequences 40 times. For the multi-period newsboy problem without/with shortage cost, we compute the average (AVE) and standard deviation (STD) of ratios of cumulative gains DAS achieves to cumulative gains the best expert achieves for trial number (TN) equaling to 10, 20, 30 and 40. The results are presented in Table 2. The AVEs are very high (above 98%) for both cases (a little higher for the case with shortage cost) , which illustrate the great competitive performance of DAS. The STDs show DAS's performance fluctuates more obviously for the case without shortage cost.     [23] also treats the discrete multi-period newsboy problem. For comparison, we p = 2, c = 1, N = 30, B = 21 according to numerical examples presented in [23]. We generate demand sequences 10 times. For each trial, we compute DAS's cumulative gain and ANS's cumulative gain. The computed results presented in Table 3 show DAS can achieve more for each trial.  Conclusions. This paper continues study of multi-period newsboy problem. In this problem, the demands are revealed incrementally and the order quantities are made before the demands are obtained. In particular the analysis of the classic multi-period newsboy problem is extended by considering factors of salvage, shortage cost and integral order quantity. The Weak Aggregating Algorithm is further applied to provide ordering rules. Theoretically, the cumulative gains of the proposed ordering rules are as large as these of the best experts who suggest fixed order quantity. Numerical examples are provided to illustrate the competitive performance of the proposed ordering rules.