OPTIMAL INFORMATION POLICY IN DISCRETE-TIME QUEUES WITH STRATEGIC CUSTOMERS

. This paper studies optimal information revelation policies in discrete-time Geo/Geo/ 1 queue. Revealing the queue length information to arriving customers plays an important role in their decision making, that is, whether to join the system or balk. We consider policies where a service provider discloses information to some customers and conceals it from others, depending upon the number of waiting customers. This partial information disclosure policy helps the service provider minimize the idle period of the system and maximize the revenue.


1.
Introduction. In the recent years, there has been an emerging trend to study queueing systems related to economic analysis and strategic customer behavior. Economic analysis of queueing systems was initiated by Naor [16] who considered an observable M/M/1 queue using linear cost-reward structure. The study presumed that an arriving customer, before deciding whether to join or balk, observed the number of customers in the queue. Edelson and Hilderbrand [5] complemented the above study by considering the corresponding unobservable M/M/1 queue, where the arriving customers make their decision without observing the number of customers in the queue.
In such studies, a certain cost-reward structure is levied on the queueing system. This determines the customers' dislike for waiting and their desire for service. A customer acts strategically and enters the queue only when the expected waiting cost is less than the reward received upon being served. This holds true in both the observable and unobservable cases. The monograph of Hassin and Haviv [9] reported various results in the economic analysis of several queueing systems.
At call centers, information revelation policy is a key factor in the customers' behavior [1]. When the environment is congestion-prone, such as the queues in computer systems, data networks, voting locations and amusements parks, a decision made by a customer influences the utility of other customers [10]. Users want to maximize their profit, but are unwilling to regulate their actions. They select from the alternative actions, taking into consideration various assumptions regarding the 690 VEENA GOSWAMI AND GOPINATH PANDA system parameters like cost and expected waiting time. Most of these systems assume that providing the queue-length information helps customers to optimize their expected payoffs. Shone et al. [17] compared the results of unobservable and observable M/M/1 queues and concluded that each model, depending on the settings, can perform better than the other. Literature focusing on various equilibrium strategies in continuous time Markovian queueing systems is readily available, see [3], [19] (with setup times), [24] (working vacation), [21] (retrials), [2] (with catastrophes), [8] (partial information on the service time distribution).
Discrete-time queueing systems are more appropriate for performance measures and modelling in telecommunication networks and digital communications due to discrete nature of their functioning. In discrete-time systems, the time axis is divided into fixed-length intervals called slots, and the arrivals and their onward transmissions may take place at the same time around slot boundaries. The discrete-time queueing model is different from the study of the corresponding continuous-time model. The advantage of examining discrete-time queue is that one may find the equivalent continuous-time result from it as a limiting case, however, the converse is not true. A complete study on discrete-time queueing models has been presented in [11] and [22]. There are comparatively fewer studies on equilibrium strategies in discrete-time queues, with some popular works being, [14] (with multiple vacations), [20] (single working vacation), [23] (with server breakdowns and repairs), [15] (pricing analysis), [6] (observable queue with delayed multiple vacations), [13] (with multiple and single vacation policies). Large and Norman [12] considered both informed and uninformed customers with endogenously determined arrivals and batch services. The uninformed customers were non-strategic and followed a determined threshold policy. Optimal information disclosure policies in an M/M/1 queue has been studied by Simhon et al. [18]. They discussed the model under the following three information policies based on the queue length: (i) always inform customers, (ii) never inform customers, (iii) inform customers only if the queue length is below a specified threshold. They compared the three policies for revenue optimization and concluded that the third policy is never optimal. The present work extends the above model to include the discrete time queues with geometric arrivals and services.
The purpose of this work is to ascertain the conditions in discrete-time queues where the service provider may optimize its revenue by employing both observable and unobservable cases. In particular, the stress is on the analysis of policies in which the service provider releases the queue length information to some customers and suppresses it from others. The service provider earns a fixed revenue from each customer that enters the queue. In order to maximize the revenue, the service provider has to increase the effective arrival rate, that is, to minimize the idle period of the system. We compare the results obtain by adhering the following distinct policies: (i) informed policy (arriving customers are always informed about the queue length); (ii) uninformed policy (customers are never informed about the queue length); (iii) selective policy (customers are informed based on a threshold policy when the queue length is below the mentioned threshold). The selective policy seems to be intuitive. It shows that in any circumstances, either sharing or hiding information from all customers always results in an increased expected revenue. We find an optimal information policy, that is, the selective threshold information disclosure policy that maximizes the revenue. The disclosure of the queue length may stop customers from joining very long queues. It may be socially optimal to conceal information, however, it is not optimal to promote suppression of information when the service provider chooses to reveal the queue length. Glazer and Hassin [7] reported that by hiding the exact items of a good sold on a payment, the firm's income and the social benefit might increase, simultaneously.
The rest of this paper is organized as follows. In Section 2, we discuss system model. Section 3 analyzes the model and find the equilibrium strategy with threshold information policy. Section 4 examines comparison among the results of the three policies. Section 5 presents the impact of system parameters on the equilibrium behavior through numerical results. Section 6 concludes the paper.
2. Model description. We consider a Geo/Geo/1 queueing system under the late arrival system with delayed access (LAS-DA) policy. Let the time axis be divided into fixed size slots of equal intervals as 0, 1, . . . , t, . . .. A potential arrival occurs in the slot (t−, t) and a potential departure takes place in the slot (t, t+). The various time epochs at which events take place are described in Fig. 1. The state of the system takes place only around the slot boundaries, i.e., both arrivals and departures are possible only at slot boundaries. The inter-arrival times are independent and geometrically distributed with probability mass function (p.m.f.) a n = P (A = n) =λ n−1 λ, n ≥ 1, 0 < λ < 1, where the random variable A is generic of inter-arrival times and for any real number x ∈ [0, 1], we denotex = 1 − x. The service times are also independent and geometrically distributed with p.m.f. P (S = n) =μ n−1 µ, n ≥ 1, 0 < µ < 1, where the random variable S is the generic of service times. The service discipline is taken to be first-come first-served (FCFS). We assume ρ = λ µ as the maximum load of the queue. The problems associated with queueing games is that an arriving customer decides whether to enter the system or balk depending on his expected sojourn time and the reward received for service completion. Individual customers' joining decisions may affect the waiting delay and thus the benefit of all customers. The decision of an arriving customer is considerably dependent on the information; he has acquired about the queue length at the instant of his arrival. We assume that a fixed reward R is gained by each customer after their service completion and at the same time they are charged a waiting cost of C units per time unit to remain in the system. Using a linear cost-reward function, we study customer's expected net benefit, defined as R − CW , where W represents the expected mean sojourn time of an arriving customer. In our analysis, we substitute C = 1 through out this paper, without loss of generality. When an arriving customer encounters an empty system, he will join the system if his net benefit is positive, i.e., R > 1 µ . We assume this 692 VEENA GOSWAMI AND GOPINATH PANDA criterion to attract customers to join an empty system. Customers desire to maximize their expected net benefit by making decisions only at their arrival instants. Their decisions are irrevocable in the sense that retrials of balking customers and reneging of entering customers are not allowed. Further, the customers are assumed to be identical and will enter the queue if and only if the reward is more than the expected sojourn time.
We study a state-dependent information revelation policy, such that the threshold parameter ξ is a mapping from the queue length k into the interval [0, 1]. Thus, the threshold ξ(k) for all k ≥ 0 is the probability that service provider imparts the information to an arriving customer if there are k waiting customers in the queue. Also, we consider threshold policies which are independent of time, that is, stationary state revelation policies. We split this into two parts : observable and unobservable. For the observable case, ξ(k) = 1 for all k ≥ 0 and for the unobservable case, ξ(k) = 0 for all k ≥ 0. The threshold parameter notation for the former information revelation policy is ξ + whereas, for the latter revelation policy is ξ − . The main objective of the service provider is to maximize the revenue generated by completion of services. If we consider the benefit function ∆ ξ as the long run rate of service completions when the information disclosure policy ξ is imposed, 0 is the steady-state idle probability. Since, the mean service rate is a constant, the problem of profit maximization over the policies ξ ∈ Ω (max ξ∈Ω ∆ ξ ) is equivalent to the idle probability minimization (max ξ∈Ω (1−P ξ 0 )). Thus, it purposes to minimize the steady-state idle probability of the system, that is, min where Ω is the set of information revelation policies mapping from N to [0, 1].
An informed customer enters the system if the queue length is strictly less than the threshold L = Rµ , see [16]. An uninformed customer takes a decision based on his expected sojourn time, denoted by W ξ . Their expected sojourn time may be computed using the given system parameters and the service provider's information policy. They are aware of the policy applied by the service provider through trials or via exogenous sources. However, once they arrive to the system, customers are forward-looking and maximize their expected utility with respect to their join/balk decisions. When the queue length is small, the arrival rate of customers can be increased by revealing the queue length information. Otherwise, the service provider may not reveal the queue length information.
3. Equilibrium analysis of threshold strategies. In this section, we consider threshold strategies in which customers upon arrival are well informed of the queuelength if it is less than or equal to some threshold, say D and otherwise, no information is given to them, that is, Let P n (t−) denote the probability that there are n ≥ 0 customers in the system at time t−. The evolution of the system length process forms a discrete time Markov chain (DTMC) with state space {0, 1, 2, . . . }. Fig. 2 illustrates the state transition diagram for the selective threshold policy ξ D . Above the threshold value D, customers do not have information about the queue length. Let us assume that an uninformed customer enters the queue with probability f , where f ∈ [0, 1], and the effective arrival rate is λf. Below the threshold value D, customers do have information about the queue length. When the queue length is D (≥ L − 1), uninformed customers never join the queue as their expected sojourn times, conditioned on the concept that the queue length is at least L, is inevitably higher than the reward R. They experience that when the queue length is greater than D, the expected sojourn time (W ξ (D, f )) depends on the threshold D and joining probability f. Therefore, a threshold policy with a threshold D ≥ L − 1 is similar to the observable model and the unique equilibrium is f = 0. Hereafter, we only consider Rµ ≥ 2, that is, we take threshold policies with D ∈ {0, 1, .., L − 2}.
Relating the states of the system at two consecutive prior to potential arrival epochs t− and (t + 1)−, we obtain the following balance equations, where for the sake of simplicity we use t instead of t−, Let us define the steady-state P n = lim t→∞ P n (t), n ≥ 0. In the steady-state, above equations (1a) -(1f) reduce to Using (2a) -(2c), recursively, we obtain P n = 1 µ r n P 0 , 1 ≤ n ≤ D,

VEENA GOSWAMI AND GOPINATH PANDA
where r = λμ λµ . Using (2d) -(2f), yield Combining all, the steady-state probabilities are Using the normalization condition ∞ n=0 P n = 1, we get the idle probability P 0 , (that is, the steady-state probability that the system is empty) as a function of the joining probability f and the threshold D and is given by .
In the computation of P 0 (D, f ), we assume λf < µ, as requirement for the convergence of the associated geometric series i

Remark 1.
We obtain the continuous-time results from that of a discrete-time queue in the limiting case. For the optimal information revelation policies in strategic continuous-time queues, we assume that the inter-arrival and service time are independent and exponentially distributed. The time axis is slotted into intervals of equal length, such that λ = α∆ and µ = β∆, where ∆ > 0 is sufficiently small. Using λ = α∆, µ = β∆ and taking limit as ∆ → 0 in (3) -(4), we get which exactly match with the corresponding results reported in [18].
Let us assume that P (i | i > D) be the conditional probability that there are i customers in the system, given that the number in system is greater than D. It is given by The average sojourn time of an uninformed arriving customer is Substituting (3) and (5) in (6), we obtain the average sojourn time as Our next objective is to find the equilibrium strategy f * which decides whether an uninformed customer enters the queue or not. In the lemma 3.1, we show the existences and uniqueness of the equilibrium.
Lemma 3.1. The strategy of Geo/Geo/1 queue with a threshold information policy has a unique equilibrium.
Proof. To enter the queue when the queue length is less than L, and balk otherwise is a dominant strategy for the informed customers, hence unique. Now, we study the strategic behavior of the uninformed customers. It is observed from (7) that the expected payoff (W ξ (D, f )) of an uninformed customer that enters the system decreases with increase in f , but the expected payoff of balking R is independent of f . Therefore, the two payoff functions either meet once or do not meet. In the former case, a mixed equilibrium strategy (f * ) exists at the intersection point of two payoff functions. Since the expected payoff of joining decreases with increase in f and the expected payoff of balking is independent of f , which intends that the game has "avoid-the-crowd" property. It is well known that this property ensures the uniqueness of the equilibrium, see [9, p. 7]. In the latter case ( where two payoff functions do not meet), we get a unique pure equilibrium strategy where all uninformed customers enter (f * = 1) when R > W ξ (D, f ), otherwise all of them balk (f * = 0). So in both the cases, there exist a unique equilibrium strategy.
The following theorem presents the expression for mixed equilibrium strategy.
Theorem 3.2. If a service provider employs the selective policy ξ D and the queue length is less than L, then all informed customers will join the queue. But the uninformed customers will join the queue with probability Proof. We begin the equilibrium analysis when the system is stable and arriving customers are attracted to join the empty system, i.e., ρ < 1 and R >λ/(µ − λ). Under this situation, if the ξ − policy is applied, then all uninformed customers enter the system as their expected sojourn time will beλ/(µ−λ), which is smaller than the service completion reward. For the service provider, this policy is optimal and other policy can not perform better than it. To find the system equilibrium in this case, we consider the following two scenarios of the system: (i) overloaded system, i.e., ρ > 1, (ii) underloaded system with bounded reward, i.e., ρ < 1 and R <λ/(µ − λ). The unique equilibrium (Lemma 3.1) might be a pure equilibrium with no entry of uninformed customers to the queue, f * = 0 or a mixed equilibrium with entry of some uninformed customers f * ∈ (0, 1). In the former case, no uninformed customer joins the queue as the queue length is greater than the threshold D. If an uniformed customer gets the queue length information D + 1 and joins, then his expected sojourn time will be (D + 2)/µ. If f = 0 is an equilibrium, then an uninformed customer is not better off by joining the queue. This results in As we discuss threshold policies with D ≤ L−2, we compute that a pure equilibrium exists only if Rµ = D + 2. Next, we analyze the case of a mixed equilibrium. We deduce the fraction of uninformed customers that enter the queue, where each player must be indifferent to the actions of joining and balking the queue. Thus, at equilibrium, W ξ (D, f * ) = R. Solving (7) for f , we get The equilibrium f * represents a pure equilibrium for D = Rµ − 2 and a mixed equilibrium otherwise. Using (9), we find the customers' equilibrium strategy under an information revelation policy ξ D assumed by the service provider. Thus, we get the results.
3.1. Geo/Geo/1/L informed queue. In this case, we have two more equations in addition to (1a) -(1c), where (1c) holds up to L − 2. The additional equations are Thus, for the steady-state case, we have The outside observer's observation epoch in LAS-DA falls in the interval (t+, (t + 1)−). Since nothing happens in this interval, the outside observer's distribution P o n remains same as the distribution of number in system observed at (t + 1)−. Hence, P n = P o n , 0 ≤ n ≤ L. The expected number of customers in the system is given by The expected waiting time in the system (W s ) can be obtained using Little's law. We note that Little's law agrees at outside observer's observation epoch, and in LAS-DA case as P n = P o n , we get

3.2.
Geo/Geo/1 uninformed queue. The steady-state probabilities in Geo/Geo/1 uninformed queue are given as The expected number of customers in the system is given by

INFORMATION DISCLOSURE IN DISCRETE QUEUES 697
Using Little's law, the expected waiting time in the system is 4. Comparison of policies. In this section, we find out the optimal information revelation policy that the service provider may follow to optimize the revenue. For the three types of policies (ξ D , ξ + and ξ − ), we deduce and compare the steadystate idle probabilities at equilibrium. When threshold policy ξ D is applied, the steady-state idle probability can be calculated by substituting (9) into (4). Thus, at equilibrium Once the queue length information is shared with all the arriving customers, we have the finite buffer Geo/Geo/1/L queue. The steady-state idle probability for the Geo/Geo/1/L queue is given by When the queue length information is not shared with the arriving customers, there exists a unique mixed equilibrium. The customers join the queue with certain probability f and balk with the complementary probabilityf . The model can be thought of as the infinite buffer Geo/Geo/1 queue with effective arrival rate λf * and service rate µ. The steady-state idle probability and the expected sojourn time for this model are, respectively and Since all arriving customers are indifferent between joining and balking, W ξ− = R. Solving (13) for f and substituting it in (12), we get The social welfare per time unit, when all the customers follow the same equilibrium mixed strategy f , is Differentiating ∆ s (f ) with respect to f , we get Let f * s be the root of the equation ∆ s (f ) = 0. On simplification ∆ s (f ) = 0 reduces to the following quadratic equation in f ,

VEENA GOSWAMI AND GOPINATH PANDA
The two roots of the above equation are given by If f * o denotes the socially optimal joining probability, then there are two cases for the existence of f * o .
is negative for any value of f satisfying λf < µ. Therefore, the function ∆ s (f ) in (14) is strictly concave in the interval [0, µ λ ) and at the point f = f * s , it attains a unique maximum. The following theorem compares the information policies at equilibrium using the performance indicator as the steady-state idle probability.
Theorem 4.1. To maximize the revenue by taking the set of deterministic threshold based information disclosure strategies, a service provider may either use the full information strategy ξ + or the uninformed strategy ξ − .
Proof. We divide the proof into two parts. In the first part, let us assume ρ > 1, to prove that P 0 (D) > P ξ+ 0 and in the second part, we examine the case of ρ < 1, to show that P 0 (D) > P ξ− 0 . Using (10) and (11), we obtain that Taking T 1 = Rµ − D − 1, we note that r Rµ −D > r Rµ−D−1 . Thus, for any T 1 ∈ [Rµ − Rµ + 1, Rµ − 1] and ρ > 1, we have We observe that g(1) = 0, thus the smallest possible value of T 1 is 1. Taking the derivative of g(T 1 ) with respect to T 1 , we get Hence, for g (T 1 ) ≥ 0, our claim is to show that One may express that ln(x) ≥ x−1 x , for any x > 0, with the help of the discrete mean value theorem [4]. Thus, equation (18) is valid and we infer that our claim g (T 1 ) ≥ 0 is true.
Next in the second part, that is for ρ < 1, we examine that which corresponds to Setting T 2 = D + 1 in (20) and after simplification, we obtain The above equation holds for T 2 = 1, the smallest possible value of T 2 . Substituting T 2 = 1, we obtain On further simplification, we get Equation (22) is valid for any Rµ ≥ 2 and ρ < 1. Let us define Differentiating with respect to T 2 , we obtain To show that h (T 2 ) ≤ 0 , we divide the right hand side of (24) by ρR T2−1 and on simplification, our claim reduces to Since ln(r)(1 − ρ) < 0 and also (Rµ − T 2 ) ≥ 1, we get Using the above inequality, (25) reduces to which is equivalent to and is true for any r > 0. Remark 2. We may deduce similar results for the early arrival system (EAS) from the output of the above LAS-DA system. Since the outside observer's distributions for both the systems remain same, we can replace {P n } at the outside observer's epoch of the EAS system and get the results. Let Q n and Q o n be the steady-state probability that there are n customers in the system at an arbitrary and outside 700 VEENA GOSWAMI AND GOPINATH PANDA observer's observation epoch, respectively. Hence, using the relation Q o n = P n and observing Fig.3, we obtain P 0 =λQ 0 , Solving the above equations, we get 5. Numerical results. In this section, we present some numerical experiments to display the effect of the system parameters on the behavior of the customers under the three different strategies explained above. In particular, we are interested in the values of queueing parameters that will help the system manager to optimize the idle system state by controlling the information shared to the arriving customers. We analyze both the overloaded and underloaded situations and present the optimal performance of the system in each case. Firstly, we consider the underloaded Geo/Geo/1/L queue with λ = 0.5, µ = 0.6 and the service reward R = 50. Here, the buffer capacity L = Rµ = 30. So, the value of selective threshold policy D can be from 0 to 28. Behavior of customers under different policies against the threshold D is illustrated in Fig.4. Here, the steady-state idle probability P 0 for the threshold policy ξ D is higher than the uninformed policy and is lower than the informed policy (ξ + ). Hence, the service provider will maximize the revenue if he employs the the policy ξ − . When the queue is overloaded with parameters λ = 0.65, µ = 0.6 and the same service reward R = 50 as used in the underloaded queue, exactly opposite behavior is displayed in Fig.5. Here, the optimal policy is the informed one. In both the cases, we observed that the service provider will be benefited by employing either the informed policy, ξ + or the uninformed policy, ξ − . Secondly, we computed the expected waiting time of the joining customers for different joining probabilities for a particular threshold value D = 5. For the underloaded Geo/Geo/1 queue with the parameters λ = 0.5, µ = 0.6, the uniformed policy out performed both the informed and threshold policy (see Fig.6). The expected waiting times increase with the increase in the joining probability value f ∈ [0, 0.9], subject to the stability criterion λf < µ. For different values of the threshold D, we observe sharp increase in the expected waiting time. This is intuitive as higher the threshold an arriving customer encounters more customers which in turn increases his waiting time in queue to get service. In Fig.7, the corresponding behavior is described for the overloaded queue with parameters λ = 0.65, µ = 0.6 and D = 5. In this case, the expected waiting time of an arriving customer that join the queue following the threshold policy lies in between the expected waiting times if he would have followed the other two policies. In both the cases, the uniformed policy out performed the other two policies with respect to the expected waiting times. Arriving customers will be benefited if they follow the uniformed policy. 6. Conclusions. In this paper, we have considered the equilibrium behavior in the discrete-time Geo/Geo/1 queue from a game-theoretic perspective. We studied the policies where service provider, depending upon queue length, discloses information to some customers and hides it from others. This helps the provider to minimize the idle period of the system for maximizing the revenue. On the other hand, this helps the arriving customers to choose a policy such that their expected waiting time will be the minimum which in turn increases their net benefit. These results might render managers with useful information to examine the pricing issues and prepare customers to take optimal strategies. Various numerical results for the system under consideration are investigated. The results discussed in this paper tend to the continuous-time counterparts in the limiting case. The policy studied in this paper may be applied to analyze more complex models such as discretetime batch arrival queues with various vacation policies which is left for future investigation.