Cooperative dynamic advertising via state-dependent payoff weights

We consider an infinite horizon cooperative advertising differential game with nontransferable utility (NTU). The values of each firm are parametrized by a common discount rate and advertising costs. First we characterize the set of efficient solutions with a constant payoff weight. We show that there does not exist a constant weight that supports an agreeable cooperative solution. Then we consider a linear state-dependent payoff weight and derive an agreeable cooperative solution for a restricted parameter space.


1.
Introduction. Dynamic individual rationality is a necessary condition for stable cooperative agreements in dynamic games. That is, the individual cooperative value payoff dominates the noncooperative equilibrium value at all time instants. One distinguishes time consistency [10] and agreeability [7] of cooperative solutions. The former refers to payoff dominance along the cooperative state trajectory, while the latter to payoff dominance along any state trajectory. Clearly, if a cooperative solution is agreeable it is also time consistent. Agreeability is a robust concept in the sense that cooperation is always beneficial even if the cooperative state trajectory is perturbed [4,5,6,20].
One way or the other the individual values under the cooperative solution are generated by efficient controls. That is, the grand coalition forms and jointly maximizes the sum of payoffs. The resulting controls yield an undominated value by the very definition of joint payoff maximization. In order to determine the individual cooperative values we further need to distinguish between games with transferable utility (TU) [11] and games with nontransferable utility (NTU) [19]. In TU games one can always construct agreeable payoff distributions. In a first step each agent is granted her noncooperative equilibrium value. Since the value of the grand coalition always (weakly) exceeds the sum of noncooperative values, cooperation is then incentivised in a second step by adding a share of the cooperation dividend [6,Eq. (9)]. In NTU games, however, the set of agreeable cooperative solutions may very well be empty, because the current individual cooperative value is simply the discounted stream of payoffs given the agents play the efficient controls. Side payments are not allowed and if the agents are sufficiently asymmetric agreeability may not be given over the entire state space for constant payoff weights [16,17].
In the present paper we consider an infinite horizon 2-firms nonzero-sum NTU advertising differential game and introduce a state-dependent payoff weight. The game is symmetrically antagonistic in the sense that an increase of payoffs for one firm necessarily comes with a loss for the other firm. Since the objectives of the firms are perfectly misaligned it seems especially hard to support cooperation at the bounds of the state space. It is shown that for this very reason constant weights fail to support an agreeable cooperative solution over the entire state space.
Related Literature State-dependent payoff weights were already used in noncooperative infinite horizon difference [15] and differential games [3] as well as in finite horizon cooperative NTU difference games [18]. To the best of my knowledge no paper has yet studied an infinite horizon cooperative NTU differential game with state-dependent weights.
Sorger [15] considers a discrete time cooperative bargaining game in every period, and cooperation is then supported as an equilibrium given that the agents threaten to play noncooperatively in each period. The so-called welfare weight is a statedependent fraction of the cooperation dividend viz. the difference between the cooperative and noncooperative value of each agent.
In de-Paz et al. [3] and Marín-Solano [8] the authors consider agents with heterogeneous time preference rates. Here time consistent per instance cooperation is supported as an equilibrium of the game played between each t-generation. That is, the agents act cooperatively within, but noncooperatively across time. It is noteworthy that a time consistent cooperative equilibrium is not necessarily individually rational if the players are sufficiently asymmetric with respect to discounting [9]. In their framework they allow for time-and state-dependent payoff weights.
From a methodological point of view the paper of Yeung and Petrosyan [18] is related to the present approach. They consider a discrete time NTU cooperative game with finite horizon in which the cooperative controls solve the weighed joint payoff maximization problem of the grand coalition. The cooperative controls are then parametrized in the per period payoff weights. Since the time horizon is finite and time discrete, one can construct time varying payoff weights that support a time consistent cooperative solution by backward induction. That is, in the final stage one fixes weights that support cooperation. Such weights always exist by jointly maximizing payoffs. Given the weights in the final stage, one then fixes weights in the second-to-final stage that support cooperation and so forth. This approach is not applicable in the present paper, since there is no final stage (infinite horizon) and one cannot distinguish discrete stages (continuous time).
The remainder of the paper is organized as follows: In Section 2 we set up the model. In Section 3 we solve the differential game for the noncooperative Nash equilibrium in state feedback strategies. In Section 4 we introduce the cooperative differential game terminology and set up the optimal control problem that ultimately yields the values of the cooperative game. In Section 4.1 we solve for the efficient controls under a constant weight and show that there does not exist an agreeable cooperative solution. In Section 4.2 we solve for the efficient controls under a state-dependent weight and show that there exists an agreeable cooperative solution for a restricted parameter space. The game is analytically tractable and closed form solutions are derived for the feedback strategies and value functions. Section 5 concludes.
2. Model. Sorger [14] studies a noncooperative advertising differential game that rests on the optimal control model of Sethi [13]. 1 The game is played over an infinite horizon [0, ∞) where t ∈ [0, ∞) refers to the current time. Further indicate the respective opponent by j = 3 − i. The firms sell a homogenous good on the market and the market demand is normalized to unity. Let x(t) ∈ X = [0, 1] denote the market share of firm 1 and 1−x(t) of firm 2 respectively. Each firm can advertise at rate u i (t) ∈ U i = R + to increase its demand. The evolution of x(t) over s ∈ [t, ∞) is described by the following dynamic system: where x refers to the current position of the game and u = (u 1 , u 2 ) ∈ U 1 × U 2 .
Advertising comes with two effects on instantaneous profits. A negative direct cost effect and an indirect effect of larger future profits due to an increase of demand. The trade-off is utilized in the following profit functions F i : X × U i → R + : where γ > 0 is a cost parameter. Since the time horizon is infinite and the state equation as well as payoff functions are time invariant, we consider stationary feedback strategies u i = µ i (x). Let is Lipschitz continuous in x} denote the set of admissible strategies. The objective functional of each firm in the subgame t ∈ [0, ∞) is given by the discounted stream of payoffs For computational reasons it is convenient to define κ := γr 2 > 0. For a given r one thus subparametrizes the cost parameter γ = κr −2 . We collect the exogenous parameters in ρ = (γ, r) = (κr −2 , r). The infinite horizon differential game is then 3. Noncooperative equilibrium. We are assuming that the noncooperative equilibrium value serves as a lower bound against which a cooperative agreement is compared to, because individual rationality requires that each firm must be better off by cooperating than what it can guarantee itself by acting noncooperatively. We consider state-dependent strategies and since the time horizon is infinite we also consider an autonomous game. As the disagreement strategies we thus fix the state feedback equilibrium strategies.
Definition 3.1. The pair φ(x; ρ) = (φ 1 (x; ρ), φ 2 (x; ρ)) ∈ U 1 × U 2 = U of state feedback strategies constitutes a Nash equilibrium for Γ(x) if for all i ∈ {1, 2} and x ∈ X it holds that: 2} and x ∈ X the following pair of coupled differential equations: We coin D i (x; ρ) = J i (x, φ(·); ρ) as the current noncooperative disagreement value. Proposition 1. There exists a noncooperative equilibrium for Γ(x) characterized by the following values and strategies: Proof. We are going to show how to solve the game by guess and verifying D i (x; ρ). The maximizers of the Hamilton-Jacobi-Bellman equation (HJBe) (2) are given by We consider solutions with nonnegative controls and thus seek for solutions that satisfy . Substituting the maximizers into the HJBe for i = 1 yields From the structure of the maximized HJBe we conjecture that the value functions are linear with respect to the state variable. Since the firms are symmetrically antagonistic with respect to the payoff functions and state equation we further conjecture that the value functions are mirrored along the line x = 1 2 , i.e.
The game can be regarded as a symmetric antagonistic nonzero-sum differential game, because the values are mirrored along the symmetric steady state The gain of one firm along the state necessarily comes with a loss for the other firm. It seems that for these kind of games, support of agreeability over the entire state space is especially hard, because the preferences are completely misaligned on X and at the bounds of the state space the minimal value of one firm corresponds with the maximal value of the other one, i.e. arg max x∈X D i (x; ρ) = arg min x∈X D j (x; ρ).

Cooperative solution.
We assume that the firms do not want to stick to the noncooperative equilibrium, but that they want to exploit a surplus from cooperation. Let us first state the assumptions of the cooperative game. At t = 0 the firms can agree on implementing control paths according to the feedback strategies σ(x; ρ) ∈ U. In order to determine the cooperative controls σ(x; ρ) we assume that the firms implement efficient controls derived from jointly maximizing the convex combination of instantaneous payoffs Here λ : X → [0, 1] is an arbitrary function called a state-dependent payoff weight. Then the joint cooperative payoff functional is accordingly defined by J(x, µ(·); ρ) = ∞ t e −r(s−t) F (x(s), µ(x(s)); γ)ds.
If the firms agree to cooperate, then they agree on implementing the controls that solve the following optimal control program: max Theorem 4.1 ([2, cf. Theorem 3.4]). The pair σ(x; ρ) ∈ U constitutes an optimal solution to the infinite horizon control problem (9) if there exists a continuously differentiable function C : X → R that satisfies for all x ∈ X the following differential equation: and C(x) is bounded for all x ∈ X.
Since we consider a NTU game, the individual cooperative value is simply given by the payoff stream under the cooperative controls.
Definition 4.2. The current individual cooperative value A i : X → R, i ∈ {1, 2} is defined as follows: We assume that the cooperative agreement is stable over time if it is agreeable. That is, for both firms the individual cooperative value payoffs dominates the noncooperative equilibrium value for any position x ∈ X in the game. Define the cooperation dividend by If there exists a x ∈ X such that (11) does not hold, the firms abandon the agreement and play noncooperatively in the remaining game. For latter reasoning we provide an equivalent definition of agreeability.
Definition 4.4 (Agreeability 2). A cooperative solution σ(x; ρ) ∈ U is agreeable at t = 0 if the following relation holds: Put differently, the firms stick to the agreement in any time instant t ∈ [0, ∞) if there does not exist a state such that the current cooperative value of either firm is strictly payoff dominated by its noncooperative equilibrium value.

Constant payoff weight.
In this section we consider a constant weight λ ∈ [0, 1]. It will be shown that there does not exist a weight such that the cooperative controls are agreeable. Therefore we first solve (9) for the parametrized cooperative controls σ(x; ρ, λ). Then we derive the individual cooperative value A i (x; ρ, λ) and thus the cooperation dividend E i (x; ρ, λ) = A i (x; ρ, λ) − D i (x; ρ). Then we apply Definition 4.4 to check whether there exists a weight such that the cooperative controls are agreeable over the entire state space. It will be shown, however, that no such weight exists. Put differently, for all weights λ ∈ [0, 1] there exists a x ∈ X such that either firm 1 has an incentive to play noncooperative E 1 (x; ρ, λ) < 0 or firm 2 respectively E 2 (x; ρ, λ) < 0.

Proposition 2.
For λ ∈ [0, 1] there exists a cooperative solution to the optimal control problem (9) characterized by the following value function and strategies: where The proof follows the same lines as of Proposition 1. Proof. Let y 1 (s; x, ρ, λ) denote the parametrized solution of (1) with controls (14) and (15) Next we show that there does not exist a weight λ ∈ 0, 1 2 such that both firms benefit cooperation over the entire state space X. It thus suffices to show that for all weights λ ∈ 0, 1 2 there exists a state ∃x ∈ X such that either E 1 (x; ρ, λ) < 0 or E 2 (x; ρ, λ) < 0 holds. Let us first compute the individual cooperative value. Since it suffices to show that either one firm has an incentive to play noncooperatively we compute the individual cooperative value of firm 1 only.

4.2.
Variable payoff weight. In this section we consider the variable weight λ(x) = 1 − x. It will be shown that the cooperative controls are agreeable for a restricted parameter space. Therefore we first solve (9) for the cooperative controls σ(x; ρ). Then we derive the individual cooperative value A i (x; ρ) = J i (x, σ(·; ρ); ρ) and thus the cooperation dividend E i (x; ρ) = A i (x; ρ) − D i (x; ρ). Then we apply Definition 4.3 to show that the cooperative controls are agreeable over the entire state space if the costs are bounded from above by γ ≤ 11+7 where r > 0 is an arbitrary number.
where the parameters of the value function are given by and g(x) and ω(ρ) are auxiliaries respectively defined as follows: Proof. The proof follows the same lines as of Proposition 1.

COOPERATIVE DYNAMIC ADVERTISING 203
Let y 2 (s; x, ρ) denote the parametrized solution of (1) with controls (18) and (19) The steady state lim s→∞ y 2 (s; x, ρ) = 1 2 is globally asymptotically stable for all (x, ρ) ∈ X × R 2 ++ . Depending on the initial condition x the state y 2 (s; x, ρ) approaches the steady state from either below or above We can thus readily compute the control paths {σ(y 2 (s; x, ρ); ρ) | s ∈ [t, ∞)} where we need to distinguish three cases.
Proposition 5. The individual cooperative values are given by Proof. During the course of integration we are going to use the following definitions where Φ(z, 1, a) is the so-called Lerch zeta function defined for z < 1 and (a) > 0 and s ∈ [t, ∞) implies q ∈ [0, ∞).
The proof for x ∈ 1 2 , 1 follows the same lines, but is omitted. Let us briefly illustrate the results with the example ρ = ( 3 4 , 1). Figure 2 illustrates the noncooperative φ i (x) and cooperative σ i (x) strategies as well as values D i (x) and A i (x) of firm 1 (left panel) and 2 (right panel). Since A i (x) > D i (x) for all firms i ∈ {1, 2} and states x ∈ X the cooperative solution σ(x) is agreeable.

Conclusion.
We studied a cooperative advertising differential game with nontransferable utility. It was shown that there does not exist a constant weight that supports an agreeable cooperative solution over the entire state space. After modifying the coalitional payoff with a state-dependent weighing function we were able to partially resolve the issue and showed the existence of an agreeable cooperative solution for a restricted parameter space. While the time preference rate can take arbitrary values, we derived an upper bound on the advertising cost parameter that is inversely related to the time preference rate. In a follow-up paper we want to investigate classes (e.g. linear-state or linear-quadratic) of analytically tractable games in order to derive general results on the functional form of the payoff weight such that a cooperative solution always satisfies agreeability over the entire state space.