FAIR-FIXTURE: MINIMIZING CARRY-OVER EFFECTS IN FOOTBALL LEAGUES

. We study a sports scheduling problem with the objective of minimizing carry-over eﬀects in round robin tournaments.In the ﬁrst part, focusing on tournaments that allow minimum number of breaks (at most one) for each team, we formulate an integer programming model and provide an eﬃcient heuristic algorithm to solve this computationally expensive problem. We apply the algorithm to the current Turkish Professional Football League and present an alternative scheduling template. In the second part, we discuss how the carry-over eﬀects can be further decreased if the number of breaks is allowed to be of slightly larger value and numerically represent this trade-oﬀ.

1. Introduction. Sports, especially football, have become a multi-billion dollar industry. Both national and international tournaments are followed by people across the world and countries compete for the organizational rights of worldwide football tournaments. Football clubs make huge investments (on players, stadiums, marketing, etc.) to increase attention of fans, media and broadcasting companies. A prominent role in attraction of audience is related to game schedules; fans prefer if their teams play in local stadiums, and broadcasting companies are interested in finding the schedule that will bring the highest advertisement revenues.
In general, sports scheduling problems consist of determining (i) opponents, (ii) date and (iii) venue of each game of a league. They are interesting combinatorial optimization problems for operational researchers. Football league scheduling has been studied from a variety of perspectives and there's no single model that solves all types of scheduling problems. Objectives may include; minimizing distance travelled for the teams, maximizing fairness among teams, maximizing number of TV audiences and gate attendance, or only finding a feasible schedule with respect to a given set of operational constraints 1 .
In order to arrive at the finish line, it is important for teams to balance efforts during the tournament. Playing against a strong or weak opponent affects teams' overall performances. Since team strength levels are not equal, it is possible for some teams to gain benefit from the way the league is scheduled. For example, if a team plays against a strong opponent one week, it is likely that the team is tired for the next week due to the great effort spent in that week. Assume that team i plays team j, and team j is a strong team. Next week, when team i plays with team k, team k will be advantageous against team i because team i is likely to be tired from the previous week. In the literature, such cases are defined as team k receiving carry-over effects from team j. Especially in leagues where schedules are prepared with respect to a fixed ordering of teams (i.e., teams following the same ordering of opponents every week), this creates a large benefit for teams following stronger teams' schedules. Obviously, carry-over effects (coe) will be present in any schedule, as teams will always follow one previous opponent. However, the magnitude of carry-over effects can be diminished largely by a fair scheduling of the league. In the literature, carry-over effects values are represented on a carryover matrix C, where c ij is the number of times team i gives carry-over effects to team j and carry-over effects value of a league is determined by i∈T j∈T c 2 ij [15], where T is the set of teams. Ideally, one would want to minimize the coe to create a balanced and fair schedule. The lowest carry-over effects value possible in a single-round robin tournament 2 with n teams would be n(n − 1) if each team only followed every other team once.
In addition to coe values, determining the location of games is important in creating a fair schedule. In a double round robin tournament of n teams, each team meets other teams twice in a season. The first game is played at one of the team's "home (H)" town (the opponent is said to have the game "away (A)"), and the second game is played at the other team's home town. A string of H and A symbols that represents the sequence of home and away games of a team in a half-season is called a "pattern" [5]. Schedules are arranged to satisfy an equal number of home and away games for each team, and further arranged so that a team does not play consecutive home or away games (when possible). If there are two consecutive home or away games for a team, then it is called a "break" for that team. It can be shown that a league of n teams require a total of at least n − 2 breaks in a half-season [3]; (n − 2) teams have 1 break and 2 teams have no breaks. Many of the European premier leagues (Belgium, England, Germany, Italy, Portugal, Spain, Turkey) follow this scheme where no team plays more than two home (or away) matches in a row [8].
In this paper we focus on creating a fair schedule by minimizing carry-over effects and finding a feasible schedule that allows for at most one break for each team in a half season with no breaks in the first and last weeks. 3 Up to our knowledge, this is the first paper to consider these two issues together. In this respect our study builds on two streams of literature: i) Minimizing carry-over effects are first introduced by [15], which proposed an exact algorithm for balanced schedules when n is a power of 2 and a heuristic for when n = p m + 1 where p is an odd prime and m ≥ 1. Since then many approaches have been developed to further improve solutions, such as starters-based method [1], constraint programming [16,10], a permutation scheme (an extension of the polygon method) [13], tabu search [11], iterated local search with multistart strategy [9] for the weighted carry-over effects problem and multi-phase scheduling [14] for the extended carry-over effects problem. [7] claimed that carry-over effects do not necessarily create a disadvantage for teams. However, there are real-life application studies [6] that give examples of problems for specific teams [9]. Further, balancing carry-over effects stays as an interesting fairness problem for leagues and a difficult combinatorial optimization problem for researchers.
ii) Minimizing the total number of breaks in a schedule is a hard problem to solve, though its complexity status is still open. A well known constructive algorithm for minimizing number of breaks is the canonical form [2,3,4]. A common approach in solving such problems is to separate the problem into three phases. In the first phase a set of feasible patterns are generated, in the second phase patterns are matched with each other in a way that minimizes number of breaks and in the last phase, teams are assigned to patterns to find the final schedule.
In the next section, we give an integer programming model that solves this problem exactly, but since it quickly becomes impractical for tournaments of sizes larger than 10 teams, we provide a heuristic method in Section 3. We show the impact of our heuristic on coe values by comparing it with the current Turkish Professional Football league in Section 4. Computational results are presented in Section 5. Also, in Section 5 we further investigate the trade-off between the number of breaks and coe values by allowing more than 1 break for each team. We compare our heuristic results with three real schedules from season 2015-16 to further represent how the heuristic can be adopted for certain break restrictions of individual leagues. Concluding remarks are given in Section 6 .
2. Mathematical model. The following integer programming model optimally solves the problem that minimizes carry-over effects while keeping the number of breaks to at most 1 for each team, with no breaks in the first and last weeks of a half-season. 4 To keep the modeling approach general (which focuses on number of breaks and carry-over effects), operational constraints are not taken into account.
Let T , P and K represent the set of teams, patterns and weeks, respectively. There are three sets of decision variables. The first set is related to patterns. x ip is equal to 1 if team i ∈ T follows pattern p ∈ P , 0 otherwise and y p equals 1 if pattern p ∈ P is used, 0 otherwise. The next set of decision variables are related to opponent, week and location assignments. x ijk equals 1 if teams i ∈ T and j ∈ T play against each other in week k ∈ K and 0 otherwise. h ik shows the location of the game and is equal to 1 if team i ∈ T plays at home in week k ∈ K, and 0 if i plays away. The last set of decision variables cover the carry-over effects. c ijk equals to 1 if team i ∈ T gives carry-over effect to team j ∈ T in week k ∈ K, 0 otherwise, and c ij represents the total carry-over effect from team i ∈ T to team j ∈ T , i.e., summation of c ijk over k ∈ K weeks. Parameter s pk equals 1 if corresponding location for pattern p at week k is H, and 0 otherwise.
Note that the pattern of game locations for a team determine the number of breaks for that team, and when the number of breaks is limited to at most 1, we can determine the set of all possible patterns at the outset. For a league of n teams, when no breaks are allowed in the first and the last week, the number of possible patterns is 2n − 6. For example, for a league of 8 teams, the set of all possible patterns are shown in Table 1. There are two patterns with no breaks and for each pattern p, there is a complementary pattern p + (n − 3) where A(H) is replaced with H(A). Table 1. Pattern set for a league of 8 teams.

A H A H A H A 2 A H H A H A H 3 A H A A H A H 4 A H A H H A H 5 A H A H A A H 6 H A H A H A H 7 H A A H A H A 8 H A H H A H A 9 H A H A A H A 10 H A H A H H A
The objective is to minimize the total coe value over all teams. Each team follows one pattern (Constraint 2). Each used pattern is identified (Constraint 3) to make sure that the first pattern (with no breaks) is used (Constraint 4) and for every pattern used, it's complimentary pattern is also in the solution (Constraint 5). If team i plays with team j on week k, the same is true for team j as well (Constraint 6). x iik is set to 0 for all i and k (Constraint 7). Each team meets once during a half-season and only play with one opponent on a week (Constraints 8 and 9). For opponent assignment, locations (determined by the patterns) must be taken into account. If a pattern p is assigned to team i, then the corresponding location (H or A) of the pattern on week k must be the same for team i (Constraint 10). Two teams that are both scheduled for playing home (or away) on week k cannot be assigned to each other for that week (Constraints 11 and 12). Finally, to calculate the coe value, teams following each other are tracked each week. If team l plays with team i on week k and with team j on week k + 1, then i gives carry-over effect to team j (Constraint 13). It is customary to use the first week schedule to calculate the coe for the last week (Constraint 14) (since the mirrored schedule will be followed in the next half-season). Total coe between two teams i and j is the summation of all coe values over the half-season (Constraint 15).
3. Solution method. Existing methods to solve minimizing carry-over effects cannot be applied to this problem, as up to our knowledge none of the studies on carryover effects considers leagues with at most one break for each team with break-free first and last weeks, albeit it is the most common practice. In this section we propose a constructive heuristic that is fast in creating a high-quality initial feasible solution (mainly by dividing the integrated model, INT, into two phases; pattern and opponent assignment) and further improving the fairness among the teams by decreasing the total coe value of the league.
3.1. Pattern and opponent assignment. Our heuristic finds a feasible schedule that allows n − 2 teams to have 1 break and 2 teams to have no breaks. This is done in two phases. The first phase is assigning patterns to teams and the second phase is assigning opponents to teams while abiding by the patterns selected for each team in the first phase.
i) Pattern assignment: Since the possible set of patterns can be identified at the outset, the problem of pattern assignment simply becomes selecting n patterns (for n teams) out of 2n − 6 possible patterns. We do this by finding a feasible solution to the integer programming model, PAT, that consists of only the constraints 2, 3, 4, 5 of the integrated model INT. In this model, we also keep track of a team's home away availabilities by including Constraint 10 (which is redundant for pattern assignment).
ii) Opponent assignment: In this part, teams are matched each week in a way that will not violate their home and away games determined by the pattern assignment above. We find a feasible solution to the new integer program, OPP, which consists of constraints 6,7,8,9,11,12 of the integrated model INT. Team i's home away availability, h ik , is a parameter for the OPP model since patterns are already chosen in the previous phase.
Both of these two models can be solved almost instantenously using a commercial solver. Computational difficulty of the INT model is overcome with these two models, but obviously at the expense of the optimality of the solution. In the next section we propose an improvement method that exhaustively searches for an assignment that further decreases the total coe value obtained by these two models and improves the existing solution.

3.2.
Week swap. Most heuristic approaches in the literature benefit from swapping opponents in decreasing coe value, however, in our case swapping may give infeasible solutions since location patterns will not be taken into account in random swapping. We suggest a simple heuristic that is easy to apply and improves coe values. The heuristic reduces the coe value without violating the number of breaks constraint.
Algorithm 1 below gives a pseudocode of the heuristic. The input is the opponent assignment that shows whether teams i and j play on week k and the output is the new schedule with the lowest coe value found. As a first step, we calculate the coe value for the given initial feasible solution. The main idea behind swapping in our heuristic is as follows. Note that the carry-over value of a league is calculated as the summation of the square of individual coe values for each pair of team. To minimize this sum of squares, our strategy is to try to minimize individual c ij values, when possible. To decide on which coe values to target, the algorithm uses a parameter, f , and finds two teams with a coe value larger than f together with during which weeks (k, k + 1) the first carry-over effect takes place. f can take a value between 1 and the largest coe value for any pair of teams in that schedule. The value for f determines the iteration number for our heuristic. Thus, a small f allows for a larger number of swaps. As expected, this increases the running time and so this parameter should be chosen taking into account the problem size. 5 Next, the opponent assignment for week k + 1 is swapped with week 0, i.e., games on week 0 are moved to k + 1 and games on week k + 1 are moved to week 0. After the swap, new coe value is calculated for the schedule. Note that the H, A patterns of teams will not be the same after swapping and the solution may have become infeasible. If the new coe value is smaller than the current, then this solution is checked for feasibility via substituting it into the pattern assignment model (PAT) described above. If there is a feasible pattern set that fits the new assignment, the schedule becomes the current solution and coe value is updated. Else, the solution is ignored, and games on week k + 1 are swapped with games on week 1. This is repeated until all weeks are swapped looking for a feasible and better (with a lower coe value) assignment. Then, we move on to the next couple of teams with a c ij value of larger than f , follow the same steps with those two teams, and move onto the next couple.
For example, let 2 and 7 be two teams which have a carry-over value of 4, and let 3, 7, 12 and 15 be the weeks when carry-over effects take place. First, opponents of all teams on week 3 are switched with opponents on week 1. If the new coe value is larger than the current coe, the solution is ignored and swapping is reversed. If it is smaller, then the feasibility of the new schedule is checked with respect to pattern sets of teams, PAT. If there is a feasible pattern set, coe value and the schedule is updated, else opponents are switched back. Next, opponents of all teams on week 3 are switched with opponents on week 2 and checked to see if there is improvement in coe value and if the schedule is feasible. This swapping is repeated until all weeks are covered. Once all week swapping is done for week 3, we repeat the same steps for weeks 7, 12 and 15. Then the next two teams with c ij ≥ f is found and same steps are repeated.

Algorithm 1:
Input: A feasible solution to the opponent assignment model OPP.
Step 1. Calculate c ij ∀i, j and let min = Description of the league: TPFL has been organized as a mirrored double round robin tournament. Similar to other major European leagues, it consists of 18 teams and each team follows a home-away pattern with at most 1 break, with break-free first and last weeks. Turkish Football Federation (TFF) schedules the league after they determine the first and last days of the season, taking anticipated FIFA official national match days into account. Fixture is generated considering hard and soft constraints; hard constraints are the requirements with respect to home-away concerns, and soft constraints are in general related to resource scarcity and availability of teams due to international games. At the beginning of the season, fixture with only games and their weeks are announced. The exact days (which day of the week) each game will be played on is announced only one week before the game to avoid any game matching. Game days are determined mostly by considering broadcasting companies' requests.
Scheduling of a half-season: A predetermined fixture template(known as the canonical schedule or the circle method) for 18 teams is used each year. Teams are randomly assigned numbers from 1 to 18 to find the final schedule for that season. In this template each team (except the randomly chosen team 17) follows the same sequence of opponents (teams 1-2-8-4-6-10-12-14-16-18-15-13-11-9-5-3-7), i.e., template forces each team to follow the same sequence of opponents. Odd numbered teams play home during the first week and alternate to away the next week. If a team encounters its own number in the sequence then it plays with team 17 and has a break on that week. For example, if team 8 plays with team 1 in the first week, then its opponent would be team 2 in the second week, team 17 in the third week, team 4 in the fourth week etc. Since 8 is even, the first game would be away and pattern for team 8 would be H-A-A-H-A-H-A-.... In this way, hard constraints are enforced via the template. After this initial assignment, the schedule is updated manually to cover for specific soft constraints 6 for teams. Also, some teams are given priority at the beginning (when random numbers are assigned) to ensure those two teams are not both assigned an odd or even number. For example, teams of the same city should not be both playing home on the same week (for reasons such as stadium sharing, limited security resources, etc.), and therefore they are assigned complementary patterns, i.e., they never play home on the same week. This assignment, due to teams following the same ordering of opponents, although easy to apply, results in high coe values each year, as given in Table 2.This is indeed aligned with the recent finding by [12] that the canonical form results in maximizing the carry-over effects. Since we cannot access information on the initial assignment, the coe values below belong to fixtures which are manually modified by the TFF after the initial assignment, i.e., the actual schedule of that year. As can be seen easily, there is no significant difference between years as the same template is used every year. In this section we first show the performance of our heuristic, and then apply our algorithm to the TPFL. To evaluate the trade-off between the number of breaks and carry-over effects numerically, we extend the problem to the cases when more than one break is allowed for teams. All computations are run via CPLEX 12.6 on a 3.40-GHz Intel Core i7-3770 Processor with 32.0 GB of RAM.

5.1.
Heuristic results for the TPFL. Constructing the initial template: We first provide the computational results for 8, 10, 12, 14, 16 and 18 teams in Table  3. Since the exact integer program for an 8-team league takes more than 48 hours to give a solution with a 90% gap, we cannot use the exact solutions for the INT model for comparison. We refer the reader to [11] for the best known values in literature. However, note that the best known values are for leagues where neither the number of breaks nor the type of breaks is a limiting concern. Therefore, the results here are not expected to be smaller. The algorithm is able to find a solution in about 24 minutes for a league of 18 teams. When we run our heuristic for an 18-team schedule with at most one break for each team and break-free first and last weeks, the initial steps of Pattern and Opponent Assignment phases create a schedule with a coe value of 1104. Since we're solving two easy mathematical models, the solution is obtained instantaneously. We further improve this solution to 944 in the Week Swap phase. The improvement algorithm takes about 24 minutes to find the best solution. 944 is a better coe value than many of the European leagues' in Season 2015-16 (For example, coe values for Germany and Spain are 1054 and 5183, respectively.) and certainly better than the average coe value over the last 4 years of the TPFL. The final template for an 18-team league is given in Table 7 in the Appendix.

5.2.
When more than 1 break is allowed for a team. Keeping at most 1break for each team makes it harder to minimize coe, as expected, since opponent assignment is done only after sticking to certain H and A pattern restrictions. In this section we further investigate how coe value changes when we allow for more than 1 break for each team while still trying to create a fair schedule.
When increasing the number of breaks, to keep a fair schedule, we have two concerns; i) Minimize the number of consecutive breaks. We take into account the ordering of breaks, thus we make sure that breaks are not consecutive. For example, we would prefer AAHAA over AAAHA which both have 2 breaks. ii) Minimize the difference between the number of breaks for each team. Since we aim for a fair schedule, we design a schedule that will not allow a large difference between teams' number of breaks.
We adapt the Week Swap phase by using the following integer programming model instead of PAT in Step 4 of Algorithm 1. In this way, we check for both feasibility and for the number of breaks in the schedule. Here, h ik equals 1 if team i plays at home on week k, and 0 otherwise, r ik equals 1 if there's a break on week k for team i, and 0 otherwise and b i represent the number of breaks of team i. Parameters u and v represent the lower and upper bounds on the number of breaks for each team to address concern ii) above. We allow for different minimum and maximum number of breaks for each team by changing u and v parameters to see the trade-off between the number of breaks and the coe value.

BRK: Min
The objective of the model BRK is to minimize the total number of breaks. For opponents i and j on week k, i.e., x ijk = 1, Constraint 17 guarantees that the game is played at home for only one team. Constraints 18 and 19 count the number of breaks and Constraint 20 gives the total number of breaks. Constraint 21 determines the minimum and the maximum number of breaks allowed for each team. Constraint 22 prevents consecutive breaks. Table 4 gives the computational results obtained with BRK model in Algorithm 1 Step 4. Additionally, to observe the effect of the initial schedule on the heuristic, solutions with three different initial schedules are presented. The first is a random feasible schedule, the second is the canonical schedule, and third is the actual TPFL schedule for season 2014-15. Maximum and minimum number of breaks allowed per team is given in the first column "Break (Each)", and "Break" columns display total number of breaks of all teams. Remarkable improvements are obtained in coe values if more breaks are allowed and the possible number of feasible solutions are increased. For a random initial feasible schedule the week swap heuristic successfully decreases the coe value from 1140 to 944, as explained in Section 5.1. Further improvement of 650 is achieved if number of breaks is allowed to be between 1 and 2, and 544 if it's allowed to be between 2 and 3. The decrease in the coe value (from 944 to 650) is around 30%, whereas the total number of breaks increases from 16 to 31, which is an increase of almost one for each team on average (remember that in the original problem there are 16 breaks; 1 for each of sixteen teams and 0 for two teams). Coe values for the "Canonical" and "TPFL" leagues are 3876 and 3707, respectively. Independent of the initial solution, our heuristic is able to find schedules with low coe values and that distribute breaks fairly among teams, as far as possible.
When we further relax the constraints on the distribution of the breaks (remove the lower bound from Constraint (21) in the BRK model), we get the results in Table 5. The decrease in the coe value is better observed on leagues with larger number of teams. For example, for an 18-team league, we can decrease coe value from 944 to 646 but we would need to allow some teams to have 2 breaks, and the total number of breaks to increase from 16 to 30. However, whether this would be an advisable trade-off should be evaluated individually for leagues. Countries indeed consider different priorities for their football leagues as we will mention next. Finally, we present how our heuristic performs under the break specifications of three real leagues which do not use the canonical schedule. Table 6 shows how coe values and break numbers for season 2015-16 schedules of Czech Republic, Germany and France together with our heuristic results. We have chosen France and Czech Republic leagues as they are known [8] to have small number of breaks and small coe values and Germany for an 18-team league comparison. The first five columns give information on the current situation for each league. The last two columns "Coe (H)" and "Break (H)" show our heuristic results for the coe and the total number of breaks. Using the same number of individual breaks, our heuristic was able to create a schedule with smaller coe values in each case, with an average decrease of 18% in coe values.
6. Concluding remarks. We addressed a scheduling problem for football leagues that aims to maximize fairness among teams. Fairness is provided by minimizing both the number of breaks for each team and the total coe value of the league. Providing fairness is important in creating tournament schedules, because teams' motivation and fans' attention are in direct proportion to fairness of the schedule. We introduced a three-phase solution method to minimize number of breaks and carry-over effect values at the same time. We start with finding an initial solution that allows at most one break for each team, with break-free first and last weeks, as such leagues are the most common in practice. Then, we provide a simple and easy to apply heuristic method that decreases coe value while keeping the same number of breaks. Next we propose a secondary method that captures the trade-off between number of breaks and coe values. We provide numerical examples to underline this trade-off and further show the performance of our heuristic in comparison to three real league schedules.
Although the setting is for a football league in the paper, modeling approach and the solution heuristics proposed in this study can be used by organizers of any single or double round robin tournaments where fairness is a major concern. Appendix. The table below represents the outcome (resulting template) of our heuristic for an 18-team league. Further modifications (with respect to soft constraints of a league) can be made manually on this template. Here the first team in a cell represents the Home team. For example, in the first week Team 1 plays at home with Team 11.