Throughput of flow lines with unreliable parallel-machine workstations and blocking

Flow lines in which workstations and buffers are linked along a single flow path one after another are widely used for modeling manufacturing systems. In this paper we consider the flow lines with multiple independent unreliable machines at each workstation and blocking. The processing times, time to failure and time to repair of each machine are assumed to exponentially distributed and blocking after service blocking protocol is also assumed. An approximate analysis for throughput in the flow lines is presented. The method developed here is based on the decomposition method using the subsystems with three workstations including virtual station and two buffers between workstations. Some numerical examples are presented for accuracy of approximation.


(Communicated by Kok Lay Teo)
Abstract. Flow lines in which workstations and buffers are linked along a single flow path one after another are widely used for modeling manufacturing systems. In this paper we consider the flow lines with multiple independent unreliable machines at each workstation and blocking. The processing times, time to failure and time to repair of each machine are assumed to exponentially distributed and blocking after service blocking protocol is also assumed. An approximate analysis for throughput in the flow lines is presented. The method developed here is based on the decomposition method using the subsystems with three workstations including virtual station and two buffers between workstations. Some numerical examples are presented for accuracy of approximation.
1. Introduction. The main ingredients of manufacturing systems are workstations, called also service stations that consist of a set of machines and storage buffers in between workstations. One of the system most widely used for modeling manufacturing systems is a flow line, sometimes called transfer line, serial line or tandem queue in queueing terminology in which workstations and buffers are linked along a single flow path one after another. A pair of workstation and a buffer in front of the workstation is called a stage. The processing time, called also service time of a machine is a time period to be taken to process a part while the machine is working without failure.
There is an extensive literature for the analysis of throughput of flow lines with finite buffers and unreliable servers e.g. see the monographs [1,2,6,11], the survey papers [3,10] and the references therein. However, significant work has focused on analyzing the throughput of the systems with single machine in a workstation.
The system with exponential processing times can be modeled by finite state Markov chains. However, the number of states of the Markov chain increases drastically as the number of stages increases, which makes analytical or numerical solutions intractable. For the systems with long line, approximation methods such as decomposition and aggregation method have been developed. The basic idea of decomposition method is to decompose the long line into subsystems that are mathematically tractable, and to derive a set of equations that determine the unknown parameters of each subsystem, and finally to develop an algorithm to solve the equations. For more about decomposition method, see e.g. Gershwin [6]. The aggregation method is to replace a two-workstation-one-buffer line by a single workstation with the same throughput as the original two-workstation line. For more about aggregation method, see e.g. Li and Meerkov [11].
In order to improve production rate and/or reliability, a workstation has multiple machines in parallel [3]. Approximate solutions for the reliable lines with multiple machines using decomposition method are presented by [4,15,16]. The literature on the analysis of the unreliable systems with multiple machines in a workstation and finite buffer is more scarce than the reliable system. Diamantidis and Papadopoulos [5] and Liu et al. [12] derive algorithmic solutions for the two-workstation-onebuffer line. The approaches in [5] and [12] can be applied only to the system with small number of stages. Patchong and Willaeys [13] propose a method that replaces each parallel-machine stage by a single equivalent machine in order to obtain the system using the classically-structured flow line with machines in series. Li [9] applies the method in [13] to analyze the complex systems with multiple unreliable machines in a workstation. We have compared the original system with the equivalent single-machine system obtained by the method [13] using simulation. Numerical experiments show that the approach in [13] works well for the system when the buffer size is large and this fact is also indicated in [11]. It is necessary to reflect the state of each machine in a workstation and dependence between stages to improve the accuracy of approximation especially for the system with small size of buffer. It does not seem to be easy to develop a method reflecting such dependency effectively and there are few results for analysis of the long flow line with multiple unreliable machines in a workstation and finite buffer considering the states of each machine.
The purpose of this paper to propose an approximate solution for the throughput of flow lines in which each workstation has multiple unreliable machines in parallel and a buffer of finite size based on the decomposition method. To reflect the dependence between consecutive stages, we use the subsystems with three workstations including a virtual workstation for arrival to the system and two buffers instead of the subsystems with two workstations and one buffer that are mostly used in decomposition technique. Each subsystem is modeled by the two-stage tandem queue with two buffers and external arrivals and is analyzed using the level dependent quasi-birth-and-death (LDQBD) process.
The paper is organized as follows. Section 2 describes the model in detail and the original system is decomposed into subsystems. In Section 3 the formulae for the parameters of the subsystems are derived in terms of the states of the adjacent subsystems. Approximation of the subsystems and an algorithm for throughput are presented in Sections 4 and 5. Numerical results and some concluding remarks are given in Sections 6 and 7, respectively. 2. The model and decomposition. We consider a flow line L with N + 1 stages labeled S i , i = 0, 1, · · · , N . Each stage S i has a workstation M i with m i identical machines in parallel and a finite buffer B i of size b i in front of M i , i = 1, 2, · · · , N as depicted in Figure 1. The initial stage S 0 consists of one workstation M 0 with  We assume that the initial workstation M 0 is never starved and each machine at M 0 starts new process immediately after process completion unless the machine is blocked. When a machine completes processing a part at a stage and finds that the next stage is full at that time, the the part is held at the machine where it just completed its process until the destination can accommodate it. This type of blocking is called blocking after service (BAS) rule. The machines of the last workstation M N are never blocked and the part at M N leaves the system immediately after completing its process. The processing time of each machine at M i has exponentially distributed with rate µ i . Each machine is either up (operational) or under repair(broken-down) at any time. We assume the operation dependent failure, that is, each machine at each stage can be failed only while it is working. The time to failure and time to repair of a machine at M i are exponentially distributed with rate ν i and γ i , respectively. We decompose the system L with the subsystems L i , 0 ≤ i ≤ N − 1. Subsystem L 0 consists of M 0 and S 1 in series and L i , 1 ≤ i ≤ N − 1 consists of M i−1 , S i and S i+1 in series, where M i−1 is a virtual station which supplies the parts to S i as depicted in Figure 2. The departure process from the second stage S i+1 in L i may be affected by blocking due to full buffer at S i+2 . Once the arrival rate to S i , departure rate from S i+1 and failure rate at S i+1 in the subsystem L i are determined, one can analyze the subsystem L i and the throughput of L is approximated by the production rate of M N at S N .
3.1. Stochastic processes. Let X i (t) be the number of parts at the stage S i which includes the parts in the machines being repaired but excludes the parts that are blocked to enter the next stage and D i (t) be the number of machines being repaired at M i at time t. Let Z i (t) = (X i (t), D i (t)) and Ψ 0 (t) = (D 0 (t), Z 1 (t)), It can be seen that the state space of Ψ Ψ Ψ 0 = {Ψ 0 (t), t ≥ 0} is Denote the stationary version of X i (t) by X i and D i , Z i and Ψ i are similarly defined. We can easily see from the definition of X i and D i that Similarly, it can be seen that for given Z i = (n, h), Z i+1 takes values on Let π i (n, h, j, k) = P (Ψ i = (n, h, j, k)), (n, h, j, k) ∈ S S S i (Ψ) and define π i|i+1 (n, h|j, k) = π i (n, h, j, k) π i,2 (j, k) , where π i,1 (n, h) = (n,h,j,k)∈S S Si,2(n,h) π i (n, h, j, k), π i (n, h, j, k).

3.2.
Distribution of the number of machines blocked. Let Y i be the number of machines blocked at M i in steady state. Now we consider the state space of random variable Y i given that Z i = (n, h) and where . .
is given as follows: and (1) is immediate from the assumption that M N is never blocked. It can be seen from that given Z i = (n, h), Thus (2) is immediate from the relations (5) and (6) with Y N = 0. We have for and (3) for 1 ≤ y ≤ y i (n, h) is immediately obtained from the relation (6).

3.3.
Processing rates, failure rates and arrival rates. Let W i be the number of machines working at M i , 0 ≤ i ≤ N . It can be easily seen from (4) that W i , 1 ≤ i ≤ N − 1 can be expressed in terms of the states of S i+1 as

YANG WOO SHIN AND DUG HEE MOON
Combining (4) for Y i+1 and (7), we have Since M 0 is never starved, we can assume that there are infinite number of parts in front of M 0 . Setting X 0 = ∞ in (7) and (8), we have be the processing rates. The following proposition is immediate from (7) and (9).
. The conditional processing rate at M i given Z i = (n, h) is given by Proposition 3. The conditional failure rates at M i . are given as The conditional arrival rate to the stage S i+1 given Ψ i+1 = ψ = (n, h, j, k) is the processing rate at M i , that is, h, j, k, y) and Z i = (x, d) are given, it can be easily seen from (8) that W i is given by where y i (ψ * ) is the number of machines blocked at M i when Ψ * i+1 = ψ * and y i (ψ * ) = max{n + max(j + y − l i+2 , 0) − l i+1 , 0} Since M 0 is never starved, setting X 0 = ∞, we have The following proposition is immediately obtained by the law of total probability. Proposition 4. The conditional mean arrival rate λ i+1 (ψ) is given as where ψ * = (n, h, j, k, y), π * 0|1,2 (d|ψ * ) = P (D 0 = d|Ψ * 1 = ψ * ) and for 4. Approximation of L i . For an approximation of the subsystem L i , we consider the two-stage flow lineL i which consists of the stages S a i and S d i+1 in series as shown Figure 3. The stage S a 0 has m 0 parallel identical machines that are never starved and the processing rate of each machine is µ 0 . The stage S a i , 1 ≤ i ≤ N − 1 has a workstation M a i with m i identical machines in parallel and a buffer of size b i + m i−1 that counts the buffer size b i of B i and the potential number of blocked machines m i−1 at M i−1 . The second stage S d i+1 ofL i consists of m i+1 machines in parallel and a buffer of size b i+1 . The parts at S a i can be blocked to enter the second stage due to full buffer at S d i+1 and the BAS blocking rule is assumed. Each server in M a i and M d i+1 is either up (operational) or under repair(broken-down) at any time. The operation dependent failure is assumed.
Let X a i (t) be the number of parts at the stage S a i which includes the parts in the machines being repaired but excludes the ones blocked to enter the next stage at time t and D a i (t) the number of machines being repaired at M a i at time t and set Z a i (t) = (X a i (t), D a i (t)). The random variables X d . Let X a i and D a i be the stationary version of X a i (t) and D a i (t), respectively and X d i+1 , D d i+1 andΨ i are similarly defined. The inter arrival time to S a i is assumed to be exponential distribution whose rate depends on the state ofΨ i and denote it by λ a i (ψ) whenΨ i = ψ. The processing time and time to failure of each machine at M a i are exponentially distributed with rates µ a i (ψ) and ν a i (ψ) whenΨ i = ψ, respectively. The times to repair of each machine at M a i and M d i+1 are exponentially distributed with parameters γ i and γ i+1 , respectively. The processing time and time to repair of the machines at M d i+1 are assumed to be exponential with rates µ d i+1 (j, k) and ν d i+1 (j, k), respectively that depend on the state Z d i+1 = (j, k). ThenΨ Ψ Ψ i is a level dependent quasi-birth-and-death (LDQBD) process on the state space S S S i (Ψ), 0 ≤ i ≤ N −1. For description of LDQBD process, one can refer to [8]. Denote The ((j, k), (j , k ))−component B is the transition rate from (h, j, k) ∈ h h h to (h, j , k ) ∈ h h h that is given by where ∆ 0 (h, j, k) is the positive number that makes Q 0 e = 0 and e is the column vector of appropriate dimension whose components are all 1. Similarly, the components of the block matrices A are the transition rate from (h, j, k) ∈ h h h to (h + 1, j , k ) ∈ h + 1 h + 1 h + 1 and the transition rate from (h, j, k) ∈ h h h to (h − 1, j , k ) ∈ h − 1 h − 1 h − 1, respectively and are given as follows: is the transition rate from (n, h, j, k) ∈ n n n to (n, h , j , k ) ∈ n n n and is given by where ∆ i (n, h, j, k) is the positive number satisfying Q i e = 0. Similarly, the components of A (n) i and C (n) i are the transition rate from (n, h, j, k) ∈ n n n to (n + 1, h , j , k ) ∈ n + 1 n + 1 n + 1 and the transition rate from (n, h, j, k) ∈ n n n to (n − 1, h , j , k ) ∈ n − 1 n − 1 n − 1, respectively and are given as follows: i ((h, j, k), (h , j , k )) = µ a i (n, h, j, k), (h , j , k ) = (h, j + 1, k), 0, otherwise.
Let Y a i and Y d i+1 be the numbers of machines blocked at M a i and M d i+1 , respectively and , (n, h, ) ∈ S S S N (Z) from the formula (2) as follows:
The arrival rates λ a i (ψ) are chosen from (16) and (17) as follows : for ψ = (n, h, j, k) and ψ * = (n, h, j, k, y), The conditional distributionπ * i|i+1,i+2 (x, d|ψ * ) in (25) and (26) is approximated bŷ h, j, k, y) and THROUGHPUT OF FLOW LINES 911 5. Algorithm. Once the parameters λ a i (ψ), µ a i (ψ), ν a i (ψ), µ d i+1 (j, k) and ν d i+1 (j, k) are determined,π π π i (n), 0 ≤ n ≤ K i can be calculated by the formulâ π π π i (n) =π π π i (n − 1)R i (n), n = 1, 2, · · · , K i , where R i (n), 1 ≤ n ≤ K i are obtained recursively as follows: The vectorπ π π i (0) is the unique solution of the equation with the normalizing condition Ki n=0π π π i (n)e = 1. Now we present an iterative algorithm for the equations (18)-(26) for the parameters and throughput. LetL Step 0: [Initial step] Letting no machines at departure stage be blocked, set  Step 2: [Stopping Criterion] Calculate the stationary distributionπ π π N −1 and throughput k)) and check stopping criterion where > 0 is a given tolerance. If the condition (27)   For solving the subsystemL i , we need to invert K i matrices of the size |n n n| in S S S i . Analytic proof of the convergence of the iteration scheme is not given, but extensive numerical experiments show the convergence of the iteration. We can see from the experiments that the necessary number of iterations in Algorithm A is less sensitive to the number of servers and buffer size, but it increases as the number of stages increases.
6. Numerical results. Approximations (App) are compared with the simulations (Sim) for accuracy check of the approximation. Simulation models for the systems in the tables are developed with ARENA [7]. Simulation run time is set to 220,000 unit times including 20,000 unit times of warm-up period. Ten replications are conducted for each case and the half length of 95% confidence interval (c.i.) are obtained. Tolerance = 10 −5 is used for stopping criterion (27).
In the following tables, the columns corresponding to the vector m m m = (m 0 , · · · , m N ) denote the server allocation. For example, the vector (3, 2, 1, 1, 2, 3) in the column corresponding to m m m means that m 0 = 3, m 1 = 2, m 2 = 1, m 3 = 1, m 4 = 2, m 5 = 3 in Table 2. The numbers 0, 3 and 5 corresponding to b i in Tables 1 -3 mean that b i = 0, 3, 5, i = 1, 2, · · · , N . The columns corresponding to (ν i , γ i ) are for the failure rate and repair rate of a machine. The relative percentage error is calculated by Err(%) = |App − Sim| × 100/Sim. Two sets of parameters are selected for investigating the effect of repair time length when isolated efficiencies are same. Thus we determine (ν i , γ i ) as (0.04, 0.2) and (0.1, 0.05). It means that the pairs (MTTF, MTTR) of mean time to failure (MTTF) and mean time to repair (MTTR) are (25, 5) and (10, 2), respectively. Note that the isolated efficiencies of machines are same with 0.833. From this parameter setting, we observed that frequent failures with small repair time result better production rate in the numerical experiments below. That is, the production rates of the system with (0.1, 0.5) is higher than that of the case (0.04.0.2). It means that although the isolated efficiency is the same, the production rate is affected by MTTR. As the repair time becomes longer, the production rate decreases. Table 1 presents the comparison with DDX algorithm [6] for the system with single machine in a workstation. Numerical results show that the approximation performs well over the DDX algorithm for the system with single machine at a workstation especially in the case of the system with small size of buffers.
Gershwin's decomposition method [6] uses the two-machine one-buffer system as a subsystem and modifies the failure rate and repair rate of each machine by reflecting blocking and starvation. The modified rates of failure and repair are independent of the state. The approximation assumption adopted in [6] is that the probability of simultaneous blocking and starvation is negligible. However, the probability of a station being simultaneously blocked and starved can be increased as the length of the line increases and buffer size decreases. The method proposed in this paper reflects the states of each machine in a workstation such as the numbers of parts being processed, the number of failed machines and dependence between stages which improve the accuracy of approximation especially for the system with small size of buffer.
In Tables 2-6, approximations are compared with simulation for the moderate length of the line with N = 6, 10, 15 under the various configurations. The scenarios for Tables 5 and 6 are given in Table 4. The efficiency of each machine is fixed by γi νi+γi = 5 6 ≈ 0.83 and balanced lines (Tables 2 and 5) with m i µ i = 1 and unbalanced lines (Tables 3 and 6) with µ i = 1.0 are considered. In case of the balanced system with l i = m i + b i = 5, the accuracy is similar to that of the case of l i = 3 and the results of the case is omitted in Table 5. Numerical results show that the approximation performs well even for the long line with small buffer. Furthermore, the approximation works better for the system with balanced line and large failure rate than that with the unbalanced line and small failure rate.
From the numerical experiment, we observed followings. As the increase of line length, the approximation errors increase slightly. When we design parallel manufacturing systems, allocating machines evenly for all stations is the best with respect to the production rate. It is known that the optimal buffer allocation which maximizes production rate in single-machine flow lines follows the reverse-bowl shape, and it means that the buffer size should be maximized in the middle of flow lines. However we assume that the buffer sizes ahead of stations are same. It can be seen from Table 2 that the allocation of machines (1, 2, 3, 3, 2, 1) generates a higher production rate than (3, 2, 1, 1, 2, 3). Therefore, if we should inevitably assign some stations with parallel machines, the reverse-bowl shape is better than the bowl shape. This phenomenon can be observed in the long lines with 10 stations. In the Line 2 of Table 5, the machine allocation is (1, 1, 2, 2, 3, 3, 2, 2, 1, 1) and the total number of machines is 18. In the Line 2 of Table 6, the production rate from the simulation is 0.4597 when (ν i , γ i ) is set to (0.04, 0.2). If we change the machine allocations to (3, 2, 2, 1, 1, 1, 1, 2, 2, 3), the production rate obtained from simulation   is 0.4535 and it is smaller than 0.4597. These patterns are consistently applied to the unbalanced lines as shown in Table 3.
The number of iterations are presented in Table 7 for the systems of short line with N = 5, moderate size of line N = 10 and long line N = 15 with (µ i , ν i , γ i ) =   (1.0/m i , 0.1, 0.5) and stopping criterion = 10 −5 . From Table 7 the reader may notice that the number of iteration is less sensitive to the number of servers and buffer size than to the length of the line.

7.
Conclusions. In this paper we have developed a method for an approximation of throughput in a flow line with parallel unreliable machines and finite buffer based on a decomposition method. We use the subsystems with three workstations and two buffers and each workstation of the subsystem has parallel unreliable machines. Each subsystem is approximated by a two-stage and two-buffer line in which the arrival rate, processing rate, failure rate and repair rate at each stage are dependent on the state of the system and is modeled by LDQBD process. The method proposed here reflects the dependency between two consecutive stages effectively and works excellently even for the unbalanced system with small buffer. This approach is different from the decomposition method mostly used for analysis of the line with single-machine workstations A limit of current method is that the method tends to have difficulties as buffer size and/or the number of machines increase.