A FRAILTY MODEL FOR INTERVENTION EFFECTIVENESS AGAINST DISEASE TRANSMISSION WHEN IMPLEMENTED WITH UNOBSERVABLE HETEROGENEITY

. For an intervention against the spread of communicable diseases, the idealized situation is when individuals fully comply with the intervention and the exposure to the infectious agent is comparable across all individuals. Some level of non-compliance is likely where the intervention is widely imple- mented. The focus is on a more accurate view of its eﬀects population-wide. A frailty model is applied. Qualitative analysis, in mathematical terms, re- veals how large variability in compliance renders the intervention less eﬀective. This ﬁnding sharpens our vague, intuitive and empirical notions. An eﬀective reproduction number in the presence of frailty is deﬁned and is shown to be invariant with respect to the time-scale of disease progression. This makes the results in this paper valid for a wide spectrum of acute and chronic infectious diseases. Quantitative analysis by comparing numerical results shows that they are also robust with respect to assumptions on disease progression structure and distributions, such as with or without the latent period and the assumed distributions of latent and infectious periods.


1.
Introduction. This manuscript extends a previous work [20] which examined how variability of the latent and infectious periods of an infectious disease could affect the threshold value ρ > 0 by assuming an intervention against transmission applied homogeneously across individuals at a rate φ so that, when φ ≥ ρ, the controlled reproduction number, as defined in [20], would be kept below unity. The objective here is to address how between-individual variability in adherence makes the theoretically proven effective control measure less effective when φ ≥ ρ.
1.1. Brief review of the previous work. A branching process is often used to approximate the cumulative number of infected individuals during the initial phase of an epidemic, when an infectious "seed" is planted in an infinitely large susceptible population [2]. The basic reproduction number R 0 is the mean value of the branching process and plays a very important role. According to theory ( [19], [17]), if R 0 ≤ 1, the branching process will become extinct with certainty. If R 0 > 1, there exists a positive probability that the branching process will grow infinitely large. The branching process approximation is valid only for the establishment phase during which the depletion of the susceptible individuals is negligible and no intervention has taken place.
Assuming a very large population with homogeneous mixing, the infectious contact process (as defined in [3]) is a Poisson process with constant rate β. If an infected individual has a random infectious period with finite mean µ I , the basic reproduction number is R 0 = βµ I (see [1]). This paper only considers the case R 0 > 1. If the branching process does not die out during the establishment phase, the epidemic does not grow infinitely large. The depletion of the susceptible population depletes the reproduction number over time according to R t = R 0 S(t) n where S(t) is the expected number of susceptible individuals and n is the total population size, which is assumed to be constant. R t will be depleted to a threshold value 1 at some point of time. After that, two scenarios might happen.
1. R t remains less than 1. The result is an end of the epidemic with a positive proportion 0 < η < 1 of the population cumulatively infected. Under certain circumstances (e.g. closed population, no death), η satisfies a final size equation 1 − η = s 0 e −R0η , η > 0, where s 0 is the initial proportion of susceptible individuals in the population at the beginning of the epidemic. This equation holds true in a very large population with homogeneous mixing so that the infectious contact process is a Poisson process. It holds true for arbitrarily distributed latent and infectious periods, regardless whether there is a latent period [10]. 2. Due to loss of immunity of recovered individuals or by demography, there is replacement of susceptible individuals so that R t is sustained at R t ≈ 1 in the long run, resulting an asymptotic equilibrium condition S(t) n ≈ 1 R0 . In both scenarios, R 0 , defined at the very beginning of the outbreak, transcends to the long run outcome, in theory.
In practice, R 0 is not only depleted by the transmission dynamics, but also by interventions during the outbreak. The long run outcome, such as the final size η, does not transcend back to R 0 . It is manifested as if the reproduction number had been R c < R 0 at the very beginning and there had been no intervention. Such an analogy is appropriate given the generality of the transcendental relationships between R 0 and the final size or the asymptotic equilibrium level, as proven in [10].
In [20], R c is called the controlled reproduction number and modelled in a quarantine-isolation paradigm. Infected individuals are quarantined (or isolated) at a constant rate φ > 0, assuming that the intervention is applied homogeneously across individuals. This is the idealized situation.
A latent period is a random duration T E during which an individual is infected but not yet able to infect other susceptible individuals through contact. With quarantine rate φ, the probability that an individual during the latent period escapes from being quarantined and becomes infectious is is the probability density function (p.d.f.) of T E . The infectious contact process is a "thinned" Poisson process with rate β ∞ 0 e −φx f E (x)dx which returns to β when φ = 0. The infectious period T I has survival function F I (x). Infectious individuals are isolated at rate φ from contacting other individuals. The effective mean duration of infectiousness among infectious individuals is ∞ 0 e −φx F I (x)dx which returns to The threshold condition is βL which is recognized as the equilibrium distribution in renewal process theory. At a randomly chosen time t, "currently" infectious individuals form a prevalence cohort. The infectious period of individuals in this cohort is length-biased because those with longer infectious period have a large probability of being included. If the system is under equilibrium (i.e. the prevalence proportions for susceptible, latent, infectious, recovered and immune individuals are approximately constant), the length-biased infectious period is on average longer than µ I , which is µ I (1 + cv 2 ) and cv is the coefficient of variation of the infectious period T I defined as the ratio of standard deviation to its mean. The corresponding length-biased p.d.f. is x µ I f I (x). Meanwhile, the time from the beginning of infectiousness of these individuals in the prevalence cohort to the observation time t, is identically distributed, denoted by W, with mean equals to 1 2 . Under equilibrium, if an individual at a randomly chosen time t, assuming susceptible at t − ∆t, comes in contact with any of the infectious individuals in the prevalence cohort, G = T E + W is the total latent period plus part of the lengthbiased infectious period, where W is the duration between the time of infection of individuals who are in the prevalence cohort and time t. An illustration is given in Figure 1 in which the red line is the infectious period and the blue line is the latent period. In this figure, only two of the four individuals are in the prevalence cohort and W is defined. The mean value for G is µ E + 1 2 µ I (1 + cv 2 ). Assuming independency, the p.d.f. for G = T E + W , denoted by f G (x), is defined by the convolution between f E (x)and f W (x). Its Laplace transform is the product 1.2. Highlights of this paper. The quarantine-isolation paradigm in [20] is for the convenience of discussion. In this manuscript, the control measure is generalized into many types of intervention applied to individuals during their latent period and/or their infectious period. Such interventions include, but are not limited to, condom use for the reduction of sexual transmission of diseases, prophylactic intervention and use of antiviral drugs to reduce transmission, and so on. Several factors may influence the risk of infection separately from any impact of the intervention. One of them is between-individual heterogeneity in compliance that individuals may not fully comply with the intervention. When individuals fully comply with the intervention and the exposure to the infectious agent is comparable across all individuals being studied, this is the idealized situation. The impact of an intervention is sometimes called the efficacy of the intervention. Since some level of non-compliance is likely where an intervention is widely implemented, the focus will be on effectiveness that provides a more accurate view of possible effects population-wide.
Throughout the paper, the notation ρ is the threshold parameter calculated by solving (2). The condition R c (ρ) = 1 is a reflection of efficacy under the idealized situation. In the presence of between-individual heterogeneity in the implementation, at the calculated threshold value ρ, the actually realized controlled reproduction number, denoted by R v (ρ), will be in the range 1 < R v (ρ) < R 0 . This is a reflection of effectiveness.
Section 2 is a qualitative analysis. A frailty model, with a random effect z > 0, is used to characterize unobservable heterogeneity. The variability of z is measured according to the convex order given by Definition 2.2, which is a more natural way to describe dispersion and a more general definition than variance. It shows that the more variable the frailty z according to convex order, the larger the value of R v (φ). The control measure is most effective if applied homogeneously across all individuals. It also contains an important finding that, given the threshold ρ under the idealized situation, the value of R v (ρ) > 1 is invariant with respect to the time scale of disease progression, regardless if the disease is acute (e.g. measured in days like influenza) or chronic (e.g. HIV, viral hepatitis). Given the model for frailty, R v (ρ) only depends on R 0 and f G . The latter includes assumptions on the existence of the latent period and the distributions of the latent and the infectious periods.
Section 3 is a quantitative analysis to evaluate how much influence f G has on R v (ρ). It turns out that the influence is quite weak under the numerical results calculated in 32 different cases. Detailed calculation formula and associated tables are given in Appendix.
The results presented in this paper are not only invariant to the time scale of the disease progression, but also robust with respect to model assumptions in the disease progression.
2.1. The frailty model, efficacy and effectiveness. In survival analysis, the frailty model is a random effect model to account for unobservable heterogeneity between individuals [7]. In the proportional hazard model h(x|z) = zh 0 (x), z > 0, where h 0 (x) > 0 is the baseline hazard function, z > 0 is called the frailty parameter and is assumed to be random with mean value E(z) = 1 and p.d.f. ξ(z). It can be alternatively written in terms of survival functions F (x|z) = e −zH0(x) , where H 0 (x) = x 0 h 0 (u)du is the cumulative baseline hazard. If there is no heterogeneity, then ξ(z) degenerates to z ≡ 1 without variation and the survival function is F 0 (x) = e −H0(x) . In the presence of unobserved heterogeneity, the frailty model has the survival function The importance of Laplace transform in the context of frailty modelling in survival analysis was pointed out in [6].
In the current context, the intervention is associated with a rate φ > 0 under the idealized situation. The baseline hazard function is h 0 (x) = φ, H 0 (x) = φx and F 0 (x) = e −φx . If there is no heterogeneity in control measures, R c is a fraction of R 0 and this fraction is In the presence of frailty with a non-degenerated p.d.f. ξ(z), replacing e −φx with L[ξ](φx) > e −φx , the following inequality is established where R c (φ) reflects the efficacy in the idealized situation and R v (φ) reflects the effectiveness in the presence of frailty when the intervention is applied in a large population. The inequalities are mathematical phrases that qualitatively describe the conventional wisdom, that, the control measure is most effective if applied homogeneously across all individuals.

2.2.
Variability of the frailty parameter according to convex order. Unobservable heterogeneity is measured by variability for z. The order of variability for two non-negative random variables is described based on "marjorization" and defined in [11]. If X 1 and X 2 have equal mean value µ 1 = µ 2 (should they exist), corresponding to p.d.f. f 1 and f 2 , respectively, the verbal description for X 2 being more dispersed (spread out) than X 1 is about the change of signs between. f 1 and f 2 and their corresponding survival functions F 1 and F 2 as shown in Figure 2 This is the definition of the convex order X 1 ≤ cv X 2 . showing that X 2 is more "spread out" than X 1 .
Definition 2.1. X 1 ≤ cv X 2 if and only if µ 1 = µ 2 plus the following two statements: has two sign changes and the sign sequence is: has one sign change and the sign sequence is: +, −.
By this definition, the larger the distribution in convex order, the more it "spreads out" around its mean value. It is mathematically equivalent [11] to the following definition.
for which these expectations exist.
The convex order is stronger than variance comparison as the choice of a convex function Ψ(x) = x 2 leads to the conclusion that The following identity, first proven in [5], It can be shown that L[f G ](z) is log-convex with respect to z. This gives additional insight into (5), that, the more variable the frailty z according to convex order, the larger the value of R v (φ).

2.3.
Scale invariance with respect to disease natural history. The natural history in the current context refers to assumptions in f G , which include the existence of the latent period and the distributions of the latent and the infectious periods. Assuming f * G (y) is the p.d.f. for G according to the standard time scale λ = 1, on a transformed time by a scale parameter λ, x = y/λ, f G (x; λ) = λf * G (λx). Because the Laplace transform of a p.d.f. of a non-negative random variable is a survival function (arising from the mixture of an exponential survival function with the p.d.f. as the mixing distribution, see [11] Both R v (φ) and R c (φ), divided by R 0 , are plotted in Figure 3, re-scaled by λ. The following statements hold.
It is a survival function (of φ) constructed from a mixture of the exponential survival function e −sy with the mixing function f * G (y). The threshold condition R 0 L[f G ](ρ) = 1 is equivalent to say that ρ is the 1 − R −1 0 th percentile corresponding to this survival function. This percentile is unchanged with respect to the scale transformation x = y/λ.   Figure 3 are heavy tailed because the class of distributions with decreasing hazard rates are closed under mixtures (see [15], pp. 407-409 for details). L[f * G ](φ) arises from a mixture of the exponential distribution.

The dashed line is
. Both correspond to decreasing hazard functions. Because ρ and ρ are the 1 − R −1 0 th percentiles corresponding to these survival functions, if R 0 is large, these threshold values take place on the far right end of these tails. Numerical results will show that, when the variance of z is getting large, the ratio ρ /ρ increases steeply, highlighting the difficulty in achieving the same objective if there is frailty in the implementation of intervention measures.

Robustness of
The equation (8) shows that R v (ρ) is a function of (R 0 , f G ), of which, f G incorporates assumptions about whether there is a latent period and assumptions of distributions for the latent and the infectious periods. The integrand in (8), is given by (9).
G(z; R 0 , f G ) is a log-convex function of z, monotonically decreasing, satisfying: Figure 4 gives a schematic presentation. The dependence on f G mainly shows on the right hand tails. Another log-convex function with a simple Pareto form has the same features as G(z; R 0 , f G ) but no longer depends on f G . On the other hand, the p.d.f. ξ(z) for frailty, satisfying E[z] = 1, tends to decrease to zero quickly when z is getting large, such as the Gamma distribution given below. Intuitively, and is expected to be in close agreement. For numerical demonstration, we choose the Gamma distribution The Laplace transform is L[ξ](s) = (1 + sx) −1/v . As illustrated in Figure 5, the shape of ξ(z; v) changes dramatically at v = 1. When 0 < v < 1, z is "more or less" concentrated at E[z] = 1. One may loosely call this as "almost homogeneous" in the control measure with some mild variability. When v > 1, it is "highly heterogeneous".
When ξ(z) is given by (11), For any x, F (f railty) (x) is an increasing function of v. The variance v ranks the frailty variable z according to stochastic order. According to (5), We also compare it with the approximation G(z; R 0 , f G ) will be specified by assumptions for f G and G * (z; R 0 ) is given by (10).
We restrict discussions to distributions of the latent and the infectious periods that their Laplace transforms can be expressed explicitly.
3.1. In the absence of a latent period. Assuming the infectious period has a finite mean µ, f G (x) = 1 µ F I (x). According to (9), We consider two parametric models for the infectious period, both with mean µ and var[T I ] = θµ 2 . The first model is the Gamma distribution. In this case, where y(R 0 , θ) = ρθµ. It is a function of θ and R 0 because the threshold ρ must satisfy R0 ρµ 1 − (1 + ρθµ) −1/θ = 1. Then (13) is written as where The second model is the inverse-Gaussian distribution. In this case, In a similar manner, the expression for (15) is and (13) is expressed as When θ → 0, the infectious period degenerates to a fixed point µ and In this case, The corresponding expression for (9)is The Gamma distribution includes the exponential distribution as a special case when θ = 1, the inverse-Gaussian distribution does not include the exponential distribution as a special case. G Gamma (z) and G inv−Gaussian (z) can be numerically calculated for each R 0 , θ. Figure 6 presents the case when R 0 = 3 with θ ranging from 0 (i.e. the special case G Deg (z)) to θ = 10. They include a wide range of the infectious period distributions for a constant point µ with no random variation to very large variance 10µ 2 . Figure  6 also includes the Pareto function G * (z; R 0 ) given by (10) as the approximation, shown as the dark thick line.
In spite of the differences shown in Figure 6 when z becomes large, the product G Gamma (z) z  Figure 7 shows close agreements among the integrands in (17) and (19) in a very wide range of θ at each v. One expects R v (R 0 , θ), as areas covered by these curves, to be robust with respect to f G and in close agreement with the approximation R * v (R 0 ) given by (14). Table 1 shows the numerical results for θ = 0, 0.2, 1, 2 and 10 at R 0 = 2 and R 0 = 3. Although R v (R 0 , θ) depends on the choice of the distribution family as well as the shape parameter θ, they all fall into a very narrow range around R * v (R 0 ) for each R 0 and v, and R * v (R 0 ) does not involve any assumption of the distribution f G . Incidentally, G * (z; R 0 ) given by (10) is a special case of G Gamma (z) when θ = 1, that is, exponentially distributed infectious period.  (14) along with R v (R 0 , θ) with respect to Gamma and inverse-Gaussian distributed infectious period at R 0 = 2 and R 0 = 3, noticing that R * v (R 0 ) is identical to Gamma distributed infectious period with θ = 1.

3.2.
In the presence of a latent period. In this case, f G (x) is the convolution between f E (x) and f W (x) = 1 µ F I (x). From (8) and (13) R where θ may involve parameters that specifies the latent and the infectious period distributions. For notations, µ E is the mean latent period and θ E is the shape parameter of the latent period distribution. Without index, µ is the mean infectious period and θ is the shape parameter of the infectious period distribution. For the chosen distributions of the latent period and infectious period with explicit Laplace transforms, the variance of the latent period is θ E µ 2 E and the variance of the infectious period is θµ 2 . In the following combination of latent period and infectious period distributions, L[f E ](s) and L[F I ](s) have analytic forms and (21) can be straightforwardly computed.

Gamma latent period + Gamma infectious period (including degenerated and exponential distributions)
: where µ E is the mean latent period and θ E is the shape parameter of the latent period distribution and µ is the mean latent period and θ is the shape parameter of the infectious period distribution.

Inverse-Gaussian latent period + inverse-Gaussian infectious period
Without repeating the investigation for robustness in the previous subsection, numerical results from selected special cases when both the latent and infectious period distribution are chosen from the Gamma family are presented.

3.3.
Numerical results for special cases. For these numerical results, 16 distribution models for f G will presented at two level of R 0 . In total, there are 32 sets of special cases.
The first 4 models assume Gamma distributed infectious period without the latent period. In particular, The p.d.f. for the infectious period T I is involving the incomplete Gamma function. The mean value for G is E The shapes of f G (x) is illustrated in Figure 8, which includes the following two special cases. 1. When θ → 0, T I has a degenerated distribution with mass at a constant µ. Figure 8 is µ −1 . Numerical calculation for R v (ρ), with ξ(z; v) defined in (11) is given by (17) for 0 < θ < ∞. Table 1 contains results corresponding to θ → 0, θ = 1 and θ = 2. The next 12 models include a latent period so that G = T E + W. Four special cases are considered: (i) constant latent period+constant infectious period, (ii) exponentially distributed latent period + exponentially distributed infectious period, (iii) exponentially distributed latent period + constant infectious period, and (iv) exponentially distributed latent period + constant infectious period. The mean latent period is parameterized as µ E = lµ, l ≥ 0 and µ is the mean infectious period.

The scale parameter in
The shapes of f G (x) from combinations of assumed latent period and infectious period distributions are displayed in Figure 9 (a)-(c) with respectively assumed relative length of the average latent period to the average length of the infectious period l = 0.5, 1 and 2. Figure 9. Shapes of f G (x) in the special cases in the presence of a latent period.
Numerical calculation for R v (ρ), with ξ(z; v) given by (11), can be derived from (21) applied to each of the special cases. Detailed calculation formula and corresponding tables are given in Appendix. The left panel of Figure 10 provides a comparison of the results for R v (ρ) at v in the range from 0 to 4 by 0.25 increment. For each v, there are 16 points (in black) corresponding to R 0 = 3 and 16 points (in blue) corresponding to R 0 = 2. The trends representing R v (ρ) as functions of v are calculated as average and plotted as lines. It shows that assumptions on the structure (e.g. with or without latent period), on the types of distributions of these periods as well as the relative lengths between the latent and the infectious periods have little influence on R v (ρ). The predominant parameters are R 0 and v. For example: If R 0 = 3, a control measure at intervention rate ρ that in theory could have made the epidemic under control (i.e.

R c (ρ) = 1), may result in an outbreak as if manifested by a reproduction number
The right panel of Figure 10 is analogous to the left panel, representing the final size η through the approximate final size equation 1 − η = exp(−R v η). The final size, as a measure, is only applicable in certain epidemics, typically without replacement of the susceptible population and recovered individuals have immunity. However, unlike R v (ρ), which is mainly a theoretical parameter, the final size (where applicable) is also an observable quantity, also known as the infection attack rate. Once again, assumptions on the structure (e.g. with or without latent period), on the types of distributions of these periods as well as the relative lengths between the latent and the infectious periods have little influence on η.
Remark 1. It is well known that univariate frailty models are not identifiable from the survival information alone. In the current context, the true value of control rate φ is not identifiable from v in ξ(z; v). Meanwhile,.the basic reproduction number R 0 is a theoretical value that is difficult to estimate. The observed (or estimated) parameters during or after the outbreak, such as R v or η, cannot identify R 0 , φ and v. If one follows some guidelines with a given control rate at a threshold φ = ρ with proven (in theory) efficacy to get the epidemic under control and if one has observed an outbreak the ends with final size about 35% of the population, it could arise from an outbreak with R 0 = 3 and the control measure implemented with relatively good but not perfect adherence at v = 0.5, or from an outbreak with R 0 = 2 and the control measure implemented with poor adherence at v = 1.25, or an outbreak with R 0 = 1.23 with no intervention at all. An outbreak with infection attack rate 35% is nonetheless a large outbreak. In the absence of knowledge of R 0 along with unobservable frailty, it leaves impression as if the control measure that looks good on paper "does not work at all" in practice. Figure 11 echoes the comments made at the end of Section 2 that, if R 0 is large, the control thresholds ρ and ρ take place on the far right end tails of the survival functions corresponding to L[f * G ](φ) and ∞ 0 L[f * G ](φz)ξ(z)dz and both are heavy tailed. Consequently, the ratio ρ /ρ increases steeply if the variance of z is getting large. Of all the 32 cases examined, the predominant parameters are R 0 and v. The assumptions on distributions of the latent period and/or infectious period have very little effect on the quantitative results. However, the assumption on whether there is a latent period does have some influence. Generally speaking, the presence of a latent period somewhat make the control efforts easier. The ratio ρ /ρ does not depend on the time scale on disease progression. Figure 11 demonstrates that it is important to ensure strong adherence to minimize the frailty. When v > 1, in order to get the outbreak under control, the level of difficulty, represented by the ratio ρ /ρ, increases steeply. For instance, at v = 4, it requires increasing to the control threshold by 7 folds at R 0 = 2 and by 20 folds at R 0 = 3.

4.1.
On the use of frailty model versus more structured models for intervention heterogeneity. The idealized situation, assuming homogeneous intervention effect at rate φ on all individuals, is the null hypothesis. The frailty model by introducing a random effect z with E[z] = 1 is one of the many alternative hypotheses. One could argue the choice of this alternative hypothesis versus more structured hypotheses, such as spatial structure and time-dependence. Indeed, different alternative hypotheses need to be evaluated and some need more complex models. They also lead to different answers to different questions.
The frailty model addresses unobservable (and even un-quantifiable) heterogeneity that could arise in many situations, such as non-adherence (e.g. condom use), "leakage" (e.g. imperfect quarantine or isolation), or, in a prophylactic intervention, individuals may not take the prescribed dose of the medication provided. The frailty model in this paper provides some qualitative insights. More in-depth questions could be posed in each of the situation as alternative hypotheses and targeted models can be developed and applied. Even with that, there will still be unobservable part of heterogeneity and some random effect model may need to be added on top of the structured models. Take the condom use example, condom coverage can be explicitly modelled by space and time, yet there is still unobservable heterogeneity between and within individuals in adherence.
A related issue is the dependence between individuals. Under the null hypothesis, the intervention is applied homogeneously, and independently, on all individuals. Alternatively, one could question about independency and model correlation such as in a spatial structured model. The frailty model also introduces dependency. If a random effect ξ(z) is introduced in F (f railty) (x) = ∞ 0 e −zφx ξ(z)dz, the intervention is no longer applied independently on all individuals. The proof of this is similar to the proof that a mixed Poisson process no longer preserves the independent increment property. Structured models can explicitly describe dependency and provide answers to targeted questions. The frailty model addresses the unobservable aspect and provides qualitative insight for more general questions.
An important aspect of model is to order thoughts and sharpen vague intuitive notions. Empirical wisdom has told us that the control measure is most effective if applied homogeneously across all individuals and the more variable in adherence, the less the effectiveness. The first half of the sentence is mathematically expressed by (5). The notion variability is vague and intuitive. Definition 2.1 provides a verbal description and an illustration of the general concept of variability, which is sharpened in its mathematical equivalence, Definition 2.2. In return, variability according to Definition 2.2 leads to the order of corresponding to the second half of the sentence.
The findings that R v (ρ), along with its derived quantities, is scale invariant and robust to the disease progression natural history f G have wide application importance. Although quantitative analysis is used to investigate the robustness, the emphasis of the value of these results should be on the qualitative aspect. The frailty model is aimed to draw general conclusions and is complementary to models for specific questions on spatial-temporal heterogeneity and spatial dependence. 4.2. Non-identifiability and challenges in the design of intervention studies. The non-identifiability problem in Remark 1 poses challenges in the design of intervention studies at the population level. They are confounding factors that hinders the ability to distinguish the 'pure' impact of the intervention without the distorting influence of compliance from the effectiveness since some level of non-compliance are likely. Both the pure impact (efficacy) and population level effectiveness are important objectives in intervention studies. 4.3. Relation to other disciplines in science. The concept of frailty goes back to Greenwood and Yule [4] on accident proneness. The term frailty was introduced in demography [18]. It has been widely used to model heterogeneity in life insurance [13]. As extensions of the proportional hazard model [7], frailty models and are widely used in clinical applications where the study population must be considered as a heterogeneous sample, i.e. a mixture of individuals with different hazards.
Application of frailty models has been employed in a growing number of empirical works on a large variety of themes, including scale-free networks and dynamic systems. Examples include the population of cities, the study of stock markets, DNA sequences, family names, human behavior, geomagnetic records, train delays, reaction kinetics, air travel networks, hydrological phenomena, earthquakes, world bank records, voting processes, internet, citation of scientific papers, among others. A brief review of this distribution and a list of references of these studies are provided in Picoli et al. [14]. This manuscript adds another application field in this growing theme.

Constant infectious period, no latent period. The survival function for
of which, ρ is calculated by solving R 0 (1 − e −ρµ ) = ρµ. The scale parameter for the uniform distribution U [0, µ] is λ = µ −1 . Given any µ, the threshold ρ is scaled according to µ −1 . R v (ρ) is invariant to the length of the infectious period. With respect to R 0 = 2 and R 0 = 3, the threshold ρ is calculated as ρ = 1.594/µ at R 0 = 2 and ρ = 2.821/µ at R 0 = 3. Results are presented in Table 2. Exponentially distributed infectious period, no latent period. In this case, G = W is identically distributed as the infectious period T I with f G (x) = 1 µ e −x/µ . The scale parameter is µ −1 . The threshold ρ = µ −1 (R 0 − 1) is a function of R 0 , scaled according to µ −1 . For each of the two scenarios R 0 = 2 and R 0 = 3, ρ = 1/µ at R 0 = 2 and ρ = 2/µ at R 0 = 3. R v (ρ) is calculated according to The quantities are presented in Table 3. Gamma distributed infectious period, variance/mean ratio =2 Gamma distributed infectious period, variance/mean ratio =0.5  Table 4. Gamma distributed infectious period in four scenarios.
Gamma distributed infectious period, no latent period. The calculation of R v as a function of R 0 and θ was given by (17) and extensively covered in the previous section. R v (θ; R 0 ) according to selected values for θ and v, stratified by R 0 = 2 and R 0 = 3 presented in Table 1. Table 4 add the final size η (when applicable) and the ratio ρ /ρ for the following two scenarios: 1. T I with larger than exponential variance at θ = 2. The threshold is calculated as ρµ = 0.719 at R 0 = 2 and ρµ = 1.5 at R 0 = 3, respectively. 2. T I with smaller than exponential variance at θ = 0.5. The threshold is calculated as ρµ = 1.236 at R 0 = 2 and ρµ = 2.372 at R 0 = 3, respectively.
Constant latent period+constant infectious period. In this case, T E ≡ lµ and T I ≡ µ without random variation. Although the sum of the latent and the infectious period is constant, the sum G = T E + W is random, which is the constant lµ plus an uniformly distributed W between 0 and µ. The mean of G is When l is given, the threshold ρ is scaled according to µ −1 . At the threshold ρ, R v (ρ) is calculated by Although l+1 l [1 + (ρµ) yv] −1/v dy does not give explicit mathematical forms, it is easy to evaluate numerically. The preserved quantity is the product ρµ. Let us consider l = 0.5, 1 and 2 at R 0 = 2 and R 0 = 3. At l = 0.5, ρ = 0.714/µ when R 0 = 2 and ρ = 1.153/µ when R 0 = 3; at l = 1, ρ = 0.468/µ when R 0 = 2 and ρ = 0.748/µ when R 0 = 3; at l = 2, ρ = 0.279/µ when R 0 = 2 and ρ = 0.443/µ when R 0 = 3. Numerical results under each of these are displayed in Table 5.

PING YAN
The latent period is half the infectious period: l = 1/2 R 0 = 2, l = 0.5, The latent period equals to the infectious period: l = 1 The latent period is twice the infectious period: l = 2 The mean value of G is E[G] = (l + 1) µ. The threshold condition is R0 (1+lρµ)(1+ρµ) = 1. When l is given, the threshold ρ is scaled according to µ −1 . At the threshold ρ, R v (ρ) is calculated by The quantity ρµ is preserved. In addition, the relative lengths of the latent period and the infectious period can be reversed, without changing the numerical results Mean latent period equals to the mean infectious period: l = 1 Either  Table 6. Exponentially distributed latent period and exponentially distributed infectious period.
Constant latent period+exponentially distributed infectious period. The mean value for G = T E + W is E[G] = (l + 1) µ with p.d.f.
The preserved quantity is ρµ. Results are given in Table 7.
Exponentially distributed latent period + constant infectious period. In this setting, G = T E + W has the mean E[G] = l + 1 2 µ with p.d.f.