Estimating the division rate from indirect measurements of single cells

• Is it possible to estimate the dependence of a growing and dividing population on a given trait in the case where this trait is not directly accessible by experimental measurements, but making use of measurements of another variable? This article adresses this general question for a very recent and popular model describing bacterial growth, the so-called incremental or adder model. In this model, the division rate depends on the increment of size between birth and division, whereas the most accessible trait is the size itself. We prove that estimating the division rate from size measurements is possible, we state a reconstruction formula in a deterministic and then in a statistical setting, and solve numerically the problem on simulated and experimental data. Though this represents a severely ill-posed inverse problem, our numerical results prove to be satisfactory.

Mathematics Subject Classification: Primary: 35R30, 92B05; Secondary: 35Q62, 62G30, 45Q05.

• Figure 1.  Protocol 1 – Reconstruction of $B$ when both $U_{B,x}$ and $\mathcal{L}_B$ are (almost) exactly known. The oracle choice for $h$ gives us the value 1/4.75

Figure 2.  Results of Protocols 1 and 2. $x$ stands for size, $\xi$ for frequency and $a$ for increment of size. Estimation of the division rate $B(a) = a^2$ in function of the increment of size $a$ (Subfigure 2g), and of all the intermediate functions necessary (Subfigures 2a to 2b)

Figure 3.  Protocol 2 – Reconstruction of $B$ when $U_{B,x}$ is (almost) exactly known but not $\mathcal{L}_B$. The oracle choice for $h$ gives us the value 1/5

Figure 4.  Protocol 3 – Reconstruction of $B$ when $U_{B,x}$ is reconstructed from $X_1,\ldots,X_n$ i.i.d. $\sim$ $U_{B,x}$ but $\mathcal{L}_B$ is (almost) exactly known. The oracle choice for $h_3$ gives us values that range between $1/3.25$ for $n = 500$ and $1/4.75$ for $n = 50\; 000$. We set $\varpi_n = 1/n$

Figure 5.  Results of Protocol 3 for $n = 2000$ and $M = 100$ Monte Carlo samples. ($x$ stands for size, $\xi$ for frequency and $a$ for increment of size). Estimation of the division rate $B(a) = a^2$ in function of the increment of size $a$ (Subfigure 5e), and of intermediate functions (Subfigures 5a to 5d). In beige, the zone where 95% of the 100 Monte Carlo samples lie

Figure 6.  Results of Protocols 3 (Left) and 4 (Right), – Estimation of the division rate $B(a) = a^2$ in function of the increment of size $a$ for different $n$ (from up to bottom, 500, 5 000, 10 000) and $M = 100$ Monte Carlo samples (the beige zone representing the zone of 95% of the 100 reconstructions). We see that the result improves for $x\leq\approx 2$ when $n$ increases, but remains very poor for larger $x$, the difference between the blue curve (Protocol 1 or 2) and the beige zone showing the influence of the sampling noise. However, Protocol 4 does not significantly worsen the results of Protocol 3

Figure 7.  Results of Protocols 3 and 4 – Reduction of the mean error over $M = 100$ samples (in log-scale) in function of the sample size (from $n = 500$ to $n = 50\; 000$). Empirical errors are computed over the following regular grids: (a)-(e) $[0;6]$, $\Delta x = \tfrac{6}{500}$; (f)-(h) $[-10;10]$, $\Delta \xi = 0.05$; (i)-(j) $[0;2.25]$, $\Delta a = \tfrac{1}{\sqrt{n}}$; (k) $[0;2]$, $\Delta a = \tfrac{1}{\sqrt{n}}$

Figure 8.  Protocol 4 – Reconstruction of $B$ when both $U_{B,x}$ and $\mathcal{L}_B$ are reconstructed from $X_1,\ldots,X_n$ i.i.d. $\sim$ $U_{B,x}$. The parameter $h_1$ is automatically chosen by the kernel smoothing function ${\text ksdensity}$; $h_2$ is deduced from $h_1$. The oracle choice for $h_3$ gives us values that range between $1/3.25$ for $n = 500$ and $1/4.5$ for $n = 50\; 000$. We set $\varpi_n = 1/n$

Figure 9.  Results of Protocol 4 for $n = 2000$ and $M = 100$ Monte Carlo samples, with an inital division rate $B(a) = a^2$. From (a) to (j): successive steps of the Protocol 4. We see that the main errors come from the estimation of the Fourier transform (Subfigures 9g and 9h)

Figure 10.  Testing the procedure on experimental data

Figure 12.  Speed of convergence of each step of Protocol 4 in a log-log scale

Figure 11.  Testing the procedure on experimental data

Table 1.  Errors of Protocols 1 and 2 for the intermediate steps

 Reconstruction of $\mathcal L_B$ $\mathcal G_B$ $\mathcal G^*_B$ Numerical [0;6] [0;6] [-50;50] sampling $\Delta x = \tfrac{6}{500}$ $\Delta x = \tfrac{6}{500}$ $\Delta \xi = 0.05$ Protocol 1 - - - Protocol 2 0.0478 0.0417 0.0417

Table 2.  Errors of Protocols 1 and 2 for $B$ in function of the numerical sampling

 Reconstruction of $B$ $B$ Numerical [0;2] [0;2.5 ] sampling $\Delta a = 0.01$ $\Delta a = 0.01$ Protocol 1 0.0730 0.2065 Protocol 2 0.0849 0.1321
