A log-Gaussian Cox process with sequential Monte Carlo for line narrowing in spectroscopy

    *Corresponding author: Teemu Härkönen 
  • We propose a statistical model for narrowing line shapes in spectroscopy that are well approximated as linear combinations of Lorentzian or Voigt functions. We introduce a log-Gaussian Cox process to represent the peak locations thereby providing uncertainty quantification for the line narrowing. Bayesian formulation of the method allows for robust and explicit inclusion of prior information as probability distributions for parameters of the model. Estimation of the signal and its parameters is performed using a sequential Monte Carlo algorithm followed by an optimization step to determine the peak locations. Our method is validated using a simulation study and applied to a mineralogical Raman spectrum.

    Mathematics Subject Classification: Primary: 62F15, 62L12; Secondary: 78M31.


    \begin{equation} \\ \end{equation}
  • Figure 1.  On top, a spectrum (blue) consisting of $ N = 3 $ Lorentzian line shapes located at locations $ ( l_1, l_2, l_3)^T $ shown in red, yellow, and purple, respectively. Upon successful line narrowing, or deconvolution, we would obtain three individual Dirac delta functions located at $ ( l_1, l_2, l_3)^T $. The aim of this paper is to construct approximate samples for the Dirac delta functions using linear prediction which are further modelled as a log-Gaussian Cox process

    Figure 2.  On top, a summary of the distribution of posterior samples from $ \pi( \mathit{\boldsymbol{x}}_{\text{LN}} \mid \boldsymbol{y}) $ for the spectrum in Figure 1. In the middle, marginalized posterior samples according to Eq. (20) and the corresponding LGCP estimate. At the bottom, the posterior $ \pi( \mathit{\boldsymbol{l}} \mid \mathit{\boldsymbol{z}}) $ for the line shape locations $ \mathit{\boldsymbol{l}} $ obtained by sampling the LGCP local maxima

    Figure 3.  Simulation-based calibration histogram for the rank statistics of the true number of peaks $ N^*_s $. The number of peaks were estimated as the number of local maxima in samples from the LGCP fits, using a GP length scale of $ 0.025 $. The histogram shows a uniform distribution. The solid black line shows the expected value and the shaded gray areas show the 99% confidence intervals

    Figure 4.  At the top, posterior distributions for the line shape parameter $ \gamma $ and the number of line shapes $ N $ along with their respective true parameter values used to generate the synthetic spectrum in red. At the bottom, the corresponding synthetic spectrum (in blue) and the corresponding location posterior $ \pi( \mathit{\boldsymbol{l}} \mid \mathit{\boldsymbol{z}}) $ (in red). Blacks dots denote the locations used to generate the spectrum

    Figure 5.  At the top, posterior distributions for the line shape parameter $ \gamma $ and the number of line shapes $ N $. At the bottom, the observed Raman spectrum of anorthite (in blue) and the corresponding location posterior $ \pi( \mathit{\boldsymbol{l}} \mid \mathit{\boldsymbol{z}}) $ (in red). Blacks dots denote the peak locations found in literature [10]

    Table 1.  Prior distributions for the Lorentz and Voigt line shape parameters $ {\mathit{\boldsymbol{\theta}}} $, the Fourier self-deconvolution cut-off parameter $ M $, and the GP covariance parmeters $ \psi $

    Prior Lorentz Voigt
    $ \pi_0(\gamma) $ $ \mathcal{U}( 1, 30) $ $ \mathcal{U}( 1, 30) $
    $ \pi_0(\sigma \mid \gamma) $ NA $ \mathcal{N}_+( 0.5 \times \gamma, (0.05 \times \gamma)^2 ) $
    $ \pi_0(M) $ $ \mathcal{U}( 10, 80) $ $ \mathcal{U}( 10, 80) $
    $ \pi_0(\log\{\sigma_\lambda\}) $ $ t( 0,100^2, 10) $ $ t( 0.01,100^2, 10) $
