\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Gaussian mixture models for clustering and calibration of ensemble weather forecasts

Abstract / Introduction Full Text(HTML) Figure(9) / Table(3) Related Papers Cited by
  • Nowadays, most weather forecasting centers produce ensemble forecasts. Ensemble forecasts provide information about probability distribution of the weather variables. They give a more complete description of the atmosphere than a unique run of the meteorological model. However, they may suffer from bias and under/over dispersion errors that need to be corrected. These distribution errors may depend on weather regimes. In this paper, we propose various extensions of the Gaussian mixture model and its associated inference tools for ensemble data sets. The proposed models are then used to identify clusters which correspond to different types of distribution errors. Finally, a standard calibration method known as Non homogeneous Gaussian Regression (NGR) is applied cluster by cluster in order to correct ensemble forecast distributions. It is shown that the proposed methodology is effective, interpretable and easy to use. The clustering algorithms are illustrated on simulated and real data. The calibration method is applied to real data of temperature and wind medium range forecast for 3 stations in France.

    Mathematics Subject Classification: Primary: 62P12, 62H30.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Daily mean temperature at 2m, Millau January 2015 at 6 pm ECMWF. The plain line represents the observed time series. Boxplots represent the distribution of the ensemble forecasts at horizon 3 days

    Figure 2.  Selection of the number of clusters based on BIC criterion for the difficult case and various number of ensemble members. The true number of classes was set to $ K = 4 $. The sample size is fixed to $ n = 200 $

    Figure 3.  Root mean square errors with respect to the sample size for the regular case and ensemble size $ M = 50 $

    Figure 4.  Root mean square errors with respect to the ensemble size for the difficult case and sample size $ n = 200 $

    Figure 5.  Accuracy scores, with respect to the ensemble size, for the difficult case and sample size $ n = 200 $

    Figure 6.  Temperature variable - Predictive performance score (CRPSS) of proposed $ NGR_Z $ with respect to $ Raw $ and $ NGR $ for each station, hour, forecasting horizon, and for two clustering inputs: $ GMM_{uni} $ (clustering obtained from temperature ensembles) and $ GMM_{multi} $ (clustering obtained from temperature and wind components). Dashed line is the threshold to reach to outperform the model $ NGR_Z $

    Figure 7.  Wind variable (U component) - Predictive performance score (CRPSS) of proposed $ NGR_Z $ with respect to $ Raw $ and $ NGR $ for each station, hour, forecasting horizon, and for two clustering inputs: $ GMM_{uni} $ (clustering obtained from temperature ensembles) and $ GMM_{multi} $ (clustering obtained from temperature and wind components). Dashed line is the threshold to reach to outperform the model $ NGR_Z $

    Figure 8.  Rank histograms for 3 days temperature ensemble forecasts at Millau station (January, 6pm). First column: raw ensembles grouped by univariate clusters; Second column: raw ensembles grouped by multivariate clusters; Third column: calibrated ensembles with the proposed $ NGR_Z $ method, using multivariate clusters. Dashed line is the density threshold to reach to form a uniform rank histogram

    Figure 9.  Temperature at Millau, January 2015, 6 pm. First row: raw ECMWF ensemble forecasts (3 days horizon); Second row: calibrated ensemble forecasts with the proposed $ NGR_Z $ method; Black line: observations. Background colors highlight the different clusters infered by $ GMM_{multi} $ method

    Table 1.  Parameters of the Gaussian mixture models for the simulation study

    Cluster parameters
    $ d $ Case $ \mu_1 $, $ \sigma_1 $ $ \mu_2 $, $ \sigma_2 $ $ \mu_3 $, $ \sigma_3 $ $ \mu_4 $, $ \sigma_4 $
    1 Regular 0, 1 2, 0.3 7, 1 10, 1
    Difficult 2, 2 3, 0.3 3, 1 4, 4
    3 Regular (0, 0, 0), 1 (2, 0, 0), 0.3 (7, 7, 7), 1 (10, 10, 10), 1
    Difficult (2, 2, 2), 2 (3, 3, 3), 0.3 (3, 3, 3), 1 (4, 4, 4), 4
     | Show Table
    DownLoad: CSV

    Table 2.  Estimated marginal parameters of GMM for Millau 6 pm, 3 days forecasting. $ GMM_{uni} $ clusters are obtained from temperature ensembles only; $ GMM_{multi} $ clusters are obtained from temperature and wind components

    Models
    $ GMM_{uni} $ clusters $ GMM_{multi} $ clusters
    Variables Parameters 1 2 3 1 2 3
    TMP $ \mu_k $ $ -2.21 $ $ \phantom{-}1.68 $ $ \phantom{-}5.51 $ $ -1.30 $ $ \phantom{-}3.19 $ $ \phantom{-}3.39 $
    $ \sigma^2_k $ $ \phantom{-}3.33 $ $ \phantom{-}2.42 $ $ \phantom{-}3.47 $ $ \phantom{-}4.38 $ $ \phantom{-}5.05 $ $ \phantom{-}4.42 $
    $ \pi_k $ $ \phantom{-}0.25 $ $ \phantom{-}0.39 $ $ \phantom{-}0.36 $ $ \phantom{-}0.32 $ $ \phantom{-}0.33 $ $ \phantom{-}0.35 $
    U10 $ \mu_k $ $ 2.39 $ $ -0.40 $ $ \phantom{-}2.56 $
    $ \sigma^2_k $ $ \phantom{-}2.70 $ $ \phantom{-}6.80 $ $ \phantom{-}6.70 $
    V10 $ \mu_k $ $ -1.26 $ $ \phantom{-}0.51 $ $ -0.98 $
    $ \sigma^2_k $ $ \phantom{-}3.29 $ $ \phantom{-}2.42 $ $ \phantom{-}2.63 $
     | Show Table
    DownLoad: CSV

    Table 3.  Millau January 2015 at 6 pm, fitted temperature NGR-Z coefficients for each cluster

    Mean Var
    Clusters intercept slope intercept slope
    1 $ 1.28 $ $ 1.03 $ $ 1.00 $ $ 1.00 $
    2 $ 0.55 $ $ 0.87 $ $ 0.21 $ $ 1.04 $
    3 $ 1.52 $ $ 0.72 $ $ 0.42 $ $ 0.81 $
     | Show Table
    DownLoad: CSV
  • [1] S. AllenC. Ferro and F. Kwasniok, Recalibrating wind-speed forecasts using regime-dependent ensemble model output statistics, Quarterly Journal of the Royal Meteorological Society, 146 (2020), 2576-2596.  doi: 10.1002/qj.3806.
    [2] S. AllenC. Ferro and F. Kwasniok, Regime-dependent statistical post-processing of ensemble forecasts, Quarterly Journal of the Royal Meteorological Society, 145 (2019), 3535-3552.  doi: 10.1002/qj.3638.
    [3] J. L. Anderson, A method for producing and evaluating probabilistic forecasts from ensemble model integrations, Journal of Climate, 9 (1996), 1518-1530.  doi: 10.1175/1520-0442(1996)009<1518:AMFPAE>2.0.CO;2.
    [4] D. Arthur and S. Vassilvitskii, k-means++: The advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035, ACM, New York, 2007.
    [5] J.-P. Baudry and G. Celeux, EM for mixtures, Statistics and Computing, 25 (2015), 713-726.  doi: 10.1007/s11222-015-9561-x.
    [6] J. BessacP. AilliotJ. Cattiaux and V. Monbet, Comparison of hidden and observed regime-switching autoregressive models for (u, v)-components of wind fields in the northeastern atlantic, Advances in Statistical Climatology, Meteorology and Oceanography, 2 (2016), 1-16. 
    [7] C. BiernackiG. Celeux and G. Govaert, Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Comput. Statist. Data Anal., 41 (2003), 561-575.  doi: 10.1016/S0167-9473(02)00163-9.
    [8] P. BougeaultZ. TothC. BishopB. BrownD. BurridgeD. H. ChenB. EbertM. FuentesT. M. Hamill and K. Mylne, et al., The thorpex interactive grand global ensemble, Bulletin of the American Meteorological Society, 91 (2010), 1059-1072.  doi: 10.1175/2010BAMS2853.1.
    [9] J. B. Bremnes, Ensemble postprocessing using quantile function regression based on neural networks and bernstein polynomials, Monthly Weather Review, 148 (2020), 403-414.  doi: 10.1175/MWR-D-19-0227.1.
    [10] R. Buizza, Weather prediction in a world of uncertainties: Should ensembles simulate the effect of model approximations?, in ECMWF/WWRP Workshop: Model Uncertainty, ECMWF, Reading, 2016.
    [11] R. BuizzaM. Leutbecher and L. Isaksen, Potential use of an ensemble of analyses in the ECMWF ensemble prediction system, Quarterly Journal of the Royal Meteorological Society: A Journal of the Atmospheric Sciences, Applied Meteorology and Physical Oceanography, 134 (2008), 2051-2066.  doi: 10.1002/qj.346.
    [12] R. BuizzaM. Milleer and T. N. Palmer, Stochastic representation of model uncertainties in the ECMWF ensemble prediction system, Quarterly Journal of the Royal Meteorological Society, 125 (1999), 2887-2908.  doi: 10.1002/qj.49712556006.
    [13] M. CourbariauxP. BarbillonL. Perreault and É. Parent, Post-processing multiensemble temperature and precipitation forecasts through an exchangeable normal-gamma model and its Tobit extension, J. Agric. Biol. Environ. Stat., 24 (2019), 309-345.  doi: 10.1007/s13253-019-00358-2.
    [14] A. P. DempsterN. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. Ser. B, 39 (1977), 1-22.  doi: 10.1111/j.2517-6161.1977.tb01600.x.
    [15] P. Diaconis and D. Freedman, Finite exchangeable sequences, Ann. Probab., 8 (1980), 745-764. 
    [16] C. Fraley, Algorithms for model-based Gaussian hierarchical clustering, SIAM J. Sci. Comput., 20 (1998), 270-281.  doi: 10.1137/S1064827596311451.
    [17] C. FraleyA. E. Raftery and T. Gneiting, Calibrating multimodel forecast ensembles with exchangeable and missing members using bayesian model averaging, Monthly Weather Review, 138 (2010), 190-202.  doi: 10.1175/2009MWR3046.1.
    [18] H. R. Glahn and D. A. Lowry, The use of model output statistics (MOS) in objective weather forecasting, Journal of Applied Meteorology and Climatology, 11 (1972), 1203-1211.  doi: 10.1175/1520-0450(1972)011<1203:TUOMOS>2.0.CO;2.
    [19] T. Gneiting and A. E. Raftery, Strictly proper scoring rules, prediction, and estimation, Journal of the American Statistical Association, 102 (2007), 359-378. 
    [20] T. GneitingA. E. RafteryA. H. Westveld Ⅲ and T. Goldman, Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation, Monthly Weather Review, 133 (2005), 1098-1118.  doi: 10.1175/MWR2904.1.
    [21] T. M. Hamill and S. J. Colucci, Verification of Eta–RSM short-range ensemble forecasts, Monthly Weather Review, 125 (1997), 1312-1327.  doi: 10.1175/1520-0493(1997)125<1312:VOERSR>2.0.CO;2.
    [22] H. Hersbach, Decomposition of the continuous ranked probability score for ensemble prediction systems, Weather and Forecasting, 15 (2000), 559-570.  doi: 10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2.
    [23] W. H. KleinB. M. Lewis and I. Enger, Objective prediction of five-day mean temperatures during winter, Journal of Atmospheric Sciences, 16 (1959), 672-682.  doi: 10.1175/1520-0469(1959)016<0672:OPOFDM>2.0.CO;2.
    [24] J. E. Matheson and R. L. Winkler, Scoring rules for continuous probability distributions, Management Science, 22 (1976), 1087-1096.  doi: 10.1287/mnsc.22.10.1087.
    [25] G. J. McLachlan and T. Krishnan, The EM Algorithm and Extensions, vol. 382, John Wiley & Sons, 2007.
    [26] G. J. McLachlan and D. Peel, Finite Mixture Models, John Wiley & Sons, 2004.
    [27] Y.-Y. ParkR. Buizza and M. Leutbecher, TIGGE: Preliminary results on comparing and combining ensembles, Quarterly Journal of the Royal Meteorological Society, 134 (2008), 2029-2050.  doi: 10.1002/qj.334.
    [28] G. Schwarz, Estimating the dimension of a model, The Annals of Statistics, 6 (1978), 461-464. 
    [29] D. J. StensrudJ.-W. Bao and T. T. Warner, Using initial condition and model physics perturbations in short-range ensemble simulations of mesoscale convective systems, Monthly Weather Review, 128 (2000), 2077-2107.  doi: 10.1175/1520-0493(2000)128<2077:UICAMP>2.0.CO;2.
    [30] M. TaillardatO. MestreM. Zamo and P. Naveau, Calibrated ensemble forecasts using quantile regression forests and ensemble model output statistics, Monthly Weather Review, 144 (2016), 2375-2393.  doi: 10.1175/MWR-D-15-0260.1.
    [31] O. Talagrand, Evaluation of probabilistic prediction systems, in Workshop Proceedings "Workshop on Predictability", 20-22 October 1997, ECMWF, Reading, UK, 1999.
    [32] S. VannitsemJ. B. BremnesJ. DemaeyerG. R. EvansJ. FlowerdewS. HemriS. LerchN. RobertsS. Theis and A. Atencia, et al., Statistical postprocessing for weather forecasts: Review, challenges, and avenues in a big data world, Bulletin of the American Meteorological Society, 102 (2021), E681-E699.  doi: 10.1175/BAMS-D-19-0308.1.
    [33] D. S. Wilks, Effects of stochastic parametrizations in the Lorenz'96 system, Quarterly Journal of the Royal Meteorological Society: A Journal of the Atmospheric Sciences, Applied Meteorology and Physical Oceanography, 131 (2005), 389-407.  doi: 10.1256/qj.04.03.
    [34] D. S. Wilks, Univariate ensemble postprocessing, in Statistical Postprocessing of Ensemble Forecasts, Elsevier, (2018), 49–89. doi: 10.1016/B978-0-12-812372-0.00003-0.
  • 加载中

Figures(9)

Tables(3)

SHARE

Article Metrics

HTML views(3584) PDF downloads(489) Cited by(0)

Access History

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return