All Issues

Volume 4, 2022

Volume 3, 2021

Volume 2, 2020

Volume 1, 2019

Foundations of Data Science

Open Access Articles

Evaluation of EDISON's data science competency framework through a comparative literature analysis
Karl R. B. Schmitt, Linda Clark, Katherine M. Kinnaird, Ruth E. H. Wertz and Björn Sandstede
2021 doi: 10.3934/fods.2021031 +[Abstract](961) +[HTML](436) +[PDF](915.18KB)

During the emergence of Data Science as a distinct discipline, discussions of what exactly constitutes Data Science have been a source of contention, with no clear resolution. These disagreements have been exacerbated by the lack of a clear single disciplinary 'parent.' Many early efforts at defining curricula and courses exist, with the EDISON Project's Data Science Framework (EDISON-DSF) from the European Union being the most complete. The EDISON-DSF includes both a Data Science Body of Knowledge (DS-BoK) and Competency Framework (CF-DS). This paper takes a critical look at how EDISON's CF-DS compares to recent work and other published curricular or course materials. We identify areas of strong agreement and disagreement with the framework. Results from the literature analysis provide strong insights into what topics the broader community see as belonging in (or not in) Data Science, both at curricular and course levels. This analysis can provide important guidance for groups working to formalize the discipline and any college or university looking to build their own undergraduate Data Science degree or programs.

Analysis of the feedback particle filter with diffusion map based approximation of the gain
Sahani Pathiraja and Wilhelm Stannat
2021, 3(3): 615-645 doi: 10.3934/fods.2021023 +[Abstract](882) +[HTML](378) +[PDF](474.25KB)

Control-type particle filters have been receiving increasing attention over the last decade as a means of obtaining sample based approximations to the sequential Bayesian filtering problem in the nonlinear setting. Here we analyse one such type, namely the feedback particle filter and a recently proposed approximation of the associated gain function based on diffusion maps. The key purpose is to provide analytic insights on the form of the approximate gain, which are of interest in their own right. These are then used to establish a roadmap to obtaining well-posedness and convergence of the finite \begin{document}$ N $\end{document} system to its mean field limit. A number of possible future research directions are also discussed.

An international initiative of predicting the SARS-CoV-2 pandemic using ensemble data assimilation
Geir Evensen, Javier Amezcua, Marc Bocquet, Alberto Carrassi, Alban Farchi, Alison Fowler, Pieter L. Houtekamer, Christopher K. Jones, Rafael J. de Moraes, Manuel Pulido, Christian Sampson and Femke C. Vossepoel
2021, 3(3): 413-477 doi: 10.3934/fods.2021001 +[Abstract](3445) +[HTML](908) +[PDF](19933.3KB)

This work demonstrates the efficiency of using iterative ensemble smoothers to estimate the parameters of an SEIR model. We have extended a standard SEIR model with age-classes and compartments of sick, hospitalized, and dead. The data conditioned on are the daily numbers of accumulated deaths and the number of hospitalized. Also, it is possible to condition the model on the number of cases obtained from testing. We start from a wide prior distribution for the model parameters; then, the ensemble conditioning leads to a posterior ensemble of estimated parameters yielding model predictions in close agreement with the observations. The updated ensemble of model simulations has predictive capabilities and include uncertainty estimates. In particular, we estimate the effective reproductive number as a function of time, and we can assess the impact of different intervention measures. By starting from the updated set of model parameters, we can make accurate short-term predictions of the epidemic development assuming knowledge of the future effective reproductive number. Also, the model system allows for the computation of long-term scenarios of the epidemic under different assumptions. We have applied the model system on data sets from several countries, i.e., the four European countries Norway, England, The Netherlands, and France; the province of Quebec in Canada; the South American countries Argentina and Brazil; and the four US states Alabama, North Carolina, California, and New York. These countries and states all have vastly different developments of the epidemic, and we could accurately model the SARS-CoV-2 outbreak in all of them. We realize that more complex models, e.g., with regional compartments, may be desirable, and we suggest that the approach used here should be applicable also for these models.

Ajay Jasra, Kody J. H. Law and Vasileios Maroulas
2019, 1(1): i-iii doi: 10.3934/fods.20191i +[Abstract](2719) +[HTML](1740) +[PDF](81.25KB)




Special Issues

Email Alert

[Back to Top]