DSDA | DSENG | DSDM | DSRM | DSBA | Total | |
Skill | 6 | 4 | 6 | 5 | 2 | 23 |
Knowledge | 7 | 2 | 6 | 5 | 1 | 21 |
Merged | 8 | 8 | 3 | 1 | 7 | 27 |
Total | 21 | 14 | 15 | 11 | 10 | 71 |
During the emergence of Data Science as a distinct discipline, discussions of what exactly constitutes Data Science have been a source of contention, with no clear resolution. These disagreements have been exacerbated by the lack of a clear single disciplinary 'parent.' Many early efforts at defining curricula and courses exist, with the EDISON Project's Data Science Framework (EDISON-DSF) from the European Union being the most complete. The EDISON-DSF includes both a Data Science Body of Knowledge (DS-BoK) and Competency Framework (CF-DS). This paper takes a critical look at how EDISON's CF-DS compares to recent work and other published curricular or course materials. We identify areas of strong agreement and disagreement with the framework. Results from the literature analysis provide strong insights into what topics the broader community see as belonging in (or not in) Data Science, both at curricular and course levels. This analysis can provide important guidance for groups working to formalize the discipline and any college or university looking to build their own undergraduate Data Science degree or programs.
Citation: |
Figure 1. Examples of the reduction process and resulting item counts for merging items from EDISON Core Data Science Skills Table & EDISON Knowledge Table from [19] into the List of Topics that were investigated in our combined studies
Table 1. Unique Item Counts by Knowledge/Skill in each competency category. The merged row indicates how often a Knowledge and Skill in that competency are merged. There are five acronyms in this table: DSDA stands for Data Science Analytics, DSENG stands for Data Science Engineering, DSDM stands for Data Management, DSRM stands for Research Methods and Project Management, and finally DSBA stands for Domain Related Competencies and Business Analytics Competencies
DSDA | DSENG | DSDM | DSRM | DSBA | Total | |
Skill | 6 | 4 | 6 | 5 | 2 | 23 |
Knowledge | 7 | 2 | 6 | 5 | 1 | 21 |
Merged | 8 | 8 | 3 | 1 | 7 | 27 |
Total | 21 | 14 | 15 | 11 | 10 | 71 |
Table 2. Counts & Coverage Percentage of Topics from EDISON DSF for each Curricular-Level data source
Park City | IADSS | Wu | BHEF | ACM | |
Count of Topics | 31 | 43 | 43 | 45 | 58 |
% Coverage of EDISON CF-DS | 44% | 61% | 61% | 63% | 82% |
Table 3. Summary of topic agreement from literature analysis of EDISON CF-DS items that should persist (or not) in Data Science
Agreement | Indeterminate | Total Consensus | |||
Positive | Negative | ||||
Count | 34 | 14 | 23 | 48 | |
Percent | 48% | 20% | 32% | 68% |
Table 4. Counts & Coverage Percentage of Topics from EDISON DSF for each Course-level data source
Source | Hardin et al. | Dusen et al. Data-8 | ||||
Smith | Auckland | UC B/D | St. Olaf | Purdue | (various schools) | |
Count of Topics | 22 | 7 | 25 | 20 | 24 | 23 |
% Coverage | 31% | 10% | 35% | 28% | 34% | 32% |
Source | Cetinkaya-Rundel Data-Science-Box (various schools) | Yan and Davis U.Massachusetts Dartmouth | European DSA Foundations of Data Science Big Data (2 courses) | |||
Count of Topics | 24 | 23 | 28 | |||
% Coverage | 34% | 32% | 39% |
Table 5. Summary of topic agreement from literature analysis of EDISON CF-DS items that should be included (or not) in an introductory in Data Science course
Agreement | Indeterminate | Total Consensus | |||
Positive | Negative | ||||
Count | 15 | 41 | 15 | 56 | |
Percent | 21% | 58% | 21% | 79% |
[1] |
ACM Data Science Task Force. Available from: http://dstf.acm.org/.
![]() |
[2] |
AICPA, PFPBody of Knowledge. Available from: https://www.aicpa.org/interestareas/personalfinancialplanning/membership/pfsbodyofknowledge.html.
![]() |
[3] |
American Statistical Association, Curriculum guidelines for undergraduate programs in statistical science. Available from: http://www.amstat.org/education/curriculumguidelines.cfm.
![]() |
[4] |
P. Anderson, J. Bowring, R. McCauley, G. Pothering and C. Starr, An undergraduate degree in data science: Curriculum and a decade of implementation experience, Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, ACM, New York, NY, USA, 2014,145–150.
doi: 10.1145/2538862.2538936.![]() ![]() |
[5] |
P. Anderson, J. McGuffee and D. Uminsky, Data science as an undergraduate degree, Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, ACM, New York, NY, USA, 2014,705–706.
doi: 10.1145/2538862.2538868.![]() ![]() |
[6] |
J. Blitzstein, Teaching data science and storytelling, in The Data Science Handbook, Data Science Bookshelf, 2015,174–187.
![]() |
[7] |
Business-Higher Education Forum (BHEF, Webinar: Data science and analytics (dsa)-enabled graduate competency map | BHEF, 2019. Available from: https://s3.goeshow.com/dream/DataSummit/Data%20Summit%202018/BHEF_2016_DSA_competency_map_1.pdf.
![]() |
[8] |
I. Cárdenas-Navia and B. K. Fitzgerald, The broad application of data science and analytics: Essential tools for the liberal arts graduate, Change: The Magazine of Higher Learning, 47 (2015), 25-32.
doi: 10.1080/00091383.2015.1053754.![]() ![]() |
[9] |
B. Cassel and H. Topi, Strengthening Data Science Education Through Collaboration, Workshop on Data Science Education Workshop Report, 2015.
![]() |
[10] |
CC2020 Task Force, Computing Curricula 2020: Paradigms for Global Computing Education, ACM, New York, NY, USA, 2020.
doi: 10.1145/3467967.![]() ![]() |
[11] |
M. Cetinkaya-Rundel, Computing infrastructure and curriculum design for introductory data science, Proceedings of the 50th ACM Technical Symposium on Computer Science Education, SIGCSE '19, ACM, New York, NY, USA, 2019, 1236–1236.
doi: 10.1145/3287324.3287556.![]() ![]() |
[12] |
Civil Engineering Body of Knowledge 3 Task Committee, Civil Engineering Body of Knowledge: Preparing the Future Civil Engineer, 3$^rd$ edition, American Society of Civil Engineers, Reston, VA, 2019.
doi: 10.1061/9780784415221.![]() ![]() |
[13] |
N. R. Council, Training Students to Extract Value from Big Data: Summary of a Workshop, The National Academies Press, Washington, DC, 2014. Available from: https://www.nap.edu/catalog/18981/training-students-to-extract-value-from-big-data-summary-of.
![]() |
[14] |
Data Science Association, About the Data Science Association. Available from: https://www.datascienceassn.org/.
![]() |
[15] |
R. D. De Veaux, M. Agarwal, M. Averett, B. S. Baumer and A. Bray, et al., Curriculum guidelines for undergraduate programs in data science, Ann. Rev. Statist. Appl., 4 (2017), 15-30.
doi: 10.1146/annurev-statistics-060116-053930.![]() ![]() |
[16] |
Y. Demchenko, A. Belloum, W. Los, T. Wiktorski and A. Manieri, et al., EDISON data science framework: A foundation for building data science profession for research and industry, IEEE International Conference on Cloud Computing Technology and Science (CloudCom), Luxembourg, Luxembourg, 2016.
doi: 10.1109/CloudCom.2016.0107.![]() ![]() |
[17] |
Y. Demchenko, L. Comminiello and G. Reali, Designing customisable data science curriculum using ontology for data science competences and body of knowledge, Proceedings of the 2019 International Conference on Big Data and Education - ICBDE'19, ACM Press, London, United Kingdom, 2019,124–128.
![]() |
[18] |
EDISON Project, Data science training and data science education - EU. Available from: http://edsa-project.eu/.
![]() |
[19] |
EDISON Project, EDISON: Building the data science profession. Available from: https://edison-project.eu/.
![]() |
[20] |
U. Fayyad and H. Hamutcu, Toward foundations for data science and analytics: A knowledge framework for professional standards, Harvard Data Science Review. Available from: https://hdsr.mitpress.mit.edu/pub/6wx0qmkl/release/2.
![]() |
[21] |
D. G. Freelon, ReCal: Intercoder reliability calculation as a Web service, Internat. J. Internet Sci., 5 (2010), 20–33. Available from: http://dfreelon.org/publications/2010_ReCal_Intercoder_reliability_calculation_as_a_web_service.pdf.
![]() |
[22] |
L. Haas, A. Hero and R. A. Lue, Highlights of the national academies report on "Undergraduate data science: Opportunities and options", Harvard Data Science Review, 1. Available from: https://hdsr.mitpress.mit.edu/pub/z4sb5j9l/release/3.
![]() |
[23] |
J. Hardin, R. Hoerl and N. J. Horton, et al., Data science in statistics curricula: Preparing students to "think with data", Amer. Statist., 69 (2015), 343-353.
doi: 10.1080/00031305.2015.1077729.![]() ![]() ![]() |
[24] |
S. C. Hicks and R. A. Irizarry, A guide to teaching data science, Amer. Statist., 72 (2018), 382-391.
doi: 10.1080/00031305.2017.1356747.![]() ![]() ![]() |
[25] |
T. K. Hira, Personal finance: Past, present and future, Networks Financial Institute Policy Brief, (2009), 23pp.
doi: 10.2139/ssrn.1522299.![]() ![]() |
[26] |
Joint Task Force on Computing Curricula, Association for Computing Machinery (ACM) and IEEE Computer Society, Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science, ACM, New York, NY, USA, 2013. Available from: https://www.acm.org/binaries/content/assets/education/cs2013_web_final.pdf.
![]() |
[27] |
A. Manieri, S. Brewer, R. Riestra, Y. Demchenko and M. Hemmje, et al., Data science professional uncovered: How the EDISON Project will contribute to a widely accepted profile for data scientists, IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), Vancouver, BC, Canada, 2015.
doi: 10.1109/CloudCom.2015.57.![]() ![]() |
[28] |
P. W. G. Morris, L. Crawford, D. Hodgson, M. M. Shepherd and J. Thomas, Exploring the role of formal bodies of knowledge in defining a profession - The case of project management, Internat. J. Project Management, 24 (2006), 710-721.
doi: 10.1016/j.ijproman.2006.09.012.![]() ![]() |
[29] |
National Academies of Sciences, Data Science for Undergraduates: Opportunities and Options, National Academies Press, 2018. Available from: https://www.nap.edu/catalog/25104/data-science-for-undergraduates-opportunities-and-options.
![]() |
[30] |
C. Pompa and T. Burke, Data science and analytics skills shortage: equipping the APEC workforce with the competencies demanded by employers, Asia-Pacific Economic Cooperation Secretariat, Singapore, 2017, https://www.apec.org/Publications/2017/11/Data-Science-and-Analytics-Skills-Shortage.
![]() |
[31] |
R. Rawlings-Goss, L. Cassel, M. Cragin, C. Cramer and A. Dingle, et al., Keeping Data Science Broad: Negotiating the Digital & Data Divide, Technical report, South Big Data Hub, 2018. Available from: https://par.nsf.gov/biblio/10075971.
![]() |
[32] |
S. R. Singer, N. R. Nielsen and H. A. Schweingruber, Discipline-Based Education Research: Understanding and Improving Learning in Undergraduate Science and Engineering, National Academies Press, 2012.
doi: 10.17226/13362.![]() ![]() |
[33] |
E. Van Dusen, A. Suen, A. Liang and A. Bhatnagar, Accelerating the advancement of data science education, Proceedings of the 18th Python in Science Conference, (2019), 1-4.
doi: 10.25080/Majora-7ddc1dd1-000.![]() ![]() |
[34] |
M. A. Waller and S. E. Fawcett, Data science, predictive analytics, and big data: A revolution that will transform supply chain design and management, J. Business Logistics, 34 (2013), 77-84.
doi: 10.1111/jbl.12010.![]() ![]() |
[35] |
J. M. Wing and D. Banks, Highlights of the inaugural data science leadership summit, Harvard Data Science Review, 1.
doi: 10.1162/99608f92.e45fcb79.![]() ![]() |
[36] |
H. Wu, Systematic study of big data science and analytics programs, ASEE Annual Conference & Exposition Proceedings, ASEE Conferences, Columbus, Ohio, 2017.
doi: 10.18260/1-2–28900.![]() ![]() |
[37] |
D. Yan and G. E. Davis, A first course in data science, J. Statist. Education, 27 (2019), 99-109.
doi: 10.1080/10691898.2019.1623136.![]() ![]() |
[38] |
P. Zorn, C. S. Schumacher and M. J. Siegel, 2015 CUPM Curriculum Guide to Majors in the Mathematical Sciences, The Mathematical Association of America, 2015. Available from: https://www.maa.org/sites/default/files/pdf/CUPM/pdf/CUPMguide_print.pdf.
![]() |
Examples of the reduction process and resulting item counts for merging items from EDISON Core Data Science Skills Table & EDISON Knowledge Table from [19] into the List of Topics that were investigated in our combined studies