Advanced Search
Article Contents
Article Contents

Facilitating team-based data science: Lessons learned from the DSC-WAV project

  • * Corresponding author: Chelsey Legacy

    * Corresponding author: Chelsey Legacy 
Abstract Full Text(HTML) Figure(2) / Table(1) Related Papers Cited by
  • While coursework provides undergraduate data science students with some relevant analytic skills, many are not given the rich experiences with data and computing they need to be successful in the workplace. Additionally, students often have limited exposure to team-based data science and the principles and tools of collaboration that are encountered outside of school.

    In this paper, we describe the DSC-WAV program, an NSF-funded data science workforce development project in which teams of undergraduate sophomores and juniors work with a local non-profit organization on a data-focused problem. To help students develop a sense of agency and improve confidence in their technical and non-technical data science skills, the project promoted a team-based approach to data science, adopting several processes and tools intended to facilitate this collaboration.

    Evidence from the project evaluation, including participant survey and interview data, is presented to document the degree to which the project was successful in engaging students in team-based data science, and how the project changed the students' perceptions of their technical and non-technical skills. We also examine opportunities for improvement and offer insight to other data science educators who may want to implement a similar team-based approach to data science projects at their own institutions.

    Mathematics Subject Classification: Primary: 97K80; Secondary: 97P99.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  A sample Kanban board used by a DSC-WAV team. This board is implemented through GitHub Projects. Each task is represented by a "card" and linked to an issue in the GitHub repository. As students work on tasks, they move them from "To do" (i.e., the sprint backlog), to "In progress, " to "Done." A quick look at the board helps everyone on the team understand who is working on what. This team used tags to estimate the length of time that each task will take to complete

    Figure 2.  Sample timeline for Cohort 1 that was distributed to teams. A similar schedule was used for later cohorts

    Table 1.  Number of participants (and percentages) from each DSC-WAV team that responded to the end-of-project survey or were interviewed

    Survey Interview
    Team $ N $ (%) $ N $ (%)
    Cohort 1 (Spring 2020)
    [1ex] Team 01 ($ N=5 $) —— 4 (80%)
    Team 02 ($ N=5 $) —— 4 (80%)
    Team 03 ($ N=5 $) —— 3 (60%)
    Team 04 ($ N=5 $) —— 5 (100%)$ ^{\dagger} $
    Team 05 ($ N=5 $) —— 5 (100%)
    Cohort 2 (Fall 2020)
    [1ex] Team 06 ($ N=4 $) 4 (100%) ——
    Team 07 ($ N=3 $) 1 (33%) ——
    Team 08 ($ N=4 $) 3 (75%) ——
    Team 09 ($ N=3 $) 1 (33%) ——
    Team 10 ($ N=8 $)$ ^{\dagger\dagger} $ 4 (50%) ——
    Team 11 ($ N=2 $) 0 (0%) ——
    Cohort 3 (Spring 2021)
    [1ex] Team 12 ($ N=5 $) 1 (20%) 3 (60%)
    Team 13 ($ N=5 $) 0 (0%) 3 (60%)
    Team 14 ($ N=5 $) 0 (0%) ——
    Team 15 ($ N=4 $) 3 (75%) ——
    $ ^\dagger $One participant emailed their interview responses.
    $ ^{\dagger\dagger} $Three participants dropped out mid-project.
     | Show Table
    DownLoad: CSV
  • [1] A. C. Amason and H. J. Sapienza, The effects of top management team size and interation norms on cognitive and affective conflict, J. Mgmt., 23 (1997), 495-516.  doi: 10.1177/014920639702300401.
    [2] B. Baumer, A data science course for undergraduates: Thinking with data, Amer. Statist., 69 (2015), 334-342.  doi: 10.1080/00031305.2015.1081105.
    [3] M. D. BeckmanM. Çetinkaya-RundelN. J. HortonC. W. RundelA. J. Sullivan and M. Tackett, Implementing version control with Git and GitHub as a learning objective in statistics and data science courses, J. Statist. Data Science Education, 29 (2021), S132-S144.  doi: 10.1080/10691898.2020.1848485.
    [4] J. Bryan, Excuse me, do you have a moment to talk about version control?, Amer. Statist., 72 (2018), 20-27.  doi: 10.1080/00031305.2017.1399928.
    [5] Bureau of Labor Statistics and U.S. Department of Labor, Occupational Outlook Handbook: Computer and Information Research Scientists, 2021. Available from: https://www.bls.gov/ooh/computer-and-information-technology/computer-and-information-research-scientists.htm.
    [6] C. S. BurkeK. C. StaglE. SalasL. Pierce and D. Kendall, Understanding team adaptation: A conceptual analysis and model, J. Appl. Psychology, 91 (2006), 1189-1207.  doi: 10.1037/0021-9010.91.6.1189.
    [7] T. Busjahn, C. Schulte and A. Busjahn, Analysis of code reading to gain more insight in program comprehension, in Koli Calling '11: Proceedings of the 11th Koli Calling International Conference on Computing Education Research, 2011, 1–9. doi: 10.1145/2094131.2094133.
    [8] M. Çetinkaya-Rundel and V. Ellison, A fresh look at introductory data science, J. Statist. Data Science Education, 29 (2021), S16–S26. doi: 10.1080/10691898.2020.1804497.
    [9] T. Chow and D.-B. Cao, A survey study of critical success factors in agile software projects, J. Syst. Software, 81 (2008), 961-971.  doi: 10.1016/j.jss.2007.08.020.
    [10] S. Condon, Data science is a team sport: How to choose the right players, ZD Net, (2021). Available from: https://www.zdnet.com/article/data-science-is-a-team-sport-how-to-choose-the-right-players/.
    [11] E. C. ConfortoF. SalumD. C. AmaralS. L. da Silva and L. F. M. de Almeida, Can agile project management be adopted by industries other than software development?, Project Mgmt. J., 45 (2014), 21-34.  doi: 10.1002/pmj.21410.
    [12] D. A. Dillman, J. D. Smyth and L. M. Christian, Internet, Mail, and Mixed-Mode Surveys: The Tailored Design Method, 4$^{th}$ edition, Wiley, New York, 2014.
    [13] T. Donoghue, B. Voytek and S. E. Ellis, Teaching creative and practical data science at scale, J. Statist. Data Science Education, 29 (2021), S27–S39. doi: 10.1080/10691898.2020.1860725.
    [14] N. EghbalWorking in Public: The Making and Maintenance of Open Source Software, Stripe Press, an Francisco, CA, 2020. 
    [15] S. G. FisherT. A. Hunter and K. W. D. Macrosson, Team or group? Managers' perceptions of the differences, J. Managerial Psychology, 12 (1997), 232-242.  doi: 10.1108/02683949710174838.
    [16] E. Forsgren and K. Byström, Multiple social media in the workplace: Contradictions and congruencies, Inform. Syst. J., 28 (2018), 442-464.  doi: 10.1111/isj.12156.
    [17] T. A. Fredrick, Facilitating better teamwork: Analyzing the challenges and strategies of classroom-based collaboration, Business Commun. Quarterly, 71 (2008), 439-455.  doi: 10.1177/1080569908325860.
    [18] D. Goleman, What makes a leader?, Harvard Business Review, 76 (1998), 93–102. Available from: https://hbr.org/2004/01/what-makes-a-leader.
    [19] J. HardinR. HoerlN. J. Horton and et al., Data science in statistics curricula: Preparing students to "think with data", Amer. Statist., 69 (2015), 343-353.  doi: 10.1080/00031305.2015.1077729.
    [20] P. R. Harris and K. G. Harris, Managing effectively through teams, Team Perform. Mgmt., 2 (1996), 23-36.  doi: 10.1108/13527599610126247.
    [21] R. HodaJ. Noble and S. Marshall, The impact of inadequate customer collaboration on self-organizing Agile teams, Inform. Software Tech., 53 (2011), 521-534.  doi: 10.1016/j.infsof.2010.10.009.
    [22] E. Hossain, M. A. Babar and H.-Y. Paik, Using scrum in global software development: A systematic literature review, 2009 Fourth IEEE International Conference on Global Software Engineering, Limerick, Ireland, 2009. doi: 10.1109/ICGSE.2009.25.
    [23] C. Hsing and V. Gennarelli, Using GitHub in the classroom predicts student learning outcomes and classroom experiences: Findings from a survey of students and teachers, in SIGCSE '19: Proceedings of the 50th ACM Technical Symposium on Computer Science Education, New York, 2019,672–678. doi: 10.1145/3287324.3287460.
    [24] D. W. Johnson and R. T. Johnson, Learning Together and Alone: Cooperative, Competitive, and Individualistic Learning, 5$^{th}$ edition, Allyn and Bacon, Needham Heights, MA, 1999.
    [25] D. W. Johnson and R. T. Johnson, Social interdependence–-Cooperative learning in education, in Conflict, Cooperation, and Justice, Jossey-Bass Publishers, San Francisco, 1995,205–251.
    [26] P. Kastl and R. Romeike, Agile projects to foster cooperative learning in heterogeneous classes, 2018 IEEE Global Engineering Education Conference (EDUCON), Santa Cruz de Tenerife, Spain, 2018. doi: 10.1109/EDUCON.2018.8363364.
    [27] M. Kim, T. Zimmermann, R. DeLine and A. Begel, The emerging role of data scientists on software development teams, in ICSE '16: Proceedings of the 38th International Conference on Software Engineering, 2016, 96–107. doi: 10.1145/2884781.2884783.
    [28] Knowledge@Wharton, What's driving the demand for data scientists?, 2019. Available from: https://knowledge.wharton.upenn.edu/article/whats-driving-demand-data-scientist/.
    [29] D. J. LeachT. D. WallS. G. Rogelberg and P. R. Jackson, Team autonomy, performance, and member job strain: Uncovering the teamwork KSA link, Appl. Psychology, 54 (2005), 1-24.  doi: 10.1111/j.1464-0597.2005.00193.x.
    [30] T. A. Limoncelli, Five nonobvious remote work techniques, Communications of the ACM, 63 (2020), 108–110. doi: 10.1145/3410627.
    [31] A. López-Alcarria, A. Olivares-Vicente and F. Poza-Vilches, A systematic review of the use of agile methodologies in education to foster sustainability competencies, Sustainability, 11 (2019). doi: 10.3390/su11102915.
    [32] J. Luca and P. Tarricone, Does emotional intelligence affect successful teamwork?, in Meeting at the Crossroads: Proceedings of the 18th Annual Conference of the Australasian Society for Computers in Learning in Tertiary Education, Melbourne, Australia, 2001. Available from: https://ro.ecu.edu.au/ecuworks/4834.
    [33] T. MaierJ. DeFranco and C. Mccomb, An analysis of design process and performance in distributed data science teams, Team Perform. Mgmt., 25 (2019), 419-439.  doi: 10.1108/TPM-03-2019-0024.
    [34] E. Mannix and M. A. Neale, What difference makes a difference: The promise and reality of diverse teams in organizations, Psychological Sci. Public Interest, 6 (2005), 31-55.  doi: 10.1111/j.1529-1006.2005.00022.x.
    [35] A. McNamaraN. J. Horton and B. S. Baumer, Greater data science at baccalaureate institutions, J. Comput. Graph. Statist., 26 (2017), 781-783.  doi: 10.1080/10618600.2017.1386568.
    [36] Na tional Academies of ScienceEn gineering and  an d MedicineData Science for Undergraduates: Opportunities and Options, The National Academies Press, Washington, DC, 2018.  doi: 10.17226/25104.
    [37] D. A. Nolan and D. T. Lang, Computing in the statistics curricula, Amer. Statist., 64 (2010), 97-107.  doi: 10.1198/tast.2010.09132.
    [38] A. Qumer and B. Henderson-Sellers, An evaluation of the degree of agility in six agile methods and its applicability for method engineering, Inform. Software Tech., 50 (2008), 280-295.  doi: 10.1016/j.infsof.2007.02.002.
    [39] R. Rawlings-Goss, L. Cassel, M. Cragin, C. Cramer and A. Dingle, et al., Keeping data science broad: Negotiating the digital and data divide among higher-education institutions, Workshop: Bridging the Digital and Data Divide, National Science Foundation, 2018. Available from: https://par.nsf.gov/biblio/10075971.
    [40] V. Razmov and R. Anderson, Experiences with agile teaching in project based courses, 2006 Annual Conference & Exposition, Emerging Trends in Engineering Education Poster Session, Chicago, IL, USA, 2006. doi: 10.18260/1-2–1018.
    [41] C. Sadowski, E. Söderberg, L. Church, M. Sipko and A. Bacchelli, Modern code review: A case study at Google, in ICSE-SEIP '18: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, 2018,181–190. doi: 10.1145/3183519.3183525.
    [42] M. Sako, From remote work to working from anywhere, Communications of the ACM, 64 (2021), 20–22. doi: 10.1145/3451223.
    [43] J. T. Scarnati, On becoming a team player, Team Perform. Mgmt., 7 (2001), 5-10.  doi: 10.1108/13527590110389501.
    [44] J. Sheffield and J. Lemétayer, Factors associated with the software development agility of successful projects, Internat. J. Project Mgmt., 31 (2013), 459-472.  doi: 10.1016/j.ijproman.2012.09.011.
    [45] The Standish Group, Comprehensive Human Appraisal for Originating Software (CHAOS), Report, 2018. Available from: https://www.standishgroup.com.
    [46] G. L. Stewart, A meta-analytic review of relationships between team design features and team performance, J. Mgmt., 32 (2006), 29-55.  doi: 10.1177/0149206305277792.
    [47] D. A. Trytten, A design for team peer code review, ACM SIGCSE Bulletin, 37 (2005), 455-459.  doi: 10.1145/1047124.1047492.
    [48] E. Valentin, J. R. Hughes Carvalho and R. Barreto, Rapid improvement of students' soft-skills based on an agile-process approach, 2015 IEEE Frontiers in Education Conference (FIE), El Paso, TX, USA, 2015. doi: 10.1109/FIE.2015.7344408.
    [49] H. van MierloC. G. RutteJ. K. VermuntM. A. J. Kompier and J. A. M. C. Doorewaard, Individual autonomy in work teams: The role of team autonomy, self-efficacy, and social support, European J. Work Org. Psych., 15 (2006), 281-299.  doi: 10.1080/13594320500412249.
    [50] D. Vázquez-BusteloL. Avella and E. Fernández, Agility drivers, enablers and outcomes: Empirical test of an integrated agile manufacturing model, Internat. J. Oper. Production Mgmt., 27 (2007), 1303-1332.  doi: 10.1108/01443570710835633.
    [51] C. A. Yost and M. L. Tucker, Are effective teams more emotionally intelligent? Confirming the importance of effective communication in teams, Delta Pi Epsilon J., 42 (2000), 101-109. 
    [52] M. ZahediaM. Shahin and M. A. Babar, A systematic review of knowledge sharing challenges and practices in global software development, Internat. J. Information Mgmt., 36 (2016), 995-1019.  doi: 10.1016/j.ijinfomgt.2016.06.007.
    [53] A. X. Zhang, M. Muller and D. Wang, How do data science workers collaborate? Roles, workflows, and tools, in Proceedings of the ACM on Human-Computer Interaction, 4 (2020). doi: 10.1145/3392826.
    [54] S. ZhangF. KöblerM. Tremaine and A. Milewski, Instant messaging in global software teams, Internat. J. e-Collaboration, 6 (2010), 43-63.  doi: 10.4018/978-1-61350-459-8.ch010.
    [55] N. N. ZolkifliA. Ngah and A. Deraman, Version control system: A review, Procedia Computer Sci., 135 (2018), 408-415.  doi: 10.1016/j.procs.2018.08.191.
  • 加载中




Article Metrics

HTML views(802) PDF downloads(517) Cited by(0)

Access History



    DownLoad:  Full-Size Img  PowerPoint