Collaborative Proposal: Learning Linkages: Integrating Data Streams of Multiple Modalities and Timescales

Project: Research project

Project Details


As more and more student work is conducted on computers and online, new possibilities are opened for the study of learning. One of the most alluring features of this computer-collected data is the ease with which it can be collected. We can now collect vast amounts of learning-related data with relative ease, and analyze this data in an automated manner. In the best cases, the analysis provides a rich picture of student learning. At the same time, it is clear that this type of computer-collected data cannot capture all important learning phenomena. Even if we record every mouse movement and key press, there is still much that is missed. This observation leads to a set of important questions: What types of learning phenomena can we capture and trace with computer-collected data, and what types do we miss? How can we make the most of computer-collected data without pushing the boundaries of reasonable inferences? Are there ways we can enrich computer-collected data, while still retaining some of the benefits of easy collection and use? In this proposed project, we take some first steps toward providing answers to these questions. In doing so, our efforts are narrowed in a few respects. First, we are concerned with one type of computer-collected learning data, what we will call fine-grained computer-collected (FGCC) data. The distinguishing feature of FGCC data is that it samples learning events at a short timescale, and tracks them over an intermediate time scale, from minutes to hours. Furthermore, we propose to conduct our investigations in the context of classroom enactments that make use of two computer-supported learning environments, ChemVLab and the Carnegie Mellon Algebra Tutor. Our studies will make use of log file data of the sort that has been previously studied. But we will augment this FGCC data with data from a range of other modalities, in order to build a more complete picture of individual students, as well as of entire classes. In sum, our proposed project has two main aims. (1) Map the limits of FGCC data. First, we will answer the question: What does FGCC data capture and what does it miss? To answer this question, we will look first at the ability of FGCC data to predict a wide variety of learning outcomes captured by a range of measures. Second, using data from the multiple modalities, we will build a set of learning narratives for a subset of the students observed. We will then compare these narratives to the pictures provided by the FGCC data alone. Explore the integration of FGCC data with data from other modalities. Second, we will explore the possibility of combining FGCC data with data from other modalities, in a manner that makes it possible to fluidly mine the full corpus of data. We will then map the limits of automated analysis of this enriched multi-modal data, just as we did for the FGCC alone. We will ask: Can we better predict outcomes? Can we capture more of the full learning narrative? How will the integrated data produce actionable knowledge to improve science teaching and learning?
Effective start/end date9/1/148/31/17


  • National Science Foundation (DRL-1418020)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.