CMU-CS-13-117
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-13-117

Integrating Representation Learning and Skill Learning
in a Human-Like Intelligent Agent

Nan Li

June 2013

Ph.D. Thesis

CMU-CS-13-117.pdf


Keywords: Intelligent agent, learner modeling, representation learning, complex problem solving

Building an intelligent agent that simulates human learning of math and science could potentially benefit both cognitive science, by contributing to the understanding of human learning, and artificial intelligence, by advancing the goal of creating human-level intelligence. However, constructing such a learning agent currently requires manual encoding of prior domain knowledge; in addition to being a poor model of human acquisition of prior knowledge, manual knowledge-encoding is both time-consuming and error-prone. Previous research has shown that one of the key factors that differentiates experts and novices is their different representations of knowledge. Experts view the world in terms of deep functional features, while novices view it in terms of shallow perceptual features. Moreover, since the performance of learning algorithms is sensitive to representation, the deep features are also important in achieving effective machine learning.

In this work, we propose an efficient algorithm that acquires representation knowledge in the form of “deep features” for specific domains, and demonstrate its effectiveness in the domain of algebra as well as synthetic domains. We integrate this algorithm into a learning agent, SimStudent, which learns procedural knowledge by observing a tutor solve sample problems, and by getting feedback while actively solving problems on its own. We show that learning representations enhances the generality of the learning agent by reducing the requirements for knowledge engineering. Moreover, we propose an approach that automatically discovers student models using the extended SimStudent. By fitting the discovered model to real student learning curve data, we show that the discovered model is better or as good as human-generated models, and demonstrate how the discovered model may be used to improve a tutoring system’s instructional strategy.

115 pages



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu