Ari Bader-Natal

SpellBEE (2000-2008)

Testing the Teachers Dilemma

As part of my dissertation research, I explored the viability of supporting peer-to-peer learning as games. The idea was to develop two-person games in which each player would select or design problems for the other player to solve. If the game was structured as a purely competitive activity, students would have motivation to stump each other. If the game was structured as a purely cooperative activity, they would have motivation to offer easy questions that the other would likely answer correctly. A better game structure wouldn't be strictly competitive or cooperative, but rather would be something different. Vygotsky's Zone of Proximal Development suggests a path forward: Create a game in which each player is motivated to find and challenge the limits of what the other player is currently capable of. The Teacher's Dilemma was the model we used to describe the category of games with this property.

In software simulations, populations of rational agents playing iterated Teacher's Dilemma games did converge on the desired ZPD-finding strategies. But just because something works in theory doesn't mean that it will work in practice. To what extent does a rational agent in a game-theoretic simulation represent the strategies used by a third-grade student playing a game online with another student? Part of the goal of SpellBEE was to find out. Would the structure of the Teacher's Dilemma games have any impact on the choice of challenges that students present to one another? Rather than arranging for a series of small classroom studies in a local school, we opted to put the experiment online, and open it up to any teachers that wanted to participate. Between 2001-2008, thousands of elementary school teachers and tens of thousands of their students participated. Lo and behold, the strategies that the students adopted did indicate that, for the most part, the students were attempting to identify what their peers were capable of and chose problems to pose that were "in the zone."

SpellBEE was a symmetric turn-taking game, in which each turn consisted of: (1) both partners selected a problem for the other to solve, (2) each attempted to solve the problem posed to them, (3) each got feedback on their own solution, (4) each got feedback on how their partner did on the challenge posed. Each game was ten turns long.

While many teachers had their students play each other, many of the two-player games involved students who were thousands of miles away from another. Here's a picture of one day in 2005. The lines connect pairs of students who played games together. While the scale isn't particularly impressive from the perspective of web-based games today, I think it was pretty remarkable for an educational research experiment ten years ago.

Desktop status indicator tool for Mac OS.

Appendix A of my dissertation discusses BEEmail, a proof-of-concept web application demonstrating an asynchronous Teacher's Dilemma game built on a fully-decentralized architecture.

My publications based on the SpellBEE research

Bader-Natal, A. The Teacher's Dilemma: A game-based approach for motivating appropriate challenge among peers. Ph.D. Dissertation, Brandeis University, May 2008.
Abstract Full PDF Purchase hardcover ProQuest Google Books

In classroom-based studies, peer tutoring has proved to be an effective learning strategy, both for the tutees and for their peer tutors. Today, the increasingly widespread availability of computers and internet access in the homes and after-school programs of students offers a new venue for peer learning. In seeking to translate the successes of peer-assisted learning from the classroom to the Internet, one major hurdle to overcome is that of motivation. When teachers are no longer supervising student activity and when participation itself becomes voluntary, peer tutoring protocols may stop being educationally productive. In order to successfully leverage these peer interactions, we must find a way to facilitate and motivate learning among a group of unsupervised peers. In this dissertation, we respond to this challenge by reconceptualizing the interactions among peers within the context of a different medium: that of games. In designing a peer-tutoring experience as a two-player game, we gain a valuable set of tools and techniques for affecting student participation, engagement, goals, and strategies.

Our contributions:

  1. We define a criteria for games -- the Teacher's Dilemma criteria -- that motivates players to challenge one another with problems of appropriate difficulty;
  2. We show three games that satisfy the Teacher's Dilemma criteria when played by rational players under idealized conditions;
  3. We demonstrate, using computer simulations of strategic dynamics, that game-play will converge towards meeting these criteria, through time, under more realistic conditions;
  4. We design a suite of software that incorporates a Teacher's Dilemma game into several web-based activities for different learning domains;
  5. We collect data from thousands of students using these activities, and examine how the games actually affected the game-play strategy and learning among these students.
The game-theoretic analysis establishes the possibility for a game-based mechanism for motivating appropriate challenges, the simulations support the plausibility of this approach given non-optimal players, the implemented software systems demonstrate the scalability of this model, and the data analysis supports the real-world applicability of this game-based approach to motivating appropriate challenges for learning among unsupervised peers.

Bader-Natal, A. Incorporating students' probabilistic expectations into a peer-driven tutoring game. Brandeis CS Tech Report CS-08-269, 2008.
Abstract Draft PDF

Brandeis CS Tech Report CS-08-269, 2008.

Games provide a promising mechanism for intelligent tutoring systems in that they offer means to influence motivation and structure interactions. We have designed and released several game-based tutoring systems in which students learn to identify the best game strategies to adopt, and, in doing so, create for each other increasingly productive learning environments. Here, we first detail the core game underlying our deployed systems, designed to leverage human intelligence in tutoring systems through the tutor's identification of "appropriate" challenges for their tutee. While this game works well for task domains in which problem difficulty is known, it cannot be applied to domains if nothing is known about a problem beyond its correct solution. We introduce a second, more robust, game here capable of addressing this larger set of task domains. By incorporating player-generated probability estimates (in place of a difficulty metric), we show that a game can be designed to simultaneously elicit best-effort responses from tutees, honest statements of probability estimates from tutees, and appropriate challenges from tutors. We derive a set of constraints on the parameterized version of this game necessary for rational players to converge on this "Teacher's Dilemma" learning environment. Beyond providing a foundation for future tutoring systems, this work offers a new mechanism with which to simultaneously leverage and enhance the knowledge of peer learners.

Bader-Natal, A. and Pollack, J. Evaluating Problem Difficulty Rankings Using Sparse Student Response Data. Supplementary Proceedings of the 13th International Conference on Artificial Intelligence in Education, 2007.
Abstract Draft PDF Full proceedings

Supplementary Proceedings of the 13th International Conference on Artificial Intelligence in Education, 2007.

Problem difficulty estimates play important roles in a wide variety of educational systems, including determining the sequence of problems presented to students and the interpretation of the resulting responses. The accuracy of these metrics are therefore important, as they can determine the relevance of an educational experience. For systems that record large quantities of raw data, these observations can be used to test the predictive accuracy of an existing difficulty metric. In this paper, we examine how well one rigorously developed – but potentially outdated – difficulty scale for American-English spelling fits the data collected from seventeen thousand students using our SpellBEE peer-tutoring system. We then attempt to construct alternate metrics that use collected data to achieve a better fit. The domain-independent techniques presented here are applicable when the matrix of available student-response data is sparsely populated or non-randomly sampled. We find that while the original metric fits the data relatively well, the data-driven metrics provide approximately 10% improvement in predictive accuracy. Using these techniques, a difficulty metric can be periodically or continuously recalibrated to ensure the relevance of the educational experience for the student.

Bader-Natal, A. and Pollack, J. Assessing Learning in a Peer-Driven Tutoring System. Proceedings of the 13th International Conference on Artificial Intelligence in Education, IOS Press, 2007.
Abstract Draft PDF ACM Digital Library

Proceedings of the 13th International Conference on Artificial Intelligence in Education, IOS Press, 2007.

In many intelligent tutoring systems, a detailed model of the task domain is constructed and used to provide students with assistance and direction. Reciprocal tutoring systems, however, can be constructed without needing to codify a full-blown model for each new domain. This provides various advantages: these systems can be developed rapidly and can be applied to complex domains for which detailed models are not yet known. In systems built on the reciprocal tutoring model, detailed validation is needed to ensure that learning indeed occurs. Here, we provide such validation for SpellBEE, a reciprocal tutoring system for the complex task domain of American-English spelling. Using a granular definition of response accuracy, we present a statistical study designed to assess and characterize student learning from collected data. We find that students using this reciprocal tutoring system exhibit learning at the word, syllable, and grapheme levels of task granularity.

Bader-Natal, A. and Pollack, J. BEEweb: A Multi-Domain Platform for Reciprocal Peer-Driven Tutoring Systems. Proceedings of the 8th International Conference on Intelligent Tutoring Systems, Springer, 2006.
Abstract Draft PDF Springer

Proceedings of the 8th International Conference on Intelligent Tutoring Systems, Springer, 2006.

Tutoring systems that engage each student as both a tutee and a tutor can be powerfully enhanced by motivating each tutor to try to appropriately challenge their tutee. The BEEweb platform is presented as a foundation upon which to build such systems, based upon the Reciprocal Tutoring protocol and the Teacher's Dilemma. Three systems that have recently been built on the BEEweb platform are introduced.

Bader-Natal, A. and Pollack, J.B. Motivating Appropriate Challenges in a Reciprocal Tutoring System. Proceedings of the 12th International Conference on Artificial Intelligence in Education, IOS Press, 2005. Nominated for Best Paper Award.
Abstract Draft PDF ACM Digital Library

Proceedings of the 12th International Conference on Artificial Intelligence in Education, IOS Press, 2005.

Formalizing a student model for an educational system requires an engineering effort that is highly domain-specific. This model-specificity limits the ability to scale a tutoring system across content domains. In this work we offer an alternative, in which the task of student modeling is not performed by the system designers. We achieve this by using a reciprocal tutoring system in which peer-tutors are implicitly tasked with student modeling. Students are motivated, using the Teacher's Dilemma, to use these models to provide appropriately-difficult challenges. We implement this as a basic literacy game in a spelling-bee format, in which players choose words for each other to spell across the internet. We find that students are responsive to the game's motivational structure, and we examine the affect on participants' spelling accuracy, challenge difficulty, and tutoring skill.