Automatic Question Generation

Assessment material development is a lot of work. School teachers spend long hours trying to prepare good test or revision material for students. To prevent cheating, many teachers even write several versions of each test, multiplying the work to be done. And not to mention grading! The issue is exacerbated by the rising popularity of online courses, in which a lot more students are involved in tests, revisions, and homework materials.

Furthermore, access to a large, varied pool of assessment items is a key to the success of many adaptive courses of study. Without that, proficiency estimates can be compromised and personalized courses of study can become less effective.

Thousands of students use Aristotle App to learn and revise daily. User data shows that Math is the most revised subject.

Machine Generated Questions

Machine-generated questions have been a component of intelligent tutoring systems for decades. Most research falls into two categories: solution-oriented approach and template-based approach.

Solution-Oriented Approach

Here, questions are generated based on the set of skills and concepts required to solve them. For example, skills related to addition include adding single-digit numbers, adding multi-digit numbers, adding three or more numbers, and carrying digits.

Since solution-oriented approaches group problems based on skills, they lend themselves well to adaptivity. As a student answers questions, one can identify the skillset he or she is struggling with, and then recommend material to remediate the lacunae. However, a major drawback of solution-oriented approaches is that developing questions even for a topic as simple as addition requires a fair amount of labor and domain expertise.

Template-Based Approach

In this approach, a question template is used to represent a potentially large class of problems. For example, consider a familiar question:
Find all roots of _ x² + _ x + _

The underlines are “holes” that must be filled in by the question generator. A template might also specify valid ways to fill in the holes. For example, maybe each hole can only be filled in by the integers 1 through 10, leading to 10³= 1000 possible questions. The instructor may wish to further restrict the template to only permit quadratics with real, distinct roots. Though theoretically, each template can generate an infinite number of questions.

Advantages of this approach:

Disadvantages of this approach:

• Templates tend to group the problems based on appearance, not skills.

Our Approach

We use the template-based approach where each template corresponds to a skill, which makes our approach affirm the solution-oriented approach as well. The first step is to define what constitutes a skill for each template-based class.

For example, the above class of problems: Find all roots of _ x² + _ x + _, might not represent a skill. But on introducing restrictions such as the holes being integers in the set {1, 2, …, 10} and discriminant of the above class to be positive in order to restrict the solutions to real numbers, it actually represents a skill required for Grade 10 student.

The Art of Templating

Our first task was to devise a templating language. We decided that it would be a good exercise to define a domain specific language (DSL) that formalizes the space of possible templates. This DSL must let instructors specify the following:

Step by Step Solutions

To provide an excellent math learning experience, we wanted to guide students through their math problems, step-by-step. A good step-by-step solution for a problem (such as “simplify x + ½ + x + ⅓”) should be detailed and have good explanations of what happens along the way. These steps should also feel intuitive and be pedagogically correct — not just any step-by-step solution, but one that a tutor would show their student.

Depending on the distractor (wrong answer), chosen by the student we know whether the student has a learning gap, a misconception, or has just made a silly mistake. We focus on the step where the student chose the wrong option.

Step-by-step solution in Aristotle App

Step by step solution in Aristotle

The question and answer are simply strings with variable names denoting the “holes.” Variables come in two flavors: generated (num1 and num2) and derived (sum). Generated variables are bound to a sample set, which could be a range of integers, numbers with 2 decimal places, or even the set of Fibonacci numbers. Derived variables are defined by mathematical expressions.

In general, constraints are useful for narrowing the skill set covered by the template and to ensure that instantiations of the template are sufficiently varied.

Nested DSL’s structure

Just like complex mathematical functions can be written as a composition of simpler functions, we can compose complex DSL’s from simpler ones by combining them.

For example,

DSL1 — template for series resistors simplification

DSL2 — template for parallel resistors simplification

DSL3 — template for a combination of series and parallel resistors simplification

Now the DSL3 can be composed using DSL1 and DSL2 a required number of times to get the simplified structure for DSL 3.

The Problem Generation Algorithm
Great, so we have a template. Now how do we actually generate questions? We use the simplest possible algorithm -

This naive algorithm performs nicely given one key assumption: a large enough fraction of the sample space (the set of all possible questions, i.e. the cartesian product of the sample sets) must meet the constraints specified in the template. For instance, if 100 questions are desired and the algorithm can handle 100,000 iterations, roughly 1/1000 questions need to be valid. This isn’t too daunting. As long as we offer an expressive library of sample sets and constraints, instructors can be expected to provide templates meeting this requirement.

It is very difficult to come up with a more efficient approach. For some problems, algorithms do exist for generating solutions (see Euclid’s method for Pythagorean triples). But for others, it is mathematically impossible. In many cases, introducing heuristics may improve the random algorithm. For instance, it may be possible to identify a large chunk of the sample space that leads to solutions that are too large, non-integral, negative, etc.

The Prototype

We chose to implement the assessment generator in Python for several reasons:

empowering data, fueling insight

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store