Creating MCQs, Part III: Alternative Types of Multiple Choice Questions

This three-part whitepaper explores some possibilities and ideas in creating multiple choice questions (MCQs). In Part III, we look at alternatives to the basic MCQ. The “basic” MCQ has four options from which the examinee must select one as the correct answer. It is the most widely used type. Most questions on the GRE and SAT are in this format.
Other question types are useful for various reasons. They might suit the subject matter better; they might be easier to create; they might be more appropriate for certain groups of learners. We’ll look at the following types here:

  • True/False questions
  • Matching columns of words and/or numbers
  • Assertion/Reason questions
  • Relationship analysis
  • Sentence completion

Section I: True/False Questions
The basic True/False (or, “T/F”) question asks whether a statement is true or false. In fact, the simplest type of MCQ examination consists of only T/F questions; for example, there might be fifty such questions, with one point for a correct answer and no points for an incorrect answer.
Usually, however, exams include the T/F question type as one of several types. Each T/F question is a group of a few statements (four to six) that each need to be marked as true or false. The two most common variations on this are:
>> If the examinee marks all the statements in the list correctly, then he is awarded a point for the question; if even one statement is marked incorrectly, the examinee is awarded no points.
>> More commonly, the examinee is awarded points for each correct answer.
Here are examples of T/F questions with school-level chemistry as the subject matter.
Question: Mark each of the following statements as true or false.
Statements:

  1. Copper sulphate is blue in colour. (T/F)
  2. Hydrogen sulphide smells of almonds. (T/F)
  3. Benzaldehyde is a green-coloured gas. (T/F)
  4. Chlorine has a pleasant, sweet smell. (T/F)
  5. Nitrous sulphide is also called laughing gas. (T/F)
  6. Formaldehyde is used as a preservative. (T/F)

This particular set has no “theme” except that they are each about one particular substance. The substances mentioned are of different types; the mentioned characteristics of those substances are of different types (colour, smell); some statements are about characteristics of the substances, while others are not.
The examiner might choose to use a theme for a set of questions. Here’s an example that uses the theme of physical properties (that is, characteristics) of individual substances:
Question: Mark each of the following statements as true or false.
Statements:

  1. Nitrous oxide is a yellow-coloured gas. (T/F)
  2. Copper oxide is blue in colour. (T/F)
  3. Benzaldehyde smells of almonds. (T/F)
  4. Chlorine is a green-coloured gas. (T/F)
  5. Ammonia smells of rotten eggs. (T/F)
  6. Potassium permanganate is purple in colour. (T/F)

Here, all the statements mention a physical property (colour, smell). This kind of structure allows the examiner to easily create groups of T/F questions.
A small but important modification that the examiner can make is to provide a “Don’t Know” option. This can help in more accurate grading by discouraging guessing. The examinee can be awarded one points for a correct answer, no points for a “don’t know,” and a negative point for an incorrect answer. With this scheme, the more the examinee tends to guess, the more he is penalised.
This rationale can be illustrated as follows, with a T/F question set consisting of 20 statements:
Say the examinee knows the answers to 12 of them and guesses the other eight. Of the eight guesses, he will probably get around 4 correct, so his score would be 16 out of 20. Now suppose the Don’t Know option exists, combined with negative marking for incorrect answers. The examinee would mark 12 statements correctly, and he might say “don’t know” for eight. His score would be 12 out of 20. (If he guessed eight times despite the Don’t Know option, his score would still be around 12 out of 20.)
At an extreme, consider an examinee who knows none of the answers. Without a Don’t Know option, he would guess 20 times, and his score would be around 10. With the Don’t Know option, his score would be zero if he chose not to guess, and close to zero if he chose to guess.
Section 2: Matching Two Columns
The “Match Columns” type of MCQ is close to the “Fill in the Blanks” type, which we don’t need to discuss. In both types, the examinee must look at a piece of data and supply a corresponding, or matching, piece of data. In a “fill in the blanks” question, the data looked at is a sentence and the data supplied is a word. In a “match columns” question, the data looked at could be a word, phrase, or sentence, and the data to be supplied can also be any of these.
This question type is useful in many situations, but it is particularly suited to exercises in recall. Our first example is a question that only serves the purpose of reinforcement as opposed to recall:
Question: Match each word in column 1 with the corresponding fact in column 2.

Column 1

Column 2

Distillation

Element no. 2

Beaker

Used to purify a substance

Molecular chemistry

Is larger than a pipette

Helium

Involves chemistry as well as physics

 Here, the answers are obvious to everyone except the most clueless. “Distillation” is a process, so the match can only be “Used to purify a substance.” Similarly, the phrase “Element no. 2” in the second column cannot relate to any word in the first column except for “Helium.”
If the examiner wants to test his students’ recall of facts, the above example is poorly written (because the answers are obvious). A question like this can, however, be useful if the purpose is reinforcement.
Example #2, below, is tightly structured. Each item in column 1 is a substance, and each item in column 2 is a characteristic (colour, smell, taste).
Question: Match each substance in column 1 with the corresponding physical property in column 2.

Column 1

Column 2

Copper sulphate

Smells like rotten eggs

Chlorine gas

Blue in colour

Hydrogen sulphide

Smells of almonds

Benzaldehyde

Green in colour

With this structure, guessing is discouraged because there are no give-away clues—unlike in example #1.
In this question, every item in column 1 is a substance and every item in column 2 is a property. In many cases, it is difficult to construct a question like this one—for example, if the range of topics to be covered is wide, or if the topics do not fit into a category (like “properties of substances” above). In such cases, the question be constructed with groups in a column having a common theme.
For example #3, let’s choose six substances for column 1, and two themes for column 2, “uses” and “sources.” So column 2 would mention three industrial uses for chemicals, and three natural sources:
Question: Match each substance in column 1 with the corresponding sentence in column 2.

Column 1

Column 2

Acetone

This colourless gas smells like rotten eggs.

Chlorine

This is a blue-coloured solid at room temperature.

Hydrogen sulphide

This liquid is used as an industrial solvent.

Formaldehyde

This gas is green in colour.

Copper sulphate

It is used in the production of plastic.

Hydrochloric acid

This substance is used as a preservative.

Such a question (as opposed to example #2) would be used if the physical characteristics of all six substances have not been discussed, or if the industrial uses of all six substances have not been discussed in the course.
Do remember that in a “match columns” question, assuming four rows, the examinee will always get a score of four even if he can match only three items correctly. For that reason, if the examiner wants to ensure that every awarded point corresponds to one fact that the student knows, the question would have to consist of five items.
Section 3: Assertion/Reason
The Assertion/Reason question format presents the test-taker with two elements: one statement followed by another, where the second might be a reason for, or an explanation of, the first. For example, the Assertion might be “There are seven colours in the rainbow,” and the Reason might be “Because light is split into seven colours by water in the atmosphere.” The examinee is asked about the truth (or otherwise) of each statement, and whether the Reason actually is an explanation of the Assertion.
This kind of question gives the examiner an opportunity to check the depth of understanding of the examinee. Guessing is very difficult with this format. If an examinee gets the answer correct, it is usually after proper understanding and reasoning. According to an unreferenced source, A/R questions “encourage higher-order thinking on the part of the student.”
The answer choices in a typical A/R question are usually the following, with “A” meaning “Assertion” and “R” meaning “Reason”:
(1) Both A and R are true and R is a correct explanation of A.
(2) Both A and R are true but R is not a correct explanation of A.
(3) A is true but R is false.
(4) A is false but R is true.
If needed, a fifth option can be added:
(5) Both A and R are false.
 
The A/R question tests for three broad knowledge aspects at the same time.
>> It tests the student’s recall of information for two statements; in this sense, it combines two T/F questions into one.
>> It tests whether the student can correlate diverse information. That is, if the Assertion and Reason don’t have many words in common, the student might know the individual facts but fail to make the correlation.
>> If there is a correlation between A and R, the question tests whether the student can conduct a proper reasoning.
A word of caution: The truth of “this statement explains that one” can depend on context. It can also depend on the student’s interpretation. Let’s look at an example of a poorly-worded question of this type:
Assertion: One metre is longer than one yard.
Reason: One yard equals 3 feet, and one inch is about 2.5 centimetres.
Choose from the following:
(1) Both A and R are true and R is a correct explanation of A.
(2) Both A and R are true but R is not a correct explanation of A.
(3) A is true but R is false.
(4) A is false but R is true.
(5) Both A and R are false.
Here, both (1) and (2) can be considered correct.
>> If the student were to know the facts presented in the Reason, and also the fact that one foot equals 12 inches, he might select (1) as the answer.
>> But suppose he realises that there is no mention of the relationship between feet and inches. He would then choose (2) as the answer.
Section 4: Relationship Analysis
Relationship Analysis (RA) is a question type often used on exams that involve numerical calculations. The examinee is presented with two values, and is asked about the relationship between them (in terms of greater/lesser).
The question is usually phrased such that the student must properly work out the mathematical problem to be able to answer. It is particularly useful because the examiner can gauge the student’s understanding at a deeper level without the need to think up plausible distracters.
Let’s take an example from algebra. The first value is [(a + 3b) (3a - b) - 5ab], which we can call “Value 1.” This happens to be equal to [3(a^2 + ab + b^2)], which we can call “Value 2.”
The student will know that Value 1 equals Value 2 only if he works out Value 1 on paper. How would this be framed as an MCQ?
If the examiner tries to construct a standard four-choice MCQ, he might ask something like “What is Value 1 equal to?” For that, he would supply Value 2 as one of the four answer options, along with three incorrect answers (distracters). The question might look like:
Question: What does (a + 3b) (3a - b) - 5ab reduce to?
Answer Options:

  1. 3(a^2 + ab + b^2)
  2. 4(a^2 + 2ab + b^2)
  3. (6a + 5b + 7ab + a^2 + b^2)
  4. (a^2 + 3ab)

Here, (A) is the correct answer. There are two difficulties for the examiner here:
>> Options B, C, and D should be plausible. In this example, options (C) and (D) are not mathematically plausible. Implausible distracters are, as a rule of thumb, a waste of time and effort for both examiner and examinee, and creating plausible distracters takes time.
>> More importantly, what can the examiner gauge about the student if he chooses, say, (C)? Nothing, except that the student has the answer wrong.
It is quite different with an RA-type question. Let’s convert the above problem to an RA question:
Question: Look at the two values below and choose from the four answer options.
Value 1: (a + 3b) (3a - b) - 5ab
Value 2: 3(a^2 + ab + b^2)
Answer Options:

  1. Value 1 is greater.
  2. Value 2 is greater.
  3. The two values are equal.
  4. The relationship cannot be determined from the information given.

The correct answer here is C (the values are equal). The advantage with this RA question, as compared to the standard four-choice MCQ, is that the examiner can gauge the examinee at different levels even if he doesn’t get the answer correct. This can be elaborated as follows:
>> If the examinee has chosen either of A and B, then he has performed the calculation and committed an error.
>> If he has chosen C, he has performed the calculation and committed no error.
>> If he has chosen D, the examinee has not performed the correct calculation, or, he has performed it and committed a gross error.
With this understanding, different points can be awarded depending on which answer was chosen. Just as an example, four points might be awarded for choosing C, one point for choosing A or B, and no points for choosing D.
Section 5: Sentence Completion
An examinee’s ability to fill in the blanks in a complex sentence (or group of sentences) can indicate comprehensive grasp of the subject matter. The other key advantage of this kind of question is that the examiner can choose exactly what to test for.
Let’s say the learner should, during a course, have gained a basic understanding of personal computers, how they work, and the associated terminology. Here’s an example of a “Sentence Completion” question:
Although it appears complex, a personal computer is a/an ________ that carries out instructions fed to it. It takes what the user feeds into it via the keyboard and mouse, while using what is present in its ________. It gives out results in a ________ format, sometimes on the monitor, and sometimes on other devices such as the printer.
The three blanks are, in order, A, B, and C. The options are:
For blank A: Object, Instrument, Machine, Gadget
For blank B: Body, Memory, Instructions, Brain
For blank C: Presentable, Human-recognisable, Consistent, Machine-neutral
This might seem like three “standard” multiple-choice questions grouped together. Actually, it goes beyond that, because the three sentences come one after the other—meaning that a deeper understanding of the subject matter is required to get all three answers correct.
In the above example, each of the three blanks (and their associated options) tests for something distinct, as follows:
>> In blank A, the correct answer is “Machine.” If an incorrect answer were chosen, it would indicate that the examinee’s fundamental concept of a computer is off track.
>> In blank B, the correct answer is “Memory.” This is a technical detail. If an incorrect answer were chosen, it would indicate that the examinee’s knowledge is lacking in factuality.
>> In blank C, the correct answer is “Human-recognisable.” A knowledge of the answer involves a knowledge of terminology, and also an awareness of the context of the entire statement. If an incorrect answer were chosen—for example, “Presentable”—it would indicate poor recall on the part of the examinee, but more importantly, the absence of a contextual and/or comprehensive understanding of the paragraph. (This is a subtle point!)
Following that reasoning, the examiner can choose each blank to test for a different aspect of understanding. This implies that the examinee’s overall understanding can be tested by the paragraph with its 12 answer options.
 
This part of the whitepaper, Part III, is about types of multiple-choice questions other than the “basic,” four-option type. For more information and tips, see Part I and Part II.
©Focalworks 2008
 
In the tips and ideas above, did you find something particularly useful? Is there something obvious we missed? Let us know!

Tags: