1.  What is the most appropriate number of items?

Initially, the more items included in the test the more reliable and valid for assessment the test will be. Roughly, it is estimated that 40 items may be included in a one hour test.

The number of items per section or thematic unit must be defined in terms of the emphasis on teaching.

2.  During the formulation of items, what I can do to avoid giving clues that facilitate chosing the correct answer even when the student does not know it?

  • Avoid grammatical inconsistency
  • Prevent the correct answer to be longer and more elaborated than the others
  • Avoid absurd alternatives
  • Avoid using very specific determinants (as always, never, etc..)


3. What is more appropriate to use a format based on actual questions, or unfinished sentences?

  • The questions are slightly easier than in the case of incomplete sentences
  • There is no special effect on the discrimination, items discriminate in a similar way (differentiate the most knowledgeable of those who lack knowledge) whether formulated as questions as if presented as incomplete sentences
  • The actual questions increase reliability (around .065) and validity (correlations with other criteria) throughout the test


4. What should I know about using negative particles (no, never, except less) in the formulation of the questions?

In general, negative particles should not be included in the formulation of the item because they can lead to mistakes even when the student knows the correct answer. In case of use, it is recommended that the negative particle is sufficiently recognizable (underlined, italicized, or in UPPERCASE)


5. What information should items provide?

It is preferable to keep the questions short and not provide more information than necessary.


6. What evaluation is made regarding the use of "all the above" and "none of the above" as possible answers to the items?

  • The answer "all of the above" is not recommended because it can build an important clue to discard some alternatives. If a student only knows that one of the proposed alternatives is true, then she/he only needs to guess between that response and "all the above"
  • Regarding the use of "none of the above" answer:
    • The questions that are included tend to be more difficult (especially if this is correct)
    • They are slightly less discriminating (differentiate less between those who know more or less )
    • For those items that have little problems with numerical answers, this answer (none of the above) attracts those who do not trust their own answer.
  • Issues to consider for use:
    • It is preferable that they appear among the first questions of the test
    • Use them in place of bad distracters in the absence of other better option, and in no more than a fourth or fifth of the items
    • It must be the right answer in a similar proportion (in the fourth or fifth of the items that have this option)
    • Use it in relatively difficult questions and where there is clearly only one correct answer


7. As for the number of right answers various positions are based on multiple investigations. Globally it is considered that:

  • Two alternatives (a right or wrong, or the classic true-false) discriminate better only in the upper part of the distribution: are more differentiated those who know more (those who do not answer randomly) but in the rest of the distribution the others appear more undifferentiated
  • Three alternatives discriminate and report better in the center of the distribution (the "best" and "worst" are less differentiated from each other)
  • Four or more alternatives work best in the lower part of the distribution. Those who know the least are more likely to guess, more alternatives are more opportunities to make mistakes


8. As for the number of possible answers, what is the most practical option?

Going to what is "practical", it seems that the best option is that of three answers (one true and two false) because it saves time in preparing the test and shortens it.


9. Is it appropriate to use true-false questions? What should I know about their difficulty?

  • Easier (there are more possibilities to find the right answer by guessing)
  • Less discriminant
  • The reliability of the entire test is less when the number of items is kept constant

The difficulty of these questions depends on two factors, its formulation and which one is the right answer. Let's look at how these elements relate:

Formulation of the item Easier when the correct answer is... More difficult when the correct answer is...
Positive True False
Negative False True


To achieve sufficient reliability and reduce the influence of guessing you need more items (5 items True-False items per Multiple Choice 3 to obtain a comparable reliability).


10. Items with multiple correct answers support different styles, ways of answering (and consequently of rules given to students) and correction. Below are some examples:

a)   It may be a question (or its equivalent, an incomplete statement) with all the answers within the same area and presented as possible answers to the initial question.

b)   It can also be formulated so that the question is a simple header to the rest of the possible questions that can be quite independent of each other but within the same general subject. The easiest way would be to ask Which of the following statements are true (or false)?

In principle it is preferable that the answers or subquestions in the same field help the student focus on the topic. So follow the model described in option a)

Note that:

a) in terms of reliability and validity are acceptable,

b) guessing is better controlled than in the alternative method of multiple true / false,

d) easier for mechanical or electronic correction.


11. There are multiple ways to minimize the effect of responses by guessing. Here are some examples:

Thurstone, 1919; Holzinger, 1924

This formula does not assume that all items have the same number of responses, although in this case (the most common) the formula applies easier.

If items have a different number of possible responses each item rate as follows:

Thurstone, 1919; Holzinger, 1924

In this formula the final total is the number of correct responses plus those skipped. Skipped or omitted answers are given the value of 1 / k (k is the number of responses for each question, if all the questions have four answers, each skipped answer is worth .25). In that case the student is discouraged to answer randomly by rewarding silence instead of punishing the mistake.

Gulliksen, 1950


12. Circumstances in which it is advisable to use guessing correction formulas:

  • When there is little time to answer, in this case, there are more random  answers although this circumstance should be avoided.
  • In difficult tests (or when it can be presumed that students have Little preparation) or the requirements for approval are rather low, as may be the case in less demanding selection tests. Generally the formula is useful when presumably many students will not know what to answer in many questions.


13.    UPF provides a correction service for exams (test-type) through optical scanning.  If you are interested and / or want more information please follow this linkhttp://www.upf.edu/bibtic/serveis/lectura/ 

14.    If you are interested in making tests via your "Global classroom" (Aula Global service) and don't know how to do it,  you may want to check the  General Settings Guide to Global Classroom for Teachers (http://www.upf.edu/bibtic/lafactoria/guiespdi.html) where you can find information about all the possibilities, or just refer to the section "Questionnaires" (http://www.upf.edu/bibtic/lafactoria/_pdf/Questionaripreguntes.pdf).


Recomendations extract by:

Morales, P. Las pruebas objetivas: normas, modalidades y cuestiones discutidas. Facultad de Ciencias Humanas y Sociales, Universidad Pontificia Comillas, Madrid (última versió, 17, Des, 2006).