Posted By Doug Peterson
In Part 1 of this series, I covered how an assessment must be reliable and valid, with “valid” meaning “tests what it’s supposed to test.” This means that before you can start writing items for an assessment, you need to know what topics to cover and how many items are needed for each topic.
This process starts with a Job Task Analysis (JTA) exercise. You need to understand what tasks a person in the position in question performs or supervises, how often they do the task, how important the task is to their job,
and how hard it is to perform the task. The results of the JTA are typically used to develop a competency model.
From the JTA/competency model you then develop a Test Content Outline (TCO), which might also be called a test blueprint or a test specification. This document drives the test item development process. Test items developed this way can then be easily mapped back to individual tasks/competencies in the JTA or competency model, ensuring that your assessment is testing what it is supposed to test.

The TCO describes the content areas to be covered in the assessment. The next step is to determine how many items should be written for each content area. There are several factors that must be taken into account when performing this step:

  • Criticality of the content – Is this content “must know” or “nice to have”? Required knowledge necessitates more thorough testing, which means more items.
  • Size of the content area – A larger content area requires more test items than a smaller content area.
  • Homogeneity – Does everything in the content area require the same knowledge, skills or abilities? If so, fewer questions are needed.
  • Consequences – What happens if the learner doesn’t grasp the concepts in the content area? Do they have to take more training? Do they lose their job? As the stakes go higher, you need more items to ensure that the learner’s true knowledge is being assessed.
  • Available resources during testing – If the test is going to be open book and/or open notes, you will need more (and more difficult) items to truly assess the learner’s knowledge vs. their ability to quickly look things up.

Using the factors listed above, test content areas should be weighted to help determine the number of items to be written for each area. This is best done by a group of Subject Matter Experts (SMEs) in an exercise similar to the Angoff method of determining a cut score. Each SME should rate each content area of the TCO by

1. Criticality – 0 (unimportant) to 4 (extremely critical)

2. Difficulty – 0.5 (easy), 1.0 (moderate), or 1.5 (hard)

3. Size relative to the other content areas – 0 (too small to include) to 4 (very large)

The ratings from the SMEs for each factor (criticality, difficulty and size) should then be averaged to come up with a single rating for each factor for each content area. Then it’s time for a little math.

# items on the test for a content area = criticality x size x difficulty

# items needing to be written for a content area = 3 x # items on the test for a content area

If a content area is determined to require less than 4 – 6 items on the test, it should be dropped or combined with another content area. Once you know how many items you need to write, you can assume an average of 10 minutes to write an item, and you should allow time for revisions (.25 x total writing time).