14 Validity

Validity has long been one of the major deities in the pantheon of the psychometrician. It is universally praised, but the good works done in its name are remarkably few.
— Robert Ebel

As noted by Ebel (1961), validity is universally considered the most important feature of a testing program. Validity encompasses everything relating to the testing process that makes score inferences useful and meaningful. All of the topics covered in this course, provide evidence supporting the validity of scores. Scores that are consistent and based on items written according to specified content standards with appropriate levels of difficulty and discrimination are more useful and meaningful than scores that do not have these qualities. Correct measurement, sound test construction, reliability, and certain item properties are thus all prerequisites for validity.

These notes begin with a definition of validity and some related terms. After defining validity, three common sources of validity evidence are discussed: test content via what’s referred to as a test blueprint or test outline, relationships with criterion variables, and theoretical models of the construct being measured. These three sources of validity evidence are then discussed within a unified view of validity. Finally, threats to validity are addressed.

14.1 Objectives

Learning objectives connected to these notes

Define validity in terms of test score interpretation and use, and identify and describe examples of this definition in context.
Compare and contrast three main types of validity evidence (content, criterion, and construct), with examples of how each type is established, including the validation process involved with each.
Explain the structure and function of a test blueprint, and how it is used to provide evidence of content validity.
Calculate and interpret a validity coefficient, describing what it represents and how it supports criterion validity.
Describe how unreliability can attenuate a correlation, and how to correct for attenuation in a validity coefficient.
Identify appropriate sources of validity evidence for given testing applications and describe how certain sources are more appropriate than others for certain applications.
Describe the unified view of validity and how it differs from and improves upon the traditional view of validity.
Identify threats to validity, including features of a test, testing process, or score interpretation or use, that impact validity. Consider, for example, the issues of content underrepresentation and misrepresentation, and construct irrelevant variance.

R analysis in this module is minimal. We’ll run correlations and make adjustments to them using the base R functions, and we’ll simulate scores using epmr.

# R setup for this module
library("epmr")
# Functions we'll use
# cor() from the base package
# rsim() from epmr to simulate scores