Part 1 of 2
New York City, like the cities of New Haven and Chicago, has a long history of employment discrimination litigation related to its firefighter testing.
Since the 1970s and cases like Guardians, the city has been under scrutiny for its woefully low number of black firefighters.
In 2007 the city found itself faced with another lawsuit over its firefighter hiring practices, and in July of 2009, a U.S. District Court judge found that the city had violated Title VII by administering written exams from 1999-2007 that had high levels of adverse impact. The city marshaled an inadequate defense. In January of 2010, the same judge (Nicholas Garaufis) found the city liable for a pattern and practice of disparate treatment for those same exams. An adverse impact finding, particularly for written exams, and especially for public safety tests, is not earth-shattering. But a finding of disparate treatment in this situation is less common.
This case, while only one example and limited in its impact, has some valuable lessons for test users and sheds some light on how judges look at our field. In particular, I describe below nine points the judge specifically made and what lessons we can draw from them:
1) While the city conducted a job analysis with an "extensive" list of tasks and surveyed incumbents, the city offered "no evidence of 'the relationship of abilities to tasks.'" They conducted a linkage, but the judge found that the SMEs were confused about what they were supposed to do and didn't understand several of the abilities they were rating.
Lesson: simply having subject matter experts (SMEs) link essential tasks and knowledge, skills, and abilities (KSAs) is not sufficient. You need to ensure they understand the statements they are linking as well as how exactly they are supposed to be linking them.
2) In conducting the job analysis, the city inappropriately retained tasks and KSAs that could be learned on the job. It is quite clear (e.g., per the Uniform Guidelines) that only tasks and KSAs that are required upon entry to the job should be identified as critical in terms of exam development.
Lesson: make sure that when you are developing exams based on job analysis results that you focus only on those tasks and KSAs that are required upon entry to the job. This should be determined by your SMEs.
3) The city relied to some extent upon the work of a previous test developer, Dr. Frank Landy (who sadly recently passed away). In addition to a tenuous link between Dr. Landy's work and the current exams, the judge makes it clear that "reliance on the stature of a test-maker cannot stand in for a proper showing of validity." At the same time, the judge emphasizes that exams should be constructed by "testing professionals."
Lesson: tests should be developed by people who know what they're doing. This means HR professionals with the requisite background in test validation and construction in conjunction with job experts. Do not rely solely on previous efforts, particularly when (as in this case) the results of those efforts were either incomplete or not fully relevant to your current situation.
4) The city performed no "sample testing" to ensure that the questions were reliable as well as "comprehensible and unambiguous."
Lesson: few steps in the test development process are as easy--or as valuable--as pilot testing. I have yet to see an exam that didn't benefit from a "trial run" with a group of incumbents. Not only will you catch unintended flaws, you will verify that the exam is doing what you claim it is.
5) There was insufficient evidence that the exams actually measured the (nine cognitive) KSAs the city claimed they intended to measure. Plaintiffs were able to suggest the opposite through analyzing convergent and discriminant validity as well as by conducting a factor analysis.
Lesson: there are two linkages of primary importance in test development. The first was describe in #1. The second is the link between critical KSAs and the exam(s). At the very least, you must be able to show evidence that there is a logical link between the two. When you claim to be measuring cognitive abilities, you incur an additional responsibility, which is gathering statistical evidence that supports this claim.
Next time: more lessons and the relief order.