HR Tests - Recruitment, assessment, and personnel selection: March 2017

Wednesday, March 29, 2017

Which employment tests work best? An update.

I'm not sure how I missed this one, but three researchers are updating Schmidt & Hunter's famous study of the validity of personnel selection procedures. And unlike much of the research in this area, it's free (webpage) (PDF)! It's also a working paper, so proceed with caution.

Oh, and some of the results as they stand will probably make your head spin.

Let's back up. In 1998, Frank Schmidt and John Hunter published "The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 85 Years of Research Findings." Via meta-analysis, the authors reported the criterion-related validity coefficients of 19 selection methods. What does this mean? Basically, how well each type of employment test predicts job (or training) performance. It's what some consider the "gold standard" of validity because it's based on actual performance, not subject matter expert judgment. (I'll leave that debate for another day)

The bottom line is that when they crunched these numbers, there were several "winners", including cognitive ability, work samples, integrity tests, and structured interviews. It still stands as one of the most frequently-cited research articles in the field of industrial/organizational psychology.

Since then, there have been statistical advancements that researchers can use to improve the accuracy of their estimates, and of course more primary research has been done, which can be included in meta-analyses.

So all that said, what are the updated results (as currently reported)? Here are some highlights, which now include analyses of 31 different types of selection methods:

Cognitive ability or "general mental ability" (GMA) still reigns supreme. Casting aside for now that these tests tend to result in adverse impact, the criterion validity coefficient went up.
Unstructured interviews match structured interviews. "Whaaat?" you say? Take a look, and check the explanation for why this is, statistically. Remember there are other factors in play that should inform your decision about how to structure interviews (e.g., legal defensibility, merit rules).
Validity of interviews (both types) went up. Bottom line: interviews can work.
Validity of work samples went down. The authors use the value found in a 2005 study, and the results may have to do with these tests being used in different sectors than originally envisioned. Whatever the explanation, I scratch my head on this one because I'm a big believer in work sample tests.
Conscientiousness went down. But the authors remind us that studies using work-specific measures show improved results.
The T&E point method still under-performs. Many organizations have become besotted with this training and experience (T&E) approach, which gives candidates points based on how applicants respond to self-reported inventories. They're easy to develop, and easy to automate. And they don't predict performance well compared to the others. Best used as an initial hurdle (if at all).
Strictly relying on years of experience or years of education doesn't work. Sorta calls the whole resume review thing into question, doesn't it?
Job knowledge tests still performed quite well. Some things never go out of style, like a well-developed test of concepts applicants must know to do the job.
Emotional intelligence measures did a decent job. They're not close to the top, but they're not useless either.
The best results still come from combining tests. The authors are a fan of combining cognitive ability tests with integrity or job knowledge tests, but there are other combinations that work as well, and they out-perform using a single measure.

So what does all this mean? Well, nothing and everything. Nothing, because how you test for a job should be based on that job. You should never just blindly pick an employment test out of a hat. The choice should be based on what competencies are required for that job, day one, along with other factors such as organizational context and operational considerations.

Everything because the only real way we know whether these things work is to look at the data—and this type of research is about as good as we get. For now. Let's see where we're at a few years from now, when we have more information about emerging forms of measurement, such as simulations and VR. Until then, keep using good tests.