Monday, August 25, 2008

Adverse impact on personality and work sample tests

The latest issue (Autumn, 2008) of Personnel Psychology has so much good stuff in it that I'm going to split it into two parts.

The first part, which I'll do today, focuses on the more selection-oriented articles which have to do with adverse impact on personality tests and in work sample exercises. In my next post I'll talk about three more articles that have to do with strategic HRM practices.

Today let's talk about adverse impact. It's a persistent dilemma, particularly given many employers' desire to promote diversity and the legal consequences of failing to avoid it. One of the "holy grails" of employee assessment is finding a tool that is generally valid, inexpensive to implement, and does not result in large amounts of adverse impact.

One type of instrument that has been suggested as fitting these criteria is the personality test. They're easy to administer and can be valid predictors of performance, but our knowledge of group differences has up until now been limited. In this issue of Personnel Psych, Foldes, Duehr, and Ones present meta-analytic evidence that attempts to fill in the blanks.

Their study of Big 5 personality factors and facets is based on over 700 effect sizes. So what did they find? There is definitely value to separating the factors from the facets, as they show different levels of group difference. And most of the group differences (in cases with decent sample sizes) were small to moderate. Here are some of the largest and most robust findings (e.g., 90% confidence interval does not include zero):

- Whites scored higher than Asians on even-temperedness (an aspect of emotional stability; d=.38)
- Hispanics scored higher than Whites on self-esteem (an aspect of emotional stability; d=.25)
- Blacks outscored Asians on global measures of emotional stability (d=.58)
- Blacks outscored Asians on global measures of extraversion (d=.41)
- Hispanics outscored Blacks on sociability (d=.30)

The article includes a very useful chart that summarizes the findings and includes indications of when adverse impact may occur given certain selection ratios. What I take away from all this is the classic racial discrimination situation employers are worried about in the U.S. (Whites scoring higher than another group) is less of a concern with personality tests than with, say, cognitive ability tests. But (and this is a big but), it doesn't take much group difference to result in adverse impact (see Sackett & Ellingson, 1997)

The second article is also about group differences. This time it's work sample tests and it's a meta-analysis of Black-White differences by Roth, Bobko, McFarland, and Buster.

The authors analyzed 40 effect sizes in their quest to dig further into this subject--and it's a good thing they did. A group difference (d) benchmark often cited for these exercises is .38 in favor of Whites. These authors obtained a value of .73, but with an important caveat--this value depends greatly on the particular work sample test.

For example, in-basket and technical exercises (e.g., reading a construction map) yielded d values of .74 .76, respectively. On the lower end, oral briefings and role-plays had d values of .22 and .21, respectively. Scheduling exercises were in the middle at d=.52.

Why the difference? The authors provide data that indicates the more saturated with cognitive ability/job knowledge the measure, the higher the d values. The more the exercise requires demonstrating social skills, the lower the d values.

Bottom line? Your choice of selection measure should always be based on the KSAs required per the job analysis. But given a choice between different exercises, consideration should be given to the group differences described above. Blindly selecting a work sample over, say, a cognitive ability test, may not yield the diversity dividends you anticipate (in addition to the fact that they may not be as predictive as we previously thought!).

Some important caveats should be noted about both of these pieces of research: (1) adverse impact is heavily dependent on factors other than group differences, such as applicant population, selection ratio, and stage in the selection process; and (2) from a legal perspective, adverse impact is only a problem if you don't have the validity evidence to back it up. Of course you should have this evidence anyway, because that's how you're deciding how to filter your candidates...right?

