Sunday, November 28, 2010

Research potpourri

Here's a very quick round-up of several recent pieces of research:

Bott et al. on how individual differences and motivation impact score elevation on non-cognitive measures (e.g., personality tests)

Ackerman, et al. on cognitive fatigue during testing

Robie, et al. on the impact of coaching and speeding on Big Five and IM scores

Evers, et al. on the Dutch review process for evaluating test quality (pretty cool; in press version here)

And now, back to figuring out why I eat too much on Thanksgiving. Every year.

Saturday, November 20, 2010

New issues of Personnel Psych and IJSA: Muslim applicants, selection perceptions, and more

Two of the big journals have come out with new issues, so let's take a look at the highlights:

First, in the Winter 2010 issue of Personnel Psychology:

King and Ahmad describe the results of several experiments that found interviewers and raters altered their behavior for confederates exhibiting obvious Muslim-identified behavior (e.g., clothing) depending on whether applicants exhibited stereotype-inconsistent behavior. For those that didn't, reactions were shorter and more negative. On the other hand, no difference was found in offers between those dressed in Muslim-identified clothing and those that weren't. So behavior--specifically its stereotypicality--and not simply something obvious like dress, may be key in predicting/preventing discriminatory behavior.

Do you consistently read the "Limitations" section of journal articles? Brutus et al. did, for three major I/O journals from 1995 to 2008 and found that threats to internal validity were the most commonly reported limitation. Interestingly, they also found that the nature of limitations reported changed over time (e.g., more sampling issues due to volunteers, variance issues). You can see an in press version here.

Next up, Henderson reports impressive criterion-related validity (combined operational = .86) for a test battery consisting of a g-saturated exam and a strength/endurance exam after following a firefighter academy class for 23 years. He suggests that employers have considerable latitude in choosing exams as long as they are highly loaded on these two factors, and also suggests approximately equal weighting.

Struggling to communicate the utility of sound assessment? Last but not least, Winkler et al. describe the results of sharing utility information with a sample of managers and found that using a casual chain analysis--rather than simply a single attribute--increased understanding, perceived usefulness, and intent to use.

Let's switch now to the December 2010 issue of the International Journal of Selection and Assessment (IJSA). There's a lot of great content in this issue, so take a deep breath:

Interested in unproctored internet testing? You'll want to check out Guo and Drasgow's piece on verification testing. The authors recommend using a Z-test over the likelihood ratio test for detecting cheating, although both did very well.

Walsh et al. discuss the moderating effect that cultural practices (performance orientation, uncertainty avoidance) have on selection fairness perceptions.

Speaking of selection perceptions, Oostrom et al. found that individual differences play a role in determining how applicants respond--particularly openness to experience. The authors recommend considering the nature of your applicant pool before implementing programs to improve perceptions of the assessment process.

Those of you interested in cut scores (and hey, who isn't) should check out Hoffman et al.'s piece on using a difficulty-anchored rating scale and the impact it has on SME judgments.

Back to perceptions for a second, Furnham and Chamorro-Premuzic asked a sample of students to rate seventeen different assessment methods for their accuracy and fairness. Not surprisingly, panel interviews and references came out on top in terms of fairness, while those that looked the most like a traditional test (e.g., drug, job knowledge, intelligence) were judged least accurate and fair. Interestingly, self-assessed intelligence moderated the perceptions (hey, if I think I'm smart I might not mind intelligence tests!).

And now for something completely different (those of you that get that reference click here for a trip down memory lane). A study by Garcia-Izquierdo et al. of information contained in online job application forms from a sample of companies found on the Spanish Stock Exchange. A surprisingly high percentage of firms asked for information on their applications that at best would be off-putting, at worst could lead to lawsuits, such as age/DOB, nationality, and marital status. The authors suggest this area of e-recruitment is ripe for scientist-practitioner collaboration.

Last but not least, a piece that ties the major topics of this post together: selection perceptions and recruitment. Schreurs et al. gathered data from 340 entry-level applicants to a large financial services firm and found that applicant perceptions, particularly of warmth/respect, mediated the relationship between expectations and attraction/pursuit intentions. This reinforces other research that has underlined the importance of making sure organizational representatives put your best foot/face forward.

Friday, November 12, 2010

Observer ratings of personality: An exciting possibility?

Using measures of personality to predict job performance continues to be one of the most active areas in I/O psychology. This is due to several things, including research showing that personality measures can be used to usefully predict job performance, and a persistent interest on the part of managers in tools that go beyond cognitive measures.

Historically the bulk of research on personality measures used for personnel assessment has been done using self-report measures—i.e., questionnaires that individuals fill out themselves. But there are other ways of measuring personality, and a prominent method involves having other people rate one’s personality. This is known as “other-rating” or “observer rating.”

Observer ratings are in some sense similar to self-report measures that ask the rater to describe their reputation (such as the Hogan Personality Inventory) in that the focus is on visible behavior rather than internal processes; in the case of observer ratings this takes on even greater fidelity since there is no need to “guess” at how behavior is perceived.

Research on observer ratings exists, but historically has not received nearly the attention given to self-ratings. Fortunately, in the November 2010 issue of Psychological Bulletin, Connelly and Ones present the results of several meta-analyses with the intent of clarifying some of the questions surrounding the utility of observer ratings and, with some caveats, largely succeed. The study was organized around the five-factor model of personality, namely Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Emotional Stability.

The results of three separate meta-analyses with a combined 263 independent samples of nearly 45,000 individuals yielded several interesting results, including:

- Self- and other-ratings correlate strongly, but not perfectly, and are stronger for the more visible traits, such as Extraversion and Conscientiousness, compared to those less visible (e.g., Emotional Stability).

- Each source of rating seems to contribute unique variance, which makes sense given that you’re in large part measuring two different things—how someone sees themselves, their thoughts and feelings, compared to how they behave. Which is more important for recruitment and assessment? Arguably other-ratings, since we are concerned primarily not with inner processes but with behavioral results. And it certainly is not uncommon to find someone who rates themselves low on a trait (e.g., Extraversion) but is able to “turn it on” when need be.

- Self/other agreement is impacted by the intimacy between rater and ratee, primarily the quality of observations one has had, not simply the quantity. This is especially true for traits low in visibility, such as Emotional Stability.

- Inter-rater reliability is not (even close to) perfect, so multiple observers is necessary to overcome individual idiosyncrasies. The authors suggest gathering data from five observers to achieve a reliability of .80.

- Observer ratings showed an increased ability to predict job performance compared to self-ratings. This was particularly true for Openness, but for all traits except Extraversion, observer ratings out-predicted self-ratings.

- Just as interesting, observer ratings added incremental validity to self-ratings, but the opposite did not hold true. This was especially true for Conscientiousness, Agreeableness, and Openness.

- Not only were observer ratings better at predicting job performance, the corrected validities were considerably higher than have been documented for self-report measures, and in one case—Conscientiousness—the validity value (.55) was higher than that reported previously for cognitive ability (.50), although it should be noted that the corrections were slightly different. Values before correcting for unreliability in the predictor were significantly lower.

Why do observer ratings out-perform self-ratings? Perhaps other-ratings are more related to observable job performance. Perhaps they do not contain as much error in the form of bias (such as self-management tendencies). Or perhaps because the measure is completely contextualized in terms of work behavior, whereas the questions in most self-report measures are not restricted to the workplace.

Despite these exciting results, some caution is in order before getting too carried away. First, the authors were unable to obtain a very large sample for observer ratings—sample sizes were around 1,000 for each trait. Second, there are currently a limited number of assessment tools that offer an observer rating option—the NEO-PI being a prominent example. Finally, there is an obvious logistical hurdle in obtaining observer ratings of personality. It is difficult to see how in most cases an employer would obtain this type of data to use in assisting with selection decisions.

However, even with that said the results are promising and we know of at least one way that technology may enable us to take advantage of this technique. What we need is a system where trait information can be gathered from a large number of raters relatively quickly and easily and the information is readily available to employers. One technology is an obvious solution, and it’s one of my favorite topics: social networking websites. Sites like Honestly (in particular), LinkedIn, and (to a lesser extent) Facebook have enormous potential to allow for trait ratings, although the assessment method would have to be carefully designed and some of the important moderators (such as degree of intimacy) would have to be built in.

This is an intriguing possibility, and raises hope that employers could soon have easy access to data that would allow them to add substantial validity to their selection decisions. But remember; let’s not put the cart before the horse. The question of what assessment method to use should always come after an analysis of what the job requires. Don’t jump headlong into other-ratings of personality when the job primarily requires analytical and writing skill.

This is just another tool to put in your belt, although it’s one that has a lot of exciting potential.

Saturday, November 06, 2010

Multimedia on the cheap: Xtranormal and Toondoo

When it comes to recruiting, one of the most important things an organization can do is make themselves stand out from other employers. And one of the best ways to do this is by using creative multimedia technologies that demonstrate both creativity and an openness to innovative approaches.

I've written before about some of the exciting technologies out there that employers can use for this purpose, such as simulations and realistic videos. But today I'd like to talk about two more abstract--but perhaps more fun--websites that are easy to use and have a lot of potential.

Oh yes, and they're free (for the basic version). Free is good.

The first is You may have seen some of these videos on YouTube or elsewhere, as it seems to rapidly have become the technology of choice for creating quick animated videos. They have both a web-based design option ("text-to-movie") as well as a more fully featured downloadable version called State.

As the designer, you simply choose a setting, the number of actors, and type in the script. The website adds computer-generated voices for you. You also have control over other features, such as the camera angles for each scene, emotional expressions, and sounds.

Now I'm no designer--as you will quickly see--but I was able to make this video in all of about 10 minutes*.

If you're intrigued, also try which is similar but I found to be slightly more cumbersome to use.

The other technology is slightly more old school--panel cartoons. Yes, like the ones you see in the paper.

Here's an example, which took me about 10 minutes using Toondoo:


Neither of these websites are perfect. You'll find there's a small learning curve, and you'll wonder why they made certain design decisions. But there really is no reason not to at least try some of these tools.

*Note: In my day job I'm an HR Manager for the California Attorney General's office, but this blog is neither sanctioned nor supported by my employer. For better or worse, these animations were entirely my creation. But what kind of recruiter would I be if I didn't use this opportunity to promote, promote, promote!

Tuesday, November 02, 2010

November '10 J.A.P.

The November 2010 issue of the Journal of Applied Psychology is out, so let's take a look at the relevant articles:

Woods & Hampson report the results of a fascinating study looking at the relationship between childhood personality and adult occupational choice. The participants (N ~ 600) were given a 5-factor personality inventory when they were between 6 and 12 years old, then reported their occupation 40 years later using Holland's vocational types. Results? Openness/Intellect and Conscientiousness scores were correlated with occupational choice as adults. Implication? Career choice may be determined fairly early on in some cases and be influenced by how we're hard-wired as well as through friends, family, etc.

Perhaps even more interesting, the authors found that for the most strongly sex-typed work environments (think construction, nursing), the results related to Openness/Intellect were moderated by gender, supporting the idea that gender stereotyping in these jobs is impacted by individual differences as well as other factors (e.g., cultural norms).

Next up, a study by Maltarich, et al. on the relationship between cognitive ability and voluntary turnover, with some (what I thought were) counter-intuitive results. The authors focused on cognitive demands as the moderating factor, so think for a second about what you would expect to find for a job with high cognitive demands (think lawyer)--who would be most likely to leave, those with high cognitive ability or those with low? I naturally assumed the latter.

Turns out it was neither. The authors found a curvilinear relationship, such that those low and high in cognitive ability were more likely to leave than those in the middle.

What about jobs lower in cognitive demands? Who would you expect to have higher voluntary turnover--those with high cognitive ability or lower cognitive ability? I assumed high, and again I was wrong. Turns out the relationship the authors found in that case was more of a straight negative linear relationship: the higher the cognitive demands, the less likely to leave.

What might explain these relationships? Check out I/O at Work's post about this article for more details including potential explanations from the authors. It certainly has implications for selection decisions based on cognitive ability scores (and reminds me of Jordan v. New London).

Do initial impressions of candidates matter during an interview? The next article, by Barrick et al. helps us tease out the answer. The authors found that candidates that made a better impression in the opening minutes of an interview received higher interview scores (r=.44) and were more likely to receive an internship offer (r=.22). Evaluations of initial competence impacted interview outcomes not only with the same interviewer, but with separate interviewers, and even separate interviewers who skipped rapport building.

But perhaps more interestingly, the authors found that assessments of candidate liking and similarity were not significantly related to other judgments made by separate interviewers. Thus, while these results support the idea that initial impression matter, they also provide strong support for using a panel interview with a diverse makeup, so that bias unrelated to competence is less likely to influence selection decisions.

Finally, Dierdorff et al. describe the results of a field study that looked at who might benefit most from frame-of-reference training (FOR). As a reminder, FOR is used to provide raters with a context for their rating and typically involves discussing the multi-dimensional nature of performance and rating anchors, and conducting practice sessions with feedback to increase accuracy and effectiveness (Landy & Conte, 2009).

In this case, the authors were interested in finding out whether individual differences might impact the effectiveness of FOR. What they found was that the negative impact of having a motivational tendency to avoid performance can be mitigated by having higher levels of learning self-efficacy. In other words, FOR training may be particularly effective for individuals that believe they are capable of learning and applying training, and overall results may be enhanced by encouraging this belief among all the raters.


Landy, F. and Conte, J. (2009). Work in the 21st century: An introduction to industrial and organizational psychology (Third Edition). Hoboken, NJ: Wiley-Blackwell.