HR Tests - Recruitment, assessment, and personnel selection: Situational judgment tests

Showing posts with label Situational judgment tests. Show all posts

Thursday, March 05, 2015

Mega research update

I hope you like research, because there's a lot of it coming your way...and many are free as of this posting!

Without further ado...

Let's start with the Journal of Applied Psychology, January 2015 issue:

- We see a lot of research involving large candidate groups, but much less for individuals. In this meta-analysis of individual assessments, the authors found support for their usefulness, but it varied significantly across studies. Highest validity was found for managerial jobs and assessments that included a cognitive ability test.

- Being in the wrong job can be frustrating for both the employee and the employers. In this study, the authors show a relationship between poor vocational fit and counterproductive work behaviors (CWBs).

- Speaking of CWB, there may be more of them going on than you would think based on the assessment literature...

- And even more on CWB! These authors found support for both self- and acquaintance-reported personality ratings, specifically conscientiousness and agreeableness, in predicting "workplace deviance".

- Unfortunately, gender bias still exists in selection. In this meta-analysis, the authors found this to be particularly the case in male-dominated jobs. On a positive note, they do suggest ways of mitigating this: provide clear evidence of the competence of applicants, encourage careful decision making, and use experienced raters.

- The over-/under-prediction of cognitive ability tests debate for different ethnic groups continues. In this study, the authors find support of overprediction for African Americans, suggesting the tests are not predictively biased.

Next, the March issue of J.A.P.:

- More support for the predictive validity of emotional intelligence, but more importantly, how the concept overlaps with other constructs such as the Big 5 and self-efficacy.

- All situational judgment tests (SJTs) are not equal, and according to these authors in a large number of instances the context that is presumably important? Not so much.

- Speaking of SJTs, these researchers suggest that putting the "situational" back in SJTs--i.e., assessing how the situation is analyzed rather than the response options--is a useful method.

- A fascinating update of effect size benchmarks that can be used for a variety of purposes.

- Trying to predict safety-related behavior? This research suggests that personality traits, particularly agreeableness, can usefully predict this behavior.

Moving on to the March issue of IJSA (free right now!):

- Some guidelines on preparing norms for personality inventories.

- Evidence that different cultures have different procedural justice perceptions of different selection mechanisms

- Some important findings on the equivalence and stability of job performance ratings over time

- Development of a new measure of subjective career success

- More evidence that both technical knowledge and prosocial knowledge are important factors in predicting medical student clinical performance

- This study found that CWBs are under-reported and organizational commitment increases the likelihood that peers will report them

- Evidence that forced-choice and Likert-type scales used in personality inventories have similar measurement properties

On to the Spring issue of Personnel Psych (also free right now!):

- This meta-analysis on narcissism showed that it is related to leadership emergence (through extraversion) and leadership effectiveness in a curvilinear fashion.

- More evidence of the importance of political skill--particularly the aspects of networking ability, interpersonal influence, and apparent sincerity--in predicting a range of important outcomes, including task performance beyond GMA and the Big 5. It would be interesting to see how this is related to emotional intelligence (yes this is a foreshadowing).

Turning to the March issue of Psych Bulletin:

- More on narcissism: this time, researchers found that men consistently report higher levels of narcissism compared to women, which is interesting when taken in combination with the study above.

In the December issue of Industrial and Organizational Psychology:

- The first focal article calls out researchers for using incorrect assumptions about criterion reliabilities, thus impacting criterion validity values. They make suggestions for how to improve meta-analyses moving forward.

- The second makes the important argument that utility analyses should consider measures of well-being when determining the effectiveness of interventions (such as an employment test).

Finally, in the January issue of JOB (also free right now):

- a proposal for improving the calculation and reporting of Cronbach's alpha

- a fascinating study showing that high conscientiousness may hinder performance during stressful situations

- in support of EI, this study found a link between emotion recognition ability and income (interestingly through political skill and interpersonal facilitation...remember the earlier study on political skill?).

That's all for now!

Sunday, January 04, 2015

2014 Research of the Year (+ research update)

Happy New Year! As I've done in previous years, I present below the research articles I ran across in this area that I think were the most impactful and/or important of 2014. But first, let's catch up on two issues:

First, the Winter issue of Personnel Psychology:

- Situational judgment tests have been shown to be useful for measuring interpersonal skills, but beware: levels of "angry hostility" moderate that relationship. (Is there a happy hostility?)

- When hiring leaders, should you look for those that have a busy home life, or be wary of them? In this fascinating study, the authors found that leader family-to-work conflict negatively impacts followers in that it can increase their burnout. However, family-to-work enrichment increased follower engagement through leader engagement. So the answer is, as usual, not simple: home/family life can be a good thing for followers if it makes the leader more engaged; but if the home/family life is increasing burnout, the leader may pass that along to others. So it would seem it all depends on how the individual is handling their life outside of work!

Let's look at the November issue of the Journal of Applied Psychology:

- Are men or women perceived as better leaders? According to this meta-analysis, it depends on how you ask the question. If you limit the question to other-ratings, women are rated significantly higher. But if you look at self-ratings, men rate themselves significantly higher. Which leads to the next question: is it a biological perception or a gender perception, and if the latter, what traits are the most important?

- An intriguing study of how applicant confidence interacts with and can be altered by the recruitment experience, in this case among recruits for the U.S. military.

- Next, a study of employment and job search efficacy. Not surprisingly, within-person frequency of job search behavior correlated with job offers; interestingly, the relationship between perceived job search progress and efficacy beliefs were moderated by beliefs of internal attribution.

- Last but not least, more evidence of the importance of defining the criteria when predicting job performance. In this meta-analysis, the researchers found more support for personality traits out-predicting cognitive ability in predicting counterproductive work behavior, that the two predictors are approximately equal in predicting organizational citizenship behaviors, and that cognitive ability outperforms personality when predicting task and overall performance. So do you want high task performance, OCBs, or do you want to avoid CWB? :) (of course the situation is even more complicated depending on whether you're looking at individual, team, leader performance, over what period of time, etc.)

Okay, on to the awards! Without further ado, here are my nominations for Research of the Year for 2014:

1) Important advancements in our understanding of weight-based discrimination at work: Vanhove & Gordon.

2) A study of applicants posting faux pas on their social networking sites: Roulin.

3) Two important looks at assessments delivered remotely via mobile devices: Arthur, Doverspike, Munoz, Taylor, & Carr, and Morelli, Mahan, & Illingworth.

4) Two fascinating looks at personality at work: Judge, Simon, Hurst, & Kelley; and Wille & De Fruyt

5) An excellent study of how effective staffing and training practices impact firm-level flexibility and adaptability: Kim & Ployhart.

6) An important study of the movement of impactful I/O researchers to business schools: Aguinis, Bradley, & Brodersen.

7) The relationship between conscientiousness and job performance is more accurately described as curvilinear: Carter, Dalal, Boyce, O'Connell, Kung & Delgado

Finally, honorable mention to two great developments in 2014: the change of some publishers to making access to articles more affordable, and the announcement of an additional journal, the Journal of Personnel Assessment and Decisions.

I'm continually amazed at the quality of thought and research in our area and the passion and practicality you exhibit. Here's to an amazing 2014 and more in 2015!

Wednesday, August 06, 2014

Research update

I can't believe it's been three months since a research update. I was waiting until I got critical mass, and with the release of the September issues of IJSA, I think I've hit it.

So let's start there:

- Experimenting with using different rating scales on SJTs (with "best and worst" response format doing the best of the traditional scales)

- Aspects of a semi-structured interview added incremental validity over cognitive ability in predicting training performance

- Studying the use of preselection methods (e.g., work experience) prior to assessment centers in German companies

- The proposed general factor of personality may be useful in selection contexts (this one was a military setting)

- Evidence that effective leaders show creativity and political skill

- Investigating the relationship (using survey data) between personality facets and CWBs (with emotional stability playing a key role)

- Corrections for indirect range restriction boosted the upper end of structured interview validity substantially

- A method of increasing the precision of simulations that analyze group mean differences and adverse impact

- A very useful study that looked at the prediction of voluntary turnover as well as performance using biodata and other applicant information, including recruitment source, among a sample of call center applicants. Reuslts? Individuals who had previously applied, chose to submit additional information, were employed, or were referrals had significantly less voluntary turnover.

Moving on...let's check out the May issue of JAP; there are only two articles but both worth looking at:

- First, a fascinating study of the firm-level impact of effective staffing and training, suggesting that the former allow organizations greater flexibility and adaptability (e.g., to changing financial conditions).

- Second, another study of SJT response formats. The researchers found, using a very large sample, the "rate" format (e.g., "rate each of the following options in terms of effectiveness") to be superior in terms of validity, reliability, and group differences.

Next, the July issue of JOB, which is devoted to leadership:

- You might want to check out this overview/critique of the various leadership theories.

- This study suggests that newer models proposing morality as an important component of leadership success have methodological flaws.

- Last, a study of why Whites oppose affirmative action programs

Let's move to the September issue of Industrial and Organizational Psychology:

- The first focal article discusses the increasing movement of I/O psychology to business schools. The authors found evidence that this is due in large part to some of the most active and influential I/O researchers moving to business schools.

- The second is about stereotype threat--specifically its importance as a psychological construct and the paucity of applied research about it.

Coming into the home stretch, the Summer issue of Personnel Psych:

- The distribution of individual performance may not be normal if, as these researchers suggest, "star performers" have emerged

- Executives with high levels of conscientiousness and who display transformational leadership behavior may directly contribute to organizational performance

Rounding out my review, check out a few recent articles from PARE:

- I'm not even gonna attempt to summarize this, so here's the title: Multiple-Group confirmatory factor analysis in R – A tutorial in measurement invariance with continuous and ordinal indicators

- Improving exploratory factor analysis for ordinal data

- Improving multidimensional adaptive testing

Last but not least, it's not related to recruitment or assessment, but check out this study that found productivity increases during bad weather :)

That's all folks!

Tuesday, August 07, 2012

Research update

Okay, way past update time. Let's take a look at the latest research (this month's themes: core self evaluations, SJTs, and, of course, personality tests):

Let's start with a couple from the July issue of Journal of Applied Psychology:

- First, an application of signaling theory to selection. The authors point out that viewing selection through this lens, where the focus is on the honesty of communication between applicant and employer, can help shed light on the field and point to future directions.

- Speaking of honesty, the other study is about a proposed way to reduce faking on personality tests. Specifically, the authors looked at the efficacy of providing applicants feedback about their honesty midway through the test; looks like they found mixed results.

Next, let's look at one from the August issue of the Journal of Organizational Behavior:

- The authors studied the impact that psychological capital (e.g., optimism, self-efficacy) has on job search behavior. They found a positive relationship between capital and perceived employability, which itself was related to various good (i.e., problem-focused) and not-so-good (i.e., symptom-focused) coping strategies.

Next, two from the Autumn Personnel Psychology:

- More support for the idea of contextualizing personality inventories. What does that mean? Essentially tailoring the test for work situations, and better yet specific work environments. In this study the mean criterion-related validity jumped from .11 (non-contextualized) to .25 (contextualized).

- Second, what looks like a fascinating study of what factors impact applicant attraction at various stages in the recruitment process. Interestingly, perceived fit was the strongest predictor of attraction but was not a significant predictor of job choice (the strongest predictor was job characteristics). In addition, organizational characteristics and recruitment process characteristics became more important in later stages.

Okay, those were warm-ups. Let's get into the heavy hitter, the September issue of the International Journal of Selection and Assessment:

- First, more on the importance of core self evaluation (CSE). In this study the authors found support for CSE explaining incremental variance in performance over ability and conscientiousness. They propose that CSE does so through its impact on learning motivation.

- Think situational judgment tests can't be coached? Think again.

- Should O*NET information be based on analyst ratings or incumbent ratings? Yes. Looks like each provides value.

- Applicant reactions: always a popular topic. This time the location is Mumbai, India. Not surprisingly, resumes and interviews fared well, while graphology and honesty tests did not. However, in an interesting twist work sample tests were rated unfavorably.

- Do recruiters care about volunteer experience? Not really.

- Might test-takers get fatigued at the end of a long SJT? Yes. Might it impact the psychometric properties? Yes. Might it impact subgroup differences? Umm...sort of.

- How many different ways can you analyze the reliability of an SJT? Turns out, quite a few.

- Aaahhhh yes, emotional intelligence. Haven't heard from you in a while. In this study the authors found positive applicant reactions and incremental validity over ability and the Big 5 across three samples (for the MSCEIT).

- Last but not least, let's end with another personality test--well, integrity test really--the venerable Personnel Reaction Blank. This time the authors looked at cross-cultural generalizability (U.S. and Singapore) as well as gender differences.

That's all for now!

Sunday, November 20, 2011

Research update #583: Impression management and a lot more

Okay, I've got a lot of ground to cover this time, so buckle up...

Let's start with the December issue of IJSA:

- Looks like how much applicants try to make themselves look good varies by country

- Is applicant faking behavior related to job performance? Kinda depends on your definitions.

- Research has found that emotional intelligence can be related to work attitudes. This appears to be in part because of an increased situational judgment effectiveness.

- Speaking of situational judgment...in terms of job knowledge, knowing what to do is different than knowing what not to do

- What impact does a resume have on a recruiter? Depends on what assumptions they make about you after reading it.

- How to people select--and continue with--an executive coach? By looking at things like their ability to forge a partnership.

- How do Canadian firms do in terms of using tests other than interviews? Not so well, it turns out.

Let's move to the October issue of JASP, where there's just one article but it's a good one. Researchers continued the (depressing) finding that applicant names impact pre-interview impressions. Specifically, the more a name was Anglicized, the more favorable the impression was when hiring for an outside sales job.

Next comes the November issue of JAP:

- A new meta-analysis of the FFM of personality and its relationship to OCBs and task performance.

- Measures of interest haven't gotten a lot of love as selection devices. Looks like we need to tease out the constructs a little because they could be more helpful than we thought.

- Applicants trying to create a certain image during an interview are better off doing this after an initial flub or relying solely on self-promotion rather than making up an image.

A few from the November issue of JPSP:

- Another on impression management (not selection-specific) that goes into more detail about the topic (e.g., how many tactics people use, their accuracy)

- A caution about using the Revised NEO-PI in different cultures due to DIF.

Next, a call for more transparency in false-positive findings.

Last but not least, those of you interested in the potential of social ratings of performance being used for selection might be interested in this study of RateMyProfessors.com, which found student ratings are likely to be useful measures of teacher quality.

Saturday, September 24, 2011

Research update + happy anniversary...to me!

Two things this time: we've got a lot of research to go over, and then a bit of a celebration! First the research.

The September issue of the Journal of Applied Psychology is out. Let's see what it has to offer:

- Using performance ratings as an assessment or a criteria? You'll want to look at Ng, et al.'s study of leniency and halo errors among superiors, peers, and subordinates of a sample of military officers.

- Speaking of criteria, you may be interested in Northcraft et al.'s study of how characteristics of the feedback environment influence resource use among competing tasks. Interesting stuff.

- Okay, let's turn to something more traditional. Berry, et al. look at correlations between cognitive ability tests and performance among different ethnic groups. Not surprising to those of you familiar with the research, the largest difference found was between White and Black samples.

- Another traditional (but always interesting) topic: designing Pareto-optimal selection systems when applicants belong to a mixture of populations. Check out De Corte, et al.'s piece. Oh, you might be interested in the in-press version.

- Dr. Lievens (a co-author on the previous study) has been busy. He and Fiona Patterson collaborate on a study of the incremental validity of simulations, both low fidelity (SJTs in this case) and high fidelity (assessment centers), beyond knowledge tests. Yes, both had incremental validity, and interestingly ACs showed incremental validity beyond SJTs. Check out the in press version as well.

- Wondering whether re-testing degrades criterion-related validity or impacts group differences? You're in luck because Van Iddekinge, et al. present the results of a study of just that. Short version? Re-testing actually did a lot of good.

- I know what you're thinking: "Might Lancaster's mid-P correction to Fischer's exact test improve adverse impact analysis?" Check out Biddle & Morris' study for an answer.

- And now that you've had your fill of that statistical analysis, you find your mind wandering to effect size indices for analyzing measurement equivalence. I'm right there with ya. So are Nye & Drasgow.

Let's turn now to the October issue of the Journal of Personality and Social Psychology because there are a few articles I think might interest you...

- First, Lara Kammrath with a fascinating study of how people's understanding of trait behaviors influence their anticipation of how others will react.

- Speaking of fascinating, George, et al. present the results of a 50-year longitudinal study of personality traits predicting career behavior and success among women. It makes you realize again how much has changed since the 1960s!

- I tell ya, this issue is chalk full of goodness. Carlson et al. demonstrate that people can make valid distinctions between how they see themselves and how others see them--potentially informing the debate on personality inventories.

- Lastly, a piece by Specht et al. on how personality changes over the life span and why this might be. Fascinating implications for using personality inventories for selection.

Bonus article: remember how I mentioned using performance ratings above? Well you might be interested in an article by Lynn & Sturman in the most recent Journal of Applied Social Psychology where they found that restaurant customers sometimes rated the performance of same-race servers as better than those of different races--but it depended on the criterion.

FINALLY, I'm proud to announce that this blog has officially been going strong for five years. My first post (now incredibly hard to read) was in September of 2006. Back then the only other similar blog was Jamie Madigan's great (but now sadly defunct) blog, Selection Matters. My first email subscriber (from DDI if you're curious) came on a month later. Now I have almost 150 email subscribers and at least a couple hundred more who follow the feed. Around 3,000 individuals visit the site each month from over a hundred countries/territories (U.S., India, and Canada are 1-2-3). It's a labor of love and I thank you for reading!

Sunday, October 03, 2010

How to hire an attorney

What's the best way for an organization to hire an attorney with little job experience? What should they look for? LSAT scores? Law school grades? Interviewing ability? A multi-year project that issued its final report in 2008 gives us some guidance. And while the study focused on ways law schools should select among applicants, it's also instructive for the hiring process. (By the way, individuals looking for personal representation may find the following interesting as well.)

Recall that the formalization of the "accomplishment record" approach occurred in 1984 with a publication by Leaetta Hough. She showed, using a sample of attorneys, that scores using this behavioral consistency technique correlated with job performance but not with aptitude tests or grades, and showed smaller ethnic and gender differences.

But in my (limited) experience, many hiring processes for attorneys have consisted of a resume/application, writing sample, and interview. Is that the best way to predict how well someone will perform on the job?

Assessment research would strongly point to cognitive ability tests being high predictors of performance for cognitively complex jobs. This is at least part of the logic of hurdles like the Law School Admissions Test (LSAT), a very cognitively-loaded assessment. When you're at the point of hire, however, LSAT scores are relatively pointless. Applicants have--at the very least--been through law school, and may have previous experience (such as an internship) you can use to determine their qualifications.

So what we appear to have at the point of hire is a mish-mash of assessment tools, relying heavily on un-proven filters (e.g., resume review) followed by a measure of questionable value (the writing sample) and the interview, which in many cases isn't conducted in a structured way that would maximize validity.

So what should we do to improve the selection of attorneys (besides using better interviews)? Some research done by a psychology professor and law school dean at UC Berkeley may offer some answers.

The investigators took a multi-phase approach to the study. The first part resulted in 26 factors of lawyer effectiveness--things like analysis and reasoning, writing, and integrity/honesty. In the second phase they identified several off-the-shelf assessments they wanted to investigate for usefulness, and they developed three new assessments--a situational judgment test (SJT), a biodata measure (BIO), and other measures, including optimism and a measure of emotional intelligence (facial recognition). In the final phase, they administered the assessments online to over 1,000 current and former law students and looked at the relationship between predictors and job performance (N for that part of about 700, using self, peer, and supervisor ratings).

Okay, so enough with the preamble--what did they find?

1) LSAT scores and undergraduate GPA (UGPA) predicted only a few of the 26 performance factors, mainly ones that overlapped with LSAT factors such as analysis and reasoning, and rarely higher than r=.1. Results using first-year law school GPA (1L GPA) were similar.

2) The scores from the BIO, SJT, and several scales of the Hogan Personality Inventory predicted many more dimensions of job performance compared to LSAT scores, UGPA, and 1L GPA.

3) The correlations between BIO and SJT and job performance were substantially higher-- in the .2-3 range compared to LSAT, UGPA, and 1L GPA. The BIO measure was particularly effective in predicting a large number of performance dimensions using multiple rating sources.

3) In general, there were no race and gender subdifferences on the new predictors.

These results strongly suggest that when it comes to hiring attorneys with limited work experience, organizations would be well advised to use professionally developed assessments, such as biodata measures, situational judgment tests, and personality inventories, rather than rely exclusively on "quick and dirty" measures such as grades and LSAT scores. Yet another proof of the rule that the more time spent developing a measure, the better the results.

On a final note, several years back I did a small exploratory study looking at the correlation between law school quality and job performance. I found two small to moderate results: law school quality was positively correlated with job knowledge, but negatively correlated with "relationships with people."

References:
Here is the project homepage.
You can see an executive summary of the final report here.
A listing of the reports and biographies is here.
The final report is here.

Saturday, April 03, 2010

ClicFlic offers assessment innovation

A while back I posted about a creative use of technology that Vestas was using for onboarding. At the time I wrote about the potential I saw for the use of such technology for assessment, but actually creating these videos was a bit of a mystery from the customer side. Now I've come across a vendor that allows us to create these tools.

I don't post about specific products very often--usually I focus on research and best practices--but I have made occasional exceptions. When I see a product that I think has the potential to be innovative, highly effective, and highly valid, I want to share the wealth.

Such is the case with ClicFlic. In a nutshell, ClicFlic allows customers to create customized interactive web-based videos that can be used for things like situational judgment tests (SJTs). But we've seen that before, right? What I hadn't seen was the branching ability of ClicFlic.

Historically, video-based testing, whether Internet-enabled or not, presents all candidates with the same content. A situation is presented, and the candidate is provided with either several pre-determined responses or an open-ended response area. But much like traditional computerized adaptive testing (CAT), ClicFlic allows for the creation of branching videos. In other words, what the user sees in the next segment will vary depending on how they respond on the current one.

Although most of the examples you'll see on their website involve customer service or training applications, the technology is easily adaptable to assessment situations, as you can see from this example.

I had an opportunity to speak with Mike Russiello, President and CEO of ClicFlic (and co-founder of Brainbench) and he allowed me to peek "under the hood"--what I saw looked plug-and-play easy. The scripting branches are easy to generate, videos simple to upload, and you can quickly assign points to different responses. The videos are flash-based and you can easily generate the HTML to place it on a webpage.

Want to learn more? Check out the examples on their website--the demo on the front page will give you a good feel for the technology. Here are some others that will give you an idea of the possibilities. For assessment-specific usages, here you can select several different types of items with some characters you may recognize.

Questions? You can learn more about how the tools are built here. You may also run into Mike at SIOP if you have questions. Finally, they're also planning on an upcoming webcast through tmgov.org.

I hope this sparks some interest for you and maybe even some ideas about where this technology could be taken even further (RJPs anyone?).

Saturday, March 20, 2010

March 2010 J.A.P.

The March 2010 issue of the Journal of Applied Psychology is out. Let's take a look:

Do women make better leaders? According to a study by Rosette and Tost, it varies with how success is attributed, the level of the position, perceptions of double-standards, and expectations. So the answer? A very solid "it depends."

Who should determine SJT scoring? Motowidlo and Beier suggest in their research study that situational judgment test (SJT) scoring keys based on input from subject matter experts (SMEs) contribute differentially to the prediction of job performance compared to keys based on general knowledge about trait effectiveness. What does this mean? That your ability to predict performance using SJTs depends in part on who is determining the scoring, and getting SME input may boost the effectiveness.

Do Americans work to live or live to work? Based on an analysis from Highhouse, et al., it's looking more and more like the former.

Need more evidence of the value of confirmatory testing? Naquin, et al. performed three experimental studies that demonstrated higher levels of lying when using email compared to pen and paper.

Do you like your leaders proactive? According to research conducted in China by Ning et al., you're not alone.

Finally, a slight correction to an article by Ilies et al. published last July on the relationship between personality and organizational citizenship behaviors.

Wednesday, March 03, 2010

Personnel Psychology, Spring 2010: SJTs, affect, and job offer timing

The Spring 2010 (v.63, #1) issue of Personnel Psychology is out. Let's look at the highlights:

First out of the gate, a great meta-analysis for anyone interested in situational judgment tests (SJTs; and who isn't?). Christian, et al. looked at 84 studies and found some pretty interesting things:

1) SJTs reported in the literature have been used to measure a variety of things, including leadership skills (37%), some type of composite (33%), interpersonal skills (12.5%), personality tendencies (9.6%), teamwork skills (4.4%) and job knowledge/skills (3%).

2) Criterion-related validity depends--as you might expect--on the match between predictor and performance measure. Conscientiousness measures, for example, predicted task performance much better than managerial performance (rho=.39 and .06 respectively). The highest correlations (albeit based on relatively small samples) were for teamwork skills and personality composites predicting task performance (.50 and .45 respectively).

3) Video-based SJTs tended to have stronger criterion-related validity values compared to paper-based measures. This was particularly true when measuring interpersonal skills (.47 compared to .27).

Second, a small but interesting study by Johnson, et al. on the relationship between trait affect (i.e., being generally disposed to feeling positive or negative emotions) and job performance. Results from 120 matched employee-supervisor pairs from a variety of jobs using both explicit (survey) and implicit (word fragment completion) measures of affect found substantial correlations, particularly between positive affect and performance (in the .50 range), and particularly when using implicit measures.

Something to add to a selection battery, perhaps? Could be perceived negatively by applicants, however, and I can see some questions being raised about the link to medical issues. But the same types of concerns were originally leveled at personality tests and were mitigated by creating measures specifically tied to work behavior. Definitely an area for more research.

Third, check out this study by Becker, et al. on the impact that job offer timing has on acceptance, performance and turnover. The authors found (using data from a Fortune 500 engineering technology company) that for both student and experienced samples, faster offers were associated with higher acceptance rates. Specifically, for experienced candidates, the difference between 2 weeks and 3 weeks taken to make the offer was substantial, whereas for the students 3 weeks versus 4 weeks was important. But, no differences were found in terms of either performance ratings or turnover among employees hired through different offer speeds.

Implication? The study suggests that offer time does impact the likelihood that the offer will be accepted, but viewed broadly this may not have long-term impacts in terms of how employees do on the job. Maybe in cases of good candidate-employer fit, candidates are willing to wait.

Last but not least are the book reviews. Two books are particularly relevant for us, The Structured Interview (Pettersen & Durivage) and Outliers (Gladwell). The first is received very positively and sounds like a great source for anyone wanting more details about the support for and use of structured interviews. The latter is "well worth [a] few evenings" but requires you to overlook the lack of evidence and convenient inferences.

Final notes: those of you interested in multisource performance ratings should check out Hoffman, et al.'s article, which reinforces the impact of having raters from different levels. Chuang and Liao's article also includes a useful measure of a high-performance work system.

Thursday, February 04, 2010

Lessons from NYC Fire case - part 2

Part 2 of 2

Last time I discussed five important lessons we can take away from recent rulings in the Vulcan v. City of New York case. In this post I'll review the remaining lessons and also discuss the relief order.

----

6) The city failed to provide sufficient evidence that the exam(s) tested for a sufficient number of the critical KSAs. They also failed to explain why they chose not to measure several KSAs identified as critical.

Lesson: the courts do not require employers to measure every single critical KSA. But there is an expectation that employers attempt to measure a sufficient number that represent a significant portion of the job requirements. In this case, that included non-cognitive abilities such as resistance to stress, teamwork, and conscientiousness, that were not measured.

7) The city failed to adequately consider how to measure a significant number of essential KSAs. While some of their concerns were valid (e.g., structured interviews for all applicants would be an operational nightmare), there are many different forms of testing that should have been considered, including situational judgment tests (SJTs) and biodata, which can be used to measure non-cognitive components.

Lesson: triers of fact expect employers to be up on the various assessment methods available and be able to explain why they chose not to use certain ones. This includes tests that are relatively easy to develop (e.g., SJTs) as well as ones that require substantial resources and statistical expertise (e.g., biodata).

8) The city failed to conduct a reading level analysis on the exams to ensure that it was not "pointlessly high." The plaintiff introduced evidence suggesting the reading level was above 12th grade; in addition, it appeared to exceed the reading level of materials at the academy.

Lesson: never forget that every assessment method is in some sense measuring additional KSAs beyond those you intend. For written exams, reading comprehension is always a requirement (barring accommodation). It's quite easy to conduct a reading level analysis (MS Word has it built in) to ensure that the level is reasonable and matches other job-related material.

9) The city failed to show that the cutoff scores (pass points) established for the exams were based on adequate rationale, namely "the necessary qualifications for the job of entry-level firefighter." Instead, the cutoff scores were based on operational need (the number of job openings expected). This is particularly important in multiple-hurdle selection processes such as in this case, where a failure on one exam component precludes an applicant from participating in the rest of the (potentially compensatory) assessment process.

Lesson: ultimately applicants have to pass the test(s) to be considered for employment. Cutoff scores should be established using the expertise of both SMEs and test developers and should be based on the minimum competency levels required upon entry to the job. At a minimum (and I would not rely solely upon this), the scores should be analyzed to identify any logical "break-points."

----

After ruling for the plaintiffs on both the adverse impact and disparate treatment claims, the judge issued a relief order on 1/10/10. In it, he imposes several things, including the following:

1) The city must develop a new testing procedure for entry-level firefighter in conjunction with the relevant parties. Following the development of the test, there will be a hearing to determine if this test should be used rather than the current test (developed in 2007 and not at issue in this litigation).

2) The court shall develop a process by which the approximately 7,400 applicants covered by this case can file a claim for monetary relief.

3) The city will identify 293 black candidates on the eligibility list and offer them priority hiring. (No quotas are being imposed, although the judge leaves this possibility open)

4) Retroactive seniority for those hired.

In addition, several other issues are up for debate, including the appointment of a special master or monitor, standards that will be relied upon in constructing the new exam, and the need for additional relief.

---

So what did we learn from all this? If you follow--fairly closely--best practices when developing and administering exams, you will be on solid ground defending them. If you don't, and your exam has a discriminatory effect, you may be called on it--and it's not a pleasant process. I'll leave you with this quote from the January ruling on disparate treatment:

"The history of the City's efforts to remedy its discriminatory firefighter hiring policies can be summarized as follows: 34 years of intransigence and deliberate indifference, bookeneded by identical judicial declarations that the City's hiring policies are illegal."

Tuesday, November 24, 2009

Want better prediction? Gather more data.

That's the bottom line from a study in the November 2009 issue of the Journal of Applied Psychology.

Oh & Berry looked at how adding personality ratings from peers and supervisors added incremental validity to self-ratings using a five-factor model measure. What were the results? Increases of 50-74% in operational validity across personality facets. They also looked at differential prediction of task and contextual performance (unfortunately those results weren't reported in the abstract). Bottom line? If you're using a personality assessment for promotions, strongly consider gathering data from co-workers.

Speaking of self-presentation, in the same issue Barrick et al. report the results of a meta-analysis of how self-presentation tactics (e.g., appearance, non-verbal behavior) impact interview ratings and later job performance. Results? "What you see in the interview may not be what you get on the job and...the unstructured interview is particularly impacted by these self-presentation tactics." An important reminder of how who the candidate seems to be impacts your assessment, and another reason to collect multiple sources of data.

There are a number of other great articles in this issue, such as:

How Major League Baseball CEO personalities impact important outcomes (like, um, winning).

How SJT and biodata measures add to the prediction of college student performance.

How personality scale validities change over time among a group of medical students.

Differences among letters of recommendation in academia between genders.

Saturday, October 31, 2009

New Job Simulations Report

The U.S. Merit Systems Protections Board (MSPB) just released a great, easily digestible, report on job simulations.

The report includes several things, such as:

- Job simulations defined and advantages/disadvantages

- Types of job simulations (SJT, work samples, etc.) and concrete examples

- Benchmark data on satisfaction with candidate quality as well as how federal agencies currently use simulations (just don't look at GPA compared to job knowledge tests in Figure 2)

- Survey data on why simulations aren't used more often in the federal government (time and expertise, sadly, were the top reasons)

- A 5-step strategy for using job simulations

- References to support the use of simulations (and good selection in general)

A great, free resource for anyone wanting to learn more about one of the best selection mechanisms you can use. And particularly relevant as more and more organizations move to using training and experience (T&E) questionnaires as their first (quick but not particularly valid) hurdle.

Sunday, July 26, 2009

July 2009 J.A.P.: SJTs and more

Situational judgment tests (SJTs) have a long tradition of successfully being used in employment tests. These types of (typically multiple-choice) items describe a job-related scenario then ask the test-taker to endorse the proper response. The question itself usually takes one of two forms:

1) What SHOULD be done in this situation? ("knowledge instruction")

2) What WOULD you do in this situation? ("behavioral tendency instruction")

What are the practical differences between the two? Previous meta-analytic research, specifically McDaniel et al.'s 2007 study, revealed that knowledge instruction items tend to be more highly correlated with cognitive ability, while behavioral tendency items show higher correlations with personality constructs. In terms of criterion-related validity, there appeared to be no significant difference between the two.

But there were limitations to that study, and two of them are addressed in a study found in the July 2009 issue of the Journal of Applied Psychology. Specifically, Lievens et al. addressed the inconsistency in stem content by keeping it the same while altering the response instruction, and also looked at a large population of applicants, rather than incumbents, which tended to dominate McDaniel et al.'s 2007 sample.

Results? Consistent with the 2007 study, knowledge instructions were again more highly correlated with cognitive ability, and there was no meaningful difference in criterion-related validity (the criterion being grades in interpersonally-oriented courses in medical school). Contrary to some research in low-stakes settings, there were no mean score difference between the two response instructions.

Practical implications? The authors suggest knowledge instruction items may be superior due to their resistance to faking. My only concern is that these items are likely to result in adverse impact in many applied settings. Like all assessment situations, the decision will involve a variety of factors, including the KSAs required on the job, the size and nature of the applicant pool, the legal environment, etc. But at least this type of research supports the fact that both response instructions seem to WORK. By the way, you can see an in-press version of this article here.

Other content in this journal? There's quite a bit, but here's a sample:

Content validity <> criterion-related validity

More evidence that selection procedures can impact unit as well as organizational performance

Self-ratings appear to be culturally bound

Wednesday, June 17, 2009

Summer '09 Personnel Psychology

The Summer 2009 issue of Personnel Psychology covers a lot of ground. Take a look:

Kuncel & Tellegen demonstrate (with undergrads) that when inflating on personality inventories, people don't always max out their self-presentation; in fact for some traits a moderate level of endorsement is seen as more desirable.

Bledow & Frese describe how a situational judgment test can be used to predict not only overall job performance, but a particular construct--in this case, initiative. Participants were employees and supervisors at six banks in Germany.

This one particularly caught my eye. Yang & Diefendorff discovered (using ~200 employees in Hong Kong), among other things, that agreeableness and conscientiousness seem to moderate the relationship between negative emotions and counterproductive work behaviors (CWBs). Implication? If you're hiring for a job prone to negative emotions (e.g., customer service), consider adding a personality inventory to your screeening process to prevent CWBs.

De Pater, et al. studied both students and employees to determine that challenging job experiences reported by participants predicted promotability ratings above and beyond current job performance and job tenure. This has implications for both career development and performance management.

Want to know more about what executive coaches do? Then check out Bono et al.'s study of similarities and differences between practicing coaches that are also I/O psychologists versus those that aren't. (Turns out they do a lot of the same things)

Last but definitely not least, Aguinis et al. describe a web-based frame of reference training they used to decrease the amount of bias inherent in personality-based job analysis. The article describes in detail how the training was implemented, and it had quite dramatic effects. Useful stuff for anyone looking to add this tool to your assessment procedure (in this case they used Raymark et al.'s personality-related personnel requirements form, which they describe as superior to Hogan & Rybicki's performance improvement characteristics tool (which I've actually used and found quite user friendly).

Wednesday, February 18, 2009

Get in the game

Longtime readers know that I've considered one of the holy grails in our field to be a way of combining the interactivity and engaging content of video game technology (VGT) with recruitment and assessment. Yes, part of this is because I enjoy the occasional Nuka-Cola and killing the occasional troll, but there is so much potential in the marriage of these two fields that we can't ignore it.

Up until now, the best efforts have gone one of two ways. The first is creating an entire first person video game for recruitment purposes--this is what the U.S. Army did. The second is using VGT but in a very basic and limited way--this is what the FAA is doing. But to my knowledge no one has created a web-based tool that showcases the basic functionality of VGT while also serving as an assessment tool. In fact many people may not even know what this might look like.

Well I ran across something the other day (hat tip) that gets us pretty darn close. It's actually an onboarding program designed by Vestas, a Danish energy company. It takes the form of a situational judgment test (SJT) that leads new employees through an orientation of what Vestas does and their approach to their business.

I think once you've watched, you'll agree with me that the potential is vast.

So why this type of technology over, say, existing SJT solutions such as those offered by companies like Ergometrics and Biddle? Those definitely still have a place, and live actors are obviously higher fidelity, but here are some advantages to think about:

1. You can do more, and show more, with VGT. Need to show someone hanging onto the bottom of a helicopter then jumping to a rooftop? Not a problem, no wires required. Need to show someone underwater? Scaling a mountaineous peak? Again, much easier (and cheaper).

2. No screen actors required. No more worrying about makeup or getting the right shot--you create what you want. Of course voice talent is still very important if you decide to use sound.

3. It's just plain more modern. For folks that grew up watching cartoons and playing video games, they will naturally gravitate more toward something that feels familiar. Text job descriptions that link to an ATS? Yawn.

4. It will make you stand out. Yes, I know the unemployment rate is high here in the U.S., but don't think that means the end of competing for the most qualified. Now's the time to plan how you're going to compete when the pendulum swings the other way again.

5. It will stand the test of time. People still watch old cartoons. Very few old shows are on. That video shot of the desktop computer in the background may look outdated sooner than you'd like.

but perhaps most importantly:

6. VGT holds the promise of a truly interactive experience, where candidates explore their future work environment, make decisions, and learn about the organization. This has the potential to be both a realistic job preview that helps candidates decide whether to apply, as well as a measurement tool that gauges how well the candidate meets job requirements. (Yes this sounds a bit like Second Life but need not be so complex)

So what do we need to do moving forward? Here are some things we need to make this work:

1. More education. What do projects like this need to succeed? How much do they cost? What are the challenges and potential roadblocks?

2. Outreach to the VGT industry including the big companies (Activision Blizzard, EA, etc.) as well as the smaller shops, industry groups, schools, etc. No doubt they have much to teach us--but we have a lot to share as well. (As an aside, Activision has a very attractive Careers page that showcases some of their work, but they dump applicants right into their ATS like most companies--failed opportunity to continue the brand experience with a game-like character sheet!)

3. What are the psychometric implications? Is this just another version of unproctored Internet testing, or is there more here? How does this relate to run-of-the-mill adaptive testing? Are there demographic differences in willingness or performance?

Now what may throw a big monkey wrench into this is cost. Video games are not cheap (WOW cost $63M to develop). But we're not talking multi-user, latest video card, and all that stuff. This could be much shorter, more cartoonish, and much simpler.

I think this is the most exciting thing happening in assessment; I hope there are enough developers out there that agree.

Tuesday, November 25, 2008

Giving thanks for research

It's almost Thanksgiving here in the U.S., a time to give thanks, and I'd like to thank a largely unsung group of people. Thank you to all the researchers out there who try to help us put some science around the art we call personnel recruitment and selection. Thank you for all your work and insights.

What better way to celebrate this wish of thanks than by talking about a new issue of the International Journal of Selection and Assessment (v16, #4)! As usual it's chalk full of good articles, so let's take a look at some of them.

First, a study of applicant perceptions of credit checks, something many of us do for sensitive positions. Using samples of undergraduates, Kuhn and Nielsen found mostly negative reactions, especially for older participants, but they varied with the explanation given as well as privacy expectations. Worth a look for any of you that conduct large numbers of background checks (and if you do, don't miss the Oppler et al. study below).

Next up, a fascinating study of police officer selection in the Netherlands. Using data from over 3,000 applicants, De Meijer et al. found evidence for differential validity between ethnic majority and minority participants. Specifically, cognitive ability tests predicted training performance for minorities but not for those in the majority. Performance prediction for the latter group was low for cognitive ability tests and somewhat better using non-cognitive ability variables. By the way, the dissertation of the primary author, a fascinating look at similar issues, can be found here.

The third article is one of those articles that almost (...almost) makes me want to pay for it, and anybody interested in electronic applicant issues take note. In this study, Dunleavy et al. used simulations to show the tremendous impact that small numbers of applicants can have on adverse impact (AI) analysis. In fact, the authors reveal situations where AI can be caused or masked by a single applicant applying multiple times! The authors present ways of identifying and handling these cases. Scary stuff. Hope the OFCCP is reading.

Fourth, Lievens and Peeters present results of a study of elaboration and its impact on faking situational judgment tests. Using master students, the researchers found that requiring elaboration on items (i.e., the reason they chose the response) had several positive results. It reduced faking on items with high familiarity. It also reduced the percentage of "fakers" in the top of the distribution. Lastly, candidates reported that the elaboration allowed them to better demonstrate their KSAs. This could be a great strategy for those of you worried about the inflation effects of administering SJTs online.

Next, Furnham et al. with a study of assessment center ratings. The authors found that expert ratings of "personal assertiveness", "toughness and determination", and "curiosity" were significantly correlated with participant personality scores, particularly Extraversion. Correlations with intelligence test scores were low.

Last but definitely not least, Oppler et al. discuss results of a rare empirical study of financial history and its relationship to counterproductive work behaviors (CWBs). Using a "random sample of 2519 employees" the authors found that those with financial history "concerns" were significantly more likely to demonstrate CWBs after hire. Great support for conducting these types of checks.

There are other articles in here, so I encourage you to check them all out. Thank goodness for research!

Tuesday, July 01, 2008

A review of situational judgment tests

In the latest issue of Personnel Review, Dr. Filip Lievens and colleagues provide an empirical review of situational judgment tests (SJTs), focusing on studies from 1990-2007.

SJTs, sometimes referred to as low fidelity simulations, present test takers with a scenario and ask them to select the appropriate response. Candidates may be asked to select what "should" they do, what "would" they do, the best response, the worst response, or some combination of the above. Here's an example:

You have been assigned lead responsibility for two weeks in the absence of your supervisor. On your first day in this role, one of your new direct reports comes into your office and complains that they were sexually harassed by the security guard when they entered the building. They ask that the situation be kept confidential. What would be your first action in response to this situation?

1. Contact the security guard and conduct an interview to obtain all the facts.
2. Assure the direct report you will look into the situation but cannot guarantee confidentiality.
3. Contact your supervisor to obtain instruction on next steps.
4. Conduct informal interviews with your other direct reports to determine if they have been harassed.

SJTs have some great benefits, and this article points them out. First, they can be valid predictors of performance--particularly when based on job analysis. Second, they show incremental validity beyond cognitive ability and personality tests, making them a valuable addition. Third, group differences tend to be reduced compared to ability tests, particularly when the cognitive load is low. Fourth, applicant perceptions of SJTs tend to be positive. And fifth, SJTs allow you to test large candidate groups simultaneously. I would add that they allow for all kinds of scoring possibilities as well (e.g., +1 for correct response, -1 for incorrect).

SJTs aren't without drawbacks--two major ones to be exact. The first is they can be susceptible to faking, practice, and coaching effects--although how they're built plays a large role in how big these effects are. The second is that we don't always know exactly what SJTs are measuring--is it job knowledge? Personality? Cognitive ability? The authors point out that more research is needed.

Overall, a very good review of a test method that every assessment professional should have in their tool belt. You can read an in press version here.

Tuesday, March 25, 2008

Too fat or too thin? You may not get hired.

Job candidates that are either too fat or too thin may have a more a difficult time getting hired than those in the middle weight ranges according to a study by Swami, et al. reported in the most recent issue of the Journal of Applied Social Psychology.

Weighting in line
The authors found that when men were asked to rate a variety of female pictures for either a management position or for providing help (N=30 and 28, respectively), they were less likely to hire or help women with body mass indices (BMI) over 30 or under 15. Those with a slender body (BMI = 19-20) were most likely to be hired or helped. This shouldn't be surprising, given that studies have consistently linked physical attributes, including weight, with employment decisions, but it's certainly a reminder to watch your biases when evaluating candidates!

Predict-ability
In another article, Truxillo et al. found a relationship between cognitive ability and the ability to accurately judge one's performance on an employment test. Using a video-based situational judgment test of customer service skills, the authors found that those with high cognitive ability were able to predict their performance while those with low cognitive ability were not. Practical implications? Providing thorough test feedback may be particularly important for candidates lower in cognitive ability as they may be more likely to be surprised (and dismayed) by the results. This means providing information prior to the test as well as afterward (e.g., how it was developed, how it is scored, how you can improve your performance).

Working IT
In a third study, Johnson, et al. found gender and ethnic group differences in how IT careers are perceived as well as in self-efficacy related to IT. Using data from 159 African- and 98 Anglo-Americans, the authors found that African American men reported higher levels of IT self-efficacy than all other groups, whereas Anglo women reported the lowest levels. In addition, Anglos had more negative stereotypes of IT professionals than did African Americans. This study had a small sample size, but the implication is that how people see their own ability related to an occupation, as well as how they perceive those in it, influences their career choices. This will in turn impact your applicant demographics as well as your recruiting success.

The rest
There are some other interesting reads in here, including:

When emotional displays of leaders may increase follower performance

How to give performance feedback

Self-perceptions of ethical behavior

Friday, February 15, 2008

2008 PTC-NC Conference

The Personnel Testing Council of Northern California (PTC-NC) is hosting its annual conference on March 20-21 in Concord.

They've lined up quite an agenda with some great presentations. Here's a sample:

- Disparate Impact and Employment Testing by Michael Harris

- Situational Judgment Testing by Jim Outtz

- Personality Assessments by Bob Hogan

Oh yeah, and I'll be doing a session on demographic application patterns and adverse impact of an on-line T&E system.

For more information contact Jerimiah Honer at jhoner@spb.ca.gov