Saturday, September 01, 2012
Is "big data" relevant for recruitment and selection?
For the unitiated, the big data movement is all about stocking, mining, and extracting information from large datasets. In the modern HR world, this data is typically found in human resources management systems--systems that contain information like time to fill, assessment results, compensation, and performance management measures.
Why has this become a hot topic? It's hard to say. Could be the buzz generated by Moneyball, the book and movie about the Oakland A's GM Billy Beane and his use of data to uncover data points (e.g., on base percentage) that bucked existing predictive measures such as batting average. Or it could be Google's study of what predicts supervisory success (shocker: it's not technical skill). I suspect it's also been fueled not insignificantly by software and consulting companies hoping to capitalize on the interest.
When I was chatting with the reporter, I found it difficult to answer some of the questions about the application of this movement to recruitment and assessment. It was only later that I realized why: for us, this is an old idea.
Asking if the analysis of large data sets is applicable to selection is like asking if sunshine is applicable to farming: it's a foundation upon which the practice exists. Not only is the research that surrounds our field grounded in data analysis, the major impactful discoveries have been in large part based on the analysis of large data sets -- things like the predictive power of cognitive ability and conscientiousness, and the U.S. Army's Project A. The entire profession of personnel psychology is founded upon the idea that through analyzing data we can help answer big questions like what predicts job success, leadership, and organizational attraction.
So I'm guessing I'm not the only one who is observing this trend and thinking: what took you so long?
Now, does all this mean the big data movement is pointless? Absolutely not. To the extent that organizations are renewing their interest in using data analysis to guide decisions, booyah. But here are my big four concerns:
1) The results are only as good as the data. Let's say you sick your analytical software on your HR data and find out that candidates recruited from the Northeast don't perform as well as those recruited from the South. Easy enough, looks like we need to shift our recruiting resources. But not so fast. What is your performance measure? What if I told you that the majority of your supervisors come from the South--might that impact the results? What if one of the success criteria is knowledge of customers in the South--but you're interested in expanding into other territories? Without first looking deeply at what our measures are, we risk coming to some very misleading conclusions. (Some of you will recognize this as "the criterion problem").
2) There's analysis, and then there's analysis. Anybody can run a correlation. But do you know about power? Statistical significance versus practical significance? Multivariate analysis? Collinearity? If that sounds like gibberish, please seek professional help.
3) With apologies to Kurt Lewin, nothing is as practical as a good theory. What if you find out that answers to "what's your favorite color" predict success as a senior manager? What does this mean? We can make all kinds of guesses, but without having a theoretical framework in place, we're letting results drive "the truth" rather than logically positing a relationship and seeing if the data support it. Basically what I'm saying is: beware fishing expeditions. All you have is a correlation; don't make the mistake of inferring causation.
4) What do you do with the results? So employees that drive to work outperform those that take public transit. Does that mean you force all your employees to drive? What if your analysis uncovers something uncomfortable about your current executive leaders--what the heck do you with that? (a: bury it, b: post and pray, c: use consultantspeak to obfuscate results, d: re-run analysis).
Without first thinking about these and other important questions, the "big data" movement, when applied to important questions like what predicts organizational behavior, has the potential to create all kinds of erroneous, wasteful, and potentially risky conclusions.
On the other hand, if this ends up creating an additional sense of energy around evidence-based management, I may end up looking back at this as an extremely positive development in helping organizations succeed.