The Problem with Hiring Algorithms

If something like hunger can affect a hiring manager’s decision—let alone classism, sexism, lookism, and other “isms”—then why not rely on the less capricious, more objective decisions of machine-learning algorithms?

In 2004, when a “webcam” was relatively unheard-of tech, Mark Newman knew that it would be the future of hiring. One of the first things the 20-year old did, after getting his degree in international business, was to co-found HireVue, a company offering a digital interviewing platform. Business trickled in. While Newman lived at his parents’ house, in Salt Lake City, the company, in its first five years, made just $100,000 in revenue. HireVue later received some outside capital, expanded and, in 2012, boasted some 200 clients—including Nike, Starbucks, and Walmart—which would pay HireVue, depending on project volume, between $5,000 and $1 million. Recently, HireVue, which was bought earlier this year by the Carlyle Group, has become the source of some alarm, or at least trepidation, for its foray into the application of artificial intelligence in the hiring process. No longer does the company merely offer clients an “asynchronous” interviewing service, a way for hiring managers to screen thousands of applicants quickly by reviewing their video interview —HireVue can now give companies the option of letting machine-learning algorithms choose the “best” candidates for them, based on, among other things, applicants’ tone, facial expressions, and sentence construction.

If that gives you the creeps, you’re not alone. A 2017 Pew Research Center report found few Americans to be enthused, and many worried, by the prospect of companies using hiring algorithms. More recently, around a dozen interviewees assessed by HireVue’s AI told the Washington Post that it felt “alienating and dehumanizing to have to wow a computer before being deemed worthy of a company’s time.” They also wondered how their recording might be used without their knowledge. Several applicants mentioned passing on the opportunity because thinking about the AI interview, as one of them told the paper, “made my skin crawl.” Had these applicants sat for a standard 30-minute interview, comprised of a half-dozen questions, the AI could have analyzed up to 500,000 data points. Nathan Mondragon, HireVue’s chief industrial-organizational psychologist, told the Washington Post that each one of those points “become ingredients in the person’s calculated score,” between 1 and 100, on which hiring decisions can depend. New scores are ranked against a store of traits—mostly having to do with language use and verbal skills—from previous candidates for a similar position, who went on to thrive on the job.

The best AI systems today are notoriously prone to misunderstanding meaning and intent.

HireVue wants you to believe that this is a good thing. After all, their pitch goes, humans are biased. If something like hunger can affect a hiring manager’s decision—let alone classism, sexism, lookism, and other “isms”—then why not rely on the less capricious, more objective decisions of machine-learning algorithms? No doubt some job seekers agree with the sentiment Loren Larsen, HireVue’s Chief Technology Officer, shared recently with the Telegraph: “I would much prefer having my first screening with an algorithm that treats me fairly rather than one that depends on how tired the recruiter is that day.” Of course, the appeal of AI hiring isn’t just about doing right by the applicants. As a 2019 white paper, from the Society for Industrial and Organizational Psychology, notes, “AI applied to assessing and selecting talent offers some exciting promises for making hiring decisions less costly and more accurate for organizations while also being less burdensome and (potentially) fairer for job seekers.” 

Do HireVue’s algorithms treat potential employees fairly? Some researchers in machine learning and human-computer interaction doubt it. Luke Stark, a postdoc at Microsoft Research Montreal who studies how AI, ethics, and emotion interact, told the Washington Post that HireVue’s claims—that its automated software can glean a workers’ personality and predict their performance from such things as tone—should make us skeptical:

Systems like HireVue, he said, have become quite skilled at spitting out data points that seem convincing, even when they’re not backed by science. And he finds this “charisma of numbers” really troubling because of the overconfidence employers might lend them while seeking to decide the path of applicants’ careers.

The best AI systems today, he said, are notoriously prone to misunderstanding meaning and intent. But he worried that even their perceived success at divining a person’s true worth could help perpetuate a “homogenous” corporate monoculture of automatons, each new hire modeled after the last.

Eric Siegel, an expert in machine learning and author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, echoed Stark’s remarks. In an email, Siegel told me, “Companies that buy into HireVue are inevitably, to a great degree, falling for that feeling of wonderment and speculation that a kid has when playing with a Magic Eight Ball.” That, in itself, doesn’t mean HireVue’s algorithms are completely unhelpful. “Driving decisions with data has the potential to overcome human bias in some situations, but also, if not managed correctly, could easily instill, perpetuate, magnify, and automate human biases,” he said. 

The extent to which HireVue’s algorithms succumb to the latter, as opposed to living up to the former, can only be determined by a detailed audit, which would mean making the system much more transparent than it is. It’d be interesting, for instance, to see HireVue convert to text some of their interviews with candidates, to see how accurate the natural-language processing is—Siri and other voice recognition systems are far from perfect. “Besides,” Siegel went on, “I see the bias question as actually secondary to the snake oil question: The system’s predictive performance and scientific merit is likely much closer to a Magic Eight Ball than HireVue’s customers realize. This mostly is a result of the simple fact that personality attributes cannot be ‘accurately’ predicted by machine learning or any other type of technology.”

HireVue, of course, would have us believe otherwise. “Consistent with generally accepted legal, professional, and validation standards established within the field of psychology, our data scientists and [industrial-organizational] psychologists continuously evaluate the degree to which evidence and theory support the interpretations and employment decisions made based on assessment results, while ensuring protected groups are not adversely impacted,” the company’s Web site states. “The result is a highly valid, bias-mitigated assessment that helps to enhance human decision making while actively promoting diversity and equal opportunity regardless of gender, ethnicity, age, or disability status.”

I asked Siegel, who recently wrote an article in Scientific American titled, “The Media’s Coverage of AI is Bogus,” to read HireVue’s Web site. He noted how the company claims to “carefully design our products to provide clear understanding about what is being predicted, the confidence in the prediction and appropriate explanation of the data.” This is dubious, Siegel said. “I’d be interested to see exactly what it is they claim to predict and at what confidence! It seems to me the confidence level they report would either be too low for them to sell their product, given the terms in which they describe the product, or not credible.”

Interested employers can test HireVue or other AI hiring systems to see if they really work as advertised. One way is to have the AI system evaluate the last class of new hires and then measure how well their future performance correlates with how the AI system initially scored that group. Or, if employers were willing to hire a cohort of candidates that didn’t score in the top tier (or didn’t want to take the recorded interview), the future performance of those hires could be measured against those who were rated more highly by the AI system. (Would HireVue refund fees if they couldn’t materially outperform these indexes?)

Researchers have been aspiring for a long time to make the process of hiring a little more scientific. The first issue of the Journal of Applied Psychology, published in 1917, identified recruitment and selection—or hiring—as the “supreme problem facing the field,” one of “diagnosing each individual, and steering him toward his fittest place, which is really the culminating problem of efficiency, because human capacities are after all the chief national resources.” 

Should employers rely on AI to help them “diagnose” job candidates and steer them toward the right positions? “Outside the particulars of HireVue, algorithmic scoring (assigning of probabilities) in and of itself is not unethical,” Siegel told me. “But it is greatly challenging to implement it in a way that avoids ethical pitfalls.” Not understanding algorithmic scoring is one such pitfall. “The users would need to be educated that the system is only delivering probabilities,” he said. “It might tell you, ‘There is a three times better chance than average, namely a 60 percent chance, that this individual will perform above benchmark XYZ if you hire them.’ Still a lot of uncertainty, but perhaps a data point worth taking into consideration when assessing a candidate—the same exact issue as using such probabilities to determine sentencing and parole in predictive policing.”

Dane Holmes, Head of Human Capital Management at Goldman Sachs, for his part, doesn’t, as of yet, think an algorithmic score is a datapoint worth considering. Goldman Sachs uses HireVue’s digital interviewing platform without the AI assessments. But the firm isn’t entirely against involving machines in the hiring process. “We’re experimenting with résumé-reading algorithms that will help candidates identify the business departments best suited to their skills and interests,” Holmes wrote in Harvard Business Review. He suspects that, at some point, certain companies may begin to exclusively rely on algorithms to assess resumes and interviews. “But I don’t see us ever eliminating the human element at Goldman Sachs; it’s too deeply embedded in our culture, in the work we do, and in what we believe drives success.”

That may be the winning, and perhaps the most ethical, strategy for now—if Amazon’s recent woes relying on hiring algorithms are anything to go by. Last year, Reuters ran the headline, “Amazon scraps secret AI recruiting tool that showed bias against women.” Nevertheless, college students are learning to expect HireVue to be a part of working life. The “Student Resource” page on Duke University’s Economics Department Web site shows the typical questions included in HireVue’s video interviews, and lists some recording tips. One says, “If there is a feature where you can get rid of the picture of yourself on the screen, it makes it easier to look directly into the camera.”

Brian Gallagher is Ethical Systems’ Communications Director. Follow him on Twitter @BSGallagher.