Don’t Embed Your Bias in Your Algorithm

AI, algorithms, and machine learning are the buzzwords in recruitment at the moment. Although many don’t really know what they entail, everybody wants them. But because of a lack of understanding the true nature of what we are building and how we are building it, this is a very dangerous time. We are currently building an infrastructure that will probably last decades, without proper thought as to what we are building.

The analogy in this article with the building of physical infrastructure is great. By building low bridges where no bus will fit underneath, you block public transport from getting into the centre. With that, you block access to this area for the underprivileged who live in the suburbs for a long time after laws are repelled.

The Case of Probation

Why is this a big risk for algorithms too? A non-HR example explains this best. The algorithm that helps judges decide on who to give probation to is racist, even though skin color is not in it’s database. This happens for two reasons.

One is, for example, the question of whether the criminal will have a problem finding a job. In the Netherlands last year academic research showed that a Dutch white name who admitted having served time for a violent crime still got invited three times as often to an interview than a Muslim name with a perfect record. Although not having a job does predict recidivism, this question is in nature racist, since recruiters, unconsciously, are.

The second question is if the criminal has other convicted criminals in its family or close friends. Black people get checked by the police about 10 times more often than white people and they also have a higher conviction rate. So police actions make for a bias here.

This leads to the result that a black woman with no priors who did time for a theft got a risk score of 8 on a 10-point scale, while a white man who did time for murder and had been to jail three times before still got a risk score of 2 on a 10 point scale. The big data analyses shows that on average the prediction rate is pretty accurate when predicting high risk, and the defendant was re-arrested. Yet when it was wrong, either predicting high risk and no re-arrest, or low risk with a re-arrest, it routinely underestimated the probability of white recidivism and over-estimated the probability of black recidivism.

The Recruitment Implications

I’ve been talking about this subject in recruitment for some time now and usually most people agree that recruiting is biased. Just look at your workforce compared to the general or the student population in your field. This is human. But this should be considered a human weakness. So we should not build this weakness into our algorithms. Our algorithms should warn us for our bias, not strengthen it.

Currently in the Netherlands there’s a big debate going on about judges from minority groups. Only 2 percent of our 3600 judges have a migration background. As many of the people close to the process, like a recently retired head of the judges association, said: enough people of color with all the right education and experiences apply, they just don’t get selected.

So building an algorithm that mimics human behavior is probably not the best idea. Even if you don’t let the algorithm see the ethnic background, there will be other data points that will point to a migrant background, and an algorithm will say: based on human behavior, these data points, that are more common with migrants, are not what we are looking for.

How We Start Fixing This

Select on relevant data. I’ve written an article on this before based on data from Haver about call center and retail sales jobs. They started building a model from theory, from scratch. They broke down the work in 21 sub tasks and developed a test for all of them. The result is an incredible increase in quality of hire, but also a total annihilation of bias.

Build a testable candidate profile. It’s similar to the first point, but instead of starting from scratch, you start with your existing population, but let them do tests. Basically, you let your existing population, from who you know are good or not, do several tests … either psychometric or cognitive, whatever is most relevant. Based on this data, your build a profile that can be tested without prejudice.

Experiment. If you have to start by training your algorithms on existing data, leave a lot of room for experimenting. So hire plenty of people the algorithm doesn’t recommend, and make the weight of the learnings based on the experiments very important for the algorithm. Make sure you let the algorithm learn from people it says you should not have hired. Was it wrong? Was it right? If it was wrong, it should adjust it’s settings?

We Are Building the Future

We need to remind ourselves we are building the future. The algorithms we are building now will probably be the foundation of much of our labor market for the next decades. We need to be really careful. We might want to get philosophers or social scientists on board asking question about what we are building. We need to make sure we do not build our human weaknesses into the system that will probably last for a long time.

image from bigstock