Like it or not, much of what we encounter online is mediated by computer-run algorithms — complex formulas that help determine our Facebook feeds, Netflix recommendations, Spotify playlists or Google ads.

But algorithms, like humans, can make mistakes. Last month, users found the photo-sharing site Flickr's new image-recognition technology was labeling dark-skinned people as "apes" and auto-tagging photos of Nazi concentration camps as "jungle gym" and "sport."

How does this happen? Zeynep Tufekci, an assistant professor at the University of North Carolina at Chapel Hill's School of Information and Library Science, tells NPR's Arun Rath that biases can enter algorithms in various ways — not just intentionally.

"More often," she says, "they come through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data — and there always [are] — that's going to be reflected as a bias in your system."


Interview Highlights

On bias in the Facebook "environment"

These systems have very limited input capacity. So for example, on Facebook, which is most people's experience with an algorithm, the only thing you can do to signal to the algorithm that you care about something is to either click on "Like" or to comment on it. The algorithm, by forcing me to only "Like" something, it's creating an environment — to be honest, my Facebook is full of babies and engagements and happy vacations, which I don't mind. I mean, I like that. When I see it, I click on "Like" — and then Facebook shows me more babies.

And it doesn't show me the desperate, sad news that I also care about a lot, that might be coming from a friend who doesn't have "likable" news.

How biases creep into computer code

One, they can be programmed in directly, but I think that's rare. I don't think programmers sit around thinking, you know, "Let us make life hard for a certain group" or not. More often, they come through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data — and there always [are] — that's going to be reflected as a bias in your system.

Sometimes [biases] can come in through the confusing complexity. A modern program can be so multi-branch that no one person has all the scenarios in their head.

For example, increasingly, hiring is being done by algorithms. And an algorithm that looks at your social media output can figure out fairly reliably if you are likely to have a depressive episode in the next six months — before you've exhibited any clinical signs. So it's completely possible for a hiring algorithm to discriminate and not hire people who might be in that category.

It's also possible that the programmers and the hiring committee [have] no idea that's what's going on. All they know is, well, maybe we'll have lower turnover. They can test that. So there's these subtle but crucial biases that can creep into these systems that we need to talk about.

How to limit human bias in computer programs

We can test it under many different scenarios. We can look at the results and see if there's discrimination patterns. In the same way that we try to judge decision-making in many fields, when the decision making is done by humans, we should apply a similar critical lens — but with a computational bent to it, too.

The fear I have is that every time this is talked about, people talk about it as if it's math or physics, therefore some natural, neutral world. And they're programs! They're complex programs. They're not like laws of physics or laws of nature. They're created by us. We should look into what they do and not let them do everything. We should make those decisions explicitly.

Copyright 2015 NPR. To see more, visit http://www.npr.org/.

Transcript

ARUN RATH, HOST:

Like it or not, much of what you encounter online is mediated by computer-run algorithms - complex formulas that determine what kind of movies you'd like or what should show up in your news feed. Behind the scenes, algorithms have a much broader impact as big institutions use them to make sense of massive amounts of data that define our digital lives.

But there's a problem with that, according to Zeynep Tufekci, professor of information and library science at the University of North Carolina at Chapel Hill. She says that like humans, algorithms can make mistakes and even reflect bias.

ZEYNEP TUFEKCI: These systems have very limited input capacity. So for example, on Facebook, which is most people's experience with an algorithm, the only thing you can do to signal to the algorithm that you care about something is to either click on Like or to comment on it. The algorithm, by forcing me to only like something - it's creating an environment - to be honest, my Facebook feed is full of babies and engagements and happy vacations, which I don't mind. I mean, I like that. When I see it, I click on Like, and then Facebook shows me more babies.

And it doesn't always show me the desperate, sad news that I also care about a lot that might be coming from a friend who doesn't have likeable news.

RATH: Now, when we're talking about algorithms behaving badly, for lack of a better term, there's also the ones that are used to process our information out there. I remember when iPhoto added facial recognition. At first, it seemed to have a problem recognizing darker-skinned members of my own family. It seems like it's gotten better. But just recently Flickr rolled out image recognition that tagged dark-skinned people as apes or animals. That clearly can't be intentional? How are biases working into computer code there?

TUFEKCI: Well, there are multiple ways they can creep in. One, they can be programmed in directly. But I think that's rare. I don't think programmers sit around thinking, you know, let us make life hard for a certain group or not. More often, they come through the complexity of the program and the limits of the data they have. And if there are some imperfections in your data - and there always is - that's going to be reflected as a bias in your system.

And sometimes they can come in through the confusing complexity. A modern program can be so multi-branched that no one person has all the scenarios in their head. For example, increasingly, hiring is being done by algorithms. And an algorithm that looks at your social media output can figure out fairly reliably if you are likely to have a depressive episode in the next six months before you've exhibited any clinical signs.

So it's completely possible for a hiring algorithm to discriminate and not hire people who might be in that category. It's also possible that the programmers on the hiring committee has no idea that's what's going on. All they know is, well, maybe we'll have lower turnover. They can test that. So there's these subtle but crucial biases that can creep into these systems that we need to talk about.

RATH: So how do we fix this? I mean, is there a way, you know - if humans are doing the programming, how do you limit human bias in the programs?

TUFEKCI: Well, we can test it under many different scenarios. We can look at the results and see if there's discrimination patterns. In the same way, we try to judge decision-making in many fields, when the decision making is done by humans. We should apply a similar critical lens, but with a computational bend to it, too.

The fear I have is that every time this is talked about, people talk about it as if it's math or physics, therefore some natural, neutral world. And they're programs. They're complex programs. They're not like laws of physics or laws of nature. They're created by us. We should look into what they do and not let them do everything. We should make those decisions explicitly.

RATH: Zeynep Tufekci is a professor of information and library science at the University of North Carolina at Chapel Hill. Thanks very much.

TUFEKCI: Thank you. Transcript provided by NPR, Copyright NPR.

300x250 Ad

Support quality journalism, like the story above, with your gift right now.

Donate