Problem solve Get help with specific problems with your technologies, process and projects.

Algorithmic bias top problem enterprises must tackle

No enterprise today would roll out an obviously biased AI tool, but many remain unaware of the risks of unconscious bias in algorithms, which can produce equally dangerous results.

Most enterprises deploying AI tools today are aware of the potential risks of bias. An algorithm that discriminates against certain classes of people is plainly unacceptable, and most AI practitioners are at least somewhat attuned to this risk.

But a deeper problem lurks inside AI systems that could undermine enterprises' good intentions. It's a problem that, by definition, few are even aware of: unconscious algorithmic bias.

"Bias is insidious because if you have biased data, it doesn't matter how much you have, it's still biased. And those kinds of errors aren't going to be washed out in the quantity of data," said Cheryl Martin, chief data scientist at Alegion, an AI training platform.

Several high-profile and obvious examples of biased AI have put the issue of algorithmic bias on many AI practitioners' agendas. Many are familiar with Microsoft's chatbot Tay, which quickly devolved into a racist and misogynistic troll, or the Google image classification service that classed images of African-Americans as images of gorillas.

Enterprises now know to scratch AI tools that deliver these kinds of obviously biased results. But relatively few are thinking about bias in a more systematic way.

In a recent survey conducted by O'Reilly Media, just 40% of respondents said they check for fairness and bias when building models. Only 17% said they consider measures of bias and fairness when evaluating the success of their models.

Bias often starts with bad data

Without systematically attacking algorithmic bias, it can creep into models in subtle ways.

For example, Martin used the example of an image recognition tool that aims to identify the gender of people in images. Most stock libraries have higher percentages of images of women in kitchens than men, so the algorithm will be trained to expect women to be in pictures in kitchens. This isn't a problem with the algorithm. There was no malicious intent on the part of the developer. The algorithm simply encoded a pre-existing bias.

To fix bias, you have to be aware that it's arising from the data, and you have to correct for the bias that's in the data.
Cheryl MartinAlegion

Martin said addressing this kind of unconscious algorithmic bias requires users to curate data sets before using the data in training algorithms to ensure equal representation. Engineers should also review algorithms' outputs and check for apparent bias before putting AI tools out in the world.

"To fix bias, you have to be aware that it's arising from the data, and you have to correct for the bias that's in the data," Martin said.

Build a diverse team

For Sanjay Srivastava, chief digital officer at consulting firm Genpact, building diverse teams also helps address the issue of unconscious bias. Involving people who come from different backgrounds can help a team identify potential edge use cases or gaps in training data that could cause an AI service to work better for certain populations than others.

He said the key to eliminating unconscious algorithmic bias is to get a broad view of how your tool works and what it aims to accomplish. Only then can you see the areas in which it falls short.

"You have to have people that can bring in common sense and be attuned to political issues and historical context," Srivastava said. "You have to have a diverse set of backgrounds to pick up that nuance."

An image recognition tool can learn bias from training data.
AI algorithms learn from what they see, even when it's biased.

Srivastava said examining every AI tool for potential unconscious bias requires a tradeoff. On the one hand, engineers are under pressure to deliver working products to business units quickly, as time to market is usually a factor. But, on the other hand, involving broad teams and making efforts to eliminate bias can be time-consuming. As the issue of unconscious bias grows in prominence, he expects more enterprises to figure out how to strike this balance.

"You'd be hard pressed to find companies that aren't working on conscious bias," he said. "The issue is unconscious bias. I think the future of AI is very bright. There will be bumps in the road, but you have to learn, and I think it's important to be vigilant."

Dig Deeper on AI ethics issues

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

How are you working to eliminate unconscious bias from AI algorithms?
"You'd be hard pressed to find companies that aren't working on conscious bias"
Really? What I see is a lot of algorithm builders, and their cultures, living in denial.

Nov 2016 is the obvious example. Some Pollsters are still living in denial that their algorithms were biased.

Insurance company actuaries have the longest historical track record of building corporate algorithms. Repeatedly when they are presented with bias in their algorithms denial of reality is their response.

Is 2018 now somehow different from 2016? or 1968?

Insurance company redlining is the classic historical case. In the 1960s and 1970s the Kennedy Expressway carried suburbanites and O'Hare travelers to Chicago's loop. Chicago is flatland. But the expressway went high in the air over a railroad bridge which went over local highway bridge. Then from the spot (unusually high for flatland Chicago) it went straight down to Hubbard's cave under the Railroads and streets. Of course this spot is precisely where traffic congestion would come to a halt.

Heavy semi-trucks and commuter cars and taxis did not mix well. There was an extremely high number of crashes, injury, death and property damage.

This stretch of expressway went through the far east edge of the zipcode, township and ward. So the zipcode had an extremely high rate of crashes and claims. All those crashes and claims came from suburban commuters and taxis.

The residents of that inner city zipcode were forced to pay extremely high auto insurance premiums due to the crashes they were not involved in.

Of course, the southeast corner of that zipcode was where the mafia chop shops were located. Organized car theft rings would steal cars from O'Hare parking lots; drive the cars to the chop shop and leave the unsaleable parts in a nearby vacant inner city lot. The CPD collected statistics by where the stolen car (parts) were recovered; not by where the cars were stolen.

Cars of people in that zip code were not being stolen. But the people in that zip code faced extremely high insurance premiums. When I moved from that zip code to Bloomington IL (Home of big insurance companies) my monthly premium dropped from $1,200 to $120 for minimum coverage. My 100,000 mile/year business usage of the car doing insurance investigations and setting up loss prevention programs statewide was exactly the same from both locations. Nothing changed except my zipcode.

That $1200 vs $120 difference was due solely to a biased algorithm. That bias still exists to some degree in all kinds of insurance premiums due to the algorithms of the guys who have been in the algorithm business the longest.
I take your point. What happened with readlining in the 60s and 70s was a clear example of conscious, intentional bias. We may still see some of those same kinds of problems today, but I doubt that any bank is doing it specifically to disadvantage certain groups. They may have bad data or data scientists who lack a broad base of experiences building their models. But those are problems of unconscious bias.