peshkov - stock.adobe.com

Check model accuracy with Facebook AI's new data set

Facebook paid people for their likenesses in its new Casual Conversations data set. The data set can help researchers check for bias in their AI models.

Mark Labbe

Published: 09 Apr 2021

Facebook AI has released a benchmarking data set to help AI researchers evaluate their computer vision and audio models for bias. The Casual Conversations data set consists of more than 45,000 videos from 3,011 people and is freely available to researchers and academics.

According to Facebook AI, the data set, unveiled April 8, is unique in that each subject labeled their age and gender themselves, and every participant explicitly consented to the use of their likeness. Participants received payment from Facebook.

Benchmarking data set

Trained annotators also labeled participants' skin tones using the Fitzpatrick Skin Type scale, a skin classification system that breaks skin tones into six shades. Developed in 1975, people have traditionally used the scale to determine how sun exposure will affect their skin.

Also, annotators labeled videos recorded in low ambient lighting.

Researchers can use this data set to test the accuracy of models that automatically detect age, gender and skin tone, noted Kashyap Kompella, CEO of RPA2AI Research.

"For example, if your software works well only for lighter skin tones but not for darker skin tones, that indicates you've got work to do -- collecting more data and retraining your algorithms," he said.

Ted Kwartler, vice president of AI trust at AI and autoML vendor DataRobot, said that while he is encouraged by Facebook's approach to transparency with how it's collecting data, "paid, expert-curated and crowd-sourced labeling is fraught with issues."

For example, as artists in a competitive industry, what Kwartler referred to as "actors" being paid to record videos for the data set could feel compelled to lie about their ages. Confirmation biases could also appear when evaluating skin tone, as one evaluator may share an evaluation with another evaluator, thus influencing an opinion, he said. Kwartler is also a member of the Federal Advisory Committee on Data for Evidence Building.

Facebook, in a statement responding to TechTarget, noted that the people in the data set were not professional actors, but were sourced by an outside vendor, thus removing any problem associated with professional competitiveness.

In the statement, Facebook pointed to an accompanying research paper, which notes that Facebook employed eight people to annotate all participants using the Fitzpatrick scale. No one annotator could see the annotations made by another rater.

Facebook said it accumulated the weighted histograms over eight votes and picked the most voted skin type for the final annotation.

The annotators were trained to use the Fitzpatrick scale, according to Facebook.

Need for diversity

The social media giant's data set comes as the AI industry is seeing a renewed effort to bring more diversity into the AI creation process. Google, for example, has come under fire recently for terminating two prominent AI ethics researchers, Timnit Gebru and Margaret Mitchell. Gebru, a Black woman, has expressed criticism about the way Google treats women and people of color.

From washroom sensors to online exam proctoring tools, the list of products with coded bias is long.

Kashyap KompellaCEO, RPA2AI Research

Researchers say that more diversity in AI creation leads to fairer, less biased AI models.

That's important, because image recognition and computer vision systems have historically "failed" dark-skinned people, Kompella said.

"From washroom sensors to online exam proctoring tools, the list of products with coded bias is long," he said.

As AI demand continues to accelerate, so, too, has the demand for high-quality, representative, accurate and bias-aware training data, said Nick McQuire, chief of enterprise research at CCS Insight.

"As an industry, we are [in an ancient era] when it comes to understanding fairness and bias in AI," he said.

While "it remains to be seen how Facebook will equip developers with this research, in the main, it is good news that we are seeing continued progress in the field given its importance to not only AI developers, but society at large as well," he continued.

The Casual Conversations data set is free for researchers to obtain, but using it comes with some caveats.

According to the agreement with Facebook to which users must consent before downloading the data set, the data cannot be used to "measure, detect, predict or otherwise label the race, ethnicity, age or gender of individuals, [or] label facial attributes of individuals." Nor may users modify, translate or create any derivative works based on the data set.

In short, researchers can use the data set to only determine the accuracy of their own models.

The agreement notes that if Facebook suspects users are mishandling the data set, the company can audit their use, storage and distribution of the system.

Good first steps

While the Casual Conversations data set could help researchers, it's not an end-all for eliminating bias.

"Progress in AI comes not just from breakthroughs in machine learning techniques but from having large reference benchmark data sets," Kompella said.

He pointed out the data set contains only three gender categories -- male, female and other -- which could anger some people.

The data set is a good first step, but it's not a truly holistic solution to mitigating bias throughout the entire AI lifecycle, Kwartler said.

"It's made by data scientists for data scientists. But the model's behavior and impact are not addressed," he said. "There are many involved in the AI lifecycle beyond data scientists, including business users, executives, citizen data scientists, IT teams and more. Solutions must be built for multiple personas who are applying models to solve business problems."

Meanwhile, Facebook has often come under fire for data privacy violations. Most recently, security experts have criticized Facebook for downplaying a data leak that affected millions of users.

"The bar of trust in Facebook is admittedly low at the moment, but in the field of AI research it is still a credible and, in some respects, an engaged player," McQuire said.

"Much of the trust in this data will be determined by how transparent its process is moving forward as the research expands, and how ultimately it makes it available to communities beyond Facebook," he added.

Check model accuracy with Facebook AI's new data set

Facebook paid people for their likenesses in its new Casual Conversations data set. The data set can help researchers check for bias in their AI models.

Benchmarking data set

Need for diversity

Good first steps

Dig Deeper on AI business strategies

What is beauty tech? 10 trends shaping the cosmetics industry

AI, the 2024 U.S. election and the spread of disinformation

How generative AI is changing the fashion industry

wearable technology