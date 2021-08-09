



Posted by: Mark Daz and Emily Denton, Research Scientists, Google Research, Ethical AI Team

Data underlies much of machine learning (ML) research and development and helps structure what machine learning algorithms learn and how models are evaluated and benchmarked. However, data collection and labeling can be complicated by challenges such as unknowing bias, restricted data access, and privacy concerns. As a result, machine learning datasets can reflect unfair social bias along aspects such as race, gender, and age.

Examining datasets that can reveal information about how different social groups are represented is an important factor in ensuring that ML model and dataset development is in line with AI principles. is. Such methods can inform the responsible use of ML datasets and indicate potential mitigations for unfair consequences. For example, previous studies have shown that some object recognition datasets are biased towards images from North America and Western Europe, and Google’s cloud to balance image representation in other parts of the world. Encourages source efforts.

Today, we’ll use the COCO caption dataset as a case study to show you some of the features of KnowYour Data (KYD), a dataset discovery tool recently introduced on Google I / O. Use this tool to find different gender and age biases in your COCO captions. These biases can be traced to both dataset collection and annotation practices. KYD is a dataset analysis tool that complements the ever-growing suite of responsible AI tools being developed by Google and the broad research community. Currently, KYD only supports analysis of small sets of image datasets, but we are working hard to make tools accessible beyond this set.

Introducing Know Your Data Know Your Data aims to help ML research, product, and compliance teams understand datasets, improve data quality, and mitigate fairness and bias issues. KYD provides a variety of features that allow users to explore and explore machine learning datasets. Users can filter, group, and investigate correlations based on annotations that already exist in a particular dataset. KYD also displays labels that are automatically calculated from Google’s Cloud Vision API, providing users with an easy way to explore data based on signals that didn’t originally exist in the dataset.

KYD Case Study As a case study, we will use the COCO Captions dataset to investigate some of these features. The COCOCaptions dataset is an image dataset that contains 5 human-generated captions for every over 300,000 images. Focus your analysis on the signals that already exist in your dataset, taking into account the rich annotations provided by the free-form text.

Gender Bias Investigations Previous studies have demonstrated unwanted gender biases in computer vision datasets, such as female pornographic images and correlation of image labels that match harmful gender stereotypes. Investigate gender bias within COCO captions by using KYD to examine gender correlation within image captions. There is a gender bias in the depiction of different activities throughout the image in the dataset, and there is also a bias related to how people of different genders are described by annotators.

The first part of the analysis was aimed at revealing gender biases regarding the various activities shown in the dataset. We examined images captioned with words that describe various activities and analyzed their relationship to gender caption words such as “male” and “female.”[KYD Relations]Tabs make it easy to explore the relationships between two different signals in a dataset by visualizing how much (or less) the two signals happen to occur by chance. Each cell shows the positive (blue) or negative (orange) correlation between two specific signal values ​​and the strength of that correlation.

KYD also allows users to filter rows in the relation table based on substring matches. Using this feature, I first looked up caption words containing “-ing” as an easy way to filter by verb. We immediately saw a strong gender correlation:

Use KYD to analyze the relationship between any word and a word of gender. Each cell indicates whether the two respective words occur at the same time with the same caption, more often (up arrow) or less frequently (down arrow) than pure coincidence.

Further digging into these correlations, some activities typically associated with women, such as “shopping” and “cooking,” are accompanied by images captioned as “female” or “female.” “Male” or “male”. In contrast, captions that describe many physically focused activities, such as “skateboarding,” “surfing,” and “snowboarding,” have the same percentage of images captioned as “male” or “male.” Occurs in.

Individual image captions may not use stereotypes or derogatory terms, as in the example below, but certain gender groups are over (or under) within a particular activity throughout the dataset. If so, the model developed from the dataset is at risk of learning stereotype associations. With KYD, you can easily plan, quantify, and mitigate this risk.

Image with one of the captions “Two women cooking in a beige and white kitchen”. Image licensed with CC-BY2.0.

In addition to examining the social group biases depicted in the various activities, we also investigated the biases in how annotators describe the appearance of people they perceive as male or female. Inspired by media scholars who examined the “male gaze” embedded in other forms of visual media, we investigated how often individuals recognized as women by COCO were described using adjectives that position them as objects of desire. I did. With KYD, you can easily find out the co-occurrence of binary gender-related words (such as “female / girl / female” and “male / male / boy”) and words related to physical attractiveness assessment. I did. Importantly, these are captions written by human annotators. The annotator makes a subjective assessment of the gender of the people in the image and selects attractive descriptors. The words “attractive,” “beautiful,” “pretty,” and “sexy” are often overestimated when referring to people who are perceived as women, compared to those who are perceived as men. understand. media.

KYD screenshot showing the relationship between attractive words and gender words. For example, “attractive” and “male / male / boy” co-occur 12 times, but are expected to occur up to 60 times by chance (ratio 0.2x). On the other hand, “attractive” and “female / female / girl” co-occur with a 2.62 times chance of chance.

KYD also allows you to manually inspect the images of each relationship by clicking on the relationship in question. For example, you can see an image that contains the female term (such as “female”) and the word “beautiful” in the caption.

Age Bias Survey Adults over the age of 65 have been shown to be underestimated in the dataset compared to their presence in the general population. The first step in improving age expression is to allow developers to assess age in their datasets. KYD helped assess the range of caption examples depicting older people by looking at caption words that describe different activities and analyzing their relationship to age-description caption words. Having examples of adult captions in different environments and activities is important for different tasks such as image captioning and pedestrian detection.

The first trend revealed by KYD is that captions detailing various activities rarely describe people as older people.[関係]The tabs also show a tendency for “elderly,” “old,” and “old” to not occur in verbs that describe various physical activities that may be important for the system to detect. increase. It is important to note that these relationships represent a person, as “old” is often used to describe something other than a person, such as belongings or clothing, compared to “young”. It also captures some uses that do not.

Age-related word relationships from KYD screenshots.

The underestimation of captions, including references to older people, examined here is that there is a relatively shortage of images depicting older people and that annotators use terms related to older people when describing people in the images. It may be due to a tendency to omit it. A manual inspection of the “old” and “running” intersections shows a negative relationship, but shows that there are no elderly people and that there are a large number of locomotives. With KYD, you can easily quantitatively and qualitatively inspect relationships to identify the strengths of your dataset and the areas you need to improve.

Conclusion Understanding the content of ML datasets is an important first step in developing appropriate strategies to mitigate the downstream impact of unfair dataset bias. The above analysis shows some potential mitigations. For example, the correlation between a particular activity and a social group can lead the trained model to reproduce social stereotypes and can be mitigated by “dataset balancing”. However, as a result of an analysis of how different genders are described by annotators, mitigations that focus solely on data set balancing are not sufficient. It turns out that the final dataset reflects the annotator’s subjective judgment of the people depicted in the image. This suggests that we need to consider more about how to annotate images. One solution for data practitioners developing image caption datasets is to integrate guidelines developed to write image descriptions that are sensitive to race, gender, and other identity categories. Is to consider.

The above case study covers only some of the KYD features. For example, Cloud Vision API signals are also integrated into KYD and can be used to infer signals that annotators are not directly labeling. We encourage the broader ML community to perform their own KYD case studies and share the results.

KYD complements other dataset analysis tools being developed throughout the ML community, such as Google’s growing Responsible AI toolkit. We look forward to ML practitioners using KYD to better understand datasets and reduce potential bias and fairness concerns. If you have any feedback about KYD, please contact us at knowyourdata-feedback @ google.com.

Acknowledgments The analysis and description of this post was made with the equivalent contributions of Emily Denton, Mark Daz, and Alex Hanna. Thanks to Marie Pellat, Ludovic Peran, Daniel Smilkov, Nikhil Thorat and Tsuung-Yi for their contributions and reviews to this post.

