The AI industry has made significant efforts to address biases in algorithms based on the lightness or darkness of people’s skin tones. However, recent research conducted by Sony AI is calling for a more comprehensive approach that takes into account red and yellow skin hues. In a paper published last month, authors William Thong and Alice Xiang from Sony AI, along with Przemyslaw Joniak from the University of Tokyo, propose a “multidimensional” measurement of skin color, aiming to create more diverse and representative AI systems.
The issue of skin color biases in AI systems has been highlighted by researchers for several years. A notable 2018 study by Joy Buolamwini and Timnit Gebru found that AI technology was more prone to inaccuracies when used on darker-skinned females. Consequently, companies have stepped up their efforts to test the accuracy of their systems with a diverse range of skin tones.
Sony’s research emphasizes that existing scales used to evaluate skin tone primarily focus on the lightness or darkness of skin. This narrow approach fails to detect and address many biases. Sony’s global head of AI Ethics, Alice Xiang, explains that the work they are doing aims to replace existing skin tone scales that solely focus on light versus dark. In their blog post, Sony’s researchers specifically acknowledge the oversight of biases against populations such as East Asians, South Asians, Hispanics, Middle Eastern individuals, and others who do not neatly fit within the light-to-dark spectrum.
To illustrate the impact of their proposed measurement, Sony’s research found that common image datasets tend to overrepresent people with lighter and redder skin tones while underrepresenting those with darker and yellower skin. As a result, AI systems can become less accurate. For instance, Twitter’s image-cropper and two other image-generating algorithms were found to favor redder skin tones, while other AI systems misclassified individuals with redder skin hues as “more smiley.”
Sony suggests adopting an automated approach based on the preexisting CIELAB color standard as a solution. This approach would move away from the manual categorization employed by the Monk scale, which Sony believes to be limiting.
However, critics argue that part of the appeal of the Monk Skin Tone Scale, named after its creator Ellis Monk, lies in its simplicity. The scale intentionally includes only ten skin tones to offer diversity without introducing inconsistencies associated with having more categories. Ellis Monk emphasizes that cognitive limitations make it challenging for individuals to accurately and reliably differentiate between a larger number of scale points.
Monk also pushes back against the notion that his scale fails to consider undertones and hues. He explains that the research behind the scale focused on determining which undertones to prioritize along the scale and at which points.
Major players in the AI field, including Google and Amazon, have reportedly appreciated Sony’s research. Both companies have expressed their intention to review the paper.
In conclusion, Sony AI’s research sheds light on the limitations of current skin tone scales in AI systems and calls for a more multidimensional approach that includes red and yellow hues. By adopting the CIELAB color standard and moving away from manual categorization, Sony hopes to address biases against various populations and create more diverse and accurate AI systems. The response from industry leaders suggests that there is an openness to consider and potentially integrate these proposed advancements into existing AI technologies.