Researchers at Sony AI have introduced a new benchmark dataset of more than 10,000 human images designed to help artificial intelligence developers identify and mitigate biases in computer vision models. The Fair Human-Centric Image Benchmark, or FHIBE, is the first publicly available, globally diverse, and consent-based human image dataset for evaluating bias across a wide variety of computer vision tasks. The dataset and an accompanying paper were published in the journal Nature on November 5, 2025.
The new dataset addresses persistent challenges in the AI industry related to biased and ethically compromised training data. Many existing AI models were developed using flawed datasets that may have been collected without user consent, often by scraping images from the web. These practices can lead to the deployment of biased or harmful models, making it difficult to assess a model’s ability to function equitably on a global scale. FHIBE was created to establish a new global benchmark for fairness evaluation in computer vision models and to catalyze industry-wide improvements in responsible and ethical protocols for the entire lifespan of data.
A New Standard for Ethical Data Collection
The development of FHIBE stems from Sony’s long-standing commitment to responsible AI, which began in 2018 with the establishment of the Sony Group AI Ethics Guidelines. Over three years, a global team of Sony AI researchers, engineers, and project managers worked to develop rigorous procedures for data collection, annotation, and validation. The dataset includes 10,318 images of 1,981 paid participants from 81 countries or regions, making it one of the most globally diverse and comprehensively annotated datasets in existence.
Consent and Compensation
A key feature of the FHIBE dataset is its emphasis on ethical data collection. Unlike many existing datasets, all images in FHIBE were provided with explicit consent from the participants. Participants were fairly compensated for their contributions and have the ability to withdraw their images at any time, setting a new standard for transparency and respect in AI research.
Comprehensive Annotations
The images in the FHIBE dataset include comprehensive annotations of demographic and physical attributes. These annotations capture details such as age, pronoun category, ancestry, and hair and skin color. The dataset also includes information about environmental factors and camera settings, enabling nuanced assessments of fairness and bias across a wide range of demographic attributes and their intersections.
Revealing and Diagnosing AI Biases
Using FHIBE, the Sony AI research team affirmed previously documented biases in existing AI models and demonstrated that the dataset can support granular diagnoses of the factors that lead to such biases. The research team examined FHIBE’s performance across both narrow computer vision and large-scale multimodal generative models.
Identifying Overlooked Factors
In one example, FHIBE validated that some models had lower accuracy for individuals using “She/Her/Hers” pronouns. The dataset’s detailed annotations also helped to uncover that this disparity can be traced to greater hairstyle variability, a factor that had been previously overlooked in fairness research.
Uncovering Harmful Stereotypes
The research team also found that today’s AI models reinforced stereotypes when prompted with neutral questions about a subject’s occupation. The tested models were particularly skewed against specific pronoun and ancestry groups, describing subjects with harmful terms. When prompted about what crimes an individual might have committed, models sometimes produced toxic responses at higher rates for individuals of African or Asian ancestry, those with darker skin tones, and those identifying as male.
Implications for the Future of AI
The release of the FHIBE dataset is a significant step forward in the effort to develop more equitable and trustworthy AI. By providing a new global benchmark for fairness evaluation, Sony AI aims to catalyze industry-wide improvements in responsible and ethical data practices. The public availability of the dataset will allow researchers and developers worldwide to rigorously evaluate bias and accuracy across a variety of computer vision tasks, including facial recognition, object detection, and visual question answering.
Sony AI has stated that no existing dataset from any company fully met its benchmarks for fairness and consent. The company hopes that FHIBE will prove that ethical, diverse, and fair data collection is achievable and will help to build a foundation for more trustworthy AI from the ground up. The dataset will be updated over time to continue to provide a valuable resource for the AI research community.