Debiasing Vision-Language Models for Vision Tasks: A Survey
en-GBde-DEes-ESfr-FR

Debiasing Vision-Language Models for Vision Tasks: A Survey

25/01/2025 Frontiers Journals

In recent years, foundation Vision-Language Models (VLMs), such as CLIP [1], which empower zero-shot transfer to a wide variety of domains without fine-tuning, have led to a significant shift in machine learning systems. Their success can primarily be credited to three factors: Web-scale multimodal data, self-contrastive losses and the rise of Transformer architecture. Despite the impressive capabilities, it is concerning that the VLMs are prone to inheriting biases from the uncurated datasets scraped from the Internet [4–8]. We examine these biases from three perspectives: (1) Label bias, certain classes (words) appear more frequently in the pre-training data. (2) Spurious correlation, non-target features, e.g., image background, that are correlated with labels, resulting in poor group robustness. (3) Social bias, which is a special form of spurious correlation, focuses on societal harm. Unaudited image-text pairs might contain human prejudice, e.g., gender, ethnicity, and age, that are correlated with targets. These biases are subsequently propagated to downstream tasks, leading to biased predictions.
A research team provide an overview of the three biases prevalent in visual classification within the area of VLMs, along with strategies to mitigate these biases in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
Currently, most VLMs debiasing methods focus on discriminative models, such as image classification, while generative tasks like image captioning and image generation receive little attention in terms of debiasing. This could become a significant research direction in the future.
DOI: 10.1007/s11704-024-40051-3
Beier ZHU, Hanwang ZHANG. Debiasing vision-language models for vision tasks: a survey. Front. Comput. Sci., 2025, 19(1): 191321, https://doi.org/10.1007/s11704-024-40051-3
Attached files
  • Figure1
25/01/2025 Frontiers Journals
Regions: Asia, China
Keywords: Applied science, Artificial Intelligence, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of news releases posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • BBC
  • The Times
  • National Geographic
  • The University of Edinburgh
  • University of Cambridge
  • iesResearch
Copyright 2025 by AlphaGalileo Terms Of Use Privacy Statement