Bitte loggen Sie sich ein, um die Kontaktdaten zu sehen

Debiasing Vision-Language Models for Vision Tasks: A Survey

25.01.2025 Frontiers Journals

In recent years, foundation Vision-Language Models (VLMs), such as CLIP [1], which empower zero-shot transfer to a wide variety of domains without fine-tuning, have led to a significant shift in machine learning systems. Their success can primarily be credited to three factors: Web-scale multimodal data, self-contrastive losses and the rise of Transformer architecture. Despite the impressive capabilities, it is concerning that the VLMs are prone to inheriting biases from the uncurated datasets scraped from the Internet [4–8]. We examine these biases from three perspectives: (1) Label bias, certain classes (words) appear more frequently in the pre-training data. (2) Spurious correlation, non-target features, e.g., image background, that are correlated with labels, resulting in poor group robustness. (3) Social bias, which is a special form of spurious correlation, focuses on societal harm. Unaudited image-text pairs might contain human prejudice, e.g., gender, ethnicity, and age, that are correlated with targets. These biases are subsequently propagated to downstream tasks, leading to biased predictions.
A research team provide an overview of the three biases prevalent in visual classification within the area of VLMs, along with strategies to mitigate these biases in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
Currently, most VLMs debiasing methods focus on discriminative models, such as image classification, while generative tasks like image captioning and image generation receive little attention in terms of debiasing. This could become a significant research direction in the future.
DOI: 10.1007/s11704-024-40051-3

https://journal.hep.com.cn/fcs/EN/10.1007/s11704-024-40051-3

Beier ZHU, Hanwang ZHANG. Debiasing vision-language models for vision tasks: a survey. Front. Comput. Sci., 2025, 19(1): 191321, https://doi.org/10.1007/s11704-024-40051-3

Angehängte Dokumente

Figure1

25.01.2025 Frontiers Journals

Regions: Asia, China

Keywords: Applied science, Artificial Intelligence, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of news releases posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.