TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen
en-GBde-DEes-ESfr-FR

TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen

13/12/2024 Frontiers Journals

In recent years, the field of deep learning has experienced remarkable growth, leading to the emergence of large, pre-trained models such as ChatGPT, which demonstrates significant capability in understanding and responding to human language inputs, and DALL-E, which creatively generates images from textual descriptions in a zero-shot manner. Another notable innovation in this domain is CLIP (Contrastive Language-Image Pre-Training), a model that excels in representation learning by bridging multiple modalities to perform classifications, also in a zero-shot manner. CLIP, trained on a diverse array of images and natural language descriptions readily available on the internet, can interpret natural language instructions to execute a wide range of classification tasks without specific optimization for those tasks. These advanced models have shown remarkable effectiveness in various real-world applications, showcasing their potential even when not trained on task-specific data. Notably, CLIP achieved a zero-shot accuracy of 76.2% on the ImageNet dataset. However, a pressing question remains within the machine learning community: Does CLIP know everything?

To solve the problems, a research team led by Da-Wei Zhou published their new research on 15 October 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

This question is pivotal. If a model could truly understand and react to all information, the exploration of alternative models might become redundant. Nevertheless, the reality is that no model, including CLIP, possesses complete knowledge. Our world is in constant flux, with new data, objects, categories, and information emerging regularly. For instance, ChatGPT’s knowledge of world events, such as political changes, is contingent upon its training data, and CLIP cannot recognize images of products released after its last update, such as the ’Apple Vision Pro’ launched in 2023.

This paper focuses on identifying datasets unknown to CLIP, a task of considerable importance. Given CLIP’s training on the extensive LAION dataset, identifying such datasets not only facilitates the application of transfer learning for downstream tasks but also serves as a means to evaluate CLIP’s ability to detect out-of-distribution or novel instances and continual learning. This is particularly relevant in the context of addressing the hallucination issues prevalent in large models. To advance research in this area, we introduce a dataset of TV series released post2021, named TV100, to explore CLIP’s performance further.

To investigate whether a pre-trained CLIP knows these images, we conduct an experiment on its zero-shot performance and finetuned performance. Accordingly, we find a pre-trained CLIP cannot recognize any classes from the dataset. By contrast, if we finetune the CLIP model with the images, the performance drastically improves, indicating that the dataset is learnable and separable. This dataset holds significant potential for use in various research areas, including the evaluation of incremental learning, novel class discovery, and long-tailed learning, among others.

DOI: 10.1007/s11704-024-40217-z

Attached files
  • Detailed information about TV100, including the data collection process, the country distribution, and class distribution. It also contains an empirical evaluation of zero-shot and finetuned performance.
13/12/2024 Frontiers Journals
Regions: Asia, China
Keywords: Applied science, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of news releases posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • BBC
  • The Times
  • National Geographic
  • The University of Edinburgh
  • University of Cambridge
  • iesResearch
Copyright 2024 by AlphaGalileo Terms Of Use Privacy Statement