Significant progress has been made in machine learning with large amounts of clean labels and static data. However, in many real-world applications, the data often changes with time and it is class-imbalanced and difficult to obtain massive clean annotations, that is, noisy labels, class-imbalanced problems, and time series are faced simultaneously.
To solve the problems, a research team led by Yu-Feng LI published their new research on 15 December 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed a framework called RTS for robust learning from time series with noisy labels. This framework is designed for time series data and has the capability to simultaneously mitigate the problems of imbalanced and noisy labels without prior knowledge of the noise transition matrix. Extensive experiments on real-world applications and benchmarks, as well as theoretical analysis, clearly demonstrate the effectiveness of RTS.
In the research, they analyze the two challenges of learning from time series data with noisy labels: (a) Label noise in time series data misleads the learning process of feature representations, making the representations unreliable; (b) Samples of minority classes are insufficient and assigned with incorrect labels, leading to degraded performance in minority classes. To address the above challenges, they propose a novel framework RTS, which consists of two modules: Noise-tolerant Time Series Representation and Purified Oversampling Learning.
Firstly, the noise-tolerant time series representation module attempts to fully exploit noisy supervision so that the model can obtain robust feature representations and distinguish clean samples. It integrates the label smoothing technique and small-loss criterion to learn reliable feature representations and thereby accurately identify clean samples. Further, a novel oversampling for time series is developed to train an unbiased model for balancing and augmenting this dataset. They also give a theoretical analysis shedding light on the feasibility of their proposal. Empirical experiments on diverse tasks, such as the house-buyer evaluation task from real-world applications and various benchmark tasks, clearly demonstrate that RTS outperforms competitive methods.
Future work can focus on designing robust time-series learning methods that can handle more realistic noise and introducing more advanced model architectures into the RTS.
DOI: 10.1007/s11704-023-3200-z