SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost

CAM is proposed to highlight the class-related activation regions for an image classification network, where feature positions related to the specific object class are activated and have higher scores while other regions are suppressed and have lower scores. For specific visual tasks, CAM can be used to infer the object bounding boxes in weakly-supervised object location(WSOL) and generate pseudo-masks of training images in weakly-supervised semantic segmentation (WSSS). Therefore, obtaining the high-quality CAM is very important to improve the recognition performance of weakly supervised pixel-wise dense prediction tasks.

To solve the problems, a research team led by Yanpeng SUN published their new research on 15 Feb 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

This work aims to design a simple yet efficient method to expand CAM. Rethinking the classification network, to improve the probability of identifying objects, pixels belonging to the same category in the feature map have similar representations. To verify this assumption, as shown in Figure 1, we randomly select one pixel from the feature maps generated by different backbone stages to visualize the correlations with other pixels. It can be observed that as the network deepens, the correlation between pixels of the same category on the feature map is stronger. The above visualization results provide strong evidence for our hypothesis and the semantic correlation between pixels is defined as semantic structure information. It is worth noting that we employ feature points from various locations to compute and assess semantic correlations. This methodology facilitates a more precise comprehension and depiction of semantic associations between objects. In the context of semantic structure information, "structural information" refers to the description of relationships between objects. By analyzing and capturing these structural cues, we can better understand the semantic correlations and the general structure among the objects.

In the research, proposes a semantic structure aware inference (SSA) model by leveraging different scales of semantic structure information to generate high-quality CAM, and hence improve the recognition performance of downstream tasks. SSA is introduced in the model inference without any training cost. The overall network architecture is shown in Figure 2. Specifically, a seed CAM is first obtained by using the standard image classification network. Then, the semantic structure modeling module (SSM) is proposed and deployed on different backbone stages to generate the semantic relevance representation. After that, the obtained structured feature representations are used to polish the seed CAM via the dot product operation. Finally, the polished CAMs from different backbone stages are fused as the final CAM. To the best of our knowledge, this is the first work to improve the quality of CAM without parameters in the model inference step. Experimental results on both WSOL and WSSS demonstrate that SSA can achieve new state-of-the-art performance.

Future work will prioritize enhancing the generalization ability of semantic structure information. This involves developing methods to refine and augment the representation of semantic structures within our model. By improving the model's capacity to generalize this crucial information, we aim to boost overall performance and robustness in diverse scenarios.

DOI: 10.1007/s11704-024-3571-9

https://journal.hep.com.cn/fcs/EN/10.1007/s11704-024-3571-9

Research Article, Published: 15 February 2025
Yanpeng SUN, Zechao LI. SSA: semantic structure aware inference on CNN networks for weakly pixel-wise dense predictions without cost. Front. Comput. Sci., 2025, 19(2): 192702, https://doi.org/10.1007/s11704-024-3571-9

Attached files

Fig.2 The overall network architecture of the proposed semantic structure aware inference (SSA). Since SSA is only used in the inference CAM stage, it is suitable for all CNN-based models.
Fig.1 Visualizations of the semantic structure information in backbone stages. Pixels of the same class as the marked pixel are brightly colored. The brighter the color, the higher the similarity. Our motivation comes from this phenomenon.

11/03/2025 Frontiers Journals

Regions: Asia, China, Extraterrestrial, Sun

Keywords: Applied science, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Latest Publications

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.

Koula Bouloukos, Senior manager, Editorial & Production Underknown

We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.

Peter Dunn, Director of Press and Media Relations at the University of Warwick

AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.

SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost

This item is under embargo and is only visible to journalists

Latest Publications

Testimonials

Koula Bouloukos, Senior manager, Editorial & Production Underknown

Peter Dunn, Director of Press and Media Relations at the University of Warwick

Ben Deighton, SciDevNet