PlumX Metrics
Embed PlumX Metrics

Transformer-CNN for small image object detection

Signal Processing: Image Communication, ISSN: 0923-5965, Vol: 129, Page: 117194
2024
  • 3
    Citations
  • 0
    Usage
  • 9
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

Article Description

Object recognition in computer vision technology has been a popular research field in recent years. Although the detection success rate of regular objects has achieved impressive results, small object detection (SOD) is still a challenging issue. In the Microsoft Common Objects in Context (MS COCO) public dataset, the detection rate of small objects is typically half that of regular-sized objects. The main reason is that small objects are often affected by multi-layer convolution and pooling, leading to insufficient details to distinguish them from the background or similar objects, resulting in poor recognition rates or even no results. This paper presents a network architecture, Transformer-CNN, that combines a self-attention mechanism-based transformer and a convolutional neural network (CNN) to improve the recognition rate of SOD. It captures global information through a transformer and uses the translation invariance and translation equivalence of CNN to maximize the retention of global and local features while improving the reliability and robustness of SOD. Our experiments show that the proposed model improves the small object recognition rate by 2∼5 % than the general transformer architectures.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know