rtdetr_r101vd_coco_o365

Maintained By
PekingU

RT-DETR R101VD COCO-Objects365

PropertyValue
Parameter Count76.8M
LicenseApache-2.0
PaperarXiv:2304.08069
Performance56.2% AP on COCO, 74 FPS on T4 GPU

What is rtdetr_r101vd_coco_o365?

RT-DETR R101VD is a state-of-the-art real-time object detection model that combines the accuracy of DETR (Detection Transformer) architectures with the speed typically associated with YOLO models. This particular variant uses a ResNet101 backbone and has been trained on both COCO and Objects365 datasets, achieving impressive results of 56.2% AP while maintaining real-time performance at 74 FPS.

Implementation Details

The model implements an efficient hybrid encoder architecture that processes multi-scale features through two key components: Attention-based Intra-scale Feature Interaction (AIFI) and CNN-based Cross-scale Feature Fusion (CCFF). It uses an innovative uncertainty-minimal query selection method to optimize initial object queries for the decoder.

  • Efficient hybrid encoder for multi-scale feature processing
  • Uncertainty-minimal query selection for improved accuracy
  • Flexible speed tuning through adjustable decoder layers
  • Pre-trained on Objects365 for enhanced performance

Core Capabilities

  • Real-time object detection at 74 FPS on T4 GPU
  • High accuracy with 56.2% AP on COCO dataset
  • Excellent performance across different object scales (AP-s: 38.3%, AP-m: 60.5%, AP-l: 73.5%)
  • Support for 80 COCO object categories

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely bridges the gap between DETR and YOLO architectures, offering both high accuracy and real-time performance without the need for Non-Maximum Suppression (NMS). It's the first end-to-end object detector to achieve this balance.

Q: What are the recommended use cases?

The model is ideal for real-time object detection applications that require both speed and accuracy, such as autonomous driving, surveillance systems, and industrial inspection. Its flexible speed tuning makes it adaptable to various hardware configurations.

The first platform built for prompt engineering