rtdetr_r101vd_coco_o365

Maintained By
PekingU

RT-DETR R101VD COCO-Objects365

PropertyValue
Parameter Count76.8M
LicenseApache-2.0
PaperarXiv:2304.08069
Performance56.2% AP on COCO, 74 FPS on T4 GPU

What is rtdetr_r101vd_coco_o365?

RT-DETR R101VD is a state-of-the-art real-time object detection model that combines the accuracy of DETR (Detection Transformer) architectures with the speed typically associated with YOLO models. This particular variant uses a ResNet101 backbone and has been trained on both COCO and Objects365 datasets, achieving impressive results of 56.2% AP while maintaining real-time performance at 74 FPS.

Implementation Details

The model implements an efficient hybrid encoder architecture that processes multi-scale features through two key components: Attention-based Intra-scale Feature Interaction (AIFI) and CNN-based Cross-scale Feature Fusion (CCFF). It uses an innovative uncertainty-minimal query selection method to optimize initial object queries for the decoder.

  • Efficient hybrid encoder for multi-scale feature processing
  • Uncertainty-minimal query selection for improved accuracy
  • Flexible speed tuning through adjustable decoder layers
  • Pre-trained on Objects365 for enhanced performance

Core Capabilities

  • Real-time object detection at 74 FPS on T4 GPU
  • High accuracy with 56.2% AP on COCO dataset
  • Excellent performance across different object scales (AP-s: 38.3%, AP-m: 60.5%, AP-l: 73.5%)
  • Support for 80 COCO object categories

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely bridges the gap between DETR and YOLO architectures, offering both high accuracy and real-time performance without the need for Non-Maximum Suppression (NMS). It's the first end-to-end object detector to achieve this balance.

Q: What are the recommended use cases?

The model is ideal for real-time object detection applications that require both speed and accuracy, such as autonomous driving, surveillance systems, and industrial inspection. Its flexible speed tuning makes it adaptable to various hardware configurations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.