detr-resnet-50

Maintained By
facebook

DETR ResNet-50

PropertyValue
Parameter Count41.6M
LicenseApache 2.0
PaperEnd-to-End Object Detection with Transformers
Training DataCOCO 2017 (118k images)
Performance42.0 AP on COCO validation

What is detr-resnet-50?

DETR-ResNet-50 is a groundbreaking object detection model that combines a ResNet-50 backbone with transformer architecture for end-to-end object detection. Developed by Facebook Research, it revolutionizes traditional object detection by eliminating the need for hand-crafted components like anchor boxes and non-maximum suppression.

Implementation Details

The model employs an encoder-decoder transformer architecture with a CNN backbone. It processes images through 100 object queries, each designed to detect specific objects in the image. The model uses a bipartite matching loss during training, utilizing the Hungarian algorithm for optimal query-annotation matching.

  • ResNet-50 backbone for feature extraction
  • Transformer encoder-decoder architecture
  • Linear layer for class prediction
  • MLP for bounding box detection
  • Bipartite matching loss function

Core Capabilities

  • Object detection with 42.0 AP on COCO dataset
  • Processing of images with multiple objects
  • End-to-end training capability
  • Efficient handling of variable object counts
  • Direct set prediction without post-processing

Frequently Asked Questions

Q: What makes this model unique?

DETR's uniqueness lies in its end-to-end approach to object detection using transformers, eliminating traditional hand-crafted components while maintaining competitive performance. It's trained on COCO 2017 and can process images targeting 100 potential objects simultaneously.

Q: What are the recommended use cases?

The model is ideal for general object detection tasks, particularly those requiring COCO-trained categories. It's especially suitable for scenarios requiring clean architecture without post-processing steps, and where batch processing of images is needed.

The first platform built for prompt engineering