yolos-small

Maintained By
hustvl

YOLOS-Small Object Detection Model

PropertyValue
Parameter Count30.7M
LicenseApache 2.0
PaperView Paper
Performance36.1 AP on COCO

What is yolos-small?

YOLOS-small is a compact Vision Transformer (ViT) model designed specifically for object detection tasks. Developed by hustvl, it represents a simplified approach to transformer-based object detection, achieving impressive results while maintaining a relatively small parameter count of 30.7M.

Implementation Details

The model employs a bipartite matching loss system and processes images through a transformer architecture. It handles 100 object queries simultaneously and uses the Hungarian matching algorithm to optimize object detection. The model was pre-trained on ImageNet-1k for 200 epochs and fine-tuned on COCO 2017 for 150 epochs.

  • Utilizes PyTorch framework for implementation
  • Supports F32 tensor operations
  • Implements DETR-style loss function
  • Combines L1 and generalized IoU loss for bounding boxes

Core Capabilities

  • Object detection with state-of-the-art accuracy
  • Processing of multiple object queries simultaneously
  • Efficient feature extraction from images
  • Real-time bounding box prediction
  • COCO class classification

Frequently Asked Questions

Q: What makes this model unique?

YOLOS-small stands out for its simplicity and efficiency, achieving 36.1 AP on COCO validation while using a pure transformer-based architecture, eliminating the need for complex detection frameworks like Faster R-CNN.

Q: What are the recommended use cases?

The model is ideal for object detection tasks in real-world scenarios, particularly when working with the COCO dataset's object classes. It's especially suitable for applications requiring a good balance between model size and performance.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.