YOLOS-tiny

Property	Value
Parameter Count	6.49M
License	Apache 2.0
Paper	View Paper
Performance	28.7 AP on COCO
Framework	PyTorch

What is yolos-tiny?

YOLOS-tiny is a compact Vision Transformer (ViT) model designed specifically for object detection tasks. Developed by hustvl, it represents a lightweight implementation of the YOLOS architecture that achieves impressive performance while maintaining a small parameter footprint. The model utilizes a novel approach that applies transformer architecture directly to vision tasks, departing from traditional CNN-based detection methods.

Implementation Details

The model employs a bipartite matching loss system and processes images using a transformer-based architecture. It handles 100 object queries simultaneously and uses the Hungarian matching algorithm to optimize object detection. The model has been pre-trained on ImageNet-1k and fine-tuned on COCO 2017, with 300 epochs for each phase.

Transformer-based architecture optimized for vision tasks
Bipartite matching loss with Hungarian algorithm
Pre-trained on ImageNet-1k and fine-tuned on COCO
F32 tensor type for precise computations

Core Capabilities

Object detection with 28.7 AP on COCO validation
Efficient processing with only 6.49M parameters
Support for multiple object detection in single images
Integration with HuggingFace Transformers library

Frequently Asked Questions

Q: What makes this model unique?

YOLOS-tiny stands out for its efficient implementation of Vision Transformers for object detection, achieving competitive performance with significantly fewer parameters than traditional models. It demonstrates that transformer architectures can be effectively scaled down while maintaining useful detection capabilities.

Q: What are the recommended use cases?

The model is ideal for applications requiring lightweight object detection, particularly where computational resources are limited. It's suitable for real-time object detection tasks, mobile applications, and scenarios where efficient deployment is prioritized over maximum accuracy.

yolos-tiny