bagel-dpo-34b-v0.2

Property	Value
Parameter Count	34.4B
Model Type	Language Model (LLaMA Architecture)
License	Yi License
Tensor Type	BF16

What is bagel-dpo-34b-v0.2?

bagel-dpo-34b-v0.2 is an experimental fine-tuned version of the Yi-34B-200K model, developed using the bagel training framework. This model stands out for its implementation of Direct Preference Optimization (DPO) and its training on an extensive collection of 29 diverse datasets. The model is designed to be less restricted in its responses while maintaining high-quality output across various tasks.

Implementation Details

The model employs a unique multi-format prompting system that incorporates four different prompt styles: Vicuna, LLaMA-2, Alpaca, and ChatML. Each instruction is processed through all four formats during training, effectively quadrupling the exposure to training data. The model uses BF16 precision and has been trained with careful attention to decontamination using approximate nearest neighbor search.

Trained on 29 carefully curated datasets spanning multiple domains
Implements four different prompt formats for enhanced versatility
Uses Direct Preference Optimization for improved output quality
Incorporates both standard and toxic DPO datasets for reduced censorship

Core Capabilities

Advanced coding abilities through multiple programming datasets
Strong mathematical reasoning from specialized math instruction sets
Multi-lingual comprehension capabilities
Creative writing and roleplay abilities
SQL and database query handling
Uncensored response capability when appropriately prompted

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness stems from its comprehensive training approach using multiple prompt formats and its integration of both standard and specialized DPO datasets. It offers reduced censorship compared to similar models while maintaining high-quality outputs across various tasks.

Q: What are the recommended use cases?

The model is suited for a wide range of applications including coding, mathematical problem-solving, creative writing, and general conversation. It's particularly useful in scenarios requiring detailed, unrestricted responses while maintaining output quality.

bagel-dpo-34b-v0.2

bagel-dpo-34b-v0.2

What is bagel-dpo-34b-v0.2?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models