TowerInstruct-7B-v0.1

Maintained By
Unbabel

TowerInstruct-7B-v0.1

PropertyValue
Parameter Count6.74B
LicenseCC-BY-NC-4.0
PaperarXiv:2402.17733
Supported Languages10 (EN, DE, FR, ZH, PT, NL, RU, KO, IT, ES)
Base ModelTowerBase

What is TowerInstruct-7B-v0.1?

TowerInstruct-7B-v0.1 is a specialized language model developed by Unbabel in collaboration with Instituto Superior Técnico and CentraleSupélec University of Paris-Saclay. It's specifically designed for translation-related tasks, representing a significant advancement in multilingual AI capabilities.

Implementation Details

The model is built upon TowerBase and fine-tuned using the TowerBlocks supervised fine-tuning dataset. It implements the ChatML prompt template format and operates with F32 tensor type. Training utilized a batch size of 256, with a cosine learning rate scheduler and carefully optimized hyperparameters for maximum performance.

  • Total training batch size: 256
  • Learning rate: 7e-06
  • Maximum sequence length: 2048
  • Training epochs: 4

Core Capabilities

  • General machine translation (sentence and paragraph-level)
  • Terminology-aware translation
  • Context-aware translation
  • Automatic post-edition
  • Named-entity recognition
  • Grammatical error correction
  • Paraphrase generation

Frequently Asked Questions

Q: What makes this model unique?

TowerInstruct-7B-v0.1 stands out for its specialized focus on translation-related tasks across 10 major languages, combining various translation capabilities with additional language understanding tasks like named-entity recognition and grammatical error correction.

Q: What are the recommended use cases?

The model is best suited for translation tasks, automatic post-editing, and language-related transformations. However, it's important to note that it's not intended for use as a conversational chatbot or code assistant, despite having some training in these areas.

The first platform built for prompt engineering