Sabia-7B
Property | Value |
---|---|
Parameter Count | 6.74B |
Model Type | Text Generation |
Architecture | LLaMA-based |
Paper | Sabiá: Portuguese Large Language Models |
License | Same as LLaMA-1 (Research Only) |
Context Length | 2048 tokens |
What is sabia-7b?
Sabia-7B is a specialized Portuguese language model developed by Maritaca AI, built upon the LLaMA-1-7B architecture. It represents a significant advancement in Portuguese natural language processing, trained on 7 billion tokens from the Portuguese subset of ClueWeb22 and further refined with an additional 10 billion tokens of training.
Implementation Details
The model utilizes the LLaMA-1-7B architecture and tokenizer, optimized for Portuguese language understanding and generation. It employs BF16 precision and supports a maximum sequence length of 2048 tokens, with data freshness up to mid-2022.
- Pretrained on ClueWeb22 Portuguese subset
- Approximately 1.4 epochs of additional training
- Maintains LLaMA's architecture while specializing in Portuguese
- Supports few-shot learning tasks
Core Capabilities
- Strong performance on ENEM Challenge (55.07% accuracy)
- Effective hate speech detection (64.13% F1-score on PT Hate Speech Binary)
- Robust sentiment analysis (46.64% F1-score on tweetSentBR)
- Natural language inference tasks (58.34% F1-score on FaQuAD NLI)
Frequently Asked Questions
Q: What makes this model unique?
Sabia-7B is specifically optimized for Portuguese language tasks while maintaining competitive performance in English. It achieves a 48.5 NPM score on the Poeta benchmark, surpassing both LLaMA-1 and LLaMA-2 in Portuguese language tasks.
Q: What are the recommended use cases?
The model excels in few-shot learning scenarios rather than zero-shot tasks. It's particularly effective for Portuguese text generation, classification tasks, and natural language understanding applications in academic and research contexts.