roberta-large-finnish

Maintained By
Finnish-NLP

roberta-large-finnish

PropertyValue
Model TypeRoBERTa Large
Training Data78GB Finnish text
TokenizerBPE (50265 vocab size)
DeveloperFinnish-NLP

What is roberta-large-finnish?

roberta-large-finnish is a powerful Finnish language model based on the RoBERTa architecture, trained on a diverse corpus of Finnish text including news archives, Wikipedia, and web crawl data. The model specializes in masked language modeling (MLM) and is designed for downstream task fine-tuning.

Implementation Details

The model was trained on TPUv3-8 VM using Adafactor optimizer with carefully tuned parameters. Training involved 2 epochs with 128 sequence length followed by an additional epoch with 512 sequence length. The model implements dynamic masking during pretraining, making it more robust than traditional BERT implementations.

  • Trained on combined 78GB of cleaned Finnish text data
  • Uses byte-level BPE tokenization with 50265 vocabulary size
  • Implements 15% token masking with varied replacement strategies
  • Achieves 88.02% average accuracy on downstream tasks

Core Capabilities

  • Masked language modeling for Finnish text
  • Sequence classification tasks
  • Token classification
  • Question answering
  • Feature extraction for downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Finnish language processing, trained on a comprehensive dataset of Finnish text. It improves upon previous Finnish language models and approaches the performance of FinBERT while using the more advanced RoBERTa architecture.

Q: What are the recommended use cases?

The model is primarily designed for tasks that utilize whole sentence context, such as sequence classification, token classification, and question answering. It's not recommended for text generation tasks, where models like GPT2 would be more appropriate.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.