plamo-13b

Maintained By
pfnet

PLaMo-13B

PropertyValue
Parameter Count13.1B
LicenseApache v2.0
Context Length4096 tokens
Training Tokens1.5T (1.32T English, 0.18T Japanese)
PaperResearch Paper

What is PLaMo-13B?

PLaMo-13B is a sophisticated bilingual language model developed by Preferred Networks, Inc., built upon the LLaMA architecture. It's specifically designed to handle both English and Japanese languages effectively, making it a powerful tool for multilingual applications.

Implementation Details

The model utilizes a causal decoder-only architecture and employs a custom sentencepiece tokenizer trained on a subset of the pre-training datasets. It's implemented using the Transformers library and supports both CPU and GPU inference.

  • Trained on diverse datasets including C4, Project Gutenberg, RedPajama, and Japanese Wikipedia
  • Uses BF16 tensor type for efficient computation
  • Supports a context window of 4096 tokens

Core Capabilities

  • Bilingual text generation in English and Japanese
  • High-quality language understanding and generation
  • Flexible integration through Hugging Face Transformers
  • Support for various text generation parameters (temperature, top-k, top-p)

Frequently Asked Questions

Q: What makes this model unique?

PLaMo-13B stands out for its balanced bilingual capabilities, having been trained on both English (1.32T tokens) and Japanese (0.18T tokens) datasets, making it particularly effective for applications requiring both languages.

Q: What are the recommended use cases?

The model is well-suited for text generation tasks in both English and Japanese, including content creation, translation assistance, and general language understanding applications. However, safety testing is recommended before deployment in production environments.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.