PLaMo-13B

Property	Value
Parameter Count	13.1B
License	Apache v2.0
Context Length	4096 tokens
Training Tokens	1.5T (1.32T English, 0.18T Japanese)
Paper	Research Paper

What is PLaMo-13B?

PLaMo-13B is a sophisticated bilingual language model developed by Preferred Networks, Inc., built upon the LLaMA architecture. It's specifically designed to handle both English and Japanese languages effectively, making it a powerful tool for multilingual applications.

Implementation Details

The model utilizes a causal decoder-only architecture and employs a custom sentencepiece tokenizer trained on a subset of the pre-training datasets. It's implemented using the Transformers library and supports both CPU and GPU inference.

Trained on diverse datasets including C4, Project Gutenberg, RedPajama, and Japanese Wikipedia
Uses BF16 tensor type for efficient computation
Supports a context window of 4096 tokens

Core Capabilities

Bilingual text generation in English and Japanese
High-quality language understanding and generation
Flexible integration through Hugging Face Transformers
Support for various text generation parameters (temperature, top-k, top-p)

Frequently Asked Questions

Q: What makes this model unique?

PLaMo-13B stands out for its balanced bilingual capabilities, having been trained on both English (1.32T tokens) and Japanese (0.18T tokens) datasets, making it particularly effective for applications requiring both languages.

Q: What are the recommended use cases?

The model is well-suited for text generation tasks in both English and Japanese, including content creation, translation assistance, and general language understanding applications. However, safety testing is recommended before deployment in production environments.

plamo-13b