FLAN-T5-Large Grammar Synthesis
Property | Value |
---|---|
Parameter Count | 783M |
License | Apache 2.0 |
Paper | Research Paper |
Tensor Type | F32 |
What is flan-t5-large-grammar-synthesis?
This is a specialized fine-tuned version of google/flan-t5-large designed specifically for grammar correction tasks. Built on an expanded version of the JFLEG dataset, it excels at performing single-shot grammar corrections while maintaining the original semantic meaning of correctly structured text.
Implementation Details
The model utilizes the T5 architecture with FLAN enhancements, featuring 783M parameters. It employs beam search with 8 beams and includes specific optimizations like a repetition penalty of 1.21 and length penalty of 1.0 to ensure high-quality corrections.
- Supports batch inference for processing multiple sentences
- Available in ONNX format for optimized runtime performance
- Includes safetensors and GGUF format support
Core Capabilities
- Grammar error correction without altering correct content
- Handling of heavily error-prone text from various sources
- Support for audio transcription cleanup
- Correction of LLM-generated content
- Fixing "tortured-phrases" in AI-generated text
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to correct severe grammatical errors while preserving the semantic meaning of already correct text sets it apart. It's particularly effective for cleaning up ASR outputs and LLM-generated content.
Q: What are the recommended use cases?
Primary use cases include correcting ASR transcriptions, improving LLM outputs, fixing OCR results, and enhancing chatbot responses. It's particularly valuable for batch processing of text with multiple grammatical issues.