TAPAS Base Model Fine-tuned on WikiTable Questions
Property | Value |
---|---|
License | Apache 2.0 |
Developer | |
Primary Paper | TAPAS: Weakly Supervised Table Parsing via Pre-training |
Dev Accuracy | 46.38% |
What is tapas-base-finetuned-wtq?
TAPAS base-finetuned-wtq is a BERT-like transformer model specifically designed for table question answering tasks. This model represents Google's advanced approach to natural language processing for structured data, combining masked language modeling with specialized table parsing capabilities. The model has been fine-tuned on a chain of datasets including SQA, WikiSQL, and WikiTable Questions (WTQ), making it particularly effective for answering questions about tabular data.
Implementation Details
The model implements a sophisticated architecture with relative position embeddings, resetting position indices at each table cell. It was trained on 32 Cloud TPU v3 cores for 50,000 steps with a maximum sequence length of 512 and batch size of 512. The training process utilized the Adam optimizer with a learning rate of 1.93581e-5 and a warmup ratio of 0.128960.
- Pre-trained on Wikipedia data using masked language modeling
- Features intermediate pre-training for numerical reasoning
- Implements cell selection and aggregation heads for table parsing
- Uses WordPiece tokenization with 30,000 vocabulary size
Core Capabilities
- Table question answering with 46.38% accuracy on WTQ dataset
- Processes both tabular data and natural language queries
- Supports complex numerical reasoning tasks
- Handles relative position embeddings for improved table understanding
Frequently Asked Questions
Q: What makes this model unique?
The model's unique strength lies in its combination of masked language modeling and intermediate pre-training specifically designed for table understanding. It uses relative position embeddings and can reset position indices at each table cell, making it particularly effective for structured data analysis.
Q: What are the recommended use cases?
This model is ideal for applications requiring natural language querying of tabular data, such as database interfaces, data analysis tools, and information retrieval systems. It's particularly effective for scenarios where users need to ask complex questions about tabulated information.