bert-uncased-keyword-extractor

Property	Value
License	Apache 2.0
Framework	PyTorch
Base Model	BERT-base-uncased
Training Metrics	F1: 0.8684, Precision: 0.8547, Recall: 0.8825

What is bert-uncased-keyword-extractor?

This is a specialized token classification model built on BERT-base-uncased architecture, designed specifically for keyword extraction tasks. The model demonstrates impressive performance metrics with 87% F1-score, making it particularly effective for automated keyword identification in English text.

Implementation Details

The model was trained using a carefully optimized process with Adam optimizer, utilizing a linear learning rate scheduler and native AMP mixed precision training. Training was conducted over 8 epochs with a learning rate of 2e-05 and batch sizes of 16 for both training and evaluation.

Achieved 97.41% accuracy on the evaluation set
Implements token classification architecture
Utilizes transformers framework version 4.19.2
Compatible with PyTorch 1.11.0+cu113

Core Capabilities

Automated keyword extraction from English text
High-precision token classification
Efficient processing with mixed precision support
Robust performance on various text types

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its high F1-score of 0.8684 and exceptional accuracy of 97.41% in keyword extraction tasks. It combines the power of BERT's contextual understanding with specialized training for keyword identification.

Q: What are the recommended use cases?

This model is ideal for applications requiring automated keyword extraction from English text, such as content tagging, document indexing, and automated metadata generation. It's particularly suitable for processing business documents, news articles, and technical content.