miCSE: Mutual Information Contrastive Sentence Embedding

Property	Value
Parameter Count	109M
License	Apache 2.0
Paper	arXiv:2211.04928
Benchmark Score	78.13% (STS Average)

What is miCSE?

miCSE is an innovative sentence embedding model that leverages mutual information-based contrastive learning to create high-quality sentence representations, particularly excelling in few-shot scenarios. The model employs a unique approach of aligning attention patterns between different views during the contrastive learning process, making it especially efficient with limited training data.

Implementation Details

The model architecture is built on transformer-based technology with 109M parameters, utilizing attention mutual information (AMI) computation to enforce syntactic consistency across dropout augmented views. It processes input text to produce vector embeddings that capture semantic meaning, with sentence representations corresponding to the [CLS] token embedding.

Trained on English Wikipedia sentences
Supports variable-length inputs with maximum token length configuration
Implements cosine similarity for sentence comparison
Optimized for both full-shot and few-shot scenarios

Core Capabilities

Sentence similarity computation
Text retrieval tasks
Semantic clustering
Few-shot learning applications
Integration with SentenceTransformers framework

Frequently Asked Questions

Q: What makes this model unique?

miCSE's distinctive feature is its ability to perform effectively in low-resource scenarios through its mutual information-based approach to contrastive learning. It achieves this by enforcing structural consistency across augmented views of sentences, making it particularly efficient with limited training data.

Q: What are the recommended use cases?

The model is ideal for applications requiring semantic text similarity, including document retrieval, sentence clustering, and semantic search. It's particularly valuable in scenarios where training data is limited, making it suitable for specialized domain applications.

miCSE