Brief-details: Korean-optimized reranker model (560M params) fine-tuned from BGE-reranker-large, designed for improving RAG performance with Korean text classification.
Brief-details: A high-quality dense prediction model for depth estimation, part of the Lotus visual foundation model family. Apache 2.0 licensed with 15K+ downloads.
Brief-details: TF-ID-large-no-caption is an 823M parameter model for detecting tables and figures in academic papers, achieving 97.32% accuracy without caption extraction.
BRIEF DETAILS: Vision Transformer (ViT) model with 88.3M parameters, pre-trained on ImageNet-21k and fine-tuned on ImageNet-1k for image classification at 384x384 resolution.
Brief-details: A compact 504M parameter multimodal model optimized for CPU inference via GGUF format, capable of processing both text and images with efficient quantization options
Brief-details: A 27B parameter Gemma-based model with various GGUF quantization options from full F16 (54GB) down to IQ2_XXS (7.63GB), optimized for different hardware configurations and RAM constraints.
Brief Details: A 3.09B parameter GGUF-formatted instruction model, quantized for efficient deployment with multiple precision options from 2-bit to 8-bit.
BRIEF DETAILS: Vision Transformer model utilizing SigLIP (Sigmoid Loss) for zero-shot image classification, trained on WebLI dataset, offering robust image-text understanding capabilities.
Brief-details: Educational web content classifier trained on 450k Llama3-annotated samples, 109M params, scores pages 0-5 for educational value with 82% F1 score.
Brief Details: Chinese GPT2 model trained on CLUECorpusSmall dataset, optimized for Chinese text generation with multiple size variants from distil to xlarge versions
Brief-details: OWL-ViT is a zero-shot text-conditioned object detection model using CLIP backbone with ViT architecture, enabling open-vocabulary object detection through text queries
Brief-details: A compact 800M parameter vision-language model optimized for OCR and document processing, featuring state-of-the-art text recognition capabilities despite its small size.
Brief Details: LaMini-Flan-T5-783M is a fine-tuned text-to-text model with 783M parameters, optimized for instruction-following tasks using 2.58M samples.
Brief Details: BART-based dialogue summarization model (406M params) fine-tuned on SAMSum dataset, achieving 54.39 ROUGE-1 score for conversation summaries.
Brief-details: OCR-free document understanding transformer model fine-tuned on CORD dataset, combining Swin Transformer vision encoder with BART text decoder for document parsing tasks.
Brief Details: A T5-based paraphrasing model trained on ChatGPT-generated data, capable of producing diverse, high-quality text variations with state-of-the-art results.
Brief Details: A 70B parameter uncensored LLaMA 3.1 variant optimized for GGUF format with multiple quantization options from 26.5GB to 75.1GB file sizes.
Brief Details: 8B parameter LLaMA3-based model with multiple GGUF quantizations for efficient deployment, optimized for text generation in English.
Brief Details: Fine-tuned Stable Diffusion model for creating image variations using CLIP embeddings. Supports Diffusers pipeline, trained on LAION aesthetics dataset. 15K+ downloads.
Brief Details: LLaMA2-based model fine-tuned for ESCI (Exact, Substitute, Complement, Irrelevant) query-product relevance ranking with 15K+ downloads
Brief-details: Microsoft's 13B parameter LLM focused on reasoning capabilities, built on LLAMA-2. Research-oriented model excelling in single-turn responses and complex tasks.