Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Llama-4-Maverick-17B-128E-Instruct-FP8

Brief Details: Meta's 17B parameter Llama-4 variant optimized for instruction-following, using FP8 quantization and 128-token context enhancement

Skywork

SkyReels-A2

Brief-details: SkyReels-A2 is a sophisticated video diffusion transformer model capable of composing video content using dual-branch encoding for spatial and semantic features, supporting high-resolution video generation.

meta-llama

Llama-4-Scout-17B-16E

Brief-details: Meta's Llama-4-Scout-17B-16E is a 17B parameter large language model, part of the Llama family, focused on enhanced dialogue and task completion capabilities.

google

gemma-3-27b-it-qat-q4_0-gguf

Brief Details: Gemma 3B 27B - Google's quantized instruction-tuned large language model, optimized for efficiency using 4-bit quantization in GGUF format

reducto

RolmOCR

Brief Details: RolmOCR is a fast, efficient OCR model built on Qwen2.5-VL-7B, offering improved performance and lower memory usage compared to olmOCR for document text extraction

rasbt

llama-3.2-from-scratch

Brief Details: Educational PyTorch implementation of Llama 3.2 with 1B/3B parameter variants, optimized for learning and research. Includes instruction-tuned versions.

meta-llama

Llama-4-Maverick-17B-128E-Instruct

BRIEF-DETAILS: Meta's 17B parameter LLaMA-4 variant optimized for instruction-following tasks, featuring enhanced context handling and improved performance.

all-hands

openhands-lm-32b-v0.1

Brief-details: OpenHands LM 32B is an open-source coding model with 37.2% resolve rate on SWE-Bench, featuring 128K context window and local deployment capability on consumer GPUs.

meta-llama

Llama-4-Scout-17B-16E-Instruct

Brief Details: Meta's 17B parameter instruction-tuned LLaMA model optimized for conversational AI, featuring enhanced instruction-following capabilities

mlx-community

Qwen2.5-VL-32B-Instruct-8bit

BRIEF-DETAILS: Qwen2.5-VL-32B-Instruct-8bit is a converted MLX format vision-language model optimized for efficiency with 8-bit quantization, enabling multimodal interactions

mlx-community

Qwen2.5-VL-32B-Instruct-bf16

BRIEF-DETAILS: MLX-optimized Qwen2.5 vision-language model with 32B parameters. Supports multimodal tasks using MLX framework. BF16 precision for efficient inference.

msj9817

GenHancer

Brief-details: GenHancer enhances CLIP models' fine-grained visual perception through innovative two-stage training, improving vision-language tasks by up to 6.0% on OpenAICLIP.

dartags

DanbotNL-2408-260M

Brief Details: DanbotNL-2408-260M is a 260M parameter language model specialized in translating natural language (Japanese/English) to Danbooru tags, using modernbert-ja-130m architecture.

amuvarma

3b-de-pretrain

Brief-details: A 3 billion parameter pretrained language model focused on German language processing, created by amuvarma and hosted on Hugging Face.

junnei

gemma-3-4b-it-speech

Brief-details: Multimodal Gemma-3 variant with speech capabilities - 4B parameters, handles text/audio/vision, specialized in ASR/AST tasks with 128K context window.

bartowski

SicariusSicariiStuff_X-Ray_Alpha-GGUF

Brief-details: A comprehensive collection of GGUF quantizations of X-Ray_Alpha model, offering various compression levels from 7.77GB to 1.54GB with different quality-size tradeoffs.

bartowski

TheDrummer_Fallen-Gemma3-27B-v1-GGUF

Brief Details: A quantized version of Fallen-Gemma3-27B-v1 with multiple GGUF variants, offering flexible performance-size tradeoffs from 8GB to 54GB with varying quality levels

AbstractPhil

clips

BRIEF-DETAILS: A personal collection of AI model clips shared on HuggingFace, designed for easy bulk access rather than individual curated uploads.

hon9kon9ize

CantoneseLLMChat-v1.0-32B

Brief-details: A 32B parameter Cantonese LLM built on Qwen 2.5, trained with 600M Hong Kong news articles and fine-tuned with 75K instruction pairs. Optimized for Hong Kong knowledge and Cantonese conversations.

Yuppie1204

LayerAnimate-Mix

Brief Details: LayerAnimate-Mix is a video diffusion framework enabling layer-level animation control, developed by Yuxue Yang et al., with unique layer-aware architecture for precise manipulation.

Mungert

Llama-3.1-Nemotron-Nano-8B-v1-GGUF

Brief-details: Llama-3.1-Nemotron-Nano-8B derivative of Meta's Llama-3.1-8B-Instruct, optimized for reasoning and chat with 128K context. Supports multiple quantization formats for various hardware configurations.

Llama-4-Maverick-17B-128E-Instruct-FP8

SkyReels-A2

Llama-4-Scout-17B-16E

gemma-3-27b-it-qat-q4_0-gguf

RolmOCR

llama-3.2-from-scratch

Llama-4-Maverick-17B-128E-Instruct

openhands-lm-32b-v0.1

Llama-4-Scout-17B-16E-Instruct

Qwen2.5-VL-32B-Instruct-8bit

Qwen2.5-VL-32B-Instruct-bf16

GenHancer

DanbotNL-2408-260M

3b-de-pretrain

gemma-3-4b-it-speech

SicariusSicariiStuff_X-Ray_Alpha-GGUF

TheDrummer_Fallen-Gemma3-27B-v1-GGUF

clips

CantoneseLLMChat-v1.0-32B

LayerAnimate-Mix

Llama-3.1-Nemotron-Nano-8B-v1-GGUF

The first platform built for prompt engineering