Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

Full-text search

Active filters: text-generation-inference

nvidia/Nemotron-Orchestrator-8B

Text Generation • 8B • Updated 7 days ago • 3.57k • 386

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated about 1 hour ago • 441k • • 128

microsoft/Fara-7B

Image-Text-to-Text • 8B • Updated 7 days ago • 32.3k • 428

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.19M • • 5.1k

EssentialAI/rnj-1

Text Generation • 8B • Updated about 1 hour ago • 121k • 44

open-thoughts/OpenThinker-Agent-v1

Text Generation • 8B • Updated 2 days ago • 113 • 43

maya-research/maya1

Text-to-Speech • 3B • Updated 27 days ago • 71.5k • • 809

Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26 • 7.59M • • 854

google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.03M • 1.02k

Qwen/Qwen3-4B-Instruct-2507

Text Generation • 4B • Updated Sep 17 • 6.29M • • 530

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12 • 7.25M • • 932

meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 731k • • 1.96k

dphn/Dolphin-Mistral-24B-Venice-Edition

Text Generation • 24B • Updated Sep 8 • 9.67k • • 327

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.34M • • 1.38k

Qwen/Qwen3-Embedding-8B

Feature Extraction • 8B • Updated Jul 7 • 830k • • 475

thu-pacman/PCMind-2.1-Kaiyuan-2B

Text Generation • 2B • Updated about 13 hours ago • 214 • 12

FutureMa/Qwen3-8B-Drama-Thinking

Text Generation • 308k • Updated about 13 hours ago • 12

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 1.21M • • 12.9k

google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21 • 1.42M • • 1.73k

Qwen/Qwen3-1.7B

Text Generation • 2B • Updated Jul 26 • 4.14M • • 347

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • 1B • Updated Mar 17, 2024 • 1.85M • 1.47k

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26 • 4.76M • • 792

Genius-Society/hoyoMusic

Updated Oct 30 • 28

meta-llama/Meta-Llama-3-8B-Instruct

Text Generation • 8B • Updated Jun 18 • 1.23M • • 4.31k

google/gemma-2-2b-it

Text Generation • 3B • Updated Aug 27, 2024 • 759k • • 1.24k

Qwen/Qwen2.5-0.5B-Instruct

Text Generation • 0.5B • Updated Sep 25, 2024 • 2.14M • 404

meta-llama/Llama-3.3-70B-Instruct

Text Generation • 71B • Updated Dec 21, 2024 • 412k • • 2.59k

Qwen/Qwen3-Embedding-0.6B

Feature Extraction • 0.6B • Updated Jun 20 • 4.44M • • 768

Qwen/Qwen3-4B-Thinking-2507

Text Generation • 4B • Updated Aug 6 • 715k • • 480

WeiboAI/VibeThinker-1.5B

Text Generation • 2B • Updated 15 days ago • 28.3k • 499