Voxtral Realtime 4B
Speech-to-Text in the browser with transformers.js + WebGPU
A cutting-edge speech generation model with stereo support
Controllable TTS via instruction prompting (JPN / Anime)
FireRed-Image-Edit ร Qwen-Image-Edit-Rapid (Transformers)
FireRed-OCR for Document Recognition
Generate speech audio from text with custom or cloned voices
Generate high-quality images from text prompts
Generate or edit images from text and optional photos
Based 'Z-IMAGE TURBO'
Generate spoken audio from typed text
Multimodal OCR model for complex document understanding.