1 min voice data can also be used to train a good TTS model
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
OpenVINO™ Toolkit repository
Qwen3-ASR is an open-source series of ASR models
Controllable & emotion-expressive zero-shot TTS
The official Python SDK for the ElevenLabs API
Real-time voice interactive digital human
Video translation and dubbing tool powered by LLMs
Converts text to speech in realtime
Generate audiobooks from e-books, voice cloning & 1107+ languages
Offline inference engine for art, real-time voice conversations
Official PyTorch Implementation
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Speakr is a personal, self-hosted web application
Repo of Qwen2-Audio chat & pretrained large audio language model
The python library for real-time communication
Open source AI VTuber platform with voice chat and Live2D avatars
Production ready toolkit to run AI locally
Framework for building neural networks
Interface for OuteTTS models
AI-powered tool for generating, optimizing, and translating subtitles
Workflow and speech recognition app
Instant voice cloning by MIT and MyShell. Audio foundation model
LLM Large Model of Selling Anchor
Multi-lingual large voice generation model, providing inference