VideoChat is a real-time voice-interactive “digital human” system that combines automatic speech recognition, large language models, text-to-speech, and talking-head generation into a single conversational pipeline. It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. The tech stack integrates FunASR for speech recognition, Qwen for language understanding, multiple TTS engines like GPT-SoVITS, CosyVoice, or edge-tts, and MuseTalk for talking-head generation.

Features

  • Real-time voice-interactive digital human combining ASR, LLM, TTS, and talking-head generation in one demo
  • Supports end-to-end GLM-4-Voice pipelines and cascaded ASR → LLM → TTS → THG pipelines
  • Customizable avatar appearance and voice, with optional voice cloning from short reference samples
  • Uses modular components such as FunASR, Qwen, GPT-SoVITS, CosyVoice, edge-tts, and MuseTalk for flexibility
  • Gradio-based web interface for easy local deployment, experimentation, and demonstration
  • Low initial response latency (≈3 seconds) designed for smooth, interactive conversations

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

MIT License

Follow VideoChat

VideoChat Web Site

Other Useful Business Software
Earn up to 15% annual interest with Nexo. Icon
Earn up to 15% annual interest with Nexo.

Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
Get started with Nexo.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VideoChat!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28