Search Results for "real voice text to speech"

Showing 19 open source projects for "real voice text to speech"

View related business solutions
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features like emotional range, tone, and real-time playback. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. ...
    Downloads: 67 This Week
    Last Update:
    See Project
  • 3
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 4
    ElatoAI

    ElatoAI

    Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP

    ElatoAI is a real-time AI voice agent platform built around IoT hardware (ESP32) that enables continuous speech-to-speech conversations using state-of-the-art multimodal voice models with minimal latency and global performance via edge computing. The system integrates voice synthesis and recognition by connecting an ESP32 device through secure WebSockets to edge server functions written in Deno, allowing users to speak naturally with AI agents hosted through cloud APIs including OpenAI’s Realtime API, Gemini’s Live API, xAI’s Grok Voice Agent API, and others. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Remote Network Monitoring and Management for an IoT World Icon
    Remote Network Monitoring and Management for an IoT World

    The Only RMM Solution You Need

    Domotz is the premier Remote Network Monitoring and Management platform for IoT. We offer powerful network management software for MSP's, Integrators, Security Professionals, and Business Owners. Domotz enables the complete solution to cost-effectively manage and monitor your customers’ networks with plug and play setup, a friendly UX, and a comprehensive feature set, accessible from any desktop browser or mobile device. Utilize one interface to manage multiple networks at multiple locations anywhere in the World. One person can deploy remote monitoring and management in less than 15 minutes.
    Sign Up for Free
  • 5
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all necessary dependencies, so users can focus on experimenting with voices instead of managing tooling. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    Vibe

    Vibe

    Transcribe on your own

    Vibe is an open-source project by thewh1teagle designed to deliver a collaborative and interactive social application experience, though its specifics depend on its evolving community scope; its development often focuses on connecting users through dynamic features that can include chat, shared spaces, and immersive interactions. The repository typically includes backend logic, frontend integration, and real-time communication stacks to support live user engagement, performance...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 8
    Operit AI

    Operit AI

    Powerful Android AI agent with tools, automation, and Linux shell

    Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    AI App Lab

    AI App Lab

    Implementing large models into scenario-based applications

    AI App Lab is an open-source platform developed by Volcengine that provides tools, SDKs, and example applications for building real-world AI applications powered by large language models. The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and multimodal capabilities such as text, image, and voice processing. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build innovative business apps powered by process automation Icon
    Build innovative business apps powered by process automation

    Connect workflows, teams and systems within one digital business transformation platform

    Manage your business as a unified system of interacting processes. Use BPMN 2.0 for low-code process modeling by business people. Follow your strategic goals with process architecture that always corresponds to the structure of an actual business.
    Learn More
  • 10
    WebCord

    WebCord

    A Discord and SpaceBar :electron:-based client

    Nowadays, WebCord is quite complex project; it can be summarized as a pack of security and privacy hardenings, Discord features reimplementations, Electron / Chromium / Discord bugs workarounds, stylesheets, internal pages and wrapped Discord page, designed to conform with ToS as much as it is possible (or hide the changes that might violate it from Discord's eyes). WebCord does a lot to improve the privacy of the users. It blocks known tracing and fingerprinting methods, but it does not end...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 11
    ModelFusion

    ModelFusion

    The TypeScript library for building AI applications

    ...The framework allows developers to integrate large language models and other generative systems into JavaScript and TypeScript applications through a consistent and standardized API. Instead of writing separate integration logic for each provider, developers can use ModelFusion to handle common operations such as text generation, structured object generation, streaming responses, and tool calls. The library supports a wide range of model types, including text generation models, vision models, text-to-speech engines, speech-to-text systems, and embedding models. It also includes built-in production features such as observability hooks, logging, automatic retries, and error handling mechanisms that improve reliability when deploying AI systems in real-world environments.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Amica

    Amica

    Amica is an open source interface for interactive communication

    ...Under the hood, Amica leverages modern web and desktop technologies: three.js and three-vrm for 3D rendering, Transformers.js for running models in the browser, Whisper and Silero VAD for speech recognition and voice-activity detection, and a variety of LLM backends such as llama.cpp servers, ChatGPT-compatible APIs, Ollama, KoboldCpp, and others. It also integrates multiple text-to-speech providers, including ElevenLabs, OpenAI, Coqui, RVC, and AllTalkTTS.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    OpenMAIC

    OpenMAIC

    Open Multi-Agent Interactive Classroom

    OpenMAIC is an open-source multi-agent learning platform built to turn a topic or uploaded material into a fully interactive classroom experience with minimal setup. It is designed around coordinated AI roles, including teacher-like and classmate-like agents that can present information, respond in real time, and participate in live educational dialogue. The platform generates multiple learning scenes rather than a single static output, including slides, quizzes, interactive simulations, and project-based activities, which makes it feel closer to a guided lesson than a simple content generator. It also supports whiteboard-style visual explanation and text-to-speech delivery, allowing agents to draw, explain formulas, and speak aloud during instruction. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Gradient Bang

    Gradient Bang

    Gradient Bang is an online multiplayer universe

    ...The project serves both as a prototype and a conceptual playground for testing how conversational AI systems behave when embedded into dynamic, game-like environments rather than static chat interfaces. It leverages the broader Pipecat architecture for multimodal and conversational AI orchestration, meaning that interactions can potentially extend beyond text into voice, events, and real-time systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Gangio Desktop

    Gangio Desktop

    Gangio is a full-featured chat and communication platform.

    Gangio is a full-featured chat and communication platform. It provides real-time messaging, server-based communities, direct messaging, voice/video communication, and social features in a modern, responsive interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    NodeTool

    NodeTool

    Visual AI Workflow Builder

    NodeTool is an open‑source, visual AI workflow builder that lets you connect nodes for text, images, audio, video, data, and automation—then run them locally or on the cloud. Build multi‑step agents, RAG systems, and creative media pipelines without coding, inspect execution in real time, and deploy anywhere: home server, private VPC, RunPod, or Cloud Run. With a local‑first design, NodeTool keeps models and data under your control while still supporting providers like OpenAI, Anthropic,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    Ainee

    Ainee

    Ainee - AI Notetaking and Learning Companion

    Ainee is your ultimate AI-powered notetaking and learning companion. Capture lecture notes in real-time and effortlessly transform audio, text, files, and YouTube videos into formatted notes, mindmaps, quizzes, flashcards, podcasts, and more. Explore our AI meeting note taker, AI notes, video transcript generator, PDF to AI converter, and AI flashcard maker. Enhance your learning with our AI voice recorder, article summarizer AI, and AI quiz generator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TTS-Vue

    TTS-Vue

    Microsoft speech synthesis tool, built with Electron

    TTS-Vue is a desktop text-to-speech application built with Electron, Vue, ElementPlus, and Vite, focused on using Microsoft’s official Speech API for high-quality neural synthesis. It wraps the Microsoft TTS WebSocket interface in a clean UI so users can paste or load text, choose voices, tweak parameters, and export audio without touching raw API calls. The app supports SSML (Speech Synthesis Markup Language), letting power users specify fine-grained control over pronunciation, pauses,...
    Downloads: 61 This Week
    Last Update:
    See Project
  • 19
    Chat with GPT

    Chat with GPT

    An open-source ChatGPT app with a voice

    ...Users can review past chat sessions, modify system prompts, and adjust model parameters such as temperature to control response creativity. The platform also integrates speech capabilities by connecting to text-to-speech systems and speech recognition engines, enabling voice-based conversations with the AI assistant. Additional features include message editing, response regeneration, and the ability to share conversations through public links.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB