audio linux free download

Showing 852 open source projects for "audio linux"

View related business solutions

Python Clear Filters & Widen Search

Earn up to 15% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
Earn up to 15% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...

Downloads: 13 This Week

Last Update: 2026-03-30
See Project
2

Step-Audio

Open-source framework for intelligent speech interaction

Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and...

Downloads: 3 This Week

Last Update: 2026-03-16
See Project
3

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
4

Qwen2-Audio

Repo of Qwen2-Audio chat & pretrained large audio language model

Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
Get full visibility and control over your tasks and projects with Wrike.
A cloud-based collaboration, work management, and project management software

Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.

Learn More
5

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...

Downloads: 1 This Week

Last Update: 2025-09-23
See Project
6

Step-Audio-EditX

LLM-based Reinforcement Learning audio edit model

Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level...

Downloads: 2 This Week

Last Update: 2026-04-09
See Project
7

Fun Audio Chat

Large Audio Language Model built for natural interactions

Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system...

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
8

Step-Audio 2

Multi-modal large language model designed for audio understanding

Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. It...

Downloads: 0 This Week

Last Update: 2026-03-16
See Project
9

Ultimate Vocal Remover (UVR5)

GUI for a Vocal Remover that uses Deep Neural Networks

This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).

Downloads: 767 This Week

Last Update: 2025-01-20
See Project
Outbound sales software
Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.

Learn More
10

LTX-2.3

Official Python inference and LoRA trainer package

LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...

Downloads: 167 This Week

Last Update: 7 days ago
See Project
11

spotDL

Download your Spotify playlists and songs along with album art

spotDL is a command-line tool that allows users to download songs and playlists from Spotify by sourcing the audio from YouTube. Built in Python, it automatically matches Spotify tracks with corresponding videos on YouTube and downloads them with embedded metadata. The tool retrieves important information such as album art, song titles, artist names, and lyrics to organize downloaded files. spotDL is designed to be fast, accurate, and easy to use through a simple command-line interface. It...

Downloads: 105 This Week

Last Update: 2025-10-08
See Project
12

Librosa

Python library for audio and music analysis

Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.

Downloads: 6 This Week

Last Update: 2025-07-03
See Project
13

Pedalboard

A Python library for audio

pedalboard is a Python library for working with audio: reading, writing, rendering, adding effects, and more. It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard...

Downloads: 7 This Week

Last Update: 2026-02-01
See Project
14

NovaSR

A lightning fast audio upsampler

NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design. NovaSR is especially valuable for post-processing tasks...

Downloads: 5 This Week

Last Update: 2026-02-26
See Project
15

yt-dlp-gui

A cross-platform GUI wrapper for yt-dlp written in PySide6

...The project supports preset definitions and global arguments through a config file, so users can customize their most common download workflows—like audio extraction, quality ranking, or embedding thumbnails—without retyping arguments each time. Downloads can be initiated from a portable app bundle or run manually with Python, making it flexible across platforms including Windows and Linux.

Downloads: 303 This Week

Last Update: 2026-01-20
See Project
16

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling...

Downloads: 6 This Week

Last Update: 2024-11-16
See Project
17

OpenShot Video Editor

Award-Winning Open Source Video Editing Software

...OpenShot offers a myriad of features and capabilities, including powerful curve-based Key frame animations, 3D animated titles and effects, slow motion and time effects, audio mixing and editing, and so much more. It’s available for Linux, Mac and Windows, with a very simple and friendly interface. Start creating stunning videos quickly and easily with OpenShot!

6 Reviews

Downloads: 151 This Week

Last Update: 2026-04-08
See Project
18

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...

Downloads: 4 This Week

Last Update: 2025-10-13
See Project
19

AudioNotes

Extract audio and video content and organize it into a Markdown note

AudioNotes is an application (or proof-of-concept) that likely combines audio recording or playback with note-taking or annotation functionality — enabling users to record voice or audio and attach textual or timestamped notes, making it ideal for lectures, interviews, meetings, or personal memos. Such a tool offers a more expressive and flexible way to capture and revisit information: instead of just typed notes or raw audio, users get both audio context and structured notes. As an...

Downloads: 2 This Week

Last Update: 2025-12-04
See Project
20

VoxCPM2

Tokenizer-Free TTS for Multilingual Speech Generation

VoxCPM2 is an advanced open-source text-to-speech system that redefines speech synthesis by eliminating traditional tokenization and instead generating continuous speech representations through a diffusion-based autoregressive architecture. Built on top of the MiniCPM model family, it enables highly natural, expressive, and context-aware speech generation that adapts tone, emotion, and pacing directly from input text. The system is trained on massive multilingual datasets, enabling support...

Downloads: 21 This Week

Last Update: 6 days ago
See Project
21

HeartMuLa

A Family of Open Sourced Music Foundation Models

HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high...

Downloads: 15 This Week

Last Update: 2026-04-10
See Project
22

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...

Downloads: 9 This Week

Last Update: 2026-04-05
See Project
23

LatentSync

Taming Stable Diffusion for Lip Sync

LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...

Downloads: 6 This Week

Last Update: 2025-12-02
See Project
24

Audiomentations

A Python library for audio data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where...

Downloads: 2 This Week

Last Update: 2025-09-13
See Project
25

SoulSync

Automated Music Discovery and Collection Manager

SoulSync is an intelligent music discovery and automation platform designed to bridge streaming services with self-hosted media libraries, enabling users to automatically grow and maintain curated music collections. The system continuously monitors selected artists and detects new releases, then generates personalized playlists such as Release Radar and Discovery Weekly using its built-in recommendation logic. It can automatically download missing tracks from multiple sources including...

Downloads: 15 This Week

Last Update: 2026-04-13
See Project