ai image video free download

Showing 38 open source projects for "ai image video"

View related business solutions

Unix Shell Clear Filters & Widen Search

Earn up to 15% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
Earn up to 15% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

CogVideo

Text and image to video generation: CogVideoX and CogVideo

CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. ...

Downloads: 25 This Week

Last Update: 2025-10-04
See Project
2

SwarmUI

Modular AI image and video generation web UI with extensible tools

SwarmUI is a modular web-based user interface designed for AI-driven image generation, with a strong focus on usability, performance, and extensibility. It serves as a unified environment for working with multiple AI models, including Stable Diffusion and newer image and video generation systems, allowing users to create and manage outputs through a browser interface. SwarmUI is built to accommodate both beginners and advanced users by offering a simple “Generate” interface alongside more advanced workflow tools that expose deeper configuration options. ...

Downloads: 6 This Week

Last Update: 2026-03-18
See Project
3

RuoYi AI

Enterprise AI platform for building, deploying, and managing apps

RuoYi AI is a full-stack enterprise-oriented AI development platform designed to help developers rapidly build, deploy, and manage intelligent applications using modern large language models and AI ecosystems. It provides a unified framework for integrating multiple AI models from different providers, allowing teams to switch or combine models through a consistent interface without vendor lock-in. RuoYi AI includes built-in support for retrieval-augmented generation, enabling organizations...

Downloads: 6 This Week

Last Update: 6 days ago
See Project
4

Generative AI for Beginners (Version 3)

21 Lessons, Get Started Building with Generative AI

...The course covers everything from model selection, prompt engineering, and chat/text/image app patterns to secure development practices and UX for AI. It also walks through modern application techniques such as function calling, RAG with vector databases, working with open source models, agents, fine-tuning, and using SLMs. Each lesson includes a short video, a written guide, runnable samples for Azure OpenAI, the GitHub Marketplace Model Catalog, and the OpenAI API, plus a “Keep Learning” section for deeper study.

Downloads: 4 This Week

Last Update: 2 days ago
See Project
Quality and compliance software for growing life science companies
Unite quality management, product lifecycle, and compliance intelligence to stay continuously audit-ready and accelerate market entry

Automate gap analysis across FDA, ISO 13485, MDR, and 28+ regulatory standards. Cross-map evidence once, reuse across submissions. Get real-time risk alerts and board-ready dashboards, so you can expand into new markets with confidence

Learn More
5

VMZ (Video Model Zoo)

VMZ: Model Zoo for Video Modeling

The codebase was designed to help researchers and practitioners quickly reproduce FAIR’s results and leverage robust pre-trained backbones for downstream tasks. It also integrates Gradient Blending, an audio-visual modeling method that fuses modalities effectively (available in the Caffe2 implementation). Although VMZ is now archived and no longer actively maintained, it remains a valuable reference for understanding early large-scale video model training, transfer learning, and multimodal...

Downloads: 1 This Week

Last Update: 5 days ago
See Project
6

Bg-remover App

Offline Image Background Remover

Our Offline AI-powered Background Remover Desktop App effortlessly removes backgrounds from any image or photo. It utilizes the latest machine learning algorithms to provide accurate results within seconds. Download now and experience the ease and efficiency of our AI-powered solution. Introducing our Offline AI-powered Background Remover Desktop App, featuring one-click background removal for effortless image editing.

Downloads: 8 This Week

Last Update: 2025-12-05
See Project
7

AIGCPanel

One-stop AI digital human system with video voice synthesis tools

AIGCPanel is an open source desktop application designed as a comprehensive, all-in-one platform for creating AI-powered digital humans and media content. It integrates multiple capabilities such as video synthesis, voice synthesis, and voice cloning into a unified interface, allowing users to generate realistic audiovisual outputs with minimal setup. AIGCPanel focuses heavily on simplifying the management of local AI models by providing streamlined workflows for importing, configuring, and running different models with minimal manual effort. ...

Downloads: 14 This Week

Last Update: 2026-03-18
See Project
8

Public Image Mirror

Many images are hosted overseas, such as GCR

...It uses a lazy-loading mechanism, caching image layers in third-party object storage, ensuring that frequently used images are delivered faster. The system is simple to use, requiring only a prefix replacement to pull images from the mirror instead of the original registry. By reducing latency and improving reliability, the service supports developers in accelerating Kubernetes, Docker, Containerd, and AI model image downloads in production environments.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
9

AI File Sorter

Local AI file organization with categorization and rename suggestions

...It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common text files. For supported audio and video files, AI File Sorter can read embedded metadata (such as ID3, Vorbis, and MP4 tags) to suggest normalized names like year_artist_album_title.ext. AI analysis runs read-only, and all suggestions must be reviewed before being applied. AI File Sorter can run fully offline using local models like Mistral or LLaMA, so files and metadata stay on your device unless you configure a remote endpoint.

Downloads: 239 This Week

Last Update: 2026-04-07
See Project
Intelligent Appointment Reminders
For doctors, clinics and hospitals

DoctorConnect provides industry leading patient engagement. In business for over 25 years, we provide highly customizable services to thousands of doctors, clinics and hospitals. Appointment Reminders, After Care Surveys, Automated No-Show and Recall Messaging, and more. We can directly interface with hundreds of EMR and PM systems. We'd love to hear from you and show you how we increase your revenue and your patient satisfaction.

Get a Demo
10

ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
11

civitai

Open platform for sharing and discovering Stable Diffusion models

Civitai is an open source project that provides the codebase for a platform designed to share and manage generative AI models used for image generation. It focuses primarily on models compatible with Stable Diffusion and related technologies, allowing creators to upload, organize, and distribute custom AI models and related resources. These resources can include textual inversions, hypernetworks, aesthetic gradients, and variational autoencoders that modify or extend the capabilities of diffusion-based image generation systems. ...

Downloads: 13 This Week

Last Update: 20 hours ago
See Project
12

VGGSfM

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

...Version 2.0 adds support for dynamic scene handling, dense point cloud export, video-based reconstruction (1000+ frames), and integration with Gaussian Splatting pipelines. It leverages tools like PyCOLMAP, poselib, LightGlue, and PyTorch3D for feature matching, pose estimation, and visualization. With minimal configuration, users can process single scenes or full video sequences, apply motion masks to exclude moving objects, and train neural radiance or splatting models directly from reconstructed outputs.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
13

Rig

Rust framework for building modular and scalable LLM-powered apps

...It also supports capabilities such as text generation, embeddings, transcription, image generation, and audio generation depending on the provider used. Developers can integrate language models into their software with minimal boilerplate while maintaining flexibility for complex AI workflows.

Downloads: 8 This Week

Last Update: 6 days ago
See Project
14

VisualGLM-6B

Chinese and English multimodal conversational language model

VisualGLM-6B is an open-source multimodal conversational language model developed by ZhipuAI that supports both images and text in Chinese and English. It builds on the ChatGLM-6B backbone, with 6.2 billion language parameters, and incorporates a BLIP2-Qformer visual module to connect vision and language. In total, the model has 7.8 billion parameters. Trained on a large bilingual dataset — including 30 million high-quality Chinese image-text pairs from CogView and 300 million English pairs...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
15

Harbor LLM

Run a full local LLM stack with one command using Docker

...Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.

Downloads: 9 This Week

Last Update: 2 days ago
See Project
16

Aria2 AriaNg Docker

The Docker image for Aria2 + AriaNg + File Browser + Rclone

One Docker image for file downloading, managing, sharing, as well as video playing and evening cloud storage synchronization. Furthermore, it's pretty small and ARM CPU compatible which means you can also run it on Raspberry Pi. Last but not least, Auto HTTPS can't be more easy.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
17

CogVLM

A state-of-the-art open visual language model

CogVLM is an open-source visual–language model suite—and its GUI-oriented sibling CogAgent—aimed at image understanding, grounding, and multi-turn dialogue, with optional agent actions on real UI screenshots. The flagship CogVLM-17B combines ~10B visual parameters with ~7B language parameters and supports 490×490 inputs; CogAgent-18B extends this to 1120×1120 and adds plan/next-action outputs plus grounded operation coordinates for GUI tasks. The repo provides multiple ways to run models...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
18

Docker-Android

Android in docker solution with noVNC supported and video recording

...Ability to connect to Selenium Grid. Ability to control emulator from the outside container by using adb connect. Supports real devices with screen mirroring. Ability to record video during test execution for debugging. Integrated with other cloud solutions, e.g. Genymotion Cloud. Open-source with more features coming.

1 Review

Downloads: 8 This Week

Last Update: 2026-03-19
See Project
19

Armbian Linux Build Framework

Armbian Linux Build Framework

...BASH shell and lightweight XFCE-based desktop. Standard boot, config, and update methods with minimal user-space footprint. Special config utilities are optional. A distributed image is compressed to its real data size which starts below 1G. Login is possible via serial, HDMI/VGA or SSH. Boot loader and kernel optimizations, memory caching, ZRAM swap, and video acceleration where applicable. Images are made fully automatized from sources. Releases are PGP signed and code is regularly inspected by the community. ...

Downloads: 6 This Week

Last Update: 6 days ago
See Project
20

Mesh R-CNN

code for Mesh R-CNN, ICCV 2019

Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh...

Downloads: 0 This Week

Last Update: 7 hours ago
See Project
21

KDE-Services

Full features for Dolphin's right click menu on KDE Plasma 6.

Full features for Dolphin's (File Manager) right click contextual menu on KDE Plasma 6. Specially designed for OS based on Red Hat.

6 Reviews

Downloads: 11 This Week

Last Update: 6 days ago
See Project
22

Snowmix

Video mixer for mixing live and recorded video and audio feeds

New version 0.5.2.1 Released December 29th 2025. Snowmix is a Swiss army knife tool for mixing live and recorded video and audio feeds. It supports 2D and 3D clipping, scaling and transparent overlay of video, png graphics and text. It supports animation of video, images and texts through native commands changing scale, placement, transparency and rotation. Animation and actions can also be controlled through native scripting and an embedded Tcl and/or Python interpreter. Snowmix is...

10 Reviews

Downloads: 7 This Week

Last Update: 2025-12-29
See Project
23

CC2.TV / CC2 - Audio- und TV-Datenbank

Meta-Datenbank-Anwendung für die Audio- und TV-Sendungen des CC2.TV

Dieses Programm stellt eine Meta-Datenbank-Anwendung für die Audio- und Video-Sendungen des CC2.TV für GNU/Linux Systeme zur Verfügung. Es ermöglicht das Durchsuchen, Verwalten und Abspielen der umfangreichen Inhalte des CC2.TV-Audiocasts und -Videocasts. Ziel ist es, die über 3000 Audiocast-Themen und über 1000 Videocast-Themen, die sich auf Computerthemen, Technik und gesellschaftliche Aspekte konzentrieren, komfortabel zugänglich zu machen. Für die volle Funktionalität,...

Downloads: 1 This Week

Last Update: 2025-11-17
See Project
24

TurboVNC

High-speed, 3D-friendly, TightVNC-compatible remote desktop software

TurboVNC is a high-performance, enterprise-quality version of VNC based on TightVNC, TigerVNC, and X.org. It contains a variant of Tight encoding that is tuned for maximum performance and compression with 3D applications (VirtualGL), video, and other image-intensive workloads. TurboVNC, in combination with VirtualGL, provides a complete solution for remotely displaying 3D applications with interactive performance. TurboVNC's high-speed encoding methods have been adopted by TigerVNC and libvncserver, and TurboVNC is also compatible with any other TightVNC derivative. ...

15 Reviews

Downloads: 127,125 This Week

Last Update: 2024-01-13
See Project
25

CogView

Text-to-Image generation. The repo for NeurIPS 2021 paper

CogView is a large-scale pretrained text-to-image transformer model, introduced in the NeurIPS 2021 paper CogView: Mastering Text-to-Image Generation via Transformers. With 4 billion parameters, it was one of the earliest transformer-based models to successfully generate high-quality images from natural language descriptions in Chinese, with partial support for English via translation. The model incorporates innovations such as PB-relax and Sandwich-LN to enable stable training of very deep...

Downloads: 0 This Week

Last Update: 7 hours ago
See Project