Easily compute clip embeddings and build a clip retrieval system
Cosmos-RL is a flexible and scalable Reinforcement Learning framework
AI tool for automating desktop tasks via natural language input
Open Source Speech Language Model
Build multimodal AI applications with cloud-native stack
Parallax is a distributed model serving framework
Open-source evaluation toolkit of large multi-modality models (LMMs)
Open-Source Dual-Arm Mobile Robot with Motorized Lift
An autonomous AI researcher
A simple but powerful self-hosted finance tracker
End-to-end pipeline converting generative videos
Let agents classify your bank transactions
Collection of Gemma 3 variants that are trained for performance
An anomaly detection library comprising state-of-the-art algorithms
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
Spanish-language course repository that teaches fundamentals of SQL
High-quality implementations of standard and SOTA methods
Implementation of "MobileCLIP" CVPR 2024
High accuracy RAG for answering questions from scientific documents
Superfast AI decision making and processing of multi-modal data
Generic templated configuration management for Kubernetes
Integrate GraphQL into your Django project
Usable Implementation of "Bootstrap Your Own Latent" self-supervised