Showing 1058 open source projects for "python data analysis"

View related business solutions
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative)...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    AudioMuse-AI

    AudioMuse-AI

    AudioMuse-AI is an Open Source Dockerized environment

    AudioMuse-AI is an open-source system designed to automatically generate playlists and analyze music libraries using artificial intelligence and audio signal processing techniques. The platform runs locally in a Dockerized environment and performs detailed sonic analysis on audio files to understand characteristics such as tempo, mood, and acoustic similarity. By analyzing the underlying audio content rather than relying on external metadata services, the system can organize large personal...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    TorchRL

    TorchRL

    A modular, primitive-first, python-first PyTorch library

    TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.
    Downloads: 54 This Week
    Last Update:
    See Project
  • 4
    LangKit

    LangKit

    An open-source toolkit for monitoring Language Learning Models (LLMs)

    LangKit is an open-source text metrics toolkit for monitoring language models. It offers an array of methods for extracting relevant signals from the input and/or output text, which are compatible with the open-source data logging library whylogs. Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • 5
    OpenRecall

    OpenRecall

    OpenRecall is a fully open-source, privacy-first alternative

    OpenRecall is an open-source, privacy-first system designed to capture, index, and make searchable a user’s entire digital activity history, effectively acting as a personal memory layer for computing environments. It works by taking periodic screenshots of a user’s screen and applying local AI processing, including OCR and semantic analysis, to extract and structure information from both text and images. This data is then indexed into a searchable database, allowing users to retrieve past information quickly using natural language queries. Unlike proprietary alternatives, OpenRecall operates entirely locally, ensuring that all captured data remains on the user’s device and is never transmitted to external servers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DoWhy

    DoWhy

    DoWhy is a Python library for causal inference

    DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks. Much like machine learning libraries have done for prediction, DoWhy is a Python library that aims to spark causal thinking and analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Claude Scientific Skills

    Claude Scientific Skills

    A set of ready to use Agent Skills for research, science, engineering

    Claude Scientific Skills is a large open source collection of ready-to-use scientific capabilities that extend AI coding agents into full research assistants. The project provides more than 170 curated skills covering domains such as genomics, drug discovery, medical imaging, physics, and advanced data analysis. Each skill bundles documentation, examples, and tool integrations so agents can reliably execute complex multi-step scientific workflows. The framework follows the open Agent Skills standard and works with multiple AI development environments including Claude Code, Cursor, and Codex. Its primary goal is to reduce the friction of scientific computing by giving AI agents structured access to specialized libraries, databases, and research pipelines. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8
    Magentic UI

    Magentic UI

    A research prototype of a human-centered web agent

    Magentic-UI is a research prototype developed by Microsoft that serves as a human-centered interface powered by a multi-agent system. It enables users to automate complex web tasks, such as browsing, form filling, and data analysis, while maintaining control over the process. The system emphasizes transparency and user involvement, making it suitable for tasks requiring both automation and human oversight.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    chatd

    chatd

    Chat with your documents using local AI

    chatd is an open-source desktop application that allows users to interact with their documents through a locally running large language model. The software focuses on privacy and security by ensuring that all document processing and inference occur entirely on the user’s computer without sending data to external cloud services. It includes a built-in integration with the Ollama runtime, which provides a cross-platform environment for running large language models locally. The application...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Securden Privileged Account Manager Icon
    Securden Privileged Account Manager

    Unified Privileged Access Management

    Discover and manage administrator, service, and web app passwords, keys, and identities. Automate management with approval workflows. Centrally control, audit, monitor, and record all access to critical IT assets.
    Learn More
  • 10
    Preswald

    Preswald

    Python tool for browser-based interactive data apps in one file

    Preswald is an open source Python-based framework and static-site generator designed for building interactive data applications that run entirely in the browser. It packages application logic, data processing, and user interface components into a single self-contained output, enabling easy sharing and deployment without requiring local dependencies. Preswald leverages a WebAssembly runtime along with technologies like Pyodide and DuckDB to execute Python code directly in the browser environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    machine learning tutorials

    machine learning tutorials

    machine learning tutorials (mainly in Python3)

    machine-learning is a continuously updated repository documenting the author’s learning journey through data science and machine learning topics using practical tutorials and experiments. The project presents educational notebooks that combine mathematical explanations with code implementations using Python’s scientific computing ecosystem. Topics covered include classical machine learning algorithms, deep learning models, reinforcement learning, model deployment, and time-series analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Excel MCP Server

    Excel MCP Server

    A Model Context Protocol server for Excel file manipulation

    The Excel MCP Server is a Python-based implementation of the Model Context Protocol that provides Excel file manipulation capabilities without requiring Microsoft Excel installation. It enables workbook creation, data manipulation, formatting, and advanced Excel features.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Vulnhuntr

    Vulnhuntr

    AI tool for detecting complex vulnerabilities in Python codebases

    Vulnhuntr is an open source security tool that uses large language models to analyze codebases and identify remotely exploitable vulnerabilities. It focuses on Python projects and applies static code analysis combined with LLM reasoning to trace how user input flows through an application. Instead of scanning entire repositories at once, it builds call chains step by step, allowing deeper inspection of complex, multi-stage issues that traditional tools may miss. Vulnhuntr can generate detailed findings, including vulnerability explanations and potential exploit paths, helping developers and security teams understand risks faster. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Local File Organizer

    Local File Organizer

    An AI-powered file management tool that ensures privacy

    ...Through AI-driven analysis, the software can detect themes, topics, and metadata in files, allowing it to organize information in ways that traditional rule-based file managers cannot achieve. The tool supports multiple sorting strategies that allow users to categorize files by content, date, or type depending on their workflow preferences.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    HDBSCAN

    HDBSCAN

    A high performance implementation of HDBSCAN clustering

    ...In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select. HDBSCAN is ideal for exploratory data analysis; it's a fast and robust algorithm that you can trust to return meaningful clusters (if there are any).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Evidently

    Evidently

    Evaluate and monitor ML models from validation to production

    Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    omegaml

    omegaml

    MLOps simplified. From ML Pipeline ⇨ Data Product without the hassle

    omega|ml is the innovative Python-native MLOps platform that provides a scalable development and runtime environment for your Data Products. Works from laptop to cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Superlinked

    Superlinked

    Superlinked is a Python framework for AI Engineers

    Superlinked is a Python framework designed for AI engineers to build high-performance search and recommendation applications that combine structured and unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    h2oGPT

    h2oGPT

    Private chat with local GPT with document, images, video, etc.

    ...It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and Mixtral, making it a flexible tool for anyone needing advanced document analysis and AI-driven conversation in a secure, local setup.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    MySQL MCP Server

    MySQL MCP Server

    A Model Context Protocol (MCP) server that enables secure interaction

    The MySQL MCP Server enables secure interaction with MySQL databases, allowing AI assistants to list tables, read data, and execute SQL queries through a controlled interface. It is designed for integration with AI applications like Claude Desktop and should not be run as a standalone Python program. ​
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    NannyML

    NannyML

    Detecting silent model failure. NannyML estimates performance

    NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, and interactive visualizations, is completely model-agnostic, and currently supports all tabular classification use cases.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    fklearn

    fklearn

    Functional Machine Learning

    fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    sktime

    sktime

    A unified framework for machine learning with time series

    sktime is a library for time series analysis in Python. It provides a unified interface for multiple time series learning tasks. Currently, this includes time series classification, regression, clustering, annotation, and forecasting. It comes with time series algorithms and scikit-learn compatible tools to build, tune and validate time series models. Our objective is to enhance the interoperability and usability of the time series analysis ecosystem in its entirety. sktime provides a unified interface for distinct but related time series learning tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB