Search Results for "python data analysis" - Page 16

Showing 4115 open source projects for "python data analysis"

View related business solutions
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Earn up to 15% annual interest with Nexo. Icon
    Earn up to 15% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes. If you want to train your own model, please move to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    AI Hedge Fund

    AI Hedge Fund

    An AI Hedge Fund Team

    This repository demonstrates how to build a simplified, automated hedge fund strategy powered by AI/ML. It integrates financial data collection, preprocessing, feature engineering, and predictive modeling to simulate decision-making in trading. The code shows workflows for pulling stock or market data, applying machine learning algorithms to forecast trends, and generating buy/sell/hold signals based on the predictions. Its structure is educational: intended more as a proof-of-concept than a...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    AWS SDK for pandas

    AWS SDK for pandas

    Easy integration with Athena, Glue, Redshift, Timestream, Neptune

    aws-sdk-pandas (formerly AWS Data Wrangler) bridges pandas with the AWS analytics stack so DataFrames flow seamlessly to and from cloud services. With a few lines of code, you can read from and write to Amazon S3 in Parquet/CSV/JSON/ORC, register tables in the AWS Glue Data Catalog, and query with Amazon Athena directly into pandas. The library abstracts efficient patterns like partitioning, compression, and vectorized I/O so you get performant data lake operations without hand-rolling boilerplate. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • 5
    Kedro

    Kedro

    A Python framework for creating reproducible, maintainable code

    Kedro is an open sourced Python framework for creating maintainable and modular data science code. Provides the scaffolding to build more complex data and machine-learning pipelines. In addition, there's a focus on spending less time on the tedious "plumbing" required to maintain data science code; this means that you have more time to solve new problems. Standardises team workflows; the modular structure of Kedro facilitates a higher level of collaboration when teams solve problems together. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Open Wearables

    Open Wearables

    Self-hosted platform to unify wearable health data

    Open Wearables is an open-source initiative that aims to provide a community-driven ecosystem for wearable device software and interoperability by connecting sensor data, activity tracking, and health insights across multiple platforms and devices. Instead of relying on closed vendor ecosystems, the project provides standardized data models and APIs that let developers and hobbyists collect, sync, and analyze biometric and environmental data from wearables, DIY sensors, and open hardware...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 8
    Great Expectations

    Great Expectations

    Always know what to expect from your data

    Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    INTERCEPT

    INTERCEPT

    Unites the best signal intelligence tools

    iNTERCEPT is a web-based interface that brings multiple software-defined radio and signal-intelligence style tools under one consistent dashboard, making complex workflows more approachable. Rather than requiring you to learn a different UI and setup process for each underlying utility, it provides a single place to start modes, view results, and monitor activity from a browser. The project’s goal is accessibility: lowering the skill and setup barrier so learners and authorized testers can...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Digital business card + lead capture + contact enrichment Icon
    Digital business card + lead capture + contact enrichment

    Your complete in-person marketing platform

    Share digital business cards, capture leads, and enrich validated contact info - at events, in the field, and beyond. Powered by AI and our proprietary data engine, Popl drives growth for companies around the world, turning every handshake into an opportunity.
    Learn More
  • 10
    pgsync

    pgsync

    Postgres to Elasticsearch/OpenSearch sync

    pgsync is a lightweight tool for syncing Postgres databases across environments, such as from production to staging. It allows selective table syncing, data masking, and parallel copying for fast and safe data migration. pgsync is ideal for developers who need realistic test data without exposing sensitive information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Flyte
    Build production-grade data and ML workflows, hassle-free The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Don’t let friction between development and production slow down the deployment of new data/ML workflows and cause an increase in production bugs. Flyte enables rapid experimentation with production-grade software. Debug in the cloud by iterating on the workflows locally to achieve tighter feedback loops. As your...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    srsly

    srsly

    Modern high-performance serialization utilities for Python

    This package bundles some of the best Python serialization libraries into one standalone package, with a high-level API that makes it easy to write code that's correct across platforms and Pythons. This allows us to provide all the serialization utilities we need in a single binary wheel. Currently supports JSON, JSONL, MessagePack, Pickle, and YAML. Serialization is hard, especially across Python versions and multiple platforms. After dealing with many subtle bugs over the years (encodings,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    AIOHTTP

    AIOHTTP

    Asynchronous HTTP client/server framework for asyncio and Python

    ...The main change is dropping yield from support and using async/await everywhere. Farewell, Python 3.4. You often want to send some sort of data in the URL’s query string. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val. Requests allows you to provide these arguments as a dict, using the params keyword argument. aiohttp internally performs URL canonicalization before sending request.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 15
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Dataclasses JSON

    Dataclasses JSON

    Easily serialize Data Classes to and from JSON

    This library provides a simple API for encoding and decoding dataclasses to and from JSON.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    ...These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else. You can build pretty much any task you want, but Luigi also comes with a toolbox of several common task templates that you use. It includes support for running Python mapreduce jobs in Hadoop, as well as Hive, and Pig, jobs. It also comes with file system abstractions for HDFS, and local files that ensures all file system operations are atomic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    X-osint

    X-osint

    Open source OSINT tool for gathering data on emails, phones, and IPs

    ...In addition to network and domain intelligence, it includes features for extracting metadata from files or images and analyzing text content to uncover hidden details. X-osint is written primarily in Python and is designed to run in terminal environments, particularly on Linux systems and Termux setups.
    Downloads: 47 This Week
    Last Update:
    See Project
  • 19
    openTSNE

    openTSNE

    Extensible, parallel implementations of t-SNE

    openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting visualizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    JupyterLab LaTeX

    JupyterLab LaTeX

    JupyterLab extension for live editing of LaTeX documents

    An extension for JupyterLab which allows for live-editing of LaTeX documents. To use, right-click on an open .tex document within JupyterLab, and select Show LaTeX Preview. This extension includes both a notebook server extension (which interfaces with the LaTeX compiler) and a lab extension (which provides the UI for the LaTeX preview). The Python package named jupyterlab_latex provides both of them as a prebuilt extension.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    jsonschema

    jsonschema

    An implementation of the JSON Schema specification for Python

    jsonschema is an implementation of the JSON Schema specification for Python. Full support for Draft 2020-12, Draft 2019-09, Draft 7, Draft 6, Draft 4 and Draft 3. Lazy validation that can iteratively report all validation errors. Programmatic querying of which properties or items failed validation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    JC

    JC

    CLI tool and python library

    ...The JC parsers can also be used as python modules. In this case, the output will be a python dictionary, or a list of dictionaries, instead of JSON. Two representations of the data are available. The default representation uses a strict schema per parser and converts known numbers to int/float JSON values. Certain known values of None are converted to JSON null, known boolean values are converted, and, in some cases, additional semantic context fields are added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Autograd

    Autograd

    Efficiently computes derivatives of numpy code

    Autograd can automatically differentiate native Python and Numpy code. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation), which means it can efficiently take gradients of scalar-valued functions with respect to array-valued arguments, as well as forward-mode differentiation, and the two can be composed arbitrarily....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Awesome-Quant

    Awesome-Quant

    A curated list of insanely awesome libraries, packages and resources

    awesome-quant is a curated list (“awesome list”) of libraries, packages, articles, and resources for quantitative finance (“quants”). It includes tools, frameworks, research papers, blogs, datasets, etc. It aims to help people working in algorithmic trading, quant investing, financial engineering, etc., find useful open source or educational resources. Licensed under typical “awesome” list standards.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    mistletoe

    mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python

    mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest CommonMark-compliant Markdown parser implementation in pure Python, mistletoe also supports easy definitions of custom tokens. Parsing Markdown into an abstract syntax tree also allows us to swap out renderers for different output formats, without touching any of the core components.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB