Stars
📄 A curated list of awesome .cursorrules files
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Implementation of various JWx (Javascript Object Signing and Encryption/JOSE) technologies
lightweight, idiomatic and composable router for building Go HTTP services
Golang library for managing configuration data from environment variables
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
This converts most PDF files into a text only PDF file. This script strips a PDF document of all images, designs, etc and only keeps the text
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
STUMPY is a powerful and scalable Python library for modern time series analysis
Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖
An extended commonmark compliant parser, with bridges to docutils/sphinx
NumPy aware dynamic Python compiler using LLVM
Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Modin: Scale your Pandas workflows by changing a single line of code
Plugin for Obsidian, providing search/replace functionality which supports regular expressions and selections.
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.
Interactive Data Visualization in the browser, from Python
Examples of PyMC models, including a library of Jupyter notebooks.
dbt macros to stage external sources
State-of-the-Art Text Embeddings
A proof of concept tool for using ChatGPT to transform messy text documents into structured JSON