Stars
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Security scanner detecting Python Pickle files performing suspicious actions
Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
An authorization library that supports access control models like ACL, RBAC, ABAC in Golang: https://discord.gg/S5UjpzGZjN
NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformations
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
TypeChat is a library that makes it easy to build natural language interfaces using types.
A framework for few-shot evaluation of language models.
π 10x easier, π 140x lower storage cost, π high performance, π petabyte scale - Elasticsearch/Splunk/Datadog alternative for π (logs, metrics, traces, RUM, Error tracking, Session replay).
A library that allows you to easily mock out tests based on AWS infrastructure.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your dataβ¦
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Code and documentation to train Stanford's Alpaca models, and generate the data.
π Guides, papers, lecture, notebooks and resources for prompt engineering
The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the measurement of fairness in large scale machine learning workflows.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,β¦
A set of modules for (mis)using python features
Algorithms for outlier, adversarial and drift detection
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many moβ¦
Backup and migrate Kubernetes applications and their persistent volumes
This repository aims to map the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
Fast, easy and reliable testing for anything that runs in a browser.
Terratest is a Go library that makes it easier to write automated tests for your infrastructure code.