Skip to content
View TuanNguyen27's full-sized avatar
☀️
☀️

Block or report TuanNguyen27

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 597A @ Princeton Fall 2023

TeX 317 34 Updated Jan 17, 2025

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python 37,237 3,263 Updated Aug 17, 2024

A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap.

Python 1,389 86 Updated Feb 7, 2025

Fast SHAP value computation for interpreting tree-based models

Python 534 34 Updated Jun 26, 2023

The release of the Twitter algorithm, annotated for recsys

488 27 Updated Apr 15, 2023

Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.

Jupyter Notebook 113 19 Updated Mar 29, 2024

Statistical Rethinking Course for Jan-Mar 2023

R 2,246 253 Updated Nov 28, 2023

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Python 10,217 799 Updated Mar 6, 2025

Flyte Documentation 📖

Python 80 124 Updated Feb 12, 2025

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

60,775 6,169 Updated Feb 16, 2025

Neural Networks: Zero to Hero

Jupyter Notebook 13,358 1,849 Updated Aug 18, 2024

by ex-googlers, for ex-googlers - a lookup table of similar tech & services

14,784 1,054 Updated Feb 11, 2025

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 7,448 818 Updated Mar 6, 2025

Approximate Nearest Neighbor Search for Sparse Data in Python!

Python 918 145 Updated Oct 2, 2020

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,560 299 Updated Jan 21, 2022

Coarse-grained lineage and tracing for machine learning pipelines.

Python 467 29 Updated Nov 11, 2022

A collection of (mostly) technical things every software developer should know about

86,480 7,951 Updated Aug 6, 2024

A light-weight, flexible, and expressive statistical data testing library

Python 3,656 328 Updated Mar 7, 2025

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

Python 2,045 93 Updated Sep 21, 2024

Source code accompanying O'Reilly book: Machine Learning Design Patterns

Jupyter Notebook 1,944 540 Updated Apr 28, 2021

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/

Python 2,251 154 Updated Feb 25, 2025

📌 Papers, guides, and mentor interviews on applying machine learning for ApplyingML.com—the ghost knowledge of machine learning.

MDX 198 34 Updated Jun 5, 2024

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rath…

C++ 810 46 Updated Mar 5, 2025

Automatically visualize your pandas dataframe via a single print! 📊 💡

Python 5,256 370 Updated Mar 20, 2024

Preparation links and resources for system design questions

8,958 2,486 Updated May 10, 2024

📝 Design doc template & examples for machine learning systems (requirements, methodology, implementation, etc.)

581 101 Updated Mar 16, 2023

State of the Art Natural Language Processing

Scala 3,935 722 Updated Mar 7, 2025

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

Python 368 72 Updated Mar 5, 2025

A C++ standalone library for machine learning

C++ 5,340 501 Updated Jan 27, 2025

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

Python 2,710 495 Updated Oct 23, 2024
Next