TuanNguyen27

☀️

Tuan Nguyen TuanNguyen27

☀️

Data scientist @salesforce

29 followers · 60 following

Salesforce Einstein
San Francisco, CA
@tnguyen277

Achievements

Stars

edoliberty / vector-search-class-notes

Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 597A @ Princeton Fall 2023

TeX 317 34 Updated Jan 17, 2025

LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python 37,237 3,263 Updated Aug 17, 2024

jdagdelen / hyperDB

A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap.

Python 1,389 86 Updated Feb 7, 2025

linkedin / FastTreeSHAP

Fast SHAP value computation for interpreting tree-based models

Python 534 34 Updated Jun 26, 2023

igorbrigadir / awesome-twitter-algo

The release of the Twitter algorithm, annotated for recsys

488 27 Updated Apr 15, 2023

fugue-project / tutorials

Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.

Jupyter Notebook 113 19 Updated Mar 29, 2024

rmcelreath / stat_rethinking_2023

Statistical Rethinking Course for Jan-Mar 2023

R 2,246 253 Updated Nov 28, 2023

cleanlab / cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Python 10,217 799 Updated Mar 6, 2025

flyteorg / flytesnacks

Flyte Documentation 📖

Python 80 124 Updated Feb 12, 2025

binhnguyennus / awesome-scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

60,775 6,169 Updated Feb 16, 2025

karpathy / nn-zero-to-hero

Neural Networks: Zero to Hero

Jupyter Notebook 13,358 1,849 Updated Aug 18, 2024

jhuangtw / xg2xg

by ex-googlers, for ex-googlers - a lookup table of similar tech & services

14,784 1,054 Updated Feb 11, 2025

bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 7,448 818 Updated Mar 6, 2025

facebookresearch / pysparnn

Approximate Nearest Neighbor Search for Sparse Data in Python!

Python 918 145 Updated Oct 2, 2020

BoltzmannEntropy / interviews.ai

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,560 299 Updated Jan 21, 2022

loglabs / mltrace

Coarse-grained lineage and tracing for machine learning pipelines.

Python 467 29 Updated Nov 11, 2022

mtdvio / every-programmer-should-know

A collection of (mostly) technical things every software developer should know about

86,480 7,951 Updated Aug 6, 2024

unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library

Python 3,656 328 Updated Mar 7, 2025

fugue-project / fugue

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

Python 2,045 93 Updated Sep 21, 2024

GoogleCloudPlatform / ml-design-patterns

Source code accompanying O'Reilly book: Machine Learning Design Patterns

Jupyter Notebook 1,944 540 Updated Apr 28, 2021

patrick-kidger / equinox

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/

Python 2,251 154 Updated Feb 25, 2025

eugeneyan / applyingml

📌 Papers, guides, and mentor interviews on applying machine learning for ApplyingML.com—the ghost knowledge of machine learning.

MDX 198 34 Updated Jun 5, 2024

tuplex / tuplex

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rath…

C++ 810 46 Updated Mar 5, 2025

lux-org / lux

Automatically visualize your pandas dataframe via a single print! 📊 💡

Python 5,256 370 Updated Mar 20, 2024

shashank88 / system_design

Preparation links and resources for system design questions

8,958 2,486 Updated May 10, 2024

eugeneyan / ml-design-docs

📝 Design doc template & examples for machine learning systems (requirements, methodology, implementation, etc.)

581 101 Updated Mar 16, 2023

JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing

Scala 3,935 722 Updated Mar 7, 2025

mlcommons / algorithmic-efficiency

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.

Python 368 72 Updated Mar 5, 2025

flashlight / flashlight

A C++ standalone library for machine learning

C++ 5,340 501 Updated Jan 27, 2025

dreamquark-ai / tabnet

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

Python 2,710 495 Updated Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuan Nguyen TuanNguyen27

Achievements

Achievements

Block or report TuanNguyen27

Stars

edoliberty / vector-search-class-notes

LAION-AI / Open-Assistant

jdagdelen / hyperDB

linkedin / FastTreeSHAP

igorbrigadir / awesome-twitter-algo

fugue-project / tutorials

rmcelreath / stat_rethinking_2023

cleanlab / cleanlab

flyteorg / flytesnacks

binhnguyennus / awesome-scalability

karpathy / nn-zero-to-hero

jhuangtw / xg2xg

bentoml / BentoML

facebookresearch / pysparnn

BoltzmannEntropy / interviews.ai

loglabs / mltrace

mtdvio / every-programmer-should-know

unionai-oss / pandera

fugue-project / fugue

GoogleCloudPlatform / ml-design-patterns

patrick-kidger / equinox

eugeneyan / applyingml

tuplex / tuplex

lux-org / lux

shashank88 / system_design

eugeneyan / ml-design-docs

JohnSnowLabs / spark-nlp

mlcommons / algorithmic-efficiency

flashlight / flashlight

dreamquark-ai / tabnet