Stars
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
OpenMMLab Pre-training Toolbox and Benchmark
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Distributed SQL Query Engine in Python using Ray
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
This is the official repository for M2UGen
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
Curated list of project-based tutorials
Tools to download and cleanup Common Crawl data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Apache Spark - A unified analytics engine for large-scale data processing
ClickHouse® is a real-time analytics database management system
Go HTTP framework with high-performance and strong-extensibility for building micro-services.
astaxie / beego
Forked from beego/beegobeego is an open-source, high-performance web framework for the Go programming language.
A golang ebook intro how to build a web with golang
Leetcode algorithm solutions together with self-made teaching videos
Mining synonyms from unstructured and semi-structured data
一位酷爱做饭的程序员,立志用动画将算法说的通俗易懂。我的面试网站 www.chengxuchu.com
Provide all my solutions and explanations in Chinese for all the Leetcode coding problems.