Lists (2)
Sort Name ascending (A-Z)
Starred repositories
A curated list of OpenTelemetry resources
Ongoing research training transformer models at scale
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Work with remote images registries - retrieving information, images, signing content
Fast and Lightweight Observability Data Collector
Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
Lsyncd (Live Syncing Daemon) synchronizes local directories with remote targets
Kubernetes-like control planes for form-factors and use-cases beyond Kubernetes and container workloads.
Kubernetes IN Docker - local clusters for testing Kubernetes
Simple, safe way to store and distribute tensors
DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.
Infrastructures™ for Machine Learning Training/Inference in Production.
Development repository for the Triton language and compiler
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
杭州购房指南,根据个人购房经历,总结而成的一篇买房攻略,涉及新房摇号和二手房选购,包含大量杭州城市规划资料。
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Profiling and inspecting memory in pytorch
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
eBPF-based Cloud Native Monitoring Tool
An open-source, cloud-native, unified time series database for metrics, logs and events with SQL/PromQL supported. Available on GreptimeCloud.
Clean up Kubernetes yaml and json output to make it readable
High-performance LLM inference based on our optimized version of FastTransfomer
A command-line productivity tool powered by AI large language models like GPT-4, will help you accomplish your tasks faster and more efficiently.
Distributed ML Training and Fine-Tuning on Kubernetes