- Melbourne
- www.xinliang.co
Stars
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Stable Diffusion web UI
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
[WIP] Layer Diffusion for WebUI (via Forge)
ControlNet++: All-in-one ControlNet for image generations and editing!
Google Research
llama3 implementation one matrix multiplication at a time
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with comma…
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual Alignment Benefit Vision Representations? (NeurIPS 2024)
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
A high-throughput and memory-efficient inference and serving engine for LLMs
Large Language Model Text Generation Inference
🚀 Free NextJS Landing Page Template written in Tailwind CSS 3 and TypeScript ⚡️ Made with developer experience first: Next.js 14 + TypeScript + ESLint + Prettier + Husky + Lint-Staged + VSCode + Ne…
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Zero-shot Image-to-Image Translation [SIGGRAPH 2023]
LAVIS - A One-stop Library for Language-Vision Intelligence
🦜🔗 Build context-aware reasoning applications
An open source implementation of CLIP.
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
FinRL: Financial Reinforcement Learning. 🔥
For trading. Please star.
A latent text-to-image diffusion model
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch