Stars
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Tools for merging pretrained large language models.
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
Codebase for Merging Language Models (ICML 2024)
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
[NeurIPS'23] Emergent Correspondence from Image Diffusion
A framework for merging models solving different tasks with different initializations into one multi-task model without any additional training
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral, Best Student Paper].
Official code repository for NeurIPS 2022 paper "SatMAE: Pretraining Transformers for Temporal and Multi-Spectral Satellite Imagery"
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
Official code repository for ICLR 2024 paper "DiffusionSat: A Generative Foundation Model for Satellite Imagery"
FusionBench: A Comprehensive Benchmark/Toolkit of Deep Model Fusion
AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]