Stars
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
Build multimodal language agents for fast prototype and production
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
OpenBMB / mlc-MiniCPM
Forked from mlc-ai/mlc-llmMiniCPM on Android platform.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
[TMLR 2024] Efficient Large Language Models: A Survey
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (…
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
Everything about the SmolLM & SmolLM2 family of models
On-device AI across mobile, embedded and edge for PyTorch
Retrieval and Retrieval-augmented LLMs
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Kubernetes-native Deep Learning Framework
VPTQ, A Flexible and Extreme low-bit quantization algorithm
PyTorch native quantization and sparsity for training and inference
An app that brings language models directly to your phone.