Stars
Official PyTorch implementation of Extract Free Dense Misalignment from CLIP (AAAI'25)
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Implementation of Nougat Neural Optical Understanding for Academic Documents
[ECCV2024] Official implementation of paper, "DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs".
The Universe of Data. All about data, data science, and data engineering
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A natural language interface for computers
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models, EMNLP 2023
extract text from any document. no muss. no fuss.
Open source Python library for converting PDF to DOCX.
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
A user-friendly plug-in that makes it easy to generate stable diffusion images inside Photoshop using either Automatic or ComfyUI as a backend.
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023
Learn how to design systems at scale and prepare for system design interviews
Robust Speech Recognition via Large-Scale Weak Supervision
The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes