Stars
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Official Implementation for NeurIPS'23 paper Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities
[TAC 2024] SVFAP: Self-supervised Video Facial Affect Perceiver
Official PyTorch implementation for "Large Language Diffusion Models"
FremyCompany / fast_align
Forked from clab/fast_alignSimple, fast unsupervised word aligner
Official Repo for the Paper "AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text"
A dual-stream transformer-based turn-taking model
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models
code for submission https://www.arxiv.org/abs/2502.12317
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
Code and Results for "Universals of word order reflect optimization of grammars for efficient communication"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A high-throughput and memory-efficient inference and serving engine for LLMs
Overview of German Text Simplification Resources, e.g., DEplain corpus, web harvester, alignment methods.
Repository to accompany the SIGDIAL 2024 paper: Elaborative Simplification for German-language Texts
A* CCG Parser with a Supertag and Dependency Factored Model
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Code for Arehalli, Dillon & Linzen (2022) "Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities"
Unsupervised Sentence Simplification via Dependency Parsing
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.