Skip to content
View mttk's full-sized avatar

Organizations

@TakeLab

Block or report mttk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A benchmark with locally sourced multilingual questions for 31 languages.

Python 7 2 Updated Apr 15, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 11,847 807 Updated Apr 23, 2025
Jupyter Notebook 7 Updated Mar 11, 2025

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 15,711 1,612 Updated Apr 12, 2025

Simple RL training for reasoning

Python 3,490 260 Updated Apr 10, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,188 50 Updated Nov 16, 2024

A unified interface for computing surprisal (log probabilities) from language models! Supports neural, symbolic, and black-box API models.

Python 39 10 Updated Dec 17, 2024

A professionally curated list of awesome Conformal Prediction videos, tutorials, books, papers, PhD and MSc theses, articles and open-source libraries.

691 56 Updated Apr 21, 2025

The Benchmark of Linguistic Minimal Pairs

Python 150 13 Updated Dec 13, 2022

The n-gram Language Model

C 1,416 100 Updated Aug 5, 2024

PyTorch native post-training library

Python 5,116 578 Updated Apr 23, 2025

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Python 3,343 417 Updated Apr 7, 2025

Dermatology ddx dataset, Jax implementations of Monte Carlo conformal prediction, plausibility regions and statistical annotation aggregation from our recent work on uncertain ground truth (TMLR'23…

Python 647 44 Updated Mar 28, 2024

Grok open release

Python 50,235 8,346 Updated Aug 30, 2024

The official PyTorch implementation of Google's Gemma models

Python 5,424 534 Updated Mar 21, 2025

Run code inference-only benchmarks quickly using vLLM

Python 11 Updated Mar 20, 2025

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024

Python 20 5 Updated Nov 13, 2024

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,985 2,691 Updated Dec 18, 2024

Large-scale multi-document summarization dataset and code

Python 281 51 Updated May 8, 2023

Enhancing small language models with LLM generated counterfactuals.

Python 5 Updated Jan 8, 2025

Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning

Python 33 2 Updated Jan 9, 2025

The first real AI developer

Python 32,610 3,305 Updated Mar 4, 2025

Code associated with NLPeer: A unified resource for the study of peer review

Python 15 2 Updated Feb 17, 2025

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,708 354 Updated Dec 7, 2024

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.

Java 626 159 Updated Apr 21, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,379 663 Updated Apr 23, 2025
Python 37 1 Updated Feb 13, 2023

Generate automated tests for your Node.js app via LLMs without developers having to write a single line of code.

JavaScript 1,770 98 Updated Apr 22, 2025
Next