Skip to content
View gentaiscool's full-sized avatar
:octocat:
Writing interesting code...
:octocat:
Writing interesting code...

Highlights

  • Pro

Organizations

@HLTCHKUST @audioku @indobenchmark

Block or report gentaiscool

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fully open reproduction of DeepSeek-R1

Python 18,144 1,515 Updated Feb 9, 2025

Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya

Python 106 7 Updated Feb 2, 2025

Grassroots Science Website

CSS 1 Updated Dec 9, 2024

potato: portable text annotation tool

Jupyter Notebook 316 53 Updated Jan 16, 2025

Processed / Cleaned Data for Paper Copilot

262 9 Updated Feb 10, 2025
1 Updated Nov 16, 2024

Githun Repo for “Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey”

4 Updated Nov 10, 2024

EZswitch is a framework designed to generate code-switched text, blending two languages within a single sentence or discourse. This tool incorporates Equivalence Constraint Theory (ECT) with and (L…

OCaml 3 1 Updated Oct 31, 2024
JavaScript 1 Updated Oct 30, 2024

WorldCuisines is an extensive multilingual and multicultural benchmark that spans 30 languages, covering a wide array of global cuisines.

Jupyter Notebook 16 3 Updated Oct 27, 2024

RewardBench: the first evaluation tool for reward models.

Python 500 57 Updated Feb 8, 2025

MetaMetrics is a calibrated meta-metric designed to evaluate generation tasks across different modalities aligned with alignment with human preferences.

Python 12 1 Updated Dec 30, 2024

A curated list of research papers and resources on Cultural LLM.

35 2 Updated Sep 26, 2024

Resources for cultural NLP research

85 12 Updated Jan 20, 2025
Jupyter Notebook 3 1 Updated May 31, 2024

A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python

Python 1,162 122 Updated Apr 13, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 6 11 Updated Dec 23, 2024

ANSI color formatting for output in terminal

Python 237 27 Updated Jan 6, 2025

Mexican NLP 2024 Summerschool Tutorial on Knowledge Distillation and Parameter Efficient Finetuning

8 Updated Jun 17, 2024

A Python implementation of global optimization with gaussian processes.

Python 8,063 1,556 Updated Jan 2, 2025

A library to calculate similarity scores between two collections of text sequences encoded using transformer models for bitext mining, dense retrieval, retrieval-based classification, and retrieval…

Python 5 3 Updated Jun 22, 2024

A library of translation-based text similarity measures

Python 25 6 Updated Dec 11, 2023

BERT score for text generation

Jupyter Notebook 1,668 225 Updated Jul 30, 2024

MTEB: Massive Text Embedding Benchmark

Jupyter Notebook 2,156 314 Updated Feb 10, 2025

Implementation of ProxyLM, a scalable and efficient LM performance prediction framework on NLP task using proxy models

Python 6 1 Updated Dec 6, 2024

Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.

Python 15 4 Updated Oct 3, 2024

MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)

Python 13 6 Updated Oct 3, 2024

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.

Python 75 57 Updated Jan 24, 2025
Jupyter Notebook 6 Updated Jun 24, 2024

IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems

1 Updated Jun 10, 2024
Next