Skip to content
View anoopkunchukuttan's full-sized avatar
💭
IndicNLP library now on pip!
💭
IndicNLP library now on pip!

Block or report anoopkunchukuttan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Chat Templates for 🤗 HuggingFace Large Language Models

Jinja 603 55 Updated Dec 13, 2024

Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.

68,369 7,244 Updated Aug 16, 2024

Access a database of word frequencies, in various natural languages.

Python 1,433 104 Updated Jan 4, 2025

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

Python 166 20 Updated Nov 5, 2024

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

Python 550 59 Updated May 9, 2024

This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs

Jupyter Notebook 174 14 Updated Oct 12, 2023

A list of publically available audio data that anyone can download for ASR or other speech activities

Shell 203 22 Updated Aug 6, 2021

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,822 232 Updated Jun 6, 2024

🕷️ The pipeline for the OSCAR corpus

Rust 166 14 Updated Dec 18, 2023

Yet Another Neural Machine Translation Toolkit

Python 176 32 Updated Jul 3, 2024

OpenNyAI is a mission aimed at developing open source software and datasets to catalyze the creation of AI-powered solutions to improve access to justice in India. BUILD is the first benchmark data…

Python 37 19 Updated Apr 17, 2024

Custom work around the Universal Declaration of Human Rights in Unicode

Python 1 Updated Aug 11, 2021

Aksharamukha Python Library

Python 44 15 Updated Feb 2, 2025
Jupyter Notebook 3 Updated Jul 5, 2021

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

Python 1,107 130 Updated Aug 28, 2024

These are lists for a variety of languages containing words that are distinctive to each language.

35 4 Updated Apr 5, 2022

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

27,689 3,725 Updated Jul 18, 2024

A machine translation reading list maintained by Tsinghua Natural Language Processing Group

TeX 2,437 449 Updated Aug 9, 2024

Appraise code used as part of WMT21 human evaluation campaign

Python 23 12 Updated Jan 10, 2025

Exploring representations for word similarity in Hindi

3 Updated Apr 19, 2021

Some useful tips for faiss

Shell 615 47 Updated Nov 2, 2023

A neural word aligner based on multilingual BERT

Python 338 49 Updated Mar 10, 2022

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python 3,920 279 Updated Feb 8, 2025

Neural Machine Translation Toolkit by Natlang Laboratory at SFU

Python 8 6 Updated Mar 16, 2024

A collection of links and notes on forced alignment tools

Python 887 86 Updated Nov 10, 2021

Article extraction benchmark: dataset and evaluation scripts

Python 300 30 Updated Apr 24, 2024

Aksharamukha

Vue 168 43 Updated Feb 2, 2025

Fast and robust date extraction from web pages, with Python or on the command-line

Python 121 26 Updated Dec 30, 2024
Next