Stars
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
Various utilities regarding Levenshtein transducers. (Java)
Sharing the learning along the way we been gathering to enable Azure OpenAI at enterprise scale in a secure manner. GPT-RAG core is a Retrieval-Augmented Generation pattern running in Azure, using …
Self Organizing Maps (SOM) ML model can be used to conduct semantic search to populate context required for Retrieval Augmented Generation (RAG) LLM models. This repo contains an example to demonst…
KoalaNLP = Korean + Scala + NLP. 한국어 형태소 및 구문 분석기의 모음입니다.
Open Korean Text Processor - An Open-source Korean Text Processor
Near duplicate detection in a large collection of text using Locality Sensitive Hashing (LSH).
저의 Spark 프로젝트에 관련된 경험 노하우를 공개합니다. 설치 방법에서 부터 아키텍처, 소스파일 및 데이터까지 공개합니다.
Instruction Tuning with GPT-4
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
🦜🔗 Build context-aware reasoning applications
Header-only C++/python library for fast approximate nearest neighbors
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)
Polyglot: Large Language Models of Well-balanced Competence in Multi-languages
CKEditor 5 demo with image upload and Sparkjava backend
Powerful rich text editor framework with a modular architecture, modern integrations, and features like collaborative editing.
freeCodeCamp.org's open-source codebase and curriculum. Learn to code for free.
Source code for Twitter's Recommendation Algorithm
This project is deprecated. Check my new project ChatHub: