Skip to content
#

chinese-word-segmentation

Here are 94 public repositories matching this topic...

Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch

  • Updated Sep 18, 2023
  • Java

Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

  • Updated Dec 21, 2024
  • Python

一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, featuring multiple tokenization algorithms and customizable models. Ideal for students, researchers, and NLP enthusiasts..

  • Updated Oct 18, 2024
  • Python

Improve this page

Add a description, image, and links to the chinese-word-segmentation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the chinese-word-segmentation topic, visit your repo's landing page and select "manage topics."

Learn more