-
pdfplumber Public
Forked from jsvine/pdfplumberPlumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
Python MIT License UpdatedFeb 5, 2020 -
pdf2htmlEX-1 Public
Forked from pdf2htmlEX/pdf2htmlEXConvert PDF to HTML without losing text or format.
HTML Other UpdatedFeb 2, 2020 -
Kashgari Public
Forked from BrikerMan/KashgariKashgari is a Production-ready NLP Transfer learning framework for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Python Apache License 2.0 UpdatedJan 28, 2020 -
CRAFT-pytorch Public
Forked from clovaai/CRAFT-pytorchOfficial implementation of Character Region Awareness for Text Detection (CRAFT)
Python MIT License UpdatedJan 27, 2020 -
Chinese-BERT-wwm Public
Forked from ymcui/Chinese-BERT-wwmPre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Python Apache License 2.0 UpdatedJan 21, 2020 -
models-1 Public
Forked from PaddlePaddle/modelsPre-trained and Reproduced Deep Learning Models (『飞桨』官方模型库,包含多种学术前沿和工业场景验证的深度学习模型)
Python Apache License 2.0 UpdatedJan 18, 2020 -
kaldi Public
Forked from kaldi-asr/kaldiThis is the official location of the Kaldi project.
Shell Other UpdatedJan 18, 2020 -
-
funNLP Public
Forked from fighting41love/funNLP中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
Python UpdatedJan 14, 2020 -
HanLP Public
Forked from hankcs/HanLPNatural Language Processing for the next decade
Python Apache License 2.0 UpdatedJan 10, 2020 -
ALBERT Public
Forked from google-research/albertALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Python UpdatedJan 9, 2020 -
albert_zh Public
Forked from brightmart/albert_zhA LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Python UpdatedJan 7, 2020 -
speech_recognition Public
Forked from Uberi/speech_recognitionSpeech recognition module for Python, supporting several engines and APIs, online and offline.
Python BSD 3-Clause "New" or "Revised" License UpdatedJan 4, 2020 -
-
bert Public
Forked from google-research/bertTensorFlow code and pre-trained models for BERT
Python Apache License 2.0 UpdatedJan 3, 2020 -
models Public
Forked from tensorflow/modelsModels and examples built with TensorFlow
Python Apache License 2.0 UpdatedJan 2, 2020 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
Python Apache License 2.0 UpdatedJan 2, 2020 -
stanfordnlp Public
Forked from stanfordnlp/stanzaOfficial Stanford NLP Python Library for Many Human Languages
Python Other UpdatedDec 25, 2019 -
nlp-journey Public
Forked from msgi/nlp-journeyNLP 相关的一些文档、论文及代码, 包括主题模型(Topic Model)、词向量(Word Embedding)、命名实体识别(Named Entity Recognition)、文本分类(Text Classificatin)、文本生成(Text Generation)、文本相似性(Text Similarity)计算、机器翻译(Machine Translation)等,涉及到各种与…
Python UpdatedDec 23, 2019 -
bert-as-service Public
Forked from jina-ai/clip-as-serviceMapping a variable-length sentence to a fixed-length vector using BERT model
Python MIT License UpdatedDec 23, 2019 -
document-ocr Public
Forked from rockyzhengwu/document-ocr一个相对完整的文档分析和识别项目
Python Apache License 2.0 UpdatedDec 11, 2019 -
pyecharts Public
Forked from pyecharts/pyecharts🎨 Python Echarts Plotting Library
Python MIT License UpdatedNov 29, 2019 -
Latex_OCR-wotanghaole Public
Forked from Hackathon2019EastChina/Latex_OCR-wotanghaoleLatex_OCR
Jupyter Notebook UpdatedNov 24, 2019 -
ASRT_SpeechRecognition Public
Forked from nl8590687/ASRT_SpeechRecognitionA Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Python GNU General Public License v3.0 UpdatedNov 15, 2019 -
dialogue-utterance-rewriter Public
Forked from chin-gyou/dialogue-utterance-rewriterPython UpdatedNov 12, 2019 -
image2text Public
Forked from prabhakar267/image2text📋 Python wrapper to grab text from images and save as text files using Tesseract Engine
Python UpdatedOct 25, 2019 -
docker-pdf2htmlex Public
Forked from ukwa/docker-pdf2htmlexRun pdf2htmlEX in a Docker container.
Python Apache License 2.0 UpdatedOct 21, 2019 -
chinese_ocr Public
Forked from YCG09/chinese_ocrCTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras
Python Apache License 2.0 UpdatedOct 8, 2019 -
python_speech_features Public
Forked from jameslyons/python_speech_featuresThis library provides common speech features for ASR including MFCCs and filterbank energies.
Python MIT License UpdatedSep 13, 2019 -
Data-for-LaTeX_OCR Public
Forked from LinXueyuanStdio/Data-for-LaTeX_OCRLaTeX OCR 的数据仓库
UpdatedAug 26, 2019