Skip to content
View nannerhammix's full-sized avatar

Block or report nannerhammix

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DocParseNet is an innovative multi-modal deep learning architecture specifically designed for parsing and annotating document images, particularly those derived from Right of Way (ROW) agreement PD…

Python 7 1 Updated May 30, 2024

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Python 4,211 469 Updated Jan 30, 2025

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

Python 24,562 1,417 Updated Feb 1, 2025

gRPC clients and servers in R

C++ 75 25 Updated Feb 23, 2023

Extension to {sparklyr} that allows you to interact with Spark & Databricks Connect

R 14 3 Updated Jan 29, 2025

All kinds of neural text classifiers implemented by Keras

Python 64 21 Updated Sep 20, 2019

This repository contains the code and data download links to reproduce the experiments of the PVLDB paper "Dual-Objective Fine-Tuning of BERT for Entity Matching" by Ralph Peeters and Christian Bizer.

Python 14 6 Updated Jun 7, 2021

Tidy interface to 'data.table'

R 458 33 Updated Jan 21, 2025

Tidy interface to polars

Python 351 11 Updated Nov 2, 2024

Predict Race and Ethnicity Based on the Sequence of Characters in a Name

Jupyter Notebook 237 66 Updated Jun 13, 2024

Code and data used in named entity transliteration experiments

Python 57 8 Updated Jun 4, 2018

Transliteration data and models

55 30 Updated Nov 19, 2016

Fast, flexible name matching for large datasets

Python 70 9 Updated Dec 15, 2023

Fuzzy document finding in Ruby

Ruby 23 8 Updated Oct 18, 2017

Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

Python 811 121 Updated Jan 31, 2025

Rapid fuzzy string matching in Python using various string metrics

Python 2,857 120 Updated Jan 30, 2025

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

C# 3,194 300 Updated Feb 1, 2025

πŸ“ Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

Python 3,435 251 Updated Sep 9, 2024

🎯 String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Re…

Scala 484 80 Updated Jul 28, 2017

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

Python 307 54 Updated Aug 24, 2022

A curated list of resources on document similarity measures (papers, tutorials, code, ...)

240 24 Updated Jul 13, 2022

πŸ“– A curated list of resources dedicated to Natural Language Processing (NLP)

16,917 2,594 Updated Nov 13, 2023

πŸ“– A curated list of resources dedicated to Natural Language Processing (NLP)

1 Updated Dec 7, 2021

πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 138,415 27,780 Updated Jan 31, 2025

State of the Art Natural Language Processing

Scala 3,908 718 Updated Feb 1, 2025

An implementation of DBSCAN runing on top of Apache Spark

Scala 183 58 Updated Jan 10, 2018

Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.

Python 2,348 336 Updated Dec 29, 2024

PyTorch original implementation of Cross-lingual Language Model Pretraining.

Python 2,898 497 Updated Feb 14, 2023

Streamlit β€” A faster way to build and share data apps.

Python 37,018 3,191 Updated Feb 1, 2025

Data Apps & Dashboards for Python. No JavaScript Required.

Python 21,867 2,099 Updated Jan 31, 2025
Next