Skip to content
View mabaez's full-sized avatar

Highlights

  • Pro

Block or report mabaez

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The deslanting algorithm sets text upright in images. Python, C++ and OpenCL implementations provided.

C++ 149 38 Updated Nov 21, 2021

Collection of different NLP recipes :)

Jupyter Notebook 1 Updated Jan 26, 2021

A awesome list of (large-scale) public datasets on the Internet. (On-going collection)

8 17 Updated Dec 4, 2014

This is a temporary repository for working on improvements for my book 'Text Analytics with Python'

3 2 Updated Nov 15, 2017

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Te…

Jupyter Notebook 1,661 844 Updated Dec 24, 2020

Contains relevant notebooks for the hands-on NLP workshop for the GIDS AIML Conference -2020 Edition

Jupyter Notebook 23 19 Updated May 3, 2021

A TensorFlow project to detect sarcasm in tweets using the power of GloVe embeddings and Convolution layers.

Python 7 3 Updated Jun 13, 2019

Google Colab notebooks - tutorials, guides on ML topics

Jupyter Notebook 7 7 Updated Apr 18, 2024

🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based

HTML 308 40 Updated Oct 13, 2023

📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF

Python 38 4 Updated Mar 8, 2022

GROBID extension for identifying and normalizing physical quantities.

JavaScript 77 24 Updated Sep 14, 2024

a Deep Learning Framework for Text https://delft.readthedocs.io/

Python 390 64 Updated Jan 8, 2025

A high performance bibliographic information service: https://biblio-glutton.readthedocs.io

Java 132 16 Updated Sep 14, 2024

PDF to XML ALTO file converter

C 222 73 Updated Jan 11, 2025

A machine learning software for extracting information from scholarly documents

Java 3,736 462 Updated Jan 24, 2025

An opinionated guide to common Jekyll design patterns and anti-patterns.

Shell 66 18 Updated Jan 12, 2023

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 53,896 10,614 Updated Jan 23, 2025

A Jekyll plugin that provides users with a traditional CMS-style graphical interface to author content and administer Jekyll sites.

JavaScript 2,860 366 Updated Dec 16, 2024

A site to provide non-judgmental guidance on choosing a license for your open source project

Ruby 3,761 1,363 Updated Jan 10, 2025