Skip to content
View otey247's full-sized avatar

Block or report otey247

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Scraper

32 repositories

A solution accelerator built on top of Microsoft Fabric, Azure OpenAI Service, and Azure AI Speech​ that enables customers with large amounts of conversational data to use generative AI to surface …

Jupyter Notebook 191 117 Updated Dec 20, 2024

List of libraries, tools and APIs for web scraping and data processing.

Makefile 6,791 790 Updated Nov 25, 2024

The All in One Framework to build Awesome Scrapers.

Python 1,538 146 Updated Dec 7, 2024

Web data extraction tool implemented as chrome extension

JavaScript 227 69 Updated Dec 20, 2024

🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖

TypeScript 23 7 Updated Aug 15, 2024

🚀 THIS WEB SCRAPING TEMPLATE PROVIDES YOU WITH A GREAT STARTING POINT WHEN CREATING WEB SCRAPING BOTS. 🤖

Python 7 3 Updated Jul 16, 2023

An API wrapper for Scrappey.com written in Node.js (cloudflare bypass & solver)

JavaScript 12 4 Updated Jan 10, 2024

ralger makes it easy to scrape a website. Built on the shoulders of titans: rvest, xml2.

R 156 14 Updated Jul 16, 2024

A next-generation crawling and spidering framework.

Go 12,724 660 Updated Dec 23, 2024

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…

Python 2,381 264 Updated Jun 24, 2024

An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.

Python 266 33 Updated Apr 23, 2023

Fetch user's data across social media

Python 453 76 Updated Jul 11, 2024

scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place

Go 1,035 142 Updated Dec 22, 2024

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Python 6,548 677 Updated Oct 12, 2024

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

TypeScript 16,321 712 Updated Dec 26, 2024

Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion

Python 655 64 Updated Dec 19, 2024

A helper script collecting contents of a repo and placing it in one text file.

Python 76 12 Updated Aug 3, 2024

Turn any webpage into structured data using LLMs

TypeScript 2,548 150 Updated Aug 30, 2024

Python scraper based on AI

Python 16,536 1,351 Updated Dec 23, 2024

🔄 CLI to convert Webpages to PDFs 🚀

Python 1,091 45 Updated Feb 11, 2024

Streamlit demo of Scrapegraph-ai for GPT4-hackaton

Python 90 127 Updated Dec 25, 2024

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

TypeScript 20,656 1,625 Updated Dec 24, 2024

A very simple news crawler with a funny name

Python 305 75 Updated Dec 23, 2024

A web crawler that prints a website to .pdf format

Python 5 1 Updated Jun 26, 2023

App that converts a website to a PDF

JavaScript 40 6 Updated May 21, 2023

A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.

JavaScript 4,336 166 Updated Nov 6, 2024

check links in web documents or full websites

Python 900 151 Updated Oct 12, 2024

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

Python 4,905 327 Updated Dec 25, 2024

Get your documents ready for gen AI

Python 16,615 845 Updated Dec 19, 2024