Scraper
A solution accelerator built on top of Microsoft Fabric, Azure OpenAI Service, and Azure AI Speech that enables customers with large amounts of conversational data to use generative AI to surface …
List of libraries, tools and APIs for web scraping and data processing.
The All in One Framework to build Awesome Scrapers.
Web data extraction tool implemented as chrome extension
🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖
🚀 THIS WEB SCRAPING TEMPLATE PROVIDES YOU WITH A GREAT STARTING POINT WHEN CREATING WEB SCRAPING BOTS. 🤖
An API wrapper for Scrappey.com written in Node.js (cloudflare bypass & solver)
ralger makes it easy to scrape a website. Built on the shoulders of titans: rvest, xml2.
A next-generation crawling and spidering framework.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.
Fetch user's data across social media
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
A helper script collecting contents of a repo and placing it in one text file.
Turn any webpage into structured data using LLMs
Streamlit demo of Scrapegraph-ai for GPT4-hackaton
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
A web crawler that prints a website to .pdf format
App that converts a website to a PDF
A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.
check links in web documents or full websites
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…