Scrapper

Extraction Tool

The Extraction Tool, called Scrapper, is a Python-based application that allows users to extract and process web content from a given URL. It's written with Object Oriented model, follows the Model-View-Controller (MVC) architecture, and provides a command-line interface for interaction.

Usage

Easy step : use the .exe file in the dist directory
Manual step :
- Fork the project, but only main.py file and mvc folder are required
- Open your terminal, and make sure you have Python 3.10 or above installed
- Install the required packages from requirements.txt using pip : pip install -r requirements.txt
- Run the file main.py and follow the instructions : python main.py

Features

Extracts source code or specific content (title, text, link, image) from a web page
Supports saving the extracted content in various formats (txt, html, XML)
Validates user input and handles errors gracefully
Provides options for retrying or quitting the extraction process

Future features

Best-suited algorithm for Checking extraction eligibility and authorization before processing
Cross-platform GUI app (maybe with PyQt), for user interaction and accessibility

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
dist		dist
mvc		mvc
README.md		README.md
README.txt		README.txt
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapper

Extraction Tool

Usage

Features

Future features

License

About

Releases

Packages

Languages

Ko0ler/Scrapper

Folders and files

Latest commit

History

Repository files navigation

Scrapper

Extraction Tool

Usage

Features

Future features

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages