Web Scraping with Scrapy and Splash

The project consists of scraping the Yescapa website, to get data about motorhomes.

Since the website is dynamic, simple scraping wasn't possible. A Headless browser was needed to execute the JavaScript code. For this I used Splash, which can be easily integrated with Scrapy.

Scrapy

It is an open-source fast web crawling and web scraping framework for Python. It includes built-in support for handling common web scraping tasks such as handling cookies, user agents, and pagination. Additionally, it provides a built-in mechanism for handling web page parsing using XPath and CSS selectors.

Splash

It is a lightweight and open-source headless browser that is designed to render web pages and execute JavaScript code. You can launch it with Docker, and to use it with Scrapy, we need Scrapy-Splash which uses Splash HTTP API.

Ge to know more about the flow of the project in my medium article:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
vanscrap		vanscrap
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping with Scrapy and Splash

Scrapy

Splash

Medium Article

About

Releases

Packages

Languages

macrodrigues/vanscrap

Folders and files

Latest commit

History

Repository files navigation

Web Scraping with Scrapy and Splash

Scrapy

Splash

Medium Article

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages