Skip to content

Web scrape Yescapa website using scrapy and splash service

Notifications You must be signed in to change notification settings

macrodrigues/vanscrap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Web Scraping with Scrapy and Splash

The project consists of scraping the Yescapa website, to get data about motorhomes.

Since the website is dynamic, simple scraping wasn't possible. A Headless browser was needed to execute the JavaScript code. For this I used Splash, which can be easily integrated with Scrapy.

Scrapy

It is an open-source fast web crawling and web scraping framework for Python. It includes built-in support for handling common web scraping tasks such as handling cookies, user agents, and pagination. Additionally, it provides a built-in mechanism for handling web page parsing using XPath and CSS selectors.

Splash

It is a lightweight and open-source headless browser that is designed to render web pages and execute JavaScript code. You can launch it with Docker, and to use it with Scrapy, we need Scrapy-Splash which uses Splash HTTP API.

Ge to know more about the flow of the project in my medium article:

scrpy image

About

Web scrape Yescapa website using scrapy and splash service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages