The project consists of scraping the Yescapa website, to get data about motorhomes.
Since the website is dynamic, simple scraping wasn't possible. A Headless browser was needed to execute the JavaScript code. For this I used Splash, which can be easily integrated with Scrapy.
It is an open-source fast web crawling and web scraping framework for Python. It includes built-in support for handling common web scraping tasks such as handling cookies, user agents, and pagination. Additionally, it provides a built-in mechanism for handling web page parsing using XPath and CSS selectors.
It is a lightweight and open-source headless browser that is designed to render web pages and execute JavaScript code. You can launch it with Docker, and to use it with Scrapy, we need Scrapy-Splash which uses Splash HTTP API.
Ge to know more about the flow of the project in my medium article: