Starting with Puppeteer

This is an example repository to be used as a companion to a series of begginer-friendly posts I plan on write about doing magic stuff with Puppeteer.

awesome post links will go here, eventually

Reading the articles are non-obligatory (but I'll be very happy if you do) and this repo can be read as it is. A lot of different approaches will be used here as examples and maybe inspire your own implementations.

Content

All the code here is separated by context modules with their own set of awesome features.

Util

Browser: You can find here how to launch your Puppeteer instance along with using it with superpowers, with all the resources pupppeteer-extra provides us.
Page: Provides useful functions of interesting ways of taking your screenshots and scrapping full page's HTML code and uploading somewhere. Maybe a S3 bucket or something?
Stealth: Shows how to perform a scrapper stealth test using the puppeteer-extra stealth module and showing up the results.
Time: Functions to be used to check the amount of time used to perform scrapping operations
Upload: Shows a logic to upload all your screenshots and HTML data into a local bucket, customizable to work with S3 as well.

Core

Google: Really simple example of scrapping Google's first page of results for a keyword search.

Config

Logger: Custom logger configured using Winston. I quite like it, feel free to use as well.

Building the image

Run docker build -t starting-with-puppeteer:latest .

Running the scrapping example

Install all necessary dependencies with a npm install
Create your own .env following the variables defined on the .env.example
Run docker-compose up scrapper
Profit

Running the stealth checking example

Install all necessary dependencies with a npm install
Create your own .env following the variables defined on the .env.example
Run docker-compose up stealth-check
Profit

You can check all the taken snapshots on Minio, accessible by entering localhost:1111/minio/bucket/ on your local machine.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
bin		bin
src		src
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierc.json		.prettierc.json
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
jsdoc.json		jsdoc.json
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Starting with Puppeteer

Content

Util

Core

Config

Building the image

Running the scrapping example

Running the stealth checking example

About

Releases

Packages

Contributors 2

Languages

Emethium/starting-with-puppeteer

Folders and files

Latest commit

History

Repository files navigation

Starting with Puppeteer

Content

Util

Core

Config

Building the image

Running the scrapping example

Running the stealth checking example

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages