This scraper will scrape facebook comment, with all the comment replies, and save it as a JSON file. And there's one API endpoint to manipulate how the scraper will do it's job.
Build with minimalistic django framework, and the combination between selenium and beautifulsoup
To get started, you need to install these following applications:
- Python 3, make sure you have python 3 installed
- Virtualenv
- Supported browsers, you can check here
- Webdriver, make sure you download with correct OS you are using and the driver version should match with the browser version, you can download the driver here. This project currently using Chrome driver on macOS
Lets begin installing the package:
- Clone this repository:
git clone https://github.com/maulanaahmadarif/facebook-comment-scraper.git
. cd facebook-comment-scraper
- Create virtualenv
virtualenv env
- Activate virtualenv
source env/bin/activate
- Install the packages
pip install -r requirements.txt
With all the packages installed, now lets run the app
cd scrapapi
- Run the server
python manage.py runserver
it should running on port8000
on your localhost
Now, open http://localhost:8000/api?url={url}
, hit enter and it should open new browser depending on your webdriver selection, and the scraper is starting
Don't prevent any click event, it will break the scraper
The API has 4 parameter as listed below:
query | type | defaultValue | description |
---|---|---|---|
limit | Int (optional) |
null |
|
offset | Int (optional) |
0 |
|
reply | Boolean (optional) |
false |
if true the response will display the comment replies |
url | String (required) |
null |
Facebook post url (must be encoded), you can encode here |