Skip to content

Commit a4381f0

Browse files
Merge pull request avinashkranjan#839 from Ayushjain2205/MonsterJobs-scraper
Monster jobs scraper
2 parents 096ac16 + 728ca44 commit a4381f0

File tree

3 files changed

+127
-0
lines changed

3 files changed

+127
-0
lines changed

MonsterJobs Scraper/README.md

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Monster Jobs Scrapper
2+
3+
Running this Script would allow the user to scrape job openings from [Monster jobs](https://www.monsterindia.com), based on their choice of location, job role, company or designation.
4+
5+
## Setup instructions
6+
7+
In order to run this script, you need to have Python and pip installed on your system. After you're done installing Python and pip, run the following command from your terminal to install the requirements from the same folder (directory) of the project.
8+
9+
```
10+
pip install -r requirements.txt
11+
```
12+
13+
As this script uses selenium, you will need to install the chrome webdriver from [this link](https://sites.google.com/a/chromium.org/chromedriver/downloads)
14+
15+
After satisfying all the requirements for the project, Open the terminal in the project folder and run
16+
17+
```
18+
python scraper.py
19+
```
20+
21+
or
22+
23+
```
24+
python3 scraper.py
25+
```
26+
27+
depending upon the python version. Make sure that you are running the command from the same virtual environment in which the required modules are installed.
28+
29+
## Output
30+
31+
The user needs to enter input as per required job
32+
33+
![User is asked for input](https://i.postimg.cc/tg270Zjs/monster-scraper-input.png)
34+
35+
The scraped jobs are stored in a CSV file with name job_records.csv
36+
37+
![Jobs saved in csv file](https://i.postimg.cc/x1gbQFGj/monster-scraper-output.png)
38+
39+
## Author
40+
41+
[Ayush Jain](https://github.com/Ayushjain2205)

MonsterJobs Scraper/requirements.txt

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
requests
2+
beautifulsoup4
3+
selenium

MonsterJobs Scraper/scraper.py

+83
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
import requests
2+
from bs4 import BeautifulSoup
3+
from selenium import webdriver
4+
from selenium.webdriver.common.keys import Keys
5+
import time
6+
import csv
7+
8+
# Get chrome driver path
9+
driver_path = input("Enter chrome driver path: ")
10+
11+
# Setup csv file to write data into
12+
filename = "job_records.csv"
13+
fields = ['Job Title', 'Company', 'Location', 'Job Description', 'URL']
14+
15+
# Get user choice until valid choice is entered
16+
while (True):
17+
search_option = int(input(
18+
"Enter 1 - to search by location \nEnter 2 - to search by role, skill or company \nEnter 3 for both : "))
19+
if (search_option == 1):
20+
location = input("Enter location :")
21+
url = 'https://www.monsterindia.com/srp/results?locations={}'.format(
22+
location)
23+
break
24+
elif (search_option == 2):
25+
job_type = input("Enter role, skill or company : ")
26+
url = 'https://www.monsterindia.com/srp/results?query={}'.format(
27+
job_type)
28+
break
29+
elif (search_option == 3):
30+
location = input("Enter location :")
31+
job_type = input("Enter role, skill or company : ")
32+
url = 'https://www.monsterindia.com/srp/results?query={}&locations={}'.format(
33+
job_type, location)
34+
break
35+
else:
36+
continue
37+
# initiating the webdriver. Parameter includes the path of the webdriver.
38+
driver = webdriver.Chrome(driver_path)
39+
driver.get(url)
40+
41+
# this is just to ensure that the page is loaded
42+
time.sleep(5)
43+
html = driver.page_source
44+
45+
# Now apply bs4 to html variable
46+
soup = BeautifulSoup(html, "html.parser")
47+
job_divs = soup.find_all("div", {"class": "card-apply-content"})
48+
49+
with open(filename, 'w', newline='', encoding='utf8') as csvfile:
50+
csvwriter = csv.writer(csvfile)
51+
csvwriter.writerow(fields)
52+
for job in job_divs:
53+
job_title_div = job.find('div', {"class": "job-tittle"})
54+
55+
# Get job title
56+
job_title_holder = job_title_div.find('h3')
57+
job_title = (job_title_holder.find('a')).text.strip()
58+
59+
# Get company name
60+
company_name_tag = job_title_div.find(
61+
'span', {"class": "company-name"})
62+
company_name = company_name_tag.find('a', {"class": "under-link"})
63+
if (company_name is None):
64+
company_name = 'confidential'
65+
else:
66+
company_name = company_name.text
67+
68+
# Get location
69+
company_location_tag = job_title_div.find('span', {"class": "loc"})
70+
company_location = company_location_tag.find('small').text.strip()
71+
72+
# Get job description
73+
job_description = job.find('p', {"class": "job-descrip"}).text.strip()
74+
75+
# Get job URL
76+
job_url = "https:"+((job_title_holder.find('a'))['href'])
77+
78+
# Add data as a row in CSV file
79+
csvwriter.writerow(
80+
[job_title, company_name, company_location, job_description, job_url])
81+
82+
print("Job data successfully saved in job_records.csv")
83+
driver.close() # closing the webdriver

0 commit comments

Comments
 (0)