Technion - Israel Institute of Technology - Faculty of Data Science and Decisions
- Final Project Report
- Model Interpretability
- Model Interpretability Notebook
- Model Interpretability Results
- On Target tool
- On Target Notebook
- Example generated instructions
- Scraping
- Scraping "Comparably"
- Scraping companies' websites
- Requirements
First impressions and emphasizing the right qualities are crucial in the business world, especially when applying for a job. That's why we developed On Target, our revolutionary big data tool. This system generates tailored guidelines that enable candidates to highlight the skills and values most valued by the specific company they are targeting using Statistical and Machine Learning methods.
- Data Preprocessing
- Features Engineering
- Pre-trained models from Hugging Face]
- Statistical tests to significance inference
- Machine Learning model to Model Interpretability
- NLP keywords extraction techniques
- LLM to generate instructions
The model interpretability notebook gives information about the importance of the features in the recruitment process by training for each company a Random Forest model for a binary classification task.
Before you begin, make sure the following prerequisites are met:
- Databricks Account: A Databricks account is required to run this project.
- Databricks Cluster: A cluster must be configured and started before running the code.
Then, start the cluster and run the code. The file "Model Interpretability Results" provides the decreasing ordered list of the importance of each feature.
You can find here the main notebook with all our work.
Here, we were able to learn the key values for each company, the inherent values of each profile, and understand their level of importance for each feature. Finally, the system generates the instructions the candidates has to follow to enhance his profil towards a specific company that he targeted.
Same as "Running the Code" section in the "Model Interpretability" section.
Then, each user enters his employee ID and the company's name that he targets. The system outputs the instructions. The file "Example generated instructions" provides a great example of the format of the instructions.
This file is about the scraping methods we used to scrape the relevant data from the companies' websites and from Comparably. For the scraping mission, we used the BrightData application that allows us to do high-scaling scraping without being blocked.
Before you begin, make sure you have a BrightData account. Install all the dependencies thanks to the requirements files according to the chosen scraping task. Copy paste the following lines of codes by replacing "username" and "password" by your own username password.
AUTH = 'username:password'
SBR_WEBDRIVER = f'https://{AUTH}@brd.superproxy.io:9515'
Inception alert 🚨 : You may check our Linkedin post about On Target based on Linkedin Big Data!