Skip to content

On Target project for Data Mangement and Collection Lab

Notifications You must be signed in to change notification settings

tombijaoui/lab940290

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Management & Collection Lab - On Target

Not to lie, but to emphasize

Jornet Jeremy - Sasson Eden - Bijaoui Tom

Technion - Israel Institute of Technology - Faculty of Data Science and Decisions

Project's Files

  • Final Project Report
  • Model Interpretability
    • Model Interpretability Notebook
    • Model Interpretability Results
  • On Target tool
    • On Target Notebook
    • Example generated instructions
  • Scraping
    • Scraping "Comparably"
    • Scraping companies' websites
  • Requirements

Overview

First impressions and emphasizing the right qualities are crucial in the business world, especially when applying for a job. That's why we developed On Target, our revolutionary big data tool. This system generates tailored guidelines that enable candidates to highlight the skills and values most valued by the specific company they are targeting using Statistical and Machine Learning methods.

Implementing Methods

  • Data Preprocessing
  • Features Engineering
  • Pre-trained models from Hugging Face]
  • Statistical tests to significance inference
  • Machine Learning model to Model Interpretability
  • NLP keywords extraction techniques
  • LLM to generate instructions

Model Interpretability

The model interpretability notebook gives information about the importance of the features in the recruitment process by training for each company a Random Forest model for a binary classification task.

Running the code

Before you begin, make sure the following prerequisites are met:

  • Databricks Account: A Databricks account is required to run this project.
  • Databricks Cluster: A cluster must be configured and started before running the code.

Then, start the cluster and run the code. The file "Model Interpretability Results" provides the decreasing ordered list of the importance of each feature.

On Target Tool

You can find here the main notebook with all our work.

Here, we were able to learn the key values for each company, the inherent values of each profile, and understand their level of importance for each feature. Finally, the system generates the instructions the candidates has to follow to enhance his profil towards a specific company that he targeted.

Running the code

Same as "Running the Code" section in the "Model Interpretability" section.

Then, each user enters his employee ID and the company's name that he targets. The system outputs the instructions. The file "Example generated instructions" provides a great example of the format of the instructions.

Scraping

This file is about the scraping methods we used to scrape the relevant data from the companies' websites and from Comparably. For the scraping mission, we used the BrightData application that allows us to do high-scaling scraping without being blocked.

Running the code

Before you begin, make sure you have a BrightData account. Install all the dependencies thanks to the requirements files according to the chosen scraping task. Copy paste the following lines of codes by replacing "username" and "password" by your own username password.

AUTH = 'username:password'
SBR_WEBDRIVER = f'https://{AUTH}@brd.superproxy.io:9515'

Links

Inception alert 🚨 : You may check our Linkedin post about On Target based on Linkedin Big Data!

About

On Target project for Data Mangement and Collection Lab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published