Job Catcher

Job Catcher is a web application that allows users to rank job postings based on their CV text. It is ultimately a retrieval and reranking pipeline. It first retrieves a pool of job postings, then ranks the results by combining several semantic and classical approaches to for an optimal match with the user profile.

See the live website: https://job-catcher.onrender.com/

Features

Job Aggregator: Uses Google’s job search API to gather listings from major job boards like LinkedIn, Glassdoor, Indeed, ZipRecruiter, Monster, and more, as well as from company career pages.
Advanced Matching: Reranks the results using several techniques to optimize the match with the user profiel:
- Semantic similarity with sentence-transformers.
- Semantic elite keyword matching.
- TF-IDF scoring.
- Keyword scoring.
Web Interface: Access via a simple web interface, with the local server running on the user's side.

Installation

Clone the Repository:

git clone https://github.com/Woolball/job-matcher.git

Navigate to the Project Directory:
```
cd job-catcher
```

Set Up a Virtual Environment (Optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the Required Python Packages:
```
pip install -r requirements.txt
```
Install and Run Redis: Redis is used to manage the rate limiting for API requests. Install Redis (on Linux):
```
sudo apt update
sudo apt install redis-server
sudo systemctl start redis
```
Verify Redis is running by typing:
```
redis-cli ping
```
If Redis is running, it should return PONG.
Configure the Environment Variables: Create a .env file in the root directory with the following content:
```
JSEARCH_API_URL=https://jsearch.p.rapidapi.com/search
JSEARCH_API_KEY=
JSEARCH_API_HOST=jsearch.p.rapidapi.com
JSEARCH_API_RATE_LIMIT_CALLS=5
JSEARCH_API_RATE_LIMIT_PERIOD=1

REDIS_HOST=localhost
REDIS_PORT=6379

FETCHER=scraper  # or 'jsearch' if you want to use the JSearch API
```
Provide a value for JSEARCH_API_KEY with your actual JSearch API key if you want to use the jsearch fetcher. Using the JSearch API fetcher is more efficient, but requires registration & API key.

Usage

Starting the Application

Run the Redis Server: Redis must be running to handle rate limiting for API requests.
```
redis-server
```
Run the Flask Application:
```
python app.py 
```
Access the Application: Open your web browser and go to http://127.0.0.1:5000.

Providing Input

You can provide input directly in the web interface:

Search terms: Enter comma-separated job titles (e.g., "product manager, financial advisor").
CV: Upload your CV file for semantic comparison.
Preferred and exclusion keywords: Provide keywords you'd like to include or exclude from your results (e.g., "data analysis, project management, remote, senior").

Viewing Results

After submitting the form, the application will display a ranked list of job postings based on their relevance to your CV. Each job listing includes the job title, company name, date posted, and a tag indicating its level of relevance.

Note:

A maximum of 50 results are displayed by default in the interface. This can be configured in /config.py.
After each search, the full results, including detailed scores, are saved to a CSV file located in the data/ directory.

Project Structure

├── app.py
├── config.py
├── src/
│   ├── fetchers
│   │   ├── scraper.py
│   │   └── jsearch.py
│   ├── ranking.py
│   └── utils.py
├── static/
│   ├── js
│   │   ├── crs.min.js
│   │   └── main.js
│   ├── avatar.png
│   └── styles.css
├── templates/
│   └──  index.html
├── data/
│   └── dump_search.csv (created at runtime)
├── uploads/ (created at runtime)
├── README.md
├── requirements.txt
├── .gitignore
└── LICENSE

Configuration

App Parameters: Modify the default parameters in /config.py and in your /.env file. Notably, you can select which of the two fetchers to use for retrieving job ads. You can also implement your own fetcher with other scrapers or API. Follow the code conventions in src/fetchers/scraper.py.
Results Storage: By default, results of the latest search are stored in a CSV file under the data/ directory (refreshed after each search).
Customizing Matching Logic: You are welcome to modify or replace the job-matching logic implemented in src/ranking.py to experiment with different matching strategies.

Contributing

Contributions are welcome! Please fork this repository and submit a pull request for review.

License

This project is licensed under the terms of the GNU Affero General Public License (AGPL) v3.0.

Commercial Use

For commercial use or licensing inquiries, please contact [[email protected]].

Acknowledgements

This project uses the JobSpy package for job scraping.
SentenceTransformers provides the core semantic-matching mechanism. Particularly, the mini models (e.g., all-MiniLM-L6-v2) provides a great balance between performance and efficiency.
The Elite Keyword Matching mechanism is inspired by: Susan, S., Sharma, M., & Choudhary, G. (2024). Uniqueness meets Semantics: A Novel Semantically Meaningful Bag-of-Words Approach for Matching Resumes to Job Profiles. Inteligencia Artificial, 27(74), 117–132.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Catcher

Features

Installation

Usage

Starting the Application

Providing Input

Viewing Results

Project Structure

Configuration

Contributing

License

Commercial Use

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.idea		.idea
src		src
static		static
templates		templates
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt

License

Woolball/job-catcher

Folders and files

Latest commit

History

Repository files navigation

Job Catcher

Features

Installation

Usage

Starting the Application

Providing Input

Viewing Results

Project Structure

Configuration

Contributing

License

Commercial Use

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages