job4j_grabber is a project for automating the search and collection of job vacancies from websites based on specified criteria.
The application runs on a schedule and gathers all relevant vacancies for Java developers from a specified website (career.habr.com, section /vacancies/java_developer).
The collected data is stored in a PostgreSQL database, and access to the interface is provided through a REST API.
The execution schedule is configured in the settings file - app.properties.
Main Features
- Vacancy Parsing:
- Uses the jsoup library for HTML parsing and extracting job vacancy data;
- The application automatically navigates through job listing pages, identifies Java-related positions, and saves them to the database.
- Task Scheduling:
- The application runs on a schedule using the Quartz Scheduler library, with periodicity defined in the app.properties file.
- Data Storage:
- Job vacancy data is stored in a relational PostgreSQL database using JDBC.
- Data Access:
- A REST API is implemented to provide access to the collected data via HTTP requests.
- CI/CD:
- GitHub Actions is used for automatic project builds and testing.
mvn install
- to build the project
java -jar target/job4j_grabber-1.0.jar
- to run the project
Before running, an existing database named job4j and a table post are required.
Script to create the **post** table
` create table post ( id serial primary key, name text, text text, link text unique, created timestamp ) `The application is used to search for Java vacancies on the website https://career.habr.com.
- Java Core
- Jsoup (HTML parsing)
- Quartz Scheduler (task scheduling)
- JDBC
- PostgreSQL (data storage)
- Maven (dependency management and project build))
- GitHub Action (CI/CD integration)
- REST API