Demo Video 📽️: https://www.youtube.com/watch?v=O_P305Kb26g
The Mars Weather Data Project is a Java application that retrieves and stores weather data from NASA's InSight Mars weather API into a MongoDB database. The application fetches weather data, processes it, and allows retrieval of stored documents for further analysis. It is designed to facilitate the exploration of Martian weather conditions.
- Connects to a MongoDB database to store Mars weather data.
- Fetches real-time weather data from the NASA InSight API.
- Parses JSON data and inserts it into the MongoDB collection.
- Retrieves stored weather data from MongoDB.
- Uses environment variables to manage sensitive information securely.
To run this project, ensure you have the following installed:
- Java 11 or higher
- Maven
- MongoDB
- An active internet connection to access the NASA API
git clone https://github.com/capybara-brain346/Mars-Data-ETL-Pipeline.git
cd mars-weather-data
Create a .env
file in the root directory of your project with the following variables:
DB_USERNAME=your_mongodb_username
DB_PASSWORD=your_mongodb_password
DB_URI=your_mongodb_uri
NASA_API_KEY=your_nasa_api_key
Replace the placeholder values with your actual MongoDB credentials and NASA API key.
Use Maven to build the project:
mvn clean install
You can run the application using the following command:
java -cp target/mars-weather-data-1.0-SNAPSHOT.jar org.mars.Load
The Main
class can be modified to call the sendDataToMongoDB
method, which fetches data from the NASA API and inserts it into MongoDB. Uncomment the relevant line in the main
method:
// sendDataToMongoDB(getAPI, mongoDBClient);
The application retrieves stored weather data from MongoDB using the retrieveDataFromMongoDB
method in the Main
class. This method prints the retrieved documents to the console.
- MongoDBClient: Manages connections to the MongoDB database and handles data insertion and retrieval.
- Main: Entry point for the application; orchestrates data fetching and processing.
- GetAPI: Handles HTTP requests to the NASA API to retrieve weather data.
- DataRepository: Responsible for inserting weather data into a relational database (not fully defined in the provided code).
- DataModel: Represents the data structure for storing Mars weather data (not fully defined in the provided code).
- DateUtil: Utility class for date formatting (not fully defined in the provided code).
The following libraries are used in this project:
Add the following dependencies to your pom.xml
:
<dependencies>
<!-- MongoDB Java Driver -->
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-sync</artifactId>
<version>4.6.0</version>
</dependency>
<!-- OkHttp -->
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.2</version>
</dependency>
<!-- Jackson Databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
</dependency>
<!-- dotenv-java -->
<dependency>
<groupId>io.github.cdimascio</groupId>
<artifactId>dotenv-java</artifactId>
<version>3.1.0</version>
</dependency>
</dependencies>
An interactive dashboard for visualizing weather conditions on Mars. The project uses data from a MySQL database, processes JSON-formatted weather attributes, and presents insights using a Dash web application.
- Visualize Trends: Analyze pressure, temperature, and wind speed variations over time.
- Interactive Visualization: Explore relationships like temperature vs. pressure and wind direction distribution.
- Custom Styling: Dark-themed dashboard with vibrant color schemes for better readability.
This DAG runs a Java JAR file as part of an automated workflow.
from airflow.operators.bash import BashOperator
from datetime import datetime
from airflow import DAG
default_args = {
'owner': 'airflow',
'start_date': datetime(2024, 10, 31),
'retries': 1,
}
dag = DAG(
'run_java_jar',
default_args=default_args,
schedule_interval='@daily',
)
run_java_jar = BashOperator(
task_id='run_java_jar_task',
bash_command='java -jar /app/Extract-1.0-SNAPSHOT.jar',
dag=dag,
)
Sets up an Airflow environment with PostgreSQL as the backend and provisions the services required for execution.
version: "3"
x-airflow-common: &airflow-common
image: apache/airflow:2.0.0
environment:
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__CORE__LOGGING_LEVEL=INFO
volumes:
- ./dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
depends_on:
- postgres
services:
postgres:
image: postgres:12
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=airflow
ports:
- "5432:5432"
volumes:
- postgres-data:/var/lib/postgresql/data
airflow-init:
<<: *airflow-common
container_name: airflow_init
entrypoint: /bin/bash
command:
- -c
- |
airflow db init &&
airflow users create \
--role Admin \
--username airflow \
--password airflow \
--email [email protected] \
--firstname airflow \
--lastname airflow
restart: "no"
depends_on:
- postgres
airflow-webserver:
<<: *airflow-common
command: airflow webserver
ports:
- "8081:8080"
container_name: airflow_webserver
restart: always
depends_on:
- postgres
- airflow-init
airflow-scheduler:
<<: *airflow-common
command: airflow scheduler
container_name: airflow_scheduler
restart: always
depends_on:
- postgres
- airflow-init
volumes:
postgres-data:
The Dash app processes data from the MySQL database, extracts JSON attributes, and displays the processed data in various interactive graphs.
- Pressure Variation: Average, min, and max pressure over time.
- Temperature Profiles: Trends in Mars temperature with highlighted extremes.
- Wind Speed & Direction: Analyze wind speed changes and directional distributions.
Contributions are welcome! If you have suggestions for improvements or want to report a bug, please create an issue or submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to NASA for providing the InSight weather data API.
- Special thanks to the open-source community for the libraries used in this project.