If running 'docker ps' cannot be accesable you should run :
sudo service docker start
-
Enter username ubuntu, password ubuntu
-
To open where is directory ubuntu at on explorer run the command below:
explorer.exe .
-
Run Code Next Step :
sudo -i
-
Insert password 'ubuntu'
-
Run Code Next Step :
cd /home/ubuntu/ mkdir airflow sudo apt update sudo apt install software-properties-common sudo add-apt-repository ppa:deadsnakes/ppa sudo apt-get install python3.9
-
Run and copy output:
which python3.9
-
Run Code Next Step :
nano ~/.bashrc
Add in end of line with step no 6 directory (Paling bawah tambah 1 enter) :
alias python=/usr/bin/python3.9
-
Run Code Next Step :
source ~/.bashrc apt-get install pip
-
Run Code Next Step :
source ~/.bashrc apt-get install pip sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg
-
Run Code Next Step :
echo \ "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \ "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
-
Run Code Next Step :
sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin sudo service docker start sudo docker run hello-world
- Run Code & Copy Output for your IP address :
hostname -I | awk '{print $1}'
- Run Code Next Step :
docker run --name postgresql-container -p 5433:5432 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=secret -d postgres
- Run Code Next Step :
docker run --name pgadmin -p 80:80 \ -e PGADMIN_DEFAULT_EMAIL="[email protected]" \ -e PGADMIN_DEFAULT_PASSWORD="secret" \ -d dpage/pgadmin4
- Open on website 127.0.0.1:80 and input email is "[email protected]" password is "secret"
- Make connection with hostname is your ip address on step number 1, Database = postgres, username = postgres, password = secret
- Query to create table and insert on table_postgres.txt in this repository
-
Run Code Next Step :
nano ~/.bashrc
Add in end of line with (Paling bawah tambah 1 enter) :
alias pip='python -m pip'
-
Run Code Next Step :
source ~/.bashrc pip install streamlit pip install psycopg2-binary pip install scikit-learn pip install matplotlib
-
Copy code + directory on this repository airflow/plugins/project_streamlit
-
Run Code Next Step :
streamlit run Hello.py
-
Change directory to file located for Hello.py :
docker build -t streamlit-app . docker run -p 8501:8501 streamlit-app
-
Run Code Next Step :
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.6.3/docker-compose.yaml'
-
Run Code Next Step :
nano Dockerfile
Input this or you can copy from this repository airflow/Dockerfile:
FROM apache/airflow:2.6.3 USER root USER airflow COPY requirements.txt / RUN pip install -r /requirements.txt
-
Please copy the requirements.txt on airflow/requirements.txt
-
Run Code Next Step :
docker build -t airflow_custom .
-
Run Code Next Step :
nano docker-compose.yaml
Find and replace or you can copy all from airflow/docker-compose.yml:
version: '3.8' x-airflow-common: &airflow-common # In order to add custom dependencies or upgrade provider packages you can use your extended image. # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml # and uncomment the "build" line below, Then run `docker-compose build` to build the images. image: ${AIRFLOW_IMAGE_NAME:-airflow_custom} <------------------ REPLACE THIS !!!
-
Run Code Next Step :
mkdir -p ./dags ./logs ./plugins ./config echo -e "AIRFLOW_UID=$50000\nAIRFLOW_GID=0" > .env
-
Run Code Next Step :
docker compose up
-
Open on WEB : http://127.0.0.1:8080/
-
Copy all files on airflow/dags in this repository into the airflow directory
-
Login with username is "airflow" and password is "airflow"
- change directory to airflow/plugins/model_api and copy all files and folder there
- Run Code Next Step :
docker build -t fast_api_model . docker compose up
You can visit this project : https://dataengineer-azhar-test.streamlit.app/
Note:
Docker : Pembungkus tiap tools aplikasi pada project dalam bentuk container, memungkinkan untuk menjalan banyak aplikasi menggunakan os linux.
Postgres : Sebagai database yang akan kita tarik data nya untuk visualisasi dari Streamlit (Bisa juga sebagai proses ETL, Data pada model retrain)
PgAdmin : UI dari postgres dan memudahkan kita untuk melakukan Data Modeling pada Postgres
Streamlit : Sebagai framework / tools yang tujuannya itu memvisualisasikan data, mengintegrasi Machine Learning, dan mempublikasikan hasil project ke publik dengan streamlit share.
Airflow : Sebagai orchestrator / tools scheduler untuk menjalankan suatu job yang dimana itu proses retrain model.
FastAPI : Framework yang tujuannya dijadikan API untuk ML Model yang berfungsi sebagai prediksi kredit scoring
ML Model : Machine Learning menggunakan Scikit-learn dan juga RandomizedSearchCV untuk mencari parameter terbaik
Portofolio Saya: https://azharizz.my.canva.site/