first commit

kppeterkiss · Nov 1, 2021 · 4bd027b · 4bd027b
commit 4bd027b
Show file tree

Hide file tree

Showing 40 changed files with 6,016 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,4 @@
+.dist
+venv/*
+App/*
+.notes.txt
diff --git a/1000lines_35m_3mpng.png b/1000lines_35m_3mpng.png
diff --git a/In/df_lines.csv b/In/df_lines.csv
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2021 Samir Saci
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,174 @@
+# Automate ABC Analysis & Product Segmentation with Streamlit 📈
+*A statistical methodology to segment your products based on turnover and demand variability using an automated solution with a web application designed with the framework Streamlit*
+
+<p align="center">
+  <img align="center" src="images/streamlit_capture.PNG" width=75%>
+</p>
+
+Product segmentation refers to the activity of grouping products that have similar characteristics and serve a similar market. It is usually related to marketing _(Sales Categories)_ or manufacturing _(Production Processes)_. However as a **Supply Chaine Engineer** your focus is not on the product itself but more on the complexity of managing its flow.
+
+Your want to understand the sales volumes distribution (fast/slow movers) and demand variability to optimize your production, storage and delivery operations to ensure the best service level by considering: 
+- The highest contribution to your total volume: ABC Analysis
+- The most unstable demand: Demand Variability
+
+I have designed this **Streamlit App** to provide a tool to **Supply Chain Engineers** for Product Segmentation, with a focus on retail products, of their portofolio considering the complexity of the demand and the volumes contribution of each item.
+
+### Understand the theory behind 📜
+In this [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a), you can find details about the theory used to build this tool. 
+
+# Access the application 🖥️ 
+> Access it here: [Product Segmentation for Retail](https://share.streamlit.io/samirsaci/segmentation/main/segmentation.py)
+
+## **Step 0: Why should you use it?**
+This Streamlit Web Application has been designed for Supply Chain Engineers to support them in their Inventory Management. It will help you to automate product segmentation using statistics.
+
+## **Step 1: What do you want to do?**
+You have two ways to use this application:
+- 🖥️ Look at the results computed by the model using the pre-loaded dataset: in that case you just need to scroll to see the visuals and the analyses
+OR
+- 💾 Upload your dataset of sales records that includes columns related to:
+  - **Item master data**
+  _For example: SKU ID, Category, Sub-Category, Store ID_
+  - **Date of the sales**:
+  _For example: Day, Week, Month, Year_
+  - **Quantity or value**: this measure will be used for the ABC analysis
+  _For example: units, cartons, pallets or euros/dollars/your local currency_
+
+## **Step 2: Prepare the analysis**
+
+### **1. 💾 Upload your dataset of sales records**
+<p align="center">
+  <img align="center" src="images/step_1.PNG" width=40%>
+</p>
+
+💡 _Please make sure that you dataset format is csv with a file size lower than 200MB. If you want to increase the size, you'd better copy this repository and deploy the app locally following the instructions below._
+
+### **2. 📅 [Parameters] select the columns for the date (day, week, year) and the values (quantity, $)**
+<p align="center">
+  <img align="center" src="images/step_2.PNG" width=75%>
+</p>
+
+💡 _If you have several columns for the date (day, week, month) and for the values (quantity, amount) you can use only one column per category for each run of calculation._
+
+### **3. 📉 [Parameters] select all the columns you want to keep in the analysis**
+<p align="center">
+  <img align="center" src="images/step_3.PNG" width=75%>
+</p>
+
+💡 _This step will basically help you to remove the columns that you do not need for your analysis to increase the speed of computation and reduce the usage of ressources._
+
+### **4. 🏬 [Parameters] select all the related to product master data (SKU ID, FAMILIY, CATEGORY, STORE LOCATION)**
+<p align="center">
+  <img align="center" src="images/step_4.PNG" width=75%>
+</p>
+
+💡 _In this step you will show at what granularity you want to do your analysis. For example it can be at:_
+  - _Item, Store level: that means the same item in two stores will represent two SKU_
+  - _Item ID level: that means you group the sales of your item in all stores_
+
+### **5. 🛍️ [Parameters] select one feature you want to use for analysis by family**
+<p align="center">
+  <img align="center" src="images/step_5.PNG" width=75%>
+</p>
+
+💡 _This feature will be used to plot the repartition of (A, B, C) product by family_
+
+### **6. 🖱️ Click on Start Calculation? to launch the analysis**
+<p align="center">
+  <img align="center" src="images/step_6.PNG" width=75%>
+</p>
+
+💡 _This feature will be used to plot the repartition of (A, B, C) product by family_
+
+# Get insights about your sales records 💡
+
+### **Pareto Analysis**
+
+<p align="center">
+  <img align="center" src="images/pareto.PNG" width=75%>
+</p>
+
+**INSIGHTS:** 
+1. How many SKU represent 80% of your total sales?
+2. How much sales represent 20% of your SKUs?
+
+_For more information about the theory behind the pareto law and its application in Supply Chain Management: [Pareto Principle for Warehouse Layout Optimization](https://towardsdatascience.com/reduce-warehouse-space-with-the-pareto-principle-using-python-e722a6babe0e)_
+
+### **ABC Analysis with Demand Variability**
+
+<p align="center">
+  <img align="center" src="images/abc_analysis.PNG" width=75%>
+</p>
+
+**QUESTIONS: WHAT IS THE PROPORTION OF?** 
+1. **LOW IMPORTANCE SKUS**: C references
+2. **STABLE DEMAND SKUS**: A and B SKUs with a coefficient of variation below 1 
+3. **HIGH IMPORTANCE SKUS**: A and B SKUS with a high coefficient of variation
+
+Your inventory management strategies will be impacted by this split:
+- A minimum effort should be put in **LOW IMPORTANCE SKUS**
+- Automated rules with a moderate attention for **STABLE SKUS**
+- Complex replenishment rules and careful attention for **HIGH IMPORTANCE SKUS**
+
+
+_For more information: [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a)_
+
+<p align="center">
+  <img align="center" src="images/split_category.PNG" width=75%>
+</p>
+
+**QUESTIONS:** 
+1. What is the split of SKUS by FAMILY?
+2. What is the split of SKUS by ABC class in each FAMILY?
+
+
+### **Normality Test**
+
+<p align="center">
+  <img align="center" src="images/normality.PNG" width=75%>
+</p>
+
+**QUESTION:** 
+- Which SKUs have a sales distribution that follows a normal distribution?
+
+Many inventory rules and safety stock formula can be used only if the sales distribution of your item is following a normal distribution. Thefore, it's better to know the % of your portofolio that can be managed easily.
+
+_For more information: [Inventory Management for Retail — Stochastic Demand](https://towardsdatascience.com/inventory-management-for-retail-stochastic-demand-3020a43d1c14)_
+
+
+# Build the application locally 🏗️ 
+
+## **Build a python local environment (recommanded)** 
+
+### Then install **virtualenv** using pip3
+
+    sudo pip3 install virtualenv 
+
+### Now create a virtual environment 
+
+    virtualenv venv 
+
+### Active your virtual environment    
+
+    source venv/bin/activate
+
+## Launch Streamlit 🚀
+
+### Install all dependencies needed using requirements.txt
+
+     pip install -r requirements.txt 
+
+### Run the application  
+
+    streamlit run segmentation.py 
+
+### Click on the Network URL in the shell   
+  <p align="center">
+    <img align="center" src="images/network.PNG" width=50%>
+  </p>
+
+> -> Enjoy!
+# About me 🤓
+Senior Supply Chain Engineer with an international experience working on Logistics and Transportation operations. \
+Have a look at my portfolio: [Data Science for Supply Chain Portfolio](https://samirsaci.com) \
+Data Science for Warehousing📦, Transportation 🚚 and Demand Forecasting 📈 
diff --git a/app.py b/app.py
@@ -0,0 +1,124 @@
+import pandas as pd
+import numpy as np
+import plotly.express as px
+from utils.routing.distances import (
+	distance_picking,
+	next_location
+)
+from utils.routing.routes import (
+	create_picking_route
+)
+from utils.batch.mapping_batch import (
+	orderlines_mapping,
+	locations_listing
+)
+from utils.cluster.mapping_cluster import (
+	df_mapping
+)
+from utils.batch.simulation_batch import (
+	simulation_wave,
+	simulate_batch
+)
+from utils.cluster.simulation_cluster import(
+	loop_wave,
+	simulation_cluster,
+	create_dataframe,
+	process_methods
+)
+from utils.results.plot import (
+	plot_simulation1,
+	plot_simulation2
+)
+import streamlit as st
+from streamlit import caching
+
+# Set page configuration
+st.set_page_config(page_title ="Improve Warehouse Productivity using Order Batching",
+                    initial_sidebar_state="expanded",
+                    layout='wide',
+                    page_icon="🛒")
+
+# Set up the page
+@st.cache(persist=False,
+          allow_output_mutation=True,
+          suppress_st_warning=True,
+          show_spinner= True)
+# Preparation of data
+def load(filename, n):
+    df_orderlines = pd.read_csv(IN + filename).head(n)
+    return df_orderlines
+
+
+# Alley Coordinates on y-axis
+y_low, y_high = 5.5, 50
+# Origin Location
+origin_loc = [0, y_low]
+# Distance Threshold (m)			
+distance_threshold = 35			
+distance_list = [1] + [i for i in range(5, 100, 5)]		
+IN = 'In/'
+# Store Results by WaveID
+list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult = [], [], [], [], [], [], []
+list_results = [list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult]	# Group in list
+# Store Results by Simulation (Order_number)
+list_ordnum , list_dstw = [], []
+
+# Simulation 1: Order Batch
+# SCOPE SIZE
+st.header("**🥇 Impact of the wave size in orders (Orders/Wave) **")
+st.subheader('''
+        🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
+    ''')
+col1, col2 = st.beta_columns(2)
+with col1:
+	n = st.slider(
+				'SIMULATION 1 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
+with col2:
+	lines_number = 1000 * n 
+	st.write('''🛠️{:,} \
+		order lines'''.format(lines_number))
+# SIMULATION PARAMETERS
+st.subheader('''
+        🛠️ SIMULATE ORDER PICKING BY WAVE OF N ORDERS PER WAVE WITH N IN [N_MIN, N_MAX] ''')
+col_11 , col_22 = st.beta_columns(2)
+with col_11:
+	n1 = st.slider(
+				'SIMULATION 1: N_MIN (ORDERS/WAVE)', 0, 20 , value = 1)
+	n2 = st.slider(
+				'SIMULATION 1: N_MAX (ORDERS/WAVE)', n1 + 1, 20 , value = int(np.max([n1+1 , 10])))
+with col_22:
+		st.write('''[N_MIN, N_MAX] = [{:,}, {:,}]'''.format(n1, n2))
+# START CALCULATION
+start_1= False
+if st.checkbox('SIMULATION 1: START CALCULATION',key='show', value=False):
+    start_1 = True
+# Calculation
+if start_1:
+	df_orderlines = load('df_lines.csv', lines_number)
+	df_waves, df_results = simulate_batch(n1, n2, y_low, y_high, origin_loc, lines_number, df_orderlines)
+	plot_simulation1(df_results, lines_number)
+
+# Simulation 2: Order Batch using Spatial Clustering 
+# SCOPE SIZE
+st.header("**🥈 Impact of the wave size in orders (Orders/Wave) **")
+st.subheader('''
+        🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
+    ''')
+col1, col2 = st.beta_columns(2)
+with col1:
+	n_ = st.slider(
+				'SIMULATION 2 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
+with col2:
+	lines_2 = 1000 * n_ 
+	st.write('''🛠️{:,} \
+		order lines'''.format(lines_number))
+# START CALCULATION
+start_2 = False
+if st.checkbox('SIMULATION 2: START CALCULATION',key='show_2', value=False):
+    start_2 = True
+# Calculation
+if start_2:
+	df_orderlines = load('df_lines.csv', lines_2)
+	df_reswave, df_results = simulation_cluster(y_low, y_high, df_orderlines, list_results, n1, n2, 
+			distance_threshold)
+	plot_simulation2(df_reswave, lines_2, distance_threshold)
diff --git a/notes.txt b/notes.txt
@@ -0,0 +1,41 @@
+# Example Artefact
+https://github.com/MaximeLutel/streamlit_prophet
+https://streamlit.io/gallery?category=model-building-training
+- The Math of the Prophet
+https://medium.com/future-vision/the-math-of-prophet-46864fa9c55a
+
+# INSTALL NODE
+https://docs.microsoft.com/fr-fr/windows/dev-environment/javascript/nodejs-on-wsl
+
+
+# Ubuntu WSL VS Code
+https://code.visualstudio.com/docs/remote/wsl
+- Donner les droits admins pour ecrire et dl des librairies
+sudo chown -R samirs streamlit_prophet
+
+# Move local directory of Windows to Local Linux
+mkdir app
+cp -R /mnt/c/Data/62-\ Projects/24-\ Articles/25-\ Improve\ Warehouse\ Productivity/App ~/app
+cd ~/App
+code .
+
+# Github
+git config --global user.email "[email protected]"
+git config --global user.name "Samir Saci"
+git remote add origin 'https://github.com/samirsaci/segmentation.git'
+git push -u origin main
+
+# Install pipenv
+pip install virtualenv
+python3.8 -m virtualenv venv
+source venv/bin/activate
+
+# Activate Streamlit
+streamlit run segmentation.py --server.address 0.0.0.0
+streamlit run app.py --server.address 0.0.0.0
+
+# SEGMENTATION TO DO
+	1) FAMILY = F(SKU SCOPE)
+	2) ITEM = ITEM LIST - FAMILY
+
+C:\Data\62- Projects\24- Articles\25- Improve Warehouse Productivity\App