Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
samirsaci committed Nov 1, 2021
0 parents commit 4bd027b
Show file tree
Hide file tree
Showing 40 changed files with 6,016 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.dist
venv/*
App/*
.notes.txt
Binary file added 1000lines_35m_3mpng.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5,001 changes: 5,001 additions & 0 deletions In/df_lines.csv

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2021 Samir Saci

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
174 changes: 174 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Automate ABC Analysis & Product Segmentation with Streamlit 📈
*A statistical methodology to segment your products based on turnover and demand variability using an automated solution with a web application designed with the framework Streamlit*

<p align="center">
<img align="center" src="images/streamlit_capture.PNG" width=75%>
</p>

Product segmentation refers to the activity of grouping products that have similar characteristics and serve a similar market. It is usually related to marketing _(Sales Categories)_ or manufacturing _(Production Processes)_. However as a **Supply Chaine Engineer** your focus is not on the product itself but more on the complexity of managing its flow.

Your want to understand the sales volumes distribution (fast/slow movers) and demand variability to optimize your production, storage and delivery operations to ensure the best service level by considering:
- The highest contribution to your total volume: ABC Analysis
- The most unstable demand: Demand Variability

I have designed this **Streamlit App** to provide a tool to **Supply Chain Engineers** for Product Segmentation, with a focus on retail products, of their portofolio considering the complexity of the demand and the volumes contribution of each item.

### Understand the theory behind 📜
In this [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a), you can find details about the theory used to build this tool.

# Access the application 🖥️
> Access it here: [Product Segmentation for Retail](https://share.streamlit.io/samirsaci/segmentation/main/segmentation.py)
## **Step 0: Why should you use it?**
This Streamlit Web Application has been designed for Supply Chain Engineers to support them in their Inventory Management. It will help you to automate product segmentation using statistics.

## **Step 1: What do you want to do?**
You have two ways to use this application:
- 🖥️ Look at the results computed by the model using the pre-loaded dataset: in that case you just need to scroll to see the visuals and the analyses
OR
- 💾 Upload your dataset of sales records that includes columns related to:
- **Item master data**
_For example: SKU ID, Category, Sub-Category, Store ID_
- **Date of the sales**:
_For example: Day, Week, Month, Year_
- **Quantity or value**: this measure will be used for the ABC analysis
_For example: units, cartons, pallets or euros/dollars/your local currency_

## **Step 2: Prepare the analysis**

### **1. 💾 Upload your dataset of sales records**
<p align="center">
<img align="center" src="images/step_1.PNG" width=40%>
</p>

💡 _Please make sure that you dataset format is csv with a file size lower than 200MB. If you want to increase the size, you'd better copy this repository and deploy the app locally following the instructions below._

### **2. 📅 [Parameters] select the columns for the date (day, week, year) and the values (quantity, $)**
<p align="center">
<img align="center" src="images/step_2.PNG" width=75%>
</p>

💡 _If you have several columns for the date (day, week, month) and for the values (quantity, amount) you can use only one column per category for each run of calculation._

### **3. 📉 [Parameters] select all the columns you want to keep in the analysis**
<p align="center">
<img align="center" src="images/step_3.PNG" width=75%>
</p>

💡 _This step will basically help you to remove the columns that you do not need for your analysis to increase the speed of computation and reduce the usage of ressources._

### **4. 🏬 [Parameters] select all the related to product master data (SKU ID, FAMILIY, CATEGORY, STORE LOCATION)**
<p align="center">
<img align="center" src="images/step_4.PNG" width=75%>
</p>

💡 _In this step you will show at what granularity you want to do your analysis. For example it can be at:_
- _Item, Store level: that means the same item in two stores will represent two SKU_
- _Item ID level: that means you group the sales of your item in all stores_

### **5. 🛍️ [Parameters] select one feature you want to use for analysis by family**
<p align="center">
<img align="center" src="images/step_5.PNG" width=75%>
</p>

💡 _This feature will be used to plot the repartition of (A, B, C) product by family_

### **6. 🖱️ Click on Start Calculation? to launch the analysis**
<p align="center">
<img align="center" src="images/step_6.PNG" width=75%>
</p>

💡 _This feature will be used to plot the repartition of (A, B, C) product by family_

# Get insights about your sales records 💡

### **Pareto Analysis**

<p align="center">
<img align="center" src="images/pareto.PNG" width=75%>
</p>

**INSIGHTS:**
1. How many SKU represent 80% of your total sales?
2. How much sales represent 20% of your SKUs?

_For more information about the theory behind the pareto law and its application in Supply Chain Management: [Pareto Principle for Warehouse Layout Optimization](https://towardsdatascience.com/reduce-warehouse-space-with-the-pareto-principle-using-python-e722a6babe0e)_

### **ABC Analysis with Demand Variability**

<p align="center">
<img align="center" src="images/abc_analysis.PNG" width=75%>
</p>

**QUESTIONS: WHAT IS THE PROPORTION OF?**
1. **LOW IMPORTANCE SKUS**: C references
2. **STABLE DEMAND SKUS**: A and B SKUs with a coefficient of variation below 1
3. **HIGH IMPORTANCE SKUS**: A and B SKUS with a high coefficient of variation

Your inventory management strategies will be impacted by this split:
- A minimum effort should be put in **LOW IMPORTANCE SKUS**
- Automated rules with a moderate attention for **STABLE SKUS**
- Complex replenishment rules and careful attention for **HIGH IMPORTANCE SKUS**


_For more information: [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a)_

<p align="center">
<img align="center" src="images/split_category.PNG" width=75%>
</p>

**QUESTIONS:**
1. What is the split of SKUS by FAMILY?
2. What is the split of SKUS by ABC class in each FAMILY?


### **Normality Test**

<p align="center">
<img align="center" src="images/normality.PNG" width=75%>
</p>

**QUESTION:**
- Which SKUs have a sales distribution that follows a normal distribution?

Many inventory rules and safety stock formula can be used only if the sales distribution of your item is following a normal distribution. Thefore, it's better to know the % of your portofolio that can be managed easily.

_For more information: [Inventory Management for Retail — Stochastic Demand](https://towardsdatascience.com/inventory-management-for-retail-stochastic-demand-3020a43d1c14)_


# Build the application locally 🏗️

## **Build a python local environment (recommanded)**

### Then install **virtualenv** using pip3

sudo pip3 install virtualenv

### Now create a virtual environment

virtualenv venv

### Active your virtual environment

source venv/bin/activate

## Launch Streamlit 🚀

### Install all dependencies needed using requirements.txt

pip install -r requirements.txt

### Run the application

streamlit run segmentation.py

### Click on the Network URL in the shell
<p align="center">
<img align="center" src="images/network.PNG" width=50%>
</p>

> -> Enjoy!
# About me 🤓
Senior Supply Chain Engineer with an international experience working on Logistics and Transportation operations. \
Have a look at my portfolio: [Data Science for Supply Chain Portfolio](https://samirsaci.com) \
Data Science for Warehousing📦, Transportation 🚚 and Demand Forecasting 📈
124 changes: 124 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
import pandas as pd
import numpy as np
import plotly.express as px
from utils.routing.distances import (
distance_picking,
next_location
)
from utils.routing.routes import (
create_picking_route
)
from utils.batch.mapping_batch import (
orderlines_mapping,
locations_listing
)
from utils.cluster.mapping_cluster import (
df_mapping
)
from utils.batch.simulation_batch import (
simulation_wave,
simulate_batch
)
from utils.cluster.simulation_cluster import(
loop_wave,
simulation_cluster,
create_dataframe,
process_methods
)
from utils.results.plot import (
plot_simulation1,
plot_simulation2
)
import streamlit as st
from streamlit import caching

# Set page configuration
st.set_page_config(page_title ="Improve Warehouse Productivity using Order Batching",
initial_sidebar_state="expanded",
layout='wide',
page_icon="🛒")

# Set up the page
@st.cache(persist=False,
allow_output_mutation=True,
suppress_st_warning=True,
show_spinner= True)
# Preparation of data
def load(filename, n):
df_orderlines = pd.read_csv(IN + filename).head(n)
return df_orderlines


# Alley Coordinates on y-axis
y_low, y_high = 5.5, 50
# Origin Location
origin_loc = [0, y_low]
# Distance Threshold (m)
distance_threshold = 35
distance_list = [1] + [i for i in range(5, 100, 5)]
IN = 'In/'
# Store Results by WaveID
list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult = [], [], [], [], [], [], []
list_results = [list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult] # Group in list
# Store Results by Simulation (Order_number)
list_ordnum , list_dstw = [], []

# Simulation 1: Order Batch
# SCOPE SIZE
st.header("**🥇 Impact of the wave size in orders (Orders/Wave) **")
st.subheader('''
🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
''')
col1, col2 = st.beta_columns(2)
with col1:
n = st.slider(
'SIMULATION 1 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
with col2:
lines_number = 1000 * n
st.write('''🛠️{:,} \
order lines'''.format(lines_number))
# SIMULATION PARAMETERS
st.subheader('''
🛠️ SIMULATE ORDER PICKING BY WAVE OF N ORDERS PER WAVE WITH N IN [N_MIN, N_MAX] ''')
col_11 , col_22 = st.beta_columns(2)
with col_11:
n1 = st.slider(
'SIMULATION 1: N_MIN (ORDERS/WAVE)', 0, 20 , value = 1)
n2 = st.slider(
'SIMULATION 1: N_MAX (ORDERS/WAVE)', n1 + 1, 20 , value = int(np.max([n1+1 , 10])))
with col_22:
st.write('''[N_MIN, N_MAX] = [{:,}, {:,}]'''.format(n1, n2))
# START CALCULATION
start_1= False
if st.checkbox('SIMULATION 1: START CALCULATION',key='show', value=False):
start_1 = True
# Calculation
if start_1:
df_orderlines = load('df_lines.csv', lines_number)
df_waves, df_results = simulate_batch(n1, n2, y_low, y_high, origin_loc, lines_number, df_orderlines)
plot_simulation1(df_results, lines_number)

# Simulation 2: Order Batch using Spatial Clustering
# SCOPE SIZE
st.header("**🥈 Impact of the wave size in orders (Orders/Wave) **")
st.subheader('''
🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
''')
col1, col2 = st.beta_columns(2)
with col1:
n_ = st.slider(
'SIMULATION 2 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
with col2:
lines_2 = 1000 * n_
st.write('''🛠️{:,} \
order lines'''.format(lines_number))
# START CALCULATION
start_2 = False
if st.checkbox('SIMULATION 2: START CALCULATION',key='show_2', value=False):
start_2 = True
# Calculation
if start_2:
df_orderlines = load('df_lines.csv', lines_2)
df_reswave, df_results = simulation_cluster(y_low, y_high, df_orderlines, list_results, n1, n2,
distance_threshold)
plot_simulation2(df_reswave, lines_2, distance_threshold)
41 changes: 41 additions & 0 deletions notes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Example Artefact
https://github.com/MaximeLutel/streamlit_prophet
https://streamlit.io/gallery?category=model-building-training
- The Math of the Prophet
https://medium.com/future-vision/the-math-of-prophet-46864fa9c55a

# INSTALL NODE
https://docs.microsoft.com/fr-fr/windows/dev-environment/javascript/nodejs-on-wsl


# Ubuntu WSL VS Code
https://code.visualstudio.com/docs/remote/wsl
- Donner les droits admins pour ecrire et dl des librairies
sudo chown -R samirs streamlit_prophet

# Move local directory of Windows to Local Linux
mkdir app
cp -R /mnt/c/Data/62-\ Projects/24-\ Articles/25-\ Improve\ Warehouse\ Productivity/App ~/app
cd ~/App
code .

# Github
git config --global user.email "[email protected]"
git config --global user.name "Samir Saci"
git remote add origin 'https://github.com/samirsaci/segmentation.git'
git push -u origin main

# Install pipenv
pip install virtualenv
python3.8 -m virtualenv venv
source venv/bin/activate

# Activate Streamlit
streamlit run segmentation.py --server.address 0.0.0.0
streamlit run app.py --server.address 0.0.0.0

# SEGMENTATION TO DO
1) FAMILY = F(SKU SCOPE)
2) ITEM = ITEM LIST - FAMILY

C:\Data\62- Projects\24- Articles\25- Improve Warehouse Productivity\App
Loading

0 comments on commit 4bd027b

Please sign in to comment.