Snowpark ML Demo

Uses Snowpark Python Stored Procs (SPROCs), User-Defined Functions (UDFs) and data frame operations for end-to-end Machine Learning in Snowflake.

Introduction

In this notebook we fit/train a Scikit-Learn ML pipeline that includes common feature engineering tasks such as Imputations, Scaling and One-Hot Encoding. The pipeline also includes a RandomForestRegressor model that will predict median house values in California.

We will fit/train the pipeline using a Snowpark Python Stored Procedure (SPROC) and then save the pipeline to a Snowflake stage. This example concludes by showing how a saved model/pipeline can be loaded and run in a scalable fashion on a snowflake warehouse using Snowpark Python User-Defined Functions (UDFs).

1. Create the Conda Snowpark Environment

1.1 Clone the repository and switch into the directory

1.2 Open environment.yml and paste in the following config:

name: snowpark_ml_test
channels:
  - snowflake
dependencies:
  - python=3.8
  - snowflake-snowpark-python
  - ipykernel
  - pyarrow
  - numpy
  - scikit-learn
  - pandas
  - joblib
  - cachetools

1.3 Create the conda environment

conda env create -f environment.yml

1.4 Activate the conda environment

conda activate snowpark_ml_test

1.5 Download and install VS Code

1.6 Create creds.json file with your Snowflake account and credentials in the following format:

{
   "account": "yoursnowflake-account",
   "user": "snowflake_user",
   "role": "snowflake_role",
   "password": "snowflake_role",
   "database": "snowpark_demo_db",
   "schema": "MEMBERSHIP_MODELING_DEMO",
   "warehouse": "snowpark_demo_wh"
}

Use your own database, schema, warehouse, and role or create the ones used above in your own environment.

1.7 Configure the conda environment in VS Code

In the terminal, run this command and note the path of the newly create conda environment

`conda env list`

Open the notebook named 01_prepare_environment.ipynb in VS Code and in the top right corner click Select Kernel

Paste in the path to the conda env you copied earlier

2. Create the Sample Dataset into Snowflake

Run the rest of the cells in 01_prepare_environment.ipynb to generate a modeling dataset.

3. Feature Engineering, ML Training and Model Deployment in Snowflake

Open and run 02_snowpark_end_to_end_ml.ipynb notebook.

4. Using scikit-learn Pipelines to Simplify our Scoring Pipeline

Perform all feature encoding and scaling using a scikit-learn pipeline. This will not make use of the distributed processing we get from UDFs or the preprocessing module, but simplifies our scoring pipeline.

Open and run 03_deploy_sklearn_pipeline_as_udf.ipynb notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
preprocessing		preprocessing
.gitignore		.gitignore
01_prepare_environment.ipynb		01_prepare_environment.ipynb
02_snowpark_end_to_end_ml.ipynb		02_snowpark_end_to_end_ml.ipynb
03_deploy_sklearn_pipeline_as_udf.ipynb		03_deploy_sklearn_pipeline_as_udf.ipynb
03_deploy_sklearn_pipeline_as_udf_json_input.ipynb		03_deploy_sklearn_pipeline_as_udf_json_input.ipynb
README.md		README.md
member_ltv_pipeline.joblib.gz		member_ltv_pipeline.joblib.gz
preprocessing.zip		preprocessing.zip
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snowpark ML Demo

Introduction

1. Create the Conda Snowpark Environment

2. Create the Sample Dataset into Snowflake

3. Feature Engineering, ML Training and Model Deployment in Snowflake

4. Using scikit-learn Pipelines to Simplify our Scoring Pipeline

About

Releases

Packages

Languages

sfc-gh-jgriffith/snowpark-end-to-end-ML-with-hyperparameter-tuning

Folders and files

Latest commit

History

Repository files navigation

Snowpark ML Demo

Introduction

1. Create the Conda Snowpark Environment

2. Create the Sample Dataset into Snowflake

3. Feature Engineering, ML Training and Model Deployment in Snowflake

4. Using scikit-learn Pipelines to Simplify our Scoring Pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages