Skip to content

nguyentrieuanduong/data-loader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Load data with Spark

Installation

  1. Create python virtual environment

    python3 -m venv venv
  2. Activate the created python virtual environment

    source venv/bin/activate
  3. Upgrade pip, setuptools and wheel

    pip install --upgrade pip setuptools wheel
  4. Installing

    python3 -m pip install -r requirements.txt

    Or install pyspark for local development

    python3 -m pip install pyspark
  1. Programming Guides

    1. SQL, DataFrames and Datasets

    2. Structured Streaming

    3. PySpark (Python on Spark)

  2. API Docs

    1. SQL, Built-in Functions
  3. Deploying

    1. Cluster Mode Overview

    2. Submitting Applications

    3. YARN

    4. Kubernetes

  4. Others

    1. Configuration

    2. Monitoring

    3. Tuning

    4. Job Scheduling

    5. Security

    6. Migration Guide: SQL, Datasets and DataFrame

    7. Spark Connect

    8. Integration with Cloud Infrastructures

    9. Code Examples

About

Load data with Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published