Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 974 Bytes

README.md

File metadata and controls

14 lines (10 loc) · 974 Bytes

Introduction

Apache Spark is an open source parallel processing framework for large-scale data processing and analytics. Spark has become popular in "big data" processing scenarios, and is available in multiple platform implementations; including Azure HDInsight, Azure Synapse Analytics, and Microsoft Fabric.

This lab helps with basic understanding of Spark concepts including loading data to dataframe, querying data and applying simple transformations.

Contents