Introduction

Apache Spark is an open source parallel processing framework for large-scale data processing and analytics. Spark has become popular in "big data" processing scenarios, and is available in multiple platform implementations; including Azure HDInsight, Azure Synapse Analytics, and Microsoft Fabric.

This lab helps with basic understanding of Spark concepts including loading data to dataframe, querying data and applying simple transformations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Introduction

Contents

Files

README.md

Latest commit

History

README.md

File metadata and controls

Introduction

Contents