Skip to content

Commit

Permalink
Merge pull request MicrosoftLearning#129 from afelix-95/main
Browse files Browse the repository at this point in the history
Fixed typos labs 02, 04, and 10
  • Loading branch information
afelix-95 authored Apr 26, 2024
2 parents 5203517 + 95de2b8 commit 83dfe2f
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions Instructions/Labs/02-analyze-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,11 +270,11 @@ A common task for data engineers is to ingest data in a particular format or str
> **Note**: Commonly, *Parquet* format is preferred for data files that you will use for further analysis or ingestion into an analytical store. Parquet is a very efficient format that is supported by most large scale data analytics systems. In fact, sometimes your data transformation requirement may simply be to convert data from another format (such as CSV) to Parquet!
2. Run the cell and wait for the message that the data has been saved. Then, in the **Lakehouses** pane on the left, in the **...** menu for the **Files** node, select **Refresh**; and select the **transformed_orders** folder to verify that it contains a new folder named **orders**, which in turn contains one or more Parquet files.
2. Run the cell and wait for the message that the data has been saved. Then, in the **Lakehouses** pane on the left, in the **...** menu for the **Files** node, select **Refresh**; and select the **transformed_data** folder to verify that it contains a new folder named **orders**, which in turn contains one or more Parquet files.
![Screenshot of a folder containing parquet files.](./Images/saved-parquet.png)
3. Add a new cell with the following code to load a new dataframe from the parquet files in the **transformed_orders/orders** folder:
3. Add a new cell with the following code to load a new dataframe from the parquet files in the **transformed_data/orders** folder:
```Python
orders_df = spark.read.format("parquet").load("Files/transformed_data/orders")
Expand Down
2 changes: 1 addition & 1 deletion Instructions/Labs/04-ingest-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Now that you have a workspace, it's time to create a data lakehouse into which y

A simple way to ingest data is to use a **Copy Data** activity in a pipeline to extract the data from a source and copy it to a file in the lakehouse.

1. On the **Home** page for your lakehouse, select **New data pipeline**, and create a new data pipeline named **Ingest Sales Data**.
1. On the **Home** page for your lakehouse, select **Get data** and then select **New data pipeline**, and create a new data pipeline named **Ingest Sales Data**.
2. If the **Copy Data** wizard doesn't open automatically, select **Copy Data** in the pipeline editor page.
3. In the **Copy Data** wizard, on the **Choose a data source** page, in the **data sources** section, select the **Generic protocol** tab and then select **HTTP**.

Expand Down
2 changes: 1 addition & 1 deletion Instructions/Labs/10-ingest-notebooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Start by creating a new lakehouse, and a destination folder in the lakehouse.

1. From **Files**, select the **[...]** to create **New subfolder** named **RawData**.

1. From the Lakehouse Explorer within the lakehouse, select **Files > ... > Properties**.
1. From the Lakehouse Explorer within the lakehouse, select **RawData > ... > Properties**.

1. Copy the **ABFS path** for the **RawData** folder to an empty notepad for later use, which should look something like:
`abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Files/{folder_name}/{file_name}`
Expand Down

0 comments on commit 83dfe2f

Please sign in to comment.