< Previous Module - Home - Next Module >
- An Azure account with an active subscription.
- A SQL Virtual Machine (see module 00).
- An Azure Azure Purview account (see module 01).
To populate Azure Purview with assets from your on-premise data sources, we must make use a self-hosted integrastion runtime agent to create that gateway for data discovery and exploration. In this module, we will walk through how to install a self-hosted integration runtime, register the on-premise SQL server and scan the data source.
- Connect to on premise data source using a self-hosted integration runtime.
- Connect to SQL Virtual Machine
- Install Self-Hosted Integration Runtime
- Authenticate to Azure Purview
To invoke the install the self-hosted integration runtime, we must first log into our SQL virtual machine. For this example, we'll be using the RDP connection to complete this step. If you would like to use Bastion to connect, follow the instructions here to get this set-up.
📖 Note once the environment set-up is complete, your VM should already be in 'running' state. If this is not the case, you will need to 'start' your VM.
-
Navigate to your Virtual Machine resource in the Azure portal. In the Overview section (left blade), click on 'Connect' and 'RDP' from the drop-down menu.
-
In the next page, click 'Download RDP File'. Once the file has downloaded, click 'Open'.
-
You will need to access the SQL username and password generated when deploying the lab environment from module 00. To find these details, navigate to the resource group in the Azure portal. Under 'Settings > Deployments', click on 'SQLVMDeployment'.
-
Navigate to the 'Outputs' blade within the SQLVMDeployment area to find your SQL Admin username and password.
-
In the Remote Desktop Connection pop-up window, click 'Connect'.
-
Here you need to log into the virtual machine using the credentials supplied in the 'Outputs' blade in the deployment area of the resource group you created in module 00. You'll need to select the 'More Choices' option and/or 'Use a different account' options in the log in window.
📖 Note You'll need to log in using the format username = vm name\sqladmin username and password = sql password
-
You'll see a warning message, click Yes to continue.
💡 Did you know?
Integration Runtime (IR) is a secure compute infrastructure that is used to provide the data integration capabilities across the different network environments and make sure that these activities will be executed in the closest possible region to the data store.
Self-hosted Integration Runtime (SHIR) is an implementation of IR that is installed on an on-premises machine or virtual machine within a virtual network.
-
In the virtual machine, open the browser and navigate to the integration runtime download page. If the download doesn't start automatically, download the latest version of the integraion runtime from the list presented. Click 'Run' when the download begins.
-
Follow the instruction on screen to complete the installation process and click finish to proceed to the next step.
-
If the integration runtime manager doesn't open automatically, navigate to the Start Menu and click 'Microsoft Integration Runtime'. Once the IR Manager window opens, we can move on to the next step to authenticate to Azure Purview.
💡 Did you know?
The Purview Integration Runtime cannot be shared with an Azure Synapse Analytics or Azure Data Factory Integration Runtime on the same machine. It needs to be installed on a separated machine.
-
Within the Azure Purview Studio, navigate to the Data Map in the left blade, click Integration Runtime and click + New.
-
Ensure the Self-Hosted option is selected, then click Continue.
-
Give your integration runtime a name (mandatory) and a description (optional), then click Create.
-
Copy one of the keys to your clipboard then open your virtual machine window and paste this key into the integration runtime manager window. Click Register when the button becomes active and then Finish in the next screen.
-
Once successfully registered, you should see a green tick ✔️ within the integration runtime manager window and the Azure Purview Studio integration runtime manager area.
💡 Did you know?
The Purview Integration Runtime can also be used to scan and ingest metadata assets from Azure cloud services that are hidden behind private endpoints, such as Azure Data Lake, Azure SQL Database, Azure Cosmos DB and more.
-
What is an Self-Hosted Integration Runtime used for?
A) It's used for copying data from or to an on-premises data store or networks with access control
B) It's used for copying data between cloud based data stores or networks with public endpoints
C) It's used for copying data between managed environments -
Self-Hosted Integration Runtime can be shared across multiple services when installed on one machine/VM.
A) True
B) False -
Which Azure services can be scanned and have metadata assets ingested from using the self-hosted integration runtime?
A) Azure Blob Storage
B) Azure SQL Database
C) Azure Synapse Analytics
D) All of these
E) None of these
In this module, you learned how to install the self-hosted integration runtime to your virtual machine network and get it connected up to Azure Purview. If you'd like continue with this module to complete further tasks, please feel free to complete the tutorial links below:
- Setting up authentication for a scan
- Register SQL Server on VM as a data source in Purview
- Upload same data to the SQL Server on the VM
- Trigger a scan of the on-premise data source