title | description | services | documentationcenter | author | manager | editor | ms.assetid | ms.service | ms.devlang | ms.topic | ms.tgt_pltfrm | ms.workload | ms.date | ms.author |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Manage Azure Data Lake Analytics using Azure PowerShell | Microsoft Docs |
Learn how to manage Data Lake Analytics jobs, data sources, users. |
data-lake-analytics |
edmacauley |
jhubbard |
cgronlun |
ad14d53c-fed4-478d-ab4b-6d2e14ff2097 |
data-lake-analytics |
na |
article |
na |
big-data |
12/05/2016 |
edmaca |
[!INCLUDE manage-selector]
Learn how to manage Azure Data Lake Analytics accounts, data sources, users, and jobs using the Azure PowerShell. To see management topics using other tools, click the tab select above.
Prerequisites
Before you begin this tutorial, you must have the following:
- An Azure subscription. See Get Azure free trial.
See the Prerequisite section of Using Azure PowerShell with Azure Resource Manager.
Before running any Data Lake Analytics jobs, you must have a Data Lake Analytics account. Unlike Azure HDInsight, you don't pay for an Analytics account when it is not running a job. You only pay for the time when it is running a job. For more information, see Azure Data Lake Analytics Overview.
$resourceGroupName = "<ResourceGroupName>"
$dataLakeStoreName = "<DataLakeAccountName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
$location = "<Microsoft Data Center>"
Write-Host "Create a resource group ..." -ForegroundColor Green
New-AzureRmResourceGroup `
-Name $resourceGroupName `
-Location $location
Write-Host "Create a Data Lake account ..." -ForegroundColor Green
New-AzureRmDataLakeStoreAccount `
-ResourceGroupName $resourceGroupName `
-Name $dataLakeStoreName `
-Location $location
Write-Host "Create a Data Lake Analytics account ..." -ForegroundColor Green
New-AzureRmDataLakeAnalyticsAccount `
-Name $dataLakeAnalyticsAccountName `
-ResourceGroupName $resourceGroupName `
-Location $location `
-DefaultDataLake $dataLakeStoreName
Write-Host "The newly created Data Lake Analytics account ..." -ForegroundColor Green
Get-AzureRmDataLakeAnalyticsAccount `
-ResourceGroupName $resourceGroupName `
-Name $dataLakeAnalyticsAccountName
You can also use an Azure Resource Group template. A template for creating a Data Lake Analytics account and the dependent Data Lake Store account is in Appendix A. Save the template into a file with .json template, and then use the following PowerShell script to call it:
$AzureSubscriptionID = "<Your Azure Subscription ID>"
$ResourceGroupName = "<New Azure Resource Group Name>"
$Location = "EAST US 2"
$DefaultDataLakeStoreAccountName = "<New Data Lake Store Account Name>"
$DataLakeAnalyticsAccountName = "<New Data Lake Analytics Account Name>"
$DeploymentName = "MyDataLakeAnalyticsDeployment"
$ARMTemplateFile = "E:\Tutorials\ADL\ARMTemplate\azuredeploy.json" # update the Json template path
Login-AzureRmAccount
Select-AzureRmSubscription -SubscriptionId $AzureSubscriptionID
# Create the resource group
New-AzureRmResourceGroup -Name $ResourceGroupName -Location $Location
# Create the Data Lake Analytics account with the default Data Lake Store account.
$parameters = @{"adlAnalyticsName"=$DataLakeAnalyticsAccountName; "adlStoreName"=$DefaultDataLakeStoreAccountName}
New-AzureRmResourceGroupDeployment -Name $DeploymentName -ResourceGroupName $ResourceGroupName -TemplateFile $ARMTemplateFile -TemplateParameterObject $parameters
List Data Lake Analytics accounts within the current subscription
Get-AzureRmDataLakeAnalyticsAccount
The output:
Id : /subscriptions/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/resourceGroups/learn1021rg/providers/Microsoft.DataLakeAnalytics/accounts/learn1021adla
Location : eastus2
Name : learn1021adla
Properties : Microsoft.Azure.Management.DataLake.Analytics.Models.DataLakeAnalyticsAccountProperties
Tags : {}
Type : Microsoft.DataLakeAnalytics/accounts
List Data Lake Analytics accounts within a specific resource group
Get-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName
Get details of a specific Data Lake Analytics account
Get-AzureRmDataLakeAnalyticsAccount -Name $adlAnalyticsAccountName
Test existence of a specific Data Lake Analytics account
Test-AzureRmDataLakeAnalyticsAccount -Name $adlAnalyticsAccountName
The cmdlet will return either True or False.
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
Remove-AzureRmDataLakeAnalyticsAccount -Name $dataLakeAnalyticsAccountName
Delete Data Lake Analytics account will not delete the dependent Data Lake Storage account. The following example deletes the Data Lake Analytics account and the default Data Lake Store account
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
$dataLakeStoreName = (Get-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName -Name $dataLakeAnalyticAccountName).Properties.DefaultDataLakeAccount
Remove-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName -Name $dataLakeAnalyticAccountName
Remove-AzureRmDataLakeStoreAccount -ResourceGroupName $resourceGroupName -Name $dataLakeStoreName
Data Lake Analytics currently supports the following data sources:
When you create an Analytics account, you must designate an Azure Data Lake Storage account to be the default storage account. The default Data Lake Store account is used to store job metadata and job audit logs. After you have created an Analytics account, you can add additional Data Lake Storage accounts and/or Azure Storage account.
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
$dataLakeStoreName = (Get-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName -Name $dataLakeAnalyticAccountName).Properties.DefaultDataLakeAccount
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
$AzureStorageAccountName = "<AzureStorageAccountName>"
$AzureStorageAccountKey = "<AzureStorageAccountKey>"
Add-AzureRmDataLakeAnalyticsDataSource -ResourceGroupName $resourceGroupName -Account $dataLakeAnalyticAccountName -AzureBlob $AzureStorageAccountName -AccessKey $AzureStorageAccountKey
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
$AzureDataLakeName = "<DataLakeStoreName>"
Add-AzureRmDataLakeAnalyticsDataSource -ResourceGroupName $resourceGroupName -Account $dataLakeAnalyticAccountName -DataLake $AzureDataLakeName
$resourceGroupName = "<ResourceGroupName>"
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
(Get-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName -Name $dataLakeAnalyticAccountName).Properties.DataLakeStoreAccounts
(Get-AzureRmDataLakeAnalyticsAccount -ResourceGroupName $resourceGroupName -Name $dataLakeAnalyticAccountName).Properties.StorageAccounts
You must have a Data Lake Analytics account before you can create a job. For more information, see Manage Data Lake Analytics accounts.
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName -State Running, Queued
#States: Accepted, Compiling, Ended, New, Paused, Queued, Running, Scheduling, Starting
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName -Result Cancelled
#Results: Cancelled, Failed, None, Successed
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName -Name <Job Name>
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName -Submitter <Job submitter>
# List all jobs submitted on January 1 (local time)
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-SubmittedAfter "2015/01/01"
-SubmittedBefore "2015/01/02"
# List all jobs that succeeded on January 1 after 2 pm (UTC time)
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-State Ended
-Result Succeeded
-SubmittedAfter "2015/01/01 2:00 PM -0"
-SubmittedBefore "2015/01/02 -0"
# List all jobs submitted in the past hour
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-SubmittedAfter (Get-Date).AddHours(-1)
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
Get-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName -JobID <Job ID>
$dataLakeAnalyticsAccountName = "<DataLakeAnalyticsAccountName>"
#Pass script via path
Submit-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-Name $jobName `
-ScriptPath $scriptPath
#Pass script contents
Submit-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-Name $jobName `
-Script $scriptContents
Note
The default priority of a job is 1000, and the default degree of parallelism for a job is 1.
Stop-AzureRmDataLakeAnalyticsJob -Account $dataLakeAnalyticAccountName `
-JobID $jobID
The U-SQL catalog is used to structure data and code so they can be shared by U-SQL scripts. The catalog enables the highest performance possible with data in Azure Data Lake. For more information, see Use U-SQL catalog.
#List databases
Get-AzureRmDataLakeAnalyticsCatalogItem `
-Account $adlAnalyticsAccountName `
-ItemType Database
#List tables
Get-AzureRmDataLakeAnalyticsCatalogItem `
-Account $adlAnalyticsAccountName `
-ItemType Table `
-Path "master.dbo"
#Get a database
Get-AzureRmDataLakeAnalyticsCatalogItem `
-Account $adlAnalyticsAccountName `
-ItemType Database `
-Path "master"
#Get a table
Get-AzureRmDataLakeAnalyticsCatalogItem `
-Account $adlAnalyticsAccountName `
-ItemType Table `
-Path "master.dbo.mytable"
Test-AzureRmDataLakeAnalyticsCatalogItem `
-Account $adlAnalyticsAccountName `
-ItemType Database `
-Path "master"
New-AzureRmDataLakeAnalyticsCatalogSecret `
-Account $adlAnalyticsAccountName `
-DatabaseName "master" `
-Secret (Get-Credential -UserName "username" -Message "Enter the password")
Set-AzureRmDataLakeAnalyticsCatalogSecret `
-Account $adlAnalyticsAccountName `
-DatabaseName "master" `
-Secret (Get-Credential -UserName "username" -Message "Enter the password")
Remove-AzureRmDataLakeAnalyticsCatalogSecret `
-Account $adlAnalyticsAccountName `
-DatabaseName "master"
Applications are typically made up of many components, for example a web app, database, database server, storage, and 3rd party services. Azure Resource Manager (ARM) enables you to work with the resources in your application as a group, referred to as an Azure Resource Group. You can deploy, update, monitor or delete all of the resources for your application in a single, coordinated operation. You use a template for deployment and that template can work for different environments such as testing, staging and production. You can clarify billing for your organization by viewing the rolled-up costs for the entire group. For more information, see Azure Resource Manager Overview.
A Data Lake Analytics service can include the following components:
- Azure Data Lake Analytics account
- Required default Azure Data Lake Storage account
- Additional Azure Data Lake Storage accounts
- Additional Azure Storage accounts
You can create all these components under one ARM group to make them easier to manage.
A Data Lake Analytics account and the dependent storage accounts must be placed in the same Azure data center. The ARM group however can be located in a different data center.
- Overview of Microsoft Azure Data Lake Analytics
- Get started with Data Lake Analytics using Azure Portal
- Manage Azure Data Lake Analytics using Azure Portal
- Monitor and troubleshoot Azure Data Lake Analytics jobs using Azure Portal
The following ARM template can be used to deploy a Data Lake Analytics account and its dependent Data Lake Store account. Save it as a json file, and then use PowerShell script to call the template. For more information, see Deploy an application with Azure Resource Manager template and Authoring Azure Resource Manager templates.
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"adlAnalyticsName": {
"type": "string",
"metadata": {
"description": "The name of the Data Lake Analytics account to create."
}
},
"adlStoreName": {
"type": "string",
"metadata": {
"description": "The name of the Data Lake Store account to create."
}
}
},
"resources": [
{
"name": "[parameters('adlStoreName')]",
"type": "Microsoft.DataLakeStore/accounts",
"location": "East US 2",
"apiVersion": "2015-10-01-preview",
"dependsOn": [ ],
"tags": { }
},
{
"name": "[parameters('adlAnalyticsName')]",
"type": "Microsoft.DataLakeAnalytics/accounts",
"location": "East US 2",
"apiVersion": "2015-10-01-preview",
"dependsOn": [ "[concat('Microsoft.DataLakeStore/accounts/',parameters('adlStoreName'))]" ],
"tags": { },
"properties": {
"defaultDataLakeStoreAccount": "[parameters('adlStoreName')]",
"dataLakeStoreAccounts": [
{ "name": "[parameters('adlStoreName')]" }
]
}
}
],
"outputs": {
"adlAnalyticsAccount": {
"type": "object",
"value": "[reference(resourceId('Microsoft.DataLakeAnalytics/accounts',parameters('adlAnalyticsName')))]"
},
"adlStoreAccount": {
"type": "object",
"value": "[reference(resourceId('Microsoft.DataLakeStore/accounts',parameters('adlStoreName')))]"
}
}
}