title | description | services | documentationcenter | author | manager | editor | ms.assetid | ms.service | ms.devlang | ms.topic | ms.tgt_pltfrm | ms.workload | ms.date | ms.author |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Develop U-SQL user-defined operators for Azure Data Lake Analytics jobs | Microsoft Docs |
Learn how to develop user-defined operators to be used and reused in Data Lake Analytics jobs. |
data-lake-analytics |
edmacauley |
jhubbard |
cgronlun |
e5189e4e-9438-46d1-8686-ed4836bf3356 |
data-lake-analytics |
na |
article |
na |
big-data |
12/05/2016 |
edmaca |
Learn how to develop user-defined operators to be used and reused in Data Lake Analytics jobs. You will develop a custom operator to convert country names.
For the instructions of developing general-purpose assemblies for U-SQL, see Develop U-SQL assemblies for Azure Data Lake Analytics jobs
- Visual Studio 2015, Visual Studio 2013 update 4, or Visual Studio 2012 with Visual C++ Installed
- Microsoft Azure SDK for .NET version 2.5 or above. Install it using the Web platform installer.
- A Data Lake Analytics account. See Get Started with Azure Data Lake Analytics using Azure portal.
- Go through the Get started with Azure Data Lake Analytics U-SQL Studio tutorial.
- Connect to Azure.
- Upload the source data, see Get started with Azure Data Lake Analytics U-SQL Studio.
To create and submit a U-SQL job
-
From the File menu, click New, and then click Project.
-
Select the U-SQL Project type.
-
Click OK. Visual studio creates a solution with a Script.usql file.
-
From Solution Explorer, expand Script.usql, and then double-click Script.usql.cs.
-
Paste the following code into the file:
using Microsoft.Analytics.Interfaces; using System.Collections.Generic; namespace USQL_UDO { public class CountryName : IProcessor { private static IDictionary<string, string> CountryTranslation = new Dictionary<string, string> { { "Deutschland", "Germany" }, { "Schwiiz", "Switzerland" }, { "UK", "United Kingdom" }, { "USA", "United States of America" }, { "中国", "PR China" } }; public override IRow Process(IRow input, IUpdatableRow output) { string UserID = input.Get<string>("UserID"); string Name = input.Get<string>("Name"); string Address = input.Get<string>("Address"); string City = input.Get<string>("City"); string State = input.Get<string>("State"); string PostalCode = input.Get<string>("PostalCode"); string Country = input.Get<string>("Country"); string Phone = input.Get<string>("Phone"); if (CountryTranslation.Keys.Contains(Country)) { Country = CountryTranslation[Country]; } output.Set<string>(0, UserID); output.Set<string>(1, Name); output.Set<string>(2, Address); output.Set<string>(3, City); output.Set<string>(4, State); output.Set<string>(5, PostalCode); output.Set<string>(6, Country); output.Set<string>(7, Phone); return output.AsReadOnly(); } } }
-
Open Script.usql, and paste the following U-SQL script:
@drivers = EXTRACT UserID string, Name string, Address string, City string, State string, PostalCode string, Country string, Phone string FROM "/Samples/Data/AmbulanceData/Drivers.txt" USING Extractors.Tsv(Encoding.Unicode); @drivers_CountryName = PROCESS @drivers PRODUCE UserID string, Name string, Address string, City string, State string, PostalCode string, Country string, Phone string USING new USQL_UDO.CountryName(); OUTPUT @drivers_CountryName TO "/Samples/Outputs/Drivers.csv" USING Outputters.Csv(Encoding.Unicode);
-
From Solution Explorer, right-click Script.usql, and then click Build Script.
-
From Solution Explorer, right-click Script.usql, and then click Submit Script.
-
If you haven't connect to your Azure subscription, you will be prompt to enter your Azure account credentials.
-
Click Submit. Submission results and job link are available in the Results window when the submission is completed.
-
You must click the Refresh button to see the latest job status and refresh the screen.
To see the job output
- From Server Explorer, expand Azure, expand Data Lake Analytics, expand your Data Lake Analytics account, expand Storage Accounts, right-click the Default Storage, and then click Explorer.
- Expand Samples, expand Outputs, and then double-click Drivers.csv.