Skip to content

Commit 72e3491

Browse files
committed
Bringing even with master.
2 parents 7fb7fcc + b5dbd86 commit 72e3491

File tree

7 files changed

+368
-348
lines changed

7 files changed

+368
-348
lines changed

docs/advanced-analytics/r/how-to-do-realtime-scoring.md

Lines changed: 58 additions & 199 deletions
Large diffs are not rendered by default.

docs/advanced-analytics/real-time-scoring.md

Lines changed: 120 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Real-time scoring in SQL Server machine learning | Microsoft Docs
3-
description: Generate predictions using sp_rxPredict, scoring dta inputs against a pre-trained model written in R on SQL Server.
3+
description: Generate predictions using sp_rxPredict, scoring data inputs against a pre-trained model written in R on SQL Server.
44
ms.prod: sql
55
ms.technology: machine-learning
66

@@ -14,10 +14,7 @@ manager: cgronlun
1414
# Real-time scoring with sp_rxPredict in SQL Server machine learning
1515
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1616

17-
This article explains how scoring in near-real-time works for SQL Server relational data, using machine learning models written in R.
18-
19-
> [!Note]
20-
> Native scoring is a special implementation of real-time scoring that uses the native T-SQL PREDICT function for very fast scoring. For more information and availability, see [Native scoring](sql-native-scoring.md).
17+
Real-time scoring uses the CLR extension capabilities in SQL Server for high-performance predictions or scores in forecasting workloads. Because real-time scoring is language-agnostic, it executes with no dependencies on R or Python run times. Assuming a model created from Microsoft functions, trained, and serialized to a binary format in SQL Server, you can use real-time scoring to generate predicted outcomes on new data inputs on SQL Server instances that do not have the R or Python add-on features installed.
2118

2219
## How real-time scoring works
2320

@@ -32,41 +29,58 @@ Real-time scoring is a multi-step process:
3229
3. You provide new input data, either tabular or single rows, as input to the model.
3330
4. To generate scores, call the sp_rxPredict stored procedure.
3431

35-
## Get started
32+
> [!TIP]
33+
> For an example of real-time scoring in action, see [End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service](https://blogs.msdn.microsoft.com/rserver/2017/06/29/end-to-end-loan-chargeoff-prediction-built-using-azure-hdinsight-spark-clusters-and-sql-server-2016-r-service/)
3634
37-
For code examples and instructions, see [How to perform native scoring or real-time scoring](r/how-to-do-realtime-scoring.md).
35+
## Prerequisites
3836

39-
For an example of how rxPredict can be used for scoring, see [End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service](https://blogs.msdn.microsoft.com/rserver/2017/06/29/end-to-end-loan-chargeoff-prediction-built-using-azure-hdinsight-spark-clusters-and-sql-server-2016-r-service/)
37+
+ [Enable SQL Server CLR integration](https://docs.microsoft.com/dotnet/framework/data/adonet/sql/introduction-to-sql-server-clr-integration).
4038

41-
> [!TIP]
42-
> If you are working exclusively in R code, you can also use the [rxPredict](https://docs.microsoft.com/r-server/r-reference/revoscaler/rxpredict) function for fast scoring.
39+
+ [Enable real-time scoring](#bkmk_enableRtScoring).
40+
41+
+ The model must be trained in advance using one of the supported **rx** algorithms. For R, real-time scoring with `sp_rxPredict` works with [RevoScaleR and MicrosoftML supported algorithms](#bkmk_rt_supported_algos). For Python, see [revoscalepy and microsoftml supported algorithms](#bkmk_py_supported_algos)
4342

44-
## Requirements
43+
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
4544

46-
Real-time scoring is supported on these platforms:
45+
> [!Note]
46+
> Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) might be faster.
4747
48-
+ SQL Server 2017 Machine Learning Services
49-
+ SQL Server R Services 2016, with an upgrade of R components to 9.1.0 or later
48+
<a name="bkmk_py_supported_algos"></a>
5049

51-
On SQL Server, you must enable the real-time scoring feature in advance to add the CLR-based libraries to SQL Server.
50+
## Supported algorithms
5251

53-
For information regarding real-time scoring in a distributed environment based on Microsoft R Server, refer to the [publishService](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/publishservice) function available in the [mrsDeploy package](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/mrsdeploy-package), which supports publishing models for real-time scoring as a new a web service running on R Server.
52+
### Python algorithms using real-time scoring
5453

55-
### Restrictions
54+
+ revoscalepy models
5655

57-
+ The model must be trained in advance using one of the supported **rx** algorithms. For details, see [Supported algorithms](#bkmk_rt_supported_algos). Real-time scoring with `sp_rxPredict` supports both RevoScaleR and MicrosoftML algorithms.
56+
+ [rx_lin_mod](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-lin-mod) \*
57+
+ [rx_logit](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-logit) \*
58+
+ [rx_btrees](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-btrees) \*
59+
+ [rx_dtree](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-dtree) \*
60+
+ [rx_dforest](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-dforest) \*
61+
62+
Models marked with \* also support native scoring with the PREDICT function.
5863

59-
+ The model must be saved using the new serialization functions: [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
64+
+ microsoftml models
6065

61-
+ Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
66+
+ [rx_fast_trees](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-trees)
67+
+ [rx_fast_forest](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-forest)
68+
+ [rx_logistic_regression](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-logistic-regression)
69+
+ [rx_oneclass_svm](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-oneclass-svm)
70+
+ [rx_neural_net](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-neural-network)
71+
+ [rx_fast_linear](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-linear)
6272

63-
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not currently supported
73+
+ Transformations supplied by microsoftml
6474

65-
+ RevoScaleR models that use an R transformation function, or a formula that contains a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on the to input data before passing the data to real-time scoring.
75+
+ [featurize_text](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/featurize-text)
76+
+ [concat](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/concat)
77+
+ [categorical](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/categorical)
78+
+ [categorical_hash](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/categorical-hash)
6679

67-
+ Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) might be faster.
6880

69-
### <a name="bkmk_rt_supported_algos">Algorithms that support real-time scoring
81+
<a name="bkmk_rt_supported_algos"></a>
82+
83+
### R algorithms using real-time scoring
7084

7185
+ RevoScaleR models
7286

@@ -97,15 +111,91 @@ For information regarding real-time scoring in a distributed environment based o
97111

98112
### Unsupported model types
99113

100-
Real-time scoring is not supported for R transformations other than those explicitly listed in the previous section.
114+
Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
115+
116+
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not supported.
117+
118+
+ Models using a transformation function or formula containing a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on input data before passing the data to real-time scoring.
119+
120+
121+
## Example: sp_rxPredict
122+
123+
This section describes the steps required to set up **real-time** prediction, and provides an example in R of how to call the function from T-SQL.
124+
125+
<a name ="bkmk_enableRtScoring"></a>
126+
127+
### Step 1. Enable the real-time scoring procedure
128+
129+
You must enable this feature for each database that you want to use for scoring. The server administrator should run the command-line utility, RegisterRExt.exe, which is included with the RevoScaleR package.
130+
131+
> [!NOTE]
132+
> In order for real-time scoring to work, SQL CLR functionality needs to be enabled in the instance; additionally, the database needs to be marked trustworthy. When you run the script, these actions are performed for you. However, consider the additional security implications before doing this!
133+
134+
1. Open an elevated command prompt, and navigate to the folder where RegisterRExt.exe is located. The following path can be used in a default installation:
135+
136+
`<SQLInstancePath>\R_SERVICES\library\RevoScaleR\rxLibs\x64\`
101137

102-
For developers accustomed to working with RevoScaleR and other Microsoft R-specific libraries, unsupported functions include
103-
`rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR, PMML models, and other models created using other R libraries from CRAN or other repositories.
138+
2. Run the following command, substituting the name of your instance and the target database where you want to enable the extended stored procedures:
104139

105-
### Known issues
140+
`RegisterRExt.exe /installRts [/instance:name] /database:databasename`
106141

107-
+ `sp_rxPredict` returns an inaccurate message when a NULL value is passed as the model: "System.Data.SqlTypes.SqlNullValueException:Data in Null".
142+
For example, to add the extended stored procedure to the CLRPredict database on the default instance, type:
143+
144+
`RegisterRExt.exe /installRts /database:CLRPRedict`
145+
146+
The instance name is optional if the database is on the default instance. If you are using a named instance, you must specify the instance name.
147+
148+
3. RegisterRExt.exe creates the following objects:
149+
150+
+ Trusted assemblies
151+
+ The stored procedure `sp_rxPredict`
152+
+ A new database role, `rxpredict_users`. The database administrator can use this role to grant permission to users who use the real-time scoring functionality.
153+
154+
4. Add any users who need to run `sp_rxPredict` to the new role.
155+
156+
> [!NOTE]
157+
>
158+
> In SQL Server 2017, additional security measures are in place to prevent problems with CLR integration. These measures impose additional restrictions on the use of this stored procedure as well.
159+
160+
### Step 2. Prepare and save the model
161+
162+
The binary format required by sp\_rxPredict is the same as the format required to use the PREDICT function. Therefore, in your R code, include a call to [rxSerializeModel](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel), and be sure to specify `realtimeScoringOnly = TRUE`, as in this example:
163+
164+
```R
165+
model <- rxSerializeModel(model.name, realtimeScoringOnly = TRUE)
166+
```
167+
168+
### Step 3. Call sp_rxPredict
169+
170+
You call sp\_rxPredict as you would any other stored procedure. In the current release, the stored procedure takes only two parameters: _\@model_ for the model in binary format, and _\@inputData_ for the data to use in scoring, defined as a valid SQL query.
171+
172+
Because the binary format is the same that is used by the PREDICT function, you can use the models and data table from the preceding example.
173+
174+
```SQL
175+
DECLARE @irismodel varbinary(max)
176+
SELECT @irismodel = [native_model_object] from [ml_models]
177+
WHERE model_name = 'iris.dtree'
178+
AND model_version = 'v1''
179+
180+
EXEC sp_rxPredict
181+
@model = @irismodel,
182+
@inputData = N'SELECT * FROM iris_rx_data'
183+
```
184+
185+
> [!NOTE]
186+
>
187+
> The call to sp\_rxPredict fails if the input data for scoring does not include columns that match the requirements of the model. Currently, only the following .NET data types are supported: double, float, short, ushort, long, ulong and string.
188+
>
189+
> Therefore, you might need to filter out unsupported types in your input data before using it for real-time scoring.
190+
>
191+
> For information about corresponding SQL types, see [SQL-CLR Type Mapping](/dotnet/framework/data/adonet/sql/linq/sql-clr-type-mapping) or [Mapping CLR Parameter Data](https://docs.microsoft.com/sql/relational-databases/clr-integration-database-objects-types-net-framework/mapping-clr-parameter-data).
192+
193+
## Disable real-time scoring
194+
195+
To disable real-time scoring functionality, open an elevated command prompt, and run the following command: `RegisterRExt.exe /uninstallrts /database:<database_name> [/instance:name]`
108196
109197
## Next steps
110198
111-
[How to do real-time scoring](r/how-to-do-realtime-scoring.md)
199+
For an example of how rxPredict can be used for scoring, see [End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service](https://blogs.msdn.microsoft.com/rserver/2017/06/29/end-to-end-loan-chargeoff-prediction-built-using-azure-hdinsight-spark-clusters-and-sql-server-2016-r-service/).
200+
201+
For more background on scoring in SQL Server, see [How to generate predictions in SQL Server machine learning](r/how-to-do-realtime-scoring.md).

0 commit comments

Comments
 (0)