You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/advanced-analytics/r/how-to-do-realtime-scoring.md
+20-11Lines changed: 20 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,6 @@
1
1
---
2
-
title: How to perform real-time scoring or native scoring in SQL Server Machine Learning | Microsoft Docs
2
+
title: How to generate forecasts and predictions using machine learning models in SQL Server | Microsoft Docs
3
+
description: Use rxPredict, or sp_rxPredict for real-time scoring, or PREDICT T-SQL for native scoring for predictions and forecasting in R and Pythin in SQL Server Machine Learning.
3
4
ms.prod: sql
4
5
ms.technology: machine-learning
5
6
@@ -9,20 +10,20 @@ author: HeidiSteen
9
10
ms.author: heidist
10
11
manager: cgronlun
11
12
---
12
-
# How to perform real-time scoring or native scoring in SQL Server
13
+
# How to generate forecasts and predictions using machine learning models in SQL Server
Using an existing model to forecast or predict outcomes for new data inputs is a core task in machine learning. This article enumerates the approaches for generating predictions in SQL Server. Among the approaches are internal processing methodologies for highspeed predictions, where speed is based on incremental reductions of run time dependencies. Fewer dependencies means faster predictions.
16
+
Using an existing model to forecast or predict outcomes for new data inputs is a core task in machine learning. This article enumerates the approaches for generating predictions in SQL Server. Among the approaches are internal processing methodologies for high-speed predictions, where speed is based on incremental reductions of run time dependencies. Fewer dependencies mean faster predictions.
16
17
17
18
Using the internal processing infrastructure (real-time or native scoring) comes with library requirements. Functions must be from the Microsoft libraries. R or Python code calling open-source or third-party functions is not supported in CLR or C++ extensions.
18
19
19
20
The following table summarizes the scoring frameworks for forecasting and predictions.
| Extensibility framework | R: rxpredict <br/>Python: rx_predict | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
24
-
| Real-time scoring CLR extension |[sp_rxPredict](../../sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql.md) on a binary model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
25
-
| Native scoring C++ extension|[PREDICT T-SQL function](../../sql/t-sql/queries/predict-transact-sql.md) on a binary model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
24
+
| Extensibility framework | R: rxPredict <br/>Python: rx_predict | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
25
+
| Real-time scoring CLR extension |[sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) on a binary model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
26
+
| Native scoring C++ extension|[PREDICT T-SQL function](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql) on a binary model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
26
27
27
28
Speed of processing and not substance of the output is the differentiating feature. Assuming the same functions and inputs, the scored output should not vary based on the approach you use.
28
29
@@ -45,17 +46,17 @@ Taking a step back, the overall process of preparing the model and then generati
45
46
46
47
When the input includes many rows of data, it is usually faster to insert the prediction values into a table as part of the scoring process. Generating a single score is more typical in a scenario where you get input values from a form or user request, and return the score to a client application. To improve performance when generating successive scores, SQL Server might cache the model so that it can be reloaded into memory.
47
48
48
-
## Native and real-time scoring compared
49
+
## Compare methods
49
50
50
51
To preserve the integrity of core database engine processes, support for R and Python is enabled in a dual architecture that isolates language processing from RDBMS processing. Starting in SQL Server 2016, Microsoft added an extensibility framework that allows R scripts to be executed from T-SQL. In SQL Server 2017, Python integration was added.
51
52
52
-
The extensibility framework supports any operation you might perform in R or Python, ranging from simple functions to training complex machine learning models. However, the dual-process architecture requires invoking an external R or Python process for every call, regardless of the complexity of the operation. When the workload entails loading a pre-trained model from a table and scoring against it on data already in SQL Server, the overhead of calling the external processes adds latency that can be unacceptable in certain circumstances. For example, in a request-response pattern such as fraud detection, scores must be generated very quickly in order to be relevant.
53
+
The extensibility framework supports any operation you might perform in R or Python, ranging from simple functions to training complex machine learning models. However, the dual-process architecture requires invoking an external R or Python process for every call, regardless of the complexity of the operation. When the workload entails loading a pre-trained model from a table and scoring against it on data already in SQL Server, the overhead of calling the external processes adds latency that can be unacceptable in certain circumstances. For example, in a request-response pattern such as fraud detection, scores must be generated quickly in order to be relevant.
53
54
54
55
To support fast scoring, SQL Server added built-in scoring libraries as C++ and CLR extensions that eliminate the processing overhead of R and Python run times.
55
56
56
57
**Real-time scoring** was the first solution for high-performance scoring. Introduced in early versions of SQL Server 2017 and later updates to SQL Server 2016, real-time scoring relies on CLR libraries that stand in for R and Python processing over Microsoft-controlled functions in RevoScaleR, MicrosoftML (R), revoscalepy, and microsoftml (Python). CLR libraries are invoked using the **sp_rxPredict** stored procedure to generates scores from any supported model type, without calling the R or Python runtime.
57
58
58
-
**Native scoring** is a SQL Server 2017 feature, implemented as a native C++ library, but only for RevoScaleR and revoscalepy ,models. It is the fastest and more secure approach, but supports a smaller set of functions relative to other methodologies.
59
+
**Native scoring** is a SQL Server 2017 feature, implemented as a native C++ library, but only for RevoScaleR and revoscalepy models. It is the fastest and more secure approach, but supports a smaller set of functions relative to other methodologies.
59
60
60
61
## Choose a scoring method
61
62
@@ -85,9 +86,9 @@ From R code, call the [rxWriteObject](https://docs.microsoft.com/machine-learnin
85
86
86
87
If you use this function, be sure to serialize the model using [rxSerializeModel](https://docs.microsoft.com/r-server/r-reference/revoscaler/rxserializemodel) first. Then, set the *serialize* argument in `rxWriteObject` to FALSE, to avoid repeating the serialization step.
87
88
88
-
Serialing a model to a binary format is useful, but not required if you are scoring predictions using R and Python run time environment in the extensibility framework. You can save a model in raw byte format to a file and then read from the file into SQL Server. This option might be useful if you are moving or copying models between environments.
89
+
Serializing a model to a binary format is useful, but not required if you are scoring predictions using R and Python run time environment in the extensibility framework. You can save a model in raw byte format to a file and then read from the file into SQL Server. This option might be useful if you are moving or copying models between environments.
89
90
90
-
## Scoring in related Microsoft products
91
+
## Scoring in related products
91
92
92
93
If you are using the [standalone server](r-server-standalone.md) or a [Microsoft Machine Learning Server](https://docs.microsoft.com/machine-learning-server/what-is-machine-learning-server), you have other options besides stored procedures and T-SQL functions for generating predictions quickly. Both the standalone server and Machine Learning Server support the concept of a *web service* for code deployment. You can bundle an R or Python pre-trained model as a web service, called at run time to evaluate new data inputs. For more information, see these articles:
93
94
@@ -96,3 +97,11 @@ If you are using the [standalone server](r-server-standalone.md) or a [Microsoft
96
97
+[Deploy a Python model as a web service with azureml-model-management-sdk](https://docs.microsoft.com/machine-learning-server/operationalize/python/quickstart-deploy-python-web-service)
97
98
+[Publish an R code block or a real-time model as a new web service](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/publishservice)
98
99
+[mrsdeploy package for R](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/mrsdeploy-package)
Copy file name to clipboardExpand all lines: docs/advanced-analytics/real-time-scoring.md
+15-16Lines changed: 15 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Real-time scoring in SQL Server machine learning | Microsoft Docs
3
-
description: Generate predictions using sp_rxPredict, scoring dta inputs against a pre-trained model written in R on SQL Server.
3
+
description: Generate predictions using sp_rxPredict, scoring data inputs against a pre-trained model written in R on SQL Server.
4
4
ms.prod: sql
5
5
ms.technology: machine-learning
6
6
@@ -42,18 +42,14 @@ Real-time scoring is a multi-step process:
42
42
43
43
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
44
44
45
-
Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
46
-
47
-
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not currently supported
48
-
49
-
+ RevoScaleR models that use an R transformation function, or a formula that contains a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on the to input data before passing the data to real-time scoring.
50
-
51
45
> [!Note]
52
46
> Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) might be faster.
53
47
54
48
<aname="bkmk_py_supported_algos"></a>
55
49
56
-
## Python algorithms using real-time scoring
50
+
## Supported algorithms
51
+
52
+
### Python algorithms using real-time scoring
57
53
58
54
+ revoscalepy models
59
55
@@ -84,7 +80,7 @@ Real-time scoring does not use an interpreter; therefore, any functionality that
84
80
85
81
<aname="bkmk_rt_supported_algos"></a>
86
82
87
-
## R algorithms using real-time scoring
83
+
###R algorithms using real-time scoring
88
84
89
85
+ RevoScaleR models
90
86
@@ -113,19 +109,22 @@ Real-time scoring does not use an interpreter; therefore, any functionality that
Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
115
+
116
+
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not supported.
117
117
118
-
Real-time scoring is not supported for R transformations other than those explicitly listed in the previous section.
118
+
+ Models using a transformation function or formula containing a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on input data before passing the data to real-time scoring.
119
119
120
-
For developers accustomed to working with RevoScaleR and other Microsoft R-specific libraries, unsupported functions include
121
-
`rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR, PMML models, and other models created using other R libraries from CRAN or other repositories.
122
120
121
+
## Example: sp_rxPredict
123
122
124
-
## Example (R): Real-time scoring with sp_rxPredict
123
+
This section describes the steps required to set up **real-time** prediction, and provides an example in R of how to call the function from T-SQL.
125
124
126
-
This section describes the steps required to set up **real-time** prediction, and provides an example of how to call the function from T-SQL.
125
+
<aname ="bkmk_enableRtScoring"></a>
127
126
128
-
### <aname ="bkmk_enableRtScoring"></a> Step 1. Enable the real-time scoring procedure
127
+
### Step 1. Enable the real-time scoring procedure
129
128
130
129
You must enable this feature for each database that you want to use for scoring. The server administrator should run the command-line utility, RegisterRExt.exe, which is included with the RevoScaleR package.
Copy file name to clipboardExpand all lines: docs/advanced-analytics/sql-native-scoring.md
+9-16Lines changed: 9 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ Native scoring requires that you have an already trained model. In SQL Server 20
20
20
21
21
## How native scoring works
22
22
23
-
Native scoring uses native C++ libraries from Microsoft that can read an already trained model, previosuly stored in a special binary format or saved to disk as raw byte stream, and generate scores for new data inputs that you provide. Because the model is trained, published, and stored, it can be used for scoring without having to call the R or Python interpreter. As such, the overhead of multiple process interactions is reduced, resulting in much faster prediction performance in enterprise production scenarios.
23
+
Native scoring uses native C++ libraries from Microsoft that can read an already trained model, previously stored in a special binary format or saved to disk as raw byte stream, and generate scores for new data inputs that you provide. Because the model is trained, published, and stored, it can be used for scoring without having to call the R or Python interpreter. As such, the overhead of multiple process interactions is reduced, resulting in much faster prediction performance in enterprise production scenarios.
24
24
25
25
To use native scoring, call the PREDICT T-SQL function and pass the following required inputs:
26
26
@@ -31,13 +31,11 @@ The function returns predictions for the input data, together with any columns o
31
31
32
32
## Prerequisites
33
33
34
-
PREDICT is available on all editions of SQL Server 2017 database engine and enabled by default, including SQL Server 2017 Machine Learning Services on Windows, SQL Server 2017 (Windows), SQL Server 2017 (Linux) or Azure SQL Database. You do not need to install R, Python, or enable additional features.
35
-
34
+
PREDICT is available on all editions of SQL Server 2017 database engine and enabled by default, including SQL Server 2017 Machine Learning Services on Windows, SQL Server 2017 (Windows), SQL Server 2017 (Linux), or Azure SQL Database. You do not need to install R, Python, or enable additional features.
36
35
37
-
## Model preparation
36
+
+ The model must be trained in advance using one of the supported **rx** algorithms listed below.
38
37
39
-
+ The model must be trained in advance using one of the supported **rx** algorithms. For details, see [Supported algorithms](#bkmk_native_supported_algos).
40
-
+ The model must be saved using the new serialization function provided in Microsoft R Server 9.1.0. The serialization function is optimized to support fast scoring.
38
+
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
41
39
42
40
<aname="bkmk_native_supported_algos"></a>
43
41
@@ -63,13 +61,12 @@ If you need to use models from MicrosoftML or microsoftml, use [real-time scorin
63
61
64
62
Unsupported model types include the following types:
65
63
66
-
+ Models containing other, unsupported types of R transformations
67
-
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR
64
+
+ Models containing other transformations
65
+
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR or revoscalepy equivalents
68
66
+ PMML models
69
-
+ Models created using other R libraries from CRAN or other repositories
70
-
+ Models containing any other R transformation
67
+
+ Models created using other open-source or third-party libraries
71
68
72
-
## Example: Native scoring with PREDICT
69
+
## Example: PREDICT (T-SQL)
73
70
74
71
In this example, you create a model, and then call the real-time prediction function from T-SQL.
75
72
@@ -168,8 +165,4 @@ If you get the error, "Error occurred during execution of the function PREDICT.
168
165
For a complete solution that includes native scoring, see these samples from the SQL Server development team:
169
166
170
167
+ Deploy your ML script: [Using a Python model](https://microsoft.github.io/sql-ml-tutorials/python/rentalprediction/step/3.html)
171
-
+ Deploy your ML script: [Using an R model](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html)
172
-
173
-
## See also
174
-
175
-
[Real-time scoring in SQL Server machine learning ](real-time-scoring.md)
168
+
+ Deploy your ML script: [Using an R model](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html)
0 commit comments