Lokasi ngalangkungan proxy:   [ UP ]  
[Ngawartoskeun bug]   [Panyetelan cookie]                
Skip to content

Commit 47af3b9

Browse files
committed
acrolinx errors and warnings, links, H2
1 parent 2ef4aff commit 47af3b9

3 files changed

Lines changed: 44 additions & 43 deletions

File tree

docs/advanced-analytics/r/how-to-do-realtime-scoring.md

Lines changed: 20 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
---
2-
title: How to perform real-time scoring or native scoring in SQL Server Machine Learning | Microsoft Docs
2+
title: How to generate forecasts and predictions using machine learning models in SQL Server | Microsoft Docs
3+
description: Use rxPredict, or sp_rxPredict for real-time scoring, or PREDICT T-SQL for native scoring for predictions and forecasting in R and Pythin in SQL Server Machine Learning.
34
ms.prod: sql
45
ms.technology: machine-learning
56

@@ -9,20 +10,20 @@ author: HeidiSteen
910
ms.author: heidist
1011
manager: cgronlun
1112
---
12-
# How to perform real-time scoring or native scoring in SQL Server
13+
# How to generate forecasts and predictions using machine learning models in SQL Server
1314
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1415

15-
Using an existing model to forecast or predict outcomes for new data inputs is a core task in machine learning. This article enumerates the approaches for generating predictions in SQL Server. Among the approaches are internal processing methodologies for high speed predictions, where speed is based on incremental reductions of run time dependencies. Fewer dependencies means faster predictions.
16+
Using an existing model to forecast or predict outcomes for new data inputs is a core task in machine learning. This article enumerates the approaches for generating predictions in SQL Server. Among the approaches are internal processing methodologies for high-speed predictions, where speed is based on incremental reductions of run time dependencies. Fewer dependencies mean faster predictions.
1617

1718
Using the internal processing infrastructure (real-time or native scoring) comes with library requirements. Functions must be from the Microsoft libraries. R or Python code calling open-source or third-party functions is not supported in CLR or C++ extensions.
1819

1920
The following table summarizes the scoring frameworks for forecasting and predictions.
2021

2122
| Methodology | Interface | Library requirements | Processing speeds |
2223
|-----------------------|-------------------|----------------------|----------------------|
23-
| Extensibility framework | R: rxpredict <br/>Python: rx_predict | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
24-
| Real-time scoring CLR extension | [sp_rxPredict](../../sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql.md) on a binary model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
25-
| Native scoring C++ extension| [PREDICT T-SQL function](../../sql/t-sql/queries/predict-transact-sql.md) on a binary model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
24+
| Extensibility framework | R: rxPredict <br/>Python: rx_predict | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
25+
| Real-time scoring CLR extension | [sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) on a binary model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
26+
| Native scoring C++ extension| [PREDICT T-SQL function](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql) on a binary model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
2627

2728
Speed of processing and not substance of the output is the differentiating feature. Assuming the same functions and inputs, the scored output should not vary based on the approach you use.
2829

@@ -45,17 +46,17 @@ Taking a step back, the overall process of preparing the model and then generati
4546

4647
When the input includes many rows of data, it is usually faster to insert the prediction values into a table as part of the scoring process. Generating a single score is more typical in a scenario where you get input values from a form or user request, and return the score to a client application. To improve performance when generating successive scores, SQL Server might cache the model so that it can be reloaded into memory.
4748

48-
## Native and real-time scoring compared
49+
## Compare methods
4950

5051
To preserve the integrity of core database engine processes, support for R and Python is enabled in a dual architecture that isolates language processing from RDBMS processing. Starting in SQL Server 2016, Microsoft added an extensibility framework that allows R scripts to be executed from T-SQL. In SQL Server 2017, Python integration was added.
5152

52-
The extensibility framework supports any operation you might perform in R or Python, ranging from simple functions to training complex machine learning models. However, the dual-process architecture requires invoking an external R or Python process for every call, regardless of the complexity of the operation. When the workload entails loading a pre-trained model from a table and scoring against it on data already in SQL Server, the overhead of calling the external processes adds latency that can be unacceptable in certain circumstances. For example, in a request-response pattern such as fraud detection, scores must be generated very quickly in order to be relevant.
53+
The extensibility framework supports any operation you might perform in R or Python, ranging from simple functions to training complex machine learning models. However, the dual-process architecture requires invoking an external R or Python process for every call, regardless of the complexity of the operation. When the workload entails loading a pre-trained model from a table and scoring against it on data already in SQL Server, the overhead of calling the external processes adds latency that can be unacceptable in certain circumstances. For example, in a request-response pattern such as fraud detection, scores must be generated quickly in order to be relevant.
5354

5455
To support fast scoring, SQL Server added built-in scoring libraries as C++ and CLR extensions that eliminate the processing overhead of R and Python run times.
5556

5657
**Real-time scoring** was the first solution for high-performance scoring. Introduced in early versions of SQL Server 2017 and later updates to SQL Server 2016, real-time scoring relies on CLR libraries that stand in for R and Python processing over Microsoft-controlled functions in RevoScaleR, MicrosoftML (R), revoscalepy, and microsoftml (Python). CLR libraries are invoked using the **sp_rxPredict** stored procedure to generates scores from any supported model type, without calling the R or Python runtime.
5758

58-
**Native scoring** is a SQL Server 2017 feature, implemented as a native C++ library, but only for RevoScaleR and revoscalepy ,models. It is the fastest and more secure approach, but supports a smaller set of functions relative to other methodologies.
59+
**Native scoring** is a SQL Server 2017 feature, implemented as a native C++ library, but only for RevoScaleR and revoscalepy models. It is the fastest and more secure approach, but supports a smaller set of functions relative to other methodologies.
5960

6061
## Choose a scoring method
6162

@@ -85,9 +86,9 @@ From R code, call the [rxWriteObject](https://docs.microsoft.com/machine-learnin
8586

8687
If you use this function, be sure to serialize the model using [rxSerializeModel](https://docs.microsoft.com/r-server/r-reference/revoscaler/rxserializemodel) first. Then, set the *serialize* argument in `rxWriteObject` to FALSE, to avoid repeating the serialization step.
8788

88-
Serialing a model to a binary format is useful, but not required if you are scoring predictions using R and Python run time environment in the extensibility framework. You can save a model in raw byte format to a file and then read from the file into SQL Server. This option might be useful if you are moving or copying models between environments.
89+
Serializing a model to a binary format is useful, but not required if you are scoring predictions using R and Python run time environment in the extensibility framework. You can save a model in raw byte format to a file and then read from the file into SQL Server. This option might be useful if you are moving or copying models between environments.
8990

90-
## Scoring in related Microsoft products
91+
## Scoring in related products
9192

9293
If you are using the [standalone server](r-server-standalone.md) or a [Microsoft Machine Learning Server](https://docs.microsoft.com/machine-learning-server/what-is-machine-learning-server), you have other options besides stored procedures and T-SQL functions for generating predictions quickly. Both the standalone server and Machine Learning Server support the concept of a *web service* for code deployment. You can bundle an R or Python pre-trained model as a web service, called at run time to evaluate new data inputs. For more information, see these articles:
9394

@@ -96,3 +97,11 @@ If you are using the [standalone server](r-server-standalone.md) or a [Microsoft
9697
+ [Deploy a Python model as a web service with azureml-model-management-sdk](https://docs.microsoft.com/machine-learning-server/operationalize/python/quickstart-deploy-python-web-service)
9798
+ [Publish an R code block or a real-time model as a new web service](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/publishservice)
9899
+ [mrsdeploy package for R](https://docs.microsoft.com/machine-learning-server/r-reference/mrsdeploy/mrsdeploy-package)
100+
101+
102+
## See also
103+
104+
+ [rxSerializeModel](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel)
105+
+ [rxRealTimeScoring](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxrealtimescoring)
106+
+ [sp-rxPredict](https://docs.microsoft.com/sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql)
107+
+ [PREDICT T-SQL](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql)

docs/advanced-analytics/real-time-scoring.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Real-time scoring in SQL Server machine learning | Microsoft Docs
3-
description: Generate predictions using sp_rxPredict, scoring dta inputs against a pre-trained model written in R on SQL Server.
3+
description: Generate predictions using sp_rxPredict, scoring data inputs against a pre-trained model written in R on SQL Server.
44
ms.prod: sql
55
ms.technology: machine-learning
66

@@ -42,18 +42,14 @@ Real-time scoring is a multi-step process:
4242

4343
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
4444

45-
Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
46-
47-
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not currently supported
48-
49-
+ RevoScaleR models that use an R transformation function, or a formula that contains a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on the to input data before passing the data to real-time scoring.
50-
5145
> [!Note]
5246
> Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) might be faster.
5347
5448
<a name="bkmk_py_supported_algos"></a>
5549

56-
## Python algorithms using real-time scoring
50+
## Supported algorithms
51+
52+
### Python algorithms using real-time scoring
5753

5854
+ revoscalepy models
5955

@@ -84,7 +80,7 @@ Real-time scoring does not use an interpreter; therefore, any functionality that
8480

8581
<a name="bkmk_rt_supported_algos"></a>
8682

87-
## R algorithms using real-time scoring
83+
### R algorithms using real-time scoring
8884

8985
+ RevoScaleR models
9086

@@ -113,19 +109,22 @@ Real-time scoring does not use an interpreter; therefore, any functionality that
113109
+ [categoricalHash](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/categoricalHash)
114110
+ [selectFeatures](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/selectFeatures)
115111

116-
## Unsupported model types
112+
### Unsupported model types
113+
114+
Real-time scoring does not use an interpreter; therefore, any functionality that might require an interpreter is not supported during the scoring step. These might include:
115+
116+
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms are not supported.
117117

118-
Real-time scoring is not supported for R transformations other than those explicitly listed in the previous section.
118+
+ Models using a transformation function or formula containing a transformation, such as <code>A ~ log(B)</code> are not supported in real-time scoring. To use a model of this type, we recommend that you perform the transformation on input data before passing the data to real-time scoring.
119119

120-
For developers accustomed to working with RevoScaleR and other Microsoft R-specific libraries, unsupported functions include
121-
`rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR, PMML models, and other models created using other R libraries from CRAN or other repositories.
122120

121+
## Example: sp_rxPredict
123122

124-
## Example (R): Real-time scoring with sp_rxPredict
123+
This section describes the steps required to set up **real-time** prediction, and provides an example in R of how to call the function from T-SQL.
125124

126-
This section describes the steps required to set up **real-time** prediction, and provides an example of how to call the function from T-SQL.
125+
<a name ="bkmk_enableRtScoring"></a>
127126

128-
### <a name ="bkmk_enableRtScoring"></a> Step 1. Enable the real-time scoring procedure
127+
### Step 1. Enable the real-time scoring procedure
129128

130129
You must enable this feature for each database that you want to use for scoring. The server administrator should run the command-line utility, RegisterRExt.exe, which is included with the RevoScaleR package.
131130

docs/advanced-analytics/sql-native-scoring.md

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Native scoring requires that you have an already trained model. In SQL Server 20
2020

2121
## How native scoring works
2222

23-
Native scoring uses native C++ libraries from Microsoft that can read an already trained model, previosuly stored in a special binary format or saved to disk as raw byte stream, and generate scores for new data inputs that you provide. Because the model is trained, published, and stored, it can be used for scoring without having to call the R or Python interpreter. As such, the overhead of multiple process interactions is reduced, resulting in much faster prediction performance in enterprise production scenarios.
23+
Native scoring uses native C++ libraries from Microsoft that can read an already trained model, previously stored in a special binary format or saved to disk as raw byte stream, and generate scores for new data inputs that you provide. Because the model is trained, published, and stored, it can be used for scoring without having to call the R or Python interpreter. As such, the overhead of multiple process interactions is reduced, resulting in much faster prediction performance in enterprise production scenarios.
2424

2525
To use native scoring, call the PREDICT T-SQL function and pass the following required inputs:
2626

@@ -31,13 +31,11 @@ The function returns predictions for the input data, together with any columns o
3131

3232
## Prerequisites
3333

34-
PREDICT is available on all editions of SQL Server 2017 database engine and enabled by default, including SQL Server 2017 Machine Learning Services on Windows, SQL Server 2017 (Windows), SQL Server 2017 (Linux) or Azure SQL Database. You do not need to install R, Python, or enable additional features.
35-
34+
PREDICT is available on all editions of SQL Server 2017 database engine and enabled by default, including SQL Server 2017 Machine Learning Services on Windows, SQL Server 2017 (Windows), SQL Server 2017 (Linux), or Azure SQL Database. You do not need to install R, Python, or enable additional features.
3635

37-
## Model preparation
36+
+ The model must be trained in advance using one of the supported **rx** algorithms listed below.
3837

39-
+ The model must be trained in advance using one of the supported **rx** algorithms. For details, see [Supported algorithms](#bkmk_native_supported_algos).
40-
+ The model must be saved using the new serialization function provided in Microsoft R Server 9.1.0. The serialization function is optimized to support fast scoring.
38+
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
4139

4240
<a name="bkmk_native_supported_algos"></a>
4341

@@ -63,13 +61,12 @@ If you need to use models from MicrosoftML or microsoftml, use [real-time scorin
6361

6462
Unsupported model types include the following types:
6563

66-
+ Models containing other, unsupported types of R transformations
67-
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR
64+
+ Models containing other transformations
65+
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR or revoscalepy equivalents
6866
+ PMML models
69-
+ Models created using other R libraries from CRAN or other repositories
70-
+ Models containing any other R transformation
67+
+ Models created using other open-source or third-party libraries
7168

72-
## Example: Native scoring with PREDICT
69+
## Example: PREDICT (T-SQL)
7370

7471
In this example, you create a model, and then call the real-time prediction function from T-SQL.
7572

@@ -168,8 +165,4 @@ If you get the error, "Error occurred during execution of the function PREDICT.
168165
For a complete solution that includes native scoring, see these samples from the SQL Server development team:
169166

170167
+ Deploy your ML script: [Using a Python model](https://microsoft.github.io/sql-ml-tutorials/python/rentalprediction/step/3.html)
171-
+ Deploy your ML script: [Using an R model](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html)
172-
173-
## See also
174-
175-
[Real-time scoring in SQL Server machine learning ](real-time-scoring.md)
168+
+ Deploy your ML script: [Using an R model](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html)

0 commit comments

Comments
 (0)