Updating for MI

garyericson · garyericson · commit ce78845e8421 · 2020-10-08T17:21:09.000-07:00
diff --git a/docs/machine-learning/r/converting-r-code-for-use-in-sql-server.md b/docs/machine-learning/r/converting-r-code-for-use-in-sql-server.md
@@ -4,12 +4,12 @@ description: Migrate R code to a SQL Server stored procedure for solution deploy
 ms.prod: sql
 ms.technology: machine-learning-services
 
-ms.date: 08/28/2020  
+ms.date: 10/06/2020  
 ms.topic: how-to
 author: dphansen
 ms.author: davidph
 ms.custom: seo-lt-2019
-monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allversions"
+monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=azuresqldb-mi-current||=sqlallproducts-allversions"
 ---
 # Convert R code for execution in SQL Server (In-Database) instances
 [!INCLUDE [SQL Server 2016 and later](../../includes/applies-to-version/sqlserver2016.md)]
@@ -24,41 +24,41 @@ However, your code might require substantial changes if any of the following app
 + The code makes separate calls to data sources outside SQL Server, such as Excel worksheets, files on shares, and other databases. 
 + You want to run the code in the *\@script* parameter of [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md) and also parameterize the stored procedure.
 + Your original solution includes multiple steps that might be more efficient in a production environment if executed independently, such as data preparation or feature engineering vs. model training, scoring, or reporting.
-+ You want to improve optimize performance by changing libraries, using parallel execution, or offloading some processing to SQL Server. 
++ You want to optimize performance by changing libraries, using parallel execution, or offloading some processing to SQL Server. 
 
 ## Step 1. Plan requirements and resources
 
-**Packages**
+### Packages
 
 + Determine which packages are needed and ensure that they work on SQL Server.
  
 + Install packages in advance, in the default package library used by Machine Learning Services. User libraries are not supported.
 
-**Data sources** 
+### Data sources
 
 + If you intend to embed your R code in [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md), identify primary and secondary data sources. 
 
-    + **Primary** data sources are large datasets, such as model training data, or input data for predictions. Plan to map your largest dataset to the input parameter of [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md).
+  + **Primary** data sources are large datasets, such as model training data, or input data for predictions. Plan to map your largest dataset to the input parameter of [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md).
 
-    + **Secondary** data sources are typically smaller data sets, such as lists of factors, or additional grouping variables. 
-    
-    Currently, sp_execute_external_script supports only a single dataset as input to the stored procedure. However, you can add multiple scalar or binary inputs.
+  + **Secondary** data sources are typically smaller data sets, such as lists of factors, or additional grouping variables. 
+  
+  Currently, sp_execute_external_script supports only a single dataset as input to the stored procedure. However, you can add multiple scalar or binary inputs.
 
-    Stored procedure calls preceded by EXECUTE cannot be used as an input to [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md). You can use queries, views, or any other valid SELECT statement.
+  Stored procedure calls preceded by EXECUTE cannot be used as an input to [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md). You can use queries, views, or any other valid SELECT statement.
 
 + Determine the outputs you need. If you run R code using sp_execute_external_script, the stored procedure can output just one data frame as a result. However, you can also output multiple scalar outputs, including plots and models in binary format, as well as other scalar values derived from R code or SQL parameters.
 
-**Data types**
+### Data types
 
 + Make a checklist of possible data type issues.
 
-    All R data types are supported by SQL Server machine Learning Services. However, [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] supports a greater variety of data types than does R. Therefore, some implicit data type conversions are performed when sending [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data to R, and vice versa. You might need to explicitly cast or convert some data. 
+  All R data types are supported by SQL Server machine Learning Services. However, [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] supports a greater variety of data types than does R. Therefore, some implicit data type conversions are performed when sending [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data to R, and vice versa. You might need to explicitly cast or convert some data. 
 
-    NULL values are supported. However, R uses the `na` data construct to represent a missing value, which is similar to a null.
+  NULL values are supported. However, R uses the `na` data construct to represent a missing value, which is similar to a null.
 
 + Consider eliminating dependency on data that cannot be used by R: for example, rowid and GUID data types from SQL Server cannot be consumed by R and generate errors.
 
-    For more information, see [R Libraries and Data Types](../r/r-libraries-and-data-types.md).
+  For more information, see [R Libraries and Data Types](../r/r-libraries-and-data-types.md).
 
 ## Step 2. Convert or repackage code
 
@@ -68,99 +68,103 @@ How much you change your code depends on whether you intend to submit the R code
 
 + When running R in a stored procedure, you can pass through multiple **scalar** inputs. For any parameters that you want to use in the output, add the **OUTPUT** keyword. 
 
-    For example, the following scalar input `@model_name` contains the model name, which is also output in its own column in the results:
+  For example, the following scalar input `@model_name` contains the model name, which is also output in its own column in the results:
 
-    ```sql
-    EXEC sp_execute_external_script @model_name="DefaultModel" OUTPUT, @language=N'R', @script=N'R code here'
-    ``` 
+  ```sql
+  EXEC sp_execute_external_script @model_name="DefaultModel" OUTPUT, @language=N'R', @script=N'R code here'
+  ```
 
 + Any variables that you pass in as parameters of the stored procedure [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md) must be mapped to variables in the R code. By default, variables are mapped by name.
 
-    All columns in the input dataset must also be mapped to variables in the R script.  For example, assume your R script contains a formula like this one:
+  All columns in the input dataset must also be mapped to variables in the R script.  For example, assume your R script contains a formula like this one:
 
-    ```R
-    formula <- ArrDelay ~ CRSDepTime + DayOfWeek + CRSDepHour:DayOfWeek
-    ```
-    
-    An error is raised if the input dataset does not contain columns with the matching names ArrDelay, CRSDepTime, DayOfWeek, CRSDepHour, and DayOfWeek.
+  ```R
+  formula <- ArrDelay ~ CRSDepTime + DayOfWeek + CRSDepHour:DayOfWeek
+  ```
+  
+  An error is raised if the input dataset does not contain columns with the matching names ArrDelay, CRSDepTime, DayOfWeek, CRSDepHour, and DayOfWeek.
 
 + In some cases, an output schema must be defined in advance for the results.
 
-    For example, to insert the data into a table, you must use the **WITH RESULT SET** clause to specify the schema.
+  For example, to insert the data into a table, you must use the **WITH RESULT SET** clause to specify the schema.
 
-    The output schema is also required if the R script uses the argument `@parallel=1`. The reason is that multiple processes might be created by SQL Server to run the query in parallel, with the results collected at the end. Therefore, the output schema must be prepared before the parallel processes can be created.
-    
-    In other cases, you can omit the result schema by using the option **WITH RESULT SETS UNDEFINED**. This statement returns the dataset from the R script without naming the columns or specifying the SQL data types.
+  The output schema is also required if the R script uses the argument `@parallel=1`. The reason is that multiple processes might be created by SQL Server to run the query in parallel, with the results collected at the end. Therefore, the output schema must be prepared before the parallel processes can be created.
+  
+  In other cases, you can omit the result schema by using the option **WITH RESULT SETS UNDEFINED**. This statement returns the dataset from the R script without naming the columns or specifying the SQL data types.
 
 + Consider generating timing or tracking data using T-SQL rather than R.
 
-    For example, you could pass the system time or other information used for auditing and storage by adding a T-SQL call that is passed through to the results, rather than generating similar data in the R script. 
+  For example, you could pass the system time or other information used for auditing and storage by adding a T-SQL call that is passed through to the results, rather than generating similar data in the R script. 
 
-**Improve performance and security**
+### Improve performance and security
 
+::: moniker range=">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allversions"
 + Avoid writing predictions or intermediate results to file. Write predictions to a table instead, to avoid data movement.
+::: moniker-end
 
 + Run all queries in advance, and review the SQL Server query plans to identify tasks that can be performed in parallel.
 
-    If the input query can be parallelized, set `@parallel=1` as part of your arguments to [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md). 
+  If the input query can be parallelized, set `@parallel=1` as part of your arguments to [sp_execute_external_script](../../relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql.md). 
 
-    Parallel processing with this flag is typically possible any time that SQL Server can work with partitioned tables or distribute a query among multiple processes and aggregate the results at the end. Parallel processing with this flag is typically not possible if you are training models using algorithms that require all data to be read, or if you need to create aggregates.
+  Parallel processing with this flag is typically possible any time that SQL Server can work with partitioned tables or distribute a query among multiple processes and aggregate the results at the end. Parallel processing with this flag is typically not possible if you are training models using algorithms that require all data to be read, or if you need to create aggregates.
 
 + Review your R code to determine if there are steps that can be performed independently, or performed more efficiently, by using a separate stored procedure call. For example, you might get better performance by doing feature engineering or feature extraction separately, and saving the values to a table.
 
 + Look for ways to use T-SQL rather than R code for set-based computations.
 
-    For example, this R solution shows how user-defined T-SQL functions and R can perform the same feature engineering task: [Data Science End-to-End Walkthrough](../tutorials/walkthrough-data-science-end-to-end-walkthrough.md).
+  ::: moniker range=">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allversions"
+  For example, this R solution shows how user-defined T-SQL functions and R can perform the same feature engineering task: [Data Science End-to-End Walkthrough](../tutorials/walkthrough-data-science-end-to-end-walkthrough.md).
+  ::: moniker-end
 
 + If possible, replace conventional R functions with **ScaleR** functions that support distributed execution. For more information, see [Comparison of Base R and Scale R Functions](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/revoscaler-compared-to-base-r).
 
 + Consult with a database developer to determine ways to improve performance by using SQL Server features such as [memory-optimized tables](https://docs.microsoft.com/sql/relational-databases/in-memory-oltp/introduction-to-memory-optimized-tables), or, if you have Enterprise Edition, [Resource Governor](https://docs.microsoft.com/sql/relational-databases/resource-governor/resource-governor)).
 
-
-### Step 3. Prepare for deployment
+## Step 3. Prepare for deployment
 
 + Notify the administrator so that packages can be installed and tested in advance of deploying your code. 
 
-    In a development environment, it might be okay to install packages as part of your code, but this is a bad practice in a production environment. 
+  In a development environment, it might be okay to install packages as part of your code, but this is a bad practice in a production environment. 
 
-    User libraries are not supported, regardless of whether you are using a stored procedure or running R code in the SQL Server compute context.
+  User libraries are not supported, regardless of whether you are using a stored procedure or running R code in the SQL Server compute context.
 
-**Package your R code in a stored procedure**
+### Package your R code in a stored procedure
 
 + If your code is relatively simple, you can embed it in a T-SQL user-defined function without modification, as described in this samples:
 
-    + [Feature engineering using T-SQL and R](../tutorials/r-taxi-classification-create-features.md)
+  + [Feature engineering using T-SQL and R](../tutorials/r-taxi-classification-create-features.md)
 
 + If the code is more complex, use the R package **sqlrutils** to convert your code. This package is designed to help experienced R users write good stored procedure code. 
 
-    The first step is to rewrite your R code as a single function with clearly defined inputs and outputs.
+  The first step is to rewrite your R code as a single function with clearly defined inputs and outputs.
 
-    Then, use the **sqlrutils** package to generate the input and outputs in the correct format. The **sqlrutils** package generates the complete stored procedure code for you, and can also register the stored procedure in the database. 
+  Then, use the **sqlrutils** package to generate the input and outputs in the correct format. The **sqlrutils** package generates the complete stored procedure code for you, and can also register the stored procedure in the database. 
 
-    For more information and examples, see [sqlrutils (SQL)](ref-r-sqlrutils.md).
+  For more information and examples, see [sqlrutils (SQL)](ref-r-sqlrutils.md).
 
-**Integrate with other workflows**
+### Integrate with other workflows
 
 + Leverage T-SQL tools and ETL processes. Perform feature engineering, feature extraction, and data cleansing in advance as part of data workflows.
 
-    When you are working in a dedicated R development environment such as [!INCLUDE[rsql_rtvs_md](../../includes/rsql-rtvs-md.md)] or RStudio, you might pull data to your computer, analyze the data iteratively, and then write out or display the results. 
-    
-    However, when standalone R code is migrated to SQL Server, much of this process can be simplified or delegated to other SQL Server tools. 
+  When you are working in a dedicated R development environment such as [!INCLUDE[rsql_rtvs_md](../../includes/rsql-rtvs-md.md)] or RStudio, you might pull data to your computer, analyze the data iteratively, and then write out or display the results. 
+  
+  However, when standalone R code is migrated to SQL Server, much of this process can be simplified or delegated to other SQL Server tools. 
 
 + Use secure, asynchronous visualization strategies.
 
-    Users of SQL Server often cannot access files on the server, and SQL client tools typically do not support the R graphics device. If you generate plots or other graphics as part of the solution, consider exporting the plots as binary data and saving to a table, or writing.
+  Users of SQL Server often cannot access files on the server, and SQL client tools typically do not support the R graphics device. If you generate plots or other graphics as part of the solution, consider exporting the plots as binary data and saving to a table, or writing.
 
 + Wrap prediction and scoring functions in stored procedures for direct access by applications.
 
-### Other resources
+## Next steps
 
 To view examples of how an R solution can be deployed in SQL Server, see these samples:
 
-+ [Build a predictive model for  ski rental business using R and SQL Server](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/)
++ [Tutorial: Develop a predictive model in R with SQL machine learning](../tutorials/r-predictive-model-introduction.md)
 
-+ [In-Database Analytics for the SQL Developer](../tutorials/r-taxi-classification-introduction.md)
-  Demonstrates how you can make your R code more modular by wrapping it  in stored procedures
++ [R tutorial: Predict NYC taxi fares with binary classification](../tutorials/r-taxi-classification-introduction.md)
 
+::: moniker range=">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allversions"
 + [End-to-End Data Science Solution](../tutorials/walkthrough-data-science-end-to-end-walkthrough.md)
   Includes a comparison of feature engineering in R and T-SQL
+::: moniker-end
diff --git a/docs/machine-learning/r/how-to-create-a-stored-procedure-using-sqlrutils.md b/docs/machine-learning/r/how-to-create-a-stored-procedure-using-sqlrutils.md
@@ -4,12 +4,12 @@ description: Use the sqlrutils R package in SQL Server to bundle R language code
 ms.prod: sql
 ms.technology: machine-learning-services
 
-ms.date: 08/31/2020  
+ms.date: 10/06/2020  
 ms.topic: how-to
 author: dphansen
 ms.author: davidph
 ms.custom: seo-lt-2019
-monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allversions"
+monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=azuresqldb-mi-current||=sqlallproducts-allversions"
 ---
 # Create a stored procedure using sqlrutils
 [!INCLUDE [SQL Server 2016 and later](../../includes/applies-to-version/sqlserver2016.md)]
diff --git a/docs/machine-learning/r/operationalizing-your-r-code.md b/docs/machine-learning/r/operationalizing-your-r-code.md