---
title: "Non-R Models"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Non-R Models}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
A model that is trained in any language are able to integrate with `tidypredict`, and thus with `broom`. The requirement is that the model in that language is exported using the parse model spec. The easiest file format would be YAML.
## python example
A model that was fitted using `sklearn`'s `linear_model`. The model is based on diabetes data. Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442 diabetes patients, as well as the response of interest, a quantitative measure of disease progression one year after baseline. The model's results were converted to YAML by the same python script, I copied and pasted the top part here:
```
general:
is_glm: 0
model: lm
residual: 0
sigma2: 0
type: regression
version: 2.0
terms:
- coef: 152.76430691633442
fields:
- col: (Intercept)
type: ordinary
is_intercept: 1
label: (Intercept)
```
## Read in R
The YAML data can be read in R by using the `yaml` package. In this example, we have copy-pasted most of the models inside a variable called `sklearn_model`. Because `yaml` requires local YAML variables to be split by line, we use `strsplit()`.
```{r}
library(yaml)
sklearn_model <- strsplit("general:
is_glm: 0
model: lm
residual: 0
sigma2: 0
type: regression
version: 2.0
terms:
- coef: 152.76430691633442
fields:
- col: (Intercept)
type: ordinary
is_intercept: 1
label: (Intercept)
- coef: 0.3034995490660432
fields:
- col: age
type: ordinary
is_intercept: 0
label: age
- coef: -237.63931533353403
fields:
- col: sex
type: ordinary
is_intercept: 0
label: sex
- coef: 510.5306054362253
fields:
- col: bmi
type: ordinary
is_intercept: 0
label: bmi
- coef: 327.7369804093466
fields:
- col: bp
type: ordinary
is_intercept: 0
label: bp
- coef: -814.1317093725387
fields:
- col: s1
type: ordinary
is_intercept: 0
label: s1
", split = "\n")[[1]]
```
Now the model is converted to an R `list` using `yaml.load`.
```{r}
sklearn_model <- yaml.load(sklearn_model)
str(sklearn_model, 2)
```
## `tidypredict`
The `list` object needs to be recognized as a `tidypredict` parsed model. To do that, we use `as_parsed_model()`
```{r}
library(tidypredict)
spm <- as_parsed_model(sklearn_model)
class(spm)
```
The `spm` variable now works just as any parsed model inside R. Use `tidypredict_fit()` to view the resulting formula.
```{r}
tidypredict_fit(spm)
```
Now, the model can run **inside a database**
```{r}
tidypredict_sql(spm, dbplyr::simulate_mssql())
```
## `broom`
Now that we have a `parsed_model` object, it is possible to use `broom`'s `tidy()` function. This means that we are able to integrate a totally external model, with `broom`.
```{r}
tidy(spm)
```