---
title: "policy_data"
output:
rmarkdown::html_vignette:
fig_caption: true
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{policy_data}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: ref.bib
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup, message = FALSE}
library(polle)
```
This vignette is a guide to `policy_data()`. As the name suggests, the function creates a `policy_data` object with
a specific data structure making it easy to use in combination with `policy_def()`, `policy_learn()`, and `policy_eval()`.
The vignette is also a guide to some of the associated S3 functions
which transform or access parts of the data, see `?policy_data` and `methods(class="policy_data")`.
We will start by looking at a simple single-stage example, then consider a fixed two-stage example with varying actions sets and data in wide format, and
finally we will look at an example with a stochastic number of stages and data in long format.
# Single-stage: wide data
Consider a simple single-stage problem with covariates/state variables $(Z, L, B)$, binary action variable $A$, and
utility outcome $U$. We use `sim_single_stage()` to simulate data:
```{r single stage data}
(d <- sim_single_stage(n = 5e2, seed=1)) |> head()
```
We give instructions to `policy_data()` which variables define the `action`, the state `covariates`, and the `utility` variable:
```{r pdss}
pd <- policy_data(d, action="A", covariates=list("Z", "B", "L"), utility="U")
pd
```
In the single-stage case the history $H$ is just $(B, Z, L)$. We access the history and actions using
`get_history()`:
```{r gethistoryss}
get_history(pd)$H |> head()
get_history(pd)$A |> head()
```
Similarly, we access the utility outcomes $U$:
```{r get}
get_utility(pd) |> head()
```
```{r cleanup, include=FALSE}
rm(list = ls())
```
# Two-stage: wide data
Consider a two-stage problem with observations $O = (B, BB, L_{1}, C_{1}, U_{1},
A_1, L_2, C_{2}, U_{2}, A_2, U_{3})$. Following the general notation introduced
in Section 3.1 of [@nordland2023policy], $(B,BB)$ are the baseline covariates, $S_k =(L_{k, C_{k}})$ are the
state covariates at stage k, $A_{k}$ is the action at stage k, and $U_k$ is the reward at stage $k$.
The utility is the sum of the rewards $U=U_{1}+U_{2}+U_{3}$.
We use `sim_two_stage_multi_actions()` to simulate data:
```{r simtwostage}
d <- sim_two_stage_multi_actions(n=2e3, seed = 1)
colnames(d)
```
Note that the data is in wide format.
The data is transformed using `policy_data()` with instructions on which
variables define the actions, baseline covariates, state covariates, and the rewards:
```{r pdtwostage}
pd <- policy_data(d,
action = c("A_1", "A_2"),
baseline = c("B", "BB"),
covariates = list(L = c("L_1", "L_2"),
C = c("C_1", "C_2")),
utility = c("U_1", "U_2", "U_3"))
pd
```
The length of the character vector `action` determines the number of stages `K` (in this case 2).
If the number of stages is 2 or more, the `covariates` argument must be a named list. Each element must be
a character vector with length equal to the number of stages. If a covariate is not available at a
given stage we insert an `NA` value, e.g., `L = c(NA, "L_2")`.
Finally, the `utility` argument must
be a single character string (the utility is observed after stage K) or a character vector
of length K+1 with the names of the rewards.
In this example, the observed action sets vary for each stage. `get_action_set()` returns the
global action set and `get_stage_action_sets()` returns the action set for each stage:
```{r getactionsets}
get_action_set(pd)
get_stage_action_sets(pd)
```
The full histories $H_1 = (B, BB, L_{1}, C_{1})$ and $H_2=(B, BB, L_{1}, C_{1}, A_{1}, L_{2}, C_{2})$ are available using `get_history()` and `full_history = TRUE`:
```{r gethistwostage}
get_history(pd, stage = 1, full_history = TRUE)$H |> head()
get_history(pd, stage = 2, full_history = TRUE)$H |> head()
```
Similarly, we access the associated actions at each stage via list element `A`:
```{r}
get_history(pd, stage = 1, full_history = TRUE)$A |> head()
get_history(pd, stage = 2, full_history = TRUE)$A |> head()
```
Alternatively, the state/Markov type history and actions are available using `full_history = FALSE`:
```{r gethisstate}
get_history(pd, full_history = FALSE)$H |> head()
get_history(pd, full_history = FALSE)$A |> head()
```
Note that `policy_data()` overrides the action variable names to `A_1`, `A_2`, ... in the full history case and
`A` in the state/Markov history case.
As in the single-stage case we access the utility, i.e. the sum of the rewards, using
`get_utility()`:
```{r getutiltwo}
get_utility(pd) |> head()
```
# Multi-stage: long data
In this example we illustrate how `polle` handles decision
processes with a stochastic number of stages, see Section 3.5 in [@nordland2023policy].
The data is simulated using `sim_multi_stage()`.
Detailed information on the simulation is available in `?sim_multi_stage`.
We simulate data from 2000 iid subjects:
```{r sim_data}
d <- sim_multi_stage(2e3, seed = 1)
```
As described, the stage data is in long format:
```{r view_data}
d$stage_data[, -(9:10)] |> head()
```
The `id` variable is important for identifying which rows belong
to each subjects. The baseline data uses the same `id` variable:
```{r view_b_data}
d$baseline_data |> head()
```
The data is transformed using `policy_data()` with `type = "long"`.
The names of the `id`, `stage`, `event`, `action`,
and `utility` variables must be specified. The event variable, inspired by
the event variable in `survival::Surv()`, is `0` whenever an
action occur and `1` for a terminal event.
```{r pd}
pd <- policy_data(data = d$stage_data,
baseline_data = d$baseline_data,
type = "long",
id = "id",
stage = "stage",
event = "event",
action = "A",
utility = "U")
pd
```
In some cases we are only interested in analyzing a subset of the decision stages.
`partial()` trims the maximum number of decision stages:
```{r partial}
pd3 <- partial(pd, K = 3)
pd3
```
# SessionInfo
```{r sessionInfo}
sessionInfo()
```
# References