---
title: "make_cate"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{make_cate}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(OPL)
```

#make_cate#
Predicting conditional average treatment effect (CATE)
                    on a new policy based on the training over an old policy

##Description##
make_cate is a function generating conditional average treatment effect
    (CATE) for both a training dataset and a testing (or new) dataset
    related to a binary (treated vs. untreated) policy program. It provides
    the main input for running opl_tb (optimal policy learning of a
    threshold-based policy), opl_tb_c (optimal policy learning of a
    threshold-based policy at specific thresholds), opl_lc (optimal policy
    learning of a linear-combination policy), opl_lc_c (optimal policy
    learning of a linear-combination policy at specific parameters), opl_dt
    (optimal policy learning of a decision-tree policy), opl_dt_c (optimal
    policy learning of a decision-tree policy at specific thresholds and
    selection variables). Based on Kitagawa and Tetenov (2018), the main
    econometrics supported by these commands can be found in Cerulli (2022).

##Function Syntax##

make_cate(model, train_data, test_data, w, x, y, family = gaussian(), ntree = 100, mtry = 2)


##Arguments:##
- model: A string indicating the model to use. Valid options are "glm" and "rf".
- train_data: The training dataset used for estimating the treatment effect on the old policy.
- test_data: The test dataset used for estimating the treatment effect on the new policy.
- w: The treatment variable.
- x: Independent variables for the model.
- y: The outcome variable.
- family: The family type for the model (e.g., "binomial", "gaussian").
- ntree: Number of trees for the Random Forest model (default is 100).

##Return Value##
An object containing the estimated causal treatment effect results, including:
- Average Treatment Effect (ATE)
- Average Treatment Effect on Treated (ATET)
- Average Treatment Effect on Non-Treated (ATENT)

##Example Usage##
```{r}
set.seed(42)
train_data <- data.frame(
  y = rnorm(100),    # Outcome
  x1 = rnorm(100),   # Covariate
  x2 = rnorm(100),
  w = sample(0:1, 100, replace = TRUE)  # Trattamento
)

test_data <- data.frame(
  y = rnorm(100),    # Outcome
  x1 = rnorm(100),   # Covariate
  x2 = rnorm(100),
  w = sample(0:1, 100, replace = TRUE)  # Trattamento
)

  x <- c("x1", "x2")  # le covariate
  y <- "y"            # la variabile dipendente
  w <- "w"            # la variabile di trattamento
  family <- "gaussian"  # Famiglia per glm
  ntree <- 100          # Numero di alberi per random forest
  mtry <- 2             # Numero di variabili da considerare in ogni split
  
result <- make_cate(model = "glm", train_data = train_data, test_data = test_data, w = w, x = x, y = y)
```
##Detailed Steps:##
- Train and Test Data: The function separates the data into treated and untreated groups for both the training and test datasets.
- Model Estimation: The function estimates the treatment effect using either a GLM or Random Forest model. It then calculates the predicted outcome for both the treated and untreated groups.
- Causal Effect Calculation: The CATE is calculated as the difference in predicted outcomes between the treated and untreated groups.
- Output: The function returns the estimated treatment effects (ATE, ATET, ATENT) and the treatment effects for both the training and test datasets.

##Results##
The following results are output for both the old (training) and new (test) policy:

--------------------------------------------------
- Treatment-effects estimation: OLD POLICY       -
--------------------------------------------------
Number of obs = [n]
Estimator = regression adjustment
Outcome model = linear
DIM = [DIM_value]
ATE = [ATE_value]
ATET = [ATET_value]
ATENT = [ATENT_value]

--------------------------------------------------
- Treatment-effects estimation: NEW POLICY       -
--------------------------------------------------
Number of obs = [n]
Estimator = regression adjustment
Outcome model = linear
DIM = [DIM_value]
ATE = [ATE_value]
ATET = [ATET_value]
ATENT = [ATENT_value]

##Conclusion##
The make_cate function is a powerful tool for estimating the causal treatment effect of a policy using either GLM or Random Forest models. It provides insights into the treatment effects both for the old and new policies, making it a useful method for causal inference in policy analysis.

## Acknowledgment
The development of this software was supported by FOSSR (Fostering Open Science in Social Science Research), a project funded by the European Union - NextGenerationEU under the NPRR Grant agreement n. MURIR0000008.