--- title: "Getting Started with PFCI" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with PFCI} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview The `PFCI` package implements **Penalized Fast Causal Inference**, a two-stage procedure for learning causal structure in high-dimensional settings with potential latent variables and selection bias. The method combines graphical lasso screening with the FCI algorithm to produce a Partial Ancestral Graph (PAG) that is substantially faster than standard FCI/RFCI while maintaining accuracy under sparsity. ## Installation `PFCI` is available on CRAN. It requires `pcalg` and `graph` from Bioconductor for its core functionality: ```{r install, eval = FALSE} install.packages("PFCI") # Required Bioconductor dependencies install.packages("BiocManager") BiocManager::install(c("pcalg", "graph", "RBGL", "Rgraphviz")) ``` ## Basic workflow The standard three-step workflow is simulate, fit, evaluate: ```{r basic, eval = FALSE} library(PFCI) # Step 1: simulate a sparse DAG with p = 100 nodes sim <- simulate_pfci_toy(p = 100, n = 100, edge_prob = 0.02, seed = 1) # Step 2: fit PFCI fit <- pfci_fit(sim$X, alpha = 0.05) print(fit) # Step 3: evaluate against ground truth met <- pfci_metrics(sim, fit) met ``` The `print(fit)` call reports runtime and tuning parameters. The `met` list contains SHD, F1, MCC, Precision, Recall, and Time. ## Plotting the PAG ```{r plot, eval = FALSE} plot_pag(fit) ``` ## Latent confounders To simulate and evaluate under latent confounding use the `simulate_with_latent` and `metrics_with_latent` functions: ```{r latent, eval = FALSE} sim_lat <- simulate_with_latent(p_obs = 100, gamma = 0.05, n = 100, seed_graph = 1, seed_data = 2) fit_lat <- pfci_fit(sim_lat$X, alpha = 0.05) metrics_with_latent(sim_lat, fit_lat) ``` ## Scaling behaviour PFCI is approximately 3x faster than RFCI at `p = 1000` while maintaining equal or better F1 and MCC. See Table 1 of Pal, Ghosh, and Yang (2025) for full simulation results across `p = 100` to `p = 1000`. ## Reference Pal, S., Ghosh, D., and Yang, S. (2025). Penalized FCI for Causal Structure Learning in a Sparse DAG for Biomarker Discovery in Parkinson's Disease. *Annals of Applied Statistics*.