
toolero is an R package designed to help researchers
implement best practices for their coding projects. It provides a small
set of opinionated, practical functions that reduce friction at the
start of a project and during day-to-day data work.
You can install toolero from CRAN:
install.packages("toolero")Or install the development version from GitHub:
# install.packages("pak")
pak::pak("erwinlares/toolero")init_project()Creates a new R project with a standard folder structure suited for
research workflows. Optionally initializes renv for package
management and git for version control.
library(toolero)
# Create a project with the standard folder structure
init_project(path = "~/Documents/my-project")
# Add extra folders
init_project(path = "~/Documents/my-project",
extra_folders = c("notebooks", "presentations"))
# Skip renv and git
init_project(path = "~/Documents/my-project", use_renv = FALSE, use_git = FALSE)The default folder structure includes: data/,
data-raw/, R/, scripts/,
plots/, images/, results/, and
docs/.
create_qmd()Scaffolds a new Quarto document from a reproducible template,
including a sample dataset, UW-Madison branded assets, and an optional
post-render purl hook that extracts R code from the rendered document
into a companion .R file. Optionally pre-populates the YAML
header from a user-supplied YAML config file.
library(toolero)
# Create a document with placeholder YAML
create_qmd(path = "~/Documents/my-project", filename = "analysis.qmd")
# Create without the purl hook
create_qmd(path = "~/Documents/my-project", filename = "report.qmd",
use_purl = FALSE)
# Pre-populate YAML from a personal config file
create_qmd(path = "~/Documents/my-project", filename = "analysis.qmd",
yaml_data = "~/my_config.yml")read_clean_csv()Reads a CSV file and cleans the column names in one step, producing a tidyverse-friendly tibble.
library(toolero)
data <- read_clean_csv("path/to/file.csv")
# Show column type messages
data <- read_clean_csv("path/to/file.csv", verbose = TRUE)detect_execution_context()Identifies which of three execution environments the code is
currently running in: an interactive R session, a
quarto render call, or a plain Rscript
invocation. Returns one of "interactive",
"quarto", or "rscript".
library(toolero)
context <- detect_execution_context()
input_file <- switch(context,
interactive = "data/sample.csv",
quarto = params$input_file,
rscript = commandArgs(trailingOnly = TRUE)[1]
)write_by_group()Splits a data frame by a single grouping column and writes each group
to a separate CSV file. Filenames are derived from sanitized group
values — converted to lowercase with spaces and special characters
replaced by dashes. Optionally writes a manifest.csv
listing output files, group values, and row counts.
library(toolero)
# Load the bundled sample dataset
sample_path <- system.file("templates", "sample.csv", package = "toolero")
penguins <- read_clean_csv(sample_path)
# Split by species
write_by_group(penguins, group_col = "species", output_dir = tempdir())
# Also write a manifest
write_by_group(penguins, group_col = "species",
output_dir = tempdir(), manifest = TRUE)generate_kb_xml()Produces a UW-Madison Knowledge Base importable XML file from a
rendered Quarto document. Re-renders the source .qmd with
all assets embedded, extracts the HTML body, and wraps it in the KB XML
structure along with metadata drawn from the document’s YAML header —
title → kb_title, description →
kb_summary, categories →
kb_keywords.
library(toolero)
generate_kb_xml(
html_path = "docs/analysis.html",
output_dir = "exports"
)When importing the resulting XML into the KB, check the Decode HTML entity in body content option.
To cite toolero in publications:
citation("toolero")MIT © Erwin Lares