Spline-Based Nonlinear Modeling for Multilevel and Longitudinal Data
Author: Subir Hait · Michigan State University · haitsubi@msu.edu
MultiSpline provides a unified interface for fitting, interpreting, and visualising nonlinear relationships in single-level and multilevel regression models using natural cubic splines, B-splines, or GAM smooths.
Version 2 is a major upgrade. Every item raised in the Moran correspondence has been addressed, and the package now offers a complete estimation → interpretation → diagnostics framework:
| Layer | Functions |
|---|---|
| Estimation | nl_fit() · nl_knots() |
| Prediction | nl_predict() |
| Interpretation | nl_derivatives() · nl_turning_points() ·
nl_plot() |
| Model selection | nl_compare() |
| Variance explained | nl_r2() · nl_icc() |
| Cluster heterogeneity | nl_het() |
# Cross-classified (e.g. students nested in both schools and districts)
fit <- nl_fit(data = df, y = "score", x = "SES",
cluster = c("student_id", "school_id"),
nested = FALSE)
# Nested (students within schools)
fit <- nl_fit(data = df, y = "score", x = "SES",
cluster = c("student_id", "school_id"),
nested = TRUE)Internally this generates
(1 | student_id) + (1 | school_id) for cross-classified
models and (1 | student_id/school_id) for nested models,
passed to lme4::lmer() / lme4::glmer().
All model types now return proper CI bands:
| Model | CI method |
|---|---|
lm |
Analytical (standard predict.lm) |
gam |
Analytical (mgcv) |
lmerMod |
Delta method via fixed-effects variance–covariance matrix |
glmerMod |
Delta method on link scale + back-transformation (default) or
parametric bootstrap (glmer_ci = "boot") |
pred <- nl_predict(fit, se = TRUE, level = 0.95)
nl_plot(pred_df = pred, x = "age", show_ci = TRUE)# Explore and select df explicitly
ks <- nl_knots(data = df, y = "score", x = "age",
df_range = 2:10, criterion = "AIC")
ks$best_df # → e.g. 5
# Or let nl_fit do it automatically
fit <- nl_fit(data = df, y = "score", x = "age",
df = "auto", df_criterion = "AIC")A diagnostic plot of AIC/BIC vs df is produced automatically.
nl_r2(fit)Output (LMM example):
=== MultiSpline R² Decomposition ===
Model type: LMM
Marginal R²m = 0.1823 (fixed effects only)
Conditional R²c = 0.4761 (fixed + all random effects)
Variance Partition (r2_mlm-style):
Component Variance Proportion
Fixed effects 1.4221 0.1823
student_id 1.8340 0.2350
school_id 0.8762 0.1124
Residual 3.6590 0.4690
Follows the Nakagawa-Schielzeth (2013) and
Nakagawa-Johnson-Schielzeth (2017) formulas for marginal and conditional
R². The level-specific variance partition mirrors the
r2_mlm / Rights-Sterba (2019) approach.
pred <- nl_predict(fit, x_seq = seq(18, 80, length.out = 200))
deriv <- nl_derivatives(pred, x = "age")
# Visualise slope (first derivative) with CI band
nl_plot(deriv_df = deriv, x = "age", type = "slope")
# Visualise curvature (second derivative)
nl_plot(deriv_df = deriv, x = "age", type = "curvature")
# Trajectory + slope side by side
nl_plot(pred_df = pred, deriv_df = deriv, x = "age", type = "combo")Turning points and inflection regions:
tp <- nl_turning_points(deriv, x = "age")
tp$turning_points # local maxima and minima with x location and type
tp$inflection_regions # where concavity changes
tp$slope_regions # contiguous increasing / decreasing stretches
# Overlay turning points on trajectory plot
nl_plot(pred_df = pred, x = "age",
show_turning_points = TRUE,
turning_points = tp$turning_points)nl_compare(fit) # default: linear + poly(2) + poly(3) + spline
nl_compare(fit, polynomial_degrees = 2:5) # custom comparatorsOutput:
=== MultiSpline Model Comparison ===
Model AIC BIC LogLik npar Deviance LRT_vs_linear LRT_p
Linear 1842.1 1863.4 -916.1 5 1020.4 NA NA
Poly(2) 1830.5 1855.6 -908.2 6 992.3 15.74 <0.001
Poly(3) 1825.3 1854.3 -903.6 7 978.1 25.00 <0.001
Spline 1818.7 1851.5 -898.3 9 961.2 35.68 <0.001
Best model by AIC: Spline
nl_het(fit, n_clusters_plot = 50)Plots cluster-specific predicted trajectories (BLUPs) against the population-mean curve, and performs a likelihood-ratio test comparing random-slope vs random-intercept models to test whether the nonlinear effect genuinely varies across clusters.
fit_bs <- nl_fit(data = df, y = "score", x = "age",
method = "bs", df = 6, bs_degree = 3)Allows the nonlinear trajectory to vary across clusters:
fit_rs <- nl_fit(data = df, y = "score", x = "age",
cluster = "id", random_slope = TRUE)# From source
install.packages("path/to/MultiSpline_0.2.0.tar.gz", repos = NULL, type = "source")
# Dependencies (install first if needed)
install.packages(c("lme4", "mgcv", "dplyr", "ggplot2", "rlang", "splines"))
# Optional (for p-values in LMM)
install.packages("lmerTest")library(MultiSpline)
# 1. Select df automatically
ks <- nl_knots(data = nlme::Orthodont, y = "distance",
x = "age", df_range = 2:8)
# 2. Fit multilevel model with two-way clustering and auto df
fit <- nl_fit(
data = nlme::Orthodont,
y = "distance",
x = "age",
cluster = "Subject",
df = "auto",
controls = NULL
)
fit # prints postestimation menu
# 3. Coefficient table
nl_summary(fit)
# 4. Multilevel R²
nl_r2(fit)
# 5. Model comparison
nl_compare(fit)
# 6. Predictions with CI
pred <- nl_predict(fit, se = TRUE)
nl_plot(pred_df = pred, x = "age", show_ci = TRUE, type = "trajectory")
# 7. Derivatives and turning points
deriv <- nl_derivatives(pred, x = "age")
tp <- nl_turning_points(deriv, x = "age")
tp$turning_points
nl_plot(deriv_df = deriv, x = "age", type = "slope")
nl_plot(deriv_df = deriv, x = "age", type = "curvature")
# 8. Cluster heterogeneity
nl_het(fit)nested argument)glmerMod via delta method
and parametric bootstrapdf = "auto", nl_knots())nl_r2())nl_derivatives())nl_turning_points())nl_compare())nl_het())method = "bs",
bs_degree)random_slope = TRUE)nl_plot() now supports
type = "slope", "curvature",
"combo"print.nl_fit lists all v2
postestimation functionsnl_fit, nl_predict,
nl_plot, nl_summary, nl_iccHait, S. (2026). MultiSpline: Spline-Based Nonlinear Modeling for Multilevel and Longitudinal Data (R package version 0.2.0). https://github.com/haitsubi/MultiSpline