Computes summary statistics for observed and change values in longitudinal data, supporting both continuous and categorical x-axis variables.
Usage
compute_stats(
df,
x_var,
y_var,
group_var,
cluster_var,
baseline_value,
confidence_interval = NULL,
summary_statistic = "mean",
statistical_tests = FALSE,
facet_vars = NULL,
test_method = "parametric",
p_adjust_method = "BH",
cov_struct = "auto"
)Arguments
- df
A data frame containing the data to be plotted.
- x_var
The independent variable (x-axis) name.
- y_var
The dependent variable (y-axis) name.
- group_var
Grouping variable for data (optional).
- cluster_var
Cluster variable for within-subject grouping (subject ID).
- baseline_value
Baseline value for calculating changes.
- confidence_interval
Numeric. Confidence level (e.g., 0.95 for 95% CI). If specified, calculates confidence intervals instead of standard error.
- summary_statistic
Character. Type of summary statistic: "mean" (mean ± CI/SE), "mean_se" (mean ± SE), "median" (median + IQR), or "boxplot" (quartiles + whiskers).
- statistical_tests
Logical. If TRUE, performs statistical comparisons.
- facet_vars
Character vector. Names of variables to use for faceting (optional).
- test_method
Character. Testing approach for group comparisons: "parametric" (t-test / ANOVA, the default), "nonparametric" (Wilcoxon rank-sum / Kruskal-Wallis), or "mmrm" (mixed model for repeated measures with emmeans contrasts; requires the mmrm and emmeans packages).
- p_adjust_method
Character. Multiple comparison correction passed to
stats::p.adjust(). Default is "BH" (Benjamini-Hochberg). Use "none" to disable adjustment.- cov_struct
Character. Covariance structure for MMRM. See
lplot()for available options. Default is "auto".
Value
A data frame containing the computed statistics with columns:
Original x and group variables
mean_value: Mean/median of y values (depending on summary_statistic)
change_mean: Mean/median of change from baseline
sample_size: Number of observations
standard_deviation: SD of y values (for mean) or IQR (for median)
change_sd: SD of change values (for mean) or IQR (for median)
standard_error: Standard error of mean/median
change_se: Standard error of change mean/median
bound_lower/bound_upper: Lower/upper bounds (CI/SE for mean, Q1/Q3 for median)
bound_lower_change/bound_upper_change: Bounds for change values
group: Factor combining all grouping variables
is_continuous: Boolean indicating if x is continuous
Examples
df <- data.frame(
subject_id = rep(1:10, each = 3),
visit = rep(c(0, 1, 2), times = 10),
measure = rnorm(30, mean = 50, sd = 10),
group = rep(c("A", "B"), length.out = 30)
)
# Compute statistics with visit as x variable, measure as y variable,
# grouped by treatment group, with subject_id as the cluster variable
stats <- compute_stats(df, "visit", "measure", "group", "subject_id", 0)
head(stats)
#> # A tibble: 6 × 15
#> group visit mean_value change_mean sample_size standard_deviation change_sd
#> <fct> <dbl> <dbl> <dbl> <int> <dbl> <dbl>
#> 1 A 0 42.8 0 5 4.55 0
#> 2 A 1 45.8 -0.375 5 7.01 17.2
#> 3 A 2 60.1 17.3 5 7.16 10.8
#> 4 B 0 46.2 0 5 10.5 0
#> 5 B 1 49.7 6.90 5 14.9 14.5
#> 6 B 2 50.9 4.65 5 7.85 7.78
#> # ℹ 8 more variables: standard_error <dbl>, change_se <dbl>, bound_lower <dbl>,
#> # bound_upper <dbl>, bound_lower_change <dbl>, bound_upper_change <dbl>,
#> # ci_level <lgl>, is_continuous <lgl>
