Mean CI from Data — ci_mean

ci_mean_t() calculates the mean's confidence interval (CI) using the classic formula with Student's t coefficient for data in data frame format. This enhanced version of DescTools::MeanCI() responds to dplyr::group_by(), enabling subgroup calculations. Result is a data frame.

ci_mean_t(.data, x, conf.level = 0.95, ...)

Arguments

.data: Data frame.
x: Column name (unquoted).
conf.level: Confidence level. Default: 0.95.
...: Additional parameters for DescTools::MeanCI(). See that function's documentation.

Value

A data frame with columns:

(if present) grouping variable names;
mean (<dbl>) – mean estimate;
lwr.ci, upr.ci (<dbl>) – lower and upper CI bounds.

Examples

# Example with built-in dataset
data(npk, package = "datasets")
head(npk)
#>   block N P K yield
#> 1     1 0 1 1  49.5
#> 2     1 1 1 0  62.8
#> 3     1 0 0 0  46.8
#> 4     1 1 0 1  57.0
#> 5     2 1 0 0  59.8
#> 6     2 1 1 1  58.5

# Basic CI calculation for crop yield
ci_mean_t(npk, yield)
#> # A tibble: 1 × 3
#>    mean lwr.ci upr.ci
#>   <dbl>  <dbl>  <dbl>
#> 1  54.9   52.3   57.5
# Interpretation: We're 95% confident the true mean yield
# falls between lwr.ci and upr.ci

# Using pipe operator (tidyverse style)
npk |> ci_mean_t(yield)
#> # A tibble: 1 × 3
#>    mean lwr.ci upr.ci
#>   <dbl>  <dbl>  <dbl>
#> 1  54.9   52.3   57.5

# Compare yields with nitrogen (N) treatment vs. without
npk |>
  dplyr::group_by(N) |>
  ci_mean_t(yield)
#> # A tibble: 2 × 4
#>   N      mean lwr.ci upr.ci
#>   <fct> <dbl>  <dbl>  <dbl>
#> 1 0      52.1   48.6   55.5
#> 2 1      57.7   54.0   61.4
# Look at the CIs: Do they overlap? Non-overlapping CIs suggest
# a potential difference between groups

# More complex grouping: Three factors at once
npk |>
  dplyr::group_by(N, P, K) |>
  ci_mean_t(yield)
#> # A tibble: 8 × 6
#>   N     P     K      mean lwr.ci upr.ci
#>   <fct> <fct> <fct> <dbl>  <dbl>  <dbl>
#> 1 0     1     1      50.5   44.6   56.4
#> 2 1     1     0      57.9   44.3   71.5
#> 3 0     0     0      51.4   40.0   62.9
#> 4 1     0     1      54.7   44.2   65.1
#> 5 1     0     0      63.8   51.1   76.4
#> 6 1     1     1      54.4   41.9   66.8
#> 7 0     0     1      52     38.0   66.0
#> 8 0     1     0      54.3   31.0   77.7

# Example with iris dataset: Petal length by species
data(iris, package = "datasets")
iris |>
  dplyr::group_by(Species) |>
  ci_mean_t(Petal.Length)
#> # A tibble: 3 × 4
#>   Species     mean lwr.ci upr.ci
#>   <fct>      <dbl>  <dbl>  <dbl>
#> 1 setosa      1.46   1.41   1.51
#> 2 versicolor  4.26   4.13   4.39
#> 3 virginica   5.55   5.40   5.71
# Notice how the three species have clearly different intervals

# Example with mtcars: MPG by number of cylinders
data(mtcars, package = "datasets")
mtcars |>
  dplyr::group_by(cyl) |>
  ci_mean_t(mpg)
#> # A tibble: 3 × 4
#>     cyl  mean lwr.ci upr.ci
#>   <dbl> <dbl>  <dbl>  <dbl>
#> 1     6  19.7   18.4   21.1
#> 2     4  26.7   23.6   29.7
#> 3     8  15.1   13.6   16.6

# 90% confidence interval (less confident, narrower interval)
npk |> ci_mean_t(yield, conf.level = 0.90)
#> # A tibble: 1 × 3
#>    mean lwr.ci upr.ci
#>   <dbl>  <dbl>  <dbl>
#> 1  54.9   52.7   57.0

# 99% confidence interval (more confident, wider interval)
npk |> ci_mean_t(yield, conf.level = 0.99)
#> # A tibble: 1 × 3
#>    mean lwr.ci upr.ci
#>   <dbl>  <dbl>  <dbl>
#> 1  54.9   51.3   58.4