Do numerical summaries by groups with formaula interface. Missing values are automatically removed.
do_summary( y, data = NULL, stat = c("n", "missing", "mean", "trimmed", "sd", "variance", "min", "Q1", "median", "Q3", "max", "mad", "IQR", "range", "cv", "se", "skewness", "kurtosis"), trim = 0.1, type = 3, na.rm = TRUE ) # S3 method for num_summaries print(x, ..., digits = NA, format = "f", digits_sk = 2)
y | formula with variable names to summarize. See more in examples. |
---|---|
data | data set |
stat | (character) Descriptive statistics to compute. Currently supported statistics:
|
trim | The fraction (0 to 0.5) of observations to be trimmed from each end of sorted variable before the mean is computed. Values of trim outside that range are taken as the nearest endpoint. |
type | (integer: 1, 2, 3) The type of skewness and kurtosis estimate.
See |
na.rm | (logical) Flag to remove missing values. Default is |
x | object to print |
... | further arguments to methods. |
digits | Number of digits for descriptive statistics. |
format | (character) |
digits_sk | Number of digits for skweness and kurtosis. |
Data frame with summary satatistics.
library(biostat) data(cabbages, package = "MASS") do_summary(~VitC, data = cabbages) %>% print(digits = 2)#> Warning: `funs()` is deprecated as of dplyr 0.8.0. #> Please use a list of either functions or lambdas: #> #> # Simple named list: #> list(mean = mean, median = median) #> #> # Auto named with `tibble::lst()`: #> tibble::lst(mean, median) #> #> # Using lambdas #> list(~ mean(., trim = .2), ~ median(., na.rm = TRUE)) #> This warning is displayed once every 8 hours. #> Call `lifecycle::last_warnings()` to see where this warning was generated.#> .summary_of n missing mean trimmed sd variance min Q1 median Q3 #> 1 VitC 60 0 57.95 57.67 10.12 102.39 41.00 50.75 56.00 66.25 #> max mad IQR range cv se skewness kurtosis #> 1 84.00 10.38 15.50 43.00 0.17 1.31 0.32 -0.68#> .summary_of Cult n missing mean trimmed sd variance min Q1 median #> 1 VitC c39 30 0 51.50 50.92 7.12 50.74 41.00 46.00 51.00 #> 2 VitC c52 30 0 64.40 64.25 8.46 71.49 47.00 58.00 64.50 #> Q3 max mad IQR range cv se skewness kurtosis #> 1 54.75 68.00 5.93 8.75 27.00 0.14 1.30 0.58 -0.26 #> 2 70.75 84.00 9.64 12.75 37.00 0.13 1.54 0.12 -0.62#> .summary_of Cult Date mean #> 1 VitC c39 d16 50.30 #> 2 VitC c39 d20 49.40 #> 3 VitC c39 d21 54.80 #> 4 VitC c52 d16 62.50 #> 5 VitC c52 d20 58.90 #> 6 VitC c52 d21 71.80do_summary(HeadWt + VitC ~ Cult + Date, data = cabbages, stat = c("n", "mean", "sd") ) %>% print(digits = 1)#> .summary_of Cult Date n mean sd #> 1 HeadWt c39 d16 10 3.2 1.0 #> 2 HeadWt c39 d20 10 2.8 0.3 #> 3 HeadWt c39 d21 10 2.7 1.0 #> 4 HeadWt c52 d16 10 2.3 0.4 #> 5 HeadWt c52 d20 10 3.1 0.8 #> 6 HeadWt c52 d21 10 1.5 0.2 #> 7 VitC c39 d16 10 50.3 4.3 #> 8 VitC c39 d20 10 49.4 8.3 #> 9 VitC c39 d21 10 54.8 7.6 #> 10 VitC c52 d16 10 62.5 5.8 #> 11 VitC c52 d20 10 58.9 7.7 #> 12 VitC c52 d21 10 71.8 6.2# TODO: # 1. First argument should be a data frame #