DEPRECATED!
Create folds with blocking and stratification (BS) for (repeated) k-fold cross-validation Function randomly divides observations into folds that are used for (repeated) k-fold cross-validation. In these folds observations are:

  1. blocked by values in variable ID (i.e. observations with the same "ID" are treated as one unit (a block) and are always in the same fold);

  2. stratified by levels of factor variable gr (the proportions of these grouped units of observations per each group (level) are kept aproximately constant throughout all folds).

createFolds2(data = NULL, gr = NULL, ID = NULL, k = 5, returnTrain = TRUE, ...)

stratifiedFolds(
  data = NULL,
  gr = NULL,
  ID = NULL,
  k = 5,
  returnTrain = TRUE,
  ...
)

createFoldsBS(
  data = NULL,
  stratify_by = NULL,
  block_by = NULL,
  k = 5,
  returnTrain = TRUE,
  times = 1,
  seeds = NULL,
  gr = stratify_by,
  ID = block_by
)

Arguments

data

A data frame, that contains variables which names are denoted by arguments ID and by gr.

k

(integer) A number of folds, default k = 5.

returnTrain

(logical) If TRUE, returns indices of variables in training set. If FALSE, returns indices of variables in test set.

stratify_by, gr

A vector or a name of factor variable in data, which levels will be used for stratification. E.g., vector with medical groups.

block_by, ID

A vector or a name of variable in data, that contains identification codes/numbers (ID). These codes will be used for blocking.

times

(integer) number of repetitions for repeated cross-vatidtion.

seeds

(vector of integers | NULL) Seeds for random number generator for each repetition.
If seeds = NULL random seeds are generated.
If number of repetitions is greater than number of provided seeds, random seeds are generated and added to the provided ones. The first seed will be used to ensure reproducibility of the randomly generated seeds.

(See set.seed for more information about random number generation).

Value

A list of folds. In each fold there are indices observations. The structure of outpus is the similar to one created with

createFolds().

Note

If k is such big, that some folds have no observations of a certain group (i.e. level in gr), an error is returned. In that case smaller value of k is recommended.

createFolds2, stratifiedFolds is a wrapper of createFoldsBS.

See also

createFolds
Test if folds are blocked and stratified cvo_test_bs

Author

Vilmantas Gegzna