tabensemb.data.datasplitter.RandomSplitter.split#

method

RandomSplitter.split(df: DataFrame, cont_feature_names: List[str], cat_feature_names: List[str], label_name: List[str], cv: int | None = None) Tuple[ndarray, ndarray, ndarray]#

Split the dataset. It will call _split() and check its results.

Parameters:
df:

The input tabular dataset.

cont_feature_names:

Names of continuous features.

cat_feature_names:

Names of categorical features.

label_name:

The name of the label.

cv:

The total number of cross-validation runs.

Returns:
np.ndarray

Indices of the training, validation, and testing datasets.