tabensemb.data.datasplitter.RandomSplitter.split#

method

RandomSplitter.split(df: DataFrame, cont_feature_names: List[str], cat_feature_names: List[str], label_name: List[str], cv: int | None = None) → Tuple[ndarray, ndarray, ndarray]#

Split the dataset. It will call _split() and check its results.

Parameters:

df:: The input tabular dataset.
cont_feature_names:: Names of continuous features.
cat_feature_names:: Names of categorical features.
label_name:: The name of the label.
cv:: The total number of cross-validation runs.

Returns:

np.ndarray: Indices of the training, validation, and testing datasets.