tabensemb.data.datamodule.DataModule.prepare_new_data#

method

DataModule.prepare_new_data(df: DataFrame, derived_data: Dict | None = None, ignore_absence=False) Tuple[DataFrame, dict][source]#

Prepare the new tabular dataset for predictions using _predict() Stacked and unstacked features are derived; missing values are imputed; The transform method of AbstractProcessor is called. Users usually do not need to call this because predict() does it.

Parameters:
df:

A new tabular dataset.

derived_data:

Data derived from derive_unstacked(). If not None, unstacked data will be re-derived.

ignore_absence:

Whether to ignore absent keys in derived_data. Use True only when the model does not use derived_data.

Returns:
df

The dataset after derivation, imputation, and processing. It has the same structure as self.X_train

derived_data:

Data derived from derive_unstacked(). It has the same structure as self.D_train

Notes

The returned df is not scaled for the sake of further treatments. To scale the df, run df = datamodule.data_transform(df, scaler_only=True)