tabensemb.data.AbstractImputer#

class tabensemb.data.AbstractImputer(**kwargs)[source]#

Bases: AbstractDataStep

The base class for all data-imputers. Data-imputers replace NaNs in the input tabular dataset. For categorical features that are all numerical (integers or np.nan), the column will be transformed to the dtype “int” after filling NaNs with tabensemb.data.utils.number_unknown_value. Other categorical features will be transformed to the dtype “str” after filling NaNs with tabensemb.data.utils.object_unknown_value.

Methods

__init__(**kwargs)[source]#

fit_transform(input_data, datamodule)

Record feature names in the datamodule, fit the imputer and transform the input dataframe.

transform(input_data, datamodule)

Restore feature names in the datamodule using recorded features, and transform the input tabular data using the fitted imputer.

_fit_transform(input_data, datamodule)

Fit the imputer and transform the input dataframe.

_get_impute_features(cont_feature_names, data)

Get continuous feature names that can be imputed, i.e. those not totally missing.

_transform(input_data, datamodule)

Transform the input tabular data using the fitted imputer.