tabensemb.data.utils.fill_cat_nan#

tabensemb.data.utils.fill_cat_nan(df: DataFrame, cat_dtypes: Dict[str, dtype]) DataFrame[source]#

Imputation of categorical features.

Parameters:
df

The dataframe to be imputed.

cat_dtypes

The dtype of each categorical feature. If it is a numerical type, number_unknown_value (default to -1) is used for imputation, otherwise object_unknown_value (default to “UNK”) is used. Change these two values if you want other values for missing or unknown values.

Returns:
pd.DataFrame