tabensemb.model.TorchModel#

class tabensemb.model.TorchModel(*args, lightning_trainer_kwargs: Dict | None = None, **kwargs)[source]#

Bases: AbstractModel

The class for PyTorch-like models. Some abstract methods in AbstractModel are implemented.

Methods

__init__(*args, lightning_trainer_kwargs: Dict | None = None, **kwargs)[source]#
Parameters:
trainer:

A Trainer instance that contains all information and datasets and will be linked to the model base. The trainer has loaded configs and data.

program:

The name of the model base. If None, the name from _get_program_name() is used.

model_subset:

The names of models selected to be trained in the model base.

exclude_models:

The names of models that should not be trained. Only one of model_subset and exclude_models can be specified.

store_in_harddisk:

Whether to save models in the hard disk. If the global setting tabensemb.setting["low_memory"] is True, True is used.

optimizers

A dictionary of optimizer names (choose from those in torch.optim) and their hyperparameters for each model. Remember to change _initial_values() and _space() to optimize its hyperparameters.

lr_schedulers

A dictionary of lr scheduler names (choose from those in torch.optim.lr_scheduler) and their hyperparameters for each model. Remember to change _initial_values() and _space() to optimize its hyperparameters.

**kwargs:

Ignored.

cal_feature_importance(model_name, method[, ...])

Calculate feature importance using a specified model.

cal_shap(model_name[, call_general_method, ...])

Calculate SHAP values using a specified model.

count_params(model_name[, trainable_only])

Count the number of parameters in a torch.nn.Module

get_full_name_from_required_model(required_model)

Get the name of a required model to store or access data in derived_tensors passed to AbstractNN._forward().

_data_preprocess(df, derived_data, model_name)

Perform the same preprocessing as in _train_data_preprocess() on a new dataset.

_generate_dataset(datamodule, model_name)

Generate torch.utils.data.Dataset for training.

_generate_dataset_for_required_models(df, ...)

Call AbstractModel._data_preprocess() to generate the dataset, output, and hidden representations for the required model

_generate_dataset_from_tensors(tensors, df, ...)

Perform the same preprocessing as in _generate_dataset() on a new dataset.

_initial_values(model_name)

Initial values of hyperparameters to be optimized.

_pred_single_model(model, X_test, verbose, ...)

Predict using the model trained in _train_single_model().

_prepare_custom_datamodule(model_name[, ...])

Change this method if a customized preprocessing stage is needed.

_prepare_tensors(df, derived_data, model_name)

Transform the upcoming dataset into Tensors that has the same structures as those stored in a tabensemb.data.datamodule.DataModule and obtained by tabensemb.data.datamodule.DataModule.update_dataset().

_run_custom_data_module(df, derived_data, ...)

Change this method if a customized preprocessing stage is implemented in _prepare_custom_datamodule().

_space(model_name)

A list of scikit-optimize search spaces for the selected model.

_train_data_preprocess(model_name[, warm_start])

Processing the data from self.trainer.datamodule for training.

_train_single_model(model, model_name, ...)

pytorch_lightning implementation of training a pytorch model.