Build your own model upon others#

Models can be built based on other trained models in the current model base or in other model bases. Both AbstractModel and TorchModel support this feature.

For AbstractModel#

[1]:
import tabensemb
import numpy as np
import torch
import os
from tempfile import TemporaryDirectory
from tabensemb.model import WideDeep, AbstractModel

temp_path = TemporaryDirectory()
tabensemb.setting["default_output_path"] = os.path.join(temp_path.name, "output")
tabensemb.setting["default_config_path"] = os.path.join(temp_path.name, "configs")
tabensemb.setting["default_data_path"] = os.path.join(temp_path.name, "data")

device = "cuda" if torch.cuda.is_available() else "cpu"

Suppose that we want to call TabMlp of WideDeep in another model base CallTabMlp

class CallTabMlp(AbstractModel):
    def _get_program_name(self):
        return "CallTabMlp"

    def _get_model_names(self):
        return ["CalledTabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

Extracting another model can be done by setting required_models in a specific format. In the following code, “EXTERN” means that the model is from another model base. “WideDeep” is the name of the model base which the wanted model is from. “TabMlp” is the wanted model in the model base. If the model is from the current model base, only the name of the wanted model is needed (return ["TabMlp"]). Multiple required models can be specified in the returned list.

def required_models(self, model_name: str):
    return ["EXTERN_WideDeep_TabMlp"]

As normal, _train_data_preprocess, _data_preprocess, _new_model, _train_single_model, and _pred_single_model should be implemented. First, _train_data_preprocess is called, and _get_required_models is used to extract the external model. In this case, a WideDeep instance containing the trained TabMlp model is returned. If the model is from the current model base, calling self._get_required_models("TabMlp") is equivalent to calling self.model["TabMlp"].

Then the _train_data_preprocess method from WideDeep is directly used to process the dataset to get compatible processed data.

def _train_data_preprocess(self, model_name):
    if not hasattr(self, "net"):
        self.net = self._get_required_models("TabMlp")["EXTERN_WideDeep_TabMlp"]
        self.net.trainer = self.trainer
    return self.net._train_data_preprocess("TabMlp")

Also, _data_preprocess calls the same method from WideDeep instead to get compatible processed data.

def _data_preprocess(self, df, derived_data, model_name):
    return self.net._data_preprocess(df, derived_data, "TabMlp")

In _new_model, the extracted model is directly returned.

def _new_model(self, model_name, verbose, **kwargs):
    return self.net

_pred_single_model calls the same method from WideDeep to make predictions based on the extracted model.

def _pred_single_model(self, model, X_test, verbose, **kwargs):
    return model._pred_single_model(model.model["TabMlp"], X_test, verbose, **kwargs)

In this example, we won’t do further training on the extracted model, but it is straightforward to do other operations on the predictions from the extracted model obtained by model._pred_single_model as shown above.

def _train_single_model(self, *args, **kwargs):
    pass
[2]:
class CallTabMlp(AbstractModel):
    def _get_program_name(self):
        return "CallTabMlp"

    def _get_model_names(self):
        return ["TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _train_data_preprocess(self, model_name):
        if not hasattr(self, "net"):
            self.net = self._get_required_models("TabMlp")["EXTERN_WideDeep_TabMlp"]
            self.net.trainer = self.trainer
        return self.net._train_data_preprocess("TabMlp")

    def _data_preprocess(self, df, derived_data, model_name):
        return self.net._data_preprocess(df, derived_data, "TabMlp")

    def _new_model(self, model_name, verbose, **kwargs):
        return self.net

    def _train_single_model(self, *args, **kwargs):
        pass

    def _pred_single_model(self, model, X_test, verbose, **kwargs):
        return model._pred_single_model(model.model["TabMlp"], X_test, verbose, **kwargs)

For TorchModel#

It is easier to build a model based on others in TorchModel because we have already implemented complex dataset-building operations internally.

Similar to the implementation above, we specify methods except for _train_data_preprocess and _data_preprocess.

class CallTabMlpTorch(TorchModel):
    def _get_program_name(self):
        return "CallTabMlpTorch"

    def _get_model_names(self):
        return ["TabMlp"]

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

We build our model CallTabMlpNN on the top of TabMlp from WideDeep. In this tutorial, we will not train anything.

def _new_model(self, model_name, verbose, **kwargs):
    return CallTabMlpNN(datamodule=self.trainer.datamodule, **kwargs)

def _train_single_model(self, *args, **kwargs):
    pass

Now comes CallTabMlpNN. A positional argument required_models is passed to __init__ containing all required and extracted models specified in CallTabMlpTorch.required_models.

class CallTabMlpNN(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNN, self).__init__(datamodule, **kwargs)
        self.net = required_models["EXTERN_WideDeep_TabMlp"]

To get results from the extracted model, use self.call_required_model.

def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
    return self.call_required_model(self.net, x, derived_tensors)

Remark: Indeed, the output of the model is already calculated when preparing the dataset and is stored in derived_tensors["data_required_models"]["MODELNAME_pred"]. self.call_required_model first tries to find the pre-calculated output. If failed, the output is calculated using the dataset for the model base stored in derived_tensors["data_required_models"]["MODELNAME"]. Therefore, if you want to actually calculate the output during forward, just remove the stored predictions in derived_tensors.

[3]:
from tabensemb.model import TorchModel, AbstractNN

class CallTabMlpNN(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNN, self).__init__(datamodule, **kwargs)
        self.net = required_models["EXTERN_WideDeep_TabMlp"]

    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        return self.call_required_model(self.net, x, derived_tensors)

class CallTabMlpTorch(TorchModel):
    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNN(datamodule=self.trainer.datamodule, **kwargs)

    def _get_program_name(self):
        return "CallTabMlpTorch"

    def _get_model_names(self):
        return ["TabMlp"]

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

    def _train_single_model(self, *args, **kwargs):
        pass

We can compare results from the original model and the extracted model. They get exactly the same results.

[4]:
from tabensemb.trainer import Trainer
from tabensemb.config import UserConfig

trainer = Trainer(device=device)
mpg_columns = [
    "mpg",
    "cylinders",
    "displacement",
    "horsepower",
    "weight",
    "acceleration",
    "model_year",
    "origin",
    "car_name",
]
cfg = UserConfig.from_uci("Auto MPG", column_names=mpg_columns, sep=r"\s+")
trainer.load_config(cfg)
trainer.load_data()
trainer.add_modelbases(
    [
        WideDeep(trainer, model_subset=["TabMlp"]),
        CallTabMlp(trainer),
        CallTabMlpTorch(trainer),
    ]
)
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()
Downloading https://archive.ics.uci.edu/static/public/9/auto+mpg.zip to /tmp/tmpvlx3s8em/data/Auto MPG.zip
cylinders is Integer and will be treated as a continuous feature.
model_year is Integer and will be treated as a continuous feature.
origin is Integer and will be treated as a continuous feature.
Unknown values are detected in ['horsepower']. They will be treated as np.nan.
The project will be saved to /tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig
Dataset size: 238 80 80
Data saved to /tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig (data.csv and tabular_data.csv).

-------------Run WideDeep-------------

Training TabMlp
Epoch: 1/300, Train loss: 635.5330, Val loss: 555.4755, Min val loss: 555.4755
Epoch: 21/300, Train loss: 441.6902, Val loss: 375.7337, Min val loss: 375.7337
Epoch: 41/300, Train loss: 145.8623, Val loss: 119.9598, Min val loss: 119.9598
Epoch: 61/300, Train loss: 45.9133, Val loss: 34.0160, Min val loss: 34.0160
Epoch: 81/300, Train loss: 27.6878, Val loss: 24.1525, Min val loss: 24.1525
Epoch: 101/300, Train loss: 23.0877, Val loss: 18.2096, Min val loss: 18.2096
Epoch: 121/300, Train loss: 21.4056, Val loss: 17.2203, Min val loss: 17.1303
Epoch: 141/300, Train loss: 21.2559, Val loss: 16.0746, Min val loss: 16.0746
Epoch: 161/300, Train loss: 19.2337, Val loss: 15.3027, Min val loss: 15.3027
Epoch: 181/300, Train loss: 16.1232, Val loss: 14.5777, Min val loss: 14.5777
Epoch: 201/300, Train loss: 16.7095, Val loss: 14.2274, Min val loss: 14.2274
Epoch: 221/300, Train loss: 15.7366, Val loss: 13.5223, Min val loss: 13.5223
Epoch: 241/300, Train loss: 16.9825, Val loss: 12.9892, Min val loss: 12.9892
Epoch: 261/300, Train loss: 15.3358, Val loss: 12.4278, Min val loss: 12.4278
Epoch: 281/300, Train loss: 13.3989, Val loss: 12.1155, Min val loss: 12.1155
Restoring model weights from the end of the best epoch
Training mse loss: 10.17037
Validation mse loss: 11.66271
Testing mse loss: 6.43856
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')

-------------WideDeep End-------------


-------------Run CallTabMlp-------------

Training TabMlp
Training mse loss: 10.17037
Validation mse loss: 11.66271
Testing mse loss: 6.43856
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')

-------------CallTabMlp End-------------


-------------Run CallTabMlpTorch-------------

Training TabMlp
Training mse loss: 10.17037
Validation mse loss: 11.66271
Testing mse loss: 6.43856
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')

-------------CallTabMlpTorch End-------------

WideDeep metrics
TabMlp 1/1
CallTabMlp metrics
TabMlp 1/1
CallTabMlpTorch metrics
TabMlp 1/1
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')
[4]:
Program Model Training RMSE Training MSE Training MAE Training MAPE Training R2 Training MEDIAN_ABSOLUTE_ERROR Training EXPLAINED_VARIANCE_SCORE Testing RMSE ... Testing R2 Testing MEDIAN_ABSOLUTE_ERROR Testing EXPLAINED_VARIANCE_SCORE Validation RMSE Validation MSE Validation MAE Validation MAPE Validation R2 Validation MEDIAN_ABSOLUTE_ERROR Validation EXPLAINED_VARIANCE_SCORE
0 WideDeep TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152
1 CallTabMlp TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152
2 CallTabMlpTorch TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152

3 rows × 23 columns

Extract learned hidden representation from models#

The original correlation among input features and targets can be complex, especially for high dimensional inputs and multimodal inputs, which is why we want deep learning models to extract the internal relations and reduce the dimension. For most deep learning models, no matter what the backbone structure is, the output of the backbone is normally a low dimension tensor (for instance, (batch_size, 16)), which contains learned information from the deep learning model, so we name it “hidden representation” of the deep learning model. The hidden representation will be projected to the output dimension through a linear layer, an MLP, etc.

Most models in two model bases, pytorch_widedeep (WideDeep) and pytorch_tabular (PyTorchTabular), are supported to extract hidden representations in an AbstractNN.

To use this functionality, first, change the name in required_models. A postfix “_WRAP” is added.

class CallTabMlpTorchWrapped(CallTabMlpTorch):
    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp_WRAP"]

    def _get_program_name(self):
        return "CallTabMlpTorchWrapped"

    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNNWrapped(datamodule=self.trainer.datamodule, **kwargs)

Now more operations can be done in the AbstractNN. In __init__, _test_required_model can be used to check the validity of hidden representations and get its dimension to further generate nn.Modules like a linear layer or MLP.

class CallTabMlpNNWrapped(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNNWrapped, self).__init__(datamodule, **kwargs)
        print(required_models)
        self.net = required_models["EXTERN_WideDeep_TabMlp_WRAP"]
        self.use_hidden_rep, hidden_rep_dim = self._test_required_model(
            self.n_inputs, self.net
        )
        print(f"Does the model support extracting hidden representation?: {self.use_hidden_rep}")
        print(f"The dimension of the hidden representation: {hidden_rep_dim}")

When doing forward propagation, the hidden representation can be extracted using get_hidden_state.

def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
    print(derived_tensors["data_required_models"].keys())
    output = self.call_required_model(self.net, x, derived_tensors)
    hidden = self.get_hidden_state(self.net, x, derived_tensors)
    print(f"The dimensions of the batched hidden representation: {hidden.shape}")
    return output

Remark: If the model does not support extracting hidden representations, self.use_hidden_rep shown above will be False, and hidden_rep_dim will be self.n_inputs + 1, which suggests concatenating continuous features and the output of the model.

Remark: Same as the output, the hidden representation is calculated when preparing the dataset and is stored in derived_tensors["data_required_models"]["MODELNAME_hidden"].

[5]:
from tabensemb.model import TorchModel, AbstractNN

class CallTabMlpNNWrapped(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNNWrapped, self).__init__(datamodule, **kwargs)
        print(required_models)
        self.net = required_models["EXTERN_WideDeep_TabMlp_WRAP"]
        self.use_hidden_rep, hidden_rep_dim = self._test_required_model(
            self.n_inputs, self.net
        )
        print(f"Does the model support extracting hidden representation?: {self.use_hidden_rep}")
        print(f"The dimension of the hidden representation: {hidden_rep_dim}")

    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        print(derived_tensors["data_required_models"].keys())
        output = self.call_required_model(self.net, x, derived_tensors)
        hidden = self.get_hidden_state(self.net, x, derived_tensors)
        print(f"The dimensions of the batched hidden representation: {hidden.shape}")
        return output

class CallTabMlpTorchWrapped(CallTabMlpTorch):
    def _get_program_name(self):
        return "CallTabMlpTorchWrapped"

    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNNWrapped(datamodule=self.trainer.datamodule, **kwargs)

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp_WRAP"]

We can show the information of the extracted model, the hidden representation, and the stored data and predictions.

[6]:
trainer.add_modelbases([CallTabMlpTorchWrapped(trainer)])
trainer.get_modelbase("CallTabMlpTorchWrapped").train(stderr_to_stdout=True)
trainer.get_leaderboard()

-------------Run CallTabMlpTorchWrapped-------------

Training TabMlp
{'EXTERN_WideDeep_TabMlp_WRAP': <tabensemb.model.widedeep.WideDeepWrapper object at 0x7f8f782182b0>}
Does the model support extracting hidden representation?: True
The dimension of the hidden representation: 100
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([238, 100])
Training mse loss: 10.17037
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
Validation mse loss: 11.66271
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
Testing mse loss: 6.43856
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')

-------------CallTabMlpTorchWrapped End-------------

WideDeep metrics
TabMlp 1/1
CallTabMlp metrics
TabMlp 1/1
CallTabMlpTorch metrics
TabMlp 1/1
CallTabMlpTorchWrapped metrics
TabMlp 1/1
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([238, 100])
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpvlx3s8em/output/auto-mpg/2023-09-23-20-41-06-0_UserInputConfig/trainer.pkl')
[6]:
Program Model Training RMSE Training MSE Training MAE Training MAPE Training R2 Training MEDIAN_ABSOLUTE_ERROR Training EXPLAINED_VARIANCE_SCORE Testing RMSE ... Testing R2 Testing MEDIAN_ABSOLUTE_ERROR Testing EXPLAINED_VARIANCE_SCORE Validation RMSE Validation MSE Validation MAE Validation MAPE Validation R2 Validation MEDIAN_ABSOLUTE_ERROR Validation EXPLAINED_VARIANCE_SCORE
0 WideDeep TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152
1 CallTabMlp TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152
2 CallTabMlpTorch TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152
3 CallTabMlpTorchWrapped TabMlp 3.189102 10.170372 2.318564 0.096454 0.842218 1.669983 0.859805 2.537431 ... 0.88025 1.767459 0.900587 3.415071 11.662707 2.539188 0.116035 0.791657 1.90416 0.806152

4 rows × 23 columns

Remark: If a model from the same TorchModel is required, the AbstractNN is extracted and passed as required_models. When calling call_required_model and get_hidden_state, you must pass the model_name argument.