Customized model base#

For researchers or model base developers, the basic need is comparing their own models with existing benchmarks in tabensemb. In this part, a model base is built within the framework assuming that we want to integrate TabNet (from dreamquark-ai team) into tabensemb (indeed pytorch_tabular and pytorch_widedeep have done that) for regression and classification tasks.

Remark: For PyTorch-based models, we have implemented most requirements of the framework so that users can integrate torch.nn.Modules more conveniently.

Example: Implement TabNet as a model base from scratch#

[1]:
import tabensemb
import numpy as np
import torch
import os
from tempfile import TemporaryDirectory

temp_path = TemporaryDirectory()
tabensemb.setting["default_output_path"] = os.path.join(temp_path.name, "output")
tabensemb.setting["default_config_path"] = os.path.join(temp_path.name, "configs")
tabensemb.setting["default_data_path"] = os.path.join(temp_path.name, "data")

device = "cuda" if torch.cuda.is_available() else "cpu"

All model bases inherit AbstractModel and implement methods within the class. If necessary methods are not implemented, NotImplementedError will be raised during usage.

[2]:
from tabensemb.model import AbstractModel

We use scikit-optimize (scikit-optimize/scikit-optimize) to do Bayesian hyperparameter optimization, so space classes are imported.

[3]:
from skopt.space import Integer, Real, Categorical

First, we define the initialization of the model base. Always remember to pass all args and kwargs to __init__ of AbstractModel. You can do other things in __init__. All *args and **kwargs (including arguments like the some_param shown below) are recorded in self.init_params.

class TabNetFromAbstract(AbstractModel):
    def __init__(self, *args, some_param=1.1, **kwargs):
        super(TabNetFromAbstract, self).__init__(*args, **kwargs)
        # Do something else here
        self.some_param = some_param
        print(self.init_params)

We should define the name of the model base and all available models in the model base.

def _get_program_name(self):
    return "TabNetFromAbstract"

def _get_model_names(self):
    return ["TabNet"]

For each model in the model base, the program will request initial hyperparameters of the model and their search spaces. They are defined as

def _space(self, model_name):
    return [
        Integer(low=4, high=16, prior="uniform", name="n_d", dtype=int),
        Integer(low=4, high=16, prior="uniform", name="n_a", dtype=int),
        Integer(low=1, high=6, prior="uniform", name="n_steps", dtype=int),
        Real(low=1.0, high=1.5, prior="uniform", name="gamma"),
        Integer(
            low=1, high=4, prior="uniform", name="n_independent", dtype=int
        ),
        Integer(low=1, high=4, prior="uniform", name="n_shared", dtype=int),
    ] + self.trainer.SPACE

def _initial_values(self, model_name):
    return {
        "n_d": 8,
        "n_a": 8,
        "n_steps": 3,
        "gamma": 1.3,
        "n_independent": 2,
        "n_shared": 2,
        "lr": self.trainer.args["lr"],
        "weight_decay": self.trainer.args["weight_decay"],
        "batch_size": self.trainer.args["batch_size"],
    }

Before training, each model base has its own way of processing the dataset.

_train_data_preprocess will return the processed dataset according to a given Trainer which provides all training information and data required. In this example, X_train/X_val/X_test represent training/validation/testing sets, and y_train/y_val/y_test represent corresponding labels.

Remark: The tabular dataset has gone through all processing stages defined in the DataModule inside the trainer except scaling. Call self.trainer.datamodule.data_transform(df, scaler_only=True) to scale it using the trained scaler if no scaling stage is defined internally in the model.

def _train_data_preprocess(self, model_name):
    data = self.trainer.datamodule
    all_feature_names = data.all_feature_names

    X_train = data.data_transform(data.X_train, scaler_only=True)[
        all_feature_names
    ].values.astype(np.float64)
    X_val = data.data_transform(data.X_val, scaler_only=True)[
        all_feature_names
    ].values.astype(np.float64)
    X_test = data.data_transform(data.X_test, scaler_only=True)[
        all_feature_names
    ].values.astype(np.float64)
    y_train = data.y_train.astype(np.float64)
    y_val = data.y_val.astype(np.float64)
    y_test = data.y_test.astype(np.float64)

    return {
        "X_train": X_train,
        "y_train": y_train,
        "X_val": X_val,
        "y_val": y_val,
        "X_test": X_test,
        "y_test": y_test,
    }

Correspondingly, _data_preprocess will process an upcoming new dataset, including the tabular data df containing continuous features and categorical features, and unstacked derived data derived_data (multi-modal data or something else depending on the configuration introduced in “Using data functionalities”). The returned value should have the same structure as the X_test returned in _train_data_preprocess.

def _data_preprocess(self, df, derived_data, model_name):
    return self.trainer.datamodule.data_transform(df, scaler_only=True)[
        self.trainer.all_feature_names
    ].values.astype(np.float64)

The program will pass a selected set of hyperparameters as kwargs to initialize a model, train a model, and predict using the model. The returned model will be stored locally and reloaded for evaluation and inference, so make sure it contains all the information needed to make predictions.

Here we initialize the model using information contained in the DataModule instance, including the indices of categorical features cat_idxs, the number of categories of each categorical feature cat_dims, the current task task (possible values are “regression”, “binary”, and “multiclass”), the device to train the model self.trainer.device, and the hyperparameters kwargs. model_name is ignored because we only have one model in the model base. All model bases should at least follow the guidance of self.trainer.device, self.trainer.datamodule.task, model_name, and kwargs to make all models trained in a consistent way within the framework.

Remark: In DataModule.cat_num_unique and DataModule.cat_feature_mapping, the category of unknown or missing values is already included as -1 for integer-like categorical features and UNK for string-like categorical features.

def _new_model(self, model_name, verbose, **kwargs):
    from pytorch_tabnet.tab_model import TabNetRegressor, TabNetClassifier

    datamodule = self.trainer.datamodule
    cat_idxs = np.array(range(len(datamodule.cont_feature_names), len(datamodule.all_feature_names)))
    cat_dims = datamodule.cat_num_unique
    self.task = datamodule.task
    init_kwargs = dict(
        verbose=tabensemb.setting["verbose_per_epoch"] if verbose else 0,
        optimizer_params={
            "lr": kwargs["lr"],
            "weight_decay": kwargs["weight_decay"],
        },
        cat_idxs=cat_idxs,
        cat_dims=cat_dims,
        cat_emb_dim=3,
        device_name=self.trainer.device,
    )
    if self.trainer.datamodule.task == "regression":
        model = TabNetRegressor(**init_kwargs)
    else:
        model = TabNetClassifier(**init_kwargs)

    model.set_params(
        **{
            "n_d": kwargs["n_d"],
            "n_a": kwargs["n_a"],
            "n_steps": kwargs["n_steps"],
            "gamma": kwargs["gamma"],
            "n_independent": kwargs["n_independent"],
            "n_shared": kwargs["n_shared"],
        }
    )
    return model

Remark: kwargs has all keys defined in _initial_values. If a parameter named batch_size is included, a new key named original_batch_size exists in kwargs. The values of batch_size and original_batch_size may be different if the program finds that the batch size will make the mini-batches tiny. The threshold is defined by self.limit_batch_size (default to 6). A tiny batch might interrupt some models, so it is better to use the modified batch_size value.

The framework will pass X_train, y_train, X_val, and y_val from _train_data_preprocess to the following _train_single_model method, along with some other arguments stating the current training stage. epoch is the number of epochs to train the model. warm_start=True means the passed model is already trained and should be fine-tuned based on a new dataset. in_bayes_opt=True means that the passed kwargs is selected by a bayesian hyperparameter optimization step, and a simplified training routine is needed to reduce optimization time, so we set the max_epochs to “bayes_epoch” in the configuration.

Remark: epoch will be self.trainer.args["bayes_epoch"] if in_bayes_opt=True, and self.trainer.args["epoch"] otherwise.

Remark: If you want to plot the training/validation loss curves using the Trainer.plot_loss method for your own model base, you should record the losses as lists in self.train_losses, self.val_losses, and self.earlystopping_epoch after training in _train_single_model. See source codes of PytorchTabular, WideDeep, or TorchModel for details.

def _train_single_model(
    self,
    model,
    model_name,
    epoch,
    X_train,
    y_train,
    X_val,
    y_val,
    verbose,
    warm_start,
    in_bayes_opt,
    **kwargs,
):
    eval_set = [(X_val, y_val if self.task == "regression" else y_val.flatten())]

    model.fit(
        X_train,
        y_train if self.task == "regression" else y_train.flatten(),
        eval_set=eval_set,
        max_epochs=epoch if not in_bayes_opt else self.trainer.args["bayes_epoch"],
        patience=self.trainer.args["patience"],
        loss_fn=torch.nn.MSELoss()
        if self.task == "regression"
        else torch.nn.CrossEntropyLoss(),
        eval_metric=["mse" if self.task == "regression" else "logloss"],
        batch_size=int(kwargs["batch_size"]),
        warm_start=warm_start,
        drop_last=False,
    )

To evaluate the model or make use of the model, _pred_single_model is defined, and X_test processed in _train_data_preprocess or _data_preprocess is passed as an argument. The returned value should always be a two-dimensional np.ndarray. For binary classification tasks, the output is the probability of the positive (1) class, and for multiclass classification, the output is the probability of each class. AbstractModel automatically deals with the probabilities for metrics and final outputs.

def _pred_single_model(self, model, X_test, verbose, **kwargs):
    if self.task == "regression":
        return model.predict(X_test).reshape(-1, 1)
    elif self.task == "binary":
        return model.predict_proba(X_test)[:, 1].reshape(-1, 1)
    else:
        return model.predict_proba(X_test)

The full code is as follows:

[4]:
class TabNetFromAbstract(AbstractModel):
    def __init__(self, *args, some_param=1.1, **kwargs):
        super(TabNetFromAbstract, self).__init__(*args, **kwargs)
        # Do something else here
        self.some_param = some_param
        print(self.init_params)

    def _get_program_name(self):
        return "TabNetFromAbstract"

    def _get_model_names(self):
        return ["TabNet"]

    def _space(self, model_name):
        return [
            Integer(low=4, high=16, prior="uniform", name="n_d", dtype=int),
            Integer(low=4, high=16, prior="uniform", name="n_a", dtype=int),
            Integer(low=1, high=6, prior="uniform", name="n_steps", dtype=int),
            Real(low=1.0, high=1.5, prior="uniform", name="gamma"),
            Integer(
                low=1, high=4, prior="uniform", name="n_independent", dtype=int
            ),
            Integer(low=1, high=4, prior="uniform", name="n_shared", dtype=int),
        ] + self.trainer.SPACE

    def _initial_values(self, model_name):
        return {
            "n_d": 8,
            "n_a": 8,
            "n_steps": 3,
            "gamma": 1.3,
            "n_independent": 2,
            "n_shared": 2,
            "lr": self.trainer.args["lr"],
            "weight_decay": self.trainer.args["weight_decay"],
            "batch_size": self.trainer.args["batch_size"],
        }

    def _train_data_preprocess(self, model_name):
        data = self.trainer.datamodule
        all_feature_names = data.all_feature_names

        X_train = data.data_transform(data.X_train, scaler_only=True)[
            all_feature_names
        ].values.astype(np.float64)
        X_val = data.data_transform(data.X_val, scaler_only=True)[
            all_feature_names
        ].values.astype(np.float64)
        X_test = data.data_transform(data.X_test, scaler_only=True)[
            all_feature_names
        ].values.astype(np.float64)
        y_train = data.y_train.astype(np.float64)
        y_val = data.y_val.astype(np.float64)
        y_test = data.y_test.astype(np.float64)

        return {
            "X_train": X_train,
            "y_train": y_train,
            "X_val": X_val,
            "y_val": y_val,
            "X_test": X_test,
            "y_test": y_test,
        }

    def _data_preprocess(self, df, derived_data, model_name):
        return self.trainer.datamodule.data_transform(df, scaler_only=True)[
            self.trainer.all_feature_names
        ].values.astype(np.float64)

    def _new_model(self, model_name, verbose, **kwargs):
        from pytorch_tabnet.tab_model import TabNetRegressor, TabNetClassifier

        datamodule = self.trainer.datamodule
        cat_idxs = list(range(len(datamodule.cont_feature_names), len(datamodule.all_feature_names)))
        cat_dims = datamodule.cat_num_unique
        self.task = datamodule.task
        init_kwargs = dict(
            verbose=tabensemb.setting["verbose_per_epoch"] if verbose else 0,
            optimizer_params={
                "lr": kwargs["lr"],
                "weight_decay": kwargs["weight_decay"],
            },
            cat_idxs=cat_idxs,
            cat_dims=cat_dims,
            cat_emb_dim=3,
            device_name=self.trainer.device,
        )
        if self.trainer.datamodule.task == "regression":
            model = TabNetRegressor(**init_kwargs)
        else:
            model = TabNetClassifier(**init_kwargs)

        model.set_params(
            **{
                "n_d": kwargs["n_d"],
                "n_a": kwargs["n_a"],
                "n_steps": kwargs["n_steps"],
                "gamma": kwargs["gamma"],
                "n_independent": kwargs["n_independent"],
                "n_shared": kwargs["n_shared"],
            }
        )
        return model

    def _train_single_model(
        self,
        model,
        model_name,
        epoch,
        X_train,
        y_train,
        X_val,
        y_val,
        verbose,
        warm_start,
        in_bayes_opt,
        **kwargs,
    ):
        eval_set = [(X_val, y_val if self.task == "regression" else y_val.flatten())]

        model.fit(
            X_train,
            y_train if self.task == "regression" else y_train.flatten(),
            eval_set=eval_set,
            max_epochs=epoch if not in_bayes_opt else self.trainer.args["bayes_epoch"],
            patience=self.trainer.args["patience"],
            loss_fn=torch.nn.MSELoss()
            if self.task == "regression"
            else torch.nn.CrossEntropyLoss(),
            eval_metric=["mse" if self.task == "regression" else "logloss"],
            batch_size=int(kwargs["batch_size"]),
            warm_start=warm_start,
            drop_last=False,
        )

    def _pred_single_model(self, model, X_test, verbose, **kwargs):
        if self.task == "regression":
            return model.predict(X_test).reshape(-1, 1)
        elif self.task == "binary":
            return model.predict_proba(X_test)[:, 1].reshape(-1, 1)
        else:
            return model.predict_proba(X_test)

Example: Implement TabNet as a PyTorch-based model#

Indeed, the example shown above uses TabNetRegressor and TabNetClassifier from pytorch_tabnet that have already implemented the training and evaluation procedures over the torch.nn.Module subclass called TabNet. We can also directly build a model base for nn.Modules with less effort. These model bases inherit TorchModel, and nn.Modules should inherit AbstractNN (just needs to change a few lines to migrate previous code into this framework).

[5]:
from tabensemb.model import TorchModel, AbstractNN
from pytorch_tabnet.tab_network import TabNet
from typing import Dict

First, we implement an AbstractNN (which inherits pytorch_lightning.LightningModule that further inherits torch.nn.Module).

We initialize the model in __init__. kwargs will depend on the arguments passed from _new_model, which will be implemented later, but at least it should contain all keys defined in _initial_values, as introduced in an above remark.

Remember to call super().__init__. There is nothing more difficult than initializing a LightningModule.

We can use self.hparams.some_param to get a hyperparameter (equivalent to kwargs["some_param"]) if you call super().__init__(datamodule, **kwargs) instead of super().__init__(datamodule) because AbstractNN uses the LightningModule.save_hyperparameters utility (which you should not call in your own __init__).

Remark: To migrate existing nn.Module code (Part 1)

  • Change class SomeModel(nn.Module) to class SomeModel(AbstractNN).

  • Change the indices of categorical features to [0, 1, ..., self.n_cat-1] and the numbers of unique categories of categorical features to self.cat_num_unique.

  • Change the number of input dimensions to self.n_cont+self.n_cat and the number of output dimensions self.n_outputs.

class TabNetNN(AbstractNN):
    def __init__(
        self,
        datamodule,
        **kwargs,
    ):
        super(TabNetNN, self).__init__(datamodule, **kwargs)
        self.network = TabNet(
            input_dim=self.n_cont+self.n_cat,
            output_dim=self.n_outputs,
            n_d=self.hparams.n_d,
            n_a=self.hparams.n_a,
            n_steps=self.hparams.n_steps,
            gamma=self.hparams.gamma,
            cat_idxs=list(range(self.n_cat)),
            cat_dims=self.cat_num_unique,
            cat_emb_dim=[3] * self.n_cat,
            n_independent=self.hparams.n_independent,
            n_shared=self.hparams.n_shared,
        )

Then we implement the computation step of the model. We should implement _forward instead of forward which is already implemented by AbstractNN and is used to automatically process inputs and outputs of _forward.

There are two input arguments for _forward: x and derived_tensors. x is a tensor of continuous features. derived_tensors is a dictionary containing contents in datamodule.derived_data (which is introduced in the last two sections of the “Using data functionalities” part), including categorical data (with the key “categorical” if there is any categorical feature), the signal for each data point representing whether it is an augmented one (with the key “augmented” if there is any augmented data point), and derived unstacked data (with the key derived_name specified in the configuration). This is how multimodal data is passed to a deep learning model in our framework.

In the following lines, we build the input of the neural network from the continuous features x and the categorical features derived_tensors["categorical"] by concatenation (that’s why the indices of categorical features are set to [0, 1, ..., self.n_cat-1]), calculate the output of the network, and return the output.

Remark: The default loss function is torch.nn.MSELoss for regression, torch.nn.BCEWithLogitsLoss for binary classification, and torch.nn.CrossEntropyLoss for multiclass classification. To change this behavior, implement self.loss_fn. See the “Advanced customized model base” part for details.

Remark: For binary classification tasks, self.n_outputs=1 so we expect the logits of the positive class (instead of a normalized probability). The output is then used to calculate torch.nn.BCEWithLogitsLoss by default. For multiclass classification tasks, self.n_outputs is the number of classes, so we expect the logits of these classes (instead of probabilities from Softmax or something else). The output is then used to calculate torch.nn.CrossEntropyLoss by default.

Remark: To migrate existing nn.Module code (Part 2)

  • Change forward to _forward

  • Get categorical features from derived_tensors

  • Get multimodal features from derived_tensors (and load multimodal features using data derivers)

  • Return logits instead of probabilities

def _forward(
    self, x: torch.Tensor, derived_tensors: Dict[str, torch.Tensor]
) -> torch.Tensor:
    x_cont = x
    if "categorical" in derived_tensors.keys():
        x_cat = derived_tensors["categorical"]
        x_in = torch.concat([x_cat, x_cont], dim=-1)
    else:
        x_in = x_cont
    output, _ = self.network(x_in)
    return output

The code is as follows:

[6]:
class TabNetNN(AbstractNN):
    def __init__(
        self,
        datamodule,
        **kwargs,
    ):
        super(TabNetNN, self).__init__(datamodule, **kwargs)
        self.network = TabNet(
            input_dim=self.n_cont+self.n_cat,
            output_dim=self.n_outputs,
            n_d=self.hparams.n_d,
            n_a=self.hparams.n_a,
            n_steps=self.hparams.n_steps,
            gamma=self.hparams.gamma,
            cat_idxs=list(range(self.n_cat)),
            cat_dims=self.cat_num_unique,
            cat_emb_dim=[3] * self.n_cat,
            n_independent=self.hparams.n_independent,
            n_shared=self.hparams.n_shared,
        )

    def _forward(
        self, x: torch.Tensor, derived_tensors: Dict[str, torch.Tensor]
    ) -> torch.Tensor:
        x_cont = x
        if "categorical" in derived_tensors.keys():
            x_cat = derived_tensors["categorical"]
            x_in = torch.concat([x_cat, x_cont], dim=-1)
        else:
            x_in = x_cont
        output, _ = self.network(x_in)
        return output

Finally, we build the model base for the neural network. It inherits TorchModel which has implemented most required methods. Necessary methods for TorchModel can be written similarly with TabNetFromAbstract.

Remark: PyTorch-based models are trained using pytorch_lightning.Trainer, whose arguments can be specified by passing them to TorchModel.__init__ as a dictionary using the key lightning_trainer_kwargs.

In the following implementation, _new_model passes the datamodule and hyperparameters to the neural network, which is what you saw above in __init__. You can also pass other arguments as you want.

[7]:
class TabNetFromTorch(TorchModel):
    def _new_model(self, model_name, verbose, **kwargs):
        return TabNetNN(datamodule=self.trainer.datamodule, **kwargs)

    def _get_program_name(self):
        return "TabNetFromTorch"

    def _get_model_names(self):
        return ["TabNet"]

    def _space(self, model_name):
        return [
            Integer(low=4, high=16, prior="uniform", name="n_d", dtype=int),
            Integer(low=4, high=16, prior="uniform", name="n_a", dtype=int),
            Integer(low=1, high=6, prior="uniform", name="n_steps", dtype=int),
            Real(low=1.0, high=1.5, prior="uniform", name="gamma"),
            Integer(
                low=1, high=4, prior="uniform", name="n_independent", dtype=int
            ),
            Integer(low=1, high=4, prior="uniform", name="n_shared", dtype=int),
        ] + self.trainer.SPACE

    def _initial_values(self, model_name):
        return {
            "n_d": 8,
            "n_a": 8,
            "n_steps": 3,
            "gamma": 1.3,
            "n_independent": 2,
            "n_shared": 2,
            "lr": self.trainer.args["lr"],
            "weight_decay": self.trainer.args["weight_decay"],
            "batch_size": self.trainer.args["batch_size"],
        }

Comparison of different implementations in other model bases#

We can compare our models with TabNet implemented in the other two model bases. Note that because of different training routines and randomization, they perform differently. Let’s try the models on a regression task first.

[8]:
from tabensemb.trainer import Trainer
from tabensemb.config import UserConfig
from tabensemb.model import PytorchTabular, WideDeep

trainer = Trainer(device=device)
mpg_columns = [
    "mpg",
    "cylinders",
    "displacement",
    "horsepower",
    "weight",
    "acceleration",
    "model_year",
    "origin",
    "car_name",
]
cfg = UserConfig.from_uci("Auto MPG", column_names=mpg_columns, sep=r"\s+")
trainer.load_config(cfg)
trainer.load_data()
trainer.add_modelbases(
    [
        PytorchTabular(trainer, model_subset=["TabNet"]),
        WideDeep(trainer, model_subset=["TabNet"]),
        TabNetFromAbstract(trainer),
        TabNetFromTorch(trainer),
    ]
)
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()
Downloading https://archive.ics.uci.edu/static/public/9/auto+mpg.zip to /tmp/tmpdriivjp7/data/Auto MPG.zip
cylinders is Integer and will be treated as a continuous feature.
model_year is Integer and will be treated as a continuous feature.
origin is Integer and will be treated as a continuous feature.
Unknown values are detected in ['horsepower']. They will be treated as np.nan.
The project will be saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig
Dataset size: 238 80 80
Data saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig (data.csv and tabular_data.csv).
{'some_param': 1.1, 'program': None, 'model_subset': None, 'exclude_models': None, 'store_in_harddisk': True}

-------------Run PytorchTabular-------------

Training TabNet
Global seed set to 42
2023-09-23 20:34:52,661 - {pytorch_tabular.tabular_model:473} - INFO - Preparing the DataLoaders
2023-09-23 20:34:52,661 - {pytorch_tabular.tabular_datamodule:290} - INFO - Setting up the datamodule for regression task
2023-09-23 20:34:52,670 - {pytorch_tabular.tabular_model:521} - INFO - Preparing the Model: TabNetModel
2023-09-23 20:34:52,684 - {pytorch_tabular.tabular_model:268} - INFO - Preparing the Trainer
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:589: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
  rank_zero_deprecation(
Auto select gpus: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
2023-09-23 20:34:53,558 - {pytorch_tabular.tabular_model:582} - INFO - Training Started
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name             | Type           | Params
----------------------------------------------------
0 | _embedding_layer | Identity       | 0
1 | _backbone        | TabNetBackbone | 6.1 K
2 | _head            | Identity       | 0
3 | loss             | MSELoss        | 0
----------------------------------------------------
6.1 K     Trainable params
0         Non-trainable params
6.1 K     Total params
0.024     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 631.3760, Val loss: 566.2643, Min val loss: 566.2643, Epoch time: 0.029s.
Epoch: 20/300, Train loss: 601.5834, Val loss: 530.7571, Min val loss: 530.7571, Epoch time: 0.023s.
Epoch: 40/300, Train loss: 573.0287, Val loss: 512.0469, Min val loss: 512.0469, Epoch time: 0.024s.
Epoch: 60/300, Train loss: 547.0676, Val loss: 488.9417, Min val loss: 488.9417, Epoch time: 0.026s.
Epoch: 80/300, Train loss: 521.3848, Val loss: 460.8999, Min val loss: 460.8999, Epoch time: 0.024s.
Epoch: 100/300, Train loss: 492.7377, Val loss: 429.2286, Min val loss: 429.2286, Epoch time: 0.023s.
Epoch: 120/300, Train loss: 461.0415, Val loss: 398.5600, Min val loss: 398.5600, Epoch time: 0.025s.
Epoch: 140/300, Train loss: 430.3234, Val loss: 374.5228, Min val loss: 374.5228, Epoch time: 0.035s.
Epoch: 160/300, Train loss: 397.2393, Val loss: 348.5988, Min val loss: 348.5988, Epoch time: 0.028s.
Epoch: 180/300, Train loss: 370.6253, Val loss: 322.0574, Min val loss: 322.0574, Epoch time: 0.026s.
Epoch: 200/300, Train loss: 340.3246, Val loss: 301.0881, Min val loss: 301.0881, Epoch time: 0.028s.
Epoch: 220/300, Train loss: 315.3022, Val loss: 277.9825, Min val loss: 277.9825, Epoch time: 0.024s.
Epoch: 240/300, Train loss: 287.3188, Val loss: 257.2012, Min val loss: 257.2012, Epoch time: 0.024s.
Epoch: 260/300, Train loss: 260.1859, Val loss: 233.5729, Min val loss: 233.5729, Epoch time: 0.024s.
Epoch: 280/300, Train loss: 235.5418, Val loss: 206.5197, Min val loss: 206.5197, Epoch time: 0.025s.
Epoch: 300/300, Train loss: 211.4229, Val loss: 186.0890, Min val loss: 186.0890, Epoch time: 0.046s.
`Trainer.fit` stopped: `max_epochs=300` reached.
2023-09-23 20:35:05,998 - {pytorch_tabular.tabular_model:584} - INFO - Training the model completed
2023-09-23 20:35:05,998 - {pytorch_tabular.tabular_model:1258} - INFO - Loading the best model
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v2.0.0. Please use `lightning_fabric.utilities.cloud_io.get_filesystem` instead.
  rank_zero_deprecation(
Training mse loss: 204.72249
Validation mse loss: 186.08899
Testing mse loss: 211.05500
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')

-------------PytorchTabular End-------------


-------------Run WideDeep-------------

Training TabNet
Epoch: 1/300, Train loss: 632.3985, Val loss: 567.4960, Min val loss: 567.4960
Epoch: 21/300, Train loss: 598.0990, Val loss: 533.1615, Min val loss: 533.1615
Epoch: 41/300, Train loss: 569.2331, Val loss: 512.1868, Min val loss: 512.1868
Epoch: 61/300, Train loss: 540.4758, Val loss: 484.7060, Min val loss: 484.7060
Epoch: 81/300, Train loss: 511.0865, Val loss: 455.2033, Min val loss: 455.2033
Epoch: 101/300, Train loss: 480.2253, Val loss: 424.2426, Min val loss: 424.2426
Epoch: 121/300, Train loss: 450.1095, Val loss: 398.7469, Min val loss: 398.7469
Epoch: 141/300, Train loss: 419.1121, Val loss: 373.4113, Min val loss: 373.4113
Epoch: 161/300, Train loss: 389.0500, Val loss: 343.9605, Min val loss: 343.9605
Epoch: 181/300, Train loss: 359.7761, Val loss: 317.2437, Min val loss: 317.2437
Epoch: 201/300, Train loss: 332.5761, Val loss: 289.8560, Min val loss: 289.8560
Epoch: 221/300, Train loss: 304.8683, Val loss: 268.5120, Min val loss: 268.5120
Epoch: 241/300, Train loss: 278.2647, Val loss: 245.0433, Min val loss: 245.0433
Epoch: 261/300, Train loss: 252.8031, Val loss: 220.4438, Min val loss: 220.4438
Epoch: 281/300, Train loss: 228.1485, Val loss: 196.3897, Min val loss: 196.3897
Restoring model weights from the end of the best epoch
Training mse loss: 206.63809
Validation mse loss: 173.36779
Testing mse loss: 212.69775
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')

-------------WideDeep End-------------


-------------Run TabNetFromAbstract-------------

Training TabNet
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/multiclass_utils.py:13: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
  from scipy.sparse.base import spmatrix
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/abstract_model.py:75: UserWarning: Device used : cuda
  warnings.warn(f"Device used : {self.device}")
epoch 0  | loss: 587.08069| val_0_mse: 521.36511|  0:00:00s
epoch 20 | loss: 543.03503| val_0_mse: 509.49057|  0:00:00s
epoch 40 | loss: 503.94666| val_0_mse: 474.56982|  0:00:01s
epoch 60 | loss: 464.24173| val_0_mse: 433.25231|  0:00:01s
epoch 80 | loss: 428.79398| val_0_mse: 394.9299|  0:00:02s
epoch 100| loss: 398.57919| val_0_mse: 363.43954|  0:00:02s
epoch 120| loss: 368.87878| val_0_mse: 332.14944|  0:00:03s
epoch 140| loss: 339.44043| val_0_mse: 301.91438|  0:00:03s
epoch 160| loss: 313.07431| val_0_mse: 274.16858|  0:00:04s
epoch 180| loss: 287.4639| val_0_mse: 248.59148|  0:00:04s
epoch 200| loss: 255.01363| val_0_mse: 223.61368|  0:00:05s
epoch 220| loss: 229.25694| val_0_mse: 198.16758|  0:00:06s
epoch 240| loss: 200.16383| val_0_mse: 172.72145|  0:00:06s
epoch 260| loss: 176.21191| val_0_mse: 150.20621|  0:00:07s
epoch 280| loss: 150.32635| val_0_mse: 132.21739|  0:00:07s
Stop training because you reached max_epochs = 300 with best_epoch = 299 and best_val_0_mse = 110.85251
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/callbacks.py:172: UserWarning: Best weights from best epoch are automatically used!
  warnings.warn(wrn_msg)
Training mse loss: 122.27976
Validation mse loss: 110.85251
Testing mse loss: 112.14110
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')

-------------TabNetFromAbstract End-------------


-------------Run TabNetFromTorch-------------

Training TabNet
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                | Type     | Params
-------------------------------------------------
0 | default_loss_fn     | MSELoss  | 0
1 | default_output_norm | Identity | 0
2 | network             | TabNet   | 6.1 K
-------------------------------------------------
6.1 K     Trainable params
0         Non-trainable params
6.1 K     Total params
0.024     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 631.3761, Val loss: 568.1518, Min val loss: 568.1518, Min ES val loss: 568.1518, Epoch time: 0.021s.
Epoch: 20/300, Train loss: 601.5826, Val loss: 531.1477, Min val loss: 531.1477, Min ES val loss: 531.1477, Epoch time: 0.020s.
Epoch: 40/300, Train loss: 572.3008, Val loss: 512.4727, Min val loss: 512.4727, Min ES val loss: 512.4727, Epoch time: 0.020s.
Epoch: 60/300, Train loss: 547.1713, Val loss: 487.6359, Min val loss: 487.6359, Min ES val loss: 487.6359, Epoch time: 0.021s.
Epoch: 80/300, Train loss: 521.3299, Val loss: 458.8514, Min val loss: 458.8514, Min ES val loss: 458.8514, Epoch time: 0.023s.
Epoch: 100/300, Train loss: 493.5303, Val loss: 430.2360, Min val loss: 430.2360, Min ES val loss: 430.2360, Epoch time: 0.024s.
Epoch: 120/300, Train loss: 460.1804, Val loss: 404.0699, Min val loss: 404.0699, Min ES val loss: 404.0699, Epoch time: 0.022s.
Epoch: 140/300, Train loss: 430.0944, Val loss: 377.3679, Min val loss: 377.3679, Min ES val loss: 377.3679, Epoch time: 0.022s.
Epoch: 160/300, Train loss: 399.9077, Val loss: 349.6179, Min val loss: 349.6179, Min ES val loss: 349.6179, Epoch time: 0.026s.
Epoch: 180/300, Train loss: 373.4857, Val loss: 330.2073, Min val loss: 330.2073, Min ES val loss: 330.2073, Epoch time: 0.022s.
Epoch: 200/300, Train loss: 342.6961, Val loss: 307.0838, Min val loss: 307.0838, Min ES val loss: 307.0838, Epoch time: 0.024s.
Epoch: 220/300, Train loss: 318.1428, Val loss: 283.2042, Min val loss: 283.2042, Min ES val loss: 283.2042, Epoch time: 0.023s.
Epoch: 240/300, Train loss: 292.0168, Val loss: 260.7692, Min val loss: 260.7692, Min ES val loss: 260.7692, Epoch time: 0.035s.
Epoch: 260/300, Train loss: 263.9176, Val loss: 236.7290, Min val loss: 236.7290, Min ES val loss: 236.7290, Epoch time: 0.037s.
Epoch: 280/300, Train loss: 241.0646, Val loss: 214.5846, Min val loss: 214.5846, Min ES val loss: 214.5846, Epoch time: 0.024s.
Epoch: 300/300, Train loss: 216.0625, Val loss: 191.7081, Min val loss: 191.7081, Min ES val loss: 191.7081, Epoch time: 0.031s.
`Trainer.fit` stopped: `max_epochs=300` reached.
Training mse loss: 208.48247
Validation mse loss: 191.70814
Testing mse loss: 204.47160
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')

-------------TabNetFromTorch End-------------

PytorchTabular metrics
TabNet 1/1
WideDeep metrics
TabNet 1/1
TabNetFromAbstract metrics
TabNet 1/1
TabNetFromTorch metrics
TabNet 1/1
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')
[8]:
Program Model Training RMSE Training MSE Training MAE Training MAPE Training R2 Training MEDIAN_ABSOLUTE_ERROR Training EXPLAINED_VARIANCE_SCORE Testing RMSE ... Testing R2 Testing MEDIAN_ABSOLUTE_ERROR Testing EXPLAINED_VARIANCE_SCORE Validation RMSE Validation MSE Validation MAE Validation MAPE Validation R2 Validation MEDIAN_ABSOLUTE_ERROR Validation EXPLAINED_VARIANCE_SCORE
0 TabNetFromAbstract TabNet 11.058018 122.279759 10.120124 0.425556 -0.897031 9.511139 0.688216 10.589669 ... -1.085708 9.735932 0.711417 10.528652 110.852511 9.544754 0.417929 -0.980271 9.028342 0.575915
1 TabNetFromTorch TabNet 14.438922 208.482471 13.648272 0.578209 -2.234367 12.755719 0.655482 14.299357 ... -2.802959 13.421765 0.606030 13.845871 191.708137 12.779896 0.562416 -2.424678 11.474345 0.490396
2 PytorchTabular TabNet 14.308127 204.722492 13.486490 0.570209 -2.176035 12.484611 0.645709 14.527732 ... -2.925404 13.481362 0.597145 13.641444 186.088988 12.641969 0.558155 -2.324297 11.443497 0.530719
3 WideDeep TabNet 14.374912 206.638091 13.423684 0.562884 -2.205754 12.262027 0.560759 14.584161 ... -2.955957 13.293868 0.569623 13.166920 173.367793 12.166225 0.534651 -2.097046 11.464049 0.411240

4 rows × 23 columns

We can see that TabNet does not perform well with the current hyperparameters. We can use trainer.args["bayes_opt"] = True to activate Bayesian hyperparameter optimization to improve its performance. Alternatively, we can directly provide a set of hyperparameters (which is found by Bayesian optimization) in AbstractModel.model_params. As shown below, the performance significantly improves.

[9]:
trainer = Trainer(device=device)
trainer.load_config(cfg)
trainer.load_data()
modelbase = PytorchTabular(trainer, model_subset=["TabNet"])
trainer.add_modelbases([modelbase])
modelbase.model_params["TabNet"] = {'n_d': 8, 'n_a': 15, 'n_steps': 1, 'gamma': 1.0, 'n_independent': 3, 'n_shared': 4, 'lr': 0.026917811078469658, 'weight_decay': 1e-09, 'batch_size': 64}
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()
The project will be saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-35-35-0_UserInputConfig
Dataset size: 238 80 80
Data saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-35-35-0_UserInputConfig (data.csv and tabular_data.csv).

-------------Run PytorchTabular-------------

Training TabNet
Previous params loaded: {'n_d': 8, 'n_a': 15, 'n_steps': 1, 'gamma': 1.0, 'n_independent': 3, 'n_shared': 4, 'lr': 0.026917811078469658, 'weight_decay': 1e-09, 'batch_size': 64}
Global seed set to 42
2023-09-23 20:35:35,743 - {pytorch_tabular.tabular_model:473} - INFO - Preparing the DataLoaders
2023-09-23 20:35:35,744 - {pytorch_tabular.tabular_datamodule:290} - INFO - Setting up the datamodule for regression task
2023-09-23 20:35:35,754 - {pytorch_tabular.tabular_model:521} - INFO - Preparing the Model: TabNetModel
2023-09-23 20:35:35,766 - {pytorch_tabular.tabular_model:268} - INFO - Preparing the Trainer
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:589: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
  rank_zero_deprecation(
Auto select gpus: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
2023-09-23 20:35:35,784 - {pytorch_tabular.tabular_model:582} - INFO - Training Started
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name             | Type           | Params
----------------------------------------------------
0 | _embedding_layer | Identity       | 0
1 | _backbone        | TabNetBackbone | 11.3 K
2 | _head            | Identity       | 0
3 | loss             | MSELoss        | 0
----------------------------------------------------
11.3 K    Trainable params
0         Non-trainable params
11.3 K    Total params
0.045     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 608.1827, Val loss: 442.9725, Min val loss: 442.9725, Epoch time: 0.066s.
Epoch: 20/300, Train loss: 8.9815, Val loss: 23.6918, Min val loss: 23.6918, Epoch time: 0.053s.
Epoch: 40/300, Train loss: 6.7731, Val loss: 13.2956, Min val loss: 9.6972, Epoch time: 0.061s.
Epoch: 60/300, Train loss: 6.0017, Val loss: 11.2620, Min val loss: 9.4719, Epoch time: 0.059s.
Epoch: 80/300, Train loss: 4.7520, Val loss: 10.6944, Min val loss: 9.4719, Epoch time: 0.082s.
Epoch: 100/300, Train loss: 6.2060, Val loss: 8.9962, Min val loss: 8.5929, Epoch time: 0.065s.
Epoch: 120/300, Train loss: 5.4083, Val loss: 11.0474, Min val loss: 8.5929, Epoch time: 0.056s.
Epoch: 140/300, Train loss: 4.6970, Val loss: 10.8864, Min val loss: 8.4415, Epoch time: 0.052s.
Epoch: 160/300, Train loss: 5.2870, Val loss: 10.9812, Min val loss: 8.4415, Epoch time: 0.053s.
Epoch: 180/300, Train loss: 4.4553, Val loss: 10.5833, Min val loss: 8.4415, Epoch time: 0.052s.
Epoch: 200/300, Train loss: 4.3323, Val loss: 12.8693, Min val loss: 8.4415, Epoch time: 0.058s.
Epoch: 220/300, Train loss: 4.1741, Val loss: 12.9377, Min val loss: 8.4415, Epoch time: 0.057s.
2023-09-23 20:35:49,765 - {pytorch_tabular.tabular_model:584} - INFO - Training the model completed
2023-09-23 20:35:49,766 - {pytorch_tabular.tabular_model:1258} - INFO - Loading the best model
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v2.0.0. Please use `lightning_fabric.utilities.cloud_io.get_filesystem` instead.
  rank_zero_deprecation(
Training mse loss: 4.88793
Validation mse loss: 8.44147
Testing mse loss: 6.77917
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-35-35-0_UserInputConfig/trainer.pkl')

-------------PytorchTabular End-------------

PytorchTabular metrics
TabNet 1/1
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-35-35-0_UserInputConfig/trainer.pkl')
[9]:
Program Model Training RMSE Training MSE Training MAE Training MAPE Training R2 Training MEDIAN_ABSOLUTE_ERROR Training EXPLAINED_VARIANCE_SCORE Testing RMSE ... Testing R2 Testing MEDIAN_ABSOLUTE_ERROR Testing EXPLAINED_VARIANCE_SCORE Validation RMSE Validation MSE Validation MAE Validation MAPE Validation R2 Validation MEDIAN_ABSOLUTE_ERROR Validation EXPLAINED_VARIANCE_SCORE
0 PytorchTabular TabNet 2.210867 4.887931 1.642482 0.069627 0.924169 1.215234 0.924173 2.603684 ... 0.873915 1.517266 0.873926 2.90542 8.441465 2.17954 0.097734 0.849201 1.557569 0.849222

1 rows × 23 columns

Then the binary classification task:

[10]:
trainer = Trainer(device=device)
adult_columns = [
    "age",
    "workclass",
    "fnlwgt",
    "education",
    "education-num",
    "marital-status",
    "occupation",
    "relationship",
    "race",
    "sex",
    "capital-gain",
    "capital-loss",
    "hours-per-week",
    "native-country",
    "income",
]
cfg = UserConfig.from_uci("Adult", column_names=adult_columns, sep=", ")
trainer.load_config(cfg)
trainer.load_data()
trainer.add_modelbases(
    [
        PytorchTabular(trainer, model_subset=["TabNet"]),
        WideDeep(trainer, model_subset=["TabNet"]),
        TabNetFromAbstract(trainer),
        TabNetFromTorch(trainer),
    ]
)
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()
Downloading https://archive.ics.uci.edu/static/public/2/adult.zip to /tmp/tmpdriivjp7/data/Adult.zip
/home/xlluo/hdd/tabular_ensemble/tabensemb/config/user_config.py:292: UserWarning: There exists .test file(s) ['adult.test'] which should be used for final metrics. The .zip file is left for the user to process.
  warnings.warn(
/home/xlluo/hdd/tabular_ensemble/tabensemb/utils/utils.py:464: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
  df = pd.read_csv(StringIO(s), names=names, sep=sep)
age is Integer and will be treated as a continuous feature.
fnlwgt is Integer and will be treated as a continuous feature.
education-num is Integer and will be treated as a continuous feature.
capital-gain is Integer and will be treated as a continuous feature.
capital-loss is Integer and will be treated as a continuous feature.
hours-per-week is Integer and will be treated as a continuous feature.
The project will be saved to /tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig
Dataset size: 19536 6512 6513
Data saved to /tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig (data.csv and tabular_data.csv).
{'some_param': 1.1, 'program': None, 'model_subset': None, 'exclude_models': None, 'store_in_harddisk': True}

-------------Run PytorchTabular-------------

Training TabNet
Global seed set to 42
2023-09-23 20:35:56,645 - {pytorch_tabular.tabular_model:473} - INFO - Preparing the DataLoaders
2023-09-23 20:35:56,648 - {pytorch_tabular.tabular_datamodule:290} - INFO - Setting up the datamodule for classification task
2023-09-23 20:35:56,719 - {pytorch_tabular.tabular_model:521} - INFO - Preparing the Model: TabNetModel
2023-09-23 20:35:56,741 - {pytorch_tabular.tabular_model:268} - INFO - Preparing the Trainer
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:589: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
  rank_zero_deprecation(
Auto select gpus: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
2023-09-23 20:35:56,759 - {pytorch_tabular.tabular_model:582} - INFO - Training Started
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name             | Type             | Params
------------------------------------------------------
0 | _embedding_layer | Identity         | 0
1 | _backbone        | TabNetBackbone   | 11.2 K
2 | _head            | Identity         | 0
3 | loss             | CrossEntropyLoss | 0
------------------------------------------------------
11.2 K    Trainable params
0         Non-trainable params
11.2 K    Total params
0.045     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 1.1484, Val loss: 0.7944, Min val loss: 0.7944, Epoch time: 1.112s.
Epoch: 20/300, Train loss: 0.4990, Val loss: 0.4745, Min val loss: 0.4745, Epoch time: 0.782s.
Epoch: 40/300, Train loss: 0.4284, Val loss: 0.4167, Min val loss: 0.4167, Epoch time: 0.783s.
Epoch: 60/300, Train loss: 0.4016, Val loss: 0.3994, Min val loss: 0.3978, Epoch time: 0.880s.
Epoch: 80/300, Train loss: 0.3866, Val loss: 0.3913, Min val loss: 0.3896, Epoch time: 0.786s.
Epoch: 100/300, Train loss: 0.3802, Val loss: 0.3806, Min val loss: 0.3806, Epoch time: 0.912s.
Epoch: 120/300, Train loss: 0.3637, Val loss: 0.3661, Min val loss: 0.3661, Epoch time: 1.019s.
Epoch: 140/300, Train loss: 0.3539, Val loss: 0.3600, Min val loss: 0.3600, Epoch time: 0.922s.
Epoch: 160/300, Train loss: 0.3467, Val loss: 0.3549, Min val loss: 0.3541, Epoch time: 0.925s.
Epoch: 180/300, Train loss: 0.3428, Val loss: 0.3497, Min val loss: 0.3490, Epoch time: 0.859s.
Epoch: 200/300, Train loss: 0.3375, Val loss: 0.3451, Min val loss: 0.3451, Epoch time: 0.645s.
Epoch: 220/300, Train loss: 0.3343, Val loss: 0.3423, Min val loss: 0.3423, Epoch time: 0.672s.
Epoch: 240/300, Train loss: 0.3297, Val loss: 0.3415, Min val loss: 0.3398, Epoch time: 0.717s.
Epoch: 260/300, Train loss: 0.3241, Val loss: 0.3431, Min val loss: 0.3391, Epoch time: 0.924s.
Epoch: 280/300, Train loss: 0.3212, Val loss: 0.3373, Min val loss: 0.3373, Epoch time: 0.800s.
Epoch: 300/300, Train loss: 0.3201, Val loss: 0.3380, Min val loss: 0.3362, Epoch time: 0.754s.
`Trainer.fit` stopped: `max_epochs=300` reached.
2023-09-23 20:40:10,281 - {pytorch_tabular.tabular_model:584} - INFO - Training the model completed
2023-09-23 20:40:10,281 - {pytorch_tabular.tabular_model:1258} - INFO - Loading the best model
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v2.0.0. Please use `lightning_fabric.utilities.cloud_io.get_filesystem` instead.
  rank_zero_deprecation(
Training log_loss loss: 0.31245
Validation log_loss loss: 0.33618
Testing log_loss loss: 0.33183
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig/trainer.pkl')

-------------PytorchTabular End-------------


-------------Run WideDeep-------------

Training TabNet
Epoch: 1/300, Train loss: 1.1847, Val loss: 0.9394, Min val loss: 0.9394
Epoch: 21/300, Train loss: 0.4870, Val loss: 0.4681, Min val loss: 0.4681
Epoch: 41/300, Train loss: 0.4204, Val loss: 0.4097, Min val loss: 0.4097
Epoch: 61/300, Train loss: 0.3905, Val loss: 0.3855, Min val loss: 0.3852
Epoch: 81/300, Train loss: 0.3784, Val loss: 0.3750, Min val loss: 0.3750
Epoch: 101/300, Train loss: 0.3657, Val loss: 0.3704, Min val loss: 0.3704
Epoch: 121/300, Train loss: 0.3642, Val loss: 0.3670, Min val loss: 0.3659
Epoch: 141/300, Train loss: 0.3569, Val loss: 0.3619, Min val loss: 0.3619
Epoch: 161/300, Train loss: 0.3502, Val loss: 0.3583, Min val loss: 0.3576
Epoch: 181/300, Train loss: 0.3462, Val loss: 0.3555, Min val loss: 0.3555
Epoch: 201/300, Train loss: 0.3392, Val loss: 0.3508, Min val loss: 0.3505
Epoch: 221/300, Train loss: 0.3366, Val loss: 0.3490, Min val loss: 0.3481
Epoch: 241/300, Train loss: 0.3314, Val loss: 0.3454, Min val loss: 0.3446
Epoch: 261/300, Train loss: 0.3282, Val loss: 0.3423, Min val loss: 0.3419
Epoch: 281/300, Train loss: 0.3239, Val loss: 0.3412, Min val loss: 0.3412
Restoring model weights from the end of the best epoch
Training log_loss loss: 0.31902
Validation log_loss loss: 0.33853
Testing log_loss loss: 0.33015
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig/trainer.pkl')

-------------WideDeep End-------------


-------------Run TabNetFromAbstract-------------

Training TabNet
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/abstract_model.py:75: UserWarning: Device used : cuda
  warnings.warn(f"Device used : {self.device}")
epoch 0  | loss: 0.60097 | val_0_logloss: 0.60262 |  0:00:00s
epoch 20 | loss: 0.40801 | val_0_logloss: 0.40382 |  0:00:09s
epoch 40 | loss: 0.37704 | val_0_logloss: 0.38881 |  0:00:18s
epoch 60 | loss: 0.36342 | val_0_logloss: 0.37497 |  0:00:28s
epoch 80 | loss: 0.34997 | val_0_logloss: 0.36238 |  0:00:37s
epoch 100| loss: 0.34382 | val_0_logloss: 0.35627 |  0:00:46s
epoch 120| loss: 0.33903 | val_0_logloss: 0.35555 |  0:00:55s
epoch 140| loss: 0.33296 | val_0_logloss: 0.3552  |  0:01:04s
epoch 160| loss: 0.32897 | val_0_logloss: 0.35173 |  0:01:13s
epoch 180| loss: 0.32246 | val_0_logloss: 0.34939 |  0:01:23s
epoch 200| loss: 0.31779 | val_0_logloss: 0.34762 |  0:01:32s
epoch 220| loss: 0.31467 | val_0_logloss: 0.34598 |  0:01:41s
epoch 240| loss: 0.31257 | val_0_logloss: 0.34283 |  0:01:50s
epoch 260| loss: 0.31309 | val_0_logloss: 0.34439 |  0:01:59s
epoch 280| loss: 0.30807 | val_0_logloss: 0.33702 |  0:02:08s
Stop training because you reached max_epochs = 300 with best_epoch = 296 and best_val_0_logloss = 0.33691
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/callbacks.py:172: UserWarning: Best weights from best epoch are automatically used!
  warnings.warn(wrn_msg)
Training log_loss loss: 0.29280
Validation log_loss loss: 0.33691
Testing log_loss loss: 0.33218
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig/trainer.pkl')

-------------TabNetFromAbstract End-------------


-------------Run TabNetFromTorch-------------

Training TabNet
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                | Type              | Params
----------------------------------------------------------
0 | default_loss_fn     | BCEWithLogitsLoss | 0
1 | default_output_norm | Sigmoid           | 0
2 | network             | TabNet            | 8.0 K
----------------------------------------------------------
8.0 K     Trainable params
0         Non-trainable params
8.0 K     Total params
0.032     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 0.6781, Val loss: 0.6334, Min val loss: 0.6334, Min ES val loss: 0.6334, Epoch time: 0.487s.
Epoch: 20/300, Train loss: 0.4223, Val loss: 0.4170, Min val loss: 0.4170, Min ES val loss: 0.4170, Epoch time: 0.479s.
Epoch: 40/300, Train loss: 0.3829, Val loss: 0.3858, Min val loss: 0.3844, Min ES val loss: 0.3844, Epoch time: 0.567s.
Epoch: 60/300, Train loss: 0.3665, Val loss: 0.3691, Min val loss: 0.3691, Min ES val loss: 0.3691, Epoch time: 0.477s.
Epoch: 80/300, Train loss: 0.3556, Val loss: 0.3600, Min val loss: 0.3600, Min ES val loss: 0.3600, Epoch time: 0.480s.
Epoch: 100/300, Train loss: 0.3466, Val loss: 0.3559, Min val loss: 0.3547, Min ES val loss: 0.3547, Epoch time: 0.481s.
Epoch: 120/300, Train loss: 0.3414, Val loss: 0.3542, Min val loss: 0.3533, Min ES val loss: 0.3533, Epoch time: 0.478s.
Epoch: 140/300, Train loss: 0.3382, Val loss: 0.3500, Min val loss: 0.3498, Min ES val loss: 0.3498, Epoch time: 0.556s.
Epoch: 160/300, Train loss: 0.3307, Val loss: 0.3470, Min val loss: 0.3470, Min ES val loss: 0.3470, Epoch time: 0.481s.
Epoch: 180/300, Train loss: 0.3280, Val loss: 0.3439, Min val loss: 0.3439, Min ES val loss: 0.3439, Epoch time: 0.480s.
Epoch: 200/300, Train loss: 0.3249, Val loss: 0.3408, Min val loss: 0.3402, Min ES val loss: 0.3402, Epoch time: 0.484s.
Epoch: 220/300, Train loss: 0.3206, Val loss: 0.3378, Min val loss: 0.3375, Min ES val loss: 0.3375, Epoch time: 0.480s.
Epoch: 240/300, Train loss: 0.3164, Val loss: 0.3386, Min val loss: 0.3370, Min ES val loss: 0.3370, Epoch time: 0.561s.
Epoch: 260/300, Train loss: 0.3152, Val loss: 0.3374, Min val loss: 0.3342, Min ES val loss: 0.3342, Epoch time: 0.487s.
Epoch: 280/300, Train loss: 0.3144, Val loss: 0.3327, Min val loss: 0.3327, Min ES val loss: 0.3327, Epoch time: 0.558s.
Epoch: 300/300, Train loss: 0.3124, Val loss: 0.3332, Min val loss: 0.3327, Min ES val loss: 0.3327, Epoch time: 0.483s.
`Trainer.fit` stopped: `max_epochs=300` reached.
Training log_loss loss: 0.30412
Validation log_loss loss: 0.33268
Testing log_loss loss: 0.33141
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig/trainer.pkl')

-------------TabNetFromTorch End-------------

PytorchTabular metrics
TabNet 1/1
WideDeep metrics
TabNet 1/1
TabNetFromAbstract metrics
TabNet 1/1
TabNetFromTorch metrics
TabNet 1/1
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/adult/2023-09-23-20-35-54-0_UserInputConfig/trainer.pkl')
[10]:
Program Model Training F1_SCORE Training PRECISION_SCORE Training RECALL_SCORE Training JACCARD_SCORE Training ACCURACY_SCORE Training BALANCED_ACCURACY_SCORE Training COHEN_KAPPA_SCORE Training HAMMING_LOSS ... Validation ACCURACY_SCORE Validation BALANCED_ACCURACY_SCORE Validation COHEN_KAPPA_SCORE Validation HAMMING_LOSS Validation MATTHEWS_CORRCOEF Validation ZERO_ONE_LOSS Validation ROC_AUC_SCORE Validation LOG_LOSS Validation BRIER_SCORE_LOSS Validation AVERAGE_PRECISION_SCORE
0 PytorchTabular TabNet 0.663081 0.757096 0.589836 0.495977 0.855702 0.764917 0.573066 0.144298 ... 0.847359 0.753237 0.548046 0.152641 0.555038 0.152641 0.895756 0.336181 0.106143 0.855486
1 TabNetFromTorch TabNet 0.672948 0.754224 0.607485 0.507100 0.857852 0.772360 0.583483 0.142148 ... 0.846744 0.759372 0.552975 0.153256 0.557504 0.153256 0.896746 0.332678 0.105144 0.860627
2 WideDeep TabNet 0.664649 0.734311 0.607059 0.497734 0.852529 0.768709 0.571219 0.147471 ... 0.845055 0.762183 0.552930 0.154945 0.556004 0.154945 0.894150 0.338528 0.107826 0.852035
3 TabNetFromAbstract TabNet 0.692785 0.753562 0.641080 0.529970 0.863124 0.787303 0.605467 0.136876 ... 0.845670 0.766947 0.558357 0.154330 0.560548 0.154330 0.896387 0.336909 0.106059 0.857159

4 rows × 44 columns

Finally the multiclass classification task:

[11]:
trainer = Trainer(device=device)
iris_columns = [
    "sepal length",
    "sepal width",
    "petal length",
    "petal width",
    "class",
]
cfg = UserConfig.from_uci("Iris", column_names=iris_columns, datafile_name="iris")
trainer.load_config(cfg)
trainer.load_data()
trainer.add_modelbases(
    [
        PytorchTabular(trainer, model_subset=["TabNet"]),
        WideDeep(trainer, model_subset=["TabNet"]),
        TabNetFromAbstract(trainer),
        TabNetFromTorch(trainer),
    ]
)
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()
Downloading https://archive.ics.uci.edu/static/public/53/iris.zip to /tmp/tmpdriivjp7/data/Iris.zip
The project will be saved to /tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig
Dataset size: 90 30 30
Data saved to /tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig (data.csv and tabular_data.csv).
{'some_param': 1.1, 'program': None, 'model_subset': None, 'exclude_models': None, 'store_in_harddisk': True}

-------------Run PytorchTabular-------------

Training TabNet
Global seed set to 42
2023-09-23 20:47:59,284 - {pytorch_tabular.tabular_model:473} - INFO - Preparing the DataLoaders
2023-09-23 20:47:59,284 - {pytorch_tabular.tabular_datamodule:290} - INFO - Setting up the datamodule for classification task
2023-09-23 20:47:59,291 - {pytorch_tabular.tabular_model:521} - INFO - Preparing the Model: TabNetModel
2023-09-23 20:47:59,303 - {pytorch_tabular.tabular_model:268} - INFO - Preparing the Trainer
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:589: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
  rank_zero_deprecation(
Auto select gpus: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
2023-09-23 20:47:59,315 - {pytorch_tabular.tabular_model:582} - INFO - Training Started
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name             | Type             | Params
------------------------------------------------------
0 | _embedding_layer | Identity         | 0
1 | _backbone        | TabNetBackbone   | 5.9 K
2 | _head            | Identity         | 0
3 | loss             | CrossEntropyLoss | 0
------------------------------------------------------
5.9 K     Trainable params
0         Non-trainable params
5.9 K     Total params
0.024     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 1.3527, Val loss: 3.6434, Min val loss: 3.6434, Epoch time: 0.022s.
Epoch: 20/300, Train loss: 0.8093, Val loss: 2.1175, Min val loss: 2.1175, Epoch time: 0.019s.
Epoch: 40/300, Train loss: 0.5522, Val loss: 1.1805, Min val loss: 1.1805, Epoch time: 0.018s.
Epoch: 60/300, Train loss: 0.3874, Val loss: 0.8075, Min val loss: 0.8075, Epoch time: 0.018s.
Epoch: 80/300, Train loss: 0.2738, Val loss: 0.6783, Min val loss: 0.6783, Epoch time: 0.019s.
Epoch: 100/300, Train loss: 0.2028, Val loss: 0.6126, Min val loss: 0.6126, Epoch time: 0.019s.
Epoch: 120/300, Train loss: 0.1537, Val loss: 0.5558, Min val loss: 0.5558, Epoch time: 0.019s.
Epoch: 140/300, Train loss: 0.1162, Val loss: 0.4934, Min val loss: 0.4934, Epoch time: 0.019s.
Epoch: 160/300, Train loss: 0.0899, Val loss: 0.4366, Min val loss: 0.4366, Epoch time: 0.019s.
Epoch: 180/300, Train loss: 0.0703, Val loss: 0.3954, Min val loss: 0.3954, Epoch time: 0.019s.
Epoch: 200/300, Train loss: 0.0560, Val loss: 0.3592, Min val loss: 0.3592, Epoch time: 0.019s.
Epoch: 220/300, Train loss: 0.0453, Val loss: 0.3398, Min val loss: 0.3398, Epoch time: 0.020s.
Epoch: 240/300, Train loss: 0.0373, Val loss: 0.3234, Min val loss: 0.3234, Epoch time: 0.019s.
Epoch: 260/300, Train loss: 0.0309, Val loss: 0.3124, Min val loss: 0.3124, Epoch time: 0.019s.
Epoch: 280/300, Train loss: 0.0260, Val loss: 0.3075, Min val loss: 0.3075, Epoch time: 0.019s.
Epoch: 300/300, Train loss: 0.0222, Val loss: 0.3039, Min val loss: 0.3033, Epoch time: 0.019s.
`Trainer.fit` stopped: `max_epochs=300` reached.
2023-09-23 20:48:07,783 - {pytorch_tabular.tabular_model:584} - INFO - Training the model completed
2023-09-23 20:48:07,783 - {pytorch_tabular.tabular_model:1258} - INFO - Loading the best model
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v2.0.0. Please use `lightning_fabric.utilities.cloud_io.get_filesystem` instead.
  rank_zero_deprecation(
Training log_loss loss: 0.02973
Validation log_loss loss: 0.30334
Testing log_loss loss: 0.08998
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig/trainer.pkl')

-------------PytorchTabular End-------------


-------------Run WideDeep-------------

Training TabNet
Epoch: 1/300, Train loss: 1.3533, Val loss: 3.6438, Min val loss: 3.6438
Epoch: 21/300, Train loss: 0.7912, Val loss: 2.0427, Min val loss: 2.0427
Epoch: 41/300, Train loss: 0.5492, Val loss: 1.1557, Min val loss: 1.1557
Epoch: 61/300, Train loss: 0.3919, Val loss: 0.8098, Min val loss: 0.8098
Epoch: 81/300, Train loss: 0.2776, Val loss: 0.6744, Min val loss: 0.6744
Epoch: 101/300, Train loss: 0.2061, Val loss: 0.6021, Min val loss: 0.6021
Epoch: 121/300, Train loss: 0.1560, Val loss: 0.5320, Min val loss: 0.5320
Epoch: 141/300, Train loss: 0.1184, Val loss: 0.4719, Min val loss: 0.4719
Epoch: 161/300, Train loss: 0.0917, Val loss: 0.4163, Min val loss: 0.4163
Epoch: 181/300, Train loss: 0.0720, Val loss: 0.3811, Min val loss: 0.3811
Epoch: 201/300, Train loss: 0.0574, Val loss: 0.3475, Min val loss: 0.3475
Epoch: 221/300, Train loss: 0.0466, Val loss: 0.3298, Min val loss: 0.3298
Epoch: 241/300, Train loss: 0.0386, Val loss: 0.3141, Min val loss: 0.3141
Epoch: 261/300, Train loss: 0.0324, Val loss: 0.3057, Min val loss: 0.3057
Epoch: 281/300, Train loss: 0.0276, Val loss: 0.3003, Min val loss: 0.3003
Restoring model weights from the end of the best epoch
Training log_loss loss: 0.03166
Validation log_loss loss: 0.29784
Testing log_loss loss: 0.08862
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig/trainer.pkl')

-------------WideDeep End-------------


-------------Run TabNetFromAbstract-------------

Training TabNet
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/abstract_model.py:75: UserWarning: Device used : cuda
  warnings.warn(f"Device used : {self.device}")
epoch 0  | loss: 1.56647 | val_0_logloss: 2.77843 |  0:00:00s
epoch 20 | loss: 0.84741 | val_0_logloss: 1.56011 |  0:00:00s
epoch 40 | loss: 0.57801 | val_0_logloss: 1.30489 |  0:00:00s
epoch 60 | loss: 0.42103 | val_0_logloss: 0.94241 |  0:00:01s
epoch 80 | loss: 0.30948 | val_0_logloss: 0.82352 |  0:00:01s
epoch 100| loss: 0.22945 | val_0_logloss: 0.67582 |  0:00:01s
epoch 120| loss: 0.17005 | val_0_logloss: 0.56843 |  0:00:02s
epoch 140| loss: 0.12258 | val_0_logloss: 0.43708 |  0:00:02s
epoch 160| loss: 0.08765 | val_0_logloss: 0.37523 |  0:00:03s
epoch 180| loss: 0.06136 | val_0_logloss: 0.32675 |  0:00:03s
epoch 200| loss: 0.04456 | val_0_logloss: 0.29909 |  0:00:03s
epoch 220| loss: 0.03383 | val_0_logloss: 0.28244 |  0:00:04s
epoch 240| loss: 0.02584 | val_0_logloss: 0.26869 |  0:00:04s
epoch 260| loss: 0.02049 | val_0_logloss: 0.2337  |  0:00:05s
epoch 280| loss: 0.01678 | val_0_logloss: 0.20078 |  0:00:05s
Stop training because you reached max_epochs = 300 with best_epoch = 299 and best_val_0_logloss = 0.17915
/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/callbacks.py:172: UserWarning: Best weights from best epoch are automatically used!
  warnings.warn(wrn_msg)
Training log_loss loss: 0.02356
Validation log_loss loss: 0.17915
Testing log_loss loss: 0.06823
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig/trainer.pkl')

-------------TabNetFromAbstract End-------------


-------------Run TabNetFromTorch-------------

Training TabNet
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                | Type             | Params
---------------------------------------------------------
0 | default_loss_fn     | CrossEntropyLoss | 0
1 | default_output_norm | Softmax          | 0
2 | network             | TabNet           | 5.9 K
---------------------------------------------------------
5.9 K     Trainable params
0         Non-trainable params
5.9 K     Total params
0.024     Total estimated model params size (MB)
Epoch: 1/300, Train loss: 1.3527, Val loss: 3.6726, Min val loss: 3.6726, Min ES val loss: 3.6726, Epoch time: 0.020s.
Epoch: 20/300, Train loss: 0.8086, Val loss: 2.0715, Min val loss: 2.0715, Min ES val loss: 2.0715, Epoch time: 0.017s.
Epoch: 40/300, Train loss: 0.5588, Val loss: 1.1552, Min val loss: 1.1552, Min ES val loss: 1.1552, Epoch time: 0.017s.
Epoch: 60/300, Train loss: 0.3996, Val loss: 0.8152, Min val loss: 0.8152, Min ES val loss: 0.8152, Epoch time: 0.017s.
Epoch: 80/300, Train loss: 0.2837, Val loss: 0.6745, Min val loss: 0.6745, Min ES val loss: 0.6745, Epoch time: 0.017s.
Epoch: 100/300, Train loss: 0.2098, Val loss: 0.6031, Min val loss: 0.6031, Min ES val loss: 0.6031, Epoch time: 0.018s.
Epoch: 120/300, Train loss: 0.1586, Val loss: 0.5270, Min val loss: 0.5270, Min ES val loss: 0.5270, Epoch time: 0.017s.
Epoch: 140/300, Train loss: 0.1203, Val loss: 0.4580, Min val loss: 0.4580, Min ES val loss: 0.4580, Epoch time: 0.020s.
Epoch: 160/300, Train loss: 0.0926, Val loss: 0.4144, Min val loss: 0.4144, Min ES val loss: 0.4144, Epoch time: 0.017s.
Epoch: 180/300, Train loss: 0.0724, Val loss: 0.3771, Min val loss: 0.3771, Min ES val loss: 0.3771, Epoch time: 0.017s.
Epoch: 200/300, Train loss: 0.0578, Val loss: 0.3446, Min val loss: 0.3446, Min ES val loss: 0.3446, Epoch time: 0.017s.
Epoch: 220/300, Train loss: 0.0467, Val loss: 0.3253, Min val loss: 0.3253, Min ES val loss: 0.3253, Epoch time: 0.016s.
Epoch: 240/300, Train loss: 0.0385, Val loss: 0.3145, Min val loss: 0.3145, Min ES val loss: 0.3145, Epoch time: 0.017s.
Epoch: 260/300, Train loss: 0.0322, Val loss: 0.3053, Min val loss: 0.3053, Min ES val loss: 0.3053, Epoch time: 0.016s.
Epoch: 280/300, Train loss: 0.0273, Val loss: 0.2999, Min val loss: 0.2998, Min ES val loss: 0.2998, Epoch time: 0.016s.
Epoch: 300/300, Train loss: 0.0235, Val loss: 0.2978, Min val loss: 0.2964, Min ES val loss: 0.2964, Epoch time: 0.016s.
`Trainer.fit` stopped: `max_epochs=300` reached.
Training log_loss loss: 0.03299
Validation log_loss loss: 0.29635
Testing log_loss loss: 0.10517
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig/trainer.pkl')

-------------TabNetFromTorch End-------------

PytorchTabular metrics
TabNet 1/1
WideDeep metrics
TabNet 1/1
TabNetFromAbstract metrics
TabNet 1/1
TabNetFromTorch metrics
TabNet 1/1
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/iris/2023-09-23-20-47-59-0_UserInputConfig/trainer.pkl')
[11]:
Program Model Training ACCURACY_SCORE Training BALANCED_ACCURACY_SCORE Training COHEN_KAPPA_SCORE Training HAMMING_LOSS Training MATTHEWS_CORRCOEF Training ZERO_ONE_LOSS Training PRECISION_SCORE_MACRO Training PRECISION_SCORE_MICRO ... Validation F1_SCORE_MICRO Validation F1_SCORE_WEIGHTED Validation JACCARD_SCORE_MACRO Validation JACCARD_SCORE_MICRO Validation JACCARD_SCORE_WEIGHTED Validation TOP_K_ACCURACY_SCORE Validation LOG_LOSS Validation ROC_AUC_SCORE_OVR_MACRO Validation ROC_AUC_SCORE_OVR_WEIGHTED Validation ROC_AUC_SCORE_OVO
0 PytorchTabular TabNet 1.0 1.0 1.0 0.0 1.0 0.0 1.0 1.0 ... 0.933333 0.934656 0.888889 0.875000 0.88000 1.0 0.303345 0.975059 0.966566 0.972956
1 WideDeep TabNet 1.0 1.0 1.0 0.0 1.0 0.0 1.0 1.0 ... 0.933333 0.934656 0.888889 0.875000 0.88000 1.0 0.297838 0.980985 0.975455 0.979940
2 TabNetFromAbstract TabNet 1.0 1.0 1.0 0.0 1.0 0.0 1.0 1.0 ... 0.900000 0.901217 0.837500 0.818182 0.82625 1.0 0.179150 0.989874 0.988788 0.990417
3 TabNetFromTorch TabNet 1.0 1.0 1.0 0.0 1.0 0.0 1.0 1.0 ... 0.933333 0.934656 0.888889 0.875000 0.88000 1.0 0.296352 0.980985 0.975455 0.979940

4 rows × 71 columns

Results show that models perform much worse on the validation set than on the testing set. To get reliable results, we recommend using cross-validation to get the leaderboard:

# trainer.train(stderr_to_stdout=True)  # No need to run `train`
trainer.get_leaderboard(cross_validation=5, split_type="cv")