{ "cells": [ { "cell_type": "markdown", "source": [ "# Customized model base\n", "\n", "For researchers or model base developers, the basic need is comparing their own models with existing benchmarks in `tabensemb`. In this part, a model base is built within the framework assuming that we want to integrate `TabNet` ([from dreamquark-ai team](https://github.com/dreamquark-ai/tabnet)) into `tabensemb` (indeed `pytorch_tabular` and `pytorch_widedeep` have done that) for regression and classification tasks.\n", "\n", "**Remark**: For `PyTorch`-based models, we have implemented most requirements of the framework so that users can integrate `torch.nn.Module`s more conveniently." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## Example: Implement TabNet as a model base from scratch" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 1, "outputs": [], "source": [ "import tabensemb\n", "import numpy as np\n", "import torch\n", "import os\n", "from tempfile import TemporaryDirectory\n", "\n", "temp_path = TemporaryDirectory()\n", "tabensemb.setting[\"default_output_path\"] = os.path.join(temp_path.name, \"output\")\n", "tabensemb.setting[\"default_config_path\"] = os.path.join(temp_path.name, \"configs\")\n", "tabensemb.setting[\"default_data_path\"] = os.path.join(temp_path.name, \"data\")\n", "\n", "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "All model bases inherit `AbstractModel` and implement methods within the class. If necessary methods are not implemented, `NotImplementedError` will be raised during usage." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 2, "outputs": [], "source": [ "from tabensemb.model import AbstractModel" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "We use `scikit-optimize` (https://github.com/scikit-optimize/scikit-optimize) to do Bayesian hyperparameter optimization, so space classes are imported." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 3, "outputs": [], "source": [ "from skopt.space import Integer, Real, Categorical" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "First, we define the initialization of the model base. Always remember to pass all args and kwargs to `__init__` of `AbstractModel`. You can do other things in `__init__`. All `*args` and `**kwargs` (including arguments like the `some_param` shown below) are recorded in `self.init_params`.\n", "\n", "```python\n", "class TabNetFromAbstract(AbstractModel):\n", " def __init__(self, *args, some_param=1.1, **kwargs):\n", " super(TabNetFromAbstract, self).__init__(*args, **kwargs)\n", " # Do something else here\n", " self.some_param = some_param\n", " print(self.init_params)\n", "```\n", "\n", "We should define the name of the model base and all available models in the model base.\n", "\n", "```python\n", " def _get_program_name(self):\n", " return \"TabNetFromAbstract\"\n", "\n", " def _get_model_names(self):\n", " return [\"TabNet\"]\n", "```\n", "\n", "For each model in the model base, the program will request initial hyperparameters of the model and their search spaces. They are defined as\n", "\n", "```python\n", " def _space(self, model_name):\n", " return [\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_d\", dtype=int),\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_a\", dtype=int),\n", " Integer(low=1, high=6, prior=\"uniform\", name=\"n_steps\", dtype=int),\n", " Real(low=1.0, high=1.5, prior=\"uniform\", name=\"gamma\"),\n", " Integer(\n", " low=1, high=4, prior=\"uniform\", name=\"n_independent\", dtype=int\n", " ),\n", " Integer(low=1, high=4, prior=\"uniform\", name=\"n_shared\", dtype=int),\n", " ] + self.trainer.SPACE\n", "\n", " def _initial_values(self, model_name):\n", " return {\n", " \"n_d\": 8,\n", " \"n_a\": 8,\n", " \"n_steps\": 3,\n", " \"gamma\": 1.3,\n", " \"n_independent\": 2,\n", " \"n_shared\": 2,\n", " \"lr\": self.trainer.args[\"lr\"],\n", " \"weight_decay\": self.trainer.args[\"weight_decay\"],\n", " \"batch_size\": self.trainer.args[\"batch_size\"],\n", " }\n", "```\n", "\n", "Before training, each model base has its own way of processing the dataset.\n", "\n", "`_train_data_preprocess` will return the processed dataset according to a given `Trainer` which provides all training information and data required. In this example, `X_train/X_val/X_test` represent training/validation/testing sets, and `y_train/y_val/y_test` represent corresponding labels.\n", "\n", "**Remark**: The tabular dataset has gone through all processing stages defined in the `DataModule` inside the trainer **except scaling**. Call `self.trainer.datamodule.data_transform(df, scaler_only=True)` to scale it using the trained scaler if no scaling stage is defined internally in the model.\n", "\n", "```python\n", " def _train_data_preprocess(self, model_name):\n", " data = self.trainer.datamodule\n", " all_feature_names = data.all_feature_names\n", "\n", " X_train = data.data_transform(data.X_train, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " X_val = data.data_transform(data.X_val, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " X_test = data.data_transform(data.X_test, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " y_train = data.y_train.astype(np.float64)\n", " y_val = data.y_val.astype(np.float64)\n", " y_test = data.y_test.astype(np.float64)\n", "\n", " return {\n", " \"X_train\": X_train,\n", " \"y_train\": y_train,\n", " \"X_val\": X_val,\n", " \"y_val\": y_val,\n", " \"X_test\": X_test,\n", " \"y_test\": y_test,\n", " }\n", "```\n", "\n", "Correspondingly, `_data_preprocess` will process an upcoming new dataset, including the tabular data `df` containing continuous features and categorical features, and unstacked derived data `derived_data` (multi-modal data or something else depending on the configuration introduced in \"Using data functionalities\"). The returned value should have the same structure as the `X_test` returned in `_train_data_preprocess`.\n", "\n", "```python\n", " def _data_preprocess(self, df, derived_data, model_name):\n", " return self.trainer.datamodule.data_transform(df, scaler_only=True)[\n", " self.trainer.all_feature_names\n", " ].values.astype(np.float64)\n", "```\n", "\n", "The program will pass a selected set of hyperparameters as `kwargs` to initialize a model, train a model, and predict using the model. The returned `model` will be stored locally and reloaded for evaluation and inference, so make sure it contains all the information needed to make predictions.\n", "\n", "Here we initialize the model using information contained in the `DataModule` instance, including the indices of categorical features `cat_idxs`, the number of categories of each categorical feature `cat_dims`, the current task `task` (possible values are \"regression\", \"binary\", and \"multiclass\"), the device to train the model `self.trainer.device`, and the hyperparameters `kwargs`. `model_name` is ignored because we only have one model in the model base. All model bases should at least follow the guidance of `self.trainer.device`, `self.trainer.datamodule.task`, `model_name`, and `kwargs` to make all models trained in a consistent way within the framework.\n", "\n", "**Remark**: In `DataModule.cat_num_unique` and `DataModule.cat_feature_mapping`, the category of unknown or missing values is already included as `-1` for integer-like categorical features and `UNK` for string-like categorical features.\n", "\n", "```python\n", " def _new_model(self, model_name, verbose, **kwargs):\n", " from pytorch_tabnet.tab_model import TabNetRegressor, TabNetClassifier\n", "\n", " datamodule = self.trainer.datamodule\n", " cat_idxs = np.array(range(len(datamodule.cont_feature_names), len(datamodule.all_feature_names)))\n", " cat_dims = datamodule.cat_num_unique\n", " self.task = datamodule.task\n", " init_kwargs = dict(\n", " verbose=tabensemb.setting[\"verbose_per_epoch\"] if verbose else 0,\n", " optimizer_params={\n", " \"lr\": kwargs[\"lr\"],\n", " \"weight_decay\": kwargs[\"weight_decay\"],\n", " },\n", " cat_idxs=cat_idxs,\n", " cat_dims=cat_dims,\n", " cat_emb_dim=3,\n", " device_name=self.trainer.device,\n", " )\n", " if self.trainer.datamodule.task == \"regression\":\n", " model = TabNetRegressor(**init_kwargs)\n", " else:\n", " model = TabNetClassifier(**init_kwargs)\n", "\n", " model.set_params(\n", " **{\n", " \"n_d\": kwargs[\"n_d\"],\n", " \"n_a\": kwargs[\"n_a\"],\n", " \"n_steps\": kwargs[\"n_steps\"],\n", " \"gamma\": kwargs[\"gamma\"],\n", " \"n_independent\": kwargs[\"n_independent\"],\n", " \"n_shared\": kwargs[\"n_shared\"],\n", " }\n", " )\n", " return model\n", "```\n", "\n", "**Remark**: `kwargs` has all keys defined in `_initial_values`. If a parameter named `batch_size` is included, a new key named `original_batch_size` exists in `kwargs`. The values of `batch_size` and `original_batch_size` may be different if the program finds that the batch size will make the mini-batches tiny. The threshold is defined by `self.limit_batch_size` (default to 6). A tiny batch might interrupt some models, so it is better to use the modified `batch_size` value.\n", "\n", "The framework will pass `X_train`, `y_train`, `X_val`, and `y_val` from `_train_data_preprocess` to the following `_train_single_model` method, along with some other arguments stating the current training stage. `epoch` is the number of epochs to train the model. `warm_start=True` means the passed model is already trained and should be fine-tuned based on a new dataset. `in_bayes_opt=True` means that the passed `kwargs` is selected by a bayesian hyperparameter optimization step, and a simplified training routine is needed to reduce optimization time, so we set the `max_epochs` to \"bayes_epoch\" in the configuration.\n", "\n", "**Remark**: `epoch` will be `self.trainer.args[\"bayes_epoch\"]` if `in_bayes_opt=True`, and `self.trainer.args[\"epoch\"]` otherwise.\n", "\n", "**Remark**: If you want to plot the training/validation loss curves using the `Trainer.plot_loss` method for your own model base, you should record the losses as lists in `self.train_losses`, `self.val_losses`, and `self.earlystopping_epoch` after training in `_train_single_model`. See source codes of `PytorchTabular`, `WideDeep`, or `TorchModel` for details.\n", "\n", "```python\n", " def _train_single_model(\n", " self,\n", " model,\n", " model_name,\n", " epoch,\n", " X_train,\n", " y_train,\n", " X_val,\n", " y_val,\n", " verbose,\n", " warm_start,\n", " in_bayes_opt,\n", " **kwargs,\n", " ):\n", " eval_set = [(X_val, y_val if self.task == \"regression\" else y_val.flatten())]\n", "\n", " model.fit(\n", " X_train,\n", " y_train if self.task == \"regression\" else y_train.flatten(),\n", " eval_set=eval_set,\n", " max_epochs=epoch if not in_bayes_opt else self.trainer.args[\"bayes_epoch\"],\n", " patience=self.trainer.args[\"patience\"],\n", " loss_fn=torch.nn.MSELoss()\n", " if self.task == \"regression\"\n", " else torch.nn.CrossEntropyLoss(),\n", " eval_metric=[\"mse\" if self.task == \"regression\" else \"logloss\"],\n", " batch_size=int(kwargs[\"batch_size\"]),\n", " warm_start=warm_start,\n", " drop_last=False,\n", " )\n", "```\n", "\n", "To evaluate the model or make use of the model, `_pred_single_model` is defined, and `X_test` processed in `_train_data_preprocess` or `_data_preprocess` is passed as an argument. The returned value should always be a two-dimensional `np.ndarray`. For binary classification tasks, the output is the probability of the positive (1) class, and for multiclass classification, the output is the probability of each class. `AbstractModel` automatically deals with the probabilities for metrics and final outputs.\n", "\n", "```python\n", " def _pred_single_model(self, model, X_test, verbose, **kwargs):\n", " if self.task == \"regression\":\n", " return model.predict(X_test).reshape(-1, 1)\n", " elif self.task == \"binary\":\n", " return model.predict_proba(X_test)[:, 1].reshape(-1, 1)\n", " else:\n", " return model.predict_proba(X_test)\n", "```\n", "\n", "The full code is as follows:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 4, "outputs": [], "source": [ "class TabNetFromAbstract(AbstractModel):\n", " def __init__(self, *args, some_param=1.1, **kwargs):\n", " super(TabNetFromAbstract, self).__init__(*args, **kwargs)\n", " # Do something else here\n", " self.some_param = some_param\n", " print(self.init_params)\n", "\n", " def _get_program_name(self):\n", " return \"TabNetFromAbstract\"\n", "\n", " def _get_model_names(self):\n", " return [\"TabNet\"]\n", "\n", " def _space(self, model_name):\n", " return [\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_d\", dtype=int),\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_a\", dtype=int),\n", " Integer(low=1, high=6, prior=\"uniform\", name=\"n_steps\", dtype=int),\n", " Real(low=1.0, high=1.5, prior=\"uniform\", name=\"gamma\"),\n", " Integer(\n", " low=1, high=4, prior=\"uniform\", name=\"n_independent\", dtype=int\n", " ),\n", " Integer(low=1, high=4, prior=\"uniform\", name=\"n_shared\", dtype=int),\n", " ] + self.trainer.SPACE\n", "\n", " def _initial_values(self, model_name):\n", " return {\n", " \"n_d\": 8,\n", " \"n_a\": 8,\n", " \"n_steps\": 3,\n", " \"gamma\": 1.3,\n", " \"n_independent\": 2,\n", " \"n_shared\": 2,\n", " \"lr\": self.trainer.args[\"lr\"],\n", " \"weight_decay\": self.trainer.args[\"weight_decay\"],\n", " \"batch_size\": self.trainer.args[\"batch_size\"],\n", " }\n", "\n", " def _train_data_preprocess(self, model_name):\n", " data = self.trainer.datamodule\n", " all_feature_names = data.all_feature_names\n", "\n", " X_train = data.data_transform(data.X_train, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " X_val = data.data_transform(data.X_val, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " X_test = data.data_transform(data.X_test, scaler_only=True)[\n", " all_feature_names\n", " ].values.astype(np.float64)\n", " y_train = data.y_train.astype(np.float64)\n", " y_val = data.y_val.astype(np.float64)\n", " y_test = data.y_test.astype(np.float64)\n", "\n", " return {\n", " \"X_train\": X_train,\n", " \"y_train\": y_train,\n", " \"X_val\": X_val,\n", " \"y_val\": y_val,\n", " \"X_test\": X_test,\n", " \"y_test\": y_test,\n", " }\n", "\n", " def _data_preprocess(self, df, derived_data, model_name):\n", " return self.trainer.datamodule.data_transform(df, scaler_only=True)[\n", " self.trainer.all_feature_names\n", " ].values.astype(np.float64)\n", "\n", " def _new_model(self, model_name, verbose, **kwargs):\n", " from pytorch_tabnet.tab_model import TabNetRegressor, TabNetClassifier\n", "\n", " datamodule = self.trainer.datamodule\n", " cat_idxs = list(range(len(datamodule.cont_feature_names), len(datamodule.all_feature_names)))\n", " cat_dims = datamodule.cat_num_unique\n", " self.task = datamodule.task\n", " init_kwargs = dict(\n", " verbose=tabensemb.setting[\"verbose_per_epoch\"] if verbose else 0,\n", " optimizer_params={\n", " \"lr\": kwargs[\"lr\"],\n", " \"weight_decay\": kwargs[\"weight_decay\"],\n", " },\n", " cat_idxs=cat_idxs,\n", " cat_dims=cat_dims,\n", " cat_emb_dim=3,\n", " device_name=self.trainer.device,\n", " )\n", " if self.trainer.datamodule.task == \"regression\":\n", " model = TabNetRegressor(**init_kwargs)\n", " else:\n", " model = TabNetClassifier(**init_kwargs)\n", "\n", " model.set_params(\n", " **{\n", " \"n_d\": kwargs[\"n_d\"],\n", " \"n_a\": kwargs[\"n_a\"],\n", " \"n_steps\": kwargs[\"n_steps\"],\n", " \"gamma\": kwargs[\"gamma\"],\n", " \"n_independent\": kwargs[\"n_independent\"],\n", " \"n_shared\": kwargs[\"n_shared\"],\n", " }\n", " )\n", " return model\n", "\n", " def _train_single_model(\n", " self,\n", " model,\n", " model_name,\n", " epoch,\n", " X_train,\n", " y_train,\n", " X_val,\n", " y_val,\n", " verbose,\n", " warm_start,\n", " in_bayes_opt,\n", " **kwargs,\n", " ):\n", " eval_set = [(X_val, y_val if self.task == \"regression\" else y_val.flatten())]\n", "\n", " model.fit(\n", " X_train,\n", " y_train if self.task == \"regression\" else y_train.flatten(),\n", " eval_set=eval_set,\n", " max_epochs=epoch if not in_bayes_opt else self.trainer.args[\"bayes_epoch\"],\n", " patience=self.trainer.args[\"patience\"],\n", " loss_fn=torch.nn.MSELoss()\n", " if self.task == \"regression\"\n", " else torch.nn.CrossEntropyLoss(),\n", " eval_metric=[\"mse\" if self.task == \"regression\" else \"logloss\"],\n", " batch_size=int(kwargs[\"batch_size\"]),\n", " warm_start=warm_start,\n", " drop_last=False,\n", " )\n", "\n", " def _pred_single_model(self, model, X_test, verbose, **kwargs):\n", " if self.task == \"regression\":\n", " return model.predict(X_test).reshape(-1, 1)\n", " elif self.task == \"binary\":\n", " return model.predict_proba(X_test)[:, 1].reshape(-1, 1)\n", " else:\n", " return model.predict_proba(X_test)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## Example: Implement TabNet as a `PyTorch`-based model\n", "\n", "Indeed, the example shown above uses `TabNetRegressor` and `TabNetClassifier` from `pytorch_tabnet` that have already implemented the training and evaluation procedures over the `torch.nn.Module` subclass called `TabNet`. We can also directly build a model base for `nn.Module`s with less effort. These model bases inherit `TorchModel`, and `nn.Module`s should inherit `AbstractNN` (just needs to change a few lines to migrate previous code into this framework)." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 5, "outputs": [], "source": [ "from tabensemb.model import TorchModel, AbstractNN\n", "from pytorch_tabnet.tab_network import TabNet\n", "from typing import Dict" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "First, we implement an `AbstractNN` (which inherits `pytorch_lightning.LightningModule` that further inherits `torch.nn.Module`).\n", "\n", "We initialize the model in `__init__`. `kwargs` will depend on the arguments passed from `_new_model`, which will be implemented later, but at least it should contain all keys defined in `_initial_values`, as introduced in an above remark.\n", "\n", "Remember to call `super().__init__`. There is nothing more difficult than initializing a `LightningModule`.\n", "\n", "We can use `self.hparams.some_param` to get a hyperparameter (equivalent to `kwargs[\"some_param\"]`) if you call `super().__init__(datamodule, **kwargs)` instead of `super().__init__(datamodule)` because `AbstractNN` uses the `LightningModule.save_hyperparameters` utility (which you should **not** call in your own `__init__`).\n", "\n", "**Remark**: To migrate existing `nn.Module` code (Part 1)\n", "\n", "* Change `class SomeModel(nn.Module)` to `class SomeModel(AbstractNN)`.\n", "* Change the indices of categorical features to `[0, 1, ..., self.n_cat-1]` and the numbers of unique categories of categorical features to `self.cat_num_unique`.\n", "* Change the number of input dimensions to `self.n_cont+self.n_cat` and the number of output dimensions `self.n_outputs`.\n", "\n", "```python\n", "class TabNetNN(AbstractNN):\n", " def __init__(\n", " self,\n", " datamodule,\n", " **kwargs,\n", " ):\n", " super(TabNetNN, self).__init__(datamodule, **kwargs)\n", " self.network = TabNet(\n", " input_dim=self.n_cont+self.n_cat,\n", " output_dim=self.n_outputs,\n", " n_d=self.hparams.n_d,\n", " n_a=self.hparams.n_a,\n", " n_steps=self.hparams.n_steps,\n", " gamma=self.hparams.gamma,\n", " cat_idxs=list(range(self.n_cat)),\n", " cat_dims=self.cat_num_unique,\n", " cat_emb_dim=[3] * self.n_cat,\n", " n_independent=self.hparams.n_independent,\n", " n_shared=self.hparams.n_shared,\n", " )\n", "```\n", "\n", "Then we implement the computation step of the model. We should implement `_forward` instead of `forward` which is already implemented by `AbstractNN` and is used to automatically process inputs and outputs of `_forward`.\n", "\n", "There are two input arguments for `_forward`: `x` and `derived_tensors`. `x` is a tensor of continuous features. `derived_tensors` is a dictionary containing contents in `datamodule.derived_data` (which is introduced in the last two sections of the \"Using data functionalities\" part), including categorical data (with the key \"categorical\" if there is any categorical feature), the signal for each data point representing whether it is an augmented one (with the key \"augmented\" if there is any augmented data point), and derived unstacked data (with the key `derived_name` specified in the configuration). This is how multimodal data is passed to a deep learning model in our framework.\n", "\n", "In the following lines, we build the input of the neural network from the continuous features `x` and the categorical features `derived_tensors[\"categorical\"]` by concatenation (that's why the indices of categorical features are set to `[0, 1, ..., self.n_cat-1]`), calculate the output of the network, and return the output.\n", "\n", "**Remark**: The default loss function is `torch.nn.MSELoss` for regression, `torch.nn.BCEWithLogitsLoss` for binary classification, and `torch.nn.CrossEntropyLoss` for multiclass classification. To change this behavior, implement `self.loss_fn`. See the \"Advanced customized model base\" part for details.\n", "\n", "**Remark**: For binary classification tasks, `self.n_outputs=1` so we expect the logits of the positive class (instead of a normalized probability). The output is then used to calculate `torch.nn.BCEWithLogitsLoss` by default. For multiclass classification tasks, `self.n_outputs` is the number of classes, so we expect the logits of these classes (instead of probabilities from `Softmax` or something else). The output is then used to calculate `torch.nn.CrossEntropyLoss` by default.\n", "\n", "**Remark**: To migrate existing `nn.Module` code (Part 2)\n", "\n", "* Change `forward` to `_forward`\n", "* Get categorical features from `derived_tensors`\n", "* Get multimodal features from `derived_tensors` (and load multimodal features using data derivers)\n", "* Return logits instead of probabilities\n", "\n", "```python\n", " def _forward(\n", " self, x: torch.Tensor, derived_tensors: Dict[str, torch.Tensor]\n", " ) -> torch.Tensor:\n", " x_cont = x\n", " if \"categorical\" in derived_tensors.keys():\n", " x_cat = derived_tensors[\"categorical\"]\n", " x_in = torch.concat([x_cat, x_cont], dim=-1)\n", " else:\n", " x_in = x_cont\n", " output, _ = self.network(x_in)\n", " return output\n", "```\n", "\n", "The code is as follows:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 6, "outputs": [], "source": [ "class TabNetNN(AbstractNN):\n", " def __init__(\n", " self,\n", " datamodule,\n", " **kwargs,\n", " ):\n", " super(TabNetNN, self).__init__(datamodule, **kwargs)\n", " self.network = TabNet(\n", " input_dim=self.n_cont+self.n_cat,\n", " output_dim=self.n_outputs,\n", " n_d=self.hparams.n_d,\n", " n_a=self.hparams.n_a,\n", " n_steps=self.hparams.n_steps,\n", " gamma=self.hparams.gamma,\n", " cat_idxs=list(range(self.n_cat)),\n", " cat_dims=self.cat_num_unique,\n", " cat_emb_dim=[3] * self.n_cat,\n", " n_independent=self.hparams.n_independent,\n", " n_shared=self.hparams.n_shared,\n", " )\n", "\n", " def _forward(\n", " self, x: torch.Tensor, derived_tensors: Dict[str, torch.Tensor]\n", " ) -> torch.Tensor:\n", " x_cont = x\n", " if \"categorical\" in derived_tensors.keys():\n", " x_cat = derived_tensors[\"categorical\"]\n", " x_in = torch.concat([x_cat, x_cont], dim=-1)\n", " else:\n", " x_in = x_cont\n", " output, _ = self.network(x_in)\n", " return output" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "Finally, we build the model base for the neural network. It inherits `TorchModel` which has implemented most required methods. Necessary methods for `TorchModel` can be written similarly with `TabNetFromAbstract`.\n", "\n", "**Remark**: PyTorch-based models are trained using `pytorch_lightning.Trainer`, whose arguments can be specified by passing them to `TorchModel.__init__` as a dictionary using the key `lightning_trainer_kwargs`.\n", "\n", "In the following implementation, `_new_model` passes the datamodule and hyperparameters to the neural network, which is what you saw above in `__init__`. You can also pass other arguments as you want." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 7, "outputs": [], "source": [ "class TabNetFromTorch(TorchModel):\n", " def _new_model(self, model_name, verbose, **kwargs):\n", " return TabNetNN(datamodule=self.trainer.datamodule, **kwargs)\n", "\n", " def _get_program_name(self):\n", " return \"TabNetFromTorch\"\n", "\n", " def _get_model_names(self):\n", " return [\"TabNet\"]\n", "\n", " def _space(self, model_name):\n", " return [\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_d\", dtype=int),\n", " Integer(low=4, high=16, prior=\"uniform\", name=\"n_a\", dtype=int),\n", " Integer(low=1, high=6, prior=\"uniform\", name=\"n_steps\", dtype=int),\n", " Real(low=1.0, high=1.5, prior=\"uniform\", name=\"gamma\"),\n", " Integer(\n", " low=1, high=4, prior=\"uniform\", name=\"n_independent\", dtype=int\n", " ),\n", " Integer(low=1, high=4, prior=\"uniform\", name=\"n_shared\", dtype=int),\n", " ] + self.trainer.SPACE\n", "\n", " def _initial_values(self, model_name):\n", " return {\n", " \"n_d\": 8,\n", " \"n_a\": 8,\n", " \"n_steps\": 3,\n", " \"gamma\": 1.3,\n", " \"n_independent\": 2,\n", " \"n_shared\": 2,\n", " \"lr\": self.trainer.args[\"lr\"],\n", " \"weight_decay\": self.trainer.args[\"weight_decay\"],\n", " \"batch_size\": self.trainer.args[\"batch_size\"],\n", " }" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## Comparison of different implementations in other model bases\n", "\n", "We can compare our models with TabNet implemented in the other two model bases. Note that because of different training routines and randomization, they perform differently. Let's try the models on a regression task first." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 8, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading https://archive.ics.uci.edu/static/public/9/auto+mpg.zip to /tmp/tmpdriivjp7/data/Auto MPG.zip\n", "cylinders is Integer and will be treated as a continuous feature.\n", "model_year is Integer and will be treated as a continuous feature.\n", "origin is Integer and will be treated as a continuous feature.\n", "Unknown values are detected in ['horsepower']. They will be treated as np.nan.\n", "The project will be saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig\n", "Dataset size: 238 80 80\n", "Data saved to /tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig (data.csv and tabular_data.csv).\n", "{'some_param': 1.1, 'program': None, 'model_subset': None, 'exclude_models': None, 'store_in_harddisk': True}\n", "\n", "-------------Run PytorchTabular-------------\n", "\n", "Training TabNet\n", "Global seed set to 42\n", "2023-09-23 20:34:52,661 - {pytorch_tabular.tabular_model:473} - INFO - Preparing the DataLoaders\n", "2023-09-23 20:34:52,661 - {pytorch_tabular.tabular_datamodule:290} - INFO - Setting up the datamodule for regression task\n", "2023-09-23 20:34:52,670 - {pytorch_tabular.tabular_model:521} - INFO - Preparing the Model: TabNetModel\n", "2023-09-23 20:34:52,684 - {pytorch_tabular.tabular_model:268} - INFO - Preparing the Trainer\n", "/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:589: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.\n", " rank_zero_deprecation(\n", "Auto select gpus: [0]\n", "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "2023-09-23 20:34:53,558 - {pytorch_tabular.tabular_model:582} - INFO - Training Started\n", "You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n", "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n", "\n", " | Name | Type | Params\n", "----------------------------------------------------\n", "0 | _embedding_layer | Identity | 0 \n", "1 | _backbone | TabNetBackbone | 6.1 K \n", "2 | _head | Identity | 0 \n", "3 | loss | MSELoss | 0 \n", "----------------------------------------------------\n", "6.1 K Trainable params\n", "0 Non-trainable params\n", "6.1 K Total params\n", "0.024 Total estimated model params size (MB)\n", "Epoch: 1/300, Train loss: 631.3760, Val loss: 566.2643, Min val loss: 566.2643, Epoch time: 0.029s.\n", "Epoch: 20/300, Train loss: 601.5834, Val loss: 530.7571, Min val loss: 530.7571, Epoch time: 0.023s.\n", "Epoch: 40/300, Train loss: 573.0287, Val loss: 512.0469, Min val loss: 512.0469, Epoch time: 0.024s.\n", "Epoch: 60/300, Train loss: 547.0676, Val loss: 488.9417, Min val loss: 488.9417, Epoch time: 0.026s.\n", "Epoch: 80/300, Train loss: 521.3848, Val loss: 460.8999, Min val loss: 460.8999, Epoch time: 0.024s.\n", "Epoch: 100/300, Train loss: 492.7377, Val loss: 429.2286, Min val loss: 429.2286, Epoch time: 0.023s.\n", "Epoch: 120/300, Train loss: 461.0415, Val loss: 398.5600, Min val loss: 398.5600, Epoch time: 0.025s.\n", "Epoch: 140/300, Train loss: 430.3234, Val loss: 374.5228, Min val loss: 374.5228, Epoch time: 0.035s.\n", "Epoch: 160/300, Train loss: 397.2393, Val loss: 348.5988, Min val loss: 348.5988, Epoch time: 0.028s.\n", "Epoch: 180/300, Train loss: 370.6253, Val loss: 322.0574, Min val loss: 322.0574, Epoch time: 0.026s.\n", "Epoch: 200/300, Train loss: 340.3246, Val loss: 301.0881, Min val loss: 301.0881, Epoch time: 0.028s.\n", "Epoch: 220/300, Train loss: 315.3022, Val loss: 277.9825, Min val loss: 277.9825, Epoch time: 0.024s.\n", "Epoch: 240/300, Train loss: 287.3188, Val loss: 257.2012, Min val loss: 257.2012, Epoch time: 0.024s.\n", "Epoch: 260/300, Train loss: 260.1859, Val loss: 233.5729, Min val loss: 233.5729, Epoch time: 0.024s.\n", "Epoch: 280/300, Train loss: 235.5418, Val loss: 206.5197, Min val loss: 206.5197, Epoch time: 0.025s.\n", "Epoch: 300/300, Train loss: 211.4229, Val loss: 186.0890, Min val loss: 186.0890, Epoch time: 0.046s.\n", "`Trainer.fit` stopped: `max_epochs=300` reached.\n", "2023-09-23 20:35:05,998 - {pytorch_tabular.tabular_model:584} - INFO - Training the model completed\n", "2023-09-23 20:35:05,998 - {pytorch_tabular.tabular_model:1258} - INFO - Loading the best model\n", "/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py:33: LightningDeprecationWarning: `pytorch_lightning.utilities.cloud_io.get_filesystem` has been deprecated in v1.8.0 and will be removed in v2.0.0. Please use `lightning_fabric.utilities.cloud_io.get_filesystem` instead.\n", " rank_zero_deprecation(\n", "Training mse loss: 204.72249\n", "Validation mse loss: 186.08899\n", "Testing mse loss: 211.05500\n", "Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')\n", "\n", "-------------PytorchTabular End-------------\n", "\n", "\n", "-------------Run WideDeep-------------\n", "\n", "Training TabNet\n", "Epoch: 1/300, Train loss: 632.3985, Val loss: 567.4960, Min val loss: 567.4960\n", "Epoch: 21/300, Train loss: 598.0990, Val loss: 533.1615, Min val loss: 533.1615\n", "Epoch: 41/300, Train loss: 569.2331, Val loss: 512.1868, Min val loss: 512.1868\n", "Epoch: 61/300, Train loss: 540.4758, Val loss: 484.7060, Min val loss: 484.7060\n", "Epoch: 81/300, Train loss: 511.0865, Val loss: 455.2033, Min val loss: 455.2033\n", "Epoch: 101/300, Train loss: 480.2253, Val loss: 424.2426, Min val loss: 424.2426\n", "Epoch: 121/300, Train loss: 450.1095, Val loss: 398.7469, Min val loss: 398.7469\n", "Epoch: 141/300, Train loss: 419.1121, Val loss: 373.4113, Min val loss: 373.4113\n", "Epoch: 161/300, Train loss: 389.0500, Val loss: 343.9605, Min val loss: 343.9605\n", "Epoch: 181/300, Train loss: 359.7761, Val loss: 317.2437, Min val loss: 317.2437\n", "Epoch: 201/300, Train loss: 332.5761, Val loss: 289.8560, Min val loss: 289.8560\n", "Epoch: 221/300, Train loss: 304.8683, Val loss: 268.5120, Min val loss: 268.5120\n", "Epoch: 241/300, Train loss: 278.2647, Val loss: 245.0433, Min val loss: 245.0433\n", "Epoch: 261/300, Train loss: 252.8031, Val loss: 220.4438, Min val loss: 220.4438\n", "Epoch: 281/300, Train loss: 228.1485, Val loss: 196.3897, Min val loss: 196.3897\n", "Restoring model weights from the end of the best epoch\n", "Training mse loss: 206.63809\n", "Validation mse loss: 173.36779\n", "Testing mse loss: 212.69775\n", "Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')\n", "\n", "-------------WideDeep End-------------\n", "\n", "\n", "-------------Run TabNetFromAbstract-------------\n", "\n", "Training TabNet\n", "/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/multiclass_utils.py:13: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.\n", " from scipy.sparse.base import spmatrix\n", "/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/abstract_model.py:75: UserWarning: Device used : cuda\n", " warnings.warn(f\"Device used : {self.device}\")\n", "epoch 0 | loss: 587.08069| val_0_mse: 521.36511| 0:00:00s\n", "epoch 20 | loss: 543.03503| val_0_mse: 509.49057| 0:00:00s\n", "epoch 40 | loss: 503.94666| val_0_mse: 474.56982| 0:00:01s\n", "epoch 60 | loss: 464.24173| val_0_mse: 433.25231| 0:00:01s\n", "epoch 80 | loss: 428.79398| val_0_mse: 394.9299| 0:00:02s\n", "epoch 100| loss: 398.57919| val_0_mse: 363.43954| 0:00:02s\n", "epoch 120| loss: 368.87878| val_0_mse: 332.14944| 0:00:03s\n", "epoch 140| loss: 339.44043| val_0_mse: 301.91438| 0:00:03s\n", "epoch 160| loss: 313.07431| val_0_mse: 274.16858| 0:00:04s\n", "epoch 180| loss: 287.4639| val_0_mse: 248.59148| 0:00:04s\n", "epoch 200| loss: 255.01363| val_0_mse: 223.61368| 0:00:05s\n", "epoch 220| loss: 229.25694| val_0_mse: 198.16758| 0:00:06s\n", "epoch 240| loss: 200.16383| val_0_mse: 172.72145| 0:00:06s\n", "epoch 260| loss: 176.21191| val_0_mse: 150.20621| 0:00:07s\n", "epoch 280| loss: 150.32635| val_0_mse: 132.21739| 0:00:07s\n", "Stop training because you reached max_epochs = 300 with best_epoch = 299 and best_val_0_mse = 110.85251\n", "/home/xlluo/anaconda3/envs/tabular_ensemble/lib/python3.10/site-packages/pytorch_tabnet/callbacks.py:172: UserWarning: Best weights from best epoch are automatically used!\n", " warnings.warn(wrn_msg)\n", "Training mse loss: 122.27976\n", "Validation mse loss: 110.85251\n", "Testing mse loss: 112.14110\n", "Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')\n", "\n", "-------------TabNetFromAbstract End-------------\n", "\n", "\n", "-------------Run TabNetFromTorch-------------\n", "\n", "Training TabNet\n", "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n", "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n", "\n", " | Name | Type | Params\n", "-------------------------------------------------\n", "0 | default_loss_fn | MSELoss | 0 \n", "1 | default_output_norm | Identity | 0 \n", "2 | network | TabNet | 6.1 K \n", "-------------------------------------------------\n", "6.1 K Trainable params\n", "0 Non-trainable params\n", "6.1 K Total params\n", "0.024 Total estimated model params size (MB)\n", "Epoch: 1/300, Train loss: 631.3761, Val loss: 568.1518, Min val loss: 568.1518, Min ES val loss: 568.1518, Epoch time: 0.021s.\n", "Epoch: 20/300, Train loss: 601.5826, Val loss: 531.1477, Min val loss: 531.1477, Min ES val loss: 531.1477, Epoch time: 0.020s.\n", "Epoch: 40/300, Train loss: 572.3008, Val loss: 512.4727, Min val loss: 512.4727, Min ES val loss: 512.4727, Epoch time: 0.020s.\n", "Epoch: 60/300, Train loss: 547.1713, Val loss: 487.6359, Min val loss: 487.6359, Min ES val loss: 487.6359, Epoch time: 0.021s.\n", "Epoch: 80/300, Train loss: 521.3299, Val loss: 458.8514, Min val loss: 458.8514, Min ES val loss: 458.8514, Epoch time: 0.023s.\n", "Epoch: 100/300, Train loss: 493.5303, Val loss: 430.2360, Min val loss: 430.2360, Min ES val loss: 430.2360, Epoch time: 0.024s.\n", "Epoch: 120/300, Train loss: 460.1804, Val loss: 404.0699, Min val loss: 404.0699, Min ES val loss: 404.0699, Epoch time: 0.022s.\n", "Epoch: 140/300, Train loss: 430.0944, Val loss: 377.3679, Min val loss: 377.3679, Min ES val loss: 377.3679, Epoch time: 0.022s.\n", "Epoch: 160/300, Train loss: 399.9077, Val loss: 349.6179, Min val loss: 349.6179, Min ES val loss: 349.6179, Epoch time: 0.026s.\n", "Epoch: 180/300, Train loss: 373.4857, Val loss: 330.2073, Min val loss: 330.2073, Min ES val loss: 330.2073, Epoch time: 0.022s.\n", "Epoch: 200/300, Train loss: 342.6961, Val loss: 307.0838, Min val loss: 307.0838, Min ES val loss: 307.0838, Epoch time: 0.024s.\n", "Epoch: 220/300, Train loss: 318.1428, Val loss: 283.2042, Min val loss: 283.2042, Min ES val loss: 283.2042, Epoch time: 0.023s.\n", "Epoch: 240/300, Train loss: 292.0168, Val loss: 260.7692, Min val loss: 260.7692, Min ES val loss: 260.7692, Epoch time: 0.035s.\n", "Epoch: 260/300, Train loss: 263.9176, Val loss: 236.7290, Min val loss: 236.7290, Min ES val loss: 236.7290, Epoch time: 0.037s.\n", "Epoch: 280/300, Train loss: 241.0646, Val loss: 214.5846, Min val loss: 214.5846, Min ES val loss: 214.5846, Epoch time: 0.024s.\n", "Epoch: 300/300, Train loss: 216.0625, Val loss: 191.7081, Min val loss: 191.7081, Min ES val loss: 191.7081, Epoch time: 0.031s.\n", "`Trainer.fit` stopped: `max_epochs=300` reached.\n", "Training mse loss: 208.48247\n", "Validation mse loss: 191.70814\n", "Testing mse loss: 204.47160\n", "Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')\n", "\n", "-------------TabNetFromTorch End-------------\n", "\n", "PytorchTabular metrics\n", "TabNet 1/1\n", "WideDeep metrics\n", "TabNet 1/1\n", "TabNetFromAbstract metrics\n", "TabNet 1/1\n", "TabNetFromTorch metrics\n", "TabNet 1/1\n", "Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpdriivjp7/output/auto-mpg/2023-09-23-20-34-51-0_UserInputConfig/trainer.pkl')\n" ] }, { "data": { "text/plain": " Program Model Training RMSE Training MSE Training MAE \\\n0 TabNetFromAbstract TabNet 11.058018 122.279759 10.120124 \n1 TabNetFromTorch TabNet 14.438922 208.482471 13.648272 \n2 PytorchTabular TabNet 14.308127 204.722492 13.486490 \n3 WideDeep TabNet 14.374912 206.638091 13.423684 \n\n Training MAPE Training R2 Training MEDIAN_ABSOLUTE_ERROR \\\n0 0.425556 -0.897031 9.511139 \n1 0.578209 -2.234367 12.755719 \n2 0.570209 -2.176035 12.484611 \n3 0.562884 -2.205754 12.262027 \n\n Training EXPLAINED_VARIANCE_SCORE Testing RMSE ... Testing R2 \\\n0 0.688216 10.589669 ... -1.085708 \n1 0.655482 14.299357 ... -2.802959 \n2 0.645709 14.527732 ... -2.925404 \n3 0.560759 14.584161 ... -2.955957 \n\n Testing MEDIAN_ABSOLUTE_ERROR Testing EXPLAINED_VARIANCE_SCORE \\\n0 9.735932 0.711417 \n1 13.421765 0.606030 \n2 13.481362 0.597145 \n3 13.293868 0.569623 \n\n Validation RMSE Validation MSE Validation MAE Validation MAPE \\\n0 10.528652 110.852511 9.544754 0.417929 \n1 13.845871 191.708137 12.779896 0.562416 \n2 13.641444 186.088988 12.641969 0.558155 \n3 13.166920 173.367793 12.166225 0.534651 \n\n Validation R2 Validation MEDIAN_ABSOLUTE_ERROR \\\n0 -0.980271 9.028342 \n1 -2.424678 11.474345 \n2 -2.324297 11.443497 \n3 -2.097046 11.464049 \n\n Validation EXPLAINED_VARIANCE_SCORE \n0 0.575915 \n1 0.490396 \n2 0.530719 \n3 0.411240 \n\n[4 rows x 23 columns]", "text/html": "
| \n | Program | \nModel | \nTraining RMSE | \nTraining MSE | \nTraining MAE | \nTraining MAPE | \nTraining R2 | \nTraining MEDIAN_ABSOLUTE_ERROR | \nTraining EXPLAINED_VARIANCE_SCORE | \nTesting RMSE | \n... | \nTesting R2 | \nTesting MEDIAN_ABSOLUTE_ERROR | \nTesting EXPLAINED_VARIANCE_SCORE | \nValidation RMSE | \nValidation MSE | \nValidation MAE | \nValidation MAPE | \nValidation R2 | \nValidation MEDIAN_ABSOLUTE_ERROR | \nValidation EXPLAINED_VARIANCE_SCORE | \n
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \nTabNetFromAbstract | \nTabNet | \n11.058018 | \n122.279759 | \n10.120124 | \n0.425556 | \n-0.897031 | \n9.511139 | \n0.688216 | \n10.589669 | \n... | \n-1.085708 | \n9.735932 | \n0.711417 | \n10.528652 | \n110.852511 | \n9.544754 | \n0.417929 | \n-0.980271 | \n9.028342 | \n0.575915 | \n
| 1 | \nTabNetFromTorch | \nTabNet | \n14.438922 | \n208.482471 | \n13.648272 | \n0.578209 | \n-2.234367 | \n12.755719 | \n0.655482 | \n14.299357 | \n... | \n-2.802959 | \n13.421765 | \n0.606030 | \n13.845871 | \n191.708137 | \n12.779896 | \n0.562416 | \n-2.424678 | \n11.474345 | \n0.490396 | \n
| 2 | \nPytorchTabular | \nTabNet | \n14.308127 | \n204.722492 | \n13.486490 | \n0.570209 | \n-2.176035 | \n12.484611 | \n0.645709 | \n14.527732 | \n... | \n-2.925404 | \n13.481362 | \n0.597145 | \n13.641444 | \n186.088988 | \n12.641969 | \n0.558155 | \n-2.324297 | \n11.443497 | \n0.530719 | \n
| 3 | \nWideDeep | \nTabNet | \n14.374912 | \n206.638091 | \n13.423684 | \n0.562884 | \n-2.205754 | \n12.262027 | \n0.560759 | \n14.584161 | \n... | \n-2.955957 | \n13.293868 | \n0.569623 | \n13.166920 | \n173.367793 | \n12.166225 | \n0.534651 | \n-2.097046 | \n11.464049 | \n0.411240 | \n
4 rows × 23 columns
\n| \n | Program | \nModel | \nTraining RMSE | \nTraining MSE | \nTraining MAE | \nTraining MAPE | \nTraining R2 | \nTraining MEDIAN_ABSOLUTE_ERROR | \nTraining EXPLAINED_VARIANCE_SCORE | \nTesting RMSE | \n... | \nTesting R2 | \nTesting MEDIAN_ABSOLUTE_ERROR | \nTesting EXPLAINED_VARIANCE_SCORE | \nValidation RMSE | \nValidation MSE | \nValidation MAE | \nValidation MAPE | \nValidation R2 | \nValidation MEDIAN_ABSOLUTE_ERROR | \nValidation EXPLAINED_VARIANCE_SCORE | \n
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \nPytorchTabular | \nTabNet | \n2.210867 | \n4.887931 | \n1.642482 | \n0.069627 | \n0.924169 | \n1.215234 | \n0.924173 | \n2.603684 | \n... | \n0.873915 | \n1.517266 | \n0.873926 | \n2.90542 | \n8.441465 | \n2.17954 | \n0.097734 | \n0.849201 | \n1.557569 | \n0.849222 | \n
1 rows × 23 columns
\n| \n | Program | \nModel | \nTraining F1_SCORE | \nTraining PRECISION_SCORE | \nTraining RECALL_SCORE | \nTraining JACCARD_SCORE | \nTraining ACCURACY_SCORE | \nTraining BALANCED_ACCURACY_SCORE | \nTraining COHEN_KAPPA_SCORE | \nTraining HAMMING_LOSS | \n... | \nValidation ACCURACY_SCORE | \nValidation BALANCED_ACCURACY_SCORE | \nValidation COHEN_KAPPA_SCORE | \nValidation HAMMING_LOSS | \nValidation MATTHEWS_CORRCOEF | \nValidation ZERO_ONE_LOSS | \nValidation ROC_AUC_SCORE | \nValidation LOG_LOSS | \nValidation BRIER_SCORE_LOSS | \nValidation AVERAGE_PRECISION_SCORE | \n
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \nPytorchTabular | \nTabNet | \n0.663081 | \n0.757096 | \n0.589836 | \n0.495977 | \n0.855702 | \n0.764917 | \n0.573066 | \n0.144298 | \n... | \n0.847359 | \n0.753237 | \n0.548046 | \n0.152641 | \n0.555038 | \n0.152641 | \n0.895756 | \n0.336181 | \n0.106143 | \n0.855486 | \n
| 1 | \nTabNetFromTorch | \nTabNet | \n0.672948 | \n0.754224 | \n0.607485 | \n0.507100 | \n0.857852 | \n0.772360 | \n0.583483 | \n0.142148 | \n... | \n0.846744 | \n0.759372 | \n0.552975 | \n0.153256 | \n0.557504 | \n0.153256 | \n0.896746 | \n0.332678 | \n0.105144 | \n0.860627 | \n
| 2 | \nWideDeep | \nTabNet | \n0.664649 | \n0.734311 | \n0.607059 | \n0.497734 | \n0.852529 | \n0.768709 | \n0.571219 | \n0.147471 | \n... | \n0.845055 | \n0.762183 | \n0.552930 | \n0.154945 | \n0.556004 | \n0.154945 | \n0.894150 | \n0.338528 | \n0.107826 | \n0.852035 | \n
| 3 | \nTabNetFromAbstract | \nTabNet | \n0.692785 | \n0.753562 | \n0.641080 | \n0.529970 | \n0.863124 | \n0.787303 | \n0.605467 | \n0.136876 | \n... | \n0.845670 | \n0.766947 | \n0.558357 | \n0.154330 | \n0.560548 | \n0.154330 | \n0.896387 | \n0.336909 | \n0.106059 | \n0.857159 | \n
4 rows × 44 columns
\n| \n | Program | \nModel | \nTraining ACCURACY_SCORE | \nTraining BALANCED_ACCURACY_SCORE | \nTraining COHEN_KAPPA_SCORE | \nTraining HAMMING_LOSS | \nTraining MATTHEWS_CORRCOEF | \nTraining ZERO_ONE_LOSS | \nTraining PRECISION_SCORE_MACRO | \nTraining PRECISION_SCORE_MICRO | \n... | \nValidation F1_SCORE_MICRO | \nValidation F1_SCORE_WEIGHTED | \nValidation JACCARD_SCORE_MACRO | \nValidation JACCARD_SCORE_MICRO | \nValidation JACCARD_SCORE_WEIGHTED | \nValidation TOP_K_ACCURACY_SCORE | \nValidation LOG_LOSS | \nValidation ROC_AUC_SCORE_OVR_MACRO | \nValidation ROC_AUC_SCORE_OVR_WEIGHTED | \nValidation ROC_AUC_SCORE_OVO | \n
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \nPytorchTabular | \nTabNet | \n1.0 | \n1.0 | \n1.0 | \n0.0 | \n1.0 | \n0.0 | \n1.0 | \n1.0 | \n... | \n0.933333 | \n0.934656 | \n0.888889 | \n0.875000 | \n0.88000 | \n1.0 | \n0.303345 | \n0.975059 | \n0.966566 | \n0.972956 | \n
| 1 | \nWideDeep | \nTabNet | \n1.0 | \n1.0 | \n1.0 | \n0.0 | \n1.0 | \n0.0 | \n1.0 | \n1.0 | \n... | \n0.933333 | \n0.934656 | \n0.888889 | \n0.875000 | \n0.88000 | \n1.0 | \n0.297838 | \n0.980985 | \n0.975455 | \n0.979940 | \n
| 2 | \nTabNetFromAbstract | \nTabNet | \n1.0 | \n1.0 | \n1.0 | \n0.0 | \n1.0 | \n0.0 | \n1.0 | \n1.0 | \n... | \n0.900000 | \n0.901217 | \n0.837500 | \n0.818182 | \n0.82625 | \n1.0 | \n0.179150 | \n0.989874 | \n0.988788 | \n0.990417 | \n
| 3 | \nTabNetFromTorch | \nTabNet | \n1.0 | \n1.0 | \n1.0 | \n0.0 | \n1.0 | \n0.0 | \n1.0 | \n1.0 | \n... | \n0.933333 | \n0.934656 | \n0.888889 | \n0.875000 | \n0.88000 | \n1.0 | \n0.296352 | \n0.980985 | \n0.975455 | \n0.979940 | \n
4 rows × 71 columns
\n