tabensemb.data.datamodule.DataModule.select_by_value#

method

DataModule.select_by_value(selection: Dict[str, str | int | float | List | Tuple] | None = None, df: DataFrame | None = None, partition: str | None = None, eps: float | None = None, left_closed: bool = True, right_closed: bool = False) ndarray[source]#

Select data points with the given value(s) in the given column(s).

Parameters:
selection

A dictionary whose items indicate the columns to be investigated and the values (if is a list/int/float/str) or a range of values (if is a tuple with two components) to be selected for each column.

df

A dataframe to be filtered. If not given, df is used.

partition

“train”, “val”, “test”, or “all”

eps

A tolerance value if the value to be selected is a float. If None, only values “equal” to the float will be selected.

left_closed

When the feature is filtered by a range, whether the left boundary is closed.

right_closed

When the feature is filtered by a range, whether the right boundary is closed.

Returns:
np.ndarray

Indices of the selected data points in the dataframe.