Skip to content

API Reference: Torch Choice

data special

choice_dataset

The dataset object for management large scale consumer choice datasets. Please refer to the documentation and tutorials for more details on using ChoiceDataset.

Author: Tianyu Du Update: Apr. 27, 2022

ChoiceDataset (Dataset)

Source code in torch_choice/data/choice_dataset.py
class ChoiceDataset(torch.utils.data.Dataset):
    def __init__(self,
                 item_index: torch.LongTensor,
                 label: Optional[torch.LongTensor] = None,
                 user_index: Optional[torch.LongTensor] = None,
                 session_index: Optional[torch.LongTensor] = None,
                 item_availability: Optional[torch.BoolTensor] = None,
                 **kwargs) -> None:
        """
        Initialization methods for the dataset object, researchers should supply all information about the dataset
        using this initialization method.

        The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
        file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
        in machine learning literature.
        A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.

        The dataset consists of:
        (1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
        (2) a collection of `observables` associated with item, user, session, etc.

        Args:
            item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
                of the dataset, the relevant item can be:
                (1) the item bought in this choice instance,
                (2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
                NOTE: The support for second case is under-development, currently, we are only supporting binary label.

            label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
                each choice instance. While you want to predict the item bought, you can leave the `label` argument
                as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
                But if you are, for example, predicting the rating an user gave an item, label must be provided.
                Defaults to None.

            user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
                the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
                that the choice instances are from the same user.
                `user_index` is required if and only if there are multiple users in the dataset, for example:
                    (1) user-observables is involved in the utility form,
                    (2) and/or the coefficient is user-specific.
                This tensor is used to select the corresponding user observables and coefficients assigned to the
                user (like theta_user) for making prediction for that purchase.
                Defaults to None.

            session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
                the ID of the session when that choice instance occurred. This tensor is used to select the correct
                session observables or price observables for making prediction for that choice instance. Therefore, if
                there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
                object will assume each choice instance to be in its own session.
                Defaults to None.

            item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
                indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
                and hence these unavailable items will be set to 0 while making prediction.
                We assume all items are available if set to None.
                Defaults to None.

        Other Kwargs (Observables):
            One can specify the following types of observables, where * in shape denotes any positive
                integer. Typically * represents the number of observables.
            Please refer to the documentation for a detailed guide to use observables.
            1. user observables must start with 'user_' and have shape (num_users, *)
            2. item observables must start with 'item_' and have shape (num_items, *)
            3. session observables must start with 'session_' and have shape (num_sessions, *)
            4. taste observables (those vary by user and item) must start with `taste_` and have shape
                (num_users, num_items, *).
            NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
            5. price observables (those vary by session and item) must start with `price_` and have
                shape (num_sessions, num_items, *)
        """
        # ENHANCEMENT(Tianyu): add item_names for summary.
        super(ChoiceDataset, self).__init__()
        self.label = label
        self.item_index = item_index
        self.user_index = user_index
        self.session_index = session_index

        if self.session_index is None:
            # if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
            # if any session sensitive observable is provided, but session index is not,
            # infer each row in the dataset to be a session.
            # TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
            print('No `session_index` is provided, assume each choice instance is in its own session.')
            self.session_index = torch.arange(len(self.item_index)).long()

        self.item_availability = item_availability

        for key, item in kwargs.items():
            setattr(self, key, item)

        # TODO: add a validation procedure to check the consistency of the dataset.

    def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
        """Retrieves samples corresponding to the provided index or list of indices.

        Args:
            indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.

        Returns:
            ChoiceDataset: a subset of the dataset.
        """
        if isinstance(indices, int):
            # convert single integer index to an array of indices.
            indices = torch.LongTensor([indices])
        new_dict = dict()
        new_dict['item_index'] = self.item_index[indices].clone()

        # copy optional attributes.
        new_dict['label'] = self.label[indices].clone() if self.label is not None else None
        new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
        new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
        # item_availability has shape (num_sessions, num_items), no need to re-index it.
        new_dict['item_availability'] = self.item_availability

        # copy other attributes.
        for key, val in self.__dict__.items():
            if key not in new_dict.keys():
                if torch.is_tensor(val):
                    new_dict[key] = val.clone()
                else:
                    new_dict[key] = copy.deepcopy(val)
        return self._from_dict(new_dict)

    def __len__(self) -> int:
        """Returns number of samples in this dataset.

        Returns:
            int: length of the dataset.
        """
        return len(self.item_index)

    def __contains__(self, key: str) -> bool:
        return key in self.keys

    def __eq__(self, other: "ChoiceDataset") -> bool:
        """Returns whether all tensor attributes of both ChoiceDatasets are equal."""
        if not isinstance(other, ChoiceDataset):
            raise TypeError('You can only compare with ChoiceDataset objects.')
        else:
            flag = True
            for key, val in self.__dict__.items():
                if torch.is_tensor(val):
                    # ignore NaNs while comparing.
                    if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
                        print('Attribute {} is not equal.'.format(key))
                        flag = False
            return flag

    @property
    def device(self) -> str:
        """Returns the device of the dataset.

        Returns:
            str: the device of the dataset.
        """
        for attr in self.__dict__.values():
            if torch.is_tensor(attr):
                return attr.device

    @property
    def num_users(self) -> int:
        """Returns number of users involved in this dataset, returns 1 if there is no user identity.

        Returns:
            int: the number of users involved in this dataset.
        """
        # query from user_index
        if self.user_index is not None:
            return len(torch.unique(self.user_index))
        else:
            return 1

        # for key, val in self.__dict__.items():
        #     if torch.is_tensor(val):
        #         if self._is_user_attribute(key) or self._is_taste_attribute(key):
        #             return val.shape[0]
        # return 1

    @property
    def num_items(self) -> int:
        """Returns the number of items involved in this dataset.

        Returns:
            int: the number of items involved in this dataset.
        """
        return len(torch.unique(self.item_index))

        # for key, val in self.__dict__.items():
        #     if torch.is_tensor(val):
        #         if self._is_item_attribute(key):
        #             return val.shape[0]
        #         elif self._is_taste_attribute(key) or self._is_price_attribute(key):
        #             return val.shape[1]
        # return 1

    @property
    def num_sessions(self) -> int:
        """Returns the number of sessions involved in this dataset.

        Returns:
            int: the number of sessions involved in this dataset.
        """
        return len(torch.unique(self.session_index))

        # if self.session_index is None:
        #     return 1

        # for key, val in self.__dict__.items():
        #     if torch.is_tensor(val):
        #         if self._is_session_attribute(key) or self._is_price_attribute(key):
        #             return val.shape[0]
        # return 1

    @property
    def x_dict(self) -> Dict[object, torch.Tensor]:
        """Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format.
        Models in this package are expecting this dictionary based data format.

        Returns:
            Dict[object, torch.Tensor]: a dictionary with attribute names in the dataset as keys, and reshaped attribute
                tensors as values.
        """
        out = dict()
        for key, val in self.__dict__.items():
            if self._is_attribute(key):  # only include attributes.
                out[key] = self._expand_tensor(key, val)  # reshape to (num_sessions, num_items, num_params).
        return out

    @classmethod
    def _from_dict(cls, dictionary: Dict[str, torch.tensor]) -> "ChoiceDataset":
        """Creates an instance of ChoiceDataset from a dictionary of arguments.

        Args:
            dictionary (Dict[str, torch.tensor]): a dictionary with keys as argument names and values as arguments.

        Returns:
            ChoiceDataset: the created copy of dataset.
        """
        dataset = cls(**dictionary)
        for key, item in dictionary.items():
            setattr(dataset, key, item)
        return dataset

    def apply_tensor(self, func: callable) -> "ChoiceDataset":
        """This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.

        Args:
            func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.

        Returns:
            ChoiceDataset: the modified dataset.
        """
        for key, item in self.__dict__.items():
            if torch.is_tensor(item):
                setattr(self, key, func(item))
            # boardcast func to dictionary of tensors as well.
            elif isinstance(getattr(self, key), dict):
                for obj_key, obj_item in getattr(self, key).items():
                    if torch.is_tensor(obj_item):
                        setattr(getattr(self, key), obj_key, func(obj_item))
        return self

    def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
        """Moves all tensors in this dataset to the specified PyTorch device.

        Args:
            device (Union[str, torch.device]): the destination device.

        Returns:
            ChoiceDataset: the modified dataset on the new device.
        """
        return self.apply_tensor(lambda x: x.to(device))

    def clone(self) -> "ChoiceDataset":
        """Creates a copy of self.

        Returns:
            ChoiceDataset: a copy of self.
        """
        dictionary = {}
        for k, v in self.__dict__.items():
            if torch.is_tensor(v):
                dictionary[k] = v.clone()
            else:
                dictionary[k] = copy.deepcopy(v)
        return self.__class__._from_dict(dictionary)

    def _check_device_consistency(self) -> None:
        """Checks if all tensors in this dataset are on the same device.

        Raises:
            Exception: an exception is raised if not all tensors are on the same device.
        """
        # assert all tensors are on the same device.
        devices = list()
        for val in self.__dict__.values():
            if torch.is_tensor(val):
                devices.append(val.device)
        if len(set(devices)) > 1:
            raise Exception(f'Found tensors on different devices: {set(devices)}.',
                            'Use dataset.to() method to align devices.')

    def _size_repr(self, value: object) -> List[int]:
        """A helper method to get the string-representation of object sizes, this is helpful while constructing the
        string representation of the dataset.

        Args:
            value (object): an object to examine its size.

        Returns:
            List[int]: list of integers representing the size of the object, length of the list is equal to dimension of `value`.
        """
        if torch.is_tensor(value):
            return list(value.size())
        elif isinstance(value, int) or isinstance(value, float):
            return [1]
        elif isinstance(value, list) or isinstance(value, tuple):
            return [len(value)]
        else:
            return []

    def __repr__(self) -> str:
        """A method to get a string representation of the dataset.

        Returns:
            str: the string representation of the dataset.
        """
        info = [
            f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
        return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"

    # ==================================================================================================================
    # methods for checking attribute categories.
    # ==================================================================================================================
    @staticmethod
    def _is_item_attribute(key: str) -> bool:
        return key.startswith('item_') and (key != 'item_availability') and (key != 'item_index')

    @staticmethod
    def _is_user_attribute(key: str) -> bool:
        return key.startswith('user_') and (key != 'user_index')

    @staticmethod
    def _is_session_attribute(key: str) -> bool:
        return key.startswith('session_') and (key != 'session_index')

    @staticmethod
    def _is_taste_attribute(key: str) -> bool:
        return key.startswith('taste_')

    @staticmethod
    def _is_price_attribute(key: str) -> bool:
        return key.startswith('price_')

    def _is_attribute(self, key: str) -> bool:
        return self._is_item_attribute(key) \
            or self._is_user_attribute(key) \
            or self._is_session_attribute(key) \
            or self._is_taste_attribute(key) \
            or self._is_price_attribute(key)

    def _expand_tensor(self, key: str, val: torch.Tensor) -> torch.Tensor:
        """Expands attribute tensor to (num_sessions, num_items, num_params) shape for prediction tasks, this method
        won't reshape the tensor at all if the `key` (i.e., name of the tensor) suggests its not an attribute of any kind.

        Args:
            key (str): name of the attribute used to determine the raw shape of the tensor. For example, 'item_obs' means
                the raw tensor is in shape (num_items, num_params).
            val (torch.Tensor): the attribute tensor to be reshaped.

        Returns:
            torch.Tensor: the reshaped tensor with shape (num_sessions, num_items, num_params).
        """
        if not self._is_attribute(key):
            print(f'Warning: the input key {key} is not an attribute of the dataset, will NOT modify the provided tensor.')
            # don't expand non-attribute tensors, if any.
            return val

        num_params = val.shape[-1]
        if self._is_user_attribute(key):
            # user_attribute (num_users, *)
            out = val[self.user_index, :].view(
                len(self), 1, num_params).expand(-1, self.num_items, -1)
        elif self._is_item_attribute(key):
            # item_attribute (num_items, *)
            out = val.view(1, self.num_items, num_params).expand(
                len(self), -1, -1)
        elif self._is_session_attribute(key):
            # session_attribute (num_sessions, *)
            out = val[self.session_index, :].view(
                len(self), 1, num_params).expand(-1, self.num_items, -1)
        elif self._is_taste_attribute(key):
            # taste_attribute (num_users, num_items, *)
            out = val[self.user_index, :, :]
        elif self._is_price_attribute(key):
            # price_attribute (num_sessions, num_items, *)
            out = val[self.session_index, :, :]

        assert out.shape == (len(self), self.num_items, num_params)
        return out
device: str property readonly

Returns the device of the dataset.

Returns:

Type Description
str

the device of the dataset.

num_items: int property readonly

Returns the number of items involved in this dataset.

Returns:

Type Description
int

the number of items involved in this dataset.

num_sessions: int property readonly

Returns the number of sessions involved in this dataset.

Returns:

Type Description
int

the number of sessions involved in this dataset.

num_users: int property readonly

Returns number of users involved in this dataset, returns 1 if there is no user identity.

Returns:

Type Description
int

the number of users involved in this dataset.

x_dict: Dict[object, torch.Tensor] property readonly

Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format. Models in this package are expecting this dictionary based data format.

Returns:

Type Description
Dict[object, torch.Tensor]

a dictionary with attribute names in the dataset as keys, and reshaped attribute tensors as values.

__eq__(self, other) special

Returns whether all tensor attributes of both ChoiceDatasets are equal.

Source code in torch_choice/data/choice_dataset.py
def __eq__(self, other: "ChoiceDataset") -> bool:
    """Returns whether all tensor attributes of both ChoiceDatasets are equal."""
    if not isinstance(other, ChoiceDataset):
        raise TypeError('You can only compare with ChoiceDataset objects.')
    else:
        flag = True
        for key, val in self.__dict__.items():
            if torch.is_tensor(val):
                # ignore NaNs while comparing.
                if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
                    print('Attribute {} is not equal.'.format(key))
                    flag = False
        return flag
__getitem__(self, indices) special

Retrieves samples corresponding to the provided index or list of indices.

Parameters:

Name Type Description Default
indices Union[int, torch.LongTensor]

a single integer index or a tensor of indices.

required

Returns:

Type Description
ChoiceDataset

a subset of the dataset.

Source code in torch_choice/data/choice_dataset.py
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
    """Retrieves samples corresponding to the provided index or list of indices.

    Args:
        indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.

    Returns:
        ChoiceDataset: a subset of the dataset.
    """
    if isinstance(indices, int):
        # convert single integer index to an array of indices.
        indices = torch.LongTensor([indices])
    new_dict = dict()
    new_dict['item_index'] = self.item_index[indices].clone()

    # copy optional attributes.
    new_dict['label'] = self.label[indices].clone() if self.label is not None else None
    new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
    new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
    # item_availability has shape (num_sessions, num_items), no need to re-index it.
    new_dict['item_availability'] = self.item_availability

    # copy other attributes.
    for key, val in self.__dict__.items():
        if key not in new_dict.keys():
            if torch.is_tensor(val):
                new_dict[key] = val.clone()
            else:
                new_dict[key] = copy.deepcopy(val)
    return self._from_dict(new_dict)
__init__(self, item_index, label=None, user_index=None, session_index=None, item_availability=None, **kwargs) special

Initialization methods for the dataset object, researchers should supply all information about the dataset using this initialization method.

The number of choice instances are called batch_size in the documentation. The batch_size corresponds to the file length in wide-format dataset, and often denoted using N. We call it batch_size to follow the convention in machine learning literature. A choice instance is a row of the dataset, so there are batch_size choice instances in each ChoiceDataset.

The dataset consists of: (1) a collection of batch_size tuples (item_id, user_id, session_id, label), where each tuple is a choice instance. (2) a collection of observables associated with item, user, session, etc.

Parameters:

Name Type Description Default
item_index torch.LongTensor

a tensor of shape (batch_size) indicating the relevant item in each row of the dataset, the relevant item can be: (1) the item bought in this choice instance, (2) or the item reviewed by the user. In the later case, we need the label tensor to specify the rating score. NOTE: The support for second case is under-development, currently, we are only supporting binary label.

required
label Optional[torch.LongTensor]

a tensor of shape (batch_size) indicating the label for prediction in each choice instance. While you want to predict the item bought, you can leave the label argument as None in the initialization method, and the model will use item_index as the object to be predicted. But if you are, for example, predicting the rating an user gave an item, label must be provided. Defaults to None.

None
user_index Optional[torch.LongTensor]

a tensor of shape num_purchases (batch_size) indicating the ID of the user who was involved in each choice instance. If None user index is provided, it's assumed that the choice instances are from the same user. user_index is required if and only if there are multiple users in the dataset, for example: (1) user-observables is involved in the utility form, (2) and/or the coefficient is user-specific. This tensor is used to select the corresponding user observables and coefficients assigned to the user (like theta_user) for making prediction for that purchase. Defaults to None.

None
session_index Optional[torch.LongTensor]

a tensor of shape num_purchases (batch_size) indicating the ID of the session when that choice instance occurred. This tensor is used to select the correct session observables or price observables for making prediction for that choice instance. Therefore, if there is no session/price observables, you can leave this argument as None. In this case, the ChoiceDataset object will assume each choice instance to be in its own session. Defaults to None.

None
item_availability Optional[torch.BoolTensor]

A boolean tensor of shape (num_sessions, num_items) indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite, and hence these unavailable items will be set to 0 while making prediction. We assume all items are available if set to None. Defaults to None.

None

Other Kwargs (Observables): One can specify the following types of observables, where * in shape denotes any positive integer. Typically * represents the number of observables. Please refer to the documentation for a detailed guide to use observables. 1. user observables must start with 'user_' and have shape (num_users, ) 2. item observables must start with 'item_' and have shape (num_items, ) 3. session observables must start with 'session_' and have shape (num_sessions, ) 4. taste observables (those vary by user and item) must start with taste_ and have shape (num_users, num_items, ). NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large. 5. price observables (those vary by session and item) must start with price_ and have shape (num_sessions, num_items, *)

Source code in torch_choice/data/choice_dataset.py
def __init__(self,
             item_index: torch.LongTensor,
             label: Optional[torch.LongTensor] = None,
             user_index: Optional[torch.LongTensor] = None,
             session_index: Optional[torch.LongTensor] = None,
             item_availability: Optional[torch.BoolTensor] = None,
             **kwargs) -> None:
    """
    Initialization methods for the dataset object, researchers should supply all information about the dataset
    using this initialization method.

    The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
    file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
    in machine learning literature.
    A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.

    The dataset consists of:
    (1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
    (2) a collection of `observables` associated with item, user, session, etc.

    Args:
        item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
            of the dataset, the relevant item can be:
            (1) the item bought in this choice instance,
            (2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
            NOTE: The support for second case is under-development, currently, we are only supporting binary label.

        label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
            each choice instance. While you want to predict the item bought, you can leave the `label` argument
            as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
            But if you are, for example, predicting the rating an user gave an item, label must be provided.
            Defaults to None.

        user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
            the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
            that the choice instances are from the same user.
            `user_index` is required if and only if there are multiple users in the dataset, for example:
                (1) user-observables is involved in the utility form,
                (2) and/or the coefficient is user-specific.
            This tensor is used to select the corresponding user observables and coefficients assigned to the
            user (like theta_user) for making prediction for that purchase.
            Defaults to None.

        session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
            the ID of the session when that choice instance occurred. This tensor is used to select the correct
            session observables or price observables for making prediction for that choice instance. Therefore, if
            there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
            object will assume each choice instance to be in its own session.
            Defaults to None.

        item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
            indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
            and hence these unavailable items will be set to 0 while making prediction.
            We assume all items are available if set to None.
            Defaults to None.

    Other Kwargs (Observables):
        One can specify the following types of observables, where * in shape denotes any positive
            integer. Typically * represents the number of observables.
        Please refer to the documentation for a detailed guide to use observables.
        1. user observables must start with 'user_' and have shape (num_users, *)
        2. item observables must start with 'item_' and have shape (num_items, *)
        3. session observables must start with 'session_' and have shape (num_sessions, *)
        4. taste observables (those vary by user and item) must start with `taste_` and have shape
            (num_users, num_items, *).
        NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
        5. price observables (those vary by session and item) must start with `price_` and have
            shape (num_sessions, num_items, *)
    """
    # ENHANCEMENT(Tianyu): add item_names for summary.
    super(ChoiceDataset, self).__init__()
    self.label = label
    self.item_index = item_index
    self.user_index = user_index
    self.session_index = session_index

    if self.session_index is None:
        # if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
        # if any session sensitive observable is provided, but session index is not,
        # infer each row in the dataset to be a session.
        # TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
        print('No `session_index` is provided, assume each choice instance is in its own session.')
        self.session_index = torch.arange(len(self.item_index)).long()

    self.item_availability = item_availability

    for key, item in kwargs.items():
        setattr(self, key, item)

    # TODO: add a validation procedure to check the consistency of the dataset.
__len__(self) special

Returns number of samples in this dataset.

Returns:

Type Description
int

length of the dataset.

Source code in torch_choice/data/choice_dataset.py
def __len__(self) -> int:
    """Returns number of samples in this dataset.

    Returns:
        int: length of the dataset.
    """
    return len(self.item_index)
__repr__(self) special

A method to get a string representation of the dataset.

Returns:

Type Description
str

the string representation of the dataset.

Source code in torch_choice/data/choice_dataset.py
def __repr__(self) -> str:
    """A method to get a string representation of the dataset.

    Returns:
        str: the string representation of the dataset.
    """
    info = [
        f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
    return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"
apply_tensor(self, func)

This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.

Parameters:

Name Type Description Default
func callable

a callable function to be applied on tensors and tensor-values of dictionaries.

required

Returns:

Type Description
ChoiceDataset

the modified dataset.

Source code in torch_choice/data/choice_dataset.py
def apply_tensor(self, func: callable) -> "ChoiceDataset":
    """This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.

    Args:
        func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.

    Returns:
        ChoiceDataset: the modified dataset.
    """
    for key, item in self.__dict__.items():
        if torch.is_tensor(item):
            setattr(self, key, func(item))
        # boardcast func to dictionary of tensors as well.
        elif isinstance(getattr(self, key), dict):
            for obj_key, obj_item in getattr(self, key).items():
                if torch.is_tensor(obj_item):
                    setattr(getattr(self, key), obj_key, func(obj_item))
    return self
clone(self)

Creates a copy of self.

Returns:

Type Description
ChoiceDataset

a copy of self.

Source code in torch_choice/data/choice_dataset.py
def clone(self) -> "ChoiceDataset":
    """Creates a copy of self.

    Returns:
        ChoiceDataset: a copy of self.
    """
    dictionary = {}
    for k, v in self.__dict__.items():
        if torch.is_tensor(v):
            dictionary[k] = v.clone()
        else:
            dictionary[k] = copy.deepcopy(v)
    return self.__class__._from_dict(dictionary)
to(self, device)

Moves all tensors in this dataset to the specified PyTorch device.

Parameters:

Name Type Description Default
device Union[str, torch.device]

the destination device.

required

Returns:

Type Description
ChoiceDataset

the modified dataset on the new device.

Source code in torch_choice/data/choice_dataset.py
def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
    """Moves all tensors in this dataset to the specified PyTorch device.

    Args:
        device (Union[str, torch.device]): the destination device.

    Returns:
        ChoiceDataset: the modified dataset on the new device.
    """
    return self.apply_tensor(lambda x: x.to(device))

joint_dataset

The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer purchased. You can do this by using the JointDataset class.

Author: Tianyu Du Update: Apr. 28, 2022

JointDataset (Dataset)

A helper class for joining several pytorch datasets, using JointDataset and pytorch data loader allows for sampling the same batch index from several datasets.

The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer purchased. You can do this by using the JointDataset class.

Source code in torch_choice/data/joint_dataset.py
class JointDataset(torch.utils.data.Dataset):
    """A helper class for joining several pytorch datasets, using JointDataset
    and pytorch data loader allows for sampling the same batch index from several
    datasets.

    The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we
    need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food
    store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have
    two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent
    manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer
    purchased. You can do this by using the JointDataset class.
    """
    def __init__(self, **datasets) -> None:
        """The initialize methods.

        Args:
            Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
            ```
            dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
            ```
            All datasets should have the same length.

        """
        super(JointDataset, self).__init__()
        self.datasets = datasets
        # check the length of sub-datasets are the same.
        assert len(set([len(d) for d in self.datasets.values()])) == 1

    def __len__(self) -> int:
        """Get the number of samples in the joint dataset.

        Returns:
            int: the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained.
        """
        for d in self.datasets.values():
            return len(d)

    def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
        """Queries samples from the dataset by index.

        Args:
            indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.

        Returns:
            Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
                contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
                of contained datasets, sliced using the provided indices.
        """
        return dict((name, d[indices]) for (name, d) in self.datasets.items())

    def __repr__(self) -> str:
        """A method to get a string representation of the dataset.

        Returns:
            str: the string representation of the dataset.
        """
        out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
        for name, dataset in self.datasets.items():
            out.append(f'\t{name}: {str(dataset)}')
        out.append(')')
        return '\n'.join(out)

    @property
    def device(self) -> str:
        """Returns the device of datasets contained in the joint dataset.

        Returns:
            str: the device of the dataset.
        """
        for d in self.datasets.values():
            return d.device

    def to(self, device: Union[str, torch.device]) -> "JointDataset":
        """Moves all datasets in this dataset to the specified PyTorch device.

        Args:
            device (Union[str, torch.device]): the destination device.

        Returns:
            ChoiceDataset: the modified dataset on the new device.
        """
        for d in self.datasets.values():
            d = d.to(device)
        return self
device: str property readonly

Returns the device of datasets contained in the joint dataset.

Returns:

Type Description
str

the device of the dataset.

__getitem__(self, indices) special

Queries samples from the dataset by index.

Parameters:

Name Type Description Default
indices Union[int, torch.LongTensor]

an integer or a 1D tensor of multiple indices.

required

Returns:

Type Description
Dict[str, ChoiceDataset]

the subset of the dataset. Keys of the dictionary will be names of each dataset contained (the same as the keys of the datasets argument in the constructor). Values will be subsets of contained datasets, sliced using the provided indices.

Source code in torch_choice/data/joint_dataset.py
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
    """Queries samples from the dataset by index.

    Args:
        indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.

    Returns:
        Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
            contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
            of contained datasets, sliced using the provided indices.
    """
    return dict((name, d[indices]) for (name, d) in self.datasets.items())
__init__(self, **datasets) special

The initialize methods.

Source code in torch_choice/data/joint_dataset.py
def __init__(self, **datasets) -> None:
    """The initialize methods.

    Args:
        Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
        ```
        dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
        ```
        All datasets should have the same length.

    """
    super(JointDataset, self).__init__()
    self.datasets = datasets
    # check the length of sub-datasets are the same.
    assert len(set([len(d) for d in self.datasets.values()])) == 1
__len__(self) special

Get the number of samples in the joint dataset.

Returns:

Type Description
int

the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained.

Source code in torch_choice/data/joint_dataset.py
def __len__(self) -> int:
    """Get the number of samples in the joint dataset.

    Returns:
        int: the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained.
    """
    for d in self.datasets.values():
        return len(d)
__repr__(self) special

A method to get a string representation of the dataset.

Returns:

Type Description
str

the string representation of the dataset.

Source code in torch_choice/data/joint_dataset.py
def __repr__(self) -> str:
    """A method to get a string representation of the dataset.

    Returns:
        str: the string representation of the dataset.
    """
    out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
    for name, dataset in self.datasets.items():
        out.append(f'\t{name}: {str(dataset)}')
    out.append(')')
    return '\n'.join(out)
to(self, device)

Moves all datasets in this dataset to the specified PyTorch device.

Parameters:

Name Type Description Default
device Union[str, torch.device]

the destination device.

required

Returns:

Type Description
ChoiceDataset

the modified dataset on the new device.

Source code in torch_choice/data/joint_dataset.py
def to(self, device: Union[str, torch.device]) -> "JointDataset":
    """Moves all datasets in this dataset to the specified PyTorch device.

    Args:
        device (Union[str, torch.device]): the destination device.

    Returns:
        ChoiceDataset: the modified dataset on the new device.
    """
    for d in self.datasets.values():
        d = d.to(device)
    return self

utils

pivot3d(df, dim0, dim1, values)

Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the provided data frame.

Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then out[t, i, k] is the feature values[k] of item i in session t. The returned tensor has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl logit models.

Source code in torch_choice/data/utils.py
def pivot3d(df: pd.DataFrame, dim0: str, dim1: str, values: Union[str, List[str]]) -> torch.Tensor:
    """
    Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the
    provided data frame.

    Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then
        out[t, i, k] is the feature values[k] of item i in session t. The returned tensor
        has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl
        logit models.
    """
    if not isinstance(values, list):
        values = [values]

    dim1_list = sorted(df[dim1].unique())

    tensor_slice = list()
    for value in values:
        layer = df.pivot(index=dim0, columns=dim1, values=value)
        tensor_slice.append(torch.Tensor(layer[dim1_list].values))

    tensor = torch.stack(tensor_slice, dim=-1)
    assert tensor.shape == (df[dim0].nunique(), df[dim1].nunique(), len(values))
    return tensor

model special

coefficient

The general class of learnable coefficients in various models, this class serves as the building blocks for models in this package. The weights (i.e., learnable parameters) in the Coefficient class are implemented using PyTorch and can be trained directly using optimizers from PyTorch.

NOTE: torch-choice package users don't interact with classes in this file directly, please use conditional_logit_model.py and nested_logit_model.py instead.

Author: Tianyu Du Update: Apr. 28, 2022

Coefficient (Module)

Source code in torch_choice/model/coefficient.py
class Coefficient(nn.Module):
    def __init__(self,
                 variation: str,
                 num_params: int,
                 num_items: Optional[int]=None,
                 num_users: Optional[int]=None
                 ) -> None:
        """A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
        in Greek letters in the model's utility representation.

        Args:
            variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
                Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
                For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
            num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
                per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
                an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
                with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
                observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
                be set to `K`, NOT `K*U`.
            num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
            num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
                Defaults to None.
        """
        super(Coefficient, self).__init__()
        self.variation = variation
        self.num_items = num_items
        self.num_users = num_users
        self.num_params = num_params

        # construct the trainable.
        if self.variation == 'constant':
            # constant for all users and items.
            self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
        elif self.variation == 'item':
            # coef depends on item j but not on user i.
            # force coefficients for the first item class to be zero.
            self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
        elif self.variation == 'item-full':
            # coef depends on item j but not on user i.
            # model coefficient for every item.
            self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
        elif self.variation == 'user':
            # coef depends on the user.
            # we always model coefficient for all users.
            self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
        elif self.variation == 'user-item':
            # coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
            self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
        elif self.variation == 'user-item-full':
            # construct coefficients for every items.
            self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
        else:
            raise ValueError(f'Unsupported type of variation: {self.variation}.')

    def __repr__(self) -> str:
        """Returns a string representation of the coefficient.

        Returns:
            str: the string representation of the coefficient.
        """
        return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
               + f' num_users={self.num_users}, num_params={self.num_params},' \
               + f' {self.coef.numel()} trainable parameters in total).'

    def forward(self,
                x: torch.Tensor,
                user_index: Optional[torch.Tensor]=None,
                manual_coef_value: Optional[torch.Tensor]=None
                ) -> torch.Tensor:
        """
        The forward function of the coefficient, which computes the utility from purchasing each item in each session.
        The output shape will be (num_sessions, num_items).

        Args:
            x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
                class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
            user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
                contain IDs of the user involved in that session. If set to None, assume the same
                user is making all decisions.
                Defaults to None.
            manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
                entries as self.coef. If provided, the forward function uses provided values
                as coefficient and return the predicted utility, this feature is useful when
                the researcher wishes to manually specify values for coefficients and examine prediction
                with specified coefficient values. If not provided, forward function is executed
                using values from self.coef.
                Defaults to None.

        Returns:
            torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
                the utility of purchasing item i in session t.
        """
        if manual_coef_value is not None:
            assert manual_coef_value.numel() == self.coef.numel()
            # plugin the provided coefficient values, coef is a tensor.
            coef = manual_coef_value.reshape(*self.coef.shape)
        else:
            # use the learned coefficient values, coef is a nn.Parameter.
            coef = self.coef

        num_trips, num_items, num_feats = x.shape
        assert self.num_params == num_feats

        # cast coefficient tensor to (num_trips, num_items, self.num_params).
        if self.variation == 'constant':
            coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)

        elif self.variation == 'item':
            # coef has shape (num_items-1, num_params)
            # force coefficient for the first item to be zero.
            zeros = torch.zeros(1, self.num_params).to(coef.device)
            coef = torch.cat((zeros, coef), dim=0)  # (num_items, num_params)
            coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)

        elif self.variation == 'item-full':
            # coef has shape (num_items, num_params)
            coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)

        elif self.variation == 'user':
            # coef has shape (num_users, num_params)
            coef = coef[user_index, :]  # (num_trips, num_params) user-specific coefficients.
            coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)

        elif self.variation == 'user-item':
            # (num_trips,) long tensor of user ID.
            # originally, coef has shape (num_users, num_items-1, num_params)
            # transform to (num_trips, num_items - 1, num_params), user-specific.
            coef = coef[user_index, :, :]
            # coefs for the first item for all users are enforced to 0.
            zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
            coef = torch.cat((zeros, coef), dim=1)  # (num_trips, num_items, num_params)

        elif self.variation == 'user-item-full':
            # originally, coef has shape (num_users, num_items, num_params)
            coef = coef[user_index, :, :]  # (num_trips, num_items, num_params)

        else:
            raise ValueError(f'Unsupported type of variation: {self.variation}.')

        assert coef.shape == (num_trips, num_items, num_feats) == x.shape

        # compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
        # the inner product.
        return (x * coef).sum(dim=-1)
__init__(self, variation, num_params, num_items=None, num_users=None) special

A generic coefficient object storing trainable parameters. This class corresponds to those variables typically in Greek letters in the model's utility representation.

Parameters:

Name Type Description Default
variation str

the degree of variation of this coefficient. For example, the coefficient can vary by users or items. Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'. For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.

required
num_params int

number of parameters in this coefficient. Note that this number is the number of parameters per class, not the total number of parameters. For example, suppose we have U users and you want to initiate an user-specific coefficient called theta_user. The coefficient enters the utility form while being multiplied with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional observable. However, the total number of parameters is K * U (K for each of U users). In this case, num_params should be set to K, NOT K*U.

required
num_items int

the number of items in the prediction problem, this is required to reshape the parameter correctly.

None
num_users Optional[int]

number of users, this is only necessary if the coefficient varies by users. Defaults to None.

None
Source code in torch_choice/model/coefficient.py
def __init__(self,
             variation: str,
             num_params: int,
             num_items: Optional[int]=None,
             num_users: Optional[int]=None
             ) -> None:
    """A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
    in Greek letters in the model's utility representation.

    Args:
        variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
            Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
            For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
        num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
            per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
            an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
            with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
            observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
            be set to `K`, NOT `K*U`.
        num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
        num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
            Defaults to None.
    """
    super(Coefficient, self).__init__()
    self.variation = variation
    self.num_items = num_items
    self.num_users = num_users
    self.num_params = num_params

    # construct the trainable.
    if self.variation == 'constant':
        # constant for all users and items.
        self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
    elif self.variation == 'item':
        # coef depends on item j but not on user i.
        # force coefficients for the first item class to be zero.
        self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
    elif self.variation == 'item-full':
        # coef depends on item j but not on user i.
        # model coefficient for every item.
        self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
    elif self.variation == 'user':
        # coef depends on the user.
        # we always model coefficient for all users.
        self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
    elif self.variation == 'user-item':
        # coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
        self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
    elif self.variation == 'user-item-full':
        # construct coefficients for every items.
        self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
    else:
        raise ValueError(f'Unsupported type of variation: {self.variation}.')
__repr__(self) special

Returns a string representation of the coefficient.

Returns:

Type Description
str

the string representation of the coefficient.

Source code in torch_choice/model/coefficient.py
def __repr__(self) -> str:
    """Returns a string representation of the coefficient.

    Returns:
        str: the string representation of the coefficient.
    """
    return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
           + f' num_users={self.num_users}, num_params={self.num_params},' \
           + f' {self.coef.numel()} trainable parameters in total).'
forward(self, x, user_index=None, manual_coef_value=None)

The forward function of the coefficient, which computes the utility from purchasing each item in each session. The output shape will be (num_sessions, num_items).

Parameters:

Name Type Description Default
x torch.Tensor

a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.

required
user_index Optional[torch.Tensor]

a tensor of shape (num_sessions,) contain IDs of the user involved in that session. If set to None, assume the same user is making all decisions. Defaults to None.

None
manual_coef_value Optional[torch.Tensor]

a tensor with the same number of entries as self.coef. If provided, the forward function uses provided values as coefficient and return the predicted utility, this feature is useful when the researcher wishes to manually specify values for coefficients and examine prediction with specified coefficient values. If not provided, forward function is executed using values from self.coef. Defaults to None.

None

Returns:

Type Description
torch.Tensor

a tensor of shape (num_sessions, num_items) whose (t, i) entry represents the utility of purchasing item i in session t.

Source code in torch_choice/model/coefficient.py
def forward(self,
            x: torch.Tensor,
            user_index: Optional[torch.Tensor]=None,
            manual_coef_value: Optional[torch.Tensor]=None
            ) -> torch.Tensor:
    """
    The forward function of the coefficient, which computes the utility from purchasing each item in each session.
    The output shape will be (num_sessions, num_items).

    Args:
        x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
            class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
        user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
            contain IDs of the user involved in that session. If set to None, assume the same
            user is making all decisions.
            Defaults to None.
        manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
            entries as self.coef. If provided, the forward function uses provided values
            as coefficient and return the predicted utility, this feature is useful when
            the researcher wishes to manually specify values for coefficients and examine prediction
            with specified coefficient values. If not provided, forward function is executed
            using values from self.coef.
            Defaults to None.

    Returns:
        torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
            the utility of purchasing item i in session t.
    """
    if manual_coef_value is not None:
        assert manual_coef_value.numel() == self.coef.numel()
        # plugin the provided coefficient values, coef is a tensor.
        coef = manual_coef_value.reshape(*self.coef.shape)
    else:
        # use the learned coefficient values, coef is a nn.Parameter.
        coef = self.coef

    num_trips, num_items, num_feats = x.shape
    assert self.num_params == num_feats

    # cast coefficient tensor to (num_trips, num_items, self.num_params).
    if self.variation == 'constant':
        coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)

    elif self.variation == 'item':
        # coef has shape (num_items-1, num_params)
        # force coefficient for the first item to be zero.
        zeros = torch.zeros(1, self.num_params).to(coef.device)
        coef = torch.cat((zeros, coef), dim=0)  # (num_items, num_params)
        coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)

    elif self.variation == 'item-full':
        # coef has shape (num_items, num_params)
        coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)

    elif self.variation == 'user':
        # coef has shape (num_users, num_params)
        coef = coef[user_index, :]  # (num_trips, num_params) user-specific coefficients.
        coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)

    elif self.variation == 'user-item':
        # (num_trips,) long tensor of user ID.
        # originally, coef has shape (num_users, num_items-1, num_params)
        # transform to (num_trips, num_items - 1, num_params), user-specific.
        coef = coef[user_index, :, :]
        # coefs for the first item for all users are enforced to 0.
        zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
        coef = torch.cat((zeros, coef), dim=1)  # (num_trips, num_items, num_params)

    elif self.variation == 'user-item-full':
        # originally, coef has shape (num_users, num_items, num_params)
        coef = coef[user_index, :, :]  # (num_trips, num_items, num_params)

    else:
        raise ValueError(f'Unsupported type of variation: {self.variation}.')

    assert coef.shape == (num_trips, num_items, num_feats) == x.shape

    # compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
    # the inner product.
    return (x * coef).sum(dim=-1)

conditional_logit_model

Conditional Logit Model.

Author: Tianyu Du Date: Aug. 8, 2021 Update: Apr. 28, 2022

ConditionalLogitModel (Module)

The more generalized version of conditional logit model, the model allows for research specific variable types(groups) and different levels of variations for coefficient.

The model allows for the following levels for variable variations: !!! note "unless the -full flag is specified (which means we want to explicitly model coefficients" for all items), for all variation levels related to item (item specific and user-item specific), the model force coefficients for the first item to be zero. This design follows standard econometric practice.

  • constant: constant over all users and items,

  • user: user-specific parameters but constant across all items,

  • item: item-specific parameters but constant across all users, parameters for the first item are forced to be zero.

  • item-full: item-specific parameters but constant across all users, explicitly model for all items.

  • user-item: parameters that are specific to both user and item, parameter for the first item for all users are forced to be zero.

  • user-item-full: parameters that are specific to both user and item, explicitly model for all items.
Source code in torch_choice/model/conditional_logit_model.py
class ConditionalLogitModel(nn.Module):
    """The more generalized version of conditional logit model, the model allows for research specific
    variable types(groups) and different levels of variations for coefficient.

    The model allows for the following levels for variable variations:
    NOTE: unless the `-full` flag is specified (which means we want to explicitly model coefficients
        for all items), for all variation levels related to item (item specific and user-item specific),
        the model force coefficients for the first item to be zero. This design follows standard
        econometric practice.

    - constant: constant over all users and items,

    - user: user-specific parameters but constant across all items,

    - item: item-specific parameters but constant across all users, parameters for the first item are
        forced to be zero.
    - item-full: item-specific parameters but constant across all users, explicitly model for all items.

    - user-item: parameters that are specific to both user and item, parameter for the first item
        for all users are forced to be zero.
    - user-item-full: parameters that are specific to both user and item, explicitly model for all items.
    """

    def __init__(self,
                 coef_variation_dict: Dict[str, str],
                 num_param_dict: Optional[Dict[str, int]]=None,
                 num_items: Optional[int]=None,
                 num_users: Optional[int]=None
                 ) -> None:
        """
        Args:
            num_items (int): number of items in the dataset.
            num_users (int): number of users in the dataset.
            coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
                should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
                if the researcher requires an intercept term.
                For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
                be one of the following values, this value specifies the "level of variation" of the coefficient.

                - `constant`: the coefficient constant over all users and items: $X \beta$.

                - `user`: user-specific parameters but constant across all items: $X \beta_{u}$.

                - `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
                    Note that the coefficients for the first item are forced to be zero following the standard practice
                    in econometrics.

                - `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
                    be zeros.

                The following configurations are supported by the package, but we don't recommend using them due to the
                    large number of parameters.
                - `user-item`: parameters that are specific to both user and item, parameter for the first item
                    for all users are forced to be zero.

                - `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.

            num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
                as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
                If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
                and values of all ones. Default to be None.
        """
        super(ConditionalLogitModel, self).__init__()

        if num_param_dict is None:
            num_param_dict = {key:1 for key in coef_variation_dict.keys()}

        assert coef_variation_dict.keys() == num_param_dict.keys()

        self.variable_types = list(deepcopy(num_param_dict).keys())

        self.coef_variation_dict = deepcopy(coef_variation_dict)
        self.num_param_dict = deepcopy(num_param_dict)

        self.num_items = num_items
        self.num_users = num_users

        # check number of parameters specified are all positive.
        for var_type, num_params in self.num_param_dict.items():
            assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'

        # infer the number of parameters for intercept if the researcher forgets.
        if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
            warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
            self.num_param_dict['intercept'] = 1

        # construct trainable parameters.
        coef_dict = dict()
        for var_type, variation in self.coef_variation_dict.items():
            coef_dict[var_type] = Coefficient(variation=variation,
                                              num_items=self.num_items,
                                              num_users=self.num_users,
                                              num_params=self.num_param_dict[var_type])
        # A ModuleDict is required to properly register all trainable parameters.
        # self.parameter() will fail if a python dictionary is used instead.
        self.coef_dict = nn.ModuleDict(coef_dict)

    def __repr__(self) -> str:
        """Return a string representation of the model.

        Returns:
            str: the string representation of the model.
        """
        out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
        for var_type, num_params in self.num_param_dict.items():
            out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
        return super().__repr__() + '\n' + '\n'.join(out_str_lst)

    @property
    def num_params(self) -> int:
        """Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
        with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
        intercept is involved.

        Returns:
            int: the total number of learnable parameters.
        """
        return sum(w.numel() for w in self.parameters())

    def summary(self):
        """Print out the current model parameter."""
        for var_type, coefficient in self.coef_dict.items():
            if coefficient is not None:
                print('Variable Type: ', var_type)
                print(coefficient.coef)

    def forward(self,
                batch: ChoiceDataset,
                manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
                ) -> torch.Tensor:
        """
        Forward pass of the model.

        Args:
            batch: a `ChoiceDataset` object.

            manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
                keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
                coefficient to be the provided values and compute utility conditioned on the provided
                coefficient values. This feature is useful when the research wishes to plug in particular
                values of coefficients and examine the utility values. If not provided, the model will
                use the learned coefficient values in self.coef_dict.
                Defaults to None.

        Returns:
            torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
                the utility from item i in trip t for the user involved in that trip.
        """
        x_dict = batch.x_dict

        if 'intercept' in self.coef_variation_dict.keys():
            # intercept term has no input tensor, which has only 1 feature.
            x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)

        # compute the utility from each item in each choice session.
        total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
        # for each type of variables, apply the corresponding coefficient to input x.

        for var_type, coef in self.coef_dict.items():
            total_utility += coef(
                x_dict[var_type], batch.user_index,
                manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])

        assert total_utility.shape == (len(batch), self.num_items)

        if batch.item_availability is not None:
            # mask out unavilable items.
            total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
        return total_utility


    def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
        """Computes the log-likelihood for the batch and label.
        TODO: consider remove y, change to label.
        TODO: consider move this method outside the model, the role of the model is to compute the utility.

        Args:
            batch (ChoiceDataset): a ChoiceDataset object containing the data.
            y (torch.Tensor): the label.
            is_train (bool, optional): whether to trace the gradient. Defaults to True.

        Returns:
            torch.Tensor: the negative log-likelihood.
        """
        if is_train:
            self.train()
        else:
            self.eval()
        # (num_trips, num_items)
        total_utility = self.forward(batch)
        logP = torch.log_softmax(total_utility, dim=1)
        nll = - logP[torch.arange(len(y)), y].sum()
        return nll


    # NOTE: the method for computing Hessian and standard deviation has been moved to std.py.
    # @staticmethod
    # def flatten_coef_dict(coef_dict: Dict[str, Union[torch.Tensor, torch.nn.Parameter]]) -> Tuple[torch.Tensor, dict]:
    #     """Flattens the coef_dict into a 1-dimension tensor, used for hessian computation.

    #     Args:
    #         coef_dict (Dict[str, Union[torch.Tensor, torch.nn.Parameter]]): a dictionary holding learnable parameters.

    #     Returns:
    #         Tuple[torch.Tensor, dict]: 1. the flattened tensors with shape (num_params,), 2. an indexing dictionary
    #             used for reconstructing the original coef_dict from the flatten tensor.
    #     """
    #     type2idx = dict()
    #     param_list = list()
    #     start = 0

    #     for var_type in coef_dict.keys():
    #         num_params = coef_dict[var_type].coef.numel()
    #         # track which portion of all_param tensor belongs to this variable type.
    #         type2idx[var_type] = (start, start + num_params)
    #         start += num_params
    #         # use reshape instead of view to make a copy.
    #         param_list.append(coef_dict[var_type].coef.clone().reshape(-1,))

    #     all_param = torch.cat(param_list)  # (self.num_params(), )
    #     return all_param, type2idx

    # @staticmethod
    # def unwrap_coef_dict(param: torch.Tensor, type2idx: Dict[str, Tuple[int, int]]) -> Dict[str, torch.Tensor]:
    #     """Rebuilds coef_dict from output of self.flatten_coef_dict method.

    #     Args:
    #         param (torch.Tensor): the flattened coef_dict from self.flatten_coef_dict.
    #         type2idx (Dict[str, Tuple[int, int]]): the indexing dictionary from self.flatten_coef_dict.

    #     Returns:
    #         Dict[str, torch.Tensor]: the re-constructed coefficient dictionary.
    #     """
    #     coef_dict = dict()
    #     for var_type in type2idx.keys():
    #         start, end = type2idx[var_type]
    #         # no need to reshape here, Coefficient handles it.
    #         coef_dict[var_type] = param[start:end]
    #     return coef_dict

    # def compute_hessian(self, x_dict, availability, user_index, y) -> torch.Tensor:
    #     """Computes the Hessian of negative log-likelihood (total cross-entropy loss) with respect
    #     to all parameters in this model. The Hessian can be later used for constructing the standard deviation of
    #     parameters.

    #     Args:
    #         x_dict ,availability, user_index: see definitions in self.forward method.
    #         y (torch.LongTensor): a tensor with shape (num_trips,) of IDs of items actually purchased.

    #     Returns:
    #         torch.Tensor: a (self.num_params, self.num_params) tensor of the Hessian matrix.
    #     """
    #     all_coefs, type2idx = self.flatten_coef_dict(self.coef_dict)

    #     def compute_nll(P: torch.Tensor) -> float:
    #         coef_dict = self.unwrap_coef_dict(P, type2idx)
    #         y_pred = self._forward(x_dict=x_dict,
    #                                availability=availability,
    #                                user_index=user_index,
    #                                manual_coef_value_dict=coef_dict)
    #         # the reduction needs to be 'sum' to obtain NLL.
    #         loss = F.cross_entropy(y_pred, y, reduction='sum')
    #         return loss

    #     H = torch.autograd.functional.hessian(compute_nll, all_coefs)
    #     assert H.shape == (self.num_params, self.num_params)
    #     return H

    # def compute_std(self, x_dict, availability, user_index, y) -> Dict[str, torch.Tensor]:
    #     """Computes

    #     Args:f
    #         See definitions in self.compute_hessian.

    #     Returns:
    #         Dict[str, torch.Tensor]: a dictionary whose keys are the same as self.coef_dict.keys()
    #         the values are standard errors of coefficients in each coefficient group.
    #     """
    #     _, type2idx = self.flatten_coef_dict(self.coef_dict)
    #     H = self.compute_hessian(x_dict, availability, user_index, y)
    #     std_all = torch.sqrt(torch.diag(torch.inverse(H)))
    #     std_dict = dict()
    #     for var_type in type2idx.keys():
    #         # get std of variables belonging to each type.
    #         start, end = type2idx[var_type]
    #         std_dict[var_type] = std_all[start:end]
    #     return std_dict
num_params: int property readonly

Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no intercept is involved.

Returns:

Type Description
int

the total number of learnable parameters.

__init__(self, coef_variation_dict, num_param_dict=None, num_items=None, num_users=None) special

Parameters:

Name Type Description Default
num_items int

number of items in the dataset.

None
num_users int

number of users in the dataset.

None
coef_variation_dict Dict[str, str]

variable type to variation level dictionary. Keys of this dictionary should be variable names in the dataset (i.e., these starting with price_, user_, etc), or intercept if the researcher requires an intercept term. For each variable name X_var (e.g., user_income) or intercept, the corresponding dictionary key should be one of the following values, this value specifies the "level of variation" of the coefficient.

  • constant: the coefficient constant over all users and items: \(X eta\).

  • user: user-specific parameters but constant across all items: \(X eta_{u}\).

  • item: item-specific parameters but constant across all users, \(X eta_{i}\). Note that the coefficients for the first item are forced to be zero following the standard practice in econometrics.

  • item-full: the same configuration as item, but does not force the coefficients of the first item to be zeros.

The following configurations are supported by the package, but we don't recommend using them due to the large number of parameters. - user-item: parameters that are specific to both user and item, parameter for the first item for all users are forced to be zero.

  • user-item-full: parameters that are specific to both user and item, explicitly model for all items.
required
num_param_dict Optional[Dict[str, int]]

variable type to number of parameters dictionary with keys exactly the same as the coef_variation_dict. Values of num_param_dict records numbers of features in each kind of variable. If None is supplied, num_param_dict will be a dictionary with the same keys as the coef_variation_dict dictionary and values of all ones. Default to be None.

None
Source code in torch_choice/model/conditional_logit_model.py
def __init__(self,
             coef_variation_dict: Dict[str, str],
             num_param_dict: Optional[Dict[str, int]]=None,
             num_items: Optional[int]=None,
             num_users: Optional[int]=None
             ) -> None:
    """
    Args:
        num_items (int): number of items in the dataset.
        num_users (int): number of users in the dataset.
        coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
            should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
            if the researcher requires an intercept term.
            For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
            be one of the following values, this value specifies the "level of variation" of the coefficient.

            - `constant`: the coefficient constant over all users and items: $X \beta$.

            - `user`: user-specific parameters but constant across all items: $X \beta_{u}$.

            - `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
                Note that the coefficients for the first item are forced to be zero following the standard practice
                in econometrics.

            - `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
                be zeros.

            The following configurations are supported by the package, but we don't recommend using them due to the
                large number of parameters.
            - `user-item`: parameters that are specific to both user and item, parameter for the first item
                for all users are forced to be zero.

            - `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.

        num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
            as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
            If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
            and values of all ones. Default to be None.
    """
    super(ConditionalLogitModel, self).__init__()

    if num_param_dict is None:
        num_param_dict = {key:1 for key in coef_variation_dict.keys()}

    assert coef_variation_dict.keys() == num_param_dict.keys()

    self.variable_types = list(deepcopy(num_param_dict).keys())

    self.coef_variation_dict = deepcopy(coef_variation_dict)
    self.num_param_dict = deepcopy(num_param_dict)

    self.num_items = num_items
    self.num_users = num_users

    # check number of parameters specified are all positive.
    for var_type, num_params in self.num_param_dict.items():
        assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'

    # infer the number of parameters for intercept if the researcher forgets.
    if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
        warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
        self.num_param_dict['intercept'] = 1

    # construct trainable parameters.
    coef_dict = dict()
    for var_type, variation in self.coef_variation_dict.items():
        coef_dict[var_type] = Coefficient(variation=variation,
                                          num_items=self.num_items,
                                          num_users=self.num_users,
                                          num_params=self.num_param_dict[var_type])
    # A ModuleDict is required to properly register all trainable parameters.
    # self.parameter() will fail if a python dictionary is used instead.
    self.coef_dict = nn.ModuleDict(coef_dict)
__repr__(self) special

Return a string representation of the model.

Returns:

Type Description
str

the string representation of the model.

Source code in torch_choice/model/conditional_logit_model.py
def __repr__(self) -> str:
    """Return a string representation of the model.

    Returns:
        str: the string representation of the model.
    """
    out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
    for var_type, num_params in self.num_param_dict.items():
        out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
    return super().__repr__() + '\n' + '\n'.join(out_str_lst)
forward(self, batch, manual_coef_value_dict=None)

Forward pass of the model.

Parameters:

Name Type Description Default
batch ChoiceDataset

a ChoiceDataset object.

required
manual_coef_value_dict Optional[Dict[str, torch.Tensor]]

a dictionary with keys in {'u', 'i'} etc and tensors as values. If provided, the model will force coefficient to be the provided values and compute utility conditioned on the provided coefficient values. This feature is useful when the research wishes to plug in particular values of coefficients and examine the utility values. If not provided, the model will use the learned coefficient values in self.coef_dict. Defaults to None.

None

Returns:

Type Description
torch.Tensor

a tensor of shape (num_trips, num_items) whose (t, i) entry represents the utility from item i in trip t for the user involved in that trip.

Source code in torch_choice/model/conditional_logit_model.py
def forward(self,
            batch: ChoiceDataset,
            manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
            ) -> torch.Tensor:
    """
    Forward pass of the model.

    Args:
        batch: a `ChoiceDataset` object.

        manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
            keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
            coefficient to be the provided values and compute utility conditioned on the provided
            coefficient values. This feature is useful when the research wishes to plug in particular
            values of coefficients and examine the utility values. If not provided, the model will
            use the learned coefficient values in self.coef_dict.
            Defaults to None.

    Returns:
        torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
            the utility from item i in trip t for the user involved in that trip.
    """
    x_dict = batch.x_dict

    if 'intercept' in self.coef_variation_dict.keys():
        # intercept term has no input tensor, which has only 1 feature.
        x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)

    # compute the utility from each item in each choice session.
    total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
    # for each type of variables, apply the corresponding coefficient to input x.

    for var_type, coef in self.coef_dict.items():
        total_utility += coef(
            x_dict[var_type], batch.user_index,
            manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])

    assert total_utility.shape == (len(batch), self.num_items)

    if batch.item_availability is not None:
        # mask out unavilable items.
        total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
    return total_utility
negative_log_likelihood(self, batch, y, is_train=True)

Computes the log-likelihood for the batch and label. TODO: consider remove y, change to label. TODO: consider move this method outside the model, the role of the model is to compute the utility.

Parameters:

Name Type Description Default
batch ChoiceDataset

a ChoiceDataset object containing the data.

required
y torch.Tensor

the label.

required
is_train bool

whether to trace the gradient. Defaults to True.

True

Returns:

Type Description
torch.Tensor

the negative log-likelihood.

Source code in torch_choice/model/conditional_logit_model.py
def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
    """Computes the log-likelihood for the batch and label.
    TODO: consider remove y, change to label.
    TODO: consider move this method outside the model, the role of the model is to compute the utility.

    Args:
        batch (ChoiceDataset): a ChoiceDataset object containing the data.
        y (torch.Tensor): the label.
        is_train (bool, optional): whether to trace the gradient. Defaults to True.

    Returns:
        torch.Tensor: the negative log-likelihood.
    """
    if is_train:
        self.train()
    else:
        self.eval()
    # (num_trips, num_items)
    total_utility = self.forward(batch)
    logP = torch.log_softmax(total_utility, dim=1)
    nll = - logP[torch.arange(len(y)), y].sum()
    return nll
summary(self)

Print out the current model parameter.

Source code in torch_choice/model/conditional_logit_model.py
def summary(self):
    """Print out the current model parameter."""
    for var_type, coefficient in self.coef_dict.items():
        if coefficient is not None:
            print('Variable Type: ', var_type)
            print(coefficient.coef)

nested_logit_model

Implementation of the nested logit model, see page 86 of the book "discrete choice methods with simulation" by Train. for more details.

Author: Tianyu Du Update; Apr. 28, 2022

NestedLogitModel (Module)

Source code in torch_choice/model/nested_logit_model.py
class NestedLogitModel(nn.Module):
    def __init__(self,
                 category_to_item: Dict[object, List[int]],
                 category_coef_variation_dict: Dict[str, str],
                 category_num_param_dict: Dict[str, int],
                 item_coef_variation_dict: Dict[str, str],
                 item_num_param_dict: Dict[str, int],
                 num_users: Optional[int]=None,
                 shared_lambda: bool=False
                 ) -> None:
        """Initialization method of the nested logit model.

        Args:
            category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
                of items IDs of the queried category.

            category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
                (i.e., variable group) to the level of variation for the coefficient of this type
                of variables.
            category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
                the number of parameters in this variable group.

            item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
                for item features.
            item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
                features.

            num_users (Optional[int], optional): number of users to be modelled, this is only
                required if any of variable type requires user-specific variations.
                Defaults to None.

            shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
                is the coefficient for inclusive values, to be constant for all categories.
                The lambda enters the category-level selection as the following
                Utility of choosing category k = lambda * inclusive value of category k
                                               + linear combination of some other category level features
                If set to True, a single lambda will be learned for all categories, otherwise, the
                model learns an individual lambda for each category.
                Defaults to False.
        """
        super(NestedLogitModel, self).__init__()
        self.category_to_item = category_to_item
        self.category_coef_variation_dict = category_coef_variation_dict
        self.category_num_param_dict = category_num_param_dict
        self.item_coef_variation_dict = item_coef_variation_dict
        self.item_num_param_dict = item_num_param_dict
        self.num_users = num_users

        self.categories = list(category_to_item.keys())
        self.num_categories = len(self.categories)
        self.num_items = sum(len(items) for items in category_to_item.values())

        # category coefficients.
        self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
                                                        self.category_num_param_dict,
                                                        self.num_categories)

        # item coefficients.
        self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
                                                    self.item_num_param_dict,
                                                    self.num_items)

        self.shared_lambda = shared_lambda
        if self.shared_lambda:
            self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
        else:
            self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
        # breakpoint()
        # self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
        # used to warn users if forgot to call clamp.
        self._clamp_called_flag = True

    @property
    def num_params(self) -> int:
        """Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
        with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
        intercept is involved.

        Returns:
            int: the total number of learnable parameters.
        """
        return sum(w.numel() for w in self.parameters())

    def _build_coef_dict(self,
                         coef_variation_dict: Dict[str, str],
                         num_param_dict: Dict[str, int],
                         num_items: int) -> nn.ModuleDict:
        """Builds a coefficient dictionary containing all trainable components of the model, mapping coefficient names
            to the corresponding Coefficient Module.
            num_items could be the actual number of items or the number of categories depends on the use case.
            NOTE: torch-choice users don't directly interact with this method.

        Args:
            coef_variation_dict (Dict[str, str]): a dictionary mapping coefficient names (e.g., theta_user) to the level
                of variation (e.g., 'user').
            num_param_dict (Dict[str, int]): a dictionary mapping coefficient names to the number of parameters in this
                coefficient. Be aware that, for example, if there is one K-dimensional coefficient for every user, then
                the `num_param` should be K instead of K x number of users.
            num_items (int): the total number of items in the prediction problem. `num_items` should be the number of
                categories if _build_coef_dict() is used for category-level prediction.

        Returns:
            nn.ModuleDict: a PyTorch ModuleDict object mapping from coefficient names to training Coefficient.
        """
        coef_dict = dict()
        for var_type, variation in coef_variation_dict.items():
            num_params = num_param_dict[var_type]
            coef_dict[var_type] = Coefficient(variation=variation,
                                              num_items=num_items,
                                              num_users=self.num_users,
                                              num_params=num_params)
        return nn.ModuleDict(coef_dict)

    # def _check_input_shapes(self, category_x_dict, item_x_dict, user_index, item_availability) -> None:
    #     T = list(category_x_dict.values())[0].shape[0]  # batch size.
    #     for var_type, x_category in category_x_dict.items():
    #         x_item = item_x_dict[var_type]
    #         assert len(x_item.shape) == len(x_item.shape) == 3
    #         assert x_category.shape[0] == x_item.shape[0]
    #         assert x_category.shape == (T, self.num_categories, self.category_num_param_dict[var_type])
    #         assert x_item.shape == (T, self.num_items, self.item_num_param_dict[var_type])

    #     if (user_index is not None) and (self.num_users is not None):
    #         assert user_index.shape == (T,)

    #     if item_availability is not None:
    #         assert item_availability.shape == (T, self.num_items)

    def forward(self, batch: ChoiceDataset) -> torch.Tensor:
        """An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
            predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
            this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
            For more details about the forward passing, please refer to the _forward() method.

        # TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?

        Args:
            batch (ChoiceDataset): a ChoiceDataset object containing the data batch.

        Returns:
            torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
            of choosing item i in trip t.
        """
        return self._forward(batch['category'].x_dict,
                             batch['item'].x_dict,
                             batch['item'].user_index,
                             batch['item'].item_availability)

    def _forward(self,
                 category_x_dict: Dict[str, torch.Tensor],
                 item_x_dict: Dict[str, torch.Tensor],
                 user_index: Optional[torch.LongTensor] = None,
                 item_availability: Optional[torch.BoolTensor] = None
                 ) -> torch.Tensor:
        """"Computes log P[t, i] = the log probability for the user involved in trip t to choose item i.
        Let n denote the ID of the user involved in trip t, then P[t, i] = P_{ni} on page 86 of the
        book "discrete choice methods with simulation" by Train.

        Args:
            x_category (torch.Tensor): a tensor with shape (num_trips, num_categories, *) including
                features of all categories in each trip.
            x_item (torch.Tensor): a tensor with shape (num_trips, num_items, *) including features
                of all items in each trip.
            user_index (torch.LongTensor): a tensor of shape (num_trips,) indicating which user is
                making decision in each trip. Setting user_index = None assumes the same user is
                making decisions in all trips.
            item_availability (torch.BoolTensor): a boolean tensor with shape (num_trips, num_items)
                indicating the aviliability of items in each trip. If item_availability[t, i] = False,
                the utility of choosing item i in trip t, V[t, i], will be set to -inf.
                Given the decomposition V[t, i] = W[t, k(i)] + Y[t, i] + eps, V[t, i] is set to -inf
                by setting Y[t, i] = -inf for unavilable items.

        Returns:
            torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
            of choosing item i in trip t.
        """
        if self.shared_lambda:
            self.lambdas = self.lambda_weight.expand(self.num_categories)
        else:
            self.lambdas = self.lambda_weight

        # if not self._clamp_called_flag:
        #     warnings.warn('Did you forget to call clamp_lambdas() after optimizer.step()?')

        # The overall utility of item can be decomposed into V[item] = W[category] + Y[item] + eps.
        T = list(item_x_dict.values())[0].shape[0]
        device = list(item_x_dict.values())[0].device
        # compute category-specific utility with shape (T, num_categories).
        W = torch.zeros(T, self.num_categories).to(device)

        if 'intercept' in self.category_coef_variation_dict.keys():
            category_x_dict['intercept'] = torch.ones((T, self.num_categories, 1)).to(device)

        for var_type, coef in self.category_coef_dict.items():
            W += coef(category_x_dict[var_type], user_index)

        # compute item-specific utility (T, num_items).
        Y = torch.zeros(T, self.num_items).to(device)
        for var_type, coef in self.item_coef_dict.items():
            Y += coef(item_x_dict[var_type], user_index)

        if item_availability is not None:
            Y[~item_availability] =torch.finfo(Y.dtype).min / 2

        # =============================================================================
        # compute the inclusive value of each category.
        inclusive_value = dict()
        for k, Bk in self.category_to_item.items():
            # for nest k, divide the Y of all items in Bk by lambda_k.
            Y[:, Bk] /= self.lambdas[k]
            # compute inclusive value for category k.
            # mask out unavilable items.
            inclusive_value[k] = torch.logsumexp(Y[:, Bk], dim=1, keepdim=False)  # (T,)
        # boardcast inclusive value from (T, num_categories) to (T, num_items).
        # for trip t, I[t, i] is the inclusive value of the category item i belongs to.
        I = torch.zeros(T, self.num_items).to(device)
        for k, Bk in self.category_to_item.items():
            I[:, Bk] = inclusive_value[k].view(-1, 1)  # (T, |Bk|)

        # logP_item[t, i] = log P(ni|Bk), where Bk is the category item i is in, n is the user in trip t.
        logP_item = Y - I  # (T, num_items)

        # =============================================================================
        # logP_category[t, i] = log P(Bk), for item i in trip t, the probability of choosing the nest/bucket
        # item i belongs to. logP_category has shape (T, num_items)
        # logit[t, i] = W[n, k] + lambda[k] I[n, k], where n is the user involved in trip t, k is
        # the category item i belongs to.
        logit = torch.zeros(T, self.num_items).to(device)
        for k, Bk in self.category_to_item.items():
            logit[:, Bk] = (W[:, k] + self.lambdas[k] * inclusive_value[k]).view(-1, 1)  # (T, |Bk|)
        # only count each category once in the logsumexp within the category level model.
        cols = [x[0] for x in self.category_to_item.values()]
        logP_category = logit - torch.logsumexp(logit[:, cols], dim=1, keepdim=True)

        # =============================================================================
        # compute the joint log P_{ni} as in the textbook.
        logP = logP_item + logP_category
        self._clamp_called_flag = False
        return logP

    def log_likelihood(self, *args):
        """Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.

        Returns:
            _type_: the log likelihood of the model.
        """
        return - self.negative_log_likelihood(*args)

    def negative_log_likelihood(self,
                                batch: ChoiceDataset,
                                y: torch.LongTensor,
                                is_train: bool=True) -> torch.scalar_tensor:
        """Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
            in batch instead of the average.

        Args:
            batch (ChoiceDataset): the ChoiceDataset object containing the data.
            y (torch.LongTensor): the label.
            is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
                of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
                then `is_train` can be set to False for better performance.
                Defaults to True.

        Returns:
            torch.scalar_tensor: the negative log likelihood of the model.
        """
        # compute the negative log-likelihood loss directly.
        if is_train:
            self.train()
        else:
            self.eval()
        # (num_trips, num_items)
        logP = self.forward(batch)
        nll = - logP[torch.arange(len(y)), y].sum()
        return nll

    # def clamp_lambdas(self):
    #     """
    #     Restrict values of lambdas to 0 < lambda <= 1 to guarantee the utility maximization property
    #     of the model.
    #     This method should be called everytime after optimizer.step().
    #     We add a self_clamp_called_flag to remind researchers if this method is not called.
    #     """
    #     for k in range(len(self.lambdas)):
    #         self.lambdas[k] = torch.clamp(self.lambdas[k], 1e-5, 1)
    #     self._clam_called_flag = True

    # @staticmethod
    # def add_constant(x: torch.Tensor, where: str='prepend') -> torch.Tensor:
    #     """A helper function used to add constant to feature tensor,
    #     x has shape (batch_size, num_classes, num_parameters),
    #     returns a tensor of shape (*, num_parameters+1).
    #     """
    #     batch_size, num_classes, num_parameters = x.shape
    #     ones = torch.ones((batch_size, num_classes, 1))
    #     if where == 'prepend':
    #         new = torch.cat((ones, x), dim=-1)
    #     elif where == 'append':
    #         new = torch.cat((x, ones), dim=-1)
    #     else:
    #         raise Exception
    #     return new
num_params: int property readonly

Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no intercept is involved.

Returns:

Type Description
int

the total number of learnable parameters.

__init__(self, category_to_item, category_coef_variation_dict, category_num_param_dict, item_coef_variation_dict, item_num_param_dict, num_users=None, shared_lambda=False) special

Initialization method of the nested logit model.

Parameters:

Name Type Description Default
category_to_item Dict[object, List[int]]

a dictionary maps a category ID to a list of items IDs of the queried category.

required
category_coef_variation_dict Dict[str, str]

a dictionary maps a variable type (i.e., variable group) to the level of variation for the coefficient of this type of variables.

required
category_num_param_dict Dict[str, int]

a dictionary maps a variable type name to the number of parameters in this variable group.

required
item_coef_variation_dict Dict[str, str]

the same as category_coef_variation_dict but for item features.

required
item_num_param_dict Dict[str, int]

the same as category_num_param_dict but for item features.

required
num_users Optional[int]

number of users to be modelled, this is only required if any of variable type requires user-specific variations. Defaults to None.

None
shared_lambda bool

a boolean indicating whether to enforce the elasticity lambda, which is the coefficient for inclusive values, to be constant for all categories. The lambda enters the category-level selection as the following Utility of choosing category k = lambda * inclusive value of category k + linear combination of some other category level features If set to True, a single lambda will be learned for all categories, otherwise, the model learns an individual lambda for each category. Defaults to False.

False
Source code in torch_choice/model/nested_logit_model.py
def __init__(self,
             category_to_item: Dict[object, List[int]],
             category_coef_variation_dict: Dict[str, str],
             category_num_param_dict: Dict[str, int],
             item_coef_variation_dict: Dict[str, str],
             item_num_param_dict: Dict[str, int],
             num_users: Optional[int]=None,
             shared_lambda: bool=False
             ) -> None:
    """Initialization method of the nested logit model.

    Args:
        category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
            of items IDs of the queried category.

        category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
            (i.e., variable group) to the level of variation for the coefficient of this type
            of variables.
        category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
            the number of parameters in this variable group.

        item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
            for item features.
        item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
            features.

        num_users (Optional[int], optional): number of users to be modelled, this is only
            required if any of variable type requires user-specific variations.
            Defaults to None.

        shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
            is the coefficient for inclusive values, to be constant for all categories.
            The lambda enters the category-level selection as the following
            Utility of choosing category k = lambda * inclusive value of category k
                                           + linear combination of some other category level features
            If set to True, a single lambda will be learned for all categories, otherwise, the
            model learns an individual lambda for each category.
            Defaults to False.
    """
    super(NestedLogitModel, self).__init__()
    self.category_to_item = category_to_item
    self.category_coef_variation_dict = category_coef_variation_dict
    self.category_num_param_dict = category_num_param_dict
    self.item_coef_variation_dict = item_coef_variation_dict
    self.item_num_param_dict = item_num_param_dict
    self.num_users = num_users

    self.categories = list(category_to_item.keys())
    self.num_categories = len(self.categories)
    self.num_items = sum(len(items) for items in category_to_item.values())

    # category coefficients.
    self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
                                                    self.category_num_param_dict,
                                                    self.num_categories)

    # item coefficients.
    self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
                                                self.item_num_param_dict,
                                                self.num_items)

    self.shared_lambda = shared_lambda
    if self.shared_lambda:
        self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
    else:
        self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
    # breakpoint()
    # self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
    # used to warn users if forgot to call clamp.
    self._clamp_called_flag = True
forward(self, batch)

An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument. For more details about the forward passing, please refer to the _forward() method.

TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?

Parameters:

Name Type Description Default
batch ChoiceDataset

a ChoiceDataset object containing the data batch.

required

Returns:

Type Description
torch.Tensor

a tensor of shape (num_trips, num_items) including the log probability of choosing item i in trip t.

Source code in torch_choice/model/nested_logit_model.py
def forward(self, batch: ChoiceDataset) -> torch.Tensor:
    """An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
        predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
        this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
        For more details about the forward passing, please refer to the _forward() method.

    # TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?

    Args:
        batch (ChoiceDataset): a ChoiceDataset object containing the data batch.

    Returns:
        torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
        of choosing item i in trip t.
    """
    return self._forward(batch['category'].x_dict,
                         batch['item'].x_dict,
                         batch['item'].user_index,
                         batch['item'].item_availability)
log_likelihood(self, *args)

Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.

Returns:

Type Description
_type_

the log likelihood of the model.

Source code in torch_choice/model/nested_logit_model.py
def log_likelihood(self, *args):
    """Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.

    Returns:
        _type_: the log likelihood of the model.
    """
    return - self.negative_log_likelihood(*args)
negative_log_likelihood(self, batch, y, is_train=True)

Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples in batch instead of the average.

Parameters:

Name Type Description Default
batch ChoiceDataset

the ChoiceDataset object containing the data.

required
y torch.LongTensor

the label.

required
is_train bool

which mode of the model to be used for the forward passing, if we need Hessian of the NLL through auto-grad, is_train should be set to True. If we merely need a performance metric, then is_train can be set to False for better performance. Defaults to True.

True

Returns:

Type Description
torch.scalar_tensor

the negative log likelihood of the model.

Source code in torch_choice/model/nested_logit_model.py
def negative_log_likelihood(self,
                            batch: ChoiceDataset,
                            y: torch.LongTensor,
                            is_train: bool=True) -> torch.scalar_tensor:
    """Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
        in batch instead of the average.

    Args:
        batch (ChoiceDataset): the ChoiceDataset object containing the data.
        y (torch.LongTensor): the label.
        is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
            of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
            then `is_train` can be set to False for better performance.
            Defaults to True.

    Returns:
        torch.scalar_tensor: the negative log likelihood of the model.
    """
    # compute the negative log-likelihood loss directly.
    if is_train:
        self.train()
    else:
        self.eval()
    # (num_trips, num_items)
    logP = self.forward(batch)
    nll = - logP[torch.arange(len(y)), y].sum()
    return nll