API Reference: Torch Choice
data
special
choice_dataset
The dataset object for management large scale consumer choice datasets.
Please refer to the documentation and tutorials for more details on using ChoiceDataset
.
Author: Tianyu Du Update: Apr. 27, 2022
ChoiceDataset (Dataset)
Source code in torch_choice/data/choice_dataset.py
class ChoiceDataset(torch.utils.data.Dataset):
def __init__(self,
item_index: torch.LongTensor,
label: Optional[torch.LongTensor] = None,
user_index: Optional[torch.LongTensor] = None,
session_index: Optional[torch.LongTensor] = None,
item_availability: Optional[torch.BoolTensor] = None,
**kwargs) -> None:
"""
Initialization methods for the dataset object, researchers should supply all information about the dataset
using this initialization method.
The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
in machine learning literature.
A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.
The dataset consists of:
(1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
(2) a collection of `observables` associated with item, user, session, etc.
Args:
item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
of the dataset, the relevant item can be:
(1) the item bought in this choice instance,
(2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
NOTE: The support for second case is under-development, currently, we are only supporting binary label.
label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
each choice instance. While you want to predict the item bought, you can leave the `label` argument
as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
But if you are, for example, predicting the rating an user gave an item, label must be provided.
Defaults to None.
user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
that the choice instances are from the same user.
`user_index` is required if and only if there are multiple users in the dataset, for example:
(1) user-observables is involved in the utility form,
(2) and/or the coefficient is user-specific.
This tensor is used to select the corresponding user observables and coefficients assigned to the
user (like theta_user) for making prediction for that purchase.
Defaults to None.
session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
the ID of the session when that choice instance occurred. This tensor is used to select the correct
session observables or price observables for making prediction for that choice instance. Therefore, if
there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
object will assume each choice instance to be in its own session.
Defaults to None.
item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
and hence these unavailable items will be set to 0 while making prediction.
We assume all items are available if set to None.
Defaults to None.
Other Kwargs (Observables):
One can specify the following types of observables, where * in shape denotes any positive
integer. Typically * represents the number of observables.
Please refer to the documentation for a detailed guide to use observables.
1. user observables must start with 'user_' and have shape (num_users, *)
2. item observables must start with 'item_' and have shape (num_items, *)
3. session observables must start with 'session_' and have shape (num_sessions, *)
4. taste observables (those vary by user and item) must start with `taste_` and have shape
(num_users, num_items, *).
NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
5. price observables (those vary by session and item) must start with `price_` and have
shape (num_sessions, num_items, *)
"""
# ENHANCEMENT(Tianyu): add item_names for summary.
super(ChoiceDataset, self).__init__()
self.label = label
self.item_index = item_index
self.user_index = user_index
self.session_index = session_index
if self.session_index is None:
# if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
# if any session sensitive observable is provided, but session index is not,
# infer each row in the dataset to be a session.
# TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
print('No `session_index` is provided, assume each choice instance is in its own session.')
self.session_index = torch.arange(len(self.item_index)).long()
self.item_availability = item_availability
for key, item in kwargs.items():
setattr(self, key, item)
# TODO: add a validation procedure to check the consistency of the dataset.
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
"""Retrieves samples corresponding to the provided index or list of indices.
Args:
indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.
Returns:
ChoiceDataset: a subset of the dataset.
"""
if isinstance(indices, int):
# convert single integer index to an array of indices.
indices = torch.LongTensor([indices])
new_dict = dict()
new_dict['item_index'] = self.item_index[indices].clone()
# copy optional attributes.
new_dict['label'] = self.label[indices].clone() if self.label is not None else None
new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
# item_availability has shape (num_sessions, num_items), no need to re-index it.
new_dict['item_availability'] = self.item_availability
# copy other attributes.
for key, val in self.__dict__.items():
if key not in new_dict.keys():
if torch.is_tensor(val):
new_dict[key] = val.clone()
else:
new_dict[key] = copy.deepcopy(val)
return self._from_dict(new_dict)
def __len__(self) -> int:
"""Returns number of samples in this dataset.
Returns:
int: length of the dataset.
"""
return len(self.item_index)
def __contains__(self, key: str) -> bool:
return key in self.keys
def __eq__(self, other: "ChoiceDataset") -> bool:
"""Returns whether all tensor attributes of both ChoiceDatasets are equal."""
if not isinstance(other, ChoiceDataset):
raise TypeError('You can only compare with ChoiceDataset objects.')
else:
flag = True
for key, val in self.__dict__.items():
if torch.is_tensor(val):
# ignore NaNs while comparing.
if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
print('Attribute {} is not equal.'.format(key))
flag = False
return flag
@property
def device(self) -> str:
"""Returns the device of the dataset.
Returns:
str: the device of the dataset.
"""
for attr in self.__dict__.values():
if torch.is_tensor(attr):
return attr.device
@property
def num_users(self) -> int:
"""Returns number of users involved in this dataset, returns 1 if there is no user identity.
Returns:
int: the number of users involved in this dataset.
"""
# query from user_index
if self.user_index is not None:
return len(torch.unique(self.user_index))
else:
return 1
# for key, val in self.__dict__.items():
# if torch.is_tensor(val):
# if self._is_user_attribute(key) or self._is_taste_attribute(key):
# return val.shape[0]
# return 1
@property
def num_items(self) -> int:
"""Returns the number of items involved in this dataset.
Returns:
int: the number of items involved in this dataset.
"""
return len(torch.unique(self.item_index))
# for key, val in self.__dict__.items():
# if torch.is_tensor(val):
# if self._is_item_attribute(key):
# return val.shape[0]
# elif self._is_taste_attribute(key) or self._is_price_attribute(key):
# return val.shape[1]
# return 1
@property
def num_sessions(self) -> int:
"""Returns the number of sessions involved in this dataset.
Returns:
int: the number of sessions involved in this dataset.
"""
return len(torch.unique(self.session_index))
# if self.session_index is None:
# return 1
# for key, val in self.__dict__.items():
# if torch.is_tensor(val):
# if self._is_session_attribute(key) or self._is_price_attribute(key):
# return val.shape[0]
# return 1
@property
def x_dict(self) -> Dict[object, torch.Tensor]:
"""Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format.
Models in this package are expecting this dictionary based data format.
Returns:
Dict[object, torch.Tensor]: a dictionary with attribute names in the dataset as keys, and reshaped attribute
tensors as values.
"""
out = dict()
for key, val in self.__dict__.items():
if self._is_attribute(key): # only include attributes.
out[key] = self._expand_tensor(key, val) # reshape to (num_sessions, num_items, num_params).
return out
@classmethod
def _from_dict(cls, dictionary: Dict[str, torch.tensor]) -> "ChoiceDataset":
"""Creates an instance of ChoiceDataset from a dictionary of arguments.
Args:
dictionary (Dict[str, torch.tensor]): a dictionary with keys as argument names and values as arguments.
Returns:
ChoiceDataset: the created copy of dataset.
"""
dataset = cls(**dictionary)
for key, item in dictionary.items():
setattr(dataset, key, item)
return dataset
def apply_tensor(self, func: callable) -> "ChoiceDataset":
"""This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
Args:
func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.
Returns:
ChoiceDataset: the modified dataset.
"""
for key, item in self.__dict__.items():
if torch.is_tensor(item):
setattr(self, key, func(item))
# boardcast func to dictionary of tensors as well.
elif isinstance(getattr(self, key), dict):
for obj_key, obj_item in getattr(self, key).items():
if torch.is_tensor(obj_item):
setattr(getattr(self, key), obj_key, func(obj_item))
return self
def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
"""Moves all tensors in this dataset to the specified PyTorch device.
Args:
device (Union[str, torch.device]): the destination device.
Returns:
ChoiceDataset: the modified dataset on the new device.
"""
return self.apply_tensor(lambda x: x.to(device))
def clone(self) -> "ChoiceDataset":
"""Creates a copy of self.
Returns:
ChoiceDataset: a copy of self.
"""
dictionary = {}
for k, v in self.__dict__.items():
if torch.is_tensor(v):
dictionary[k] = v.clone()
else:
dictionary[k] = copy.deepcopy(v)
return self.__class__._from_dict(dictionary)
def _check_device_consistency(self) -> None:
"""Checks if all tensors in this dataset are on the same device.
Raises:
Exception: an exception is raised if not all tensors are on the same device.
"""
# assert all tensors are on the same device.
devices = list()
for val in self.__dict__.values():
if torch.is_tensor(val):
devices.append(val.device)
if len(set(devices)) > 1:
raise Exception(f'Found tensors on different devices: {set(devices)}.',
'Use dataset.to() method to align devices.')
def _size_repr(self, value: object) -> List[int]:
"""A helper method to get the string-representation of object sizes, this is helpful while constructing the
string representation of the dataset.
Args:
value (object): an object to examine its size.
Returns:
List[int]: list of integers representing the size of the object, length of the list is equal to dimension of `value`.
"""
if torch.is_tensor(value):
return list(value.size())
elif isinstance(value, int) or isinstance(value, float):
return [1]
elif isinstance(value, list) or isinstance(value, tuple):
return [len(value)]
else:
return []
def __repr__(self) -> str:
"""A method to get a string representation of the dataset.
Returns:
str: the string representation of the dataset.
"""
info = [
f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"
# ==================================================================================================================
# methods for checking attribute categories.
# ==================================================================================================================
@staticmethod
def _is_item_attribute(key: str) -> bool:
return key.startswith('item_') and (key != 'item_availability') and (key != 'item_index')
@staticmethod
def _is_user_attribute(key: str) -> bool:
return key.startswith('user_') and (key != 'user_index')
@staticmethod
def _is_session_attribute(key: str) -> bool:
return key.startswith('session_') and (key != 'session_index')
@staticmethod
def _is_taste_attribute(key: str) -> bool:
return key.startswith('taste_')
@staticmethod
def _is_price_attribute(key: str) -> bool:
return key.startswith('price_')
def _is_attribute(self, key: str) -> bool:
return self._is_item_attribute(key) \
or self._is_user_attribute(key) \
or self._is_session_attribute(key) \
or self._is_taste_attribute(key) \
or self._is_price_attribute(key)
def _expand_tensor(self, key: str, val: torch.Tensor) -> torch.Tensor:
"""Expands attribute tensor to (num_sessions, num_items, num_params) shape for prediction tasks, this method
won't reshape the tensor at all if the `key` (i.e., name of the tensor) suggests its not an attribute of any kind.
Args:
key (str): name of the attribute used to determine the raw shape of the tensor. For example, 'item_obs' means
the raw tensor is in shape (num_items, num_params).
val (torch.Tensor): the attribute tensor to be reshaped.
Returns:
torch.Tensor: the reshaped tensor with shape (num_sessions, num_items, num_params).
"""
if not self._is_attribute(key):
print(f'Warning: the input key {key} is not an attribute of the dataset, will NOT modify the provided tensor.')
# don't expand non-attribute tensors, if any.
return val
num_params = val.shape[-1]
if self._is_user_attribute(key):
# user_attribute (num_users, *)
out = val[self.user_index, :].view(
len(self), 1, num_params).expand(-1, self.num_items, -1)
elif self._is_item_attribute(key):
# item_attribute (num_items, *)
out = val.view(1, self.num_items, num_params).expand(
len(self), -1, -1)
elif self._is_session_attribute(key):
# session_attribute (num_sessions, *)
out = val[self.session_index, :].view(
len(self), 1, num_params).expand(-1, self.num_items, -1)
elif self._is_taste_attribute(key):
# taste_attribute (num_users, num_items, *)
out = val[self.user_index, :, :]
elif self._is_price_attribute(key):
# price_attribute (num_sessions, num_items, *)
out = val[self.session_index, :, :]
assert out.shape == (len(self), self.num_items, num_params)
return out
device: str
property
readonly
Returns the device of the dataset.
Returns:
Type | Description |
---|---|
str |
the device of the dataset. |
num_items: int
property
readonly
Returns the number of items involved in this dataset.
Returns:
Type | Description |
---|---|
int |
the number of items involved in this dataset. |
num_sessions: int
property
readonly
Returns the number of sessions involved in this dataset.
Returns:
Type | Description |
---|---|
int |
the number of sessions involved in this dataset. |
num_users: int
property
readonly
Returns number of users involved in this dataset, returns 1 if there is no user identity.
Returns:
Type | Description |
---|---|
int |
the number of users involved in this dataset. |
x_dict: Dict[object, torch.Tensor]
property
readonly
Formats attributes of in this dataset into shape (num_sessions, num_items, num_params) and returns in a dictionary format. Models in this package are expecting this dictionary based data format.
Returns:
Type | Description |
---|---|
Dict[object, torch.Tensor] |
a dictionary with attribute names in the dataset as keys, and reshaped attribute tensors as values. |
__eq__(self, other)
special
Returns whether all tensor attributes of both ChoiceDatasets are equal.
Source code in torch_choice/data/choice_dataset.py
def __eq__(self, other: "ChoiceDataset") -> bool:
"""Returns whether all tensor attributes of both ChoiceDatasets are equal."""
if not isinstance(other, ChoiceDataset):
raise TypeError('You can only compare with ChoiceDataset objects.')
else:
flag = True
for key, val in self.__dict__.items():
if torch.is_tensor(val):
# ignore NaNs while comparing.
if not torch.equal(torch.nan_to_num(val), torch.nan_to_num(other.__dict__[key])):
print('Attribute {} is not equal.'.format(key))
flag = False
return flag
__getitem__(self, indices)
special
Retrieves samples corresponding to the provided index or list of indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indices |
Union[int, torch.LongTensor] |
a single integer index or a tensor of indices. |
required |
Returns:
Type | Description |
---|---|
ChoiceDataset |
a subset of the dataset. |
Source code in torch_choice/data/choice_dataset.py
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> "ChoiceDataset":
"""Retrieves samples corresponding to the provided index or list of indices.
Args:
indices (Union[int, torch.LongTensor]): a single integer index or a tensor of indices.
Returns:
ChoiceDataset: a subset of the dataset.
"""
if isinstance(indices, int):
# convert single integer index to an array of indices.
indices = torch.LongTensor([indices])
new_dict = dict()
new_dict['item_index'] = self.item_index[indices].clone()
# copy optional attributes.
new_dict['label'] = self.label[indices].clone() if self.label is not None else None
new_dict['user_index'] = self.user_index[indices].clone() if self.user_index is not None else None
new_dict['session_index'] = self.session_index[indices].clone() if self.session_index is not None else None
# item_availability has shape (num_sessions, num_items), no need to re-index it.
new_dict['item_availability'] = self.item_availability
# copy other attributes.
for key, val in self.__dict__.items():
if key not in new_dict.keys():
if torch.is_tensor(val):
new_dict[key] = val.clone()
else:
new_dict[key] = copy.deepcopy(val)
return self._from_dict(new_dict)
__init__(self, item_index, label=None, user_index=None, session_index=None, item_availability=None, **kwargs)
special
Initialization methods for the dataset object, researchers should supply all information about the dataset using this initialization method.
The number of choice instances are called batch_size
in the documentation. The batch_size
corresponds to the
file length in wide-format dataset, and often denoted using N
. We call it batch_size
to follow the convention
in machine learning literature.
A choice instance
is a row of the dataset, so there are batch_size
choice instances in each ChoiceDataset
.
The dataset consists of:
(1) a collection of batch_size
tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
(2) a collection of observables
associated with item, user, session, etc.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
item_index |
torch.LongTensor |
a tensor of shape (batch_size) indicating the relevant item in each row
of the dataset, the relevant item can be:
(1) the item bought in this choice instance,
(2) or the item reviewed by the user. In the later case, we need the |
required |
label |
Optional[torch.LongTensor] |
a tensor of shape (batch_size) indicating the label for prediction in
each choice instance. While you want to predict the item bought, you can leave the |
None |
user_index |
Optional[torch.LongTensor] |
a tensor of shape num_purchases (batch_size) indicating
the ID of the user who was involved in each choice instance. If |
None |
session_index |
Optional[torch.LongTensor] |
a tensor of shape num_purchases (batch_size) indicating
the ID of the session when that choice instance occurred. This tensor is used to select the correct
session observables or price observables for making prediction for that choice instance. Therefore, if
there is no session/price observables, you can leave this argument as |
None |
item_availability |
Optional[torch.BoolTensor] |
A boolean tensor of shape (num_sessions, num_items) indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite, and hence these unavailable items will be set to 0 while making prediction. We assume all items are available if set to None. Defaults to None. |
None |
Other Kwargs (Observables):
One can specify the following types of observables, where * in shape denotes any positive
integer. Typically * represents the number of observables.
Please refer to the documentation for a detailed guide to use observables.
1. user observables must start with 'user_' and have shape (num_users, )
2. item observables must start with 'item_' and have shape (num_items, )
3. session observables must start with 'session_' and have shape (num_sessions, )
4. taste observables (those vary by user and item) must start with taste_
and have shape
(num_users, num_items, ).
NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
5. price observables (those vary by session and item) must start with price_
and have
shape (num_sessions, num_items, *)
Source code in torch_choice/data/choice_dataset.py
def __init__(self,
item_index: torch.LongTensor,
label: Optional[torch.LongTensor] = None,
user_index: Optional[torch.LongTensor] = None,
session_index: Optional[torch.LongTensor] = None,
item_availability: Optional[torch.BoolTensor] = None,
**kwargs) -> None:
"""
Initialization methods for the dataset object, researchers should supply all information about the dataset
using this initialization method.
The number of choice instances are called `batch_size` in the documentation. The `batch_size` corresponds to the
file length in wide-format dataset, and often denoted using `N`. We call it `batch_size` to follow the convention
in machine learning literature.
A `choice instance` is a row of the dataset, so there are `batch_size` choice instances in each `ChoiceDataset`.
The dataset consists of:
(1) a collection of `batch_size` tuples (item_id, user_id, session_id, label), where each tuple is a choice instance.
(2) a collection of `observables` associated with item, user, session, etc.
Args:
item_index (torch.LongTensor): a tensor of shape (batch_size) indicating the relevant item in each row
of the dataset, the relevant item can be:
(1) the item bought in this choice instance,
(2) or the item reviewed by the user. In the later case, we need the `label` tensor to specify the rating score.
NOTE: The support for second case is under-development, currently, we are only supporting binary label.
label (Optional[torch.LongTensor], optional): a tensor of shape (batch_size) indicating the label for prediction in
each choice instance. While you want to predict the item bought, you can leave the `label` argument
as `None` in the initialization method, and the model will use `item_index` as the object to be predicted.
But if you are, for example, predicting the rating an user gave an item, label must be provided.
Defaults to None.
user_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
the ID of the user who was involved in each choice instance. If `None` user index is provided, it's assumed
that the choice instances are from the same user.
`user_index` is required if and only if there are multiple users in the dataset, for example:
(1) user-observables is involved in the utility form,
(2) and/or the coefficient is user-specific.
This tensor is used to select the corresponding user observables and coefficients assigned to the
user (like theta_user) for making prediction for that purchase.
Defaults to None.
session_index (Optional[torch.LongTensor], optional): a tensor of shape num_purchases (batch_size) indicating
the ID of the session when that choice instance occurred. This tensor is used to select the correct
session observables or price observables for making prediction for that choice instance. Therefore, if
there is no session/price observables, you can leave this argument as `None`. In this case, the `ChoiceDataset`
object will assume each choice instance to be in its own session.
Defaults to None.
item_availability (Optional[torch.BoolTensor], optional): A boolean tensor of shape (num_sessions, num_items)
indicating the availability of each item in each session. Utilities of unavailable items would be set to -infinite,
and hence these unavailable items will be set to 0 while making prediction.
We assume all items are available if set to None.
Defaults to None.
Other Kwargs (Observables):
One can specify the following types of observables, where * in shape denotes any positive
integer. Typically * represents the number of observables.
Please refer to the documentation for a detailed guide to use observables.
1. user observables must start with 'user_' and have shape (num_users, *)
2. item observables must start with 'item_' and have shape (num_items, *)
3. session observables must start with 'session_' and have shape (num_sessions, *)
4. taste observables (those vary by user and item) must start with `taste_` and have shape
(num_users, num_items, *).
NOTE: we don't recommend using taste observables, because num_users * num_items is potentially large.
5. price observables (those vary by session and item) must start with `price_` and have
shape (num_sessions, num_items, *)
"""
# ENHANCEMENT(Tianyu): add item_names for summary.
super(ChoiceDataset, self).__init__()
self.label = label
self.item_index = item_index
self.user_index = user_index
self.session_index = session_index
if self.session_index is None:
# if any([x.startswith('session_') or x.startswith('price_') for x in kwargs.keys()]):
# if any session sensitive observable is provided, but session index is not,
# infer each row in the dataset to be a session.
# TODO: (design choice) should we assign unique session index to each choice instance or the same session index.
print('No `session_index` is provided, assume each choice instance is in its own session.')
self.session_index = torch.arange(len(self.item_index)).long()
self.item_availability = item_availability
for key, item in kwargs.items():
setattr(self, key, item)
# TODO: add a validation procedure to check the consistency of the dataset.
__len__(self)
special
__repr__(self)
special
A method to get a string representation of the dataset.
Returns:
Type | Description |
---|---|
str |
the string representation of the dataset. |
Source code in torch_choice/data/choice_dataset.py
def __repr__(self) -> str:
"""A method to get a string representation of the dataset.
Returns:
str: the string representation of the dataset.
"""
info = [
f'{key}={self._size_repr(item)}' for key, item in self.__dict__.items()]
return f"{self.__class__.__name__}({', '.join(info)}, device={self.device})"
apply_tensor(self, func)
This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
callable |
a callable function to be applied on tensors and tensor-values of dictionaries. |
required |
Returns:
Type | Description |
---|---|
ChoiceDataset |
the modified dataset. |
Source code in torch_choice/data/choice_dataset.py
def apply_tensor(self, func: callable) -> "ChoiceDataset":
"""This s a helper method to apply the provided function to all tensors and tensor values of all dictionaries.
Args:
func (callable): a callable function to be applied on tensors and tensor-values of dictionaries.
Returns:
ChoiceDataset: the modified dataset.
"""
for key, item in self.__dict__.items():
if torch.is_tensor(item):
setattr(self, key, func(item))
# boardcast func to dictionary of tensors as well.
elif isinstance(getattr(self, key), dict):
for obj_key, obj_item in getattr(self, key).items():
if torch.is_tensor(obj_item):
setattr(getattr(self, key), obj_key, func(obj_item))
return self
clone(self)
Creates a copy of self.
Returns:
Type | Description |
---|---|
ChoiceDataset |
a copy of self. |
Source code in torch_choice/data/choice_dataset.py
to(self, device)
Moves all tensors in this dataset to the specified PyTorch device.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device |
Union[str, torch.device] |
the destination device. |
required |
Returns:
Type | Description |
---|---|
ChoiceDataset |
the modified dataset on the new device. |
Source code in torch_choice/data/choice_dataset.py
def to(self, device: Union[str, torch.device]) -> "ChoiceDataset":
"""Moves all tensors in this dataset to the specified PyTorch device.
Args:
device (Union[str, torch.device]): the destination device.
Returns:
ChoiceDataset: the modified dataset on the new device.
"""
return self.apply_tensor(lambda x: x.to(device))
joint_dataset
The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer purchased. You can do this by using the JointDataset class.
Author: Tianyu Du Update: Apr. 28, 2022
JointDataset (Dataset)
A helper class for joining several pytorch datasets, using JointDataset and pytorch data loader allows for sampling the same batch index from several datasets.
The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer purchased. You can do this by using the JointDataset class.
Source code in torch_choice/data/joint_dataset.py
class JointDataset(torch.utils.data.Dataset):
"""A helper class for joining several pytorch datasets, using JointDataset
and pytorch data loader allows for sampling the same batch index from several
datasets.
The JointDataset class is a wrapper for the torch.utils.data.ChoiceDataset class, it is particularly useful when we
need to make prediction from multiple datasets. For example, you have data on consumer purchase records in a fast food
store, and suppose every customer will purchase exactly a single main food and a single drink. In this case, you have
two separate datasets: FoodDataset and DrinkDataset. You may want to use PyTorch sampler to sample them in a dependent
manner: you want to take the i-th sample from both datasets, so that you know what (food, drink) combo the i-th customer
purchased. You can do this by using the JointDataset class.
"""
def __init__(self, **datasets) -> None:
"""The initialize methods.
Args:
Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
```
dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
```
All datasets should have the same length.
"""
super(JointDataset, self).__init__()
self.datasets = datasets
# check the length of sub-datasets are the same.
assert len(set([len(d) for d in self.datasets.values()])) == 1
def __len__(self) -> int:
"""Get the number of samples in the joint dataset.
Returns:
int: the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained.
"""
for d in self.datasets.values():
return len(d)
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
"""Queries samples from the dataset by index.
Args:
indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.
Returns:
Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
of contained datasets, sliced using the provided indices.
"""
return dict((name, d[indices]) for (name, d) in self.datasets.items())
def __repr__(self) -> str:
"""A method to get a string representation of the dataset.
Returns:
str: the string representation of the dataset.
"""
out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
for name, dataset in self.datasets.items():
out.append(f'\t{name}: {str(dataset)}')
out.append(')')
return '\n'.join(out)
@property
def device(self) -> str:
"""Returns the device of datasets contained in the joint dataset.
Returns:
str: the device of the dataset.
"""
for d in self.datasets.values():
return d.device
def to(self, device: Union[str, torch.device]) -> "JointDataset":
"""Moves all datasets in this dataset to the specified PyTorch device.
Args:
device (Union[str, torch.device]): the destination device.
Returns:
ChoiceDataset: the modified dataset on the new device.
"""
for d in self.datasets.values():
d = d.to(device)
return self
device: str
property
readonly
Returns the device of datasets contained in the joint dataset.
Returns:
Type | Description |
---|---|
str |
the device of the dataset. |
__getitem__(self, indices)
special
Queries samples from the dataset by index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indices |
Union[int, torch.LongTensor] |
an integer or a 1D tensor of multiple indices. |
required |
Returns:
Type | Description |
---|---|
Dict[str, ChoiceDataset] |
the subset of the dataset. Keys of the dictionary will be names of each dataset
contained (the same as the keys of the |
Source code in torch_choice/data/joint_dataset.py
def __getitem__(self, indices: Union[int, torch.LongTensor]) -> Dict[str, ChoiceDataset]:
"""Queries samples from the dataset by index.
Args:
indices (Union[int, torch.LongTensor]): an integer or a 1D tensor of multiple indices.
Returns:
Dict[str, ChoiceDataset]: the subset of the dataset. Keys of the dictionary will be names of each dataset
contained (the same as the keys of the ``datasets`` argument in the constructor). Values will be subsets
of contained datasets, sliced using the provided indices.
"""
return dict((name, d[indices]) for (name, d) in self.datasets.items())
__init__(self, **datasets)
special
The initialize methods.
Source code in torch_choice/data/joint_dataset.py
def __init__(self, **datasets) -> None:
"""The initialize methods.
Args:
Arbitrarily many datasets with arbitrary names as keys. In the example above, you can construct
```
dataset = JointDataset(food=FoodDataset, drink=DrinkDataset)
```
All datasets should have the same length.
"""
super(JointDataset, self).__init__()
self.datasets = datasets
# check the length of sub-datasets are the same.
assert len(set([len(d) for d in self.datasets.values()])) == 1
__len__(self)
special
Get the number of samples in the joint dataset.
Returns:
Type | Description |
---|---|
int |
the number of samples in the joint dataset, which is the same as the number of samples in each dataset contained. |
Source code in torch_choice/data/joint_dataset.py
__repr__(self)
special
A method to get a string representation of the dataset.
Returns:
Type | Description |
---|---|
str |
the string representation of the dataset. |
Source code in torch_choice/data/joint_dataset.py
def __repr__(self) -> str:
"""A method to get a string representation of the dataset.
Returns:
str: the string representation of the dataset.
"""
out = [f'JointDataset with {len(self.datasets)} sub-datasets: (']
for name, dataset in self.datasets.items():
out.append(f'\t{name}: {str(dataset)}')
out.append(')')
return '\n'.join(out)
to(self, device)
Moves all datasets in this dataset to the specified PyTorch device.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device |
Union[str, torch.device] |
the destination device. |
required |
Returns:
Type | Description |
---|---|
ChoiceDataset |
the modified dataset on the new device. |
Source code in torch_choice/data/joint_dataset.py
def to(self, device: Union[str, torch.device]) -> "JointDataset":
"""Moves all datasets in this dataset to the specified PyTorch device.
Args:
device (Union[str, torch.device]): the destination device.
Returns:
ChoiceDataset: the modified dataset on the new device.
"""
for d in self.datasets.values():
d = d.to(device)
return self
utils
pivot3d(df, dim0, dim1, values)
Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the provided data frame.
Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then out[t, i, k] is the feature values[k] of item i in session t. The returned tensor has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl logit models.
Source code in torch_choice/data/utils.py
def pivot3d(df: pd.DataFrame, dim0: str, dim1: str, values: Union[str, List[str]]) -> torch.Tensor:
"""
Creates a tensor of shape (df[dim0].nunique(), df[dim1].nunique(), len(values)) from the
provided data frame.
Example, if dim0 is the column of session ID, dim1 is the column of alternative names, then
out[t, i, k] is the feature values[k] of item i in session t. The returned tensor
has shape (num_sessions, num_items, num_params), which fits the purpose of conditioanl
logit models.
"""
if not isinstance(values, list):
values = [values]
dim1_list = sorted(df[dim1].unique())
tensor_slice = list()
for value in values:
layer = df.pivot(index=dim0, columns=dim1, values=value)
tensor_slice.append(torch.Tensor(layer[dim1_list].values))
tensor = torch.stack(tensor_slice, dim=-1)
assert tensor.shape == (df[dim0].nunique(), df[dim1].nunique(), len(values))
return tensor
model
special
coefficient
The general class of learnable coefficients in various models, this class serves as the building blocks for models in this package. The weights (i.e., learnable parameters) in the Coefficient class are implemented using PyTorch and can be trained directly using optimizers from PyTorch.
NOTE: torch-choice package users don't interact with classes in this file directly, please use conditional_logit_model.py and nested_logit_model.py instead.
Author: Tianyu Du Update: Apr. 28, 2022
Coefficient (Module)
Source code in torch_choice/model/coefficient.py
class Coefficient(nn.Module):
def __init__(self,
variation: str,
num_params: int,
num_items: Optional[int]=None,
num_users: Optional[int]=None
) -> None:
"""A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
in Greek letters in the model's utility representation.
Args:
variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
be set to `K`, NOT `K*U`.
num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
Defaults to None.
"""
super(Coefficient, self).__init__()
self.variation = variation
self.num_items = num_items
self.num_users = num_users
self.num_params = num_params
# construct the trainable.
if self.variation == 'constant':
# constant for all users and items.
self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
elif self.variation == 'item':
# coef depends on item j but not on user i.
# force coefficients for the first item class to be zero.
self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
elif self.variation == 'item-full':
# coef depends on item j but not on user i.
# model coefficient for every item.
self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
elif self.variation == 'user':
# coef depends on the user.
# we always model coefficient for all users.
self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
elif self.variation == 'user-item':
# coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
elif self.variation == 'user-item-full':
# construct coefficients for every items.
self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
else:
raise ValueError(f'Unsupported type of variation: {self.variation}.')
def __repr__(self) -> str:
"""Returns a string representation of the coefficient.
Returns:
str: the string representation of the coefficient.
"""
return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
+ f' num_users={self.num_users}, num_params={self.num_params},' \
+ f' {self.coef.numel()} trainable parameters in total).'
def forward(self,
x: torch.Tensor,
user_index: Optional[torch.Tensor]=None,
manual_coef_value: Optional[torch.Tensor]=None
) -> torch.Tensor:
"""
The forward function of the coefficient, which computes the utility from purchasing each item in each session.
The output shape will be (num_sessions, num_items).
Args:
x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
contain IDs of the user involved in that session. If set to None, assume the same
user is making all decisions.
Defaults to None.
manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
entries as self.coef. If provided, the forward function uses provided values
as coefficient and return the predicted utility, this feature is useful when
the researcher wishes to manually specify values for coefficients and examine prediction
with specified coefficient values. If not provided, forward function is executed
using values from self.coef.
Defaults to None.
Returns:
torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
the utility of purchasing item i in session t.
"""
if manual_coef_value is not None:
assert manual_coef_value.numel() == self.coef.numel()
# plugin the provided coefficient values, coef is a tensor.
coef = manual_coef_value.reshape(*self.coef.shape)
else:
# use the learned coefficient values, coef is a nn.Parameter.
coef = self.coef
num_trips, num_items, num_feats = x.shape
assert self.num_params == num_feats
# cast coefficient tensor to (num_trips, num_items, self.num_params).
if self.variation == 'constant':
coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)
elif self.variation == 'item':
# coef has shape (num_items-1, num_params)
# force coefficient for the first item to be zero.
zeros = torch.zeros(1, self.num_params).to(coef.device)
coef = torch.cat((zeros, coef), dim=0) # (num_items, num_params)
coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
elif self.variation == 'item-full':
# coef has shape (num_items, num_params)
coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
elif self.variation == 'user':
# coef has shape (num_users, num_params)
coef = coef[user_index, :] # (num_trips, num_params) user-specific coefficients.
coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)
elif self.variation == 'user-item':
# (num_trips,) long tensor of user ID.
# originally, coef has shape (num_users, num_items-1, num_params)
# transform to (num_trips, num_items - 1, num_params), user-specific.
coef = coef[user_index, :, :]
# coefs for the first item for all users are enforced to 0.
zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
coef = torch.cat((zeros, coef), dim=1) # (num_trips, num_items, num_params)
elif self.variation == 'user-item-full':
# originally, coef has shape (num_users, num_items, num_params)
coef = coef[user_index, :, :] # (num_trips, num_items, num_params)
else:
raise ValueError(f'Unsupported type of variation: {self.variation}.')
assert coef.shape == (num_trips, num_items, num_feats) == x.shape
# compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
# the inner product.
return (x * coef).sum(dim=-1)
__init__(self, variation, num_params, num_items=None, num_users=None)
special
A generic coefficient object storing trainable parameters. This class corresponds to those variables typically in Greek letters in the model's utility representation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
variation |
str |
the degree of variation of this coefficient. For example, the coefficient can vary by users or items. Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'. For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel. |
required |
num_params |
int |
number of parameters in this coefficient. Note that this number is the number of parameters
per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
an user-specific coefficient called |
required |
num_items |
int |
the number of items in the prediction problem, this is required to reshape the parameter correctly. |
None |
num_users |
Optional[int] |
number of users, this is only necessary if the coefficient varies by users. Defaults to None. |
None |
Source code in torch_choice/model/coefficient.py
def __init__(self,
variation: str,
num_params: int,
num_items: Optional[int]=None,
num_users: Optional[int]=None
) -> None:
"""A generic coefficient object storing trainable parameters. This class corresponds to those variables typically
in Greek letters in the model's utility representation.
Args:
variation (str): the degree of variation of this coefficient. For example, the coefficient can vary by users or items.
Currently, we support variations 'constant', 'item', 'item-full', 'user', 'user-item', 'user-item-full'.
For detailed explanation of these variations, please refer to the documentation of ConditionalLogitModel.
num_params (int): number of parameters in this coefficient. Note that this number is the number of parameters
per class, not the total number of parameters. For example, suppose we have U users and you want to initiate
an user-specific coefficient called `theta_user`. The coefficient enters the utility form while being multiplied
with some K-dimension observables. Then, for each user, there are K parameters to be multiplied with the K-dimensional
observable. However, the total number of parameters is K * U (K for each of U users). In this case, `num_params` should
be set to `K`, NOT `K*U`.
num_items (int): the number of items in the prediction problem, this is required to reshape the parameter correctly.
num_users (Optional[int], optional): number of users, this is only necessary if the coefficient varies by users.
Defaults to None.
"""
super(Coefficient, self).__init__()
self.variation = variation
self.num_items = num_items
self.num_users = num_users
self.num_params = num_params
# construct the trainable.
if self.variation == 'constant':
# constant for all users and items.
self.coef = nn.Parameter(torch.randn(num_params), requires_grad=True)
elif self.variation == 'item':
# coef depends on item j but not on user i.
# force coefficients for the first item class to be zero.
self.coef = nn.Parameter(torch.zeros(num_items - 1, num_params), requires_grad=True)
elif self.variation == 'item-full':
# coef depends on item j but not on user i.
# model coefficient for every item.
self.coef = nn.Parameter(torch.zeros(num_items, num_params), requires_grad=True)
elif self.variation == 'user':
# coef depends on the user.
# we always model coefficient for all users.
self.coef = nn.Parameter(torch.zeros(num_users, num_params), requires_grad=True)
elif self.variation == 'user-item':
# coefficients of the first item is forced to be zero, model coefficients for N - 1 items only.
self.coef = nn.Parameter(torch.zeros(num_users, num_items - 1, num_params), requires_grad=True)
elif self.variation == 'user-item-full':
# construct coefficients for every items.
self.coef = nn.Parameter(torch.zeros(num_users, num_items, num_params), requires_grad=True)
else:
raise ValueError(f'Unsupported type of variation: {self.variation}.')
__repr__(self)
special
Returns a string representation of the coefficient.
Returns:
Type | Description |
---|---|
str |
the string representation of the coefficient. |
Source code in torch_choice/model/coefficient.py
def __repr__(self) -> str:
"""Returns a string representation of the coefficient.
Returns:
str: the string representation of the coefficient.
"""
return f'Coefficient(variation={self.variation}, num_items={self.num_items},' \
+ f' num_users={self.num_users}, num_params={self.num_params},' \
+ f' {self.coef.numel()} trainable parameters in total).'
forward(self, x, user_index=None, manual_coef_value=None)
The forward function of the coefficient, which computes the utility from purchasing each item in each session. The output shape will be (num_sessions, num_items).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
torch.Tensor |
a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient class will NOT reshape input tensors itself, this reshaping needs to be done in the model class. |
required |
user_index |
Optional[torch.Tensor] |
a tensor of shape (num_sessions,) contain IDs of the user involved in that session. If set to None, assume the same user is making all decisions. Defaults to None. |
None |
manual_coef_value |
Optional[torch.Tensor] |
a tensor with the same number of entries as self.coef. If provided, the forward function uses provided values as coefficient and return the predicted utility, this feature is useful when the researcher wishes to manually specify values for coefficients and examine prediction with specified coefficient values. If not provided, forward function is executed using values from self.coef. Defaults to None. |
None |
Returns:
Type | Description |
---|---|
torch.Tensor |
a tensor of shape (num_sessions, num_items) whose (t, i) entry represents the utility of purchasing item i in session t. |
Source code in torch_choice/model/coefficient.py
def forward(self,
x: torch.Tensor,
user_index: Optional[torch.Tensor]=None,
manual_coef_value: Optional[torch.Tensor]=None
) -> torch.Tensor:
"""
The forward function of the coefficient, which computes the utility from purchasing each item in each session.
The output shape will be (num_sessions, num_items).
Args:
x (torch.Tensor): a tensor of shape (num_sessions, num_items, num_params). Please note that the Coefficient
class will NOT reshape input tensors itself, this reshaping needs to be done in the model class.
user_index (Optional[torch.Tensor], optional): a tensor of shape (num_sessions,)
contain IDs of the user involved in that session. If set to None, assume the same
user is making all decisions.
Defaults to None.
manual_coef_value (Optional[torch.Tensor], optional): a tensor with the same number of
entries as self.coef. If provided, the forward function uses provided values
as coefficient and return the predicted utility, this feature is useful when
the researcher wishes to manually specify values for coefficients and examine prediction
with specified coefficient values. If not provided, forward function is executed
using values from self.coef.
Defaults to None.
Returns:
torch.Tensor: a tensor of shape (num_sessions, num_items) whose (t, i) entry represents
the utility of purchasing item i in session t.
"""
if manual_coef_value is not None:
assert manual_coef_value.numel() == self.coef.numel()
# plugin the provided coefficient values, coef is a tensor.
coef = manual_coef_value.reshape(*self.coef.shape)
else:
# use the learned coefficient values, coef is a nn.Parameter.
coef = self.coef
num_trips, num_items, num_feats = x.shape
assert self.num_params == num_feats
# cast coefficient tensor to (num_trips, num_items, self.num_params).
if self.variation == 'constant':
coef = coef.view(1, 1, self.num_params).expand(num_trips, num_items, -1)
elif self.variation == 'item':
# coef has shape (num_items-1, num_params)
# force coefficient for the first item to be zero.
zeros = torch.zeros(1, self.num_params).to(coef.device)
coef = torch.cat((zeros, coef), dim=0) # (num_items, num_params)
coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
elif self.variation == 'item-full':
# coef has shape (num_items, num_params)
coef = coef.view(1, self.num_items, self.num_params).expand(num_trips, -1, -1)
elif self.variation == 'user':
# coef has shape (num_users, num_params)
coef = coef[user_index, :] # (num_trips, num_params) user-specific coefficients.
coef = coef.view(num_trips, 1, self.num_params).expand(-1, num_items, -1)
elif self.variation == 'user-item':
# (num_trips,) long tensor of user ID.
# originally, coef has shape (num_users, num_items-1, num_params)
# transform to (num_trips, num_items - 1, num_params), user-specific.
coef = coef[user_index, :, :]
# coefs for the first item for all users are enforced to 0.
zeros = torch.zeros(num_trips, 1, self.num_params).to(coef.device)
coef = torch.cat((zeros, coef), dim=1) # (num_trips, num_items, num_params)
elif self.variation == 'user-item-full':
# originally, coef has shape (num_users, num_items, num_params)
coef = coef[user_index, :, :] # (num_trips, num_items, num_params)
else:
raise ValueError(f'Unsupported type of variation: {self.variation}.')
assert coef.shape == (num_trips, num_items, num_feats) == x.shape
# compute the utility of each item in each trip, take summation along the feature dimension, the same as taking
# the inner product.
return (x * coef).sum(dim=-1)
conditional_logit_model
Conditional Logit Model.
Author: Tianyu Du Date: Aug. 8, 2021 Update: Apr. 28, 2022
ConditionalLogitModel (Module)
The more generalized version of conditional logit model, the model allows for research specific variable types(groups) and different levels of variations for coefficient.
The model allows for the following levels for variable variations:
!!! note "unless the -full
flag is specified (which means we want to explicitly model coefficients"
for all items), for all variation levels related to item (item specific and user-item specific),
the model force coefficients for the first item to be zero. This design follows standard
econometric practice.
-
constant: constant over all users and items,
-
user: user-specific parameters but constant across all items,
-
item: item-specific parameters but constant across all users, parameters for the first item are forced to be zero.
-
item-full: item-specific parameters but constant across all users, explicitly model for all items.
-
user-item: parameters that are specific to both user and item, parameter for the first item for all users are forced to be zero.
- user-item-full: parameters that are specific to both user and item, explicitly model for all items.
Source code in torch_choice/model/conditional_logit_model.py
class ConditionalLogitModel(nn.Module):
"""The more generalized version of conditional logit model, the model allows for research specific
variable types(groups) and different levels of variations for coefficient.
The model allows for the following levels for variable variations:
NOTE: unless the `-full` flag is specified (which means we want to explicitly model coefficients
for all items), for all variation levels related to item (item specific and user-item specific),
the model force coefficients for the first item to be zero. This design follows standard
econometric practice.
- constant: constant over all users and items,
- user: user-specific parameters but constant across all items,
- item: item-specific parameters but constant across all users, parameters for the first item are
forced to be zero.
- item-full: item-specific parameters but constant across all users, explicitly model for all items.
- user-item: parameters that are specific to both user and item, parameter for the first item
for all users are forced to be zero.
- user-item-full: parameters that are specific to both user and item, explicitly model for all items.
"""
def __init__(self,
coef_variation_dict: Dict[str, str],
num_param_dict: Optional[Dict[str, int]]=None,
num_items: Optional[int]=None,
num_users: Optional[int]=None
) -> None:
"""
Args:
num_items (int): number of items in the dataset.
num_users (int): number of users in the dataset.
coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
if the researcher requires an intercept term.
For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
be one of the following values, this value specifies the "level of variation" of the coefficient.
- `constant`: the coefficient constant over all users and items: $X \beta$.
- `user`: user-specific parameters but constant across all items: $X \beta_{u}$.
- `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
Note that the coefficients for the first item are forced to be zero following the standard practice
in econometrics.
- `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
be zeros.
The following configurations are supported by the package, but we don't recommend using them due to the
large number of parameters.
- `user-item`: parameters that are specific to both user and item, parameter for the first item
for all users are forced to be zero.
- `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.
num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
and values of all ones. Default to be None.
"""
super(ConditionalLogitModel, self).__init__()
if num_param_dict is None:
num_param_dict = {key:1 for key in coef_variation_dict.keys()}
assert coef_variation_dict.keys() == num_param_dict.keys()
self.variable_types = list(deepcopy(num_param_dict).keys())
self.coef_variation_dict = deepcopy(coef_variation_dict)
self.num_param_dict = deepcopy(num_param_dict)
self.num_items = num_items
self.num_users = num_users
# check number of parameters specified are all positive.
for var_type, num_params in self.num_param_dict.items():
assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'
# infer the number of parameters for intercept if the researcher forgets.
if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
self.num_param_dict['intercept'] = 1
# construct trainable parameters.
coef_dict = dict()
for var_type, variation in self.coef_variation_dict.items():
coef_dict[var_type] = Coefficient(variation=variation,
num_items=self.num_items,
num_users=self.num_users,
num_params=self.num_param_dict[var_type])
# A ModuleDict is required to properly register all trainable parameters.
# self.parameter() will fail if a python dictionary is used instead.
self.coef_dict = nn.ModuleDict(coef_dict)
def __repr__(self) -> str:
"""Return a string representation of the model.
Returns:
str: the string representation of the model.
"""
out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
for var_type, num_params in self.num_param_dict.items():
out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
return super().__repr__() + '\n' + '\n'.join(out_str_lst)
@property
def num_params(self) -> int:
"""Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
intercept is involved.
Returns:
int: the total number of learnable parameters.
"""
return sum(w.numel() for w in self.parameters())
def summary(self):
"""Print out the current model parameter."""
for var_type, coefficient in self.coef_dict.items():
if coefficient is not None:
print('Variable Type: ', var_type)
print(coefficient.coef)
def forward(self,
batch: ChoiceDataset,
manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
) -> torch.Tensor:
"""
Forward pass of the model.
Args:
batch: a `ChoiceDataset` object.
manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
coefficient to be the provided values and compute utility conditioned on the provided
coefficient values. This feature is useful when the research wishes to plug in particular
values of coefficients and examine the utility values. If not provided, the model will
use the learned coefficient values in self.coef_dict.
Defaults to None.
Returns:
torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
the utility from item i in trip t for the user involved in that trip.
"""
x_dict = batch.x_dict
if 'intercept' in self.coef_variation_dict.keys():
# intercept term has no input tensor, which has only 1 feature.
x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)
# compute the utility from each item in each choice session.
total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
# for each type of variables, apply the corresponding coefficient to input x.
for var_type, coef in self.coef_dict.items():
total_utility += coef(
x_dict[var_type], batch.user_index,
manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])
assert total_utility.shape == (len(batch), self.num_items)
if batch.item_availability is not None:
# mask out unavilable items.
total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
return total_utility
def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
"""Computes the log-likelihood for the batch and label.
TODO: consider remove y, change to label.
TODO: consider move this method outside the model, the role of the model is to compute the utility.
Args:
batch (ChoiceDataset): a ChoiceDataset object containing the data.
y (torch.Tensor): the label.
is_train (bool, optional): whether to trace the gradient. Defaults to True.
Returns:
torch.Tensor: the negative log-likelihood.
"""
if is_train:
self.train()
else:
self.eval()
# (num_trips, num_items)
total_utility = self.forward(batch)
logP = torch.log_softmax(total_utility, dim=1)
nll = - logP[torch.arange(len(y)), y].sum()
return nll
# NOTE: the method for computing Hessian and standard deviation has been moved to std.py.
# @staticmethod
# def flatten_coef_dict(coef_dict: Dict[str, Union[torch.Tensor, torch.nn.Parameter]]) -> Tuple[torch.Tensor, dict]:
# """Flattens the coef_dict into a 1-dimension tensor, used for hessian computation.
# Args:
# coef_dict (Dict[str, Union[torch.Tensor, torch.nn.Parameter]]): a dictionary holding learnable parameters.
# Returns:
# Tuple[torch.Tensor, dict]: 1. the flattened tensors with shape (num_params,), 2. an indexing dictionary
# used for reconstructing the original coef_dict from the flatten tensor.
# """
# type2idx = dict()
# param_list = list()
# start = 0
# for var_type in coef_dict.keys():
# num_params = coef_dict[var_type].coef.numel()
# # track which portion of all_param tensor belongs to this variable type.
# type2idx[var_type] = (start, start + num_params)
# start += num_params
# # use reshape instead of view to make a copy.
# param_list.append(coef_dict[var_type].coef.clone().reshape(-1,))
# all_param = torch.cat(param_list) # (self.num_params(), )
# return all_param, type2idx
# @staticmethod
# def unwrap_coef_dict(param: torch.Tensor, type2idx: Dict[str, Tuple[int, int]]) -> Dict[str, torch.Tensor]:
# """Rebuilds coef_dict from output of self.flatten_coef_dict method.
# Args:
# param (torch.Tensor): the flattened coef_dict from self.flatten_coef_dict.
# type2idx (Dict[str, Tuple[int, int]]): the indexing dictionary from self.flatten_coef_dict.
# Returns:
# Dict[str, torch.Tensor]: the re-constructed coefficient dictionary.
# """
# coef_dict = dict()
# for var_type in type2idx.keys():
# start, end = type2idx[var_type]
# # no need to reshape here, Coefficient handles it.
# coef_dict[var_type] = param[start:end]
# return coef_dict
# def compute_hessian(self, x_dict, availability, user_index, y) -> torch.Tensor:
# """Computes the Hessian of negative log-likelihood (total cross-entropy loss) with respect
# to all parameters in this model. The Hessian can be later used for constructing the standard deviation of
# parameters.
# Args:
# x_dict ,availability, user_index: see definitions in self.forward method.
# y (torch.LongTensor): a tensor with shape (num_trips,) of IDs of items actually purchased.
# Returns:
# torch.Tensor: a (self.num_params, self.num_params) tensor of the Hessian matrix.
# """
# all_coefs, type2idx = self.flatten_coef_dict(self.coef_dict)
# def compute_nll(P: torch.Tensor) -> float:
# coef_dict = self.unwrap_coef_dict(P, type2idx)
# y_pred = self._forward(x_dict=x_dict,
# availability=availability,
# user_index=user_index,
# manual_coef_value_dict=coef_dict)
# # the reduction needs to be 'sum' to obtain NLL.
# loss = F.cross_entropy(y_pred, y, reduction='sum')
# return loss
# H = torch.autograd.functional.hessian(compute_nll, all_coefs)
# assert H.shape == (self.num_params, self.num_params)
# return H
# def compute_std(self, x_dict, availability, user_index, y) -> Dict[str, torch.Tensor]:
# """Computes
# Args:f
# See definitions in self.compute_hessian.
# Returns:
# Dict[str, torch.Tensor]: a dictionary whose keys are the same as self.coef_dict.keys()
# the values are standard errors of coefficients in each coefficient group.
# """
# _, type2idx = self.flatten_coef_dict(self.coef_dict)
# H = self.compute_hessian(x_dict, availability, user_index, y)
# std_all = torch.sqrt(torch.diag(torch.inverse(H)))
# std_dict = dict()
# for var_type in type2idx.keys():
# # get std of variables belonging to each type.
# start, end = type2idx[var_type]
# std_dict[var_type] = std_all[start:end]
# return std_dict
num_params: int
property
readonly
Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no intercept is involved.
Returns:
Type | Description |
---|---|
int |
the total number of learnable parameters. |
__init__(self, coef_variation_dict, num_param_dict=None, num_items=None, num_users=None)
special
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_items |
int |
number of items in the dataset. |
None |
num_users |
int |
number of users in the dataset. |
None |
coef_variation_dict |
Dict[str, str] |
variable type to variation level dictionary. Keys of this dictionary
should be variable names in the dataset (i.e., these starting with
The following configurations are supported by the package, but we don't recommend using them due to the
large number of parameters.
-
|
required |
num_param_dict |
Optional[Dict[str, int]] |
variable type to number of parameters dictionary with keys exactly the same
as the |
None |
Source code in torch_choice/model/conditional_logit_model.py
def __init__(self,
coef_variation_dict: Dict[str, str],
num_param_dict: Optional[Dict[str, int]]=None,
num_items: Optional[int]=None,
num_users: Optional[int]=None
) -> None:
"""
Args:
num_items (int): number of items in the dataset.
num_users (int): number of users in the dataset.
coef_variation_dict (Dict[str, str]): variable type to variation level dictionary. Keys of this dictionary
should be variable names in the dataset (i.e., these starting with `price_`, `user_`, etc), or `intercept`
if the researcher requires an intercept term.
For each variable name X_var (e.g., `user_income`) or `intercept`, the corresponding dictionary key should
be one of the following values, this value specifies the "level of variation" of the coefficient.
- `constant`: the coefficient constant over all users and items: $X \beta$.
- `user`: user-specific parameters but constant across all items: $X \beta_{u}$.
- `item`: item-specific parameters but constant across all users, $X \beta_{i}$.
Note that the coefficients for the first item are forced to be zero following the standard practice
in econometrics.
- `item-full`: the same configuration as `item`, but does not force the coefficients of the first item to
be zeros.
The following configurations are supported by the package, but we don't recommend using them due to the
large number of parameters.
- `user-item`: parameters that are specific to both user and item, parameter for the first item
for all users are forced to be zero.
- `user-item-full`: parameters that are specific to both user and item, explicitly model for all items.
num_param_dict (Optional[Dict[str, int]]): variable type to number of parameters dictionary with keys exactly the same
as the `coef_variation_dict`. Values of `num_param_dict` records numbers of features in each kind of variable.
If None is supplied, num_param_dict will be a dictionary with the same keys as the `coef_variation_dict` dictionary
and values of all ones. Default to be None.
"""
super(ConditionalLogitModel, self).__init__()
if num_param_dict is None:
num_param_dict = {key:1 for key in coef_variation_dict.keys()}
assert coef_variation_dict.keys() == num_param_dict.keys()
self.variable_types = list(deepcopy(num_param_dict).keys())
self.coef_variation_dict = deepcopy(coef_variation_dict)
self.num_param_dict = deepcopy(num_param_dict)
self.num_items = num_items
self.num_users = num_users
# check number of parameters specified are all positive.
for var_type, num_params in self.num_param_dict.items():
assert num_params > 0, f'num_params needs to be positive, got: {num_params}.'
# infer the number of parameters for intercept if the researcher forgets.
if 'intercept' in self.coef_variation_dict.keys() and 'intercept' not in self.num_param_dict.keys():
warnings.warn("'intercept' key found in coef_variation_dict but not in num_param_dict, num_param_dict['intercept'] has been set to 1.")
self.num_param_dict['intercept'] = 1
# construct trainable parameters.
coef_dict = dict()
for var_type, variation in self.coef_variation_dict.items():
coef_dict[var_type] = Coefficient(variation=variation,
num_items=self.num_items,
num_users=self.num_users,
num_params=self.num_param_dict[var_type])
# A ModuleDict is required to properly register all trainable parameters.
# self.parameter() will fail if a python dictionary is used instead.
self.coef_dict = nn.ModuleDict(coef_dict)
__repr__(self)
special
Return a string representation of the model.
Returns:
Type | Description |
---|---|
str |
the string representation of the model. |
Source code in torch_choice/model/conditional_logit_model.py
def __repr__(self) -> str:
"""Return a string representation of the model.
Returns:
str: the string representation of the model.
"""
out_str_lst = ['Conditional logistic discrete choice model, expects input features:\n']
for var_type, num_params in self.num_param_dict.items():
out_str_lst.append(f'X[{var_type}] with {num_params} parameters, with {self.coef_variation_dict[var_type]} level variation.')
return super().__repr__() + '\n' + '\n'.join(out_str_lst)
forward(self, batch, manual_coef_value_dict=None)
Forward pass of the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
ChoiceDataset |
a |
required |
manual_coef_value_dict |
Optional[Dict[str, torch.Tensor]] |
a dictionary with keys in {'u', 'i'} etc and tensors as values. If provided, the model will force coefficient to be the provided values and compute utility conditioned on the provided coefficient values. This feature is useful when the research wishes to plug in particular values of coefficients and examine the utility values. If not provided, the model will use the learned coefficient values in self.coef_dict. Defaults to None. |
None |
Returns:
Type | Description |
---|---|
torch.Tensor |
a tensor of shape (num_trips, num_items) whose (t, i) entry represents the utility from item i in trip t for the user involved in that trip. |
Source code in torch_choice/model/conditional_logit_model.py
def forward(self,
batch: ChoiceDataset,
manual_coef_value_dict: Optional[Dict[str, torch.Tensor]] = None
) -> torch.Tensor:
"""
Forward pass of the model.
Args:
batch: a `ChoiceDataset` object.
manual_coef_value_dict (Optional[Dict[str, torch.Tensor]], optional): a dictionary with
keys in {'u', 'i'} etc and tensors as values. If provided, the model will force
coefficient to be the provided values and compute utility conditioned on the provided
coefficient values. This feature is useful when the research wishes to plug in particular
values of coefficients and examine the utility values. If not provided, the model will
use the learned coefficient values in self.coef_dict.
Defaults to None.
Returns:
torch.Tensor: a tensor of shape (num_trips, num_items) whose (t, i) entry represents
the utility from item i in trip t for the user involved in that trip.
"""
x_dict = batch.x_dict
if 'intercept' in self.coef_variation_dict.keys():
# intercept term has no input tensor, which has only 1 feature.
x_dict['intercept'] = torch.ones((len(batch), self.num_items, 1), device=batch.device)
# compute the utility from each item in each choice session.
total_utility = torch.zeros((len(batch), self.num_items), device=batch.device)
# for each type of variables, apply the corresponding coefficient to input x.
for var_type, coef in self.coef_dict.items():
total_utility += coef(
x_dict[var_type], batch.user_index,
manual_coef_value=None if manual_coef_value_dict is None else manual_coef_value_dict[var_type])
assert total_utility.shape == (len(batch), self.num_items)
if batch.item_availability is not None:
# mask out unavilable items.
total_utility[~batch.item_availability[batch.session_index, :]] = torch.finfo(total_utility.dtype).min / 2
return total_utility
negative_log_likelihood(self, batch, y, is_train=True)
Computes the log-likelihood for the batch and label. TODO: consider remove y, change to label. TODO: consider move this method outside the model, the role of the model is to compute the utility.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
ChoiceDataset |
a ChoiceDataset object containing the data. |
required |
y |
torch.Tensor |
the label. |
required |
is_train |
bool |
whether to trace the gradient. Defaults to True. |
True |
Returns:
Type | Description |
---|---|
torch.Tensor |
the negative log-likelihood. |
Source code in torch_choice/model/conditional_logit_model.py
def negative_log_likelihood(self, batch: ChoiceDataset, y: torch.Tensor, is_train: bool=True) -> torch.Tensor:
"""Computes the log-likelihood for the batch and label.
TODO: consider remove y, change to label.
TODO: consider move this method outside the model, the role of the model is to compute the utility.
Args:
batch (ChoiceDataset): a ChoiceDataset object containing the data.
y (torch.Tensor): the label.
is_train (bool, optional): whether to trace the gradient. Defaults to True.
Returns:
torch.Tensor: the negative log-likelihood.
"""
if is_train:
self.train()
else:
self.eval()
# (num_trips, num_items)
total_utility = self.forward(batch)
logP = torch.log_softmax(total_utility, dim=1)
nll = - logP[torch.arange(len(y)), y].sum()
return nll
summary(self)
Print out the current model parameter.
nested_logit_model
Implementation of the nested logit model, see page 86 of the book "discrete choice methods with simulation" by Train. for more details.
Author: Tianyu Du Update; Apr. 28, 2022
NestedLogitModel (Module)
Source code in torch_choice/model/nested_logit_model.py
class NestedLogitModel(nn.Module):
def __init__(self,
category_to_item: Dict[object, List[int]],
category_coef_variation_dict: Dict[str, str],
category_num_param_dict: Dict[str, int],
item_coef_variation_dict: Dict[str, str],
item_num_param_dict: Dict[str, int],
num_users: Optional[int]=None,
shared_lambda: bool=False
) -> None:
"""Initialization method of the nested logit model.
Args:
category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
of items IDs of the queried category.
category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
(i.e., variable group) to the level of variation for the coefficient of this type
of variables.
category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
the number of parameters in this variable group.
item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
for item features.
item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
features.
num_users (Optional[int], optional): number of users to be modelled, this is only
required if any of variable type requires user-specific variations.
Defaults to None.
shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
is the coefficient for inclusive values, to be constant for all categories.
The lambda enters the category-level selection as the following
Utility of choosing category k = lambda * inclusive value of category k
+ linear combination of some other category level features
If set to True, a single lambda will be learned for all categories, otherwise, the
model learns an individual lambda for each category.
Defaults to False.
"""
super(NestedLogitModel, self).__init__()
self.category_to_item = category_to_item
self.category_coef_variation_dict = category_coef_variation_dict
self.category_num_param_dict = category_num_param_dict
self.item_coef_variation_dict = item_coef_variation_dict
self.item_num_param_dict = item_num_param_dict
self.num_users = num_users
self.categories = list(category_to_item.keys())
self.num_categories = len(self.categories)
self.num_items = sum(len(items) for items in category_to_item.values())
# category coefficients.
self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
self.category_num_param_dict,
self.num_categories)
# item coefficients.
self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
self.item_num_param_dict,
self.num_items)
self.shared_lambda = shared_lambda
if self.shared_lambda:
self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
else:
self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
# breakpoint()
# self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
# used to warn users if forgot to call clamp.
self._clamp_called_flag = True
@property
def num_params(self) -> int:
"""Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied
with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no
intercept is involved.
Returns:
int: the total number of learnable parameters.
"""
return sum(w.numel() for w in self.parameters())
def _build_coef_dict(self,
coef_variation_dict: Dict[str, str],
num_param_dict: Dict[str, int],
num_items: int) -> nn.ModuleDict:
"""Builds a coefficient dictionary containing all trainable components of the model, mapping coefficient names
to the corresponding Coefficient Module.
num_items could be the actual number of items or the number of categories depends on the use case.
NOTE: torch-choice users don't directly interact with this method.
Args:
coef_variation_dict (Dict[str, str]): a dictionary mapping coefficient names (e.g., theta_user) to the level
of variation (e.g., 'user').
num_param_dict (Dict[str, int]): a dictionary mapping coefficient names to the number of parameters in this
coefficient. Be aware that, for example, if there is one K-dimensional coefficient for every user, then
the `num_param` should be K instead of K x number of users.
num_items (int): the total number of items in the prediction problem. `num_items` should be the number of
categories if _build_coef_dict() is used for category-level prediction.
Returns:
nn.ModuleDict: a PyTorch ModuleDict object mapping from coefficient names to training Coefficient.
"""
coef_dict = dict()
for var_type, variation in coef_variation_dict.items():
num_params = num_param_dict[var_type]
coef_dict[var_type] = Coefficient(variation=variation,
num_items=num_items,
num_users=self.num_users,
num_params=num_params)
return nn.ModuleDict(coef_dict)
# def _check_input_shapes(self, category_x_dict, item_x_dict, user_index, item_availability) -> None:
# T = list(category_x_dict.values())[0].shape[0] # batch size.
# for var_type, x_category in category_x_dict.items():
# x_item = item_x_dict[var_type]
# assert len(x_item.shape) == len(x_item.shape) == 3
# assert x_category.shape[0] == x_item.shape[0]
# assert x_category.shape == (T, self.num_categories, self.category_num_param_dict[var_type])
# assert x_item.shape == (T, self.num_items, self.item_num_param_dict[var_type])
# if (user_index is not None) and (self.num_users is not None):
# assert user_index.shape == (T,)
# if item_availability is not None:
# assert item_availability.shape == (T, self.num_items)
def forward(self, batch: ChoiceDataset) -> torch.Tensor:
"""An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
For more details about the forward passing, please refer to the _forward() method.
# TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
Args:
batch (ChoiceDataset): a ChoiceDataset object containing the data batch.
Returns:
torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
of choosing item i in trip t.
"""
return self._forward(batch['category'].x_dict,
batch['item'].x_dict,
batch['item'].user_index,
batch['item'].item_availability)
def _forward(self,
category_x_dict: Dict[str, torch.Tensor],
item_x_dict: Dict[str, torch.Tensor],
user_index: Optional[torch.LongTensor] = None,
item_availability: Optional[torch.BoolTensor] = None
) -> torch.Tensor:
""""Computes log P[t, i] = the log probability for the user involved in trip t to choose item i.
Let n denote the ID of the user involved in trip t, then P[t, i] = P_{ni} on page 86 of the
book "discrete choice methods with simulation" by Train.
Args:
x_category (torch.Tensor): a tensor with shape (num_trips, num_categories, *) including
features of all categories in each trip.
x_item (torch.Tensor): a tensor with shape (num_trips, num_items, *) including features
of all items in each trip.
user_index (torch.LongTensor): a tensor of shape (num_trips,) indicating which user is
making decision in each trip. Setting user_index = None assumes the same user is
making decisions in all trips.
item_availability (torch.BoolTensor): a boolean tensor with shape (num_trips, num_items)
indicating the aviliability of items in each trip. If item_availability[t, i] = False,
the utility of choosing item i in trip t, V[t, i], will be set to -inf.
Given the decomposition V[t, i] = W[t, k(i)] + Y[t, i] + eps, V[t, i] is set to -inf
by setting Y[t, i] = -inf for unavilable items.
Returns:
torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
of choosing item i in trip t.
"""
if self.shared_lambda:
self.lambdas = self.lambda_weight.expand(self.num_categories)
else:
self.lambdas = self.lambda_weight
# if not self._clamp_called_flag:
# warnings.warn('Did you forget to call clamp_lambdas() after optimizer.step()?')
# The overall utility of item can be decomposed into V[item] = W[category] + Y[item] + eps.
T = list(item_x_dict.values())[0].shape[0]
device = list(item_x_dict.values())[0].device
# compute category-specific utility with shape (T, num_categories).
W = torch.zeros(T, self.num_categories).to(device)
if 'intercept' in self.category_coef_variation_dict.keys():
category_x_dict['intercept'] = torch.ones((T, self.num_categories, 1)).to(device)
for var_type, coef in self.category_coef_dict.items():
W += coef(category_x_dict[var_type], user_index)
# compute item-specific utility (T, num_items).
Y = torch.zeros(T, self.num_items).to(device)
for var_type, coef in self.item_coef_dict.items():
Y += coef(item_x_dict[var_type], user_index)
if item_availability is not None:
Y[~item_availability] =torch.finfo(Y.dtype).min / 2
# =============================================================================
# compute the inclusive value of each category.
inclusive_value = dict()
for k, Bk in self.category_to_item.items():
# for nest k, divide the Y of all items in Bk by lambda_k.
Y[:, Bk] /= self.lambdas[k]
# compute inclusive value for category k.
# mask out unavilable items.
inclusive_value[k] = torch.logsumexp(Y[:, Bk], dim=1, keepdim=False) # (T,)
# boardcast inclusive value from (T, num_categories) to (T, num_items).
# for trip t, I[t, i] is the inclusive value of the category item i belongs to.
I = torch.zeros(T, self.num_items).to(device)
for k, Bk in self.category_to_item.items():
I[:, Bk] = inclusive_value[k].view(-1, 1) # (T, |Bk|)
# logP_item[t, i] = log P(ni|Bk), where Bk is the category item i is in, n is the user in trip t.
logP_item = Y - I # (T, num_items)
# =============================================================================
# logP_category[t, i] = log P(Bk), for item i in trip t, the probability of choosing the nest/bucket
# item i belongs to. logP_category has shape (T, num_items)
# logit[t, i] = W[n, k] + lambda[k] I[n, k], where n is the user involved in trip t, k is
# the category item i belongs to.
logit = torch.zeros(T, self.num_items).to(device)
for k, Bk in self.category_to_item.items():
logit[:, Bk] = (W[:, k] + self.lambdas[k] * inclusive_value[k]).view(-1, 1) # (T, |Bk|)
# only count each category once in the logsumexp within the category level model.
cols = [x[0] for x in self.category_to_item.values()]
logP_category = logit - torch.logsumexp(logit[:, cols], dim=1, keepdim=True)
# =============================================================================
# compute the joint log P_{ni} as in the textbook.
logP = logP_item + logP_category
self._clamp_called_flag = False
return logP
def log_likelihood(self, *args):
"""Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.
Returns:
_type_: the log likelihood of the model.
"""
return - self.negative_log_likelihood(*args)
def negative_log_likelihood(self,
batch: ChoiceDataset,
y: torch.LongTensor,
is_train: bool=True) -> torch.scalar_tensor:
"""Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
in batch instead of the average.
Args:
batch (ChoiceDataset): the ChoiceDataset object containing the data.
y (torch.LongTensor): the label.
is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
then `is_train` can be set to False for better performance.
Defaults to True.
Returns:
torch.scalar_tensor: the negative log likelihood of the model.
"""
# compute the negative log-likelihood loss directly.
if is_train:
self.train()
else:
self.eval()
# (num_trips, num_items)
logP = self.forward(batch)
nll = - logP[torch.arange(len(y)), y].sum()
return nll
# def clamp_lambdas(self):
# """
# Restrict values of lambdas to 0 < lambda <= 1 to guarantee the utility maximization property
# of the model.
# This method should be called everytime after optimizer.step().
# We add a self_clamp_called_flag to remind researchers if this method is not called.
# """
# for k in range(len(self.lambdas)):
# self.lambdas[k] = torch.clamp(self.lambdas[k], 1e-5, 1)
# self._clam_called_flag = True
# @staticmethod
# def add_constant(x: torch.Tensor, where: str='prepend') -> torch.Tensor:
# """A helper function used to add constant to feature tensor,
# x has shape (batch_size, num_classes, num_parameters),
# returns a tensor of shape (*, num_parameters+1).
# """
# batch_size, num_classes, num_parameters = x.shape
# ones = torch.ones((batch_size, num_classes, 1))
# if where == 'prepend':
# new = torch.cat((ones, x), dim=-1)
# elif where == 'append':
# new = torch.cat((x, ones), dim=-1)
# else:
# raise Exception
# return new
num_params: int
property
readonly
Get the total number of parameters. For example, if there is only an user-specific coefficient to be multiplied with the K-dimensional observable, then the total number of parameters would be K x number of users, assuming no intercept is involved.
Returns:
Type | Description |
---|---|
int |
the total number of learnable parameters. |
__init__(self, category_to_item, category_coef_variation_dict, category_num_param_dict, item_coef_variation_dict, item_num_param_dict, num_users=None, shared_lambda=False)
special
Initialization method of the nested logit model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
category_to_item |
Dict[object, List[int]] |
a dictionary maps a category ID to a list of items IDs of the queried category. |
required |
category_coef_variation_dict |
Dict[str, str] |
a dictionary maps a variable type (i.e., variable group) to the level of variation for the coefficient of this type of variables. |
required |
category_num_param_dict |
Dict[str, int] |
a dictionary maps a variable type name to the number of parameters in this variable group. |
required |
item_coef_variation_dict |
Dict[str, str] |
the same as category_coef_variation_dict but for item features. |
required |
item_num_param_dict |
Dict[str, int] |
the same as category_num_param_dict but for item features. |
required |
num_users |
Optional[int] |
number of users to be modelled, this is only required if any of variable type requires user-specific variations. Defaults to None. |
None |
shared_lambda |
bool |
a boolean indicating whether to enforce the elasticity lambda, which is the coefficient for inclusive values, to be constant for all categories. The lambda enters the category-level selection as the following Utility of choosing category k = lambda * inclusive value of category k + linear combination of some other category level features If set to True, a single lambda will be learned for all categories, otherwise, the model learns an individual lambda for each category. Defaults to False. |
False |
Source code in torch_choice/model/nested_logit_model.py
def __init__(self,
category_to_item: Dict[object, List[int]],
category_coef_variation_dict: Dict[str, str],
category_num_param_dict: Dict[str, int],
item_coef_variation_dict: Dict[str, str],
item_num_param_dict: Dict[str, int],
num_users: Optional[int]=None,
shared_lambda: bool=False
) -> None:
"""Initialization method of the nested logit model.
Args:
category_to_item (Dict[object, List[int]]): a dictionary maps a category ID to a list
of items IDs of the queried category.
category_coef_variation_dict (Dict[str, str]): a dictionary maps a variable type
(i.e., variable group) to the level of variation for the coefficient of this type
of variables.
category_num_param_dict (Dict[str, int]): a dictionary maps a variable type name to
the number of parameters in this variable group.
item_coef_variation_dict (Dict[str, str]): the same as category_coef_variation_dict but
for item features.
item_num_param_dict (Dict[str, int]): the same as category_num_param_dict but for item
features.
num_users (Optional[int], optional): number of users to be modelled, this is only
required if any of variable type requires user-specific variations.
Defaults to None.
shared_lambda (bool): a boolean indicating whether to enforce the elasticity lambda, which
is the coefficient for inclusive values, to be constant for all categories.
The lambda enters the category-level selection as the following
Utility of choosing category k = lambda * inclusive value of category k
+ linear combination of some other category level features
If set to True, a single lambda will be learned for all categories, otherwise, the
model learns an individual lambda for each category.
Defaults to False.
"""
super(NestedLogitModel, self).__init__()
self.category_to_item = category_to_item
self.category_coef_variation_dict = category_coef_variation_dict
self.category_num_param_dict = category_num_param_dict
self.item_coef_variation_dict = item_coef_variation_dict
self.item_num_param_dict = item_num_param_dict
self.num_users = num_users
self.categories = list(category_to_item.keys())
self.num_categories = len(self.categories)
self.num_items = sum(len(items) for items in category_to_item.values())
# category coefficients.
self.category_coef_dict = self._build_coef_dict(self.category_coef_variation_dict,
self.category_num_param_dict,
self.num_categories)
# item coefficients.
self.item_coef_dict = self._build_coef_dict(self.item_coef_variation_dict,
self.item_num_param_dict,
self.num_items)
self.shared_lambda = shared_lambda
if self.shared_lambda:
self.lambda_weight = nn.Parameter(torch.ones(1), requires_grad=True)
else:
self.lambda_weight = nn.Parameter(torch.ones(self.num_categories) / 2, requires_grad=True)
# breakpoint()
# self.iv_weights = nn.Parameter(torch.ones(1), requires_grad=True)
# used to warn users if forgot to call clamp.
self._clamp_called_flag = True
forward(self, batch)
An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument. For more details about the forward passing, please refer to the _forward() method.
TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
ChoiceDataset |
a ChoiceDataset object containing the data batch. |
required |
Returns:
Type | Description |
---|---|
torch.Tensor |
a tensor of shape (num_trips, num_items) including the log probability of choosing item i in trip t. |
Source code in torch_choice/model/nested_logit_model.py
def forward(self, batch: ChoiceDataset) -> torch.Tensor:
"""An standard forward method for the model, the user feeds a ChoiceDataset batch and the model returns the
predicted log-likelihood tensor. The main forward passing happens in the _forward() method, but we provide
this wrapper forward() method for a cleaner API, as forward() only requires a single batch argument.
For more details about the forward passing, please refer to the _forward() method.
# TODO: the ConditionaLogitModel returns predicted utility, the NestedLogitModel behaves the same?
Args:
batch (ChoiceDataset): a ChoiceDataset object containing the data batch.
Returns:
torch.Tensor: a tensor of shape (num_trips, num_items) including the log probability
of choosing item i in trip t.
"""
return self._forward(batch['category'].x_dict,
batch['item'].x_dict,
batch['item'].user_index,
batch['item'].item_availability)
log_likelihood(self, *args)
Computes the log likelihood of the model, please refer to the negative_log_likelihood() method.
Returns:
Type | Description |
---|---|
_type_ |
the log likelihood of the model. |
negative_log_likelihood(self, batch, y, is_train=True)
Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples in batch instead of the average.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch |
ChoiceDataset |
the ChoiceDataset object containing the data. |
required |
y |
torch.LongTensor |
the label. |
required |
is_train |
bool |
which mode of the model to be used for the forward passing, if we need Hessian
of the NLL through auto-grad, |
True |
Returns:
Type | Description |
---|---|
torch.scalar_tensor |
the negative log likelihood of the model. |
Source code in torch_choice/model/nested_logit_model.py
def negative_log_likelihood(self,
batch: ChoiceDataset,
y: torch.LongTensor,
is_train: bool=True) -> torch.scalar_tensor:
"""Computes the negative log likelihood of the model. Please note the log-likelihood is summed over all samples
in batch instead of the average.
Args:
batch (ChoiceDataset): the ChoiceDataset object containing the data.
y (torch.LongTensor): the label.
is_train (bool, optional): which mode of the model to be used for the forward passing, if we need Hessian
of the NLL through auto-grad, `is_train` should be set to True. If we merely need a performance metric,
then `is_train` can be set to False for better performance.
Defaults to True.
Returns:
torch.scalar_tensor: the negative log likelihood of the model.
"""
# compute the negative log-likelihood loss directly.
if is_train:
self.train()
else:
self.eval()
# (num_trips, num_items)
logP = self.forward(batch)
nll = - logP[torch.arange(len(y)), y].sum()
return nll