recpkg package¶

Submodules¶

recpkg.evaluation module¶

recpkg.evaluation.evaluate_model(ModelConstructor, model_name, X_train, X_test, y_train, y_test, seed_val, configs, plot=False, nb=False)¶

Evaluate multiple of model configs.

Parameters

ModelConstructor (KerasRecommender) – The constructor of the model which is being evaluated.
model_name (String) – The name of the model which is being evaluated.
X_train (ndarray of shape (n_samples, 2)) – This is the train set. An array where each row consists of a user and an item.
X_test (ndarray of shape (n_samples, 2)) – This is the test set. An array where each row consists of a user and an item.
y_train (ndarray of shape (n_samples,)) – This is the train set. An array where each entry denotes interactions between the corresponding user and item.
y_test (ndarray of shape (n_samples,)) – This is the train set. An array where each entry denotes interactions between the corresponding user and item.
seed_val (int) – A random seed.
configs (List[Dict[String, Object]]) – A list of dictionaries of keyword arguments to be applied in the model’s constructor.
plot (bool) – Should training plots be made?
nb (bool) – Whether or not model is running in a Jupyter notebook.

Returns

The configs, trained models, history dataframes, training plots, and groupwise evaluations.

Return type

Dict[String, Dict]

recpkg.evaluation.plot_metric_history(history_df, title='')¶

Plot each metric versus epochs.

Parameters

history_df (pandas.DataFrame) – A tidy dataframe with epoch, metric, and value columns.
title (String) – Text which will be prepended to the title of each graph.

Returns

The metric plots.

Return type

List[FacetGrid, ..]

recpkg.explicit module¶

class recpkg.explicit.FunkSVD(users, items, latent_factors=100, epochs=10, learning_rate=0.005, regularization_term=0.02, verbose=False, nb=False)¶

Bases: recpkg.recommenders.Recommender

Recommender implementing Funk SVD.

Funk SVD without global baselines.

Parameters

user (ndarray) – An array of the users.
item (ndarray) – An array of the items.
latent_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
learning_rate (float) – The learning rate of the model.
regularization_term (float) – The regularization term of the model.
verbose (bool) – Whether or not to print verbose output.
nb (bool) – Whether or not model is running in a Jupyter notebook.

create_latent_factor_matrices()¶

Create matrices for the latent factors of the users and items.

Creates the matrices which represent the factorization of the user-item matrix. In the user latent factor matrix, the rows are the users and the columns are the latent factors. In the item latent factor matrix, the rows are latent factors and the columns are the items.

Returns: The latent factor matrices for users and items respectively.
Return type: Tuple[ndarray, ndarray]

fit(X=None, y=None)¶

Fit the recommender from the training dataset.

Parameters

X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.

predict(X=None)¶

Predict the scores for the provided data.

Parameters: X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
Returns: Class labels for each data sample.
Return type: ndarray of shape (n_samples,)

predict_rating(user_i, item_i)¶

Predict the rating for an item by the given user.

Parameters

user_i (int) – The user index.
item_i (int) – The item index.

Returns

The predicted rating of the item by the user.

Return type

float

process_users_items()¶

Create dictionaries mapping user and item ids to indexes.

Replicates the functionality provided by Keras’s IntegerLookup.

train_pair(user, item, actual_rating)¶

Train the model on a single user-item pair.

Parameters

user (int) – The user id.
item (int) – The item id.
actual_rating (float) – The rating of the item by the user.

Returns

The difference between the true and predicted ratings.

Return type

float

class recpkg.explicit.MatrixFactorization(n_factors=100, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object>, loss=<tensorflow.python.keras.losses.MeanSquaredError object>, metrics=[<tensorflow.python.keras.metrics.RootMeanSquaredError object>, <tensorflow.python.keras.metrics.MeanAbsoluteError object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶

Bases: recpkg.recommenders.KerasRecommender

Recommender implementing Funk SVD with a NN.

Parameters

n_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.

static create_core_layers(n_factors, user_layers, item_layers)¶

Creates the core layers of the MF model.

Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.

Parameters

n_factors (int) – The number of latent factors
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.

Returns

The core layers of the model.

Return type

keras.layers.Layer

create_model()¶: Creates a new MF model.

recpkg.implicit module¶

class recpkg.implicit.GeneralizedMatrixFactorization(n_factors=8, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶

Bases: recpkg.recommenders.KerasRecommender

Recommender implementing the GMF architecture.

Parameters

n_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.

static create_core_layers(n_factors, user_layers, item_layers, user_dense_kwdargs={}, item_dense_kwdargs={})¶

Creates the core layers of the GMF model.

Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.

Parameters

n_factors (int) – The number of latent factors.
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.
user_dense_kwdargs (Dict) – The keyword arguments for the user dense layer.
item_dense_kwdargs (Dict) – The keyword arguments for the item dense layer.

Returns

The core layers of the model.

Return type

keras.layers.Layer

create_model()¶: Creates a new GMF model.

get_core_layers_kwdargs()¶

Returns the appropriate kwdargs for pretraining core layers.

Returns: The keyword arguments for the user and item dense layers.
Return type: Tuple[Dict, Dict]

get_output_weights()¶

Returns the kernel and bias for the output layer of this model.

Returns: The kernel and bias.
Return type: List[ndarray, Optional[ndarray]]

class recpkg.implicit.ItemPopularity¶

Bases: sklearn.base.BaseEstimator

Recommender based solely on interactions per item.

fit(X=None, y=None)¶

Fit the recommender from the training dataset.

Parameters

X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.

predict(X=None)¶

Predict the scores for the provided data.

Parameters: X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
Returns: Class labels for each data sample.
Return type: ndarray of shape (n_samples,)

class recpkg.implicit.MultiLayerPerceptron(n_factors=8, n_hidden_layers=4, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶

Bases: recpkg.recommenders.KerasRecommender

Recommender implementing the MLP architecture.

Parameters

n_factors (int) – The number of latent factors.
n_hidden_layers (int) – The number of hidden layers.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.

static create_core_layers(n_factors, n_hidden_layers, user_layers, item_layers, hidden_layers_kwdargs=[])¶

Creates the core layers of the MLP model.

Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.

Parameters

n_factors (int) – The number of latent factors.
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.
hidden_layers_kwdargs (List[Dict, ..]) – The keyword arguments for each hidden layer.

Returns

The core layers of the model.

Return type

keras.layers.Layer

create_model()¶: Creates a new MLP model.

get_core_layers_kwdargs()¶

Returns the appropriate kwdargs for pretraining core layers.

Returns: The keyword arguments for the hidden layers.
Return type: Dict[String, Object]

get_output_weights()¶

Returns the kernel and bias for the output layer of this model.

Returns: The kernel and bias.
Return type: List[ndarray, Optional[ndarray]]

class recpkg.implicit.NeuralMatrixFactorization(gmf_n_factors=8, mlp_n_factors=8, mlp_n_hidden_layers=4, gmf_trained=None, mlp_trained=None, alpha=0.5, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶

Bases: recpkg.recommenders.KerasRecommender

Recommender implementing the NeuMF architecture, an ensemble of GMF/MLP.

Parameters

gmf_n_factors (int) – The number of latent factors for GMF.
mlp_n_factors (int) – The number of latent factors for MLP.
mlp_n_hidden_layers (int) – The number of hidden layers.
gmf_trained (GeneralizedMatrixFactorization) – A trained GMF model of the same number of factors.
mlp_trained (MultiLayerPerceptron) – A trained MLP model of the same number of factors and hidden layers.
alpha (float) – The tradeoff between MLP and GMF.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.

create_model()¶

Creates a new NeuMF model.

Returns: The NeuMF model. It will be pretrained if trained models are provided in the constructor.
Return type: keras.Model

recpkg.metrics module¶

recpkg.metrics.dcg_score(items)¶

Calculate the discounted cumulative gain.

Parameters: items (List[float, ..]) – The list of ranked items.
Returns: The DCG score.
Return type: float

recpkg.metrics.ndcg_score(items)¶

Calculate the normalized discounted cumulative gain.

Parameters: items (List[float, ..]) – The list of ranked items.
Returns: The NDCG score.
Return type: float

recpkg.metrics.perform_groupwise_evaluation(X_test, y_test, y_pred)¶

Calculate HR@10 and NDCG@10 by user.

Parameters

X_test (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y_test (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
y_pred (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.

Returns

The HR@10 and NDCG@10.

Return type

Dict[str, float]

recpkg.model_selection module¶

recpkg.model_selection.LeaveMembersOut(*lists, groups=None, n_val=1, n_test=1, seed=None)¶

Returns indices to split data into train, val, and test sets.

Returns indices of train, test, and validation sets based on the given number of validation and test items per group. The function accepts a variable number of lists, which is included for consistency with similar scikit-learn functions. The length of the lists is used to determine the number of indices. If a list of groups is specified, the specified number of members of each group will be placed in the validation and test sets.

Parameters

*lists (List[List, ..]) – One or more lists from which to leave members
They should be the same length. (out.) –
groups (List) – A list by which the indices may be grouped
user ids) This should be the same length as the provided lists. ((e.g.) –
n_val (int) – The number of members to be left out for the val set.
n_test (int) – The number of members to be left out for the test set.
seed (int) – A random seed.

Returns

Lists of indicies for the train, val, and test sets, respectively.

Return type

Tuple[List, List, List]

recpkg.model_selection.sample_n_non_interactions(X, y, user_id, n=100, seed=None)¶

Samples non-interactions for a given user.

Returns X (user, item) and y (zeros) np arrays of N (100 by default) items which the user has not interacted with.

Parameters

X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
user_id (int) – The unique identifier of a user.
n (int) – The number of non-interactions to sample.
seed (int) – A random seed.

Returns

The X and y arrays of non interactions.

Return type

Tuple[ndarray of shape (n_samples, 2), ndarray of shape (n_samples,)])

recpkg.preprocessing module¶

recpkg.preprocessing.get_standard_layers(values, name=None)¶

Returns input layer and standard preprocessing layers for given values.

Returns the input and preprocessing layers for the given integer values. The preprocessing consists of IntegerLookup and one-hot encoding via CategoryEncoding.

Parameters

values (ndarray) – The integer values of the desired input.
name (String) – The name of the values.

Returns

The input and preprocessing layers.

Return type

Tuple[Layer, Layer]

recpkg.recommenders module¶

class recpkg.recommenders.KerasRecommender(epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶

Bases: recpkg.recommenders.Recommender

Abstract class for recommenders built with Keras models.

Parameters

epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.

create_model()¶: Creates a new Keras model.

fit(X=None, y=None)¶

Fit the recommender from the training dataset.

Parameters

X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.

predict(X=None)¶

Predict the scores for the provided data.

Parameters: X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
Returns: Class labels for each data sample.
Return type: ndarray of shape (n_samples,)

class recpkg.recommenders.Recommender¶

Bases: sklearn.base.BaseEstimator

Abstract class for recommenders.

recpkg package¶

Submodules¶

recpkg.evaluation module¶

recpkg.explicit module¶

recpkg.implicit module¶

recpkg.metrics module¶

recpkg.model_selection module¶

recpkg.preprocessing module¶

recpkg.recommenders module¶

Module contents¶