recpkg package¶
Submodules¶
recpkg.evaluation module¶
-
recpkg.evaluation.evaluate_model(ModelConstructor, model_name, X_train, X_test, y_train, y_test, seed_val, configs, plot=False, nb=False)¶ Evaluate multiple of model configs.
- Parameters
ModelConstructor (KerasRecommender) – The constructor of the model which is being evaluated.
model_name (String) – The name of the model which is being evaluated.
X_train (ndarray of shape (n_samples, 2)) – This is the train set. An array where each row consists of a user and an item.
X_test (ndarray of shape (n_samples, 2)) – This is the test set. An array where each row consists of a user and an item.
y_train (ndarray of shape (n_samples,)) – This is the train set. An array where each entry denotes interactions between the corresponding user and item.
y_test (ndarray of shape (n_samples,)) – This is the train set. An array where each entry denotes interactions between the corresponding user and item.
seed_val (int) – A random seed.
configs (List[Dict[String, Object]]) – A list of dictionaries of keyword arguments to be applied in the model’s constructor.
plot (bool) – Should training plots be made?
nb (bool) – Whether or not model is running in a Jupyter notebook.
- Returns
The configs, trained models, history dataframes, training plots, and groupwise evaluations.
- Return type
Dict[String, Dict]
-
recpkg.evaluation.plot_metric_history(history_df, title='')¶ Plot each metric versus epochs.
- Parameters
history_df (pandas.DataFrame) – A tidy dataframe with epoch, metric, and value columns.
title (String) – Text which will be prepended to the title of each graph.
- Returns
The metric plots.
- Return type
List[FacetGrid, ..]
recpkg.explicit module¶
-
class
recpkg.explicit.FunkSVD(users, items, latent_factors=100, epochs=10, learning_rate=0.005, regularization_term=0.02, verbose=False, nb=False)¶ Bases:
recpkg.recommenders.RecommenderRecommender implementing Funk SVD.
Funk SVD without global baselines.
- Parameters
user (ndarray) – An array of the users.
item (ndarray) – An array of the items.
latent_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
learning_rate (float) – The learning rate of the model.
regularization_term (float) – The regularization term of the model.
verbose (bool) – Whether or not to print verbose output.
nb (bool) – Whether or not model is running in a Jupyter notebook.
-
create_latent_factor_matrices()¶ Create matrices for the latent factors of the users and items.
Creates the matrices which represent the factorization of the user-item matrix. In the user latent factor matrix, the rows are the users and the columns are the latent factors. In the item latent factor matrix, the rows are latent factors and the columns are the items.
- Returns
The latent factor matrices for users and items respectively.
- Return type
Tuple[ndarray, ndarray]
-
fit(X=None, y=None)¶ Fit the recommender from the training dataset.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
-
predict(X=None)¶ Predict the scores for the provided data.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
- Returns
Class labels for each data sample.
- Return type
ndarray of shape (n_samples,)
-
predict_rating(user_i, item_i)¶ Predict the rating for an item by the given user.
- Parameters
user_i (int) – The user index.
item_i (int) – The item index.
- Returns
The predicted rating of the item by the user.
- Return type
float
-
process_users_items()¶ Create dictionaries mapping user and item ids to indexes.
Replicates the functionality provided by Keras’s IntegerLookup.
-
train_pair(user, item, actual_rating)¶ Train the model on a single user-item pair.
- Parameters
user (int) – The user id.
item (int) – The item id.
actual_rating (float) – The rating of the item by the user.
- Returns
The difference between the true and predicted ratings.
- Return type
float
-
class
recpkg.explicit.MatrixFactorization(n_factors=100, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object>, loss=<tensorflow.python.keras.losses.MeanSquaredError object>, metrics=[<tensorflow.python.keras.metrics.RootMeanSquaredError object>, <tensorflow.python.keras.metrics.MeanAbsoluteError object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶ Bases:
recpkg.recommenders.KerasRecommenderRecommender implementing Funk SVD with a NN.
- Parameters
n_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.
-
static
create_core_layers(n_factors, user_layers, item_layers)¶ Creates the core layers of the MF model.
Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.
- Parameters
n_factors (int) – The number of latent factors
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.
- Returns
The core layers of the model.
- Return type
keras.layers.Layer
-
create_model()¶ Creates a new MF model.
recpkg.implicit module¶
-
class
recpkg.implicit.GeneralizedMatrixFactorization(n_factors=8, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶ Bases:
recpkg.recommenders.KerasRecommenderRecommender implementing the GMF architecture.
- Parameters
n_factors (int) – The number of latent factors.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.
-
static
create_core_layers(n_factors, user_layers, item_layers, user_dense_kwdargs={}, item_dense_kwdargs={})¶ Creates the core layers of the GMF model.
Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.
- Parameters
n_factors (int) – The number of latent factors.
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.
user_dense_kwdargs (Dict) – The keyword arguments for the user dense layer.
item_dense_kwdargs (Dict) – The keyword arguments for the item dense layer.
- Returns
The core layers of the model.
- Return type
keras.layers.Layer
-
create_model()¶ Creates a new GMF model.
-
get_core_layers_kwdargs()¶ Returns the appropriate kwdargs for pretraining core layers.
- Returns
The keyword arguments for the user and item dense layers.
- Return type
Tuple[Dict, Dict]
-
get_output_weights()¶ Returns the kernel and bias for the output layer of this model.
- Returns
The kernel and bias.
- Return type
List[ndarray, Optional[ndarray]]
-
class
recpkg.implicit.ItemPopularity¶ Bases:
sklearn.base.BaseEstimatorRecommender based solely on interactions per item.
-
fit(X=None, y=None)¶ Fit the recommender from the training dataset.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
-
predict(X=None)¶ Predict the scores for the provided data.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
- Returns
Class labels for each data sample.
- Return type
ndarray of shape (n_samples,)
-
-
class
recpkg.implicit.MultiLayerPerceptron(n_factors=8, n_hidden_layers=4, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶ Bases:
recpkg.recommenders.KerasRecommenderRecommender implementing the MLP architecture.
- Parameters
n_factors (int) – The number of latent factors.
n_hidden_layers (int) – The number of hidden layers.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.
-
static
create_core_layers(n_factors, n_hidden_layers, user_layers, item_layers, hidden_layers_kwdargs=[])¶ Creates the core layers of the MLP model.
Returns the hidden layers of the model. Specifically, the ones between the inputs and the visible, output layer.
- Parameters
n_factors (int) – The number of latent factors.
user_layers (keras.layers.Layer) – The input or preprocessing layers for the users.
item_layers (keras.layers.Layer) – The input or preprocessing layers for the items.
hidden_layers_kwdargs (List[Dict, ..]) – The keyword arguments for each hidden layer.
- Returns
The core layers of the model.
- Return type
keras.layers.Layer
-
create_model()¶ Creates a new MLP model.
-
get_core_layers_kwdargs()¶ Returns the appropriate kwdargs for pretraining core layers.
- Returns
The keyword arguments for the hidden layers.
- Return type
Dict[String, Object]
-
get_output_weights()¶ Returns the kernel and bias for the output layer of this model.
- Returns
The kernel and bias.
- Return type
List[ndarray, Optional[ndarray]]
-
class
recpkg.implicit.NeuralMatrixFactorization(gmf_n_factors=8, mlp_n_factors=8, mlp_n_hidden_layers=4, gmf_trained=None, mlp_trained=None, alpha=0.5, epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶ Bases:
recpkg.recommenders.KerasRecommenderRecommender implementing the NeuMF architecture, an ensemble of GMF/MLP.
- Parameters
gmf_n_factors (int) – The number of latent factors for GMF.
mlp_n_factors (int) – The number of latent factors for MLP.
mlp_n_hidden_layers (int) – The number of hidden layers.
gmf_trained (GeneralizedMatrixFactorization) – A trained GMF model of the same number of factors.
mlp_trained (MultiLayerPerceptron) – A trained MLP model of the same number of factors and hidden layers.
alpha (float) – The tradeoff between MLP and GMF.
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.
-
create_model()¶ Creates a new NeuMF model.
- Returns
The NeuMF model. It will be pretrained if trained models are provided in the constructor.
- Return type
keras.Model
recpkg.metrics module¶
-
recpkg.metrics.dcg_score(items)¶ Calculate the discounted cumulative gain.
- Parameters
items (List[float, ..]) – The list of ranked items.
- Returns
The DCG score.
- Return type
float
-
recpkg.metrics.ndcg_score(items)¶ Calculate the normalized discounted cumulative gain.
- Parameters
items (List[float, ..]) – The list of ranked items.
- Returns
The NDCG score.
- Return type
float
-
recpkg.metrics.perform_groupwise_evaluation(X_test, y_test, y_pred)¶ Calculate HR@10 and NDCG@10 by user.
- Parameters
X_test (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y_test (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
y_pred (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
- Returns
- Return type
Dict[str, float]
recpkg.model_selection module¶
-
recpkg.model_selection.LeaveMembersOut(*lists, groups=None, n_val=1, n_test=1, seed=None)¶ Returns indices to split data into train, val, and test sets.
Returns indices of train, test, and validation sets based on the given number of validation and test items per group. The function accepts a variable number of lists, which is included for consistency with similar scikit-learn functions. The length of the lists is used to determine the number of indices. If a list of groups is specified, the specified number of members of each group will be placed in the validation and test sets.
- Parameters
*lists (List[List, ..]) – One or more lists from which to leave members
They should be the same length. (out.) –
groups (List) – A list by which the indices may be grouped
user ids) This should be the same length as the provided lists. ((e.g.) –
n_val (int) – The number of members to be left out for the val set.
n_test (int) – The number of members to be left out for the test set.
seed (int) – A random seed.
- Returns
Lists of indicies for the train, val, and test sets, respectively.
- Return type
Tuple[List, List, List]
-
recpkg.model_selection.sample_n_non_interactions(X, y, user_id, n=100, seed=None)¶ Samples non-interactions for a given user.
Returns X (user, item) and y (zeros) np arrays of N (100 by default) items which the user has not interacted with.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
user_id (int) – The unique identifier of a user.
n (int) – The number of non-interactions to sample.
seed (int) – A random seed.
- Returns
The X and y arrays of non interactions.
- Return type
Tuple[ndarray of shape (n_samples, 2), ndarray of shape (n_samples,)])
recpkg.preprocessing module¶
-
recpkg.preprocessing.get_standard_layers(values, name=None)¶ Returns input layer and standard preprocessing layers for given values.
Returns the input and preprocessing layers for the given integer values. The preprocessing consists of IntegerLookup and one-hot encoding via CategoryEncoding.
- Parameters
values (ndarray) – The integer values of the desired input.
name (String) – The name of the values.
- Returns
The input and preprocessing layers.
- Return type
Tuple[Layer, Layer]
recpkg.recommenders module¶
-
class
recpkg.recommenders.KerasRecommender(epochs=10, optimizer=<tensorflow.python.keras.optimizer_v2.adam.Adam object>, loss=<tensorflow.python.keras.losses.BinaryCrossentropy object>, metrics=[<tensorflow.python.keras.metrics.BinaryAccuracy object>], seed=None, user_input=None, item_input=None, user_preprocessing_layers=None, item_preprocessing_layers=None)¶ Bases:
recpkg.recommenders.RecommenderAbstract class for recommenders built with Keras models.
- Parameters
epochs (int) – The number of epochs to train the NN.
optimizer (keras.optimizers.Optimizer) – The model’s optimizer.
loss (keras.losses.Loss) – The loss function.
metrics (List[keras.metrics.Metric, ..]) – The metric functions.
seed (int) – A random seed.
user_input (keras.Input) – An input for the users.
item_input (keras.Input) – An input for the items.
user_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the users.
item_preprocessing_layers (keras.layers.Layer) – Preprocessing layers for the items.
-
create_model()¶ Creates a new Keras model.
-
fit(X=None, y=None)¶ Fit the recommender from the training dataset.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
y (ndarray of shape (n_samples,)) – An array where each entry denotes interactions between the corresponding user and item.
-
predict(X=None)¶ Predict the scores for the provided data.
- Parameters
X (ndarray of shape (n_samples, 2)) – An array where each row consists of a user and an item.
- Returns
Class labels for each data sample.
- Return type
ndarray of shape (n_samples,)
-
class
recpkg.recommenders.Recommender¶ Bases:
sklearn.base.BaseEstimatorAbstract class for recommenders.