near_recommender.src.models package#

Submodules#

near_recommender.src.models.friends_friends.get_friends_of_friends(spark_df_path)#

Reads a CSV file as a Spark DataFrame and trains an XGBoost model to predict user connections.

Parameters:: spark_df_path (str) – The path to the CSV file containing the input data for the Spark DataFrame.
Returns:: A dictionary containing the predicted users as a NumPy array.
Return type:: Dict

near_recommender.src.models.similar_posts.get_similar_post_users(query, top_k=5)#

Returns the top k most similar sentences in a corpus to a given query sentence.

Parameters:

query (str) – The query sentence to find similar sentences for.
top_k (int, optional) – The number of top similar sentences to return. Defaults to 5.

Returns:

A dictionary containing the top-k most similar sentences to the query.

Return type:

dict

near_recommender.src.models.similar_posts.load_corpus_embeddings(filename)#

Loads the corpus embeddings from a given filename using a SentenceTransformer model.

Parameters:: filename (str) – The filename of the pretrained model to load the corpus embeddings from.
Returns:: A tuple containing the loaded corpus embeddings, the list of sentences, the DataFrame, and the SentenceTransformer model.
Return type:: Tuple[object, list[str], object, SentenceTransformer]

near_recommender.src.models.similar_posts.update_corpus()#

Updates a large language NLP sentence transformer model with new data. The model is saved to the location specified in the path variable.

near_recommender.src.models.similar_tags.get_similar_tags_users(user, top_k=5)#

Returns the top-k users with similar tags as the specified user.

Parameters:

user (str) – The name of the user for whom similar users are to be found.
top_k (int, optional) – The number of similar users to be returned. Defaults to 5.

Returns:

A dictionary containing the top-k similar users and their similarity score.

Return type:

Dict[str, List[Dict[str, str]]]

Raises:

near_recommender.src.models.trending_users.get_trending_users()#

Retrieves trending users based on specified metrics and community detection algorithms.

Returns:: A JSON object containing the usernames and community IDs of the top 20 trending users.
Return type:: Dict