near_recommender.src.models package#

Submodules#

near_recommender.src.models.friends_friends module#

near_recommender.src.models.friends_friends.get_friends_of_friends(spark_df_path)#

Reads a CSV file as a Spark DataFrame and trains an XGBoost model to predict user connections.

Parameters:

spark_df_path (str) – The path to the CSV file containing the input data for the Spark DataFrame.

Returns:

A dictionary containing the predicted users as a NumPy array.

Return type:

Dict

near_recommender.src.models.similar_posts module#

near_recommender.src.models.similar_posts.get_similar_post_users(query, top_k=5)#

Returns the top k most similar sentences in a corpus to a given query sentence.

Parameters:
  • query (str) – The query sentence to find similar sentences for.

  • top_k (int, optional) – The number of top similar sentences to return. Defaults to 5.

Returns:

A dictionary containing the top-k most similar sentences to the query.

Return type:

dict

near_recommender.src.models.similar_posts.load_corpus_embeddings(filename)#

Loads the corpus embeddings from a given filename using a SentenceTransformer model.

Parameters:

filename (str) – The filename of the pretrained model to load the corpus embeddings from.

Returns:

A tuple containing the loaded corpus embeddings, the list of sentences, the DataFrame, and the SentenceTransformer model.

Return type:

Tuple[object, list[str], object, SentenceTransformer]

near_recommender.src.models.similar_posts.update_corpus()#

Updates a large language NLP sentence transformer model with new data. The model is saved to the location specified in the path variable.

Returns:

None

Return type:

None

near_recommender.src.models.similar_tags module#

near_recommender.src.models.similar_tags.get_similar_tags_users(user, top_k=5)#

Returns the top-k users with similar tags as the specified user.

Parameters:
  • user (str) – The name of the user for whom similar users are to be found.

  • top_k (int, optional) – The number of similar users to be returned. Defaults to 5.

Returns:

A dictionary containing the top-k similar users and their similarity score.

Return type:

Dict[str, List[Dict[str, str]]]

Raises:
  • ValueError – If the input dataframe is empty or contains NaN values.

  • TypeError – If the input top_k value is not an integer.

near_recommender.src.models.trending_users module#

Retrieves trending users based on specified metrics and community detection algorithms.

Returns:

A JSON object containing the usernames and community IDs of the top 20 trending users.

Return type:

Dict

Module contents#