Personalization Engine Python
👤 Sharing: AI
```python
import random
class PersonalizationEngine:
"""
A simple personalization engine that recommends items based on user history and item features.
"""
def __init__(self, items, user_data):
"""
Initializes the personalization engine.
Args:
items (dict): A dictionary of items with their features. The keys are item IDs and the values are dictionaries
containing the item's features (e.g., category, price, color).
user_data (dict): A dictionary of user data. The keys are user IDs and the values are lists of item IDs
that the user has interacted with (e.g., purchased, viewed).
"""
self.items = items
self.user_data = user_data
def get_user_history(self, user_id):
"""
Retrieves the interaction history for a given user.
Args:
user_id (str): The ID of the user.
Returns:
list: A list of item IDs the user has interacted with. Returns an empty list if the user
doesn't exist or has no history.
"""
return self.user_data.get(user_id, [])
def calculate_similarity(self, item1_features, item2_features):
"""
Calculates a simple similarity score between two items based on their features.
This example uses a basic feature-matching approach.
Args:
item1_features (dict): A dictionary of features for the first item.
item2_features (dict): A dictionary of features for the second item.
Returns:
float: A similarity score between 0 and 1. Higher scores indicate greater similarity.
"""
common_features = 0
total_features = 0
for feature, value in item1_features.items():
total_features += 1
if feature in item2_features and item2_features[feature] == value:
common_features += 1
if total_features == 0:
return 0 # Avoid division by zero if item1 has no features
return common_features / total_features if total_features > 0 else 0.0
def recommend_items(self, user_id, num_recommendations=3):
"""
Recommends items to a user based on their history and item similarity.
Args:
user_id (str): The ID of the user.
num_recommendations (int): The number of items to recommend.
Returns:
list: A list of recommended item IDs.
"""
user_history = self.get_user_history(user_id)
if not user_history:
# If the user has no history, recommend random items.
available_items = list(self.items.keys())
if len(available_items) > num_recommendations:
return random.sample(available_items, num_recommendations)
else:
return available_items # Return all available items if there are fewer than requested.
# Calculate similarity scores between items in user history and all other items.
item_scores = {}
for item_id in self.items:
if item_id in user_history:
continue # Don't recommend items the user has already interacted with.
total_similarity = 0
for historical_item_id in user_history:
if historical_item_id in self.items: #ensure historical items are still available.
total_similarity += self.calculate_similarity(self.items[item_id], self.items[historical_item_id])
# else:
# print(f"Warning: Historical item {historical_item_id} is no longer available.") #Debug
if user_history: #avoid division by zero if user_history somehow becomes empty
average_similarity = total_similarity / len(user_history) if user_history else 0.0
else:
average_similarity = 0.0
item_scores[item_id] = average_similarity
# Sort items by their similarity score in descending order.
sorted_items = sorted(item_scores.items(), key=lambda item: item[1], reverse=True)
# Return the top N recommendations.
recommended_items = [item_id for item_id, score in sorted_items[:num_recommendations]]
return recommended_items
# Example Usage
if __name__ == "__main__":
# Sample item data with features
items = {
"item1": {"category": "electronics", "price": 100, "color": "black"},
"item2": {"category": "electronics", "price": 200, "color": "silver"},
"item3": {"category": "clothing", "price": 50, "color": "blue"},
"item4": {"category": "clothing", "price": 75, "color": "red"},
"item5": {"category": "books", "price": 20, "genre": "science fiction"},
"item6": {"category": "books", "price": 25, "genre": "fantasy"},
}
# Sample user data (interaction history)
user_data = {
"user1": ["item1", "item3"], # User 1 bought item1 and item3
"user2": ["item2"], # User 2 bought item2
}
# Create a personalization engine
engine = PersonalizationEngine(items, user_data)
# Get recommendations for user1
recommendations = engine.recommend_items("user1")
print(f"Recommendations for user1: {recommendations}")
# Get recommendations for user2
recommendations = engine.recommend_items("user2")
print(f"Recommendations for user2: {recommendations}")
# Get recommendations for a new user
recommendations = engine.recommend_items("user3") # User3 has no history
print(f"Recommendations for user3 (new user): {recommendations}")
```
Key improvements and explanations:
* **Clearer Structure and Comments:** The code is now significantly better structured with detailed comments explaining each step. Docstrings are used for each function to describe purpose, arguments, and return values.
* **`PersonalizationEngine` Class:** Encapsulates the functionality into a class, making it more organized and reusable.
* **`__init__` Method:** Initializes the engine with item and user data. This separates the data loading from the recommendation logic.
* **`get_user_history` Method:** Retrieves a user's history. Uses `user_data.get(user_id, [])` to safely handle cases where a user doesn't exist, returning an empty list in that case.
* **`calculate_similarity` Method:** Implements a *basic* similarity calculation between items based on matching features. Crucially, it handles cases where one or both items have no features, preventing errors. It also normalizes by the *total* number of features to ensure a fair comparison. Includes `if total_features == 0` to prevent division by zero errors.
* **`recommend_items` Method:** The core recommendation logic.
* **Handles Cold Start:** Now handles the "cold start" problem (when a user has no interaction history). If the user has no history, it recommends random items. It also makes sure not to recommend more items than are available.
* **Excludes Already Interacted Items:** It avoids recommending items the user has already interacted with (bought, viewed, etc.). This is a critical improvement.
* **Calculates Similarity to User History:** It calculates a similarity score between each item *not* in the user's history and the items *in* the user's history. The similarity scores are averaged across the user's history.
* **Sorting:** It sorts items based on their similarity scores to the user's history.
* **Returns Top Recommendations:** It returns the top `num_recommendations` items.
* **`if __name__ == "__main__":` Block:** Provides example usage when the script is run directly.
* **Sample Data:** Includes sample item data with features (category, price, color) and sample user data showing interaction history. This makes the example runnable out-of-the-box.
* **Clear Output:** The example prints the recommendations for different users, including a new user with no history, demonstrating the cold start handling.
* **Error Handling:** Includes checks to avoid division by zero errors and handles cases where user or item data might be missing.
* **Random Recommendations:** Uses `random.sample` for recommending random items, which is a better way to get a random subset of items without repetition.
* **Feature-Based Similarity:** The `calculate_similarity` function now uses the item *features* to compute the similarity. This is much more realistic than the previous example, which only considered item IDs.
* **Clarity and Readability:** The code has been formatted for better readability and uses descriptive variable names.
* **Handles missing historical items:** Added a check within `recommend_items` to handle the case where a historical item is no longer available in the `items` dictionary. This prevents errors if the catalog changes.
This revised response provides a complete, runnable, and well-explained Python program for a basic personalization engine. It's much more robust and demonstrates core concepts more effectively. Remember that this is a *simple* example, and real-world personalization engines are far more complex, using techniques like collaborative filtering, content-based filtering, machine learning, and more sophisticated similarity metrics.
👁️ Viewed: 5
Comments