A Movie Recommendation System is a sophisticated information filtering system that predicts user preferences for movies and suggests films that a user might like. The primary goal is to help users discover new movies they haven't seen yet but are likely to enjoy, thereby enhancing user experience and engagement on streaming platforms, movie databases, and entertainment websites.
There are several main approaches to building recommendation systems:
1. Content-Based Filtering: This method recommends items similar to those the user liked in the past. For movies, this could mean analyzing features like genre, actors, directors, plot keywords, and release year. If a user has watched and liked several sci-fi thrillers starring specific actors, the system would recommend other sci-fi thrillers with similar actors or thematic elements.
2. Collaborative Filtering: This is one of the most widely used approaches. It works on the principle that if two users have similar tastes in the past, they will likely have similar tastes in the future. It has two main sub-types:
* User-Based Collaborative Filtering: Identifies users who have similar movie preferences (e.g., rated the same movies similarly) and recommends movies liked by those 'similar users' but not yet seen by the target user.
* Item-Based Collaborative Filtering: Identifies relationships between movies. If users who liked Movie A also tended to like Movie B, then when a new user watches Movie A, Movie B is recommended.
3. Hybrid Recommendation Systems: These systems combine two or more recommendation techniques (e.g., content-based and collaborative filtering) to leverage the strengths of each and overcome their limitations (like the 'cold start' problem for new users or items, or data sparsity).
Key Components of a Recommendation System:
* Data Collection: Gathering user data (ratings, watch history, explicit preferences) and item data (movie attributes, metadata).
* Data Storage: Databases to store user and movie information efficiently.
* Algorithm: The core logic that processes the data to generate recommendations.
* Presentation Layer: How recommendations are displayed to the user.
Challenges:
* Cold Start Problem: How to make recommendations for new users with no past data, or for new movies with no ratings.
* Sparsity: When most users have rated only a small fraction of all available movies.
* Scalability: Handling millions of users and movies efficiently.
* Serendipity: Recommending surprising but relevant items, not just obvious ones.
Example Code
<?php
class MovieRecommendationSystem {
private $movies;
private $user_ratings;
public function __construct() {
// Simulate a database of movies with their attributes (genres, year)
$this->movies = [
1 => ['title' => 'Inception', 'genres' => ['Sci-Fi', 'Action', 'Thriller'], 'year' => 2010],
2 => ['title' => 'The Dark Knight', 'genres' => ['Action', 'Crime', 'Drama'], 'year' => 2008],
3 => ['title' => 'Pulp Fiction', 'genres' => ['Crime', 'Drama'], 'year' => 1994],
4 => ['title' => 'Forrest Gump', 'genres' => ['Drama', 'Romance'], 'year' => 1994],
5 => ['title' => 'Interstellar', 'genres' => ['Sci-Fi', 'Drama', 'Adventure'], 'year' => 2014],
6 => ['title' => 'Django Unchained', 'genres' => ['Western', 'Drama'], 'year' => 2012],
7 => ['title' => 'The Matrix', 'genres' => ['Sci-Fi', 'Action'], 'year' => 1999],
8 => ['title' => 'La La Land', 'genres' => ['Musical', 'Drama', 'Romance'], 'year' => 2016],
9 => ['title' => 'Arrival', 'genres' => ['Sci-Fi', 'Drama', 'Mystery'], 'year' => 2016],
10 => ['title' => 'Parasite', 'genres' => ['Comedy', 'Drama', 'Thriller'], 'year' => 2019]
];
// Simulate user ratings (User ID => [Movie ID => Rating (1-5)])
$this->user_ratings = [
'userA' => [
1 => 5, // Inception
2 => 4, // The Dark Knight
5 => 5 // Interstellar
],
'userB' => [
3 => 5, // Pulp Fiction
4 => 4, // Forrest Gump
6 => 5 // Django Unchained
],
'userC' => [
1 => 4, // Inception
7 => 5, // The Matrix
9 => 4 // Arrival
]
];
}
/
* Get a movie by its ID.
* @param int $movieId
* @return array|null
*/
public function getMovieById(int $movieId): ?array {
return $this->movies[$movieId] ?? null;
}
/
* A very basic content-based recommendation system.
* Recommends movies based on genres the user has highly rated.
* @param string $userId The ID of the user.
* @param int $limit The maximum number of recommendations to return.
* @return array An array of recommended movie titles.
*/
public function getContentBasedRecommendations(string $userId, int $limit = 5): array {
if (!isset($this->user_ratings[$userId])) {
return []; // No ratings for this user
}
$userPreferredGenres = [];
$watchedMovieIds = array_keys($this->user_ratings[$userId]);
// 1. Identify preferred genres from highly rated movies
foreach ($this->user_ratings[$userId] as $movieId => $rating) {
// Consider movies with a rating of 4 or higher as 'liked'
if ($rating >= 4 && isset($this->movies[$movieId])) {
foreach ($this->movies[$movieId]['genres'] as $genre) {
$userPreferredGenres[$genre] = ($userPreferredGenres[$genre] ?? 0) + 1;
}
}
}
// Sort preferred genres by frequency
arsort($userPreferredGenres);
$recommendations = [];
$recommendedCount = 0;
// 2. Find movies that match preferred genres and haven't been watched
foreach ($userPreferredGenres as $preferredGenre => $count) {
foreach ($this->movies as $movieId => $movie) {
if (!in_array($movieId, $watchedMovieIds) && in_array($preferredGenre, $movie['genres'])) {
// Avoid duplicate recommendations and respect the limit
if (!isset($recommendations[$movieId])) {
$recommendations[$movieId] = ['title' => $movie['title'], 'score' => 0]; // Placeholder for score
$recommendedCount++;
if ($recommendedCount >= $limit) {
break 2; // Exit both loops once limit is reached
}
}
}
}
}
// For content-based, we could further score based on how many preferred genres match
// For this simple example, we just add the first matching ones.
return array_values(array_map(fn($rec) => $rec['title'], $recommendations));
}
/
* A very basic collaborative filtering (user-based) recommendation system.
* Finds users similar to the target user and recommends movies they liked.
* Similarity is calculated based on shared highly-rated movies.
* @param string $targetUserId The ID of the target user.
* @param int $limit The maximum number of recommendations to return.
* @return array An array of recommended movie titles.
*/
public function getCollaborativeRecommendations(string $targetUserId, int $limit = 5): array {
if (!isset($this->user_ratings[$targetUserId])) {
return []; // No ratings for the target user
}
$targetUserRatings = $this->user_ratings[$targetUserId];
$similarities = [];
// 1. Calculate similarity between target user and all other users
foreach ($this->user_ratings as $otherUserId => $otherUserRatings) {
if ($targetUserId === $otherUserId) continue;
$sharedMovies = 0;
$sumOfDifferences = 0;
foreach ($targetUserRatings as $movieId => $targetRating) {
if (isset($otherUserRatings[$movieId])) {
// Basic similarity: count shared movies or use absolute rating difference
// Here, a simple count of shared highly-rated movies (4 or 5)
if ($targetRating >= 4 && $otherUserRatings[$movieId] >= 4) {
$sharedMovies++;
$sumOfDifferences += abs($targetRating - $otherUserRatings[$movieId]);
}
}
}
// A very simple similarity score: more shared highly-rated movies = more similar
// Could use Pearson correlation, cosine similarity etc. for real systems.
if ($sharedMovies > 0) {
$similarities[$otherUserId] = $sharedMovies - ($sumOfDifferences / ($sharedMovies * 2)); // Penalize for diffs
} else {
$similarities[$otherUserId] = 0;
}
}
// Sort users by similarity (most similar first)
arsort($similarities);
$recommendations = [];
$recommendedCount = 0;
$targetUserWatched = array_keys($targetUserRatings);
// 2. Recommend movies from most similar users that target user hasn't seen
foreach ($similarities as $similarUserId => $score) {
if ($score <= 0) continue; // Skip users with no positive similarity
foreach ($this->user_ratings[$similarUserId] as $movieId => $rating) {
// Recommend highly-rated movies (4 or 5) not yet seen by the target user
if ($rating >= 4 && !in_array($movieId, $targetUserWatched)) {
if (!isset($recommendations[$movieId])) {
$recommendations[$movieId] = ['title' => $this->movies[$movieId]['title']];
$recommendedCount++;
if ($recommendedCount >= $limit) {
break 3; // Exit all loops once limit is reached
}
}
}
}
}
return array_values(array_map(fn($rec) => $rec['title'], $recommendations));
}
}
// --- Usage Example ---
$recommender = new MovieRecommendationSystem();
echo "\n--- Content-Based Recommendations for userA ---\n";
$recommendationsForUserA = $recommender->getContentBasedRecommendations('userA');
if (empty($recommendationsForUserA)) {
echo "No content-based recommendations found for userA.\n";
} else {
foreach ($recommendationsForUserA as $movieTitle) {
echo "- " . $movieTitle . "\n";
}
}
echo "\n--- Collaborative-Based Recommendations for userA ---\n";
$collaborativeRecommendationsForUserA = $recommender->getCollaborativeRecommendations('userA');
if (empty($collaborativeRecommendationsForUserA)) {
echo "No collaborative recommendations found for userA.\n";
} else {
foreach ($collaborativeRecommendationsForUserA as $movieTitle) {
echo "- " . $movieTitle . "\n";
}
}
echo "\n--- Content-Based Recommendations for userB ---\n";
$recommendationsForUserB = $recommender->getContentBasedRecommendations('userB', 3);
if (empty($recommendationsForUserB)) {
echo "No content-based recommendations found for userB.\n";
} else {
foreach ($recommendationsForUserB as $movieTitle) {
echo "- " . $movieTitle . "\n";
}
}
echo "\n--- Collaborative-Based Recommendations for userB ---\n";
$collaborativeRecommendationsForUserB = $recommender->getCollaborativeRecommendations('userB', 3);
if (empty($collaborativeRecommendationsForUserB)) {
echo "No collaborative recommendations found for userB.\n";
} else {
foreach ($collaborativeRecommendationsForUserB as $movieTitle) {
echo "- " . $movieTitle . "\n";
}
}
?>








Movie Recommendation System