Movie Recommendation System Project
The main goal of this machine learning project is to build a recommendation engine that recommends movies to users. This R project is designed to help you understand the functioning of how a recommendation system works. We will be developing an Item Based Collaborative Filter. By the end of this tutorial, you will gain experience of implementing your R, Data Science, and Machine learning skills in a real-life project.
Before moving ahead in this movie recommendation system project in ML, you need to know what recommendation system means.
What is Recommendation?
A recommendation system provides suggestions to the users through a filtering process that is based on user preferences and browsing history. The information about the user is taken as an input. The information is taken from the input that is in the form of browsing data. This information reflects the prior usage of the product as well as the assigned ratings. A recommendation system is a platform that provides its users with various contents based on their preferences and likings. A recommendation system takes the information about the user as an input. The recommendation system is an implementation of the machine learning algorithms.
A recommendation system also finds a similarity between the different products. For example, Netflix Recommendation System provides you with the recommendations of the movies that are similar to the ones that have been watched in the past. Furthermore, there is a collaborative content filtering that provides you with the recommendations in respect with the other users who might have a similar viewing history or preferences. There are two types of recommendation systems — Content-Based Recommendation System and Collaborative Filtering Recommendation. In this project of recommendation system in R, we will work on a collaborative filtering recommendation system and more specifically, ITEM based collaborative recommendation system.
Lets start the coding:
In this implementation, when the user searches for a movie we will recommend the top 10 similar movies using our movie recommendation system. We will be using an item-based collaborative filtering algorithm for our purpose.
- Getting Dataup and running:
First, we need to import libraries which we’ll be using in our movie recommendation system.
Let’s have a look at the movies dataset :
Movie dataset has
- movieId — once the recommendation is done, we get a list of all similar movieId and get the title for each movie from this dataset.
- genres — which is not required for this filtering approach.
Ratings dataset has-
- userId — unique for each user.
- movieId — using this feature, we take the title of the movie from the movies dataset.
- rating — Ratings given by each user to all the movies using this we are going to predict the top 10 similar movies.
Let’s fix this and impute NaN with 0 to make things understandable for the algorithm and also making the data more eye-soothing.
Let’s visualize how these filters look like
Aggregating the number of users who voted and the number of movies that were voted.
Making the necessary modifications as per the threshold set. Let’s visualize the number of votes by each user with our threshold of 50.
Making the necessary modifications as per the threshold set.
Removing sparsity
Applying the csr_matrix method to the dataset :
Making the movie recommendation system model
We will be using the KNN algorithm to compute similarity with cosine distance metric which is very fast and more preferable than pearson coefficient.
Making the recommendation function
The working principle is very simple. We first check if the movie name input is in the database and if it is we use our recommendation system to find similar movies and sort them based on their similarity distance and output only the top 10 movies with their distances from the input movie.
Now Let’s Recommend some movies!
Hence, I conclude my collaborative filtering here.
For the whole code click here.