Collaborative filtering

15koolneha 5,157 views 32 slides Mar 11, 2017

Slide 1 of 32

About This Presentation

Types of recommender systems in information retrieval. Collaborative filtering is a very widely used method in recommendation systems. Content based filtering and collaborative filtering are two major approaches. Hybrid systems are now being employed to get better recommendations. One such method is...

Size: 1.17 MB

Language: en

Added: Mar 11, 2017

Slides: 32 pages

Slide Content

IR Presentation on Collaborative Filtering By- Neha Kulkarni (5202 ) ME Computer Pune Institute of Computer Technology

Recommender systems Types of recommender systems Content based filtering Collaborative filtering Hybrid systems Content boosted collaborative filtering Evaluation of the CBCF Advantages Conclusion O verview

Recommender system predict “rating” or “preference” that a user given to an item. Recommendation done by two ways: Content based filtering Collaborative filtering Recommender system

Content based filtering select an item based on correlation between the content of the items and user’s preference. Keywords are used to describe the items and user profile. Content based filtering

Collaborative filtering

Collaborative filtering based on collecting and analyzing a large amount of information on user’s behavior , activates or preference and predicting what user’s will like based on similarity to other user’s. For measuring similarity many algorithm used: K-nearest neighbor Pearson correlation Collaborative filtering

Collaborative filtering gives recommend items that are relevant to the user Content based recommendation gives the user profile content Because of this collaborative filtering is used mostly Difference

Cold start : we must have enough data in the system to find match Sparsity : most of the user do not rate most of items and hence the user-item rating matrix is “sparse”, therefore the probability of finding a set of users with significant similar rating is usually low. First rater : can not recommend an item that has not been previously rated. Disadvantages

Hybrid approach uses content based prediction to convert a sparse user rating matrix into a full use rating matrix and then uses collaborative filtering to provide recommendation. Ex: they use hybrid approach in domain of movie recommendation Hybrid approach

In neighborhood-based algorithms, a subset of users are chosen based on their similarity to the active user, and a weighted combination of their ratings is used to produce predictions for the active user. Steps: Weight all users with respect to similarity with the active user. N eighborhood-based algorithm

Select n users that have the highest similarity with the active user . Compute a prediction from a weighted combination of the selected neighbors ’ ratings .

Hybrid Models 1 . Implementing collaborative and content-based methods separately and combining their predictions 2. Incorporating some content-based characteristics into a collaborative approach 3. Incorporating some collaborative characteristics into a content-based approach 4. Constructing a general unifying model that incorporates both content-based and collaborative characteristics .

Netflix Example Netflix is a good example of hybrid system using content-boosted collaborative filtering. Recommendations are made by comparing the watching and searching habits of similar users(CF) and also by offering movies that share characteristics with films that the user has rated highly(Content-Based)

Amazon Example Another good example of hybrid recommendation system Stores the click stream of the user and usage pattern of the user and other users with similar preferences(CF) and also by offering products that share characteristics with products that the user has rated highly(Content-Based)

Content-Boosted Collaborative Filtering Use content-based predictor to enhance existing user data and then provide personalized predictions using collaborative filtering. I nput Input Content-based recommender CF-based recommender Combiner Recommendations

Content-Boosted Collaborative Filtering Create a pseudo-user rating for each user ‘u’ in the database. r u,i – actual rating of the user ‘u’ for item ‘i’ Cu,i – rating predicted by pure content-based system The two parameters put together give the dense pseudo-ratings matrix V .

Similarity between active user ‘a’ and another user ‘u’ is found out using Pearson’s correlation coefficient. Instead of using original user votes, we substitute the values provided by pseudo-user ratings vector v a and v u

Harmonic Mean Weighting I naccuracies in pseudo user-ratings vector often yielded misleadingly high correlations between the active user and other users. Hence to incorporate conﬁdence (or the lack thereof) in our correlations, we weight them using the Harmonic Mean weighting factor ( HM weighting).

w here : n i - items rated by user i Harmonic mean tends to bias the weight towards the lower of the two values. The choice of the threshold as 50 ratings was based on 10-fold cross-validation.

To the harmonic mean weight, we add the significance weighting factor to obtain hybrid correlation weight . If two users have rated less than 50 items, significance weighting factor is n/50 or else if more than 50 items are rated, then it is 1.

Self-Weighting Factor To provide the pseudo-active user more importance than the neighbours(increase confidence in the pure-content predictions from the pseudo-active user) incorporate self-weighting factor in the final prediction. m ax- overall confidence on the content-based predictor

Producing predictions Where : Pa,i : final CBCF prediction for user a and item i Ca,i : pure content-based predictions for user a and item I n : size of the neighbourhood The denominator is a normalizing factor that ensures all weights sum to 1.

Evaluation Mean Absolute Error (statistical accuracy) : average absolute difference between predicted ratings and actual ratings ROC curve (decision support) : sensitivity : probability that a good item is accepted by the filter specificity : probability that a bad item is rejected by the filter

Why this system is better? Overcoming the first- rater problem Tackles sparsity Finding better neigbours Overcoming cold-start problem

Conclusion CBCF elegantly exploits content within a collaborative framework. Overcomes problems faced by pure content or collaborative systems. Incorporating content information into collaborative framework can improve the recommender systems.

References Data mining-Concepts and Techniques : 3 rd edition Mining the Web by Chakarabarti Web Data Mining, Springer “ Content-Boosted Collaborative Filtering for Improved Recommendations”, Prem Melville, Raymond J. Mooney, Ramadass Nagarajan , AAAI-02 Proceedings, 2002

Collaborative filtering

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Collaborative filtering

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......