A Fuzzy Based Approach For Modelling Preferences Of Users In Multi-Criteria Recommender Systems
ABSTRACT
Recommender systems are web-based platforms or software that use various machine learning methods to propose useful items to users. Several techniques have been used to develop such a system for generating a list of recommendations. Multi-criteria is a new technique that recommends items based on multiple characteristics or attributes of the items. This technique has been used to solve many recommendation problems and its predictive performance has been tested and proven to be more effective than the traditional approach. However, current research has shown that there is still a need to use some machine learning techniques in modelling the criteria ratings in multi-criteria recommendation techniques. The proposed project aimed to present a model that is based on the architecture and main features of fuzzy sets and systems. Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. It is one of the machine learning techniques that is widely known for its effective application in different fields of study. Its main advantage is that it does not need a lot of data to train, coupled with its ability to combine human heuristics into computer-assisted decision making, which is highly applicable in the domain of recommender systems. The proposed project is designed to test and provide the predictive performance of the fuzzy-based multi-criteria technique and compare it with some of the existing methods. The main focus of this research is to model a system that can optimize the prediction accuracy of an RS, increase in ranking accuracy, and thus obtain high correlation between the predicted and actual values. Experimental results performed on real-world datasets (Yahoo movies) proved that the proposed technique (Fuzzy Multi-criteria Recommender System) remarkably improved the accuracy of prediction in multi-criteria CF RS. The system was implemented using java programming language.
vi
Keywords: Recommender System, Multi-Criteria, Fuzzy Logic, Membership function, rating, linguistic variables, overall rating.
CHAPTER ONE
INTRODUCTION
This chapter presents a general introduction to the Recommender System, the proposed system frameworks, problem statement, research aim and objectives, research questions, and structure of the thesis.
1.1 Background of the Study
The rapid growth of Internet of things (IoT) and fast development of e-commerce websites, has given rise to the pressing need for a recommender system. Users found it difficult to arrive at the most appropriate choice given the immense variety of items (products and services) that these websites offered. The explosive growth and variety of information available on the Web and the rapid introduction of new e-business services (selling products, product comparison, auctions, etc.) frequently overwhelmed users, leading them to make poor decisions. Consequently, the availability of choices, instead of producing a benefit, started to decrease users’ well-being. It was understood that while choice is good, extra choice is not always the best, as this leads to information overload, which muddles the user of the system on the right choice to make from the increasing number of options available, therefore, the need for recommender system (RS). Recently, RS has proven to be a valuable means of coping with the information overload problem. Ultimately an RS addresses this phenomenon by pointing a user towards new, not-yet-experienced items that may be relevant to the user’s current task.
1.2 Recommender Systems (RS)
Recommender Systems (RS) are software tools and techniques that provide users with suggestions for items that are most likely of interest to them. Recommendation is about
2
predicting the pattern of taste and using them to discover new and desirable things that you did not already know. Recommender System is a specific type of information filtering technique that tries to present users with information about items (movies, music, books, news, web pages, among others) in which they are interested in (Meier, Pedrycz & Portmann, 2013). RS emerged as an independent research area in the mid-1990’s, it is mostly used in e-commerce websites as a technique to provide suggestions to people who lack competency in selecting few out of many overwhelming items in a particular website. Some examples of sites that make use of recommender system are Amazon, Netflix, YouTube, Spotify, LinkedIn, Facebook etc. RSs are mostly directed towards people who lack the adequate personal experience or skill in order to evaluate the possibly overwhelming number of alternative items that a website may offer. A good example is a movie recommender system that assists users in selecting a movie to watch. An example is Netflix which is popularly known for its movie recommendation site and employs an RS to personalize the online store for each customer. RSs can be personalized or non-personalized. The non-personalized recommendations are mostly featured in magazines or newspapers, but they are typically not addressed in RS research. However, since recommendations are usually personalized, different users or user groups benefit from diverse, tailored suggestions. The personalized recommendations are offered as ordered (ranked) lists of items given by the user. Consequently, in order to perform ranking, RS tries to predict what the most suitable products and services are, based on the user’s preferences and constraints, and to complete such a computational task, RS collects information from a user’s preferences which might be explicitly expressed by the user through their ranking or browse history or implicitly through simple navigation of sites.
The development of RS’s begun from a moderately simple observation which showed that individuals often rely on recommendations provided by others in making choices for their
3
daily routine and decisions. For example, it is often common to rely on what one’s peers recommend when selecting a book to read, similarly, school administrators count on recommendation letters given to students during their admission decision process. Furthermore, when selecting a movie to watch, individuals tend to read and rely on the movie reviews and critics. Therefore, in order to mimic this real-life scenario, the first RS was implemented using Collaborative Filtering technique, which follows that if an active user agreed in the past with certain users, then the other recommendations coming from these similar users should be relevant to the other active users. Data used in RS refers to three main objects such as items, users and transactions (Tobergte & Curtis, 2013).
1. Items: Items are the objects (movie, books, music, places of interest, services) that are recommended by an RS to a user.
2. Users: Users can be described as humans to whom the items are directed to. They are described by their interaction with the RS.
3. Transactions: This is generally referred to as a recorded interaction between a user and the RS, comprising of log-like data that store important information generated during the human-computer interaction and which is useful for the recommendation generation algorithm that the system is using. It might be in the form of explicit or implicit feedback that the user has provided, such as the rating for the selected item. Ratings are the most popular form of transaction data that an RS collects. Therefore, an RS rating system is generally classified into two forms as Traditional (single rating) and Multi-Criteria RS.
4
1.2.1 Traditional Single rating
Most of the existing Recommender System on the market are based on a single numerical rating that represents user’s opinion about the item. The traditional RS operates in two-dimensional space of users and items. The utility of items to users is generally represented by a totally ordered set of ratings R0. Ratings can take on a variety of forms (Tobergte & Curtis, 2013), such as
Numerical ratings which involves number ranging from 1 – 5
Ordinal ratings: it might be in the form strongly agree, strongly disagree, and the user is asked to select the term that best indicate their opinion about an item.
Binary ratings: it is a model that decides whether a user should choose good or bad for a specific item.
Unary rating: It indicates whether a user has purchased an item or otherwise rated an item to be positive.
The utility function R for single criteria RS can be formally written as follows: 𝑅:𝑈𝑠𝑒𝑟𝑠×𝐼𝑡𝑒𝑚𝑠→𝑅0
The utility function is determined based on user inputs, such as numeric ratings that users explicitly allocate to items and/or transaction data that implicitly shows users’ preferences (e.g., purchase history). The majority of traditional recommender systems use single-criterion ratings that indicate how much a given user liked a certain item in total (i.e., the overall utility of an item by a user), for instance, consider a traditional collaborative single-rating (between 1 – 10) in movie recommender system, where user u provides a single rating for a movie that they have watched, denoted by 𝑅(𝑢,𝑖), assuming that the users provide their ratings as shown in Figure 1.1.
5
The system would estimate any rating that user u would give to yet-unseen movie i according to how users u’ who are similar to target user u rated movie i. Therefore, as illustrated in fig. 1, assuming that there are five users, u1,…,u5 and five movies i1,…,i5 and supposing that the ratings of the users to the movies i are as shown in fig. 1, the RS finds the users who are closer to the active user 1 and who have also watched the movie i5. From Figure 1.1, it is clearly shown that user 2 and user 3 have the same similarity of taste with user 1, therefore the RS will predict the rating of user 1 to movie i5 R (u5, i5) as 9. Thus, the ability to correctly determine the users that are most similar to the target user is crucial in order to have accurate predictions. Single criterion rating hides vital information concerning the exact thing the user liked in the movie, hence mislead an RS into making a wrong prediction for an activity, because the information about the exact feature they liked in the movie are not well- represented. This single rating problem gave rise to the need for Multi-Criteria RS.
Figure 1.1: Collaborative Filtering in a Single Criteria RS (Adopted from New Recommendation Technique for Multi-criteria Recommender System)
6
1.3 Multi-Criteria Recommender System
Multi-Criteria are the different attributes of items that can be put together to describe the quality of items. For instance, in a Music RS, the criteria or attributes might be the lyrics, visual, audio, sound, beat, genre, etc. Its utility function is represented as 𝑅:𝑈𝑠𝑒𝑟𝑠×𝐼𝑡𝑒𝑚𝑠=𝑅0×𝑅1,…,𝑅𝑘,𝑤ℎ𝑒𝑟𝑒𝑅0𝑖𝑠𝑡ℎ𝑒𝑜𝑣𝑒𝑟𝑎𝑙𝑙𝑟𝑎𝑡𝑖𝑛𝑔,∧𝑅1𝑅𝑘𝑎𝑟𝑒𝑡ℎ𝑒𝑐𝑟𝑖𝑡𝑒𝑟𝑖𝑎𝑟𝑎𝑡𝑖𝑛𝑔. In multi-criteria ratings, users can provide their subjective preference ratings on multiple attributes of an item. This additional information provided by multi-criteria ratings could help to improve the quality of recommendations as it represents more complex preferences of each user as well as modelling the user preferences more accurately. Users might have different reasons for liking an item (Tobergte & Curtis, 2013). Research in multi-criteria problems is extensive in both operations research and decision science fields (Adomavicius & Kwon, 2007).
Similarly, consider the same scenario in a multi-criteria RS, having five users u1,…, u5 and five movies, i1, …, i5, an unknown rating R(u1, i5) that must be predicted, and known overall ratings of all users to different movies that are exactly the same as shown in Figure 1.1, assume that the system asks each user to provide the feedback for each movie on four explicit criteria—story, acting, direction, and visuals, assume that the overall rating in this case is a simple average of the four individual criteria ratings. Using the same CF method, the additional information available in the multi-criteria ratings shown in fig 2 makes it clear that u2 and u3 are rather different in their tastes from u1, even though their overall ratings for each movie match perfectly, u1 disliked the movie aspects (story and acting) that u2 and u3 liked and liked the aspects (direction and visuals) they disliked. However, recommender systems that are based on single-criterion ratings would hide this information in the aggregated rating. As shown from the example, the aggregation can lead to inaccurate insights about the true similarity between user preferences.
7
Users u4 and u5 seem to make better matches for user u1 in this example, because their overall ratings and preferences are similar for different movie features. Both u4 and u5 rate movie i5 as 5, so the system would predict a value of 5 for the target rating R (u1, i5). This result is varied from the one obtained in a single-rating scenario in section 1.2.1. From these two scenarios, it is clearly shown that the overall rating that users provide to an item explicitly describes how much they like the item, and multi-criteria ratings provide some insights regarding why they like it. Therefore, multi-criteria ratings enable more accurate estimates of the similarity between two users.
Figure 1.2: Collaborative Multi-criteria Rating. (Adopted from New Recommendation Technique for Multi-criteria Recommender System)
1.4 Recommendation Techniques
Recommender Systems are classified according to the taxonomy given by (Burke, 2007) :
1. Content-Based: This describes a system that is designed to recommend items that are related to the ones that the user liked in the past. This technique tends to learn more about the user’s preference and so extracts keywords that describes what the user has liked before and perhaps, recommends similar item (s) to the user in their next transaction.
8
2. Collaborative Filtering (CF): It is the most popular and widely used approaches for RS which makes recommendations to the active user based on the items that other users with similar tastes liked in the past. It does this by collecting an explicit rating done by a user for an item and computes similarities between the users or items to provide a recommendation. Collaborative Filtering is a people-to-people correlation (Owen, Anil, Dunning, & Friedman, 2011). CF is basically in two forms, Memory-based and Model-based Techniques. These techniques will be further discussed in Chapter 3.
3. Hybrid: This Recommender technique is based on the combination of both Content based and Collaborative filtering. “A hybrid system combining techniques A and B tries to use the advantages of A to fix the disadvantages of B (Tobergte & Curtis, 2013).
1.5 Reasons for using Recommender Systems: The following are reasons why we need a Recommender system:
To increase sales and productivity: when a user is being provided with the best suggestions, it maximizes the purchases made since the user is provided with the best item that matched their preferences.
Increase in sales of diverse items: RS does not provide users with only popular items or services, it also creates opportunity to users to view other items that might not be of interest to others; this would not have been possible without a recommender system.
Maximize the satisfaction of the user: RS matches the users with their right preferences, this increases consumer satisfaction of the user, perhaps, maintains the user’s interest in the system which leads to frequent visits to the site and an increase in traffic for goods purchase.
9
Increases user fidelity: When a system acknowledges the user for being an active customer, it gives the user the value.
1.6 Summary of task done by the RS
Helps users find the best items that can satisfy their needs.
Provide list of all available items
Random generation of user preference without the user explicitly requesting for it
1.7 Fuzzy Set and Logic
The word fuzz or fuzzy means “nap” or “pile”, and it is a word used with textiles, and from “fuzz” comes the idea of a hazy outline, which means something that is not clearly seen. Therefore, the word “fuzzy” means “unclear” or “ambiguous”. In the real world, things are not always in their extremes, in the form; yes or no, or yet there are so many issues that we may not wish to give an exact response to, for example, one might ask this question, “do you like chicken pie?” the response might be “I like it, but it is not really tasty”, another might decide to reply with “I don’t really like it, but if that is the only option, let me have it” it might be very difficult to determine the degree of these responses based on their likeness for the chicken pie. For this cause, Prof. L.A Zadeh in 1975 proposed the fuzzy theorem and then further extended the two valued evaluation of 0 or 1 to infinite values between the intervals 0 to 1. At the inception of fuzzy sets, curly brackets {} were used to indicate sets, while square brackets []denote real number closed interval.
A fuzzy set A defined as a function 𝐴→𝑋[0,1] where X is the universe of discourse is represented by membership function that enhances its characteristic function of a set. In a fuzzy set A, a membership function (MF) expressed as 𝜇𝐴is defined as 𝜇𝐴:𝑋→[0,1].
10
The value 𝜇𝐴(𝑥) at element 𝑥𝜖𝑋 determines the degree of membership of the element in the fuzzy set A.
Fuzzy logic describes a set of truth values in the interval [0, 1]. fuzzy logic is derived from fuzzy set theory. One might ask why fuzzy theory have the wide applications? The answer to this is that fuzzy theory is the only theory that can deal with the meaning of human language mathematically.
Furthermore, there has been a movement toward trying to use fuzzy theory in the humanities and social sciences recently. In the near future, starting with models of human activity, thinking, psychology, reliability, and economics, it will probably be used actively in education, law, and analysis and evaluation of things such as public opinion. In summary, fuzzy methods will probably serve all fields that relate to control and information.
1.8 Aim and Objectives
The thesis aims to implement a fuzzy-based algorithm that would model the preferences of users in multi-criteria recommender system. The thesis would be tested to ensure that it improves the predictive performance of the fuzzy-based multi-criteria technique and compare it with some of the existing methods (traditional RS). The thesis is aimed at achieving the following objectives:
1. Decrease in prediction errors, increase in ranking accuracy,
2. To obtain high correlation between the predicted and actual values.
11
1.9 Research Question
This section provides critical questions that describe the goals and scope of this research. During the implementation stage, the following questions shall be reviewed:
1. How can Multi-criteria RS improve the traditional method of recommender system?
2. How can fuzzy-logic be applied in multi-criteria recommender system?
3. Which best model would be used for the development of a fuzzy-based RS?
1.10 Thesis Structure
This thesis is organized into five chapters, viz:
Chapter 1: This Chapter introduces the Recommender System, Traditional RS, Multi-criteria RS, Fuzzy sets & logic, Aim and Objectives, Research questions and methodology are discussed.
Chapter 2: Review of some existing literature, previous works done on Recommender system, Multi-criteria RS and fuzzy logic systems.
Chapter 3: This Chapter describes the various frameworks and techniques used, Collaborative Filtering (CF) techniques, Model Based approach, Aggregation function, Asymmetric Singular Value Decomposition, general architecture of a fuzzy-based RS and the architectural framework of the proposed system.
Chapter 4: This Chapter focuses on the implementation of the proposed thesis, it presents experiments and evaluations carried out and the results achieved.
12
Chapter 5: This Chapter provides the conclusion drawn from the findings, contributions and challenges encountered during this research and anticipation for possible future research.
IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»