Recommendation engine integrated with Mana project RFC

From Koha Wiki
Jump to navigation Jump to search

Recommendation system for OPAC which integrates with the Mana database

Status: unknown
Sponsored by: Catalyst IT
Developed by: Alex Buckley
Expected for: 2017-05-22
Bug number: Bug 18646
Work in progress repository: link(s) to currently published work on this RFC
Description: At present the Koha OPAC does not have a recommendation system which recommends items to patrons, as the Amazon recommendation engine does for example.

This enhancement will implement a recommendation system to Koha.

At this stage of this project I am looking to make the recommendation script use content based filtering rather than collaborative filtering. The former stores specific details about each item, and recommends other items with similar characteristics. Whilst the latter uses the users circulation history to recommend items based on what other users checked out.

The reason I do not want to build a recommendation system that is based on users circulation history is the US Patriot Act and French Law limit the amount of time that users data can be stored. By using a content based filtering this issue appears at this stage to be eliminated. In content based filtering an items details are elicited and a user profile of the key attributes of an item are generated. The former will be generated using a presentation algorithm such as tf-idf.

Whilst the latter is a combination of “1. A model of the users preference. 2. A history of the user’s interaction with the recommender system” https://en.wikipedia.org/wiki/Recommender_system#Content-based_filtering which creates a profile of the kinds of attributes that users like. These attributes are weighted according to how important they each are to the user. This weighting is generated using machine learning algorithms such as “Bayesian Classifiers, cluster analysis, decision trees, and artificial neural networks” https://en.wikipedia.org/wiki/Recommender_system#Content-based_filtering Because I hope to implement a machine learning algorithm into the content based filtering the greater the data set the more accurate the recommendation system becomes and so the aim is to integrate the recommendation system with the Mana system global database.

I would also like to take user ratings into account in the content based filtering, as they can impact the attribute weightings of the user profiles. The user profiles and item attributes will be stored persistently in the database.