Recommender systems (RS) have been successfully explored in a vast number of domains, e.g. movies and tv shows, music, or e-commerce. In these domains we have a large number of datasets freely available for testing and evaluating new recommender algorithms. For example, Movielens and Netflix datasets for movies, Spotify for music, and Amazon for e-commerce, which translates into a large number of algorithms applied to these fields.
In scientific fields, such as Health and Chemistry, standard and open access datasets with the information about the preferences of the users are scarce.
First, it is important to understand the application domain, i.e. “what the recommended item is”. Second, who are the end users: researchers, pharmacists, clinicians or policy makers. Third, the availability of data. Thus, if we wish to develop an algorithm for recommending scientific items, we do not have access to datasets with information about the past preferences of a group of users. Given this limitation, we developed a methodology (called LIBRETTI - LIterature Based RecommEndaTion of scienTific Items) whose goal is the creation of <user, item, rating> datasets, related with scientific fields.
The datasets are created based on the major resource of knowledge that Science has: scientific literature.
We consider the users as the authors of the publications, the items as the scientific entities (for example chemical compounds or diseases), and the ratings as the number of publications an author wrote about an entity.
4:00 - 5:10 - Part I: Introduction
5:10 - 5:20- Coffee Break
5:20 - 6:50- Part II: Creating recommendation dataset through scientific literature
6:50 - 7:00 - Part III: Discussion
Introduction to recommender systems
Scientific recommender systems
Introduction to Named Entity Recognition (NER) and Named Entity Linking (NEL)
LIBRETTI pipeline
Retrieve the research articles related to COVID-19
Named Entity Recognition (NER) and Linking (NEL)
Creating the recommendation dataset
Open discussion