Sturgeon and the Cool Kids: Problems with Top-N Recommender Evaluation
2017. Sturgeon and the Cool Kids: Problems with Top-N Recommender Evaluation. In Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. AAAI, 639–644.and .
Top-N evaluation of recommender systems, typically carried out using metrics from information retrieval or machine learning, has several challenges. Two of these challenges are popularity bias, where the evaluation intrinsically favors algorithms that recommend popular items, and misclassified decoys, where items for which no user relevance is known are actually relevant to the user, but the evaluation is unaware and penalizes the recommender for suggesting them. One strategy for mitigating the misclassified decoy problem is the one-plus-random evaluation strategy and its generalization, which we call random decoys. In this work, we explore the random decoy strategy through both a theoretical treatment and an empirical study, but find little evidence to guide its tuning and show that it has complex and deleterious interactions with popularity bias.
- Copy in the FLAIRS Proceedings
- Author version PDF (provided in compliance with the AAAI Author Agreement)
- Scripts and code to reproduce
- Slides from FLAIRS 2017 presentation
The citation for Theodore Sturgeon's essay (Sturgeon 1958) is incorrect. The correct ciation is:
Sturgeon, T. 1958. “ON HAND: A Book.” Venture Science Fiction, 2 (2): 66. Concord, NH: Fantasy House, Inc.