Measuring Fairness in Ranked Results
2022. Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison. To appear in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22).and .
Information access systems, such as search and recommender systems, often use ranked lists to present results believed to be relevant to the user’s information need. Evaluating these lists for their fairness along with other traditional metrics provide a more complete understanding of an information access system’s behavior beyond accuracy or utility constructs. To measure the (un)fairness of rankings, particularly with respect to protected group(s) of producers or providers, several metrics have been proposed in the last several years. However, an empirical and comparative analyses of these metrics showing the applicability to specific scenario or real data, conceptual similarities, and differences is still lacking. We aim to bridge the gap between theoretical and practical application of these metrics. In this paper we describe several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks. We also provide a sensitivity analysis to assess the impact of the design choices and parameter settings that go in to these metrics and point to additional work needed to improve fairness measurement.