I spend most of my time bridging between the recommender systems and fairness, accountability, & transparency research spaces. While a chunk of that energy is devoted to figuring out fair recommendation and bringing the concerns of the FAT* agenda to recsys, I also think recommender systems has a lot to offer to fairness research.
Data Sets
RecSys has a long history of public data sets driving research forward. Famously, the NetFlix prize attracted a lot of fresh energy to the recommender systems research space and spurred critical new algorithmic developments (although that data set is no longer generally available). The MovieLens data set has driven hundreds of research papers. Julian McAuley and his students have been amassing a substantial collection of data sets for studying various recommendation-related problems.
Some of these data sets are also useful for fairness research. Occasionally, they will contain some user demographic information (e.g. the MovieLens 1M and Last.FM sets have gender and age). Users’ rating or reviewing activity can also be used to identify subsets of users of particular interest; in particular, there is a long-standing line of research on how to help new users, known as the cold start problem, but there are other conceivable user groups who may receive disparate quality of service or other impacts as well.
We can also look at item or content creator attributes. We’ve been connecting book rating data to other public data, such as OpenLibrary and the Virtual Internet Authority File, to study author gender and book recommendation.
We’re just beginning to explore the possibilities in the fair recommendation research space. I am sure that there is far more that we can do with the data sets used for recsys research. Of greater interest the FAT* community at large, however, I expect that there are modeling, evaluation, and algorithmic methods that will translate out of recommendation into other applications and problem settings. I hope that recommendation can be a test bed for work that is intended for settings where data is less readily available. We have to be careful of [abstraction traps][trap, of course; but those traps don’t mean we can never abstract, just that we need to be explicit about our abstractions and test their translatability.
Explanations
RecSys also has a long line of research in explaining recommendations. This work has laid foundations that are relevant for transparency, explainability, and interpretability work in broader artificial intelligence applications.
In the 90s, Judy Kay and collaborators were working on scrutable user models: the idea is to enable the user to inspect and correct the system’s model of their preferences. This has significant connection both to transparency and some versions of accountability, and anticipates some of the specific goals and concerns of the GDPR.
Recommender explanations also have different purposes, calling for different approaches, and I think this has a lot of applicability in other applications as well. Very broadly, we often divide explanations into two categories:
Explanations explain why the recommender suggested the things it did. These are tied to the recommender’s actual operation; for example, if we use a nearest-neighbor collaborative filter, we can explain recommendations by listing the top 2–3 neighbors that produced the recommender.
Justifications explain why the recommendation might be a good fit for the user’s preference or information need, and are not necessarily tied to the algorithm’s operating principles. They can be produced by reverse-engineering possible reasons for a recommendation, e.g. using item tags, for a recommendation list provided by an arbitrary (and aribrarily complex) mechanism.
Explainable AI is also approaching these (and other) distinctions. Crucially, though, the appropriate nature of explanation depends on the purpose of the explanation. If your goal is to help the user make an informed decision about the relevance of an item to their context, the precise computational reasons behind the recommendations are not necessarily important. What the user needs is information about the recommendation and its relationship to their needs that help them evaluate it. But if the purpose is to reveal the user model and its implications — for example, in Facebook’s ad targeting explanations — honesty requires the explanations be meaningfully connected to the algorithm’s operation.
Complex Problem Spaces
Fair recommendation is also a complex problem space. There’s far more to say about this than I can say in a blog post, but for starters, it is often a [multisided problem][burke]: users and providers — at least — have fairness concerns. Sometimes the same people have concerns on both sides of a recommendation problem: in a platform like LinkedIn, for example, users need equitable opportunity to appear in candidate search results on the recruiter product, and they also should not experience discrimination in the job listings the platform recommends to them.
This setting, along with other complexities such as the ranking nature of most recommendation problems, introduces technical difficulties beyond classification problems. I want to be clear: justice demands technical complexity be a subordinate concern in rating a problem’s importance. But varying problem characteristics are useful for a number of things, including characterizing the boundary conditions of our methods, metrics, and concepts.
What Next?
These opportunities are part of why I am so excited about fair recommendation and to build a community to work on it.
There’s also a lot more to say and write about this crossover. I welcome feedback; I’m also interested in presenting this argument in some longer-form venue, but haven’t yet figured out the right one.