Blog Articles 81–85

Why Microsoft?

This is a joint post by Michael and Jennifer.

We each started using Linux more than a decade ago, and for our entire married life, we have been a primarily Linux-based household.

This spring, we decided to finally get smartphones. In the course of making this decision ­ and selecting our phones ­ we reevaluated many aspects of our technology use. This has resulted in a number of changes that many may find surprising:

  • We carry Nokia phones running Windows Phone 8.1.
  • E-mail service for elehack.net is now hosted by Microsoft, via their hosted Exchange service as a part of Office 365 business subscriptions.
  • We are running mainly Windows on our personal laptops.
  • We use Outlook for our e-mail, contacts, and calendars.
  • We use OneDrive for Business and SharePoint to ferry data between our devices and coordinate shared data for our household.

Old Papers on Recommender Systems

There’s a lot of research on recommender systems. There’s a lot of other research that, while not directly mentioning recommenders, is very relevant, including research from decades ago.

A few of my favorite old papers that I think recommender systems researchers would do well to read (and perhaps cite):

  • Back to Bentham? Explorations of experienced utility (Kahneman et al., 1997) — how people experience and remember pain and pleasure. Strong implications for what ratings mean and what kind of utility our recommenders should optimize for.

  • User Modeling via Stereotypes (Rich, 1979) — the first computer-based recommender system that I know about.

  • A searching procedure for information retrieval (Goffman, 1964) — this early IR paper has the crucial insight that the relevance of an item in a search search result list (or recommendation list) is not independent of the items that appear before or after it. Rather, an item may be less relevant if it is (partially) redundant with a previous item.

Comparing Recommendation Lists

In my research, I am trying to understand how different recommender algorithms behave in different situations. We’ve known for a while that ‘different recommenders are different’1, to paraphrase Sean McNee. However, we lack thorough data on how they are different in a variety of contexts. Our RecSys 2014 paper, User Perception of Differences in Recommender Algorithms (by myself, Max Harper, Martijn Willemsen, and Joseph Konstan), reports on an experiment that we ran to collect some of this data.

I have done some work on this subject in offline contexts already; my When Recommenders Fail paper looked at contexts in which different algorithms make different mistakes. LensKit makes it easy to test many different algorithms in the same experimental setup and context. This experiment brings my research goals back into the user experiment realm: directly measuring the ways in which users experience the output of different algorithms as being different.

Vegan Biscuits and Gravy

I love biscuits and gravy. Their traditional form doesn’t work for our family, however, so over the last months I’ve adapted and refined a recipe for vegan biscuits and gravy that is now our standard Sunday morning breakfast.

This recipe makes enough biscuits for 4 and gravy for 2.

  • Updated November 1: more gravy improvements
  • Updated May 11: improve gravy recipe.
  • Updated Sep. 29: more gravy improvements

To filter or not to filter?

Rumors have been afloat that Twitter may be making a significant change to its service: moving away from the reverse-chronological timeline in favor of an algorithmically tuned news feed. And Zeynep Tufekci’s critique of this prospect made the rounds, in waves, through my Twitter stream.

I must confess, my initial reading of Tufekci’s article (as a recommender systems researcher and developer) was somewhat knee-jerk. I latched on to this statement:

An algorithm can perhaps surface guaranteed content, but it cannot surface unexpected, diverse and sometimes weird content exactly because of how algorithms work: they know what they already know.

This statement strikes me as overreaching in its claims. ‘Cannot’ is a strong claim to make with a high evidentiary bar, and I think we just don’t know enough about the capabilities and limits of algorithms to capture user interest in order to say what they cannot do.