Michael Ekstrand
GroupLens Research
Dept. of Computer Science and Engineering
University of Minnesota
Michael Ekstrand
B.S. CprE, Iowa State University (2007) Finishing Ph.D CS, University of Minnesota (2014)
GroupLens Research (HCI & social computing)
Advised by Joseph Konstan and John Riedl
http://elehack.net
Research Overview
Helping users find, filter, navigate, and understand large-scale information spaces.
Topics
Recommender systems
Help search
Wiki history browsing
HPC network topology visualization
Methods
Offline data analysis & experiments
User studies
System building
Current Research Objective
Recommender research should be
reproducible
generalizable
grounded in user needs
so we can engineer information solutions and understand recommender-assisted decision-making
Recent & Ongoing Work
Infrastructure
LensKit toolkit
Reproducible research
Reproduce algorithms
Deploy results
Experiments
Offline w/ public data
User studies
Overview
Background
Tools
Experiment
Going Forward
Overview
Background
Tools
Experiment
Going Forward
Recommender Systems
Recommender Systems
recommending items to users
Recommender Research
Defined more by application than technology
Existed as early as 1969 (Rich's Grundy)
Modern form introduced in 1994 (Resnick et al.)
RecSys conference in 8th year, has ~300 attendees
Mix of HCI, ML/IR, business
Strong industrial presence
Common Approaches
Non-personalized
Content-based [Balabanović, 1997; others]
Collaborative filtering
User-based [Resnick et al., 1994]
Item-based [Sarwar et al., 2001]
Matrix factorization [Sarwar et al., 2000; Funk, 2006]
Hybrid approaches [Burke, 2002]
Learning to Rank
Evaluating Recommenders
Many measurements:
ML/IR-style data set experiments
User studies
A/B testing
Engagement metrics
Business metrics
Common R&D Practice
Develop recommender tech (algorithm, UI, etc.)
Test on particular data/use case
See if it is better than baseline
Publish or deploy
Learned: is new tech better for target application?
Not learned: for what applications is new tech better? why?
Algorithms are Different
Algorithms perform differently
No reason to believe one size fits all
Quantitatively similar algorithms can have qualitatively different results [McNee, 2006]
Different algorithms make different errors [Ekstrand, 2012]
Opportunity to tailor system to task
Doesn't even count other system differences!
Building a Boat Shed
An Analogy
Current practice:
build shed
A/B test: roof with 2 different materials
measure winter survival
buy a new boat
Building a Boat Shed
An Analogy
Better practice
compute span, snow load, etc.
determine adequate roofing structure
build boat shed
enjoy the boat next summer
Recommender Engineering
Recommender engineering is
designing and building a recommender system
for a particular application
from well-understood principles of algorithm behavior, application needs, and domain properties.
What Do We Need?
To enable recommender engineering, we to understand:
Algorithm behaviors and characteristics
Relevant properties of domains, use cases, etc. (applications)
How these interactions affect suitability
All this needs to be reproduced, validated, and generalizable.
My work
LensKit
enables reproducible research on wide variety of algorithms
Offline experiments
validate LensKit
demonstrate algorithm differences
improve engineering
User study
obtain user judgements of algorithm differences
currently ongoing
Overview
Background
Tools
Experiment
Going Forward
LensKit is an open-source toolkit for building, researching, and studying recommender systems.
LensKit
build
prototype and study recommender applications
research algorithms with users
deploy research results in live systems
research
reproduce and validate results
new experiments with old algorithms
make research easier
provide good baselines
study
learn from production-grade implementations
LensKit Features
Common APIs for recommender use cases
Infrastructure for building and using algorithms
Implementations of well-known algorithms
Evaluation framework for offline data analysis
Tools for working with algorithms, models, and configurations
LensKit Project
Started in 2010
Supported by undergraduate & graduate students, staff, etc.
~43K lines of Java (with some Groovy)
Open source/free software under LGPLv2+
Developed in public (GitHub, Travis CI)
Competitors: Mahout, MyMediaLite, others
Design Challenges
We want:
Flexibility (reconfigure to match and try many algorithm setups)
Performance (so it works on real data with reasonable resources)
Readability (so it can be understood)
Component Architecture
For flexibility and ease-of-use, we use:
Modular algorithm designs
Automatic tooling for recommender components
Practical configuration
Efficient experimentation
Easy web and database integration
Modular Algorithms
Break algorithms into separate components where feasible
Neighborhood finders
Similarity functions
Normalizers
Anything else!
Full recommender consists of many interoperating components
Components specified with interfaces, implementations can be swapped out
Dependency Injection
Components receive their dependencies from whatever creates them.
public UserUserCF() {
similarity = new CosineSimilarity();
}
public UserUserCF(SimilarityFunction sim) {
similarity = sim;
}
A Little Problem
How do we instantiate this mess?
Dependency Injectors
Extract dependencies from class definitions
Instantiate required components automatically
Several of them in wide use:
Spring
Guice
PicoContainer
JSR330 specifies common behavior.
DI Configuration
// use item-item CF to score items
bind ItemScorer to ItemItemScorer
// subtract baseline score from user ratings
bind UserVectorNormalizer
to BaselineSubtractingUserVectorNormalizer
// use user-item mean rating as baseline
bind (BaselineScorer, ItemScorer) to UserMeanItemScorer
bind (UserMeanBaseline, ItemScorer)
to ItemMeanRatingItemScorer
// the rest configured with defaults