Blog Articles 6–10

Surface Go First Impressions

Surface Go with burgandy type cover

Microsoft has repeatedly been trying to make strides into an entry-level market for its Surface devices, and so far none of them have stuck. There was the Surface RT, which used an incompatible processor and couldn't run normal Windows software. The Surface 3 used an Atom CPU and didn't last long. And now they have the Surface Go, a 10" Surface sporting a Pentium processor and full Windows 10.

I have been using the Surface Pro for a few years now. I love them, but have also had some reliability issues: my work SP4 has been glitchy as long as I have had it (display freezes), and my personal device ceased to boot about a year and a half after I bought it. They are on the large side for a lot of tablet use cases — it's hard to use it as a reading device — but it is fantastic for marking up PDFs and drawing, and I have made significant use of its drawing capabilities in class. The Windows Ink Workspace is very helpful, because I can take a screenshot and start drawing on it to mark up different parts of the query we just ran against the database.

But when the Surface Go came out, and I was increasingly frustrated with the display glitch on my SP4, it seemed like a great potential fit. An so far, so good.

What I Need

I work on a combination of my portable device and my desktop workstation. The primary cases where I need my portable device, however, are teaching, meetings, and travel. For that, I want:

  • Small enough I can use in small environments
  • Light weight (changing from the 3lb Zenbook Prime to 1.85lb Surface Pro 4 was a noticable improvement)
  • Solid battery performance
  • Good performance for basic remote work (browsing Google suite, Office, some programming)
  • Ability to read and mark up PDFs, tablet-style, for review, grading, and student collaborations
  • Run software needed for teaching (DataGrip, sometimes IntelliJ)

The SP4 did these quite well, although its battery (especially in the i7 version with the standard university software load) was underwhelming.

But the SP4 is still a little large for an airport tray table, and I can go about a half a day in a conference before the battery is done. Also, since I am moving my primary software from Java to Python, I no longer need heavy JetBrains IDEs for programming and instead can do almost everything in VS Code.

Surface Go Benefits

Looking at the Surface Go, I saw a number of benefits:

  • Smaller size will work better in airplanes
  • Even less weight (1.15lbs or so)
  • Decent battery (but rated for less life than the 2017 Surface Pro)
  • USB C, including power delivery support, opening up a wider range of secondary batteries
  • Surface connecter, so I can continue to leverage my investment in Surface docks and chargers

The processor is significantly less powerful. I don't really understand the Pentium line, but I think the Go's CPU is a Core-based CPU, not an Atom, but it's no Core i5. However, since my local client processing needs have decreased, that isn't a big deal if it gives me decent battery life.

The USB-C benefit is one of the things that finally sold me. I had looked at battery packs that could charge a Surface Pro, but they were big, heavy, and hard to find. There are quite a few options for USB-C, including several that can provide enough power to charge the Go. The Anker PowerCore+ 26800 has 3x the capacity of the Go's internal battery and produces sufficient wattage to charge it. This opens the door to being able to use my tablet for an entire day of conferenceing without needing to find one of the scarce power outlets.

Initial Impressions

Now that I have the device (8GB model w/ 128GB SSD), what do I think?

I think it's going to work out pretty well. Battery seems pretty good for what I've done so far; a few hours with general usage. I've been using the Edge browser to help keep battery life up.

My hand on the Surface Go keyboard

The keyboard is small. Uncomfortably so, sometimes, but I am writing this post on it. I think this may be a benefit: encouraging me to not try to do everything while I am traveling or having it at home, and to use my desktop (with better ergonomics) when I am in my office.

The CPU is fast enough for most of what I do. GMail is a little sluggish but usable. General web browsing in Edge is pretty snappy. TweetDeck is slow (typing is surprisingly slow), but it works. Some software installation was very slow (Anaconda and VS Code extensions); the Windows anti-malware scanner was working overtime while they dropped all their various software files on the SSD. Compiling my web site is also pretty slow. But now that things are installed, it works pretty well in general (and there's no noticable lag editing in VS Code).

The display is small, and not quite as dense (it runs at 1.5x scaling instead of the 2x on a Surface Pro), but it is clear and smooth.

Physical manufacture doesn't feel quite as solid as the Pro (kickstand hinge feels a little weaker, and the physical buttons aren't as refined). There's still the magnet the pen on the left side of the display, but the pen tip goes almost all the way to the bottom of the screen, so I'm concerned about damaging the tip if I keep it there most of the time.

But overall, I think it's going to be a good device for my needs.

Spending Startup

If you are starting a tenure-track research-oriented position at a US university, you should have a startup package to help you get started. When I began as a faculty member, I did not have a clear idea of how to use it effectively; 4 years in, here are some thoughts about good use of startup funds based on my experience and reflection, as well as things I've read and heard from others along the way.

This is written from the perspective of a computer science tenure-track position at a mid-tier research-oriented US university. Startup levels, existence, and structure vary between universities and disciplines, so keep that in mind.

The Purpose of Startup

The purpose of your startup funds is to enable you to establish your research program with the expectation that you will obtain grant funding to continue it.

That is, your startup fund is there to give you the starting point to get grants. Really, that's it. It isn't enough money to fund a research program that will earn you tenure — it's to land the funding that will fund your research program that will earn you tenure.

Structure and Negotiation

Startup funds vary in their structure and amounts. Some are all-cash, others are a combination of cash and specific resources such as a department-funded research assistant.

There will usually be a time limit for spending them. At both of my universities, the limit was 2 years. In the negotiation process, you can ask for an extension of this. Do so. Giving you 3 years to spend your startup is quite possibly one of the easiest concessions for the university to make and gives you more time to figure things out. Your first year will probably suck, and more runway to respond to the lessons you learn will only help.

Try to find out as much as possible about the resources and their spending power during the interview and negotiation processes.

If your startup is all-cash, how much does a graduate assistant cost? Do you need to pay their benefits as well? Some universities don't bill you for benefits if you are using startup to fund a student.

If your startup includes a GA, are they 20 hours dedicated to your research? 10 hours? Ph.D or M.S.?

Beyond that, I don't have a lot to say about the negotiation process, in large part because I am not very good at it. I didn't even negotiate an extension, but was able to obtain one after my first year.

Existing Resources

Find out what existing resources are available before spending startup. For example, if your university has access to a computing cluster, you may not need to purchase your own statistical or scientific computing hardware.

Other resources may also be available. I talked to our department's sysadmin about my need for a database server, and it turned out we had some donated used hardware no one had put to use yet. $5–10K saved.

Looking Ahead

In order to develop a successful research career — and earn tenure — you will need to develop an independent career. It should naturally draw from your previous work in Ph.D and, if applicable, postdoc, but starting a faculty position is the time to begin a new line of work that is meaningfully different from that of your mentors.

Startup is to fund that. Anything else is probably a distraction.

If you have the opportunity, getting a head start doing a first project in that new line as a side project while you finish writing your Ph.D dissertation is a good idea, but isn't necessary. I had started thinking about the ideas that are now my main direction my last year but didn't have an opportunity to start doing anything with them.

It's common to have some loose ends to tie up from the dissertation; last pieces to get published, or immediate follow-on work. Do that work — it's a great way to keep your publication pipeline going while you wait to get results on the next thing — but try not to spend much money on it.

Targetting Grants

You should probably plan on submitting to NSF CAREER. Do be careful not to put too much hope in it; it is a complicated and difficult grant to write, and many successful researchers do not win the CAREER. However, if you are at a research-intensive or research-growing institution, your chair and dean probably expect you to go for it. In computer science, its funding rate is also higher than many other NSF programs (as of 2017, 20–25% vs. less than 10% for some core programs).

Think about when best to apply. You have 3 shots before tenure. Unless you're coming off of a postdoc or similar experience and already have a clear, strong direction, your first summer is probably not the best time to make your first attempt. It takes time to develop the research direction, education and outreach integrations, and make the connections to make it all credible.

Computer science also has CRII, the CISE Research Initiation Initiative. This is a small ($175K over 2 years) program meant as a starter grant for junior faculty in computer science, and it should be on any new faculty member's radar. You can apply twice in your first three years; receiving any other NSF grant as PI also disqualifies you, so apply early.

These are the two major programs that most junior faculty in CS should definitely target, along with relevant general programs from federal agencies, state and local governments, and companies. But the key point, for the purposes of this article, is that your goal is to get one or more of these to hit by the time your startup runs out. Assuming you negotiate an extension, by the end of year 3 you want funding lined up to keep paying your Ph.D student.

Doing the work needed to secure that is the purpose of your startup fund.

Preliminary Results

So how does startup funding help you get grants?

By funding the research that produces the preliminary results you will use as evidence that your grant proposals are worth funding.

In my successful grant proposal, I had three main pieces of evidence from my prior research. One was my body of work from my Ph.D, demonstrating that I can do the software development and methodological work needed to carry out my research, because I've done it before. The other two were more proper preliminary results: showing existing techniques don't solve one of my research questions, and a set of early results on a first-order approach to another research question. Both of these results came from M.S. students' theses, with follow-on work by additional students I employed.

If you have student lines, or employ students directly out of your startup funds, preliminary results for the next thing should be your priority. Equipment you need to carry out this work is also top on the list.

Building your Network

Another useful purpose for startup is to work on building the network of collaborators you will need to carry out your research, either by maintaining existing collaborations or building new ones.

Will your next line of work engage with a research community that you haven't been part of yet? Go to a relevant conference.

Is there a more senior researcher in your topic you can bring in for a seminar talk? Besides being fun, this a good opportunity to exchange ideas, introduce your students to someone from your community, and give your department leadership another perspective on the importance and impact of your work.

Get Training

If your NSF directorate has a workshop for CAREER applicants, go to this. It's a very good use of startup funds.

There may be other grant-writing or research cohort-building activities that are worth attending as well.

Your department or college may have a separte pool of professional development funds that can partially support one of these trips, enabling you to stretch your startup funds more.

Start Slow

This wasn't entirely deliberate — I failed to hire a post-doc, for which I am grateful — but I spent my startup slowly at first.

This was a good idea, I think. Especially if you negotiate a spending extension, burning slowly the first year while you get your feet wet, tie up some loose ends, and work on building and maintaining your network frees you up to spend the money after you've spend a year thinking about what you want to do next.

Things Not To Do

You may have loose ends to publish from your dissertation, or immediate follow-up work. This work is good to pursue; the dissertation should be the beginning of your career, not its conclusion, and those papers help you keep your publication pipeline going while you start the next project.

But they do not, directly, establish you as an independent researcher, and so they're good things to pursue on your own or with existing collaborators; I don't think it's wise to spend much startup on them, unless you have surplus after funding work on the Next Thing or they are a clear bridge from your prior work to the Next Thing.

Don't automatically hire the first student that comes your way.

Wrapping Up

To reiterate, your startup is a launchpad for your career as an independent PI with a robust, externally-funded research program.

Focus spend on that.

Author Gender in Book Recommendations

I’m very pleased that we will be able to present a piece of research we have been working on for some time now at RecSys this year.

In my work on fair recommendation, one of the key questions I want to unravel is how recommender systems interact with issues of representation among content creators. As we work, as a society, to improve representation of historically underrepresented groups — women, racial minories, indigenous peoples, gender minorities, etc. — will recommender systems hinder those efforts? Will ‘get recommended to potential audiences’ be yet another roadblock in the path of authors from disadvantaged groups, or might the recommender aid in the process of exposing new creators to the audiences that will appreciate their work and make them thrive?

In this paper, we (myself, my students Mucun Tian and Imran Kazi, and my colleagues Hoda Mehrpouyan and Daniel Kluver) present our first results on this problem. This work, along with our work on recommender evaluation errors, formed the key preliminary results for my NSF CAREER proposal.

This paper has a few firsts for me. It’s my first fully-Bayesian paper, and is also the first time I have been able to provide complete code to reproduce the experiments and analysis with the manuscript submission.

Michael D. Ekstrand, Mucun Tian, Mohammed R. Imran Kazi, Hoda Mehrpouyan, and Daniel Kluver. 2018. Exploring Author Gender in Book Rating and Recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys ’18). ACM, pp. 242–250. DOI10.1145/3240323.3240373. arXiv:1808.07586v1 [cs.IR]. Acceptance rate: 17.5%. Cited 10 times.

Goal

In this project, we sought to understand how author genders are distributed in book ratings and in the resulting recommendations when those ratings are fed to a collaborative filter. We had three main questions:

  • How prevalent are books by women in users’ book ratings? Effectively, what is the input data bias1 with respect to author gender? We also looked at gender distribution in the Library of Congress catalog as a baseline of ‘books published’ to provide context for interpreting user profiles.
  • How prevalent are books by women in the recommendations users receive? In other words, how what is the overall bias of different recommender algorithms?
  • How do the gender distributions of individual users’ recommendations relate to their rating profiles? Here, we are looking for the algorithms’ personalized bias: for an individual user, how do the recommendations respond to the input?

Data and Methods

We combined book rating data (from BookCrossing and Amazon) with library catalog data from OpenLibrary and the Library of Congress and author data from the Virtual Internet Authority File to build a book rating data set with attached author demographic information. Linking all the data together was an entertaining problem, but solvable with a few hundred gigabytes of SSD and PostgreSQL.

We then trained collaborative filters on the rating data and generated recommendation lists for a sample of users. We used a hierarchical Bayesian model to infer distributions of user rating behavior with respect to gender and to estimate the parameters of a linear model relating each recommender algorithm’s output to users’ input profiles. Our key variable of interest was ‘% Female’: of the books whose author’s gender we could identify, what percent are by women?

Recommender response to user profile balance.

Results

Here are our key findings:

  • Users are highly variable in their rating tendencies, with an overall trend favoring male authors but less strongly than the distribution in the Library of Congress catalog. If women are underrepresented in our set of published books, they are less underrepresented in users’ ratings.
  • Algorithms differ in the gender distribution of their recommendations. Nearest-neighbor approaches were the most personalized in our data set, and their recommendations were comparable to the input data, though there was less variance between users’ recommendation lists than between their rating profiles.
  • Nearest-neighbor algorithms were relatively responsive to their users’ profiles, with solid linear trends particularly in implicit feedback mode.

This is just the beginning — we have many more things planned in the coming years to build on these results and more thoroughly understand what recommender algorithms do in response to content creator representation.

Fun Statistical Tricks

This is the first paper I’ve published with a fully Bayesian analysis, a thing that pleases me greatly. We fit and sampled our distributions with Stan, letting us choose vague priors without concern for conjugacy and giving us good first-pass diagnostics for model fitting problems. It took some time to get it all working, but the end result was a pretty good experience.

We also used a logit-normal model for our proportions, and regressed on log odds instead of proportions. I’ve generally found odds and odds ratios difficult to intuit about, particularly compared to probabilities and proportions, but this project has helped me become more fluent in them.

If one of the outcomes of this line of work is learning to think natively in log-odds, I will be amused and also likely somewhat twisted.

Limitations

Our study has some important limitations. The algorithms are a small sampling of types of algorithm families, and trained on very sparse data, so the behavior may not be representative of their behavior in the wild.

The version of the work in this paper also uses a binary operationalization of gender. This comes up in two places: first, the underlying data is binary (all VIAF assertions of author gender are ‘male’, ‘female’, or ‘unknown’). This is a major problem, and it is a key limitation of this data set. The Library of Congress itself seems to do a better job in its authority records, but their coverage is smaller and we have not yet integrated their authority data into our pipeline (libraries tend to publish author records [authority data] and book records [bibliographic data] separately). The MARC Authority Format is flexible in its ability to encode author gender: the gender field is defined as gender identity, it uses open vocabularies, and it supports begin and end dates for the validity of a gender identity. The data we currently have available, however, does not make use of this flexibility.

Second, a proportion-based model reifies gender as a binary construct (although we could compute other proportions, such as ‘% Non-Binary’, if we had data that records such identities). We are currently working on addressing this problem by reframing the statistical model to perform what we are calling an author-perspective analysis: rather than looking at the proportion of a profile that is of a particular gender (\(P(\mathrm{gender}|u \mathrm{rated})\)), we are looking at the author’s likelihood of being rated given their gender. So far we have successfully fit a basic model for the overall book collection, and see comparable results. This model, once we have extended it through the rest of our analysis, will be more readily extensible to non-binary gender identities, as well as other non-binary identity frameworks such as ethnicity, than the proportion-based model.

Our approach in this work is to learn what we can with the data we have, while being forthright about the limitations and weaknesses of our data and methods and working to improve them for the next round of research.


  1. We use bias in the statistical sense, not in a moral sense. We do not know what ‘neutral’ is or should be; we are concerned with what the distribution is, and how the recommender responds to that distribution.

Five Years

You don’t know when the sad will fall. You can sometimes see it coming; like , it has some predictability. Unlike the water, it does not afford much opportunity for control, and you never know quite what to expect. When you see it coming, you can brace for impact; with practice, put on the happy face and soldier on.

That’s the idea, anyway.

It hit like a tsunami a little after 7 pm. The date was August 30, 2017; the place a cafe on a side street in Como, Italy.

John should be here.

3 hours earlier, I stood on on the stage in the final session of RecSys 2017 with my friend Sole and a committee of amazing people. We announced the 2018 conference and invited everyone to join us in Vancouver. The conference itself until that point was a blur; one colleague described me as a ‘man hunted’. Dozens of people to talk to, a more focused list of contact objectives than usual as we tried to make sure we connected with all our co-chairs & I prepared for a post-conference workshop, and a failed recovery from inbound jetlag. But I found the emotional energy somewhere to think of John from time to time during the conference.

I don’t remember how all that manifested. I know I wished he could have been there when we announced the upcoming conference.

I definitely remember 7 pm, going to my first RecSys Steering Committee meeting.

John should be here.

He should be in this room. It should have been John who talked with me about what to expect, told me what happens in these meetings.

August 14, 2017, Halifax, Nova Scotia. I’m tagging along for drinks with a group working on data science in journalism. Also joining the group is Mark Riedl (no relation), one of my Twitter heroes.

He tells me about meeting John at a conference banquet some ten years previously. It’s one of those stores that is so typically John, filled with his care for younger researchers.

I had no idea how much I needed that.

If only John were here.

It’s late December. Still 2017. In a conference room, trying to figure out a grant proposal.

We're struggling with it. The project is great, but the grant won’t write itself.

If only John were here. If I had even had one more year with him, I might have been able to learn what I needed to know to write our way out. We’re having the kind of problems he was good at solving.

My colleagues here are good people. Safe, so I don’t have to keep it together. I can lower my guard (at least a little; we do still have work to do).

If only John were here.

But I don’t have the presence of mind to embrace the moment, to take the time I need. I don’t say what I’m really thinking, and my friends don’t know why I’m sad.

February 23, New York City. Another steering committee meeting, also off an exhausting travel experience. The pain isn’t acute this time, though. It isn’t John’s space.

If John could see me now.

This is what he trained me for. We’re figuring out how to build and nurture this new community. How to develop and support an intergenerational and interdisciplinary band of researchers dedicated to making computing systems good for people. To measure when, how, and why they go wrong.

Taking big questions of human flourishing in the information age and subjecting them to the cleansing, confounding light of science.

It’s what John lived for, and I get to do it. With an incredibly thoughtful, rigorous, and compassionate group of people.

February 26, 2018. I’m in my morning routine, making coffee and smoothies. In one of the waits I turn on my phone and open Gmail.

‘Tentative good news on your CHS CAREER proposal’

I can’t call John.

I knew that would come. I’d prepared for the day, ran mental drills. I knew, if I landed the grant, I would want to call him and be unable to do so. I brace for impact; the rails hold.

I call Joe and give him the good news. I wapp some friends, they join Jennifer and I that night. I spring for the 18-year Macallan and raise a glass.

I can’t call John.

John Riedl (CC-BY by U:Rummey)

It’s now been five years. The night John died, I wrote ‘I feel a bit like Rocky, training for the big fight, but Apollo’s dead.’

I didn’t run off to Siberia and retrain as an oncologist.

Somehow Jennifer and I made it through; we didn’t know it at the time, but John’s death was only the halfway point in the losses that defined the eighteen months we now call the Year of Hell. Eighteen months that forever changed our life together.

Some people say the hardest times, the most trying times, are when you learn and grown and mature the most. I have no idea what they’re talking about. We survived, scarred and bruised. @mermatriarch put it well when she said ‘what doesn't kill you gives you a lot of unhealthy coping mechanisms and a really dark sense of humour’.

Joe is fantastic and has guided me well through finishing the Ph.D and launching my own academic career, and was very supportive in the remaining loss of that year. He mentored me in to co-organizing RecSys. I had always seen GroupLens as a family, but the family really came through in the wake of losing John.

The best way I know to honor John’s memory is to do good science with good people.

So that’s what I’m trying to do. Trying to figure out what on earth fair recommendation looks like. Working with great people. Trying to help my students achieve their dreams.

I have some safe places now. People with whom I can drop shields, be me, be sad or happy or whatever. Maybe to finish grieving, if that’s a thing that is ever completed. I'm doing better now than I have been since before he passed.

I didn’t see anything make John prouder than to see his students succeed. I can’t think of a better tribute to give him than to do that, boldly and kindly.

I don’t usually drink Heffeweisens, but I had one today. John’s beer. Wish you were here.

Nazi and Alt-Right Imagery in GIF Search

Online platforms take different approaches to moderating — or not — the content that can be published or discovered through their platforms. I discovered today that some of the GIF search engines are censoring certain search terms. So I decided to poke a little more and see what is happening.

Nazi

When you search for ‘nazi’ on Giphy, you get this:

Giphy search for 'nazi', resulting in 'No GIFs found for nazi'
No Nazis!

In WhatsApp's gif search, powered by Tenor, you see:

WhatsApp GIF search for 'nazi', resulting in 'No Results'
Also no nazis.

It seems, though, that this filter is implemented by WhatsApp. Tenor's web site will happily return Nazi GIFs. No, I'm not going to provide a screenshot.

Pepe

Curiosity killed the frog, so I went farther. What do these search engines do with ‘pepe’, the green frog mascot of certain corners of the alt-right?

Giphy doesn't like the frog, but instead of no results, it return results related to soccer and a skunk I don't recognize:

Giphy search for 'pepe', with no green frog results
Relevant, but not the frog.

Neither WhatsApp nor Tenor do this filtering, though - a search for ‘pepe’ returns a plethora of images of the frog.

Interestingly, it looks like Giphy is employing image recognition for this filter instead of relying solely on tags to keep Pepe away. When I search for ‘frog’, I find a number of frogs, including everyone's favorite green frog, but no Pepe in the first several screens:

Giphy search for 'frog'
Frogs!

A Tenor search for ‘frog’ includes Pepe:

Tenor search for 'frog'
Frogs on Tenor

Conclusion

It looks to me like Giphy has made the decision that they do not want to be a source of GIFs for alt-right and neo-Nazi communication, and invested nontrivial effort to avoid it as evidenced by the fact that it appears their curation and filtering goes beyond keyword matching. I expect this has minimal impact on actual online communication — people who want to spread Pepe gifs will find them — but it removes one source of media resources & reduces the likelihood of Giphy being linked to such communication.

Tenor has not made this decision. WhatsApp is unclear: they may simply be deploying efforts to be compliant with German and French laws against dissemination of Nazi imagery worldwide. Twitter's GIF search seems to work the same as WhatsApp.