Student Skills
Research work in any discipline requires a variety of skills. Some of these you should have before you begin work; others you will learn along the way. This page catalogs a number of the skills that are useful or necessary to successfully work on the kinds of research that our research group does.
This page is not a list of prerequisites. I do not expect you to have all, or even most, of these skills before you start. Depending on the specific project you are working on, you may not have all of them before you are finished, although I am happy to work with you to learn any of them that you wish.
Rather, this is intended as a starting point for conversations about your studies and a place to collect resources for learning new skills that are useful in research. It’s my hope that students work on developing these skills, maybe focusing on 1—2 each semester, as they work in our research group.
I have divided this guide into four sections:
- Prerequisite skills that really are needed before we can do much.
- General skills that will help you understand, conduct, and communicate research.
- Technical skills pertaining to technologies that we regularly use in our research.
- Auxiliary skills that you may find fun or useful to have.
The skill list is collapsed; click a skill to see more information.
Thanks to Jennifer Ekstrand, Sole Pera, and Cathie Olschanowsky for their valuable feedback in preparing this list. If you can write some Python code, and we can communicate, that’s usually enough to get started and we can work on the rest, particularly for undergraduate students. I have some projects with roles that do not require programming knowledge, but they are less common. Research is not just — or even primarily, in our group — about technology or technical skills. It is about expanding our knowledge and solving problems, which requires a wide array of work. Reading research is a skill in its own right. There are different types of reading for different types of papers and information needs. I think of a spectrum of reading levels for research papers, including the following levels as well as others in between: There are other types of reading as well, such as reading a paper not for its content but to understand how the writing itself works. It’s generally not a good idea to just read a paper from front to back. Some papers make this easy, but many papers do not. A common approach is to read the abstract, introduction, and conclusion, and then to read more if it is relevant. Here are some more resources for reading papers: It is not enough to conduct research; we need to write up the results and get them published. The bulk of computer and information science research is published in conferences, which means that we must present it and be able to talk about it effectively if we want people to pay attention to it. While you are in school to learn generally, it’s important to be able to go out and learn a skill or a technology that we don’t specify in the curriculum. This is doubly important when working on research, as the point is to identify new knowledge — if we knew what we were doing, or could just look it up in a book, it would not be scientific research. Many times, I will not know it myself! A lot of our research involves statistical analysis of some kind: analyzing experiment results, mining data sets, or conducting simulations. Some of our work specifically requires Bayesian statistical methods. Some resources: This skill really comprises two skills: high-level project planning, where you determine the scope and desired outcomes of a project, and day-to-day task and work management to actually get your work done. Both are important to various degrees, depending on your role and educational stage. For general tips on productivity, see my blog series and resources. There are many different day-to-day work management techniques and tools and no silver bullets. It is often not productive to attempt to religiously follow a particular methodology, or particularly to spend a great deal of time churning through different tools. No tool is perfect; each will let you down somehow, sometime. And they cannot solve everything. Also, different roles require different workflows and task management systems. Many systems and books are primarily oriented towards white-collar knowledge and executive workers, particularly those with management responsibilities. They have some overlap with academic work but often need adaptation. A grab bag of techniques may work best for you. The important thing is to be able to record the things that you need to do, in a reliable medium, and work on them. Many people are very productive with a spiral notebook. I have written a series of articles about my own approach to planning and managing work, and include there a number of links to other resources. If you want to read one productivity book, I recommend The One Minute To-Do List. There are also a million software tools that may or may not help, such as Logseq, Todoist, Toodledo, Microsoft To Do, OmniFocus, TaskPaper, Emacs It’s really easy to spin your wheels in research. It’s even easier if you don’t have a clear idea of what it is that you are trying to do. Therefore, it’s important to be able to plan a project (or subproject) to give it a clear direction and to help structure your work. Even in the fairly open-ended world of academic research, it is important to have clear direction. Early on, your adviser will help a lot with this. But as you progress through your research career, you need to be able to do an increasing amount of project planning yourself. By the time you complete a Ph.D, you should be able to plan out a research project that is at least one paper’s worth of work, and ideally more. One of the key things to do in planning a project is to determine its intended outcome. If the project is successful, what will you have at the end? It does not matter much what you call this — ‘Definition of Done’ from SCRUM, a ‘Desired Outcome’, or whatever — the important thing is to define success. The desired outcome may just be ‘we have an evidence-based answer to research question $FOO’. Then you can work on determining how to move towards success. Once you have a concrete goal, or an idea that you are considering developing into a concrete goal, it is useful to be able to iterate quickly on early ideas to filter out infeasible solutions and identify a promising path forward. Minimum Viable Research is one way of thinking about this. With these skills, you will develop a deeper understanding of the various technologies we use in our research. The single most useful programming language for much of the work we do is Python. The LensKit software that we maintain is written in Python. Even if your research does not directly contribute to LensKit, there is a good chance that you will need to work with it, and we also use Python packages for much of our other work. We also do a lot of data analysis and general utility programming in Python. Data structures and algorithms are foundational programming topics that will help you learn better how to structure and reason about computations. We generally use Git to manage our source code; the LensKit project uses Git, and we usually use it for managing our experiment scripts and sometimes our papers. Source control in general is a useful tool, and Git is the dominant tool in version control for open-source software. While we use a variety of operating systems for our local computing environments (I myself use Windows), basic familiarity with the Unix command line is useful as we generally run Linux on our servers for data analysis and deploying live applications, and other infrastructure we work with is built on Unix-like platforms (such as the Travis continuous integration server). We usually use STAN for Bayesian statistical inference. Fortunately, it has very high-quality documentation; we’re also developing an increasing amount of lab expertise in it, so your labmates can probably help too! Whenever we build a user-facing experiment, it’s usually deployed as a web application. Therefore, web development can be a very valuable skill for our research, depending on the exact project you are working on. Some projects may require use of an SQL database, such as PostgreSQL. Rust is turning out to be an excellent language for high-throughput data processing. It allows us to write extremely fast code without the hassle of C or C++. For an example of how we’ve integrated Python, Rust, and advanced PostgreSQL, see the book data tools. These skills are fun, perhaps useful, but aren’t in the direct path to most of our research outcomes. You’ll pick up some functional programming along the way in a lot of other work, because functional concepts have worked their way pretty heavily into modern JavaScript, Python, and even Java. A serious study of functional programming can improve your ability to make use of those language features (sometimes to your collaborators’ chagrin). While some of our oral communication is in prepared form — conference talks, seminars, lectures, and the like — we also need to be able to communicate about it without preparation: to answer questions, pitch our work in the hallway, ask good questions about others’ talks, etc. While in grad school I learned to build bicycle wheels, and get a great deal of satisfaction out of a well-made wheel that I built with my own hands. This is a very important skill. The best way I know to obtain it is to read a lot so you have an extensive repertoire of words and sentence structures to make the lyrics work out, and then to practice. For examples, see Les Bicyclebles.Prerequisite Skills
General Skills
Reading
Writing
Abstracts
Speaking
Learning
Statistics
Planning and Executing Work
Day-to-Day Work
org-mode
, and TaskWarrior.Planning Projects
Technical Skills
Python Programming
Data Structures and Algorithms
Git
Unix Command Line
STAN Programming
Web Development
Database Design and Programming
Rust Programming
Auxiliary & Recreational Skills
Functional Programming
Extemporaneous Speaking
Building Bicycle Wheels
Parodies of Musical Lyrics