What Is a Dissertation?
If we can aggressively simplify for a moment, earning a Ph.D has three primary components:
- Do research.
- Write it up in a dissertation.
- Convince a committee of faculty that what you’ve done and presented is worthy of a research-based terminal academic degree.
There are some other things in each program, such as courses and qualifiers, but this is the heart of what earns the degree.
But what is that mysterious “dissertation”?
It’s a full report on a body of research that is sufficient to demonstrate competence as an independent researcher and earn a Ph.D: an original contribution to knowledge in your field. Matt Might has a good illustrated guide to what it looks like to create new knowledge, and how that relates to earlier academic training.
But that still leaves the question about the dissertation itself: what should such a document contain, and how should it be organized? When you do your dissertation proposal, what are you actually proposing to do?
That’s what I hope to address in this post. Before we get to the organization of the dissertation document itself, I want to spend just a little time on the scope of a dissertation. A dissertation should contain the research content of approximately three good papers (full conference or journal papers). There are variations on this — sometimes there’s 4 papers, sometimes two short papers replace a full paper — but it’s the basic idea. Not all papers need to be published: at least one should, but the publication process is fickle sometimes. The ideal is probably to have one published, another published or accepted, and a third under review (or in preparation for a deadline shortly after defense). These three papers will also often not be the only papers you write in the course of your Ph.D — while additional papers are not needed to earn the degree, they’re usually necessary to have a competitive application when pursuing research-oriented jobs. The 3–4 papers should also be on a theme so you can tell a coherent story of your dissertation work. Three disconnected papers on different topics are hard to sell, and make it difficult for you to make a clear pitch of research directions when you’re on the job market. There isn’t one story for a dissertation, but there are a few general shapes that tend to work well: There are likely other workable designs as well, but most coherent stories for 3–4 papers will probably fit one of these patterns, more or less. So you have some papers, and an overall narrative to show how they form a connected and coherent body of work. What does the actual document look like? There are some variations, but I expect most dissertations I advise to have an outline approximately like this: Introduction. The first chapter casts your overall vision: defines your topic and the terms needed to understand it, presents your story, and previews your contributions. In particular, it sets up your organizing theme (either the hammer or the problem you’re solving). By the end of it, the reader should know (1) what you’re trying to do (including your organizing principle), (2) why it matters, and (3) your core contributions. The rest of the dissertation is to then convince them that you actually make the contributions you claim. Background & Related Work. The second chapter is your primary literature survey. This serves two distinct but related roles1: first, it covers the necessary background for a reader who is competent in computer science broadly, but not your specific specialty, to understand the rest of your work. Second, it positions your dissertation work in the broader research space, and in particular other work on your problem and related or precursor problems. This is the literature survey for your whole dissertation. Some later chapters may also contain small background and/or related work sections that survey work specifically supporting that chapter’s unique work, but the common elements should usually be factored out into Chapter 2. In some dissertations, you may present most of the background in Chapter 1, so Chapter 2 is just related work, but in my experience there’s still background that’s needed in Chapter 2. Common Infrastructure (optional). In some dissertations, you’ll have some resource, such as software or a data set, that you use throughout the entire dissertation. If it doesn’t make sense to describe it in Ch. 2, it can be useful to spend Chapter 3 describing this resource in some detail. In some cases, if it is an original resource, it may also be a paper, particularly if your discipline has venues for publishing resources such as the SIGIR Resource Track or the NeurIPS Datasets & Benchmarks track. Many dissertations won’t have a dedicated chapter for this, though. Research Content. The next 3–4 chapters present your primary research content. Each of your component papers usually becomes a chapter; much of the content can be reused from the paper, but you usually need to make a few changes: Conclusion. Your last chapter ties it all together: given the vision outlined in Ch. 1, and the work presented in the research content chapters, what do we know about your topic now that we didn’t know before you started the Ph.D? What are the next steps to advance knowledge beyond what you’ve accomplished in the dissertation? Appendices. Some dissertations have appendices; their use varies. I’ve seen them used for additional research content outside the main narrative flow, such as another paper the student wrote. They’re also useful for additional supporting evidence for the research content that would break the flow too much if you included it in the chapter, but you want to make available to readers who wish to check your work more thoroughly. This can include documentation for software you developed, more complete output from statistical models, supplementary charts, etc. If anything is needed to understand one of your research results, however, it should go in the main chapter, not an appendix. There are some variations on the themes — I didn’t have a 1:1 relationship between papers and chapters in my own dissertation — but for a typical computer science dissertation, this outline will usually work pretty well, and strikes a useful balance (in my opinion) between a pure staple dissertation that does no integration, and a complete rewrite of all the material. When you’re planning out your dissertation work, particularly around the proposal stage, that’s what you’re planning to write. Make sure you leave plenty of time for the writing — it can take longer than you expect, and while the dissertation doesn’t need to be your best writing, it should be reasonably good and definitely needs to be clear and readable. Thanks to Sole Pera for frequently reminding me not to blur these roles.↩︎Scope
Dissertation Outline
Planning