Moving Away from a Manual Workflow
One of the strangest challenges I faced when I was a PhD student in the humanities was that, despite the volume and intensity of the academic work I was expected to produce, I was never given any training that focussed on managing the workflow. Although we had copious training on some aspects of academic work like research skills and citation practices, we never really discussed techniques for managing how we write. This was a problem throughout my academic career that came to a head when I had to produce a work of the scale and rigour of the PhD. There seems to be an assumption when it came to writing or deadlines that people just know how to do manage these tasks. The reality is not as straightforward. For the most part, I was just making things up as I went along, and most of what I learnt, I learnt from discussions with my peers, often from researchers in more technical fields. I thought it worthwhile to write a series of blogs on this topic to share my experiences, record what I had learnt, and begin a conversation that can hopefully be illuminating to other scholars.
In this post, I will start with some reflections on how I changed my workflow when I started a PhD. I will talk about the specific challenges I faced because of the scale of the project, and how, had I known what the end of the PhD would have been like, I would have done some things differently. In future posts, I will talk about how I am changing things around after I finished my PhD. Or more realistically, these are things I wish I had done during my PhD and am only now putting them in practice. Finally, if I have charity to say something concise on the matter, I might also write another post about what these assumptions we make about people’s understanding of workflows reveal about structural inequalities in how we do research particularly concerning disability and prior access to these resources.
A Problem of Scale
Until my Masters, most of my workflow centred on using a word processor for everything. Not only would I do my core writing on Microsoft Word, directly in the main document, I would also maintain separate files for notes and references. I compiled my references, tracked revisions, and controlled for file versions manually. This worked just fine for when I had to write short, self-contained assignments for different courses. I did not need a central reference library to manage all my references to then pull for each project. My undergraduate and Masters dissertations were pushing the boundaries of that workflow, but were still within what I could conceivably manage manually. Moreover, my revisions were always in large enough chunks that simply having major version numbers was enough for me to track them.
Given that a PhD thesis is almost two orders of magnitude bigger than what I was previously used to, it is probably obvious that this solution did not scale well. I tried, in my first year, to stay organised by maintaining a central reference document that I updated as I went along. I broke the thesis into separate files for each chapters, and selectively copied the references from the main document into it. This ended up creating a lot of clutter with the sheer volume of files it generated. It also made it that much harder to make sure different versions of different files were kept in sync for when I updated any of the references or content. On top of that, version control was an absolute mess. And none of this accounts for the fact that Word just cannot handle rendering, navigating, or editing large documents.
Scrivener is a proprietary, cross-platform writing suite developed by Literature and Latte designed to handle large documents like books or theses. As of writing this, version 3 has finally been released for Windows. It costs around £50 to buy a license, and it probably was the best investment I made during the course of the PhD. Scrivener has a number of features that explicitly designed for working with large documents, especially the ability to split a file into different parts, edit different sections independently, and seamlessly re-organise sections of the document. To begin with, it integrates well with reference managers, although it caused me more trouble later on. This really streamlined a substantial part of my workflow. I could edit sections out of sequence and write in a way that was more fluid.
Because it was still mostly like using a word processor, there was no learning curve to this, just a few simple adaptations. Rather than write and format the document at once, I effectively wrote the text of the thesis and then fiddled with the formatting afterwards when compiling it into a finalised document. This also worked well with splitting my thesis into smaller projects, like compiling different sections for different things, like articles or submissions for annual reviews. This made a massive improvement to my writing workflow for really a very small cost (and I am grateful to my desk buddy Ian for persuading me to make this change).
The Loose Ends
The difficulties with Scrivener became clearer later on, especially when I had to compile or export the document to submit. The main issue was that I had still not completely removed Microsoft Word from the equation. I would still need to format the document and adjust its layout using a word processor at the end. This led to several issues caused by bugs or conflicts between different theming plugins. Either the bibliography would print with the wrong fonts and spacing, or the changes I made to the fonts would not sync, or, worst of all, the formatting removed italics and indents from some paragraphs. It took me a lot of investigating, and eventually going into second-line tech support by troubleshooting on Microsoft’s help forums, that I realised that the issue was caused by the Body style getting overwritten because a conflict between the different theme plug-ins for EndNote and how it was interacting with the default themes in Scrivener’s compiled document. Worst of all, even though I corrected all of these errors once, changes to my formatting were getting overwritten and were not saving unless I sorted this bug. Tech support aside, this meant that when I was doing my thesis corrections, I had to correct these minute formatting mistakes and re-italicise every book title and re-indent every paragraph.
Moreover, version control was still a pain on Scrivener. It allowed me to take snapshots of files or sections, but not the whole file. These snapshots were tied to a main file rather than to any parts I split from it. Very often, I found myself unable to revert changes to a file because the snapshot I was looking for was actually for a different file that the section I am writing of was a part of until I split it off. Moreover, I could not see different snapshots in the context of the whole work. This made it harder to know when a section was moved elsewhere rather than merely deleted. I also had no way of comparing different versions of a file. This made it considerably harder to apply changes to my thesis after, say, edits made to an article version of one of the chapters, as I had to search through the text and find the relevant passages to then update. Word’s compare feature did help for the actual comparisons, but I had to then go back and apply these changes to the different branching versions of the file.
An Elegant Solution with a Steep Learning Curve
A friend of mine suggested I write my thesis and manage my bibliographies in LaTeX. This would mean I would essentially write source code that could then be version controlled via Git. I declined at the time because this would have meant having to learn LaTeX, and I was not willing to deal with the steep learning curve that entails. Moreover, I would have to transfer my entire thesis from a WYSIWYG word processor to a source code, and to make sure I preserve the formatting and styles accordingly. As averse as I was at the prospect of spending hours on StackExchange trying to figure out how to indent a blockquote, I ended up doing the same with Microsoft Word anyway. Moreover, I had to manually edit and re-do the formatting of my thesis line-by-line because of the software bugs I encountered with Word. One way or another, this resembled the workflow I would eventually teach myself, and in my next post I’ll go over in more detail about how I found making this transition.