A digital preservation workflow for academic research

As part of the Data Lifeboat meeting I attended in November 2024, I'm jotting down some rough, high-level thoughts on what a good digital preservation workflow might be.

I am writing this as a stream of consciousness from my experience as an academic researcher. There are certainly things I missed or that I will think of later.

The workflow is organised below into three stages: Pre research; during research; and post research. Within each one I'll write down what would be good to happen at that stage.

Pre research

Start with a research “data” management plan. I'm using the term data very broadly here to mean the artefacts that result from a research projects, which could be (but not limited to) general notes, numerical data, interview transcripts, audio/video recordings, artwork, lab notebooks, etc.

When writing the plan, think about:

From experience, I know that a big challenge is not just coming up with such a plan, but to budget the time, resources, and labour to implement it. In academic research, I think this is an underappreciated point. At least from my scientific background, there are many scientists who scramble to prepare and publish data (usually because an academic journal requires them to publish data) at the last minute, and end up doing a poor job at digital preservation.

During research

During the course of a research project, remember to do good documentation. In my view, it is especially important to write down things like spontaneous learnings (“what are we learning along the way?”) or to note deviations from the research plan.

Documentation could also be informal, like rehearsal notes for performing arts or daily lab notebooks for an experimental scientist. Blog posts are also good.

Regularly check in with the original data management plan to see if it is being followed or if changes are needed.

Post research

In my view, a post-mortem is a critical exercise in any research project. This is true, too, for reflecting on how well a project's digital preservation plan/data management plan worked. Some questions to ask:

Another meta issue I see in academic research is the lack of appreciation, and highlighting of, the reuse of digitally preserved material. At least from what I've seen, there's lots of talk in #openresearch circles about sharing and how to do it well, but far less on using what others have shared!

I think if we do a good job of telling stories about the use of shared stuff, then we can more effectively make a case for digitally preserving said stuff and reducing #intellectualpoverty.


Unless otherwise stated, all original content in this post is shared under the Creative Commons Attribution-ShareAlike 4.0 license (CC BY-SA 4.0).