Being clear about the nature of the lifecycle of the information that is being curated or preserved is important when assessing economic sustainability. One of the great ideas from the Blue Ribbon Task Force was “the case for preservation is the case for use”[i]. Use brings value, and value justifies preservation. It works the other way, too; if you separate use too far from the preserved content, then the value is reduced and the argument for preservation is diminished.
Quite often in digital preservation circles the model is one I call “post-use”. The functional model of OAIS[ii] takes a resource, ingests it safely into the digital preservation box, where it is looked after until someone asks for it, when it is disseminated out to the consumer. It is just about possible to interpret this as part of a normal digital information service, but it’s clear from the text that this is not the OAIS intent, and to do so requires trivialising OAIS to the extent that it ceases to be very meaningful.
We looked at a few lifecycle models, including the Digital Curation Centre’s lifecycle[iii] (described in Higgins 2008[iv]). Generally they seemed too detailed for our purposes, and none expressed the economic realities we were keen to capture. So we have had a go at drafting our own economic lifecycle model (see figure below; it needs work from a better graphic artist, but I hope you can get the idea).
We have tried to make this model work both for an individual data asset, for a data or information service (ie a service that makes data assets available) and for a data archive.
What is this diagram supposed to mean? It’s an economic view of the data lifecycle (time goes roughly clockwise, or anti-widdershins)… I’m using “data” here to include all forms of digital information.
Data are created (somehow, somewhere). Perhaps the data have been handed off from another service. Some of those created or handed off data are selected for this service, archive, whatever. The selected data have to be prepared for use; this is the “ingest” phase in OAIS, the editing phase in publishing, etc. It includes adding all relevant metadata required for use. This has been identified as one of the highest cost areas in the archiving world (Beagrie et 2008[v]), and it’s coloured red to indicate that it requires the addition of resources. Once usable, the data have to be made available in a service of some kind; this phase (also coloured red) continues on, as there are continuing costs associated with it for the whole time the data are made available through the system. Note that “costs” here do not necessarily mean money; they could represent other forms of support such as volunteer effort, etc.
Once made available, the data can be used. It’s only through use (or perhaps potential use) that the data create value, so we show the use phase in green. Value might be in monetary terms, but it might be in other forms. The economic case is that the aggregated value over many resources and significant time exceeds the aggregated costs (even if the two are expressed differently). It is of course easier to make the sustainability case if both cost and value are monetary. Aggregation is needed because of the long-tail effect: many data resources may not justify retention on their own (think how many journal articles are little-read, or how many library books are never borrowed), but these are supported in aggregate by the real stars of high-use, high-value resources.
If there is a reasonable perception of value, this situation can continue semi-indefinitely. But sooner or later, some sort of significant problem will arise. This could be a technical problem to do with the data (eg obsolescence); it could be a technical problem to do with the service (needs some kind of significant upgrade); it could be a business problem to do with the service (bankruptcy, change of ownership or focus, etc).
At around that point further decisions have to be made as further significant investment may be required. So we have another selection process. Some (or all) data will be retained, perhaps transformed etc to make it usable in the new environment. Some (or all) data will be disposed of, de-accessioned, handed off to another service or archive, etc.
OK, so that’s our draft economic lifecycle (currently in version 3). Any comments? Support? Major flaws? Minor flaws?
[i] BRTF-SDPA. (2010). Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information. P1. (A. Smith Rumsey, Ed.). San Diego. Retrieved from http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf
[ii] CCSDS. (2002). Reference Model for an Open Archival Information System (OAIS). NASA. Retrieved from http://public.ccsds.org/publications/archive/650x0b1.pdf
[iii] DCC. (2008). The DCC Curation Lifecycle Model. Edinburgh, UK: DCC. Retrieved from http://www.dcc.ac.uk/docs/publications/DCCLifecycle.pdf
[iv] Higgins, S. (2008). The DCC Curation Lifecycle Model. International Journal of Digital Curation, 3(1), 134-140. Bath: UKOLN. Retrieved from http://www.ijdc.net/index.php/ijdc/article/viewFile/69/48
[v] Beagrie, N., Chruszcz, J., & Lavoie, B. F. (2008). Keeping Research Data Safe: A cost model and guidance for UK Universities. Bristol. Retrieved from http://www.jisc.ac.uk/media/documents/publications/keepingresearchdatasafe0408.pdf
[Update to add links