PRONOM: what would success look like?

28 Nov

Today there is a meeting (I think run by the Digital Preservation Coalition) at The National Archives as part of their consultation on PRONOM and DROID. I was invited but unfortunately couldn’t make it (Kew is not the easiest place to get to by 10 am).

I’m personally much more interested in the PRONOM part rather than the DROID elements. I guess that’s because I’m not engaged in running tools to identify file types in ingested material. I know what my file types are; what I want is more information that would be useful in thinking of their preservation. Nevertheless, I suspect my question on PRONOM could be applied as well to DROID: what would success look like?

Part of the answer must depend on who PRONOM is for. To the extent that PRONOM is for The National Archives, it is successful if it meets their needs. If they don’t really need PRONOM (eg compared to DROID) other than as a place to keep file type signatures, then the fact that it is almost entirely comprised of almost-empty entries does not matter to its success (for them).

However, TNA advertise PRONOM as:

“The online registry of technical information. PRONOM is a resource for anyone requiring impartial and definitive information about the file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value.”

That sounds like it’s a resource for the rest of us. So success would have to mean there are completed entries for most file types (since collectively we will be exposed to almost all file types).

I saw a tweet from Kevin Ashley recently reminding someone that there are around 5,000 types of graphic files alone. There’s probably another dozen genres of file types, although most not so richly populated. So it’s certainly a big job. I think it’s too big for TNA alone to undertake. Indeed, I think it’s too big for any coalition of digital preservation archives to do alone (although this may be a bit more controversial). My belief is that you can only achieve this sort of completeness if you can find a way of crowd-sourcing the information.

I’ve already noted some of the deficiencies in PRONOM, so on 5 August I supplied some information about the .xslx and .docx formats. I got a nice email from someone at TNA on 25 August (a bit outside their 10-day target) that said (amongst toerh things):

“Thanks for this information. We’ll look at including a link to these specifications in the next release. I hope we can provide a better model for including this information in Pronom with the new development.”

As of today, 28 November, the entry for fmt/214 still doesn’t show the information I provided, and the last update is shown as 28 October 2009. I believe there has indeed been a PRONOM release since then.

Personally, I think completeness should be a PRONOM success factor, that completeness is not achievable by TNA alone or even in coalition, that completeness requires the participation of the public, and that the architecture of PRONOM must therefore include a mechanism for crowd-sourced input that works effectively.

That said, I do wish TNA and PRONOM very well.


One Response to “PRONOM: what would success look like?”


  1. Response to the “call to arms” post « Unsustainable Ideas - 4 July, 2012

    […] But have we really done so well? Yes, quite a bit of work has been done. Later yesterday there was the announcement of the UDFR service, successor to GDFR and supposedly a munging together with PRONOM. For the purposes of this post, I have deliberately not looked at UDFR; it will take more thought and analysis than I have time for right now. Before UDFR, the shining light in this area was PRONOM. I wrote a few posts about PRONOM last year, including “What would success look like?“. […]

Comments always welcome, will be treated as CC-BY

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: