In the spirit of the current consultation on Open Data (if not within the letter of any of its questions), I thought some history might be useful, based on my arms-length engagement with government data, in the form of the Ordnance Survey (OS) mapping data used in Digimap. This is both a positive tale (as OS learned to trust and respect both EDINA and HE in general), but also a sorry one, a story of successive licence restrictions preventing full use of a great resource.
(Yes, I know that the Open Government Data war has theoretically been won, but various rearguard actions are still being fought, and it seems worth thinking about the damage that closed data has caused.)
It may be a surprise, but the original Digimap project was agreed way back in February 1996. OS was persuaded to make available some selections of data for a few areas of the country for the project. It was, I think, a remarkably successful project, particularly for the way it showed mapping data could be used by a wide spectrum of disciplines, rather than just those related to geography. I seem to remember an archaeology project looking at the locations of Roman forts, based around sight-lines determined from the mapping data, for instance. By the end of the project, over 80% of the 500+ users were non-geographers.
By the end of 1997 it was clear the project was a success, and JISC was negotiating with OS for a wider licence for a service. But the datasets to be licensed were different, and concerns were emerging that the project licences might be withdrawn. And of course, some people were already using data that was not likely to be in the final data selected.
The original project was followed by others, exploring different aspects, and as I look through the emails I can see licensing issues emerging in all sorts of ways. An interesting aspect of the later projects was the idea of investigating the “changing landscape”, something I shall return to.
It’s also worth pointing out that EDINA was making something very different from anything that had gone before. There was no Google Earth on those days! From an email where JISC was concerned at the cost:
“We are all beginning to realise that this is a completely different proposition from adding another bibliographic data service. This stuff is completely different in nature from bibliographic data; it is much more complex to deal with. We cannot just drop the data into place as there is no existing support infrastructure.”
Eventually, the full service was launched in 2000, with national coverage but with different datasets. But, there was a problem…
OS had been worried that HE might somehow use “too much” of their data, given the discounts that had been negotiated. So they insisted that there be a form of rationing. The country was divided into many “tiles” and we were only allowed to use a maximum of 30% of the tiles in any one year. This was small enough that tiles had to be selected very carefully. EDINA also had to develop a mechanism for monitoring and enforcing the tile selections. This, coupled with extra authentication requirements that OS also insisted on, raised barriers to participation in Digimap and effectively reduced its value and uptake. All of this was on top of the magnificent development work EDINA was doing to make Digimap an effective but simple-to-use service.
By the way, as far as I can remember these difficulties were not raised by the HE representatives within OS, but by their commercial sales colleagues, who felt that their targets might be affected.
At this stage, there were concerns at some of the licence restrictions, which might make it impossible to use the maps on which research is based in articles, etc. In 2001 some of this was overcome; I have an email from EDINA which says in part:
“- change to permitted use of data to include “Limited Internal Business Use”, which allows use of maps in free University brochures, leaflets and flyers
- the use of Digimap data and maps in public lectures is now allowed”
Then later that year, success! OS dropped the tile rationing requirement. This was great, but it didn’t compensate for the months of wasted development effort and the days of fruitless negotiating among potential users at each of the various user sites.
Meanwhile OS had problems of their own, completely changing the nature of and technology underlying their primary database. And of course, those problems affected us; the products available changed, the data changed (completely), and most of the delivery systems had to be re-written. But when the new contract came up, a condition was that all the old data had to be deleted.
Remember my comment about the “changing landscape”? Researchers are naturally interested in change. We originally had pockets of data from 1996. Then we had several complete datasets with all the changes, from 2000. If EDINA had been able to carry that data forward, and find a way of reconciling it with the new data, we would have an ever-building resource. But in around 2005 (or so), EDINA was required to delete all that and start again with new current data. It was impossible under such circumstances to build up a picture of the changing landscape. But why should OS care? Their business was current data, not past data.
JISC did manage to negotiate a deal with a company that was interested in older maps, which eventually became Historic Digimap. But these were images, not data, and the latest version was prior to the introduction of the digital products; we had lost the opportunity to build our own time series of the data we had licensed for the UK.
Now all along there were reasons (some bad, some not so bad) for the various restrictions. My point here is that at every stage these restrictions exercised a chilling effect on higher education’s ability to use government-based mapping data in imaginative ways. They restricted research, they restricted teaching, they restricted administrative use. It’s not just the wasted development effort, nor the prized datasets we could not afford, not the licence fees paid to one government organisation after another, not the authentication restrictions, not the inability to publish.
Really, the restriction from closed data is on “generativity”. It’s the cumulative effect of these restrictions clamping like a vice around the creativity of the sector, and the ability of our nation to make real use of what is possibly the best mapping data in the world. I’m sure similar chilling effects have applied to government data in many other areas.
The system that requires this data to be supported by licensing deals is fundamentally flawed. Yes, let such bodies earn some of their revenue from “value-added products”. But make the underlying data freely and openly available. This may mean paying for more of the data collection from our tax base. But it will open up such possibilities, I am confident that more jobs would be created, paying tax to compensate.
Let’s make these data freely available and unleash our potential!
 The aim of the very first Digimap project was to “examine the problems and opportunities facing map libraries within Higher Education Institutions in providing access to Ordnance Survey Digital Map Data”, see http://www.ukoln.ac.uk/services/elib/projects/digimap/