The place of data papers: Producing data for geography and the geography of data production

Binary Tempest LogoGeo: Geography and Environment recently published two papers on data practices in the geography. It is an accident they were published on the same day, but it presents an occasion for us, as editors, to reflect on two related issues in academic writing and publishing: the growing role of the data paper and the spatial (and often unequal) distribution of value between data production and theory generation.

It is also good opportunity to remind people we have been promoting debate on open science and data in Geo since our 2015 launch, with the opening commentary from Sabina Leonelli et al and blog responses from George Adamson and James Porter. These pieces are all worth returning to now. The issues that emerge as practices around data sharing meet academic incentives, interdisciplinary research and public outreach continue to gather pace, with data papers offering a new venue for exploring them.

The TEMPEST team have produced the first data paper published in Geo on Dealing with the deluge of historical weather data. This paper explores the practices of assembling a digital resource of historic weather data from documentary archives. It positions their work in relation to other online datasets and offers this resource as an opportunity for future research and public engagement. It also demonstrates an exciting alternative to the drive that Adamson suggests follows the “pressure to analyse, interpret and publish before any data is shared”.

We are keen to encourage more data papers in Geo, both before and after analysis. At its simplest, a data paper describes how a dataset is assembled; reflects on its context and value; and invites others to use it by linking to the data. As the TEMPEST paper indicates, these issues are always complex in practice. Data papers are thus increasingly important in enabling connections in data-rich parts of the discipline and filling gaps in data-poor areas. These topologies of data availability are often complex and increasingly political.

Academics, and others, are facing renewed questions over the value of data and access to it. Access to government data may be removed – as in the disappearance of climate change pages and animal welfare data from US websites. Access to data may be restricted by commercial interests – by excluding data points to shape narratives or introducing high costs for data use.

Conversely, for academics, providing access to data is increasingly mandated. We are being asked to invest our time and resources in curating and archiving data produced in projects by funders. We are also asked to justify why we can’t use existing datasets as we search for new funding. Currently these processes are not well linked, meaning time spent curating data and resources spent collecting new data may both be wasted.

The second paper published, by Margath Walker and Emmanuel Frimpong Boamah on Alternative visualisations of geographic knowledge production indicates the political issues in other aspects of data generation and use.  Their work maps the relation between data production and concept work in critical urban studies and critical GIS, prompting some critical geopolitical questions.

These include: who is involved in generating our data, who is included in generating theory, and where are they both located? How are credit and value distributed across these practices, how do these reproduce existing global inequalities in knowledge production, and how might we enact these relations differently?

We are used to thinking about the responsibilities we have to the stories our respondents contribute to research. But, we are perhaps less used to considering our responsibilities to data and related questions around the ownership, interpretation, openness, and access to this data. The value attributed to ‘progressing’ theory over advancing data reinforces the unequal geographies that Walker and Boamah map out.

Data papers can play a part in exploring and addressing both issues. The detailed explanation of data production in a data paper is a way of reflecting on challenges within a project and of facilitating the recontextualization of data required for reuse beyond it. They also provide a further avenue for redistributing the value from data, by acknowledging the multiple geographies and authors that underpin data production, transforming these into a published research output, and opening space for different interpretations.

We are grateful for the questions that Veale et al and Walker and Boamah have prompted about the aggregation of weather data and the geographical distribution of value through theory and data. We hope they will inspire others to explore the challenges of putting data together and the responsibilities we share in authoring data and facilitating access to it.

Gail Davies, co-editor in chief, Geo: Geography and Environment

Joining Up Divided Data: The TEMPEST Database

We were very pleased to launch TEMPEST – our database of historical weather events – at this year’s RGS-IBG Annual Conference. With the support of the Geo team we organised a panel discussion and a small display of original and facsimile archive materials. Both were connected to a recent paper in Geo‘Dealing with the deluge of historical weather data: the example of the TEMPEST database’ – the journal’s first ‘data paper’.

Figure 2edited

‘The great frost’:  Frontispiece for The cold yeare 1614: A deepe snow: in which men and cattell have perished…or of strange accidents in this great snow, attributed to Thomas Dekker

Following an introductory post by the journal’s editors, in this contribution we wanted to reflect on our motivations for writing the paper, and creating TEMPEST, particularly in designing it as a freely accessible online resource.

Interest in historical weather is far from a new area of investigation. A number of well-known chronologies of British weather have been published and over the past 20-30 years, attempts have been made to produce searchable databases of historical weather information (instrumental data, proxy data and narrative descriptions of particular phenomena). It is widely recognised that these compilations of data or datasets have utility for the scientific study of climate, as well as satisfying the simple desire that many people have to know more about past meteorological events and their impacts on particular people and in specific places. However, in spite of rapid advances in technology, the growing amount of data (generated by labour intensive means) and the popularity of such resources, and the definite benefit that could come from uniting them, efforts largely remain separate. They are divided because they are technologically incompatible (the relevant data comes in many different formats covering instrumental observations to lengthy descriptive accounts in different languages, and database systems are constantly changing), or because they are funded only for finite periods. They can quickly become forgotten when new projects take priority or face obsolescence and lie in need of maintenance. They may also remain little known or largely indiscoverable, can be difficult to get to grips with or inaccessible to the general user.

As a research team we had some difficult conversations regarding the format, availability and deposit of our research data. It was a significant time investment to input the data into TEMPEST, time that could have been spent writing papers or our currently unfinished project book. However, we persevered and it now contains c. 18,000 event records – and we have already experienced the rewards. TEMPEST makes it possible to quickly see where we have gathered multiple narratives detailing the same event (creating a picture of the geographical extent of impact), and to piece together particular seasons or the weather of particular years or groups of years. Without TEMPEST these tasks would have required another significant time investment, and would have been reliant on the quality of our memory of the research data. Full recollection would have been an impossible challenge given the sheer quantity of data we have collected.

Although the creation of a freely available online resource was detailed in our original funding application to AHRC, as the project progressed and the volume (and quality) of our research data surpassed our expectations, team members were understandably reluctant to have our research data freely available before we had completed writing it up. However, the desire for others to use it, and our belief in its utility and popularity won over. Yet, even with an obligation to the AHRC to make our data available, but no dedicated arts and humanities data repository in the UK, it took some time to explore the various options that existed for depositing our dataset. We have just completed depositing our research data with CEDA (Centre for Environmental Data Analysis) where is it available for registered users to download as .csv files and analyse within Excel or other statistical software. A reference and DOI is provided for the dataset, alongside guidance notes relating to the data format, collection method and quality.

The database is also now ‘live’, though we may still change the url as a result of institutional moves and the conclusion of the funded period of the project.

This slideshow requires JavaScript.

Putting our own research data ‘out there’ is not enough. Few people are likely to find it unless we engage in targeted publicity and promotion, and it remains the case that significant time investment is required to properly come to ‘know’ the data, and use it to its potential – it is quite difficult to just ‘dip in’. We hope to use some of the time and finances allocated by a AHRC ‘Follow on Funding’ project to produce some sample ‘database stories’, promote the resource, and to embed and reconnect it with the archival repositories from which we have drawn data. We will also circulate our Geo paper to researchers involved in connected initiatives throughout Europe and further explore how it might be informally ‘joined up’. We also hope that we’ll be able to trace usage of our research data, whether it be by other academics wanting to contextualise their own research, by climate scientists developing computer models, by members of the public interested in the weather history of the place where they live, or by archive professionals interested in linking with other archives through documentary connections. As publications relating to the project are completed, where funds can be secured we are publishing them through the gold Open Access route, and we have definitely received wider readership and more interest in our work as a result – we can now also include reference to our research data and encourage its use.

Lucy Veale is a Research Associate in the Department of History, University of Liverpool, Georgina Endfield is Professor of Environmental History at the University of Liverpool, and Sarah Davies is a Reader in the Department of Geography and Earth Sciences at Aberystwyth University. 

Digital Data: Opening up the Weather Archive – Geo at #RGSIBG17

Join us on Wednesday 30 August at the RGS-IBG Annual International Conference for our Geo sponsored session ‘Digital Data: Opening up the Weather Archive’ (Education Centre, session 3, 14.40-16.20), convened by Georgina Endfield (The University of Liverpool), Lucy Veale (The University of Liverpool), and Sarah Davies (Aberystwyth University).

This slideshow requires JavaScript.

This session brings together researchers working on weather and climate history, existing or potential end users of research databases, and custodians of manuscript weather data, to critically evaluate the construction, management, application, and implications of digital weather data. Emphasis will be placed on thinking about the future of these tools and how we can improve connections between them, both technical and geographical.

The session will also include a live demonstration of the TEMPEST database (Tracking Extremes of Meteorological Phenomena in Extent across Space and Time). TEMPEST’s c.20,000 records are drawn from primary research into original documentary sources held in archives around the UK and offer personalised and geo-referenced insights into the relationship between society and extreme weather in the UK spanning a period of over 400 years.

Audience members are encouraged to send in live queries relating to historical extreme weather events via twitter (using the conference hashtag, #RGSIBG17); the discussion will also be of interest to researchers working on databases of other kinds.

Read the associated data paper: Dealing with the Deluge of Historical Weather Data: The example of the TEMPEST (Tracking Extremes of Meteorological Phenomena Experienced in Space and Time) Database.

Veale L., Endfield G., Davies S., Macdonald N., Naylor S., Royer M.-J., Bowen J., Tyler-Jones R., and Jones C. Dealing with the deluge of historical weather data: the example of the TEMPEST database. Geo: Geography and Environment. 2017, 4 (2), e00039

Visit the associated display in the Ambulatory: A Deluge of Documentary Weather Data, curated by Lucy Veale, Georgina Endfield and Sarah Davies.

This slideshow requires JavaScript.

This display explores extreme weather events in the UK, drawing on primary archival materials used in the AHRC funded project ‘Spaces and Experience and Horizons of Expectation: The Implications of Extreme Weather, Past, Present and Future’. It also features primary archival materials from the RGS-IBG archives, including resources relating to the meteorological investigations of the Terra Nova expedition 1910-13, led by Captain Robert Falcon Scott.

The database and project have an audience beyond academia. The project-team has worked with the RGS-IBG Schools team to produce a range of resources for teachers and students.

Watch the online lecture: Extreme Weather – The history of human-environmental interactions and our climatic past  Georgina Endfield explores the weather histories of unusual and extreme weather events, weather memories, and human responses linked to these events in an RGS-IBG School Member Lecture. (Free to access for a limited time).

Listen to Georgina Endfield on the RGS-IBG ‘Ask the Expert’ Podcast Series
In this podcast Laura Price (RGS-IBG) spoke to Georgina about the TEMPEST. The podcast explores how and why extreme weather events have been inscribed into our cultural fabric. (Free to access).

Primary teacher guide *coming soon*
The guide aims to promote the use of the Tempest weather archive in schools, and pupil’s understanding of historical weather and climate extremes more broadly.

Animation *coming soon*
The animation, for KS4 students, introduces the historical diversity of weather experiences in the UK. Using examples from the RGS-IBG archives, it explores the environmental and cultural implications of the events.

Interested in finding out more about extreme weather geographies? Take a look at Georgina Endfield and Lucy Veale’s Discovering Britain trail, the Great Dun Fell walk. It explores the Helm wind (Britain’s only named wind), the landscape of the North Pennines, and the work of Gordon Manley, a geographer who pioneered the collection of meteorological data.