Things to do in Denver while still alive
Denver is the place to go for conferences. In May 2015, Andreas van Cranenbergh attended a conference on computational linguistics to present work he and Corina Koolen had done about predicting the literariness of novels. In November 2015, I went to the annual 4S conference (Society for Social Studies of Science). As always, it was a full and varied programme. I was there primarily because of my other job (as director of WTMC, the Netherlands Graduate Research School for Science, Technology & Modern Culture), but there was lots on offer that was also relevant for digital humanities, broadly defined. The standard method in STS (Science and Technology Studies) is the ethnographic case study, but at this meeting many sessions were devoted to looking at alternative methods, including a series of sessions about digital STS, and one session straightforwardly called ‘moving beyond the case study’. In the latter, several people presented analyses that would be familiar to digital humanists, for example, using relatively large corpora of climate change negotiations and abstracts of articles in conservation biology to apply big data analytics to look for patterns. One important issue to emerge from these multi-scalar readings was how easy it is to lose sight of the exceptions, of marginal topics. In the same session, Misha Teplitskiy from the KnowledgeLab, University of Chicago picked up on the recent report about reproducibility (or lack thereof) in psychology to see whether anything similar was happening in sociology, or at least quantitative sociological analyses of the US General Social Survey (GSS). The GSS is public data, there is a curated database of publications based on the GSS, and many of the publications are based on linear regression analyses, so in principle, they should be easy to reproduce. Teplitskiy and his colleagues had started from the assumption that there are three main reasons for lack of reproducibility within sociology – fraud, selective reporting, and the fact that the social world changes. He concluded that most of the lack of reproducibility arises because of the latter. In the discussion, all of the contributors agreed that quantitative, big data analyses can supplement qualitative research in interesting ways, through the identification of patterns and exceptions, but they also highlighted the serious problem of reifying visualisations based on big data.
I also attended interesting sessions about data sharing and re-use in different disciplines. A presentation by Alison Cool and Marianne de Laet picked up the big data questions, looking at how shifts from little to big data are related to questions not only of privacy but also of value. Peter Darch talked about the designing data infrastructures in astronomy, a project that also involves Christine Borgman, a regular visitor to the eHumanities group. Their colleagues – Ashley Sands and Milena Golshan – will present the project at the ‘new trends in eHumanities’ meeting planned for 18 February. Eric Meyer gave a great presentation about work he and his Oxford Internet Institute colleagues have done for the OECD about data sharing amongst Alzheimer researchers. This disease is high on the policy agenda, given the ageing population in some part of the world. He pointed out that data sharing is very difficult amongst Alzheimer researchers when they have not only heterogeneous data (brain scans, genetics, behavioural, etc.) but very different understandings of the cause of the disease, confirming the STS insight that data are only meaningful in particular theoretical, epistemic and social contexts. Their report includes a discussion about how the structural challenges to data sharing are only partly technical. The real challenges are those facing organisations, disciplines and individual researchers, around skills, incentives and funding. One of the challenges, bordering technical and organizational, is that the Alzheimer researchers have tended to build their own databases, without collaborating with computer and information scientists, with consequences for the quality and sustainability of the data and the metadata. This is not unique to Alzheimer researchers, and reinforces the need for interdisciplinary collaboration.
Richard Arias-Hernandez from the University of British Columbia gave a very moving presentation about the design of a digital archive for the Canadian National Centre for Truth and Reconciliation, devoted to collecting and making accessible material related to a shameful part of Canada’s history. Starting in the 19th century, but continuing into the 20th, 150,000 indigenous children were placed into residential schools, as a means of forcing First Peoples to adopt the norms of the settler state and churches. Much of the material related to this practice is scattered across the country, and much of the material is highly sensitive for survivors and their families, and for perpetrators and the state. Building a digital archive that meets very different needs is a challenging and important task.
In light of the award of the 2015 Erasmus Prize to the Wikipedia community, there was a very interesting presentation by Amanda Menking and Jonathan Rosenberg, both from the University of Washington, offering a feminist critique of the Wikipedia rules. Drawing on feminist epistemology, including the work of Helen Longino (awarded an honorary doctorate by the VU in 2014), they put forward a new set of principles for Wikipedia, principles that would allow and display polyvocality, partly through a visualization of the talk pages. This resonates with an article I wrote with Anna Harris (Maastricht University) and Susan Kelly (Exeter University) about how controversy gets handled by Wikipedians, through an analysis of the talk behind the schizophrenia genetics page. This article will be published in February 2016, in the journal Science and Technology Studies.
For the first time, the 4S conference devoted an afternoon to alternative forms of representation and engagement, the ‘Making and Doing Program’. Yanni Loukissas, Georgia Tech, presented his ‘Life and death of data’ project (warning – site can take a while to load) based around the collection of the Arnold Arborateum at Harvard University. By combining data visualisation and interviews with Arboretum staff, this interactive project illuminates what data can reveal about their own social and material histories.
As always at big conferences, much of the most interesting discussion took place over coffee, drinks & food. Apart from the usual academic gossip, the most interesting ‘fact’ I heard over coffee was about the NASA chimpanzees brought from Africa in the 1950s. Their descendants were sold by NASA as ‘surplus equipment’, raising questions not only about the ethics of animal experimentation but also about NASA’s attitude to alien intelligence.