CEDAR Project

One of the main practical bottlenecks in our efforts to provide better access and use to the Dutch historical censuses, relates to the conversion of the census tables (Excel) to RDF. As of October the CEDAR project, together with DANS, has scaled up its scope from samples, to the entire dataset and converted around 2300 tables (1795-1971) into the RDF Linked Data format. For the first time we are now able to query the entire dataset and answer specific question such as: how did the structure of the census evolve throughout its life cycle, which variables do we have per census, how many variables do we have in total, which classification systems are similar ect.)

On 24-25 October 2013, the International UDC Seminar: “Classification & Visualization: Interfaces to Knowledge” took place at the Koninklijke Bibliotheek in The Hague, were CEDAR was represented with a Poster and many interactions during the conference itself. The progamme comprised 19 talks, the poster exhibition where CEDAR was represented (conference posters) next to a selection of posters from “Places & Spaces: Mapping Science“, closing with a demo of the UDC Online. The conference was preceded by the workshop “Knowledge order and Science” where CEDAR also presented a poster (KNOWeSCAPE project). In the context of upcoming events; CEDAR will present a paper on “Harmonization of Historical Dutch Census Data” at the Tenth European Social Science History conference in Vienna (2014), in the Spatial and Digital History Session. This session is specifically concerned with using digital technologies to study the past.

CEDAR, Intersect, and eHumanities research in Australia

Another appointment of CEDAR in Sydney was to meet Ingrid Mason and Anne Cregan at Intersect Australia. Intersect is Australia’s largest full-service eResearch support agency.

AnneC IMG_2067klAlbert and Anne  Photo: Ingrid Mason

One project at Intersect, run by Anne Cregan among others, is the Humanities Networked Infrastructure (HuNI). The project aims at bringing together multiple humanities datasets, extracting structured information from them and increasing their linkage and accessibility. It has lots of similarities with the Computational Humanities Programme of the KNAW. All of us put our research on the table. Ingrid made a very nice description on work she did with librarians and archivists, and the issues they faced on classifying information (especially on automatically extracted classifications). Anne went deep on issues about the always necessary fine-grained semantics of such classifications. Albert explained the challenges we are facing in CEDAR and the eHumanities group, and how these nicely relate with the ones at Intersect and HuNI.

The discussion went deeper, until the topic went back to data comparability (as it did in the SemStats workshop) and how semantic technologies can help in the humanities, librarians and archivists worlds. At some point we remembered that “there’s no true modelling” for such classifications, as we recalled that biologists, which are maybe the most ancient classifiers in history, are unable to reach unanimity between the ontology of current species (that is, the species as they are today) and the ontology of evolution (that is, the species as they have been over time). We also discussed library standards and methodology of research, identifying the extremes of “just” doing science, on the one hand, and “just” providing services to end users, on the other. We agreed, anyhow, that research needs to solve real world problems. The discussion kept flowing towards old authors that are always good to read, and by saying goodbye we realised we have some new friends in Intersect Australia.