|DAY 1 | 2015-11-23 PRECONFERENCE|
|09:00 - 12:00||COLLOCATED EVENTS
Treffen der DINI AG KIM (Meeting of the DINI AG KIM, Germany)
Stefanie Rühle / Jana Hentschke
DINI AG KIM
|Metafacture "Get Together"
|13:00 - 19:00||WORKSHOPS AND TUTORIALS
Introduction to Linked Open Data
Felix Ostrowski / Adrian Pohl
graphthinking GmbH, Germany /
North Rhine-Westphalian Library Service Center (hbz), Germany
This introductory workshop aims to introduce the fundamentals of linked data technologies on the one hand, and the basic legal issues of open data on the other. The RDF data model will be discussed, along with the concepts of dereferenceable URIs and common vocabularies. The participants will continuously create and refine RDF documents about themselves including links to other participants to strengthen their knowledge of the topic. Based on the data created the advantages of publishing linked data will be shown. On a side track, Open Data principles will be introduced, discussed and applied to the content that is being created during the workshop.
Schema.org Question Time Workshop
OCLC, United Kingdom
Schema.org is basically a simple vocabulary for describing stuff, on the web. Since its launch by the major search engines (Google, Bing, Yahoo!, Yandex) in 2011 it has had a meteoric rise to become a de facto vocabulary on the web. How it works; how you use it; can it only be embedded in html; what happens if the search engines drop it; how do I mark up my pages; can I use it for my data like I would any other vocabulary; how is it managed; how applicable is it for bibliographic data; how does it compare with Bibframe; can it be extended; how can I influence its development; how well used is it; what are the benefits of using it — all questions that are often asked about Schema.org. Join this session to hear answers to these questions, ask your own questions, and more. Amongst other things, you will walk through some simple examples of using Schema; how you can participate in the communities surrounding the development and extension of the vocabulary; and discuss how and why it is applicable to libraries and their data. The format of this workshop will mostly be driven by the participants raising questions and topics of concern for discussion in the group, introduced and facilitated by Richard Wallis, Chair of the Schema Bib Extend W3C Community Group. Come along and find out everything* about Schema.org but have never had the chance to ask. (* Richard will do his best attempting to cover everything that can be answered in the session.)
Bringing Your Content to the User, not the User to Your Content – a Lightweight Approach towards Integrating External Content via the EEXCESS Framework
Werner Bailer / Martin Höffernig
Joanneum Research, Austria
This workshop will look at the steps and tools needed to adapt scientific and cultural heritage assets to services which make them available to wide community of users on the platforms they already use, e.g., as a plugin in their web browser or on their blogging environment. The EEXCESS project has developed such a service, and welcomes GLAM institutions to make their data available.
Catmandu - a (Meta)Data Toolkit
Johann Rolschewski / Vitali Peil / Patrick Hochstenbach
Berlin State Library / Bielefeld University Library / Ghent University Library
Catmandu http://librecat.org/Catmandu/ provides a suite of software modules to ease the import, storage, retrieval, export and transformation of (meta)data records. After a short introduction to Catmandu and its features, we will present the command line interface (CLI) and the domain specific language (DSL). Participants will be guided to get data from different sources via APIs, to transform data records to a common data model, to store/index it in Elasticsearch or MongoDB, to query data from stores and to export it to different formats. The intended audience is Systems librarians, Metadata librarians, and Data managers. Participants should be familiar with command line interfaces (CLI). Programming experience is not required. Required is a Laptop with VirtualBox installed. Organisers will provide a VirtualBox image (Linux guest system) beforehand. Participants can also install their own environment, see here. Participants could bring their own data (CSV, JSON, MAB2, MARC, PICA+, XLS, YAML).
RDF.rb & ActiveTriples: Working with RDF in Ruby
Digital Public Library of America, United States of America
This workshop covers the current state of RDF support in the dynamic Object Oriented Ruby language. We will cover the following tools: Ruby RDF (RDF.rb) is a fully public domain suite of libraries implementing the full RDF model. The core library provides interfaces for working with resources, statements, graphs, and datasets. An extensive network of satellite libraries offer support for many serialization formats, basic reasoning, graph normalization, SPARQL, and persistence to a wide variety of triplestores. ActiveTriples is an Object-Graph-Modeling interface built over Ruby RDF. It supports the ActiveModel interface for integration with Ruby on Rails and similar frameworks. The workshop is recommended for anyone looking for an expressive, accessible toolkit to work with RDF data. Some experience with programming and a basic knowledge of Object Oriented concepts is assumed; experience with Ruby is not expected. Participants should come prepared with Ruby installed on their laptops, and may benefit from working through Ruby in 20 Minutes in advance of the session.
|DAY 2 | 2015-11-24 CONFERENCE|
|09:15 - 10:15||WELCOME / OPENING
Thorsten Meyer / Silke Schomburg
ZBW - Leibniz Information Centre for Economics, Germany / North Rhine-Westphalian Library Service Center (hbz)
Keynote: Maximising (Re)Usability of Library Metadata Using Linked Data
Asunción Gómez Pérez
Technical University of Madrid, Spain
Linked Data (LD) and related technologies are providing the means to connect high volumes of disconnected data at Web-scale and producing a huge global knowledge graph. The key benefits of applying LD principles to datasets are
|10:15 - 10:45||COFFEE BREAK|
|10:45 - 12:15||APPLICATIONS
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Cornell University, United States of America
The Linked Data for Libraries (LD4L) project aims to create a Linked Open Data (LOD) model that works both within individual institutions and across libraries to capture and leverage the intellectual value that librarians and other domain experts add to information resources when they describe, annotate, organize, and use those resources. First we developed a set of use cases illustrating the benefits of LOD in a library context. These served as a reference for the development of an LD4L ontology which includes bibliographic, person, curation, and usage information. This largely draws from existing ontologies, including the evolving BIBFRAME ontology. We have prioritized the ability to identify entities within library metadata records, reducing reliance on lexical forms of identity. Whenever possible we seek out persistent global identifiers for the entities being represented — identifiers from established efforts such as ORCID, VIAF, and ISNI for people, and OCLC identifiers for works for example. One group of LD4L use cases explores circulation and other usage data as sources that could improve discovery, and inform collection building. We are exploring the use of a anonymized and normalized metric that may be shared and compared across institutions. Ontology work and software from the LD4L project is available from our Github repository.
|LOD for Applications – Using the Lobid API
Pascal Christoph / Fabian Steeg
It is often correctly noted that many datasets got published in the library world with little or no stories about actual use of these datasets. In this talk we want to highlight some of this usage in the context of the hbz linked open data service lobid (which stands for "linking open bibliographic data"). The hbz has been experimenting with linked data technology since 2009. In November 2013 the hbz launched a linked open data API via its service lobid. This API provides access to different kinds of data:
|HTTP-PATCH for Read-write Linked Data
Rurik Thomas Greenall
Computas AS, Norway
It can be argued that HTTP-PATCH is essential to read-write linked data; this being the case, there seems to be no absolute definition for how this should be implemented. In this talk, I present different alternatives for HTTP-PATCH and an implementation based on practical considerations from feature-driven development of a linked-data-based library platform at Oslo public library. Grounded in the work done at Oslo public library, I show how HTTP-PATCH can be implemented and used in everyday workflows, while considering several aspects of specifications such as LD-PATCH, RDF-PATCH, particularly in light of existing efforts such as JSON-PATCH. In the description of the implementation, I pay particular attention to the practical issues of using linked data in REST architecture, the widespread use of formats that do not support hypermedia and blank nodes. The talk views the cognitive constraints imposed by the dominance of the traditional library technology stack and how these colour development of new workflows and interfaces. Further, I provide some thoughts about how specifications like the linked-data platform can be reconciled with modern development techniques that largely shun such specifications, and how we can create read-write interfaces for linked data.
|12:15 - 13:45||LUNCH|
|13:45 - 15:30||RESEARCH
Call for Linked Research
University of Bonn, Germany
Linked Research is set out to socially and technically enable researchers to take full control, ownership, and responsibility of their own knowledge. This is so that research contributions are accessible to the society at maximum capacity, by dismantling the use of archaic and artificial barriers. It is intended to influence a (paradigm) shift in all aspects of scholarly communication by fostering the use of the native Web stack. Linked Research proposes an acid test to the research community in order to verify, approve, or test the openness, accessibility and flexibility of the approaches for enhanced scholarly communication. Dokieli is a decentralized authoring, annotations, and social interaction tool complying with this initiative. This talk will discuss and demonstrate what works!
A RESTful JSON-LD Architecture for Unraveling Hidden References to Research Data
Konstantin Baierer / Philipp Zumstein
Mannheim University Library, Germany
Data citations are more common today, but more often than not the references to research data don't follow any formalism as do references to publications. The InFoLiS project makes those "hidden" references explicit using text mining techniques. They are made available for integration by software agents (e.g. for retrieval systems). In the second phase of the project we aim to build a flexible and long-term sustainable infrastructure to house the algorithms as well as APIs for embedding them into existing systems. The infrastructure's primary directive is to provide lightweight read/write access to the resources that define the InFoLiS data model (algorithms, metadata, patterns, publications, etc.). The InFoLiS data model is implemented as a JSON schema and provides full forward compatibility with RDF through JSON-LD using a JSON-to-RDF schema-ontology mapping, reusing established vocabularies whenever possible. We are neither using a triplestore nor an RDBMS, but a document database (MongoDB). This allows us to adhere to the Linked Data principles, while minimizing the complexity of mappings between different resource representations. Consequently, our web services are lightweight, making it easy to integrate InFoLiS data into information retrieval systems, publication management systems or reference management software. On the other hand, Linked Data agents expecting RDF can consume the API responses as triples; they can query the SPARQL endpoint or download a full RDF dump of the database. We will demonstrate a lightweight tool that uses the InFoLiS web services to augment the web browsing experience for data scientists and librarians.
|Researchers’ Identity Management in the 21st Century Networked World: A Case Study of AUC Faculty Publications
American University in Cairo, Egypt
This project will explore how American University in Cairo (AUC) faculty members distributed their scholarly and creative works, and how their names are identified in author identifier systems and/or on the Web. The goal is to explore how best to present their data as linked data. The project will use the AUC faculty’s names listed in AUC Faculty Publications: 2012 Calendar Year. Their names will be used to search in author identifier systems to answer;
|15:30 - 16:00||COFFEE BREAK|
|16:00 - 17:30||LIGHTENING / BREAKOUT
As an experiment, this year's SWIB will provide space for groups of participants to meet, discuss and exchange ideas or results of their work on specific topics. Please, introduce your topic and the main question to discuss with a short statement in the preceeding Lightening talks.
|DAY 3 | 2015-11-25 CONFERENCE|
|09:00 - 10:15||OPENING
Keynote: The Digital Cavemen of Linked Lascaux
Ghent University – iMinds, Belgium
Some 17,000 years ago, cavemen, cavewomen and cavekids picked up their cavebrushes to paint caveanimals on their cavewalls in a place that eventually would become known as the Lascaux complex. Their cavehands eternalized cavehorses and cavedeer in shady corners, an art form which continues to inspire contemporary artists such as Banksy. Despite the millennia-long deprecation of cave technology (X-caveML 2.0 never really caught on), we can still admire Lascauxian cave art, even though we will probably remain eternally oblivious of its purpose if there ever was any. This sharply contrasts with an Excel 97 sheet named mybooks.xls.bak I tried to open yesterday: perfectly remembering its purpose (my dad was maintaining a list of books he had read), I'm unable to revive the splendid tabular chaos undoubtedly typeset in Times New Roman or worse. 17 years ago somebody made a simple spreadsheet and it's literally less accessible than a 17,000 year old scribble by an unknown caveartist. Not to mention the philistines who are blacking out Banksy's recent works, which date back to last year or so. And certainly don't get me started about sustainable Linked Data. I mean, is there really such a thing? We'll be lucky if any triple at all survives 17 years. Or 17 months, for that matter. Some even have trouble keeping a SPARQL endpoint up for 17 hours. Or minutes. We might not be very good cavemen. This talk combines lessons learned from the Semantic Web, the REST principles, and the Web in general to think about what sustainability for Linked Data could really mean and how we just might achieve it.
|Data-Transformation on Historical Data Using the RDF Data Cube Vocabulary
Sebastian Bayerl / Michael Granitzer
University of Passau, Germany
This work describes how XML-based TEI documents, containing statistical data, can be normalized, converted and enriched using the RDF Data Cube Vocabulary. In particular we focus on a statistical real world data set, namely the statistics of the German Reich around the year 1880, which are available in the TEI format. The data is embedded in complex structured tables, which are relatively easy to understand for humans but they are not suitable for automated processing and data analysis, without heavy pre-processing, due to their varying structural properties and differing table layouts. Therefore, the complex structured tables must be validated, modified and transformed, until they are suitable for the standardized multi-dimensional data structure - the data cube. This work especially focuses on the transformations necessary to normalize the structure of the tables. Performing validation- and cleaning-steps, resolving row- and column-spans and reordering slices are available transformations among multiple others. By combining existing transformations, compound operators are implemented, which can handle specific and complex problems. The identification of structural similarities or properties can be used to automatically suggest sequences of transformations. A second focus is on the advantages, which come by using the RDF Data Cube Vocabulary. Also, a research prototype was implemented to execute the workflow and convert the statistical data into data cubes.
|10:15 - 10:45||COFFEE BREAK|
|10:45 - 12:15||LODLAM
Linking Data about the Past Through Geography: Pelagios, Recogito & Peripleo
Rainer Simon / Elton Barker / Leif Isaksen / Pau de Soto Cañamares
AIT Austrian Institute of Technology / The Open University / University of Southampton, United Kingdom
Pelagios is a community-driven initiative that facilitates better linkages between online resources documenting the past, based on the places they refer to. Our member projects are connected by a shared vision of a world in which the geography of the past is every bit as interconnected, interactive and interesting as the present. Pelagios has been working towards establishing conventions, best practices and tools in several areas of "Linked Ancient World Data":
|Modeling and Exchanging Annotations for Europeana Projects
Hugo Manguinhas / Antoine Isaac / Valentine Charles / Sergiu Gordea / Maarten Brinkerink
The Europeana Foundation, Netherlands / Austrian Institute of Technology / Netherlands Institute for Sound and Vision
Cultural heritage institutions are looking at crowdsourcing as a new way and opportunity to improve the overall quality of their data and contribute to a better semantic description and link to the web of data. This is also the case for Europeana, as crowdsourcing under the form of annotations is envisioned and being worked on in several projects. As part of the Europeana Sounds project, we have identified the user stories and requirements that cover the following annotation scenarios: open and controlled tagging; enrichment of metadata; annotation of media resources; linking to other objects; moderation and general discussion. The first success on bringing annotations to Europeana is the integration of annotations to Europeana objects made on the HistoryPin.org platform covering both the tagging and object linking scenarios. The next step, will be to help data providers to support annotation at their side, for which we are working with the Pundit annotation tool. As a central point on all the efforts around annotations is an agreement on how these should be modelled in a uniform way for all these scenarios, as it is essential to bring such information to Europeana and in a way that can also be easily exploited and shared beyond our portal. For this, we are using the recent Web Annotation Data Model supported by the Open Annotation community as it is the most promising model at the moment. Due to its flexible design, we have made recommendations on how it should be applied for these scenarios and we are looking for discussion/feedback from the community in the hope that it will help cultural heritage institutions to better understand how annotations can be modelled.
|ALIADA, an Open Source Solution to Easily Publish Linked Data of Libraries and Museums
Aliada Consortium, Spain
ALIADA is an open source solution designed by art libraries and museums, ILS vendors and experts on Semantic Web to help cultural heritage institutions to automatically convert, link and publish their library and museum data as Linked Open Data. If they can export their metadata as MARCXML or LIDOXML, they can choose ALIADA as their ally in the challenge of liberating cultural institutions from their current data silos and integrating library and museum data onto the Semantic Web. ALIADA uses its own ontology based on FRBRoo, SKOS, FoaF and WGS84, the ontologies most used by the linked open datasets analyzed during the design of the tool. It's expected this ontology to be updated with the new emerging models and vocabularies, such as RDA or BIBFRAME, according to the ALIADA's community demand. ALIADA can be integrated with the current management system in a library or a museum allowing non-expert staff to easily select and import metadata into ALIADA. Once the file is validated, the user can start the "RDFizer" to create the triples using the existing mapping templates. All the MARC mappings were not carried out into RDF using FRBRoo ontology because of the complexity of the format. Along with the RDF conversion, ALIADA provides a set of predefined SPARQL queries to check the URIs. The next step in the workflow is the linking to other datasets. ALIADA offers a list of external datasets that can be linked to, including Europeana, DBpedia or VIAF. Finally, ALIADA will show the dataset before publishing it on the DataHub.
|12:15 - 13:45||LUNCH|
|13:45 - 15:30||METADATA
Mistakes Have Been Made
The cultural heritage data communities are racing forward into the future with FRBR, BIBFRAME, RDA, and other bibliographic models. Unfortunately, these models are weighted down with the long history of bibliographic description, like stones in our pockets. As someone who worked on the cusp between card catalogs and machine-readable data, Coyle looks back on the moments in our recent history when we should have emptied our pockets and moved forward. As one who was there, there are ''mea culpas''. Coyle will also surprise you with the truth about FRBR and some radical thinking about what to do with that past that is holding us back from achieving the future we should be pursuing.
|Metadata Records & RDF: Validation, Record Scope, State, and the Statement-centric Model
Digital Public Library of Amercia, United States of America
Advocates for the use of RDF as a model for metadata in the cultural heritage sector have frequently spoken of the death of the “record”. Indeed, the shift from a document-centric approach to one based on identified resources and atomic statements is an important one. Yet current work on validation as well as requirements for day-to-day metadata management and attribution point back to aspects of a record-driven worldview. This session will address some historical views of records, contrasting them with the formal model adopted by RDF 1.1 and commonly accepted best practices for Linked Data. Practical implications of the RDF model will be explored, with questions raised regarding the management of state, mutability, and “record” workflows. A provisional approach for managing RDF resources and graphs in record-like contexts is proposed, with connections to RDF Shapes, DC Application Profiles, and Linked Data Platform. Use cases from the Digital Public Library of America will be presented as illustrative examples.
|Evaluation of Metadata Enrichment Practices in Digital Libraries: Steps towards Better Data Enrichments
Valentine Charles / Juliane Stiller
Europeana Foundation, The Netherlands / Humboldt-Universität zu Berlin, Germany
In large cultural heritage data aggregation systems such as Europeana, automatic and manual metadata enrichments are used to overcome the issues raised by multilingual and heterogeneous data. Enrichments are based on linked open datasets, which can be very beneficial for enabling retrieval across languages, adding context to cultural heritage objects and for improving the overall quality of the metadata. However, if not done correctly, the enrichments may transform into errors, which propagate to several languages and impacting the retrieval performance and user experience. To identify the different processes that impact the quality of enrichments, Europeana and affiliated projects’ representatives have organised a series of experiments applying several enrichment techniques on a particular dataset constituted of random metadata samples from several data providers from several domains, but mainly from library held cultural heritage digital objects. Comparing and analysing the results shows that selecting appropriate target vocabularies, fine-tuning enrichment rules are as important as defining evaluation methods. The development of flexible workflows will contribute to better interoperability between enrichment services and data, but might make individual enrichment processes more ambivalent. Efforts where users evaluate and correct enrichments as well as the enrichments’ impact on retrieval and user experience also need to be considered. The presentation will show how a better understanding of enrichment methodologies will help cultural heritage institutions and specifically libraries to get the semantics right.
|15:30||END OF CONFERENCE|