DAY 1 | 2015-11-23 PRECONFERENCE |
09:00 - 12:00 | COLLOCATED EVENTS Treffen der DINI AG KIM (Meeting of the DINI AG KIM, Germany) Stefanie Rühle / Jana Hentschke DINI AG KIM AbstractTagesordnung (the meeting is held in German) |
Metafacture "Get Together" Pascal Christoph hbz, Germany AbstractIn this informal get together of Metafacture users we will discuss questions like: How are we using Metafacture? What are the enhancements we have done and why? What problems remain unresolved? |
|
13:00 - 19:00 | WORKSHOPS AND TUTORIALS Introduction to Linked Open Data Felix Ostrowski / Adrian Pohl graphthinking GmbH, Germany / North Rhine-Westphalian Library Service Center (hbz), Germany AbstractThis introductory workshop aims to introduce the fundamentals of linked data technologies on the one hand, and the basic legal issues of open data on the other. The RDF data model will be discussed, along with the concepts of dereferenceable URIs and common vocabularies. The participants will continuously create and refine RDF documents about themselves including links to other participants to strengthen their knowledge of the topic. Based on the data created the advantages of publishing linked data will be shown. On a side track, Open Data principles will be introduced, discussed and applied to the content that is being created during the workshop. |
Schema.org Question Time Workshop Richard Wallis OCLC, United Kingdom AbstractSchema.org is basically a simple vocabulary for describing stuff, on the web. Since its launch by the major search engines (Google, Bing, Yahoo!, Yandex) in 2011 it has had a meteoric rise to become a de facto vocabulary on the web. How it works; how you use it; can it only be embedded in html; what happens if the search engines drop it; how do I mark up my pages; can I use it for my data like I would any other vocabulary; how is it managed; how applicable is it for bibliographic data; how does it compare with Bibframe; can it be extended; how can I influence its development; how well used is it; what are the benefits of using it — all questions that are often asked about Schema.org. Join this session to hear answers to these questions, ask your own questions, and more. Amongst other things, you will walk through some simple examples of using Schema; how you can participate in the communities surrounding the development and extension of the vocabulary; and discuss how and why it is applicable to libraries and their data. The format of this workshop will mostly be driven by the participants raising questions and topics of concern for discussion in the group, introduced and facilitated by Richard Wallis, Chair of the Schema Bib Extend W3C Community Group. Come along and find out everything* about Schema.org but have never had the chance to ask. (* Richard will do his best attempting to cover everything that can be answered in the session.) |
|
Bringing Your Content to the User, not the User to Your Content – a Lightweight Approach towards Integrating External Content via the EEXCESS Framework Werner Bailer / Martin Höffernig Joanneum Research, Austria AbstractThis workshop will look at the steps and tools needed to adapt scientific and cultural heritage assets to services which make them available to wide community of users on the platforms they already use, e.g., as a plugin in their web browser or on their blogging environment. The EEXCESS project has developed such a service, and welcomes GLAM institutions to make their data available. |
|
Catmandu - a (Meta)Data Toolkit Johann Rolschewski / Vitali Peil / Patrick Hochstenbach Berlin State Library / Bielefeld University Library / Ghent University Library AbstractCatmandu http://librecat.org/Catmandu/ provides a suite of software modules to ease the import, storage, retrieval, export and transformation of (meta)data records. After a short introduction to Catmandu and its features, we will present the command line interface (CLI) and the domain specific language (DSL). Participants will be guided to get data from different sources via APIs, to transform data records to a common data model, to store/index it in Elasticsearch or MongoDB, to query data from stores and to export it to different formats. The intended audience is Systems librarians, Metadata librarians, and Data managers. Participants should be familiar with command line interfaces (CLI). Programming experience is not required. Required is a Laptop with VirtualBox installed. Organisers will provide a VirtualBox image (Linux guest system) beforehand. Participants can also install their own environment, see here. Participants could bring their own data (CSV, JSON, MAB2, MARC, PICA+, XLS, YAML). |
|
RDF.rb & ActiveTriples: Working with RDF in Ruby Thomas Johnson Digital Public Library of America, United States of America AbstractThis workshop covers the current state of RDF support in the dynamic Object Oriented Ruby language. We will cover the following tools: Ruby RDF (RDF.rb) is a fully public domain suite of libraries implementing the full RDF model. The core library provides interfaces for working with resources, statements, graphs, and datasets. An extensive network of satellite libraries offer support for many serialization formats, basic reasoning, graph normalization, SPARQL, and persistence to a wide variety of triplestores. ActiveTriples is an Object-Graph-Modeling interface built over Ruby RDF. It supports the ActiveModel interface for integration with Ruby on Rails and similar frameworks. The workshop is recommended for anyone looking for an expressive, accessible toolkit to work with RDF data. Some experience with programming and a basic knowledge of Object Oriented concepts is assumed; experience with Ruby is not expected. Participants should come prepared with Ruby installed on their laptops, and may benefit from working through Ruby in 20 Minutes in advance of the session. |
DAY 2 | 2015-11-24 CONFERENCE |
09:15 - 10:15 | WELCOME / OPENING Welcome Thorsten Meyer / Silke Schomburg ZBW - Leibniz Information Centre for Economics, Germany / North Rhine-Westphalian Library Service Center (hbz) |
Keynote: Maximising (Re)Usability of Library Metadata Using Linked Data Asunción Gómez Pérez Technical University of Madrid, Spain AbstractLinked Data (LD) and related technologies are providing the means to connect high volumes of disconnected data at Web-scale and producing a huge global knowledge graph. The key benefits of applying LD principles to datasets are
|
|
10:15 - 10:45 | COFFEE BREAK |
10:45 - 12:15 | APPLICATIONS Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford Simeon Warner Cornell University, United States of America AbstractThe Linked Data for Libraries (LD4L) project aims to create a Linked Open Data (LOD) model that works both within individual institutions and across libraries to capture and leverage the intellectual value that librarians and other domain experts add to information resources when they describe, annotate, organize, and use those resources. First we developed a set of use cases illustrating the benefits of LOD in a library context. These served as a reference for the development of an LD4L ontology which includes bibliographic, person, curation, and usage information. This largely draws from existing ontologies, including the evolving BIBFRAME ontology. We have prioritized the ability to identify entities within library metadata records, reducing reliance on lexical forms of identity. Whenever possible we seek out persistent global identifiers for the entities being represented — identifiers from established efforts such as ORCID, VIAF, and ISNI for people, and OCLC identifiers for works for example. One group of LD4L use cases explores circulation and other usage data as sources that could improve discovery, and inform collection building. We are exploring the use of a anonymized and normalized metric that may be shared and compared across institutions. Ontology work and software from the LD4L project is available from our Github repository. |
LOD for Applications – Using the Lobid API Pascal Christoph / Fabian Steeg hbz, Germany AbstractIt is often correctly noted that many datasets got published in the library world with little or no stories about actual use of these datasets. In this talk we want to highlight some of this usage in the context of the hbz linked open data service lobid (which stands for "linking open bibliographic data"). The hbz has been experimenting with linked data technology since 2009. In November 2013 the hbz launched a linked open data API via its service lobid. This API provides access to different kinds of data:
|
|
HTTP-PATCH for Read-write Linked Data Rurik Thomas Greenall Computas AS, Norway AbstractIt can be argued that HTTP-PATCH is essential to read-write linked data; this being the case, there seems to be no absolute definition for how this should be implemented. In this talk, I present different alternatives for HTTP-PATCH and an implementation based on practical considerations from feature-driven development of a linked-data-based library platform at Oslo public library. Grounded in the work done at Oslo public library, I show how HTTP-PATCH can be implemented and used in everyday workflows, while considering several aspects of specifications such as LD-PATCH, RDF-PATCH, particularly in light of existing efforts such as JSON-PATCH. In the description of the implementation, I pay particular attention to the practical issues of using linked data in REST architecture, the widespread use of formats that do not support hypermedia and blank nodes. The talk views the cognitive constraints imposed by the dominance of the traditional library technology stack and how these colour development of new workflows and interfaces. Further, I provide some thoughts about how specifications like the linked-data platform can be reconciled with modern development techniques that largely shun such specifications, and how we can create read-write interfaces for linked data. |
|
12:15 - 13:45 | LUNCH |
13:45 - 15:30 | RESEARCH Call for Linked Research Sarven Capadisli University of Bonn, Germany AbstractLinked Research is set out to socially and technically enable researchers to take full control, ownership, and responsibility of their own knowledge. This is so that research contributions are accessible to the society at maximum capacity, by dismantling the use of archaic and artificial barriers. It is intended to influence a (paradigm) shift in all aspects of scholarly communication by fostering the use of the native Web stack. Linked Research proposes an acid test to the research community in order to verify, approve, or test the openness, accessibility and flexibility of the approaches for enhanced scholarly communication. Dokieli is a decentralized authoring, annotations, and social interaction tool complying with this initiative. This talk will discuss and demonstrate what works! |
A RESTful JSON-LD Architecture for Unraveling Hidden References to Research Data Konstantin Baierer / Philipp Zumstein Mannheim University Library, Germany AbstractData citations are more common today, but more often than not the references to research data don't follow any formalism as do references to publications. The InFoLiS project makes those "hidden" references explicit using text mining techniques. They are made available for integration by software agents (e.g. for retrieval systems). In the second phase of the project we aim to build a flexible and long-term sustainable infrastructure to house the algorithms as well as APIs for embedding them into existing systems. The infrastructure's primary directive is to provide lightweight read/write access to the resources that define the InFoLiS data model (algorithms, metadata, patterns, publications, etc.). The InFoLiS data model is implemented as a JSON schema and provides full forward compatibility with RDF through JSON-LD using a JSON-to-RDF schema-ontology mapping, reusing established vocabularies whenever possible. We are neither using a triplestore nor an RDBMS, but a document database (MongoDB). This allows us to adhere to the Linked Data principles, while minimizing the complexity of mappings between different resource representations. Consequently, our web services are lightweight, making it easy to integrate InFoLiS data into information retrieval systems, publication management systems or reference management software. On the other hand, Linked Data agents expecting RDF can consume the API responses as triples; they can query the SPARQL endpoint or download a full RDF dump of the database. We will demonstrate a lightweight tool that uses the InFoLiS web services to augment the web browsing experience for data scientists and librarians. |
|
Researchers’ Identity Management in the 21st Century Networked World: A Case Study of AUC Faculty Publications Anchalee Panigabutra-Roberts American University in Cairo, Egypt AbstractThis project will explore how American University in Cairo (AUC) faculty members distributed their scholarly and creative works, and how their names are identified in author identifier systems and/or on the Web. The goal is to explore how best to present their data as linked data. The project will use the AUC faculty’s names listed in AUC Faculty Publications: 2012 Calendar Year. Their names will be used to search in author identifier systems to answer;
|
|
15:30 - 16:00 | COFFEE BREAK |
16:00 - 17:30 | LIGHTENING / BREAKOUT Lightening Talks |
Breakout SessionsAbstractAs an experiment, this year's SWIB will provide space for groups of participants to meet, discuss and exchange ideas or results of their work on specific topics. Please, introduce your topic and the main question to discuss with a short statement in the preceeding Lightening talks. |
DAY 3 | 2015-11-25 CONFERENCE |
09:00 - 10:15 | OPENING Keynote: The Digital Cavemen of Linked Lascaux Ruben Verborgh Ghent University – iMinds, Belgium AbstractSome 17,000 years ago, cavemen, cavewomen and cavekids picked up their cavebrushes to paint caveanimals on their cavewalls in a place that eventually would become known as the Lascaux complex. Their cavehands eternalized cavehorses and cavedeer in shady corners, an art form which continues to inspire contemporary artists such as Banksy. Despite the millennia-long deprecation of cave technology (X-caveML 2.0 never really caught on), we can still admire Lascauxian cave art, even though we will probably remain eternally oblivious of its purpose if there ever was any. This sharply contrasts with an Excel 97 sheet named mybooks.xls.bak I tried to open yesterday: perfectly remembering its purpose (my dad was maintaining a list of books he had read), I'm unable to revive the splendid tabular chaos undoubtedly typeset in Times New Roman or worse. 17 years ago somebody made a simple spreadsheet and it's literally less accessible than a 17,000 year old scribble by an unknown caveartist. Not to mention the philistines who are blacking out Banksy's recent works, which date back to last year or so. And certainly don't get me started about sustainable Linked Data. I mean, is there really such a thing? We'll be lucky if any triple at all survives 17 years. Or 17 months, for that matter. Some even have trouble keeping a SPARQL endpoint up for 17 hours. Or minutes. We might not be very good cavemen. This talk combines lessons learned from the Semantic Web, the REST principles, and the Web in general to think about what sustainability for Linked Data could really mean and how we just might achieve it. |
Data-Transformation on Historical Data Using the RDF Data Cube Vocabulary Sebastian Bayerl / Michael Granitzer University of Passau, Germany AbstractThis work describes how XML-based TEI documents, containing statistical data, can be normalized, converted and enriched using the RDF Data Cube Vocabulary. In particular we focus on a statistical real world data set, namely the statistics of the German Reich around the year 1880, which are available in the TEI format. The data is embedded in complex structured tables, which are relatively easy to understand for humans but they are not suitable for automated processing and data analysis, without heavy pre-processing, due to their varying structural properties and differing table layouts. Therefore, the complex structured tables must be validated, modified and transformed, until they are suitable for the standardized multi-dimensional data structure - the data cube. This work especially focuses on the transformations necessary to normalize the structure of the tables. Performing validation- and cleaning-steps, resolving row- and column-spans and reordering slices are available transformations among multiple others. By combining existing transformations, compound operators are implemented, which can handle specific and complex problems. The identification of structural similarities or properties can be used to automatically suggest sequences of transformations. A second focus is on the advantages, which come by using the RDF Data Cube Vocabulary. Also, a research prototype was implemented to execute the workflow and convert the statistical data into data cubes. |
|
10:15 - 10:45 | COFFEE BREAK |
10:45 - 12:15 | LODLAM Linking Data about the Past Through Geography: Pelagios, Recogito & Peripleo Rainer Simon / Elton Barker / Leif Isaksen / Pau de Soto Cañamares AIT Austrian Institute of Technology / The Open University / University of Southampton, United Kingdom AbstractPelagios is a community-driven initiative that facilitates better linkages between online resources documenting the past, based on the places they refer to. Our member projects are connected by a shared vision of a world in which the geography of the past is every bit as interconnected, interactive and interesting as the present. Pelagios has been working towards establishing conventions, best practices and tools in several areas of "Linked Ancient World Data":
|
Modeling and Exchanging Annotations for Europeana Projects Hugo Manguinhas / Antoine Isaac / Valentine Charles / Sergiu Gordea / Maarten Brinkerink The Europeana Foundation, Netherlands / Austrian Institute of Technology / Netherlands Institute for Sound and Vision AbstractCultural heritage institutions are looking at crowdsourcing as a new way and opportunity to improve the overall quality of their data and contribute to a better semantic description and link to the web of data. This is also the case for Europeana, as crowdsourcing under the form of annotations is envisioned and being worked on in several projects. As part of the Europeana Sounds project, we have identified the user stories and requirements that cover the following annotation scenarios: open and controlled tagging; enrichment of metadata; annotation of media resources; linking to other objects; moderation and general discussion. The first success on bringing annotations to Europeana is the integration of annotations to Europeana objects made on the HistoryPin.org platform covering both the tagging and object linking scenarios. The next step, will be to help data providers to support annotation at their side, for which we are working with the Pundit annotation tool. As a central point on all the efforts around annotations is an agreement on how these should be modelled in a uniform way for all these scenarios, as it is essential to bring such information to Europeana and in a way that can also be easily exploited and shared beyond our portal. For this, we are using the recent Web Annotation Data Model supported by the Open Annotation community as it is the most promising model at the moment. Due to its flexible design, we have made recommendations on how it should be applied for these scenarios and we are looking for discussion/feedback from the community in the hope that it will help cultural heritage institutions to better understand how annotations can be modelled. |
|
ALIADA, an Open Source Solution to Easily Publish Linked Data of Libraries and Museums Cristina Gareta Aliada Consortium, Spain AbstractALIADA is an open source solution designed by art libraries and museums, ILS vendors and experts on Semantic Web to help cultural heritage institutions to automatically convert, link and publish their library and museum data as Linked Open Data. If they can export their metadata as MARCXML or LIDOXML, they can choose ALIADA as their ally in the challenge of liberating cultural institutions from their current data silos and integrating library and museum data onto the Semantic Web. ALIADA uses its own ontology based on FRBRoo, SKOS, FoaF and WGS84, the ontologies most used by the linked open datasets analyzed during the design of the tool. It's expected this ontology to be updated with the new emerging models and vocabularies, such as RDA or BIBFRAME, according to the ALIADA's community demand. ALIADA can be integrated with the current management system in a library or a museum allowing non-expert staff to easily select and import metadata into ALIADA. Once the file is validated, the user can start the "RDFizer" to create the triples using the existing mapping templates. All the MARC mappings were not carried out into RDF using FRBRoo ontology because of the complexity of the format. Along with the RDF conversion, ALIADA provides a set of predefined SPARQL queries to check the URIs. The next step in the workflow is the linking to other datasets. ALIADA offers a list of external datasets that can be linked to, including Europeana, DBpedia or VIAF. Finally, ALIADA will show the dataset before publishing it on the DataHub. |
|
12:15 - 13:45 | LUNCH |
13:45 - 15:30 | METADATA Mistakes Have Been Made Karen Coyle kcoylenet AbstractThe cultural heritage data communities are racing forward into the future with FRBR, BIBFRAME, RDA, and other bibliographic models. Unfortunately, these models are weighted down with the long history of bibliographic description, like stones in our pockets. As someone who worked on the cusp between card catalogs and machine-readable data, Coyle looks back on the moments in our recent history when we should have emptied our pockets and moved forward. As one who was there, there are ''mea culpas''. Coyle will also surprise you with the truth about FRBR and some radical thinking about what to do with that past that is holding us back from achieving the future we should be pursuing. |
Metadata Records & RDF: Validation, Record Scope, State, and the Statement-centric Model Thomas Johnson Digital Public Library of Amercia, United States of America AbstractAdvocates for the use of RDF as a model for metadata in the cultural heritage sector have frequently spoken of the death of the “record”. Indeed, the shift from a document-centric approach to one based on identified resources and atomic statements is an important one. Yet current work on validation as well as requirements for day-to-day metadata management and attribution point back to aspects of a record-driven worldview. This session will address some historical views of records, contrasting them with the formal model adopted by RDF 1.1 and commonly accepted best practices for Linked Data. Practical implications of the RDF model will be explored, with questions raised regarding the management of state, mutability, and “record” workflows. A provisional approach for managing RDF resources and graphs in record-like contexts is proposed, with connections to RDF Shapes, DC Application Profiles, and Linked Data Platform. Use cases from the Digital Public Library of America will be presented as illustrative examples. |
|
Evaluation of Metadata Enrichment Practices in Digital Libraries: Steps towards Better Data Enrichments Valentine Charles / Juliane Stiller Europeana Foundation, The Netherlands / Humboldt-Universität zu Berlin, Germany AbstractIn large cultural heritage data aggregation systems such as Europeana, automatic and manual metadata enrichments are used to overcome the issues raised by multilingual and heterogeneous data. Enrichments are based on linked open datasets, which can be very beneficial for enabling retrieval across languages, adding context to cultural heritage objects and for improving the overall quality of the metadata. However, if not done correctly, the enrichments may transform into errors, which propagate to several languages and impacting the retrieval performance and user experience. To identify the different processes that impact the quality of enrichments, Europeana and affiliated projects’ representatives have organised a series of experiments applying several enrichment techniques on a particular dataset constituted of random metadata samples from several data providers from several domains, but mainly from library held cultural heritage digital objects. Comparing and analysing the results shows that selecting appropriate target vocabularies, fine-tuning enrichment rules are as important as defining evaluation methods. The development of flexible workflows will contribute to better interoperability between enrichment services and data, but might make individual enrichment processes more ambivalent. Efforts where users evaluate and correct enrichments as well as the enrichments’ impact on retrieval and user experience also need to be considered. The presentation will show how a better understanding of enrichment methodologies will help cultural heritage institutions and specifically libraries to get the semantics right. |
|
15:30 | END OF CONFERENCE |
ZBW
Joachim Neubert
T. +49-(0)-40-42834462
E-mail j.neubert(at)zbw.eu
hbz
Adrian Pohl
T. +49-(0)-221-40075235
E-mail
swib(at)hbz-nrw.de
Twitter: #swib15