Programme

Note that all times are displayed in UTC. Clicking on a time display will show your local time.

Hide all abstracts

DAY 1   |   2020-11-23   CONFERENCE
13:00-14:00h UTC COLLOCATED EVENT: DINI-AG KIM MEETING
Jana Hentschke / Alexander Jahnke
DINI-AG Kompetenzzentrum Interoperable Metadaten (KIM)
Virtual public meeting of the DINI-AG KIM. KIM is a forum for German-speaking metadata experts from LAM institutions.
The Meeting will be held in German.
Agenda

14:00-15:00h UTC OPENING / KEYNOTE
Opening
Silke Schomburg
North Rhine-Westphalian Library Service Center (hbz), Germany
KEYNOTE: Open Data & Social Innovation: Experiences from Taiwan
Audrey Tang
Digital Minister, Taiwan
Abstract

When we see “internet of things”, let’s make it an internet of beings.
When we see “virtual reality”, let’s make it a shared reality.
When we see “machine learning”, let’s make it collaborative learning.
When we see “user experience”, let’s make it about human experience.
When we hear “the singularity is near”, let us remember: the Plurality is here.

Video

15:00-15:30h UTC Coffee break
15:30-16:30h UTC AUTOMATED SUBJECT INDEXING
Automatic indexing of institutional repository content using SKOS
Ricardo Eito-Brun
Universidad Carlos III de Madrid, Spain

Abstract

The lack of well-defined indexing practices is a common problem in most institutional repositories. Researchers typically assign keywords to their submissions, these terms, however are not extracted from a controlled vocabulary or thesaurus. This leads to ambiguity and lack of specificity in the terms which are used to describe the content of their contributions.
This presentation describes an approach used to complete the automatic assignment of descriptors and keywords taken from a thesaurus (the UNESCO thesaurus) to contributions which had already been published. The process is run with the help of an existing commercial tool, PoolParty. The experiment runs a process to automatically identify the thesaurus descriptors that describe the content of the documents published in the institutional repository of a Spanish university.
Once the thesaurus descriptors are assigned to the existing documents, it is feasible to use the theasurus to expand queries and to assist end-users in the selection of search terms. This brings about improved search capabilities, as users are in a position to identify both, additional related terms andmore general and specific terms in order to improve their search queries.

Slides

Video

Annif and Finto AI: DIY automated subject indexing from prototype to production
Osma Suominen / Mona Lehtinen / Juho Inkinen
The National Library of Finland, Finland
Abstract

The first prototype of Annif (annif.org), the multilingual automated subject indexing tool, was created at the National Library of Finland in early 2017. Since then, the open source tool has grown from an experiment into a production system. Through its REST API it has been integrated, into the document repositories of several university libraries, the metadata workflows of the book distributor Kirjavälitys Oy that serves publishers, bookshops, libraries and schools, into the Dissemin service for publishing academic papers in open repositories, and into the automated subject indexing service Finto AI (ai.finto.fi) that was launched in May 2020 as a companion to the Finto thesaurus and ontology service. In the meantime, we have organized workshops and tutorials around Annif and automated indexing as well as grown an international community of users and developers.
This presentation looks at the current state of Annif, the lessons learned during its development, how it has been received by different communities and what the next steps will look like.

Slides

Video

AutoSE@ZBW: Building a productive system for automated subject indexing at a scientific library
Anna Kasprzik / Moritz Fürneisen / Christopher Bartz
ZBW - Leibniz Information Centre for Economics, Germany
Abstract

At ZBW we have been developing prototype machine learning solutions for an automated subject indexing in the context of applied research for several years now. However, and as of 2019 these solutions were yet to be integrated into the metadata management system and into the subject indexing workflows at ZBW. It turns out that building a corresponding software architecture is a challenge on another level which requires additional resources on top of those for academic research as well as additional expertise. In order to create a productive system that makes these machine learning solutions usable in practice and that allows a continuous development we need to look at aspects such as user and data interfaces, suitable development and test environments, system stability, modularity and continuous integration.
After a strategic reorientation and preparation phase in 2019, in 2020 we have created a first proof-of-concept version of a productive system that suggests subject terms for resources from our holdings and which makes these suggestions available via an API for example to the DA-3: a tool for assisted subject indexing based on suggestions from external sources that is currently evaluated in our library network.
In parallel, we have been experimenting with more advanced statistical algorithms as well as with neural networks. Our software architecture is supposed to allow the integration of new methods as smoothly as possible without interruptions to the service.
This presentation sums up our first steps towards a productive system, first lessons learned, and projects some milestones for the way ahead.

Slides

Video

DAY 2   |   2020-11-24   CONFERENCE
14:00-16:30h UTC WORKSHOPS
Automated subject indexing with Annif
Osma Suominen / Mona Lehtinen / Juho Inkinen / Anna Kasprzik / Moritz Fürneisen
National Library of Finland, Finland / ZBW - Leibniz Information Centre for Economics, German
Abstract

Due to the proliferation of digital publications, intellectual subject indexing of every single literature resource in institutions such as libraries is no longer possible. For the task of providing subject-based access to information resources of different kinds and with varying amounts of available metadata, it has become necessary to explore possibilities of automation.
In this hands-on tutorial, participants will be introduced to the multilingual automated subject indexing tool Annif (annif.org) as a potential component in a library’s metadata generation system. By completing exercises, participants will get practical experience on setting up Annif, training algorithms using example data, and using Annif to produce subject suggestions for new documents using the command line interface, the web user interface and REST API provided by the tool. The tutorial will also introduce the corpus formats supported by Annif so that participants will be able to apply the tool to their own vocabularies and documents.
The tutorial will be organized using the flipped classroom approach: participants are provided with a set of instructional videos and written exercises, and are expected to attempt to complete them on their own time before the tutorial event, starting at least a week in advance. The actual event will be dedicated to solving problems, asking questions and getting a feeling of the community around Annif.
Participants are instructed to use a computer with at least 8GB of RAM and at least 20 GB free disk space to complete the exercises. The organizers will provide the software as a preconfigured VirtualBox virtual machine. Alternatively, Docker images and a native Linux install option are provided for users familiar with those environments. No prior experience with the Annif tool is required, but participants are expected to be familiar with subject vocabularies (e.g. thesauri, subject headings or classification systems) and subject metadata that reference those vocabularies.

The workshop Automated subject indexing with Annif is not recorded. Workshop material with step by step walkthroughs and video recordings for self-learning is available online. For more information on Annif, see also the Annif homepage and wiki. Last but not least don't hesitate contacting us e.g. via the Annif user group.

Making use of the coli-conc infrastructure for controlled vocabularies
Jakob Voß / Stefan Peters
Verbundzentrale des GBV, Germany
Abstract

Project coli-conc has created an infrastructure to facilitate management and exchange of concordances between library knowledge organization systems. The most visible outcome of the project is Cocoda, a web application that simplifies the creation and evaluation of mappings between concepts from different classifications, thesauri, and other controlled vocabularies. This tutorial will give an introduction to the infrastructure that allows to work with controlled vocabularies from diverse sources. After a brief introduction to the architecture, the data format JSKOS, its API, and utility node packages, we will live code a small cataloging application for semantic tagging of resources with concepts from controlled vocabularies. Active participation requires basic knowledge of JavaScript and HTML.

The Workshop Making use of the coli-conc infrastructure for controlled vocabularies is not recorded. Workshop material and slides for self-learning are available. The coli-conc homepage also provides pointers to screencasts, documents, and source code. Last but not least don't hesitate contacting us!

Managing and Preserving Linked Data with Fedora
David Wilcox
LYRASIS, Canada
Abstract

Fedora is a flexible, extensible, open source repository platform for managing, preserving, and providing access to digital content. Fedora is used in a wide variety of institutions including libraries, museums, archives, and government organizations. For the past several years the Fedora community has prioritized alignment with linked data best practices and modern web standards. We are now shifting our attention back to Fedora's digital preservation roots with a focus on durability and the Oxford Common File Layout (OCFL). This workshop will provide an introduction to the latest version of Fedora with a focus on both the linked data and digital preservation capabilities. Both new and existing Fedora users will be interested in learning about and experiencing Fedora features first-hand.
Attendees will be given access to individual cloud-hosted Fedora instances to use during the workshop. These instances will be used to participate in hands-on exercises that will give attendees a chance to experience Fedora by following step-by-step instructions. The workshop will include two modules, each of which can be delivered in 1 hour or less:
Introduction and Feature Tour
This module will feature an introduction to Fedora generally, with a focus on the latest version, followed by an overview of the core Fedora features. It will include hands-on demonstrations of the linked data features (resource management, using RDF to create and update metadata for description and access).
Fedora 6 and the Oxford Common File Layout
Fedora 6.0, the next major version of Fedora, will focus on digital preservation by aligning with the Oxford Common File Layout (OCFL). The OFCL is an application-independent approach to the storage of digital objects in a structured, transparent, and predictable manner. It is designed to promote long-term access and management of digital objects within digital repositories. This module will provide an overview of the OCFL and how it is used in Fedora. Participants will be able to see how resources created in the first half of the workshop are represented as OCFL Objects on the file system.

The SWIB20 workshop Introduction to Fedora & Fedora 6.0 and the Oxford Common File Layout are not recorded. Workshop Slides for Introduction to Fedora are available online. Workshop Slides for Fedora 6.0 and the Oxford Common File Layout are also available online. For more information on Fedora, please see the Fedora homepage and follow our progress on the Road to Fedora 6.0. You can also follow along on our blog. Please feel free to reach out to contact us at any time through our Fedora Community communication channel.

Using SkoHub for web-based metadata management & content syndication
Adrian Pohl / Steffen Rörtgen
hbz, Germany / GWDG, Germany
Abstract

Authority files, thesauri and other controlled vocabularies systems have long been a key element for knowledge management. Frequently a controlled vocabulary is used in cataloguing by different institutions, thus indirectly connecting resources about one topic. In order to find all resources about one topic, one has to query different databases or create and maintain a discovery index. This approach is error-prone and requires high maintenance.
SkoHub moves cataloguing of web-based material consequently to the web and builds new, powerful discovery tools on top. It is based on web standards like Simple Knowledge Organizations System (SKOS), ActivityPub and Linked Data Notifications. With the SkoHub infrastructure, knowledge management systems act as topic-based communication channels between content publishers and people looking for relevant resources. In effect, SkoHub allows to follow specific subjects (as in descriptors of a classification, thesaurus etc.) in order to be notified when someone has published new content about that subject on the web.
After presenting SkoHub at SWIB19, we now offer a workshop so people can try it out. The workshop participants will learn about the different components by using them in different stages of metadata management:
1. publication of a controlled vocabulary as SKOS scheme with a git-based editing process
2. configuration of a web form for creating structured metadata in the browser
3. subscription to a topic as well as publication of a resource to interested users who have subscribed to a specific topic
For details on SkoHub, see the blog posts.

The SWIB20 workshop Using SkoHub for web-based metadata management & content syndication is not recorded. Workshop material with step by step walkthroughs and video recordings for self-learning is available online. For more information on SkoHub, see also the SkoHub homepage and blog posts. Last but not least don't hesitate contacting us: skohub[at]hbz-nrw[dot]de

DAY 3   |   2020-11-25   CONFERENCE
14:00-15:00h UTC BIBFRAME
Developing BIBFRAME application profiles for a cataloging community
Paloma Graciani-Picardo / Nancy Lorimer / Christine DeZelar-Tiedman / Nancy Fallgren / Steven Folsom / Jodi Williamschen
Harry Ransom Center, University of Texas at Austin / Stanford University / University of Minnesota / National Library of Medicine / Cornell University / Library of Congress
Abstract

As libraries experiment with integrating BIBFRAME (BF) data into library workflows and applications, it is increasingly clear that there is little to no formal agreement on what a baseline BF description might be, and even how specific properties are modeled in what is a very flexible ontology. This basic agreement is imperative, at least in these early days, for data producers and developers in building out and implementing practical workflows and viable interactions among disparate data sources; the more flavors that need to be dealt with, the more difficult initial implementation will be. Additionally, there is little consensus on how to integrate BF and RDA, our primary cataloging standard, and it is difficult to move ahead without a basic mapping. The Program for Cooperative Cataloging (PCC), with its close connection to the Library of Congress and the Linked Data for Production grants, and its focus on standards-building in the MARC cataloging community, is well set up to develop standards and become a steward of well-formed BF. To that end, the PCC Sinopia Application Profiles Task Group is working on developing BF application profiles through creating PCC templates in Sinopia, the linked data editor developed in Linked Data for Production 2 grant, to serve as the basis for metadata creation by the PCC community. In this talk, we discuss the community process and challenges encountered in creating applications profiles through template development, including modeling questions, technical challenges, and reconciling BF with RDA and PCC standards.

Slides

Video

Sinopia Linked Data Editor
Jeremy Nelson
Stanford University Libraries, United States of America
Abstract

The Sinopia Linked Data environment is a Mellon foundation funded project that provides catalogers a native Linked-Data editor with a focus on the BIBFRAME ontology for describing resources. Following an iterative Agle development process, Sinopia is currently in it's third version with improved user interface and third-party integrations based on continuous feedback from an international cohort of users. This presentation will start off with a high-level introduction of Sinopia, followed by cataloger data workflows using external authority sources like Library of Congress and ShareVDE and supporting technologies, and finishing with the new and planned features, including machine learning RDF classification, in the upcoming year.

Video

Cataloging rare books as linked data: a use case
Paloma Graciani-Picardo / Brittney Washington
Harry Ransom Center - University of Texas at Austin
Abstract

Linked Data for Production phase 2 (LD4P2), a two-year pilot supported by the Andrew W. Mellon Foundation wrapped up in May 2020. As a member of the LD4P2 cohort, the Harry Ransom Center is eager to share with the community some of our activities within the project and lessons learnt. Application profiles have been at the core of LD4P2 activities and the Ransom Center has supported this effort with the evaluation of ontologies, models, vocabularies and best practices for item-level description of rare and special collection materials in a linked data collaborative environment. In this presentation, we will discuss our work analyzing MARC to BIBFRAME conversion, defining local workflows for linked data cataloging and self-training strategies, and developing an application profile for special collections materials. We will do a quick review of the existing ontologies and controlled vocabularies relevant to the project, and present data modeling approaches and challenges. Finally, but no less important, we will emphasize the value of special collections community engagement in these types of projects and the need for continued collaboration beyond the grant.

Slides

Video

15:00-15:30h UTC Coffee break
15:30-16:30h UTC AUTHORITIES
BIBFRAME instance mining: Toward authoritative publisher entities using association rules
Jim Hahn
University of Pennsylvania, United States of America
Abstract

The catalyst for this talk stems from work within the Share-VDE initiative, a shared discovery environment based on linked data. The project encompasses enrichment with linked open data and subsequent conversion from MARC to BIBFRAME/RDF and creation of a cluster knowledge base made up of over 400 million triples. The resulting BIBFRAME network is comprised of the BIBFRAME entities Work and Instance, among other Share-VDE specific entities.
With the transition of a shared catalog to BIBFRAME linked data, there is now a pressing need for identifying the canonical Instance for clustering in BIBFRAME. A fundamental component of Instance identification is by way of authoritative publisher entities. Previous work in this area by OCLC research (Connaway & Dickey, 2011) proposed a data mining approach for developing an experimental Publisher Name Authority File (PNAF). The OCLC research was able to create profiles for "high-incidence" publishers after data mining and clustering of publishers. As a component of PNAF, Connaway & Dickney were able to provide detailed subject analysis of publishers. This presentation will detail a case study of machine learning methods over a corpus of subjects, main entries, and added entries, as antecedents into association rules to derive consequent publisher entities. The departure point for the present research into identification of authoritative publisher entities is to focus on clustering, reconciliation and re-use of ISBN and subfield b of MARC 260 along with the subjects (650 - Subject Added Entry), main entries (1XX - Main Entries) and added entries (710 - Added Entry-Corporate Name) as signals to inform a training corpus into association rule mining, among other machine learning algorithms, libraries, and methods.

Slides

Video

Generating metadata subject labels with Doc2Vec and DBPedia
Charlie Harper
Case Western Reserve University, United States of America
Abstract

Previous approaches to metadata creation using unsupervised learning have often centered on generating document clusters, which then require manual labeling. Common approaches, such as topic modelling with Latent Dirichlet Allocation, are also limited by the need to determine the number of clusters prior to training. While this is useful for finding underlying relationships in corpora, unlabeled clustering does not provide an ideal way to generate metadata. In this presentation, I examine one way that unsupervised machine learning and linked data can be employed to generate rich metadata labels for textual resources and thereby improve resource discovery.
To generate document-specific metadata, I build high-dimensional vectors using the doc2vec algorithm on DBPedia entries. DBPedia regularly collects millions of pages from wikipedia, including a unique label, an abstract, and extensive information on the semantic links between pages. While abstracts and labels can be extremely specific and are unlikely to provide broadly usable metadata tags, the linked nature of this dataset provides a valuable way to improve this. Using predicates like dct:subject and skos:broader, page vectors can be averaged together to encode higher conceptual levels of information that are labelled by a unique subject or idea. By then vectorizing an unseen document’s abstract, a k-d search tree can quickly locate the nearest subjects in vector space and suggest what labels should be assigned to a document. To explore its efficacy, this tagging approach is applied to a corpus of dissertations and theses published at Ohio universities and colleges. Methods for visualizing the tagged corpus are finally explored to determine if the linked nature of the subject tags may allow users to visually discover related texts in a more natural, less constrained way.

Slides

Video

Automated tools for propagating a common hierarchy from a set of vocabularies
Joeli Takala
National Library of Finland, Finland
Abstract

Through finto.fi the National Library of Finland publishes controlled vocabularies for subject indexing and linking data. Linked Open Data formats such as SKOS, also enable us to combine several vocabularies into a common repository of concepts from various sources. The purpose is to expand one general-purpose vocabulary with others of more specific fields of knowledge in a way that enables us to cover a wider context with one vocabulary. The problem is in assessing and ensuring the interoperability of each vocabulary when used in this manner.
The combined data set consists of the Finnish General Upper Ontology (YSO) and fifteen domain-specific controlled vocabularies published in SKOS format. Each vocabulary broadly follows the same data model, with each sharing the upper hierarchy of YSO. In total this amounts to 2.1M triples which are combined into 57k unique concepts and 246k labels in a single common vocabulary, KOKO. The tool used for creating the combined vocabulary is a twelve-step algorithm which is combined with the tools for change-tracking and automated quality assessment in order to publish a common vocabulary of consistent quality.
The difficulties arise from recognising the common error types in the data structure of a single vocabulary which would not look alarming on their own, but would create complex dynamics if the vocabularies were combined and different error types cascaded on top of each other. The end result may seem confusing for a user and should be avoided whenever possible. Apart from assessing whether such mistakes exist in the data, we also need to address data synchronisation problems when concepts from one vocabulary are shifted in the hierarchy, removed altogether or split into several new concepts of similar meaning. To achieve this, the update process of each of the vocabularies is synchronised with the updates of the YSO’s hierarchy.

Slides

Video

DAY 4   |   2020-11-26   CONFERENCE
14:00-15:00h UTC IDENTIFIERS
Integration and organization of knowledge in a Current Research Information System (CRIS) based on semantic technologies
Ana Maria Fermoso García / Maria Isabel Manzano García / Julian Porras Reyes / Juan Blanco Castro
Pontifica University of Salamanca, Spain
Abstract

We present OpenUPSA, a system based, inter alia, on semantic technologies. It is a project developed in collaboration with the university library, and whose goal is to share with society information about research at the university, about its agents and its scientific production. The result is a software system that can be regarded as a Current Research Information System (CRIS).
This system, however, can be considered as an advanced CRIS. It does not only visualize information about university research including its researchers, its groups of research and scientific production (projects, publications, academic works or patents), but also provides other progressive features.
The first feature allows to integrate information from a variety of sources where nowadays information about research is provided, and to even enrich information from external sources by linking for instance a publication with its information in a database like Scopus. This is mainly achieved by implementing analyzers for the different formats that have to be integrated in a common relational database format. Besides, after integration information can be also managed.
The second feature is the possibility to perform system-specific queries and to view and download the results obtained in different formats. This is possible thanks to the use of semantic technologies, particularly due to the use of an ontology as semantic data model and a SPARQLPoint service. The ontology allows research authorities and knowledge to be organized and shared with the community, and even to enrich this knowledge from external sources as a LOD system. The ontology is new, but based on the CERIF (Common European Research Information Format) ontology, the European standard for CRIS systems. Thus it facilitates the internationalization of our proposal and sharing our data with other systems.

Slides

Video

ORCID for Wikidata: A workflow for matching author and publication items in Wikidata
Eva Seidlmayer
ZB MED - Information Centre for Life Sciences, Germany
Abstract

In the context of a bibliometric project, we retrieved social context information on authors of scientific publications from Wikidata in order to import it into the metadata of our dataset. While we were able to capture about 95% of the requested scholarly publications in Wikidata, only 3% of the authors could be assigned and used for retrieval of social context information. One reason probably was that authors in general are rarely curated in Wikidata. Whilst research papers account for 31.5% of the items, it is only 8.9% which represent humans in general, not even researchers in particular (according to Wikidata statistics, January 2020). Another reason we observed is the frequent absence of relations between the Wikidata item of a publication and the Wikidata item of the author(s), although the author is already listed.
To fill the gap and in order to improve the foundation for bibliographic analysis in general, we established a workflow for matching authors and scholarly articles by making use of the ORCID (Open Researcher and Contributor ID) database. The presentation will demonstrate how we harvest information on publications and their authors from ORCID, how we query Wikidata for existing items that are also listed in ORCID, and how we perform the matching. Finally, it will be illustrated how the initial research project benefits from the presented enrichment of bibliometric details in Wikidata.

Slides

Video

id.loc.gov and Wikidata, one year later.
Matt Miller
Library of Congress, United States of America
Abstract

The id.loc.gov linked data platform at the Library of Congress has been ingesting Wikidata identifiers since mid-2019. This process has enabled the connection of over 1.2 million links between the two systems. These links are powered by Wikimedians adding a NACO or LCSH LCCN identifier to a Wikidata entity which then flows into the ID platform. Due to the scale and nature of Wikidata there is a high velocity of change to this data. New connections are made and broken everyday in addition to ancillary data changes like Wikidata labels. This talk will present the analysis of this ingest process from 2019 and 2020. We will take a detailed look at trends that emerged from this analysis as well as a holistic look at linked records in both systems. Topics around vandalism, comparing record completeness in the two systems, and change frequency will be explored. As more data from the Wikimedia ecosystem is leveraged in our bibliographic systems it is important to understand the dynamics and differences between the two worlds.

Slides

Video

15:00-15:30h UTC Coffee break
15:30-16:30h UTC PROJECTS
Changing the tires while driving the car: A pragmatic approach to implementing linked data
David Seubert / Shawn Averkamp / Michael Lashutka
American Discography Project, University of California, Santa Barbara, USA / AVP, USA / PropertlySorted Database Solutions, Beacon, NY, USA
Abstract

The Discography of American Historical Recordings (DAHR) is an online database of sound recordings made by American record companies during the 78rpm era. Based at the University of California, Santa Barbara, DAHR now includes authoritative information on over 300,000 master recordings by over 60,000 artists and has 40,000 streaming audio files online. To provide even more context for researchers using the database, DAHR editors chose to use linked data to enrich the database with information from other open data sources. With funding from the Library of Congress National Recording Preservation Board, UCSB engaged consultants at AVP in the development of a strategy for enriching DAHR by mining public data. After the harvesting and integration of data for over 18,000 names from Library of Congress Name Authority File, Wikidata, and MusicBrainz, users can now find Wikipedia biographies, photographs, and links to additional content at many other databases, such as LP reissues in Discogs, record reviews on Allmusic, or streaming audio on Spotify as well as links to names in other authority files like VIAF. In this presentation, we will share our process of harvesting data, retrofitting DAHR’s underlying FileMaker Pro data model and workflows to accommodate the addition of this new data and the minting of URIs, and leveraging the unique circumstances of the COVID-19 outbreak to redirect staff time towards quality control of this new data. We will also discuss current efforts to populate Wikidata and MusicBrainz with our newly minted URIs to provide broader entry and visibility to the DAHR database.

Slides

Video

Linked data for opening up discovery avenues in library catalogs
Huda Khan
Cornell University, United States of America
Abstract

Exploring the integration of linked data sources into library discovery interfaces was an important goal for the recently concluded Linked Data For Production: Pathway to Implementation ( LD4P2) grant. We conducted a series of focused experiments involving user studies and the implementation of Blacklight-based prototypes. In this presentation, we will provide an overview of lessons learned through these experiments as well as subsequent discovery research as part of the ongoing Linked Data for Production: Closing the Loop (LD4P3) LD4P3 grant. Examples of areas we investigated for the integration of linked data include: knowledge panels bringing in contextual information and relationships from knowledge graphs like Wikidata to describe people and subjects related to library resources in the catalog; suggested searches based on user-entered queries using results from Wikidata and DbPedia; browsing experiences for subjects and authors bringing in relationships and data from Wikidata and library authorities; and autosuggest for entities represented in the catalog using supplementary information from FAST, the Library of Congress authorities, and Wikidata. Grant work also supported the development of Blacklight functionality for embedding Schema.org JSON-LD representation of some catalog metadata. We will also review opportunities for the larger community to engage in discussions around use cases and implementation techniques for using linked data in discovery systems.

Slides

Video

Using IIIF and Wikibase to syndicate and share cultural heritage material on the Web
Jeff Keith Mixter
OCLC, United States of America
Abstract

Digitized cultural heritage material is ubiquitous across the library, archive, and museum landscape but the material descriptions can vary based on domain, institutional best practices, and the amount of effort dedicated to digitization programs. OCLC Research has spent the past few years exploring two primary functions of digital material management: syndication of the material for research use and best metadata management practices for discoverability. This work is closely tied to OCLC’s participation in the IIIF Community.
A key benefit of IIIF is aggregation. In 2019, OCLC started a research project that created an index of all CONTENTdm metadata and a discovery interface, providing access to 11 million digitized materials represented by 30 million images, all with IIIF standard support. This project showed the benefits of building an aggregated index and the challenges of working with highly-heterogenous metadata.
Based on the findings of that project, OCLC launched a linked data pilot project to explore how we could help CONTENTdm users create and manage linked data for cultural materials. We used the MediaWiki Wikibase platform and, working with the pilot participants, designed a new data model. This effort reinforced the value of applying decentralized domain expertise to converting record metadata into linked data and identified transformation workflows.
This presentation will discuss the project findings and demonstrate the services and applications we developed.

Slides

Video

DAY 5   |   2020-11-27   CONFERENCE
14:00-16:30h UTC OPEN SPACE
Lightning Talks

Semantic MediaWiki
Bernhard Krabina
KDZ - Zentrum für Verwaltungsforschung, Vienna, Austria

Slides

Video

Share VDE: A facilitator for the library community
Anna Lionetti
Casalini Libri, Fiesole, Italy

Slides

Video

Building Wikidata one Branch of Knowledge at a Time
Anchalee (Joy) Panigabutra-Roberts
University of Tennessee Libraries, Knoxville, USA

Slides

Video

W3C Entity Reconciliation CG
Fabian Steeg
hbz, Germany

Slides

Video

Building the SWIB20 participants map
Joachim Neubert
ZBW - Leibniz Information Centre for Economics, Germany

Slides

Video

Linking K10plus library union catalog with Wikidata
Jakob Voss
Verbundzentrale des GBV, Germany

Video

   
  Breakout sessions

 

 

 

Imprint

Data protection

CONTACT (PROGRAMME)

hbz

Adrian Pohl
T. +49-(0)-221-40075235
E-mail
swib(at)hbz-nrw.de

 

ZBW

Joachim Neubert
T. +49-(0)-40-42834462
E-mail j.neubert(at)zbw.eu

 

Twitter: #swib20