Facilitating reusable third-party annotations in the digital edition

MarijnKoolen 204 views 60 slides Feb 22, 2019
Slide 1
Slide 1 of 60
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60

About This Presentation

We argue the need for support of annotations on an edition made by researchers unaffiliated to the edition project, as a contribution to the explanatory material already present on the site, for purposes of private study or for publication in conjunction with a scholarly article. We demonstrate our ...


Slide Content

Facilitating Reusable Third-Party
Annotations in the Digital Edition
Marijn Koolen
(Royal Netherlands Academy of Arts and Sciences - Humanities Cluster)
Peter Boot
(Huygens ING)

Annotation in Scholarly Editions and Research, 22-02-2019, Wuppertal, Germany

●Annotations in Digital Editions
○Tend to be restricted to critical notes by creators of the edition
○Users rarely have support from editions for making their own annotations
●Annotation is a scholarly primitive (Unsworth 2000, Palmer et al. 2009)
○All scholars make annotations, use them to structure thoughts, gather data
○Either visible only in private copies, or invisible in shared source materials,
○Add interpretations, explanations and perspectives
●Annotation is broad but vaguely defined concept
○“nearly every type of digital research activity in the Humanities today is referred to or connected to
annotating” (Niels-Oliver Walkowski on DARIAH Annotation WG survey, 2017)
Annotation as Scholarly Activity

Sticking to the unwritten rule

●Facilitate third-party annotations in the digital edition:
○annotations made by researchers unaffiliated to the edition project,
○contributes to explanatory material already present on the site,
○Purpose: private study or publish along scholarly article
●Making annotations more a visible part of scholarly communication
○“Visions of the scholarly web”
●Goal: approach with low threshold for participation
○For resource providers: tool that is easy to integrate in existing edition
○For scholars: tool that supports different annotation tasks, allows rich querying/analysing
○Implementation:
■W3C Web Annotation data model and protocol (interoperability)
■Javascript client talking with WA server
Third-Party Annotations

Overview
1.Annotation Digital Editions on the Web
a.The problem of anchoring
b.The problem of semantics
2.Making Web Editions Annotatable
a.Anchors and semantics via RDFa
b.The problem of representation
3.Facilitating Third-Party Annotations
a.The consequences
b.Beyond Digital Editions

Annotating Digital Editions on the Web
The problem of anchoring

●How to anchor annotation to specific location in the edition
●Ensure the annotation addresses a component in the logical information
structure that defines the edition
○and not a location in an HTML page which is merely one representation of an edited text
Annotating Digital Editions on the Web

●Many open, browser-based tools for social annotation tasks
○Annotator.js
○Hypothes.is
○Dokie.li
○Pund.it
○Apache Annotator
●Advantages
○Annotate online materials
○Open formats: sharing, collaborating
●Disadvantages
○Limited knowledge of the structure the annotated object
○Limited support for using/analysing annotations outside of annotated web page
○Limited support for annotating multimedia objects

State of the Art in Web-based Annotation

Annotating Digital Editions on the Web
The problem of semantics

●We argue an annotation tool should understand structure of object itself
a.Browser uses HTML representation
i.HTML is layout oriented, no meaningful connection with annotated object
ii.Annotation not robust against changes in HTML representation
b.Multiple websites may have (different) online versions/editions of the same object
i.Annotations all target same object but different URLs
c.Object may have multiple representations
i.Digital edition can have different transcriptions, translations, audio versions
ii.Annotations made on one representation should be accessible for others
d.Resource providers should be able to suggest suitable annotation types for different object
components
Understanding Annotated Object

●Edition provider has:
○Resources + metadata (e.g. as TEI/XML)
●Transformed to HTML presentation format for web browser
○Browser (and annotation plugin) only sees presentation information
○Compare rich semantics of TEI file with poor semantics of HTML representation

Annotating Digital Editions as Web Pages

TEI Header

TEI Body

HTML Version

Making Web Editions Annotatable
Anchors and Semantics via RDFa

●Use RDFa to describe resources in web page
○Enrich HTML presentation of resource with semantic info on resource
●Develop annotation client that understands RDFa
○Parse RDFa information in web page to know annotatable components
○Capture structural semantic information in annotation
Semantic Anchoring via RDFa

Adding Semantics Through RDFa

Adding Semantics Through RDFa

Demo 1
Annotating RDFa-enriched editions on the Web

Making Web Editions Annotatable
The problem of representation

●How to anchor an annotation to specific representation in the edition
●Ensure the information structure is described in sufficient detail to distinguish
○the edited text or document (the object of editing)
○its (multiple) representation(s) in the edition
Annotating Digital Editions on the Web

Creative Works and Representation
●Digital Editions can have multiple representations of the same creative work
○E.g. image scan, transcript, translation
○Annotations may relate to a specific representation…
■E.g. a correction or comment on a word in the transcript or translation
○… or to the abstract creative work...
■E.g. background information for something referenced in the text
■Or a code to assign a phrase to a category of interest
○… or to a combination of representations
■E.g. linking a phrase in the transcript to a drawing in the page image
●Different structures may be leading in the HTML view
○E.g. document-centric (pages) and text-centric (sections, paragraphs) structures
○Annotations made on one structure should be translated to match alternative structure

Annotations on Different Levels
●How can we distinguish between abstract work and representation?
●How can we target annotations at these different levels?
●Which annotations should be shown in which context?

●We created an FRBR-based ontology to distinguish between
○Editable objects (creative works, parts of works)
○Edition objects (representations, parts of representations)
●FRBR
○Functional Requirements for Bibliographic Records
○Distinguish Work - Manifestation - Expression - Item
○Van Gogh’s letter is a create Work
○Diplomatic transcription is an expression of this work
■(and a creative work in itself)
○Translation is an expression of this work
■(and a creative work in itself)
Editable and Edition Domains

Editable Domain

Editable Domain + FRBRoo

Edition Domain + FRBRoo

Editorial Domain

Representing Work and Two Text Versions

Adding Client - Linking External Resource RDF

External Resources in RDF

Demo 2
Annotations in the editable and edition domains

Facilitating Third-Party Annotations
And its consequences

Private, Shared, Public
●Annotations have permissions
○Private by default, can be shared (once implemented) or made public
○Importance of private annotations (Bradley, 2012): the role of personal reflection
■Also, McCarty’s point on the act of making an annotation (“knowing in doing”)
■Annotations are mainly for structuring your thoughts?
●Annotations for writing vs. annotations for reading
○Transition from ‘for writing’ (knowing in doing) to ‘for reading’ (knowing in using)
■I.e. from private/shared to public
○When does annotator consider annotation of interest for others?
■E.g. when they’re published alongside article to support arguments made
○Edit annotations to make them comprehensible for others

Impact
●What are consequences of third-party annotation for scholarship?
○Publish annotations along scholarly arguments
○Edition could become living document with ongoing visible communication
■Esp. within a collaborative project
■But also more publicly (how to avoid this becoming an impenetrable mess?)
●Feedback
○Edition owners/maintainers may want to incorporate certain annotations into the edition
○Third-party annotation to curated annotation/markup
●Editions of famous works or authors may attract much attention
○Open model: anyone can share anything with everyone
○Editorial model: public annotations need to be approved (by whom?)
○Private/shared model: only share with specific collaborators, enable limited conversations,
can’t openly cite annotations

Low Threshold To Participate?
●We want our annotation approach to be easy to adopt by other editions
○Semantics can be embedded via RDFa without changing the layout
○The JavaScript client that can be loaded in any RDFa-enriched web page
■Configurable to suit editor’s/annotator’s needs
○A Python REST server running Elasticsearch in the background for indexing and retrieval
■With access permissions per annotation (private, shared, public)
■Support for AnnotationCollections
●Both available on GitHub
○Server: https://github.com/marijnkoolen/scholarly-web-annotation-server
○Client: https://github.com/CLARIAH/scholarly-web-annotation-client
○Document is minimal and somewhat out-of-date

Adding Client

Facilitating Third-Party Annotations
Beyond Digital Editions

Wrap Up
●We think support for third-party annotation in digital editions is valuable
○Several difficulties:
■Changing objects, unstable identifiers
■Openness comes at a price
○Our approach has pros and cons
■Pro: flexible, supports many tasks and multiple modalities, interoperable
■Cons: complex structure, esp. when using FRBR layers, easy to make mistakes
○Suggestions for improvement/simplification are welcome
●Plans
○Set up across CLARIAH infrastructure (funded 2019-2023)
○Experiment with pilots in different disciplines (historical science, media studies, literary studies,
linguistics, ...)

Anderson, S., T. Blanke, and S. Dunn. (2010). Methodological commons: arts and humanities e-Science fundamentals. Philosophical Transactions
of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 368(1925), 3779-3796.
Boot, P, Haentjens Dekker, R, Koolen, M, Melgar, L. (2017). Facilitating Fine-grained Open Annotations of Scholarly Sources. In: ​Conference
abstracts Digital Humanities 2017, Montreal​.
Boot, P., Koolen, M (2018). A FRBRoo-based annotation ontology for digital editing. In: Conference abstracts European Association for Digital
Humanities 2018, Galway​.
Bradley, J. (2012). Towards a richer sense of digital annotation: Moving beyond a “media” orientation of the annotation of digital objects. Digital
Humanities Quarterly, 6(2).
Palmer, C. L., Teffeau, L. C., & Pirmann, C. M. (2009). Scholarly Information Practices in the Online Environment. Report commissioned by OCLC
Research.
Unsworth, J. (2001). Scholarly Primitives: what methods do humanities researchers have in common, and how might our tools reflect this. In
Humanities Computing: formal methods, experimental practice symposium, King’s College, London.
Walkowski (2016). The Landscape of Digital Annotation and Its Meaning. Conference on Language Technologies & Digital Humanities, Ljubljana,
2016
References

Thank you!
Questions?

EARMARK
●Extremely Annotational RDF Markup
●Goals:
○Allow multiple annotators to annotate the same object (overlapping annotations
○Refer to external entities
●Solution
○Java application,
○Works on XML/TEI files,
○Derives identifier from XML structure, uses XPath and character offsets and range to identify
text elements
○Allows both standoff annotation and embedding as markup
○RDF for references to anything in the world

Open Web Annotation Using Dokie.li
Source: http://csarven.ca/dokieli-rww

●Approach to enable third-party annotation in digital editions
○Technical approach is only first step!
●Annotation approach to support fluid nature of annotations
○Support need for critical distinctions in targeting
●All code on GitHub
○Server: https://github.com/marijnkoolen/scholarly-web-annotation-server
○Client: https://github.com/CLARIAH/scholarly-web-annotation-client
○Document is minimal and somewhat out-of-date
Conclusions