Knowledge engineering: from people to machines and back

elenasimperl 147 views 44 slides May 28, 2024
Slide 1
Slide 1 of 44
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44

About This Presentation

Keynote at the 21st European Semantic Web Conference


Slide Content

Knowledge
engineering: from
people to machines
and back
Elena Simperl, @esimperl

Knowledge
engineering
•Building and maintaining
knowledge-based systems
•Sits between
•Software engineering: how to
build and maintain software
systems;
•AI: how to represent, learn,
and reason over knowledge
computationally

KNOWLEDGE ENGINEERING: BEFORE
•Gathering curated knowledge
from experts and encoding it into
knowledge bases.
•Mostly manual process,
supported by methodologies,
methods, tools.
•Not scalable, considerable up-
front investment.
3
Source: NeOn project, 2010

KNOWLEDGE ENGINEERING: TODAY
•Large knowledge bases, drawn from
heterogeneous data, automatic process w/
human-in-the-loop.
•Mainstream adoption in search, intelligent
assistants, digital twins, supply chain
management, legal compliance etc.
•W/ access to data and (off-the-shelf) AI
capabilities, costs are a fraction from what they
were decades ago.
•New knowledge engineering experiences:
prompting, conversational tools, co-auditing
automatically generated content
4
Source: Fathallah et al., 2024

My KE
journey
5/28/2024 5

Standard
knowledge
engineering
process during
my PhD
6
Source: me, 2007

Case studies,
guidance,
templates to
reuse ontologies
effectively
7

Predicting costs,
understanding
the most
resource-
intensive part of
the process
8

Two ways forward
Automation Involve more people
9

Two challenges
Participation
Data quality
10

Participation:
aligning tasks
to people’s
motivations,
designing
incentives
11

People implicitly
contribute to the
knowledge base
while going about
their regular work.
Benefits are
highlighted
Challenges: not suitable
for every use case,
shallow semantics
12

People play
games. Games
data is turned
into
computational
knowledge
Challenges: design,
attracting and
retaining players,
costs
13

People are part of
online communities
that build
knowledge bases
Challenges: uneven
participation, data quality
14

What if
automation
were an
option?
15
Human computation Large language models

Knowledge
integrity in
Wikidata
Wikidata is a widely used secondary source of knowledge
Data is trusted when it is backed by its primary source
How references are encoded in Wikidata

Are
references
easy to
navigate,
authoritative,
relevant?
Yes, but some variations
across languages
5/28/2024 17
English, Swedish, Japanese, Spanish, Dutch, Portuguese

Does the
source
support
the claim?
Yes, but challenging task to
automate because of
heterogeneity and
multimodality of evidence
18

(Chandler Fashion Center, directions, Highway 101 & Highway 202)
«■ Not logged in Talk Contributions Create account Login
voyage
PageDiscussion
Read Edit View history | Search Wikivoyage
This is an old revision of this page, as edited by SelfieCity (talk | contribs) at 20:58,13 October 2019 (page banner). (diff)
•— Older revision | Latest revision (diff) | Newer revision —» (diff)
Main page
Travel destinations
Star articles What's
Nearby?
Tourist office
Random page
North America> United States of America >
Chandler
Southwest (United States of America) >Arizona >Greater Phoenix > Chandler (Arizona)
Travellers' pub
Recent changes
Community portal
Maintenance panel
Policies
Help
Interlingual lounge
Donate
Chandler is a city in Arizona, and a medium-sized suburb of Phoenix with over 240,000 residents. It is a delightful place to visit.
Get in
Related changes
Upload file
By plane
Phoenix Sky Harbor International Airport (PHX
IATA
) +1 602 275-4958 [1]i? is the main air gateway to Arizona. It is in East Phoenix 3 miles from downtown. All major U.S. carriers serve Phoenix
Special ।
Perm am
Page inft
Cite this
F rirt s r
* Q Chandler Fashion Center^ (Chandler Mali), 3111 W Chandler Blvd (Highway 101 & Highway 202), W +1 480-812-8488. M-Sa 10AM-9PM, Su 11AM-6PM; restaurant and dept store hours vary An upscale shopping mall with various
department stores, (updated Sep 2018)
• H me Shoppes at Casa Palomas’, 7131 West Ray Road (Highway 10 & Ray). Upscale outdoor shopping and dining center, (updated Sep 201 B>
•BPhoenix Premium Outletss’, 4976 Premium Outlet Way (Highway 10 & Highway 202), +1-480-639-1766. A shopping destination for locals and visitors looking for upscale shopping in a casual, family-friendly atmosphere.
(updated Sep 2018)
In other projects Get around
Wikimedia Commons
Wikipedia
Car and bus are the easiest ways to get around Chandler Rideshare services Uber and Lytt also operate in the city. Google's Waymo autonomous car service is also available in parts of the city.

Datasets
WDV:
verbalisation
WTR: textual
references
WTR
416 triples, 3 classes (supports,
refutes, not-enough-information),
free-text rationale to manage
diversity of reference layouts

Complex task, 87.5% accuracy when
explicit textual evidence present
27

What’s
next
28

5/28/202430

Dagstuhl
seminar
09/22

Hackathon
08/23

5/28/2024 33

How do
knowledge
engineers use
LLMs
Ethnographic observations of the
hackathon
Documentary analysis of documents
collected during the hackathon
Semi-structured interviews with hackathon
participants

Findings
Challenges around lack of datasets and
evaluation protocols
Guidance and best practices needed on
knowledge prompting and auditing LLM
outputs
Limited awareness of responsible AI
concerns like biases, fairness
35

LLMs and
requirements
engineering
AI assistants that
automate tasks,
generate
documentation,
remove technical
jargon
84.1% of knowledge engineers lack access to
specialised tools
66.1% lack access to guidelines
Manual process, using text editors and spreadsheets
[1] MonfardiniGK, SalamonJS, BarcellosMP. Use of Competency Questions
in Ontology Engineering: A Survey. Int. Conf. on Conceptual Modeling2023

Example: OntoChat
https://huggingface.co/spaces/b289zhan/OntoChat
37

User research: mock-ups

Using LLMs and
knowledge
graphs to
reverse
engineer
competency
questions
39

Explanations
5/28/2024 40

Cards for KGs and KG embeddings

Final thoughts
•Knowledge engineering is transforming. We need
•Methodologies w/ stronger focus on hybrid workflows, new roles (prompting), new
skills (co-auditing, responsible AI)
•Viable AI copilots, multimodal support, scalable to real knowledge graphs
•Datasets, benchmarks, and evaluation protocols to improve the technical
performance of LLMs in knowledge engineering tasks (e.g. knowledge prompts,
multi-turn dialog, safety, user stories, CQs, documentation etc.)
•Campaigns to share and reuse data where possible to reduce data-related costs
(prompting collectives)
•Guidance, best practices to use AI effectively
•Mechanisms to demonstrate AI-co-generated knowledge graphs can be trusted
42

Where is
the web in
all this?
43

This talk presented work by: Gabriel Amaral, Jacopo
de Berardinis, Fiorela Ciroku, Jonathan Hare, Lucie
Kaffee, Jongmo Kim, Elisavet Koutsiana, Albert
Meroño-Peñuela, Michelle Nwachukwu, Alessandro
Piscopo, Valentina Presutti, Odinaldo Rodrigues,
Johanna Walker, Bohui Zhang, Yihang Zhao
Thank you also to participants of the Dagstuhl
seminar, hackathon, and ESWC special track.
Elena Simperl
@esimperl
44