Knowledge engineering: from people to machines and back
elenasimperl
147 views
44 slides
May 28, 2024
Slide 1 of 44
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
About This Presentation
Keynote at the 21st European Semantic Web Conference
Size: 3.57 MB
Language: en
Added: May 28, 2024
Slides: 44 pages
Slide Content
Knowledge
engineering: from
people to machines
and back
Elena Simperl, @esimperl
Knowledge
engineering
•Building and maintaining
knowledge-based systems
•Sits between
•Software engineering: how to
build and maintain software
systems;
•AI: how to represent, learn,
and reason over knowledge
computationally
KNOWLEDGE ENGINEERING: BEFORE
•Gathering curated knowledge
from experts and encoding it into
knowledge bases.
•Mostly manual process,
supported by methodologies,
methods, tools.
•Not scalable, considerable up-
front investment.
3
Source: NeOn project, 2010
KNOWLEDGE ENGINEERING: TODAY
•Large knowledge bases, drawn from
heterogeneous data, automatic process w/
human-in-the-loop.
•Mainstream adoption in search, intelligent
assistants, digital twins, supply chain
management, legal compliance etc.
•W/ access to data and (off-the-shelf) AI
capabilities, costs are a fraction from what they
were decades ago.
•New knowledge engineering experiences:
prompting, conversational tools, co-auditing
automatically generated content
4
Source: Fathallah et al., 2024
My KE
journey
5/28/2024 5
Standard
knowledge
engineering
process during
my PhD
6
Source: me, 2007
Case studies,
guidance,
templates to
reuse ontologies
effectively
7
Predicting costs,
understanding
the most
resource-
intensive part of
the process
8
Two ways forward
Automation Involve more people
9
Two challenges
Participation
Data quality
10
Participation:
aligning tasks
to people’s
motivations,
designing
incentives
11
People implicitly
contribute to the
knowledge base
while going about
their regular work.
Benefits are
highlighted
Challenges: not suitable
for every use case,
shallow semantics
12
People play
games. Games
data is turned
into
computational
knowledge
Challenges: design,
attracting and
retaining players,
costs
13
People are part of
online communities
that build
knowledge bases
Challenges: uneven
participation, data quality
14
What if
automation
were an
option?
15
Human computation Large language models
Knowledge
integrity in
Wikidata
Wikidata is a widely used secondary source of knowledge
Data is trusted when it is backed by its primary source
How references are encoded in Wikidata
Are
references
easy to
navigate,
authoritative,
relevant?
Yes, but some variations
across languages
5/28/2024 17
English, Swedish, Japanese, Spanish, Dutch, Portuguese
Does the
source
support
the claim?
Yes, but challenging task to
automate because of
heterogeneity and
multimodality of evidence
18
(Chandler Fashion Center, directions, Highway 101 & Highway 202)
«■ Not logged in Talk Contributions Create account Login
voyage
PageDiscussion
Read Edit View history | Search Wikivoyage
This is an old revision of this page, as edited by SelfieCity (talk | contribs) at 20:58,13 October 2019 (page banner). (diff)
•— Older revision | Latest revision (diff) | Newer revision —» (diff)
Main page
Travel destinations
Star articles What's
Nearby?
Tourist office
Random page
North America> United States of America >
Chandler
Southwest (United States of America) >Arizona >Greater Phoenix > Chandler (Arizona)
Travellers' pub
Recent changes
Community portal
Maintenance panel
Policies
Help
Interlingual lounge
Donate
Chandler is a city in Arizona, and a medium-sized suburb of Phoenix with over 240,000 residents. It is a delightful place to visit.
Get in
Related changes
Upload file
By plane
Phoenix Sky Harbor International Airport (PHX
IATA
) +1 602 275-4958 [1]i? is the main air gateway to Arizona. It is in East Phoenix 3 miles from downtown. All major U.S. carriers serve Phoenix
Special ।
Perm am
Page inft
Cite this
F rirt s r
* Q Chandler Fashion Center^ (Chandler Mali), 3111 W Chandler Blvd (Highway 101 & Highway 202), W +1 480-812-8488. M-Sa 10AM-9PM, Su 11AM-6PM; restaurant and dept store hours vary An upscale shopping mall with various
department stores, (updated Sep 2018)
• H me Shoppes at Casa Palomas’, 7131 West Ray Road (Highway 10 & Ray). Upscale outdoor shopping and dining center, (updated Sep 201 B>
•BPhoenix Premium Outletss’, 4976 Premium Outlet Way (Highway 10 & Highway 202), +1-480-639-1766. A shopping destination for locals and visitors looking for upscale shopping in a casual, family-friendly atmosphere.
(updated Sep 2018)
In other projects Get around
Wikimedia Commons
Wikipedia
Car and bus are the easiest ways to get around Chandler Rideshare services Uber and Lytt also operate in the city. Google's Waymo autonomous car service is also available in parts of the city.
Complex task, 87.5% accuracy when
explicit textual evidence present
27
What’s
next
28
5/28/202430
Dagstuhl
seminar
09/22
Hackathon
08/23
5/28/2024 33
How do
knowledge
engineers use
LLMs
Ethnographic observations of the
hackathon
Documentary analysis of documents
collected during the hackathon
Semi-structured interviews with hackathon
participants
Findings
Challenges around lack of datasets and
evaluation protocols
Guidance and best practices needed on
knowledge prompting and auditing LLM
outputs
Limited awareness of responsible AI
concerns like biases, fairness
35
LLMs and
requirements
engineering
AI assistants that
automate tasks,
generate
documentation,
remove technical
jargon
84.1% of knowledge engineers lack access to
specialised tools
66.1% lack access to guidelines
Manual process, using text editors and spreadsheets
[1] MonfardiniGK, SalamonJS, BarcellosMP. Use of Competency Questions
in Ontology Engineering: A Survey. Int. Conf. on Conceptual Modeling2023
Using LLMs and
knowledge
graphs to
reverse
engineer
competency
questions
39
Explanations
5/28/2024 40
Cards for KGs and KG embeddings
Final thoughts
•Knowledge engineering is transforming. We need
•Methodologies w/ stronger focus on hybrid workflows, new roles (prompting), new
skills (co-auditing, responsible AI)
•Viable AI copilots, multimodal support, scalable to real knowledge graphs
•Datasets, benchmarks, and evaluation protocols to improve the technical
performance of LLMs in knowledge engineering tasks (e.g. knowledge prompts,
multi-turn dialog, safety, user stories, CQs, documentation etc.)
•Campaigns to share and reuse data where possible to reduce data-related costs
(prompting collectives)
•Guidance, best practices to use AI effectively
•Mechanisms to demonstrate AI-co-generated knowledge graphs can be trusted
42
Where is
the web in
all this?
43
This talk presented work by: Gabriel Amaral, Jacopo
de Berardinis, Fiorela Ciroku, Jonathan Hare, Lucie
Kaffee, Jongmo Kim, Elisavet Koutsiana, Albert
Meroño-Peñuela, Michelle Nwachukwu, Alessandro
Piscopo, Valentina Presutti, Odinaldo Rodrigues,
Johanna Walker, Bohui Zhang, Yihang Zhao
Thank you also to participants of the Dagstuhl
seminar, hackathon, and ESWC special track.
Elena Simperl
@esimperl
44