CyKG-RAG: Towards knowledge-graph enhanced retrieval augmented generation for cybersecurity

kabulkurniawan 10 views 16 slides Mar 11, 2025
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

The ICS-SEC KG: An Integrated Cybersecurity
Resource for Industrial Control Systems


Slide Content

CyKG-RAG: Towards knowledge-graph
enhanced retrieval augmented generation for
cybersecurity
Kabul Kurniawan, Elmar Kiesling, Andreas Ekelhart

Outline
2
Motivation & Challenges
LLMs for Cybersecurity
CyKG-RAG Framework
Experiment & Use-Case Application
Conclusion & Future Work

Motivation and Challenges
▪Knowledge-driven approaches have gained prominence in cybersecurity
▪Discrepancy persists between the competencies of cybersecurity professionals and
their capacity to effectively utilize semantic technologies in security analysist.
▪Competency gap in cybersecurity professionals
▪Enabling security professionals to access, query, and leverage knowledge graphs
without requiring in-depth expertise in the semantic domain has long been an obstacle.
▪Critical issues:
▪Global shortage of skilled personnel.
▪Steep learning curve.
▪Analyst fatigue and overload.
▪Rapidly evolving threat landscape.
3

LLMs for Cybersecurity
▪Large Language Models (LLMs) has potential in Cybersecurity tasks,
however:
▪It struggles with complex reasoning tasks.
▪Difficult in maintaining consistency across multiple queries.
▪It fail dealing with factual data.
▪“Hallucination”, problem with generating content without
evidence.
▪Addressing LLMs limitation with Retrieval Augmented Generation
(RAG)
▪Answers are embedded in specific region of text.
▪Enable answering user queries without additional model training.
▪Limitation: struggle to synthesize insight from disperated but
related information
4
https://www.enriquedans.com/2024/05/rankeando -los-asistentes-generativos.html
https://medium.com/@binhle.researcher/a -story-about-retrieval-augmented-generation-rag-d9a9f6e355ba

CyKG-RAG: KG based RAG System for Cybersecurity
▪Integrates cybersecurity knowledge represented as graphs
▪leverages the rich semantic relationships and structured data within KGs
▪Provide contextually relevant information.
5
Images taken from:
https://www.linkedin.com/posts/tonyseale_data-leaders-are-adapting-to-the-profound-activity-7250414921702600704 -29I7
https://www.pdq.com/sysadmin -glossary/cybersecurity/

SEPSES Continuously Updated
Cybersecurity Knowledge Graphs [1]
6
Interface 1: SPARQL Endpoint
Interface 3: Dump Files
Interface 4: TPF InterfaceInterface 2 : Linked Data Interface
▪Integrated set of resources,
including vocabularies derived
from well-established standards in
the cybersecurity domain.
▪Continuously updated
Cybersecurity Knowledge Graph
generated from main cybersecurity
information such as CVE, CWE,
CAPEC, CVSS, MITRE ATT&CK.
▪Provided in various interfaces, such
as SPARQL Endpoint, Dump files,
linked data interface and Triple
Pattern Fragment (TPF)
[1] Kiesling, Elmar, et al. "The SEPSES knowledge graph: an integrated resource for cybersecurity."International
Semantic Web Conference. Cham: Springer International Publishing, 2019.

CyKG-RAG Framework
7

8
CyKG-RAG Framework – Example Approach

9
Experiment and Use Case Application
(i) General Questions on Cyber threat Intelligence
•We leverage our existing SEPSES CSKG as a knowledge source for our RAG system.
•Answering questions related to cyber threat intelligence (e.g., MITRE ATT&CK)
including queries about attack techniques, tactics, etc.

Experiment and Use Case Application
(i) General Questions on Cyber threat Intelligence
10
Results:

11
Experiment and Use Case Application
(ii) Vulnerability Assessment
•Our RAG system can handle the dynamic retrieval of rapidly evolving cybersecurity information.
•For example, the system can assess the current vulnerability of a specific component or product
identified by its CVE ID.

12
Experiment and Use Case Application
(ii) Vulnerability Assessment
Results:

13
Experiment and Use Case Application
(iii) Security Log Analysis
•Demonstrates the use of our RAG system to facilitate question answering against private or local
information (e.g., log sources).
•For this experiment, we used a dataset derived from the AIT dataset and constructed a KG from
it using an LLM.
•Additionally, we generated vector embeddings to enable full-text search against the constructed
KG.

14
Experiment and Use Case Application
(iii) Security Log Analysis
Results:

Summary and Future Work
15
•Fast-evolving knowledge and dynamic environments in cybersecurity domains.
•Information flows from various forms (logs, cybersecurity resources)
•Using LLMs for cybersecurity analysist face difficulty in retraining with updated
information.
•It struggles with factual accuracy and prone to hallucination
•Combining retrieval-augmented generation (RAG) techniques with symbolic
knowledge graphs and semantic search for advance cybersecurity analysis
•It enable retrieval and synthesis of continuously updated relevant
information.
Future Work:
•Comprehensive evaluation in broader experimental settings.
•Investigate effectiveness of different RAG techniques.
•Explore optimal combinations of RAG approaches for various scenarios.

16
Thank you!
This work has been partially supported and funded by the Austrian ResearchPromotion Agency (FFG) via the Austrian Competence
Center for Digital Pro-duction (CDP) under the contract number 881843. SBA Research (SBA-K1) is aCOMET Centre within the
COMET – Competence Centers for Excellent Tech-nologies Programme and funded by BMK, BMAW, and the federal state of Vi-
enna. COMET is managed by FFG. This work is also part of the TEAMING.AIproject which receives funding in the European
Commission’s Horizon 2020 Re-search Programme under Grant Agreement Number 957402 (www.teamingai-project.eu).
16
Kurniawan, Kabul, Elmar Kiesling, and Andreas Ekelhart. "CyKG-RAG:
Towards knowledge-graph enhanced retrieval augmented generation for
cybersecurity.“ RAGE-KG Workshop, ISWC 2024.
Read the full paper:
Tags