Agenda
•Stakeholder andRequirements
•Proceeding
•Project plan
•High Level Architecture
•Data
•Foundation Model
Stakeholder
●Cyber Command of Swiss Army
●Characteristics:The army must be able to independently carry out tasks and have an impact in
cyber and electromagnetic space. For example, it must be able to detect and thwart a cyber
attack on its IT systems.
●Job: Detection and mitigation of Cyber security risks. Making strategic decisions based on
received information. Implementing security measures for Swiss country (people, residents,
government, public infrastructure).
●Pains: Data/Information Overload, Foreign adversaryrisk, Fast evolving technologies
Requirements
●An anticipation engine which allows Swiss Armed Forces Cyber Commandto navigate future
trends in accordance with predefined business rules should be developed
●For interaction with findings a user interface should be created
Proceeding
Anticipation Engine for Swiss Armed Forces Cyber Command
●Technique Selection: Retrieve AugmentedGenerate (RAG) Model
○Why RAG Model?
■Bias Mitigation: Effectively mitigates biases by combining retrieval-based and
generative approaches.
■Accuracy and Reliability: Provides accurate and reliable insights tailored to specific
business rules and objectives.
■Customization: Can be customized to generate relevant and actionable insights.
■Time Efficiency: Requires less time and computational resources compared to fine-
tuning large language models.
MarchAprilMay
65%
70%
70%
50%
0%
0%
0%
March 3 – Apr 23
March 3 – Apr 23
April 1 – April 23
April 9 – April 23
Apr 23 – May 28
Apr 23 – May 7
April 23 – May 7
CONCEPT
REALISATION
Research
Sources
System Architecture
Data Architecture
Data Source
LLM
Milestone
% completion
0%Orchestrator (Prompt and Context with LLM)
0%User InterfaceMay 7 – May 28
April 30 – May 7
0%Orchestrator (Search in Database)April 30 – May 7
High Level Architecture
High Level Architecture with Technologies
Data Collection Tools and Sources
•Types of sources: News sites, academic articles, think tank’s publications.
•Initially planned to use Python for scraping.
•Switched to Watson Discovery after the Lab 5. Advantages:
■User-Friendly Interface
■AI Capabilitiese.g, techdomainconcepts
■Integrated Database Functionality
Challenges in Data Collection
●Data quality issues across all sources.
●Structural differences in scraped websites.
●Some High-quality sources resistant to scraping.
Analysis and Modelling/Next steps
●Enhancing data pre-processing.
●Analysis ideas:
■Frequency
■Sentiment
■Correlation
●Topic Modelling: Latent Dirichlet Allocation (LDA) or Non-negativeMatrix Factorization(NMF).
●Time-Series Analysis: Timestamped data, analysis how certain topics or terms trend over time.
Foundation Model "Granite-13B-Chat-v2"
●Model Overview
■Large language model designed for conversational interactions, capable of understanding and generating
human-like text.
●Applicability to Use Case
■Provides the ability to aggregate, analyze, and disseminate complex and rapidly changing information
related to military technology advancements, use cases, capabilities, and global defense trends.
●Strengths:
■Conversational capabilities facilitate user interaction and engagement.
■Versatile and well-suited for understanding and generating text in the military domain.
Foundation Model "Granite-7B-Lab"
●Model Overview
■Large language model designed for labelling tasks, capable of processing and classifying large amounts of
data.
●Applicability to Use Case
■Useful for data labeling tasks such as identifying trends, patterns, and actionable insights from diverse and
reliable sources.
●Strengths
■Efficiently processes and classifies large datasets, aiding in trend identification and predictive analysis.
■Helps in addressing the pain points of data overload and accuracy and reliability.
Foundation Model "Flan-T5-XXL-11B"
●Model Overview
■Extremely large language model based on T5 architecture, capable of performing a wide range of natural
language processing tasks.
●Applicability to Use Case
■Provides a comprehensive solution for developing the anticipation engine, offering accurate insights and
predictions.
●Strengths
■Versatile and capable of performing various NLP tasks, including text analysis, summarization, and
generation.
■Well-suited for handling complex and rapidly changing information in the military domain.
"Why Flan-T5-XXL-11B is the Best Choice"
●Comprehensive Solution:The Flan-T5-XXL-11B model is an extremely large language model based on the
T5 architecture, capable of performing a wide range of natural language processing tasks. It provides a
comprehensive solution for developing the anticipation engine, offering accurate insights and predictions.
●Versatility: The model is versatile and capable of performing various NLP tasks, including text analysis,
summarization, and generation. It is well-suited for handling complex and rapidly changing information in
the military domain.
MarchAprilMay
65%
70%
70%
50%
0%
0%
0%
March 3 – Apr 23
March 3 – Apr 23
April 1 – April 23
April 9 – April 23
Apr 23 – May 28
Apr 23 – May 7
April 23 – May 7
CONCEPT
REALISATION
Research
Sources
System Architecture
Data Architecture
Data Source
LLM
Milestone
% completion
0%Orchestrator (Prompt and Context with LLM)
0%User InterfaceMay 7 – May 28
April 30 – May 7
0%Orchestrator (Search in Database)April 30 – May 7