Slides of a talk I gave at Gosec 25 in Montréal about how machine learning can help cybersecurity professionals in finding interesting patterns across a Ransomware dataset and attempt to predict the next possible attacks.
Disclaimer This tool was developed independently on personal time and is not part of any official project or directive from my employer. The ideas, implementation, and code are solely my own.
Why CCCS (Canadian Centre for Cyber Security) didn’t have such mapping available SOC teams need TTPs to feed SIEM and perform Purple Teaming to ensure good coverage Machine Learning can help to anticipate the threats of tomorrow Why not build a dataset for the community that anyone from any domain can use?
How Gather and analyze as much data as possible about each ransomware group Identify reliable sources for those ransomware analysis Manually and programmatically parse the blogs in search of TTPs Perform manual mapping when it doesn’t exist Compile in a database Perform Frequency Analysis and Predictions via Machine Learning
Which algorithm Identify the right algorithms for the data we have: Linear Regression? : ❌ ransomware amounts fluctuate too much Logistic Regression ? : ❌ not applicable here, classification algorithm SVM ? : ❌ not applicable here, classification algorithm Random Forests ? : ❌ with RaaS we have a giant spider web, not very easy to visualize K-Means ? : ❌ Not applicable here, clustering of numerical values Gradient Boost? : ❌ not applicable here, useful for ranking features but ours are same scale Markov ? : ❌ not applicable here, useful when data is sequenced which is not the case here Try, Fail, Try harder Apriori : ✅ good candidate Monte Carlo : ✅ good candidate
Potential Architecture
Meet the dataset
Outputs 218 Mapped Ransomwares so far ! (Next version about 70-80 more) └ Frequency Analysis └ Predictions