Mahout is an open source machine learning java library from Apache Software Foundation, and therefore platform independent, that provides a fertile framework and collection of patterns and ready-made component for testing and deploying new large-scale algorithms.
With these slides we aims at providi...
Mahout is an open source machine learning java library from Apache Software Foundation, and therefore platform independent, that provides a fertile framework and collection of patterns and ready-made component for testing and deploying new large-scale algorithms.
With these slides we aims at providing a deeper understanding of its architecture.
Size: 5.13 MB
Language: en
Added: Nov 12, 2018
Slides: 51 pages
Slide Content
MAHOUT
Dalla Palma Stefano
Open source Machine Learning Java library
A PA C H E
PartI:Overview
Search
Text mining
Information Retrieval
Search
Text mining
Information Retrieval
Search
Text mining
Information Retrieval
Collaborative
filtering
Search
Text mining
Information Retrieval
Collaborative
filtering
MahoutMachine Learning maintechniques
and
architectureoverview
Recommender
Data Store
User
PreferenceItem
PreferenceItem
Recommender
Neighborhood
Correlation
Preference
Inferrer
DataModel
User
User
Classification
(Naïve Bayes)
Training
examples
Training
algorithm
ModelNew examples Decisions
Predictorsand
target variables
ClassificationSystem
Copy
Estimated
target
variable
Predicted
variables
only
Model
Clustering
Clustering
Clustering
(k-means)
Mapoutput = <centroid_id, data_point>
S = Shuffleand Sorting
M1
Split 1
Split 2
Split n-1
Split n
…Input
M2
Mn
R1
R2
Rk
S
New Centroids
file on Distr.
Cache
Centroidsfile on
Distr. Cache
Mapphase Reduce phase
Samsara
Apache Flink
Flink
MAHOUT
Open source Machine Learning Java library
A PA C H E
PartII:Architecture
MAHOUT
Open source Machine Learning Java library
A PA C H E
PartIII:Conclusions
Stablestcomponents
Mostinstablecomponents
Questions?
Dalla Palma Stefano
A PA C H E M A H O U T