Apache mahout

PuneetGupta130 711 views 11 slides Mar 08, 2016
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

Introduction of Apache Mahout , Algorithm of Mahout, comparison between Mahout and R


Slide Content

Apache Mahout By : Puneet Gupta M.Tech (Future Studies and Planning)

A  mahout  is one who drives an elephant as its master Its close association with Apache Hadoop which uses an elephant as its logo . Apache Mahout started as a sub-project of Apache’s Lucene in 2008. In 2010, Mahout became a top level project of Apache. Apache Mahout ?

Apache Mahout ? Apache Mahout is an open source project Mahout is a Java library - Implementing Machine Learning techniques • Recommendation • Clustering • Classification

What can we do? • Currently Mahout supports mainly three use cases: – Recommendation - takes users' behavior and from that tries to find items users might like. – Clustering - takes e.g. text documents and groups them into groups of topically related documents. – Classification - learns from existing categorized documents what documents of a specific category look like and is able to assign unlabeled documents to the (hopefully) correct category.

Why Mahout? Mahout is not the only Machine Learning framework – Weka – R Why do we prefer Mahout? – Apache License – Good Community – Good Documentation – Scalable • Based on Hadoop (not mandatory!)

BIG DATA Why do need a scalable framework?

Algorithms • Recommendation – User-based Collaborative Filtering – Item-based Collaborative Filtering – Slope One Recommenders – Singular Value Decomposition

Algorithms • Clustering - Canopy - K-Means - Fuzzy K-Means - Latent Dirichlet Allocation (LDA) - MinHash Clustering - Hierarchical Clustering

Algorithms • Classification - Logistic Regression - Bayes - Random Forests - Hidden Markov Models - Support Vector Machines - Neural Networks - Restricted Boltzmann Machines

Mahout Vs R Mahout is a java library But R is an expression language with a very simple syntax. Mahout use for Big data and R use prototype data
Tags