Apache mahout - introduction
21
Jackson Oliveira @cyber_jso
-
Upload
jackson-dos-santos-olveira -
Category
Technology
-
view
399 -
download
0
Transcript of Apache mahout - introduction
Main Problem solving Areas - Collaborative Filtering
Algorithm Single machine MR Spark
Item based
User Based
Matrix Factorization
Main Problem solving Areas - Clustering
Algorithm Single machine MR Spark
K-Means
Fuzzy K-Means
Streaming K-Means
Main Problem solving Areas - Classification
Algorithm Single machine MR Spark
Naive Bayes/ Complementary Naive Bayes
Random Forest
Multilayer Perception
Goods and bads
● Several algorithms implementations ready to use● Well documented java API● More robust when compared to weeka● Startup overhead when compared to Spark MLIB● API target for programmers rather than data scientists● Extensible API