Ahmed Kareem

22
Cloud computing Recommender data mahout student : Ahmed Kareem oleiwi 2015220080

description

REPORT IS

Transcript of Ahmed Kareem

Page 1: Ahmed Kareem

Cloud computing Recommender data mahout

student : Ahmed Kareem oleiwi 2015220080

Page 2: Ahmed Kareem

An algorithm library for scalable machine learning on Hadoop.

Apache Mahout is a library of scalable machine-learning algorithms, implemented on top of Apache Hadoop and using the MapReduce paradigm. Machine learning is a discipline of artificial intelligence focused on enabling machines to learn without being explicitly programmed, and it is commonly used to improve future.performance based on previous outcomes.

Once big data is stored on the Hadoop Distributed File System (HDFS), Mahout.provides the data science tools to automatically find meaningful patterns in those.

What is mahout

Page 3: Ahmed Kareem
Page 4: Ahmed Kareem
Page 5: Ahmed Kareem
Page 6: Ahmed Kareem

The problem and solution I've successfully installed Hadoop Cluster with 3 machines, and the cluster is

running fine, and I just installed Mahout on the Main name node for "testing purposes", and I followed the instructions of installation and set the JAVA_HOME, but when I try to run classify-20newsgroups.sh it goes and download the dataset but after that I get the following error:

The solution of this problem to Then I've revised the.bashrc and confirmed that the JAVA_HOME is set correctly,

The .bashrc is only read by a shell that is non-login, otherwise is read .bash_profile. .

There are another several possibilities to set the JAVA_HOME: 1) set .bashrc from terminal

Page 7: Ahmed Kareem
Page 8: Ahmed Kareem
Page 9: Ahmed Kareem
Page 10: Ahmed Kareem
Page 11: Ahmed Kareem
Page 12: Ahmed Kareem
Page 13: Ahmed Kareem
Page 14: Ahmed Kareem
Page 15: Ahmed Kareem
Page 16: Ahmed Kareem
Page 17: Ahmed Kareem
Page 18: Ahmed Kareem
Page 19: Ahmed Kareem
Page 20: Ahmed Kareem
Page 21: Ahmed Kareem
Page 22: Ahmed Kareem