NNMF IN GOOGLE TENSORFLOW AND APACHE SPARK: A COMPARISON STUDY

dc.contributor.advisorCho, Hyuk
dc.creatorLi, Qizhao
dc.creator.orcid0000-0003-4124-3503
dc.date.accessioned2019-08-15T14:59:45Z
dc.date.available2019-08-15T14:59:45Z
dc.date.created2019-08
dc.date.issued2019-07-12
dc.date.submittedAugust 2019
dc.date.updated2019-08-15T14:59:45Z
dc.description.abstractData mining is no longer a new term as it has been already pervasive in all aspects of our lives. New computing platforms for specific usages are proposed continuously. Therefore, the awareness of the characteristics and the capacity of existing and newly proposed platforms becomes a critical task for researchers and practitioners, who want to use existing algorithms and also develop new ones on the recent platforms. Particularly, this thesis aims to implement and compare a set of popular matrix factorization algorithms on recent computing platforms. Specifically, the three matrix factorization algorithms, including classic Non-negative Matrix Factorization (NNMF), CUR Matrix Decomposition, and Compact Matrix Decomposition (CMD), are implemented on the two computing platforms, including Apache Spark and Google TensorFlow. As rank k approximation with Singular Value Decomposition (SVD) is an optimal baseline, both CUR and CMD approximation are less accurate than the SVD approximation. The experimental result shows that CMD in TensorFlow performs better in terms of matrix approximation than the other two non-negative matrix factorization algorithms (NNMF, and CUR) in the same experiment setup. Also, as the number of rows or columns selected for CUR and CMD increases, the approximation error decreases.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/20.500.11875/2688
dc.language.isoen
dc.subjectTensorFlow
dc.subjectApache Spark
dc.subjectNon-Negative Matrix Factorization
dc.subjectCUR Matrix Decomposition
dc.subjectCompact Matrix Decomposition
dc.subjectApproximation Performance
dc.titleNNMF IN GOOGLE TENSORFLOW AND APACHE SPARK: A COMPARISON STUDY
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Science
thesis.degree.grantorSam Houston State University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LI-THESIS-2019.pdf
Size:
600.16 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.85 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description: