SPHERICAL AND STOCHASTIC CO-CLUSTERING ALGORITHMS

dc.contributor.advisorCho, Hyuk
dc.creatorSariboz, Emrah
dc.creator.orcid0000-0002-5250-9779
dc.date.accessioned2020-05-18T13:26:24Z
dc.date.available2020-05-18T13:26:24Z
dc.date.created2019-05
dc.date.issued2019-04-17
dc.date.submittedMay 2019
dc.date.updated2020-05-18T13:26:25Z
dc.description.abstractClustering, without a doubt, is a dominating area in data mining and machine learning field. Due to the wide range of the necessity to clustering algorithms, it has many applications in real-life problems, ranging from bioinformatics to personalized information delivery. Feature characteristics of the newly generated data lead us to new approaches to explore the nature of it. General single-sided (i.e. one-way) clustering algorithms such as K-means algorithm clusters either rows or columns of the data matrix. Coclustering algorithm clusters both the instances and features of the data matrix simultaneously and thus, it is more suitable to discover the pattern(s) hidden in both row and column dimensions. Most existing Coclustering algorithms include inexplicit clustering steps for each dimension, separately. In this study, we developed two novel Coclustering algorithms, named as Spherical Coclustering and Stochastic Coclustering, which utilize the existing K-means framework, furthermore a specific data construction, and two specific data normalization was included as a pre-processing step. The Coclustering framework resembles one existing Coclustering algorithm, Spectral Coclustering, as it first applies feature selection using singular value decomposition and utilizes one-way clustering to achieve Coclustering. Furthermore, we partially address a couple of practical well-known problem in clustering algorithm which include the cluster initialization, the degeneracy problem, a local minimum, and a nan (not-a-number) condition in a Kullback-Leibler divergence. The correctness and efficiency of the two algorithms were validated with publicly available benchmark dataset in terms of monotonicity of objective function value change and clustering accuracy. To be specific, we compared the accuracy performance of Euclidean K-means, Stochastic K-means, Spherical K-means, Stochastic Coclustering and Spherical Coclustering algorithms.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/20.500.11875/2780
dc.language.isoen
dc.subjectCoclustering algorithm
dc.subjectK-means algorithm
dc.subjectbi-normalization
dc.subjectStochastic Coclustering
dc.subjectSpherical Coclustering
dc.subjectSinkhorn-Knopp Normalization
dc.subjectKullback-Leibler Divergence
dc.titleSPHERICAL AND STOCHASTIC CO-CLUSTERING ALGORITHMS
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Science
thesis.degree.grantorSam Houston State University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SARIBOZ-THESIS-2019.pdf
Size:
735.49 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.85 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description: