SPHERICAL AND STOCHASTIC CO-CLUSTERING ALGORITHMS

Sariboz, Emrah

SPHERICAL AND STOCHASTIC CO-CLUSTERING ALGORITHMS

dc.contributor.advisor	Cho, Hyuk
dc.creator	Sariboz, Emrah
dc.creator.orcid	0000-0002-5250-9779
dc.date.accessioned	2020-05-18T13:26:24Z
dc.date.available	2020-05-18T13:26:24Z
dc.date.created	2019-05
dc.date.issued	2019-04-17
dc.date.submitted	May 2019
dc.date.updated	2020-05-18T13:26:25Z
dc.description.abstract	Clustering, without a doubt, is a dominating area in data mining and machine learning field. Due to the wide range of the necessity to clustering algorithms, it has many applications in real-life problems, ranging from bioinformatics to personalized information delivery. Feature characteristics of the newly generated data lead us to new approaches to explore the nature of it. General single-sided (i.e. one-way) clustering algorithms such as K-means algorithm clusters either rows or columns of the data matrix. Coclustering algorithm clusters both the instances and features of the data matrix simultaneously and thus, it is more suitable to discover the pattern(s) hidden in both row and column dimensions. Most existing Coclustering algorithms include inexplicit clustering steps for each dimension, separately. In this study, we developed two novel Coclustering algorithms, named as Spherical Coclustering and Stochastic Coclustering, which utilize the existing K-means framework, furthermore a specific data construction, and two specific data normalization was included as a pre-processing step. The Coclustering framework resembles one existing Coclustering algorithm, Spectral Coclustering, as it first applies feature selection using singular value decomposition and utilizes one-way clustering to achieve Coclustering. Furthermore, we partially address a couple of practical well-known problem in clustering algorithm which include the cluster initialization, the degeneracy problem, a local minimum, and a nan (not-a-number) condition in a Kullback-Leibler divergence. The correctness and efficiency of the two algorithms were validated with publicly available benchmark dataset in terms of monotonicity of objective function value change and clustering accuracy. To be specific, we compared the accuracy performance of Euclidean K-means, Stochastic K-means, Spherical K-means, Stochastic Coclustering and Spherical Coclustering algorithms.
dc.format.mimetype	application/pdf
dc.identifier.uri	https://hdl.handle.net/20.500.11875/2780
dc.language.iso	en
dc.subject	Coclustering algorithm
dc.subject	K-means algorithm
dc.subject	bi-normalization
dc.subject	Stochastic Coclustering
dc.subject	Spherical Coclustering
dc.subject	Sinkhorn-Knopp Normalization
dc.subject	Kullback-Leibler Divergence
dc.title	SPHERICAL AND STOCHASTIC CO-CLUSTERING ALGORITHMS
dc.type	Thesis
dc.type.material	text
thesis.degree.department	Computer Science
thesis.degree.grantor	Sam Houston State University
thesis.degree.level	Masters
thesis.degree.name	Master of Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SARIBOZ-THESIS-2019.pdf
Size:: 735.49 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: LICENSE.txt
Size:: 1.85 KB
Format:: Plain Text
Description:

Download

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Collections

Electronic Theses and Dissertations