College of Science and Technology
Permanent URI for this communityhttps://hdl.handle.net/20.500.11875/4538
Browse
Browsing College of Science and Technology by Title
Now showing 1 - 20 of 74
- Results Per Page
- Sort Options
Item A Distribution-free Convolution Model for Background Correction of Oligonucleotide Microarray Data(BMC Genomics, 2009-07-07) Chen, Zhongxue; McGee, Monnie; Liu, Qingzhong; Kong, Megan; Deng, Youping; Scheuermann, Richard H.Introduction: Affymetrix GeneChip® high-density oligonucleotide arrays are widely used in biological and medical research because of production reproducibility, which facilitates the comparison of results between experiment runs. In order to obtain high-level classification and cluster analysis that can be trusted, it is important to perform various pre-processing steps on the probe-level data to control for variability in sample processing and array hybridization. Many proposed preprocessing methods are parametric, in that they assume that the background noise generated by microarray data is a random sample from a statistical distribution, typically a normal distribution. The quality of the final results depends on the validity of such assumptions. Results: We propose a Distribution Free Convolution Model (DFCM) to circumvent observed deficiencies in meeting and validating distribution assumptions of parametric methods. Knowledge of array structure and the biological function of the probes indicate that the intensities of mismatched (MM) probes that correspond to the smallest perfect match (PM) intensities can be used to estimate the background noise. Specifically, we obtain the smallest q2 percent of the MM intensities that are associated with the lowest q1 percent PM intensities, and use these intensities to estimate background. Conclusion: Using the Affymetrix Latin Square spike-in experiments, we show that the background noise generated by microarray experiments typically is not well modeled by a single overall normal distribution. We further show that the signal is not exponentially distributed, as is also commonly assumed. Therefore, DFCM has better sensitivity and specificity, as measured by ROC curves and area under the curve (AUC) than MAS 5.0, RMA, RMA with no background correction (RMA-noBG), GCRMA, PLIER, and dChip (MBEI) for preprocessing of Affymetrix microarray data. These results hold for two spike-in data sets and one real data set that were analyzed. Comparisons with other methods on two spike-in data sets and one real data set show that our nonparametric methods are a superior alternative for background correction of Affymetrix data.Item A gene selection method for GeneChip array data with small sample sizes(2010-07) Chen, Zhongxue; Liu, Qingzhong; McGee, Monnie; Kong, Megan; Huang, Xudong; Deng, Youping; Scheuermann, Richard H.In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Although permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. Furthermore, estimated permutation-based p-values for true nulls are highly correlated and not uniformly distributed between zero and one, making it difficult to use current false discovery rate (FDR)-controlling methods. Results: We propose a model-based information sharing method (MBIS) that, after an appropriate data transformation, utilizes information shared among genes. We use a normal distribution to model the mean differences of true nulls across two experimental conditions. The parameters of the model are then estimated using all data in hand. Based on this model, p-values, which are uniformly distributed from true nulls, are calculated. Then, since FDR-controlling methods are generally not well suited to microarray data with very small sample sizes, we select genes for a given cutoff p-value and then estimate the false discovery rate. Conclusion: Simulation studies and analysis using real microarray data show that the proposed method, MBIS, is more powerful and reliable than current methods. It has wide application to a variety of situations.Item A Logic Approach to Granular computing(International Journal of Cognitive Informatics and Natural Intelligence, 2008-04) Zhou, Bing; Yiyu, YaoGranular computing is an emerging field of study that attempts to formalize and explore methods and heuristics of human problem solving with multiple levels of granularity and abstraction. A fundamental issue of granular computing is the representation and utilization of granular structures. The main objective of this article is to examine a logic approach to address this issue. Following the classical interpretation of a concept as a pair of intension and extension, we interpret a granule as a pair of a set of objects and a logic formula describing the granule. The building blocks of granular structures are basic granules representing an elementary concept or a piece of knowledge. They are treated as atomic formulas of a logic language. Different types of granular structures can be constructed by using logic connectives. Within this logic framework, we show that rough set analysis (RSA) and formal concept analysis (FCA) can be interpreted uniformly. The two theories use multilevel granular structures but differ in their choices of definable granules and granular structures.Item A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study(BMC Bioinformatics, 2014-12-16) Liu, Qingzhong; Chen, Zhongxue; Yang, William; Yang, Jack Y.; Li, Jing; Yang, Mary QuBackground: Combining information from different studies is an important and useful practice in bioinformatics, including genome-wide association study, rare variant data analysis and other set-based analyses. Many statistical methods have been proposed to combine p-values from independent studies. However, it is known that there is no uniformly most powerful test under all conditions; therefore, finding a powerful test in specific situation is important and desirable. Results: In this paper, we propose a new statistical approach to combining p-values based on gamma distribution, which uses the inverse of the p-value as the shape parameter in the gamma distribution. Conclusions: Simulation study and real data application demonstrate that the proposed method has good performance under some situations.Item Android System Partition to Traffic Data?(International Journal of Knowledge Engineering, 2017-12) Zhou, Bing; Liu, Qingzhong; Byrd, BrittanyThe familiarity and prevalence of mobile devices inflates their use as instruments of crime. Law enforcement personnel and mobile forensics investigators, are constantly battling to gain the upper-hand at developing a standardized system able to comprehensively identify and resolve the vulnerabilities present within the mobile device platform. The Android mobile platform can be perceived as an antagonist to this objective, as its open nature provides attackers direct insight into the internalization and security features of the most popular platform presently in the consumer market. This paper identifies and demonstrates the system partition in an Android smartphone as a viable attack vector for covert data trafficking. An implementation strategy (comprised of four experimental phases) is developed to exploit the internal memory of a non-activated rooted Android HTC Desire 510 4g smartphone. A set of mobile forensics tools: AccessData Mobile Phone Examiner Plus (MPE+ v5.5.6), Oxygen Forensic Suite 2015 Standard, and Google Android Debug Bridge adb were used for the extraction and analysis process. The data analysis found the proposed approach to be a persistent and minimally detectable method to exchange dataItem Article Showcases Pitcairn Tapa with Women of the Bounty & The Art of Pitcairn(Pitcairn Log, 2023) Albert, Donald PatrickAuthors Donald Patrick Albert and Matthew Purifoy encourage PISG members to download and print (free) their study titled “Repositioning Pitcairn’s Tapa: Detecting the Voices of the Forgotten Women of Bounty.”Item Assessment of gene order computing methods for Alzheimer’s disease(BMC Medical Genomics, 2013-01-23) Liu, Qingzhong; Hu, Benqiong; Pang, Chaoyang; Wang, Shipend; Chen, Zhongxue; Vanderburg, Charles R; Rogers, Jack T.; Deng, Youping; Huang, Xudong; Jiang, GangBackground: Computational genomics of Alzheimer disease (AD), the most common form of senile dementia, is a nascent field in AD research. The field includes AD gene clustering by computing gene order which generates higher quality gene clustering patterns than most other clustering methods. However, there are few available gene order computing methods such as Genetic Algorithm (GA) and Ant Colony Optimization (ACO). Further, their performance in gene order computation using AD microarray data is not known. We thus set forth to evaluate the performances of current gene order computing methods with different distance formulas, and to identify additional features associated with gene order computation. Methods: Using different distance formulas- Pearson distance and Euclidean distance, the squared Euclidean distance, and other conditions, gene orders were calculated by ACO and GA (including standard GA and improved GA) methods, respectively. The qualities of the gene orders were compared, and new features from the calculated gene orders were identified. Results: Compared to the GA methods tested in this study, ACO fits the AD microarray data the best when calculating gene order. In addition, the following features were revealed: different distance formulas generated a different quality of gene order, and the commonly used Pearson distance was not the best distance formula when used with both GA and ACO methods for AD microarray data. Conclusion: Compared with Pearson distance and Euclidean distance, the squared Euclidean distance generated the best quality gene order computed by GA and ACO methods.Item Attack Modeling and Mitigation Strategies for Risk-Based Analysis of Networked Medical Devices(Proceedings of the 53rd Hawaii International Conference on System Sciences, 2020-01) Hodges, Bronwyn J.; McDonald, J. Todd; Glisson, William Bradley; Jacobs, Michael; Van Devender, Maureen; Pardue, J. HaroldThe escalating integration of network-enabled medical devices raises concerns for both practitioners and academics in terms of introducing new vulnerabilities and attack vectors. This prompts the idea that combining medical device data, security vulnerability enumerations, and attack-modeling data into a single database could enable security analysts to proactively identify potential security weaknesses in medical devices and formulate appropriate mitigation and remediation plans. This study introduces a novel extension to a relational database risk assessment framework by using the open-source tool OVAL to capture device states and compare them to security advisories that warn of threats and vulnerabilities, and where threats and vulnerabilities exist provide mitigation recommendations. The contribution of this research is a proof of concept evaluation that demonstrates the integration of OVAL and CAPEC attack patterns for analysis using a database-driven risk assessment framework.Item Attack-Graph Threat Modeling Assessment of Ambulatory Medical Devices(Proceedings of the 50th Hawaii International Conference on System Sciences, 2017-01) Luckett, Patrick; McDonald, J. Todd; Glisson, William BradleyThe continued integration of technology into all aspects of society stresses the need to identify and understand the risk associated with assimilating new technologies. This necessity is heightened when technology is used for medical purposes like ambulatory devices that monitor a patient’s vital signs. This integration creates environments that are conducive to malicious activities. The potential impact presents new challenges for the medical community. \ \ Hence, this research presents attack graph modeling as a viable solution to identifying vulnerabilities, assessing risk, and forming mitigation strategies to defend ambulatory medical devices from attackers. Common and frequent vulnerabilities and attack strategies related to the various aspects of ambulatory devices, including Bluetooth enabled sensors and Android applications are identified in the literature. Based on this analysis, this research presents an attack graph modeling example on a theoretical device that highlights vulnerabilities and mitigation strategies to consider when designing ambulatory devices with similar components.Item A Bleeding Digital Heart: Identifying Residual Data Generation from Smartphone Applications Interacting with Medical Devices(Proceedings of the 52nd Hawaii International Conference on System Sciences, 2019-01) Grispos, George; Glisson, William Bradley; Cooper, PeterThe integration of medical devices in everyday life prompts the idea that these devices will increasingly have evidential value in civil and criminal proceedings. However, the investigation of these devices presents new challenges for the digital forensics community. Previous research has shown that mobile devices provide investigators with a wealth of information. Hence, mobile devices that are used within medical environments potentially provide an avenue for investigating and analyzing digital evidence from such devices. The research contribution of this paper is twofold. First, it provides an empirical analysis of the viability of using information from smartphone applications developed to complement a medical device, as digital evidence. Second, it includes documentation on the artifacts that are potentially useful in a digital forensics investigation of smartphone applications that interact with medical devices.Item The Bounty ̓s Primogeniture and the Thursday-Friday Conundrum(Athens Institute for Education and Research (Athens Journal of Humanities & Arts), 2020-04) Albert, Donald PatrickThis is a biography of an obscure individual born of the ashes of the H.M.A.S. Bounty on the remote, inaccessible, and uninhabited Pitcairn Island in 1790. Thursday October Christian is best known to amateur and professional historians, philatelists, and others interested in the romance and adventure of the South Seas. He was eighteen years old when he first had contact with the outside world with the arrival of the American sealer Mayhew Folger of the Topazin 1808. In the forty years of his life he would meet, greet, and otherwise interact with sealers, whalers, naval officers, traders, and others calling on Pitcairn. This article synthesizes these disparate encounters while exploring a name change conundrum revolving around the protagonist.Thursday October Christian was an ordinary person whose life story now lingers in disparate reports, notices, and accounts of archived and otherwise rare documents.Item The Bounty's Primogeniture and the Thursday-Friday Conundrum Brief(The Pitcairn Log, 2020-04) Albert, Donald PatrickOne page summary of the research article "The Bounty's Primogeniture and the Thursday-Friday Conundrum" published in Athens Journal of Humanities & Arts in April 2020Item Calm Before the Storm: The Challenges of Cloud Computing in Digital Forensics(International Journal of Digital Crime and Forensics, 2012-04) Grispos, George; Storer, Tim; Glisson, William BradleyCloud computing is a rapidly evolving information technology (IT) phenomenon. Rather than procure, deploy, and manage a physical IT infrastructure to host their software applications, organizations are increasingly deploying their infrastructure into remote, virtualized environments, often hosted and managed by third parties. This development has significant implications for digital forensic investigators, equipment vendors, law enforcement, as well as corporate compliance and audit departments, amongst other organizations. Much of digital forensic practice assumes careful control and management of IT assets (particularly data storage) during the conduct of an investigation. This paper summarises the key aspects of cloud computing and analyses how established digital forensic procedures will be invalidated in this new environment, as well as discussing and identifying several new research challenges addressing this changing context.Item CDBFIP: Common Database Forensic Investigation Processes for Internet of Things(IEEE Access, 2017-10) Al-Dhaqm, Arafat; Razak, Shukor; Othman, Siti Hajar; Choo, Kim-Kwang Raymond; Glisson, William Bradley; Ali, Abulalem; Abrar, MohammadDatabase forensics is a domain that uses database content and metadata to reveal malicious activities on database systems in an Internet of Things environment. Although the concept of database forensics has been around for a while, the investigation of cybercrime activities and cyber breaches in an Internet of Things environment would benefit from the development of a common investigative standard that unifies the knowledge in the domain. Therefore, this paper proposes common database forensic investigation processes using a design science research approach. The proposed process comprises four phases, namely: 1) identification; 2) artefact collection; 3) artefact analysis; and 4) the documentation and presentation process. It allows the reconciliation of the concepts and terminologies of all common database forensic investigation processes; hence, it facilitates the sharing of knowledge on database forensic investigation among domain newcomers, users, and practitioners.Item Charles Christian and His Contributions to Pitcairn History(The Pitcairn Log, 2019-04) Albert, Donald PatrickWhile Fletcher Christian has become widely known as the chief mutineer of the H.M.A.S. Bounty and subsequent leader of a nascent community on the remote and isolated Pitcairn Island, his progenies no less have enjoyed their 15 minutes, give or take, of fame. Thursday October Christian (1790-1831) appeared often in the diaries, journals, and reports greeting and entertaining sea captains visiting Pitcairn Island. He is the focus of an amusing anecdote involving a name change from Thursday to Friday or Friday to Thursday, depending on the arguments one way or another. Mary Ann Christian (1793) attained worldwide fame as heroine of Mary Russell Mitford’s (1811) Christina: The Maid of the South Seas. She gifted Levi Hayden a Bible from the Bounty during the visit of the Cyrus in 1839 (Ford, 1996, 21-22). Known as the Pitcairn Bible, it resides at the Brooke Russell Astor Reading Room for Rare Books and Manuscripts of the New York Public Library. Charles Christian, (1791 or 1792-1842), the middle child, has received less attention, but careful review of the historic record finds that he too, like his older brother and younger sister, distinguished himself. He became an antagonist of Joshua Hill, the dictator or per Nechtman (2018) the pretender of Pitcairn Island who resided there from 1832-1837. Hill exerted a harsh, cruel, and brutal control over the political, social and religious affairs of the Pitcairners. His treatment toward three “outsiders,” Nobbs, Buffett, and Evans, reached unforgiving proportions, even though he was a foreigner himself.Item Chronicling Female Agency with Satellite Images and Photographs from Google Earth(2023) Albert, Donald PatrickAbstract. Teehuteatuaonoa, aka Jenny, was one of twelve Polynesian women accompanying HMAV Bounty mutineers to Pitcairn Island on January 15, 1790. Her accounts increased our knowledge of Bounty’s sailing track post-mutiny and island life during her nearly three decades (1790-1817) on Pitcairn Island (Albert 2021a). Jenny is the most traveled of Bounty’s women, and first to return to Tahiti after almost 30 years. Jenny’s journey is chronicled with satellite images and photographs from Google Earth. Her journey encompassed 15 links for a total of 24,090 km or 60% of the Earth’s circumference. The longest link was 7,400 km on the American Sultan from Coquimbo, Chile, to The Marquesas. Jenny’s life provides an example of strong female agency during a male-dominated era (late 1700s – early 1800s) when women’s voices were socially and institutionally repressed (Albert, 2021b).Item Circumstantial response to "Who shot Bounty mast with this small lead ball?"(The Pitcairn Log, 2022-10) Albert, Donald PatrickA circumstantial response to Herb Ford's question "Who shot the Bounty mast with this small lead ball?" Using primary and secondary sources the author examins different situations for plausibility.Item Cloud Forecasting: Legal Visibility Issues in Saturated Environments(Computer Law & Security Review, 2018) Brown, Adam J.; Glisson, William Bradley; Andel, Todd R.; Choo, Kim-Kwang RaymondThe advent of cloud computing has brought the computing power of corporate data pro- cessing and storage centers to lightweight devices. Software-as-a-service cloud subscribers enjoy the convenience of personal devices along with the power and capability of a service. Using logical as opposed to physical partitions across cloud servers, providers supply flexible and scalable resources. Furthermore, the possibility for multitenant accounts promises considerable freedom when establishing access controls for cloud content. For forensic analysts conducting data acquisition, cloud resources present unique challenges. Inherent proper- ties such as dynamic content, multiple sources, and nonlocal content make it difficult for a standard to be developed for evidence gathering in satisfaction of United States federal evidentiary standards in criminal litigation. Development of such standards, while essential for reliable production of evidence at trial, may not be entirely possible given the guarantees to privacy granted by the Fourth Amendment and the Electronic Communications Privacy Act. Privacy of information on a cloud is complicated because the data is stored on resources owned by a third-party provider, accessible by users of an account group, and monitored according to a service level agreement. This research constructs a balancing test for competing considerations of a forensic investigator acquiring information from a cloud.Item Comparison of feature selection and classification for MALDI-MS data(BMC Genomics, 2009-07-07) Liu, Qingzhong; Sung, Andrew H.; Chen, Zhongxue; Yang, Jack Y.; Qiao, Mengyu; Yang, Mary Qu; Huang, Xudong; Deng, YoupingIntroduction: In the classification of Mass Spectrometry (MS) proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS) data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results: We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE), and a recently developed method, Gradient based Leaveone-out Gene Selection (GLGS) that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC), Naive Bayes Classifier (NBC), Nearest Mean Scaled Classifier (NMSC), uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN) based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion: Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing accuracy. However, the distance metric learning LMNN outperformed SVMs and other classifiers on evaluating the best testing. In such cases, the optimum classification model based on LMNN is worth investigating for future study.Item A Comparison Study using Stegexpose for Steganalysis.(International Journal of Knowledge Engineering, 2017-06) Olson, Eric; Carter, Larry; Liu, QingzhongSteganography is the art of hiding secret message in innocent digital data files. Steganalysis aims to expose the existence of steganograms. While internet applications and social media has grown tremendously in recent years, the use of social media is increasingly being used by cybercriminals as well as terrorists as a means of command and control communication including taking advantage of steganography for covert communication. In this paper, we investigate open source steganography/steganalysis software and test StegExpose for steganalysis. Our experimental results show that the capability of stegExpose is very limited.