검색 상세

정보이론 척도 기반 다중 오믹스 데이터 통합 네트워크 분석 프레임워크

Integrative network analysis framework for multiple omics data using information-theoretic measure

초록/요약

Recent advances of sequencing technologies and collaborative projects enable providing high-throughput multi-level omics data from genomic level to metabolomic level. This type of data cannot be handled manually due to the mechanism is complex, and scale of the omics data is quite large and still growing. Therefore, the computational approach has been indispensable for the analysis of the data. Integrative network analysis is widely used to integrate the multi-level omics data in bioinformatics fields, and the analysis helps to understand the biological system. In the previous studies, several computational methods of interaction network construction have been proposed. However, most of the studies focused only on the strength of the interaction between arbitrary two features to construct the network. Thus, those methods cannot reflect the association between the interaction and clinical outcome. This thesis presents a simple but powerful method to construct an integrative network from multiple omics level. The connected gene pairs in the network are associated with the clinical outcome, and these associations are detected by the extended mutual information measure. Also, results of the thesis show that the network-based approach could provide a better insight into the underlying gene-gene interaction mechanisms that affect the clinical outcome of not only cancer patients but also other diseases.

more

목차

Chapter I. Introduction 1
I.1 Motivation 1
I.2 Research objectives 4
I.3 Thesis contributions 4
I.4 Thesis organizations 7
Chapter II. Backgrounds 8
II.1 Overviews of previous studies 8
II.2 Interaction network construction methods 9
II.3 Gene-gene interaction measures 12
II.3.1 Correlation coefficient measure 12
II.3.2 Mutual information measure 13
II.4 Data integration methods 14
Chapter III. Methods 17
III.1 Definition of association measure 17
III.1.1 Definition of mutual information to detect association between pair-wise feature and clinical outcome 17
III.1.2 Discretization and calculation of mutual information 18
III.2 Association network construction 21
III.2.1 Extraction of outcome associated gene-gene interactions with permutation strategy 21
III.2.2 Construction of single profile gene networks 21
III.2.3 Integrative network construction 23
III.3 MINA: Mutual information based Network Analysis framework 25
Chapter IV. Validation and application 27
IV.1 Performance comparison using simulated data 27
IV.2 Survival analysis of identified gene pairs 29
IV.3 Investigation of network topologies 30
IV.4 Enrichment analysis of network 30
IV.5 Application to network-based machine learning technique 31
IV.5.1 Survivability prediction with network-based Cox regression 31
IV.5.2 Spectral Clustering 36
Chapter V. Results 38
V.1 Disease associated dataset 38
V.1.1 TCGA Ovarian Cancer dataset 38
V.1.2 KARE Gastritis dataset 40
V.2 Performance comparison with previous methods 43
V.3 Distribution of mutual information for real dataset 46
V.3.1 TCGA Ovarian Cancer dataset 46
V.3.2 KARE Gastritis dataset 49
V.4 Survival analysis for TCGA dataset 51
V.5 Investigation of network topologies 56
V.5.1 Integration network of TCGA Ovarian cancer dataset 56
V.5.2 KARE Gastritis dataset 59
V.6 Enrichment analysis of the networks 64
V.6.1 TCGA Ovarian cancer dataset 64
V.6.2 KARE Gastritis dataset 71
V.7 Prediction performance assessment 74
V.7.1 Optimal parameter selection using cross-validation 74
V.7.2 Performance comparison on validation set 76
V.7.3 Biological functionalities of signature genes 81
V.7.4 Analysis of networks of the significant genes 84
Chapter VI. Conclusion 87

more