검색 상세

유전자 변이 프로파일링 분석을 통한 간암의 변이 마커 발견

Discovery of actionable targets in liver cancer by mutation profile analysis

초록/요약

Recently, high-throughput genomic profiling studies have demonstrated the huge dieverity of mutation profiles in cancers. Although the muations are thought play driver roles in cancer development and progression, it is not easy to define driver mutations for cancer progression from the huge number of mutations in genomic data. Recently, large scale public databases such as The Cancer Genome Atlas (TCGA) have been released in public, providing genomic landscapes of sequence varations in numerous cancer types. These large-scale collection of data inevitably generates batch effects introduced by differences in processing at various stages from sample collection to data generation. However, batch effects on the sequence variation and its characteristics have not been studied extensively. Here, in part 1, I evaluated batch effects on somatic sequence variation in pan-cancer TCGA data. In addition, in part 2, to delineate the driver mutations in liver cancer, I analyzed RNA-Seq data from liver cancer patients. By comparing the mutations and transcriptomes between primary and recurrent tumors, I sought to idetntify driver mutations that might be responsible for the recurrence of liver cancer. Part 1. I systematically evaluated batch effects on somatic sequence variations in pan-cancer TCGA data, revealing 999 somatic variants that were batch-biased with statistical significance (P<0.00001, Fisher’s exact test, false discovery rate ≤ 0.0027). Most of the batch-biased variants were associated with specific sample plates. The batch-biased variants, which had a unique mutational spectrum with frequent indel-type mutations, preferentially occurred at sites prone to sequencing errors, e.g., in long homopolymer runs. Non-indel type batch-biased variants were frequent at splicing sites with the unique consensus motif sequence ‘TTDTTTAGTT’. Furthermore, some batch-biased variants occur in known cancer genes, potentially causing misinterpretation of mutation profiles. Part 2. Recurrence of hepatocellular carcinoma (HCC) even after curative resection causes dismal outcomes of patients. To delineate the driver events of genomic and transcription alteration during HCC recurrence, I performed RNA-Seq profiling of the paired primary and recurrent tumors from two patients with intrahepatic HCC. By comparing the mutational and transcriptomic profiles, I identified somatic mutations acquired by HCC recurrence including novel mutants of GOLGB1 (E2721V) and SF3B3 (H804Y). By performing experimental evaluation using siRNA-mediated knockdown and overexpression constructs, I demonstrated that the mutants of GOLGB1 and SF3B3 can promote cell proliferation, colony formation, migration, and invasion of liver cancer cells. Transcriptome analysis also revealed that the recurrent HCCs reprogram their transcriptomes to acquire aggressive phenotypes. Network analysis revealed CXCL8 (IL-8) and SOX4 as common downstream targets of the mutants. These reults indicate that the mutations of GOLGB1 and SF3B3 are potential key drivers for the acquisition of an aggressive phenotype in recurrent HCC. In summary, from above the two studies, I suggest that mutation analysis with careful consideration of sytesmatic biases is needed for correct interpretation of large scale genomic data, and the establishmet of appropriate study designs and analysis strategies is important for identifying driver mutations from cancer genome data.

more

목차

I.INTORDUCTION 1
A. Part 1 1
B. Part 2 3
II.MATERIALS AND METHODS 5
A.Part 1 5
1. Data collection and processing 5
2. Estimation of batch-biased variant calls 10
3. Consensus sequence and motif analysis 10
B.Part 2 10
1. Patients and specimens 10
2. RNA-Seq profiling and data processing 10
3. Variant calling from RNA-Seq data 11
4. Microarray gene expression profiling 11
5. Gene ontology and gene set enrichment analyses 12
6. Cell culture and siRNA-mediated knockdown experiments 12
7. Construction of expression vectors 13
8. Cell proliferation and colony formation assays 15
9. Cell migration and invasion assays 15
10. Quantitative real time-PCR (qRT-PCR) 16
11. Western blotting 16
12. Protein structure analysis 16
III. RESULTS 17
A. Part 1 17
1. Identification of batch-biased sequence variants in TCGA data 17
2. Comparison of mutation spectrum of the batch-biased and the unbiased variants 24
3. Homopolymer runs are associated with batch-biased variants 29
4. Batch-biased variants occur frequently at splicing site 32
5. Batch-biased variants in the significantly mutated genes (SMGs) 34
B.Part 2 37
1. Profiling of RNA-Seq identifies the mutants acquired by recurrence of HCC 37
2. Transcriptomic reprogramming of the recurrent HCC 42
3. The mutants of GOLGB1 and SF3B3 give rise to an aggressive phenotype 51
4. CXCL8 and SOX4 are potential common downstream targets of GOLGB1 and SF3B3 65
IV.DISCUSSION 76
V.CONCLUSION 80

more