High-Fidelity Exploration of Conformational Space and Structural Identification for a Biomolecular System Using a Neural Network Potential Model
- 주제(키워드) Neural Network Potential , Conformational Search , Deep Learning
- 주제(DDC) 621.042
- 발행기관 아주대학교 일반대학원
- 지도교수 Hyuk Kang
- 발행년도 2025
- 학위수여년월 2025. 8
- 학위명 박사
- 학과 및 전공 일반대학원 에너지시스템학과
- 실제URI http://www.dcollection.net/handler/ajou/000000035117
- 본문언어 영어
- 저작권 아주대학교 논문은 저작권에 의해 보호받습니다.
초록/요약
Neural network potential (NNP) models trained at the M06-2X/6- 311+G(d,p) level of theory were applied to a comprehensive conformational search of the singly protonated hexapeptide DYYVVR. As a proof of concept, a fragmentation approach was first tested on models trained on Gly4-6 systems by incorporating capped fragment data into the training set, and this approach was subsequently applied to the hexapeptide system. The training data for the parent structures were further refined by constructing a diverse set of parent structures that included nearly all types of hydrogen bonds formed by different donor–acceptor combinations. A model trained with this refined dataset showed substantial improvement in prediction performance compared to one trained on the unrefined data. Further enhancement of the model was achieved through active learning, during which new low-energy minima structures were identified that closely matched those optimized at the DFT level. As a result, 48,726 geometry optimizations were done using NNP models with accuracy comparable to that of DFT calculations, while achieving significantly lower computational cost. Low-lying NNP minima newly identified in the present study were used for structural assignment after a few steps of DFT optimization and frequency analysis, yielding minima structures that explain the experimental IR spectra in the 3,350 - 3,650 cm-1 region. The procedures validated on the hexapeptide system are expected to be applicable to other systems with diverse hydrogen-bonding interactions, offering a substantial reduction in computational cost.
more목차
I. NNPs for predicting energy and gradients 1
A. Introduction 1
B. ANN methods and their applications 5
1. Overview of ANN methods 5
2. Requirements for suitable model architecture for the present study 13
C. SchNet architecture 17
1. Suitability of SchNet for the present study 17
2. Working principles of SchNet 18
D. Determination of optimal training scheme 27
1. Hyperparameter set 27
2. Training schedule 31
3. Atomic energy references 36
II. Dataset generation and fragmentation approach for NNP models 39
A. Introduction 39
B. Dataset generation methods 41
1. Conformational search 41
2. Geometry optimization and structure extraction 44
3. Duplicate screening 46
C. Fragmentation approach 50
1. Motivation 50
2. Application to Gly4-6 system 52
3. Application to DYYVVR system 57
4. Summary of datasets used for the present work 66
D. Conclusions 67
III. Hydrogen bond patterning for training dataset generation 69
A. Introduction 69
B. H-bond patterning 71
C. Structure selection for training/validation sets 75
D. Validation of the structure selection 78
E. Conclusions 84
IV. Application to singly protonated DYYVVR: Practical case study 86
A. Introduction 86
B. Fragmentation approach for DYYVVR 89
C. Active learning 92
D. New low-lying structures found from the active learning 96
E. Performance of the latest model 104
F. Conclusions 108
References 111
Appendix 121
A. Cryogenic ion spectroscopy of the singly protonated DYYVVR 121
B. Computational results performed in the previous research 125
C. Theoretical results of Conf_0 and Conf_2 128

