Evidential Neural Networks for Uncertainty-Based Document Re-ranking in QA Systems
- 주제(키워드) Evidential deep learning , language models , question-and-answering , text ranking , uncertainty estimation
- 주제(DDC) 006.31
- 발행기관 아주대학교 일반대학원
- 지도교수 Sael Lee
- 발행년도 2025
- 학위수여년월 2025. 2
- 학위명 석사
- 학과 및 전공 일반대학원 인공지능학과
- 실제URI http://www.dcollection.net/handler/ajou/000000034673
- 본문언어 영어
- 저작권 아주대학교 논문은 저작권에 의해 보호받습니다.
초록/요약
The effectiveness of the two-step Retrieval-Augmented Generation (RAG) process in question-answering (QA) depends significantly on the accuracy of the re-ranking step, which identifies the most relevant context for generating answers. Traditional text reranking methods often rely on classification models that interpret predicted probabilities as relevance scores. However, standard deep learning classification models, which are trained to minimize prediction loss for point estimates, often suffer from poor calibration. To address this, we introduce the Evidential Document Re-Ranking (EDRR) model, which leverages Evidential Deep Learning (EDL) to improve the calibration of predicted probabilities and to quantify uncertainty in model predictions. The EDRR framework employ these calibrated probabilities and uncertainty measures to establish more dependable relevance scores for the re-ranking phase. Additionally, the uncertainty estimates can serve as a criterion for active learning, enabling the selection of diverse and informative training samples. Evaluations conducted on the Wikipedia-NQ dataset demonstrate that EDRR surpasses the performance of standard cross-encoder models, achieving up to a 10% improvement in mean average precision (mAP@10) within the top 10 results.
more목차
Introduction 1
Method 4
Evidential Re-rank Model Framework 4
Evidential-Bert Model 5
Evidential Relevance Score 7
Active Learning with Uncertainty Sampling 8
Experiments 9
Datasets 9
Hyperparameter Optimization 10
Comparison with Cross-Encoder 10
Performance of Uncertainty Sampling in Active Learning 12
Ranking with Uncertainty Filters 14
Related Work 16
Discussion 18
Conclusion 20

