Rubric-based ESC Analysis on Context-size and External Knowledge in LLM
- 주제(키워드) LLM , rubrics-based , ESC , tinyllama , context-size , external knowledge , dialogue generation
- 주제(DDC) 006.31
- 발행기관 아주대학교 일반대학원
- 지도교수 Hyunsouk Cho
- 발행년도 2024
- 학위수여년월 2024. 8
- 학위명 석사
- 학과 및 전공 일반대학원 인공지능학과
- 실제URI http://www.dcollection.net/handler/ajou/000000033944
- 본문언어 영어
- 저작권 아주대학교 논문은 저작권에 의해 보호받습니다.
초록/요약
Emotional support conversation namely, the ESC system aims to provide comforting and helpful responses to reduce the emotional intensity of users in distress. However, developing ESC systems is challenging due to the need for both contextual understanding and nuanced evaluation. Existing approaches either compromise dialog context-size by splitting whole dialogs into small blocks containing a few utterances or by adding excessive external knowledge for emotional reasoning that hinders nuances like fluency. Thus, finding the balance of appropriate context-size and external knowledge for emotional response generation is an important task. Therefore, we experimented with TinyLlama-1B [1] by controlling dialog context-size and different external knowledge, finally, we pioneer the use of rubric-based evaluation on ESC tasks with Prometheus [2] which is on par with GPT-4 [3], which can assess long-form text based on user-defined scoring rubrics and is more cost-effective than human evaluation. From the observation, we found that including the whole context size is more efficient, and additional of more external knowledge decreases the model performance in several evaluating metrics. In conclusion, this paper presents an analysis of the Emotional Support Conversation with varying context-size and external knowledge and provides the pipeline to generate responses as well as evaluate the generated responses in terms of customized rubrics.
more목차
1 Introduction 1
1.1 Motivation 2
1.2 Contributions 2
1.3 Thesis Outline 3
2 Related Work 4
2.1 Emotional Support Conversation 4
2.2 Knowledge-aware Response Generation 5
2.3 Rubrics-based Prometheus Evaluation 6
3 Emotional Support Dialog System 7
3.1 Dialog Systems 7
3.1.1 Problem Formulation 8
3.2 Context Length in Dialog Systems (LLMs) 9
3.2.1 Importance of Context Length 9
3.3 External Knowledge in Dialog Systems 9
3.3.1 Stages of Emotional Support 10
3.3.2 Support Strategies 11
3.3.3 COMET: Common-sense Transformers 12
3.3.4 HEAL: A Mental Health Knowledge Graph 13
3.3.4.1 Structure and Components of Mental Health 14
3.3.4.2 Relevance to Emotional Support Conversations 15
3.4 Context-size in Previous Approaches 15
3.5 External Knowledge in Previous Approaches 17
4 Proposed Framework 18
4.1 TinyLlama Standard 20
4.2 TinyLlama with Knowledge 21
4.3 TinyLlama with Role 22
5 Results 23
5.1 Experimental Setup 23
5.1.1 Dataset 23
5.1.2 Implementation Details 23
5.1.3 Evaluation Metrics 24
5.2 Experimental Results 25
5.2.1 Can Varying Context-size Affect ESC Response Generation 25
5.2.2 Can Varying External Knowledge Affect ESC Response Generation 26
5.3 Analysis of Context-size and External Knowledge 27
6 Conclusion 30
Bibliography 31

