검색 상세

Semiconductor Defect Detection with Swin Transformer-Based Hybrid RCNN under Limited Labeled Data

초록/요약

Automated visual inspection is essential for maintaining quality control in semiconductor manufacturing, where even minor defects can lead to device failures and significant financial losses. This thesis presents a hybrid deep learning approach for semiconductor defect detection that combines the Swin Transformer backbone with the Cascade R-CNN detection framework. The proposed method addresses the challenge of detecting fine-grained structural defects in transistor components under limited labeled data conditions. The proposed Swin-T + Cascade R-CNN architecture leverages the hierarchical attention mechanism of the Swin Transformer to capture both local defect patterns and global structural context, while the multi-stage Cascade R-CNN detection head provides progressive bounding box refinement for precise defect localization. Transfer learning from ImageNet-pretrained weights and targeted data augmentation strategies enable effective training with only 172 labeled images. Experimental evaluation on the MVTec-AD transistor dataset demonstrates that the proposed method achieves a mAP@0.5:0.95 of 0.994, outperforming Faster R-CNN (0.975) and Cascade R-CNN with ResNet-50 backbone (0.989). The proposed method attains the highest recall (99.7%) and F1-score (97.2%) among evaluated models, with a defect classification accuracy of 96% on 100 test images. Per-class analysis shows significant improvements in detecting the defect class obj_head_ng (AP: 0.979) and challenging structural components. Robustness evaluation under Gaussian blur conditions indicates that the model maintains F1 scores above 0.96 even under moderate image degradation. These results validate the effectiveness of combining transformer-based feature extraction with cascade detection for industrial defect detection tasks. The proposed approach offers a practical solution for automated visual inspection in semiconductor manufacturing environments where detection accuracy and reliability are paramount.

more

목차

Chapter 1. Introduction 1
가. Background and Motivation 1
나. Problem statement 2
다. Research objectives 4
라. Contributions 6
마. Thesis organization 7
Chapter 2. Literature Review 9
가. Convolution Based Detectors 9
나. Vision Transformers 12
(1) Vision Transformer (ViT) 12
(2) Swin Transformer 14
다. Industrial Defect Detection 15
(1) MVTec Anomaly Detection Dataset 15
(2) Approaches to Industrial Defect Detection 16
(3) Relevance to the Proposed Method 17
Chapter 3. Methodology 18
가. Dataset Description 18
나. Model Architectures 20
(1) Baseline models 21
(2) Proposed Method: Swin Transformer + Cascade R-CNN 22
다. Training Pipeline 26
(1) Hardware and Training Configuration 26
(2) Optimization Configuration 27
(3) Training Times 27
(4) Transfer Learning 28
(5) Data preprocessing 28
라. Evaluation Metrics 29
(1) Object Detection Metrics 29
(2) Classification Metrics 29
(3) Component Completeness Classification 30
(4) Robustness Evaluation 30
Chapter 4. Experiments and Results 31
가. Experimental Setup 31
나. Qualitative Results 32
다. Quantitative Results 36
Chapter 5. Discussion and Conclusion 40
가. Practical Implications 40
나. Summary 41
Chapter 6. Limitations and Future Work 43
가. Limitations 43
나. Future Work 44
References 47

more