검색 상세

MESIP: 저전력 고성능 움직임 추정전용 명령어 세트 프로세서

MESIP: A low power and high performance motion estimation specific instruction-set processor

초록/요약

Abstract This dissertation proposes the new Motion Estimation Specific Instruction-set Processor, called MESIP, to implement low power and high performance Motion Estimation (ME) algorithms. ME is widely used to inter prediction based on temporal similarity in various multimedia codecs, such as, MPEG-2/4, H.263, H.264/AVC, HECV, etc., and thus, the solution of key block for various multimedia codecs has been proposed. The proposed MESIP has two major advantages compared with existing ME processors. First, MESIP can handle multiple candidate points with a single specific Sum of Absolute Difference (SAD) instructions. The existing SAD instructions only calculate the SAD result for a single candidate point. Hence, the number of required SAD instructions is proportional to the complex of search pattern. This is a weak point compared with ME Application Specific Integrated Circuit (ASIC) architectures because each individual SAD instructions require extra setting up operations. MESIP can show the comparable performance with ME ASICs by using the proposed new SAD instructions. Second, MESIP can support the proposed new search scan orders to improve the data reusability. Smart snake scan method with Reconfigurable Register Array (RRA) shows the best data reusability for Full Search (FS) algorithm. But the size of RRA is an obstacle for implement. The simplified snake scan with Optimized Sub-region Partition (OSP) method can reduce the size of RRA with the same data reusability of smart snake scan. Fast search algorithms require different scan order from that of FS since the search is only performed at the selected candidate points. The proposed Center Biased Search Scan (CBSS) order offers the efficient RRA update strategy and reduces the redundant data loading compared with existing search scan orders such as raster and snake scan. In addition, MESIP has efficient program control schemes such as dynamic pipeline control and Hardware (HW) loop acceleration for complicated ME algorithms. Existing ME processors focus on the efficient parallel operation architectures. To efficiently support complex ME algorithms, it is necessary to investigate not only the parallel operations but also the program control schemes. The proposed dynamic pipeline control scheme can reduce the pipeline stall caused by HW accelerators. Four loop specific instructions and their specific architecture can support efficient HW loop operations. Specially, the early-termination conditions can be implemented by using the single specific instruction. To implement these features of MESIP, flexible and reconfigurable Processing Element (PE) architecture and data arrangement schemes are also proposed. The flexible and reconfigurable PE architecture can be shifted the reference pixel data to the left, right, up, and down in the PE array. Specially, the left shift amount can be 4, 2, or 1 pixels in one clock cycle. Moreover, through the special data path for data reversing, the proposed PE architecture supports not only the left side search but also the right side search with the same architecture. The proposed data arrangement scheme can handle the increasing data bandwidth for new PE architecture. At the same time, the address calculation also can be simplified by using the proposed data arrangement scheme. The implemented MESIP architecture using the IBM 90nm library consists of 192k gates. At a clock frequency of 200MHz, MESIP achieves real-time 1920 x 1080 ME at 30 frames/s. The simulation results show that the proposed MESIP can reduce the number of required instructions by up to 18.9% compared with existing ME processors. Moreover, MESIP can show the comparable performance in terms of size, processing ability, and power consumption with ME ASICs. Hence, MESIP is quite suitable for low power and high performance programmable ME implementation.

more

목차

Table of Contents
Abstract
List of Figures
List of Tables
List of Abbreviations
1 Introduction 1
1.1 Introduction 1
1.2 Existing ME architectures 3
1.3 Introduction of ASIP 4
1.4 Summary of contributions 6
1.5 Outline of dissertation 9
2 Algorithm and Architecture Exploration of Motion Estimation 10
2.1 ME algorithm 10
2.2 Exploration of ME hardware architecture 14
2.3 ME Processor architectures 17
2.3.1 Adaptive motion estimation processor 17
2.3.2 Configurable and programmable motion estimation processor 20
2.3.3 ASIP for multi resolution motion estimation 22
2.4 Related Works 26
2.4.1 Multimedia ASIP design 26
2.4.2 Frame Selector for Multi Reference ME 34
2.4.3 Residual Prediction 37
3 Multi-point Search Instructions for MESIP 42
3.1 Motivation 42
3.2 Special SAD instructions of MESIP 43
3.3 Efficient Program Control Schemes for MESIP 50
3.3.1 Dynamic pipeline control 51
3.3.2 Hardware loop acceleration 54
4 Data Reusable Search Scan Order for MESIP 59
4.1 Motivation 59
4.2 Data reusable search scan order for SADF instruction 61
4.3 Data reusable search scan order for SADR instruction 66
5 Implementation 71
5.1 Architecture exploration of MESIP 71
5.2 Implementation of MESIP 78
5.2.1 Overall architecture 78
5.2.2 Reconfigurable PE architecture 80
5.2.3 Extended ladder shape data arrangement scheme 82
5.3 Performance Comparisons 84
6 Conclusions and Future works 90
6.1 Conclusions 90
6.2 Future works 92
6.2.1 Configurable PEG architecture 92
6.2.2 Flexible MESIP architecture 93
Bibliography 94

more