검색 상세

OIL PRICE PREDICTION BASED ON DATA MINING TECHNIQUES: SEMI-SUPERVISED LEARNING WITH PCA AND NLPCA

OIL PRICE PREDICTION BASED ON DATA MINING TECHNIQUES: SEMI-SUPERVISED LEARNING WITH PCA AND NLPCA

초록/요약

Oil price prediction is an important issue for the regulators of the government and the related industries. When employing the time series techniques for prediction, however, it becomes difficult and challenging since the behavior of the series of oil prices is dominated by quantitatively unexplained irregular external factors, e.g., supply- or demand-side shocks, political conflicts specific to events in the Middle East, and direct or indirect influences from other global economical indices, etc. Identifying and quantifying the relationship between oil price and those external factors may provide more relevant prediction than attempting to unclose the underlying structure of the series itself. Technically, this implies the prediction is to be based on the vectoral data on the degrees of the relationship rather than the series data. This paper proposes a novel method for time series prediction of using Semi-Supervised Learning that was originally designed only for the vector types of data. First, several time series of oil prices and other economical indices are transformed into the multiple dimensional vectors by the various types of technical indicators and the diverse combination of the indicator-specific hyper-parameters. Then, to avoid the curse of dimensionality and redundancy among the dimensions, the well-known feature extraction techniques, PCA and NLPCA, are employed. With the extracted features, a timepoint-specific similarity matrix of oil prices and other economical indices is built and finally, Semi-Supervised Learning generates one-timepoint-ahead prediction. The proposed method was validated on one artificial- and five real-world- problems. And then the series of crude oil prices of West Texas Intermediate (WTI) was used to verify the proposed method, and the experiments showed promising results: 0.86 of the average AUC and 88% of the average classification accuracy.

more

목차

I INTRODUCTION-------------------------------------- 1

II METHODS------------------------------------------- 5
II-1 Semi-Supervised Learning--------------------- 6
II-2 Technical Indicators Transform----------------- 8
II-3 Feature Extraction (PCA/NLPCA)--------------- 10

III EXPERIMENTS-------------------------------------- 15
III-1 Artificial Data---------------------------------- 16
III-2 Benchmark Data------------------------------- 20
III-3 Oil Price Data--------------------------------- 25

IV CONCLUSION--------------------------------------- 35

REFERENCES----------------------------------------- 36

more