Abstract—In this paper, we propose to optimize a data transformation matrix and study its impact on binary classification. Based on the area above the receiver operating characteristics curve (AAC) minimization with data transformation, we optimize alternatingly between the data transformation matrix and the weighting parameter vector. Some experimental results on 16 binary data sets acquired from the UCI machine learning repository are observed and discussed. Classification accuracy and ranking value averaged from 10 runs of stratified 10-fold cross-validation are adopted as performance indicators. The proposed method shows encouraging results based on these two performance indicators. In addition, it is shown that most of the performance comparisons are statistically significant.
Index Terms—Data transformation, machine learning, pattern classification, receiver operating characteristics curve.
Kangrok Oh and Kar-Ann Toh are with School of Electrical and Electronic Engineering, Yonsei University, Seoul, Korea (e-mail: kangrokoh@ yonsei.ac.kr, katoh@ yonsei.ac.kr).
Zhengguo Li is with the Institute for Infocomm Research, Singapore 119613, Singapore (e-mail: ezgli@i2r.a-star.edu).
[PDF]
Cite:Kangrok Oh, Kar-Ann Toh, and Zhengguo Li, "Optimizing Data Transformation for Binary Classification," International Journal of Computer Theory and Engineering vol. 9, no. 1, pp. 11-15, 2017.