Variable Selection Based on Information Tree for Spectroscopy Quantitative Analysis

Abstract

Spectroscopy is a fast and efficient component analysis method, and full spectrum prediction model may be redundant and inaccurate. This paper proposes a variable selection method based on information tree for spectroscopy quantitative analysis. Firstly, a feature training set that indicates the information of the selected variables is generated. Then, the partial least squares (PLS) is performed on the spectral calibration set, and root-mean-square error of cross-validation is used to evaluate the feature training set. According to the corresponding evaluation results, the information gain of each wavelength is calculated. The wavelength with maximum information gain is defined as the root node, and an information tree is built based on the information gain where each leaf node represents a wavelength. The final selection result is a conjunction path of the leaf nodes that has bigger information gain. The full spectrum PLS, the uninformative variable elimination with PLS method, the genetic algorithm with PLS method and the proposed method are conducted on the real spectral dataset of flue gas, and the effectiveness of the methods are compared and discussed. The experimental results verify that the prediction precision and the compression ability of the proposed method is higher.

Bookmark the permalink.

Comments are closed