MENU

An ensemble-based method for the selection of instances in the multi-target regression problem

Reyes, Oscar; Fardoun, Habib M.; Ventura, Sebastian

INTEGRATED COMPUTER-AIDED ENGINEERING
2018
VL / 25 - BP / 305 - EP / 320
abstract
The multi-target regression problem comprises the prediction of multiple continuous variables at the same time using a common set of input variables, and in the last few years, this problem has gained an increasing attention due to the broad range of real-world applications that can be analyzed under this framework. The complexity of the multi-target regression problem is higher than the single-target regression one since target variables often have statistical dependencies, and these dependencies should be correctly exploited in order to effectively solve this problem. Consequently, additional difficulties appear when the aim is to perform a selection of instances on this type of data. In this work, an ensemble-based method to perform the instance selection task in multi-target regression problems is proposed. First, a well-known instance selection method is adapted to directly work with multi-target data. Second, the proposed ensemble-based approach uses a set of these adapted methods to select the final subset of instances. The members of the ensemble select partial data subsets, where each member is performed on a different input space that is expanded with target variables, exploiting therefore the underlying inter-target dependencies. Finally, the ensemble-based method aggregates all the selected partial data subsets into a final subset of relevant instances by means of solving an optimization problem with a simple greedy heuristic. The experimental study carried out on 18 datasets shows the effectiveness of our proposal for selecting instances in the multi-target regression problem. Results demonstrate that the size of datasets is considerably reduced, whilst the predictive performance of the multi-target regressors is maintained or even improved. Also, it is observed that the proposed method is robust to the presence of noise in data.

AccesS level

MENTIONS DATA