ABSTENTION IN REGRESSION

Cristina Garcia-Cardona, Jamaludin Mohd-Yusof, Tanmoy Bhattacharya

Research output: Contribution to journalConference articlepeer-review

Abstract

Identifying noisy samples and outliers at training time is crucial to achieve good performance and improve robustness of deep learning models. However, this process can be difficult and excessively time consuming if attempted by hand, especially for large data sets. Moreover, in realistic situations, there are gradations in our judgement about noisy samples. In this work, we extend the Deep Abstention Classifier (DAC) framework to regression problems and devise a deep abstaining regressor (DAR) model. This is a forgiving strategy that limits the detrimental effect of noisy samples on the performance of the predictive model, while, at the same time, retaining information in ambiguous cases. We apply the new DAR formulation to synthetic and real data sets and demonstrate that the model is able to identify noisy samples, reducing their influence during training, while learning to quantify the uncertainty in the predictions via a heteroscedastic criterion. DAR constitutes a new tool for the machine learning practitioner, producing actionable uncertainty quantification predictions and helping with the automatic curation of data. This may be an efficient way to reduce the burden of manual labelling for evaluation by a human-in-the-loop.

Fingerprint

Dive into the research topics of 'ABSTENTION IN REGRESSION'. Together they form a unique fingerprint.

Cite this