Beam Search Optimized Batch Bayesian Active Learning

Jingyu Sun; Hongjie Zhai; Osamu Saisho; Susumu Takeuchi

doi:10.1609/aaai.v37i5.25751

Authors

Jingyu Sun NTT Computer and Data Science Laboratories
Hongjie Zhai NTT Software Innovation Center
Osamu Saisho NTT Social Informatics Laboratories
Susumu Takeuchi NTT Computer and Data Science Laboratories

DOI:

https://doi.org/10.1609/aaai.v37i5.25751

Keywords:

HAI: Human-in-the-Loop Machine Learning, HAI: Applications, ML: Active Learning, ML: Applications, ML: Deep Neural Architectures, ML: Evaluation and Analysis (Machine Learning)

Abstract

Active Learning is an essential method for label-efficient deep learning. As a Bayesian active learning method, Bayesian Active Learning by Disagreement (BALD) successfully selects the most representative samples by maximizing the mutual information between the model prediction and model parameters. However, when applied to a batch acquisition mode, like batch construction with greedy search, BALD suffers from poor performance, especially with noises of near-duplicate data. To address this shortcoming, we propose a diverse beam search optimized batch active learning method, which explores a graph for every batch construction by expanding the highest-scored samples of a predetermined number. To avoid near duplicate beam branches (very similar beams generated from the same root and similar samples), which is undesirable for lacking diverse representations in the feature space, we design a self-adapted constraint within candidate beams. The proposed method is able to acquire data that can better represent the distribution of the unlabeled pool, and at the same time, be significantly different from existing beams. We observe that the proposed method achieves higher batch performance than the baseline methods on three benchmark datasets.

Beam Search Optimized Batch Bayesian Active Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription