Computer Science ›› 2019, Vol. 46 ›› Issue (1): 57-63.doi: 10.11896/j.issn.1002-137X.2019.01.009

• CCDM2018 • Previous Articles     Next Articles

Adversarial Multi-armed Bandit Model with Online Kernel Selection

LI Jun-fan, LIAO Shi-zhong   

  1. (College of Intelligence and Computing,Tianjin University,Tianjin 300350,China)
  • Received:2018-06-18 Online:2019-01-15 Published:2019-02-25

Abstract: Online kernel selection is an important component of online kernel methods,and it can be classified into three categories,that is,the filter,the wrapper and the embedder.Existing online kernel selection explores the wrapper and the embedder categories,and empirically adopts the filter approach.But there have been no unified frameworks yet for comparing,analyzing and investigating online kernel selection problems.This paper proposed a unified framework for online kernel selection researches via multi-armed bandits,which can model the wrapper and the embedder of online kernel selection simultaneously.Giving a set of candidate kernels,this paper corresponds each kernel to an arm in an adversarial bandit model.At each round of online kernel selection,this paper randomly chose multiple kernels according to a probability distribution,and updated the probability distribution via the exponentially weighted average method.In this way,an online kernel selection problem was reduced to an adversarial bandit problem in a non-oblivious adversary setting,and a unified framework was developed for online kernel selection researches,which can model the wrapper and the embedder uniformly.This paper further defined a new regret concept of online kernel selection,and proved that the wrapper within the framework enjoys a sub-linear weak expected regret bound and the embedder within the framework enjoys a sub-linear expected regret bound.Experimental results on benchmark datasets demonstrate the effectiveness of the proposed unified framework.

Key words: Adversarial multi-armed bandit, Non-oblivious adversary, Online kernel selection, Unified framework

CLC Number: 

  • TP181
[1]SHALEV-SHWARTZ S.Online Learning and Online Convex Optimization [J].Foundations & Trends<sup>®</sup> in Machine Lear-ning,2012,4(2):107-194.<br /> [2]NGUYEN K.Nonparametric online machine learning with kernels [C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Morgan Kaufmann,2017:5197-5198.<br /> [3]STONE M.Cross-validatory choice and assessment of statistical predictions [J].Journal of the Royal Statistical Society,Series B (Statistical Methodology),1974,36(2):111-147.<br /> [4]BARTLETT P L,BOUCHERON S,LUGOSI G.Model Selection and Error Estimation [J].Machine Learning,2002,48(1):85-113.<br /> [5]FOSTER D J,KALE S,MOHRI M,et al.Parameter-free Online Learning via Model Selection [J].Advances in Neural Information Processing Systems,2017,30:6022-6032.<br /> [6]GUYON I,SAFFARI A,DROR G,et al.Model Selection:Beyond the Bayesian/Frequentist Divide [J].Journal of Machine Learning Research,2010,11(1):61-87.<br /> [7]DEKEL O,SHALEV-SHWARTZ S,SINGER Y.The Forge-tron:A Kernel-Based Perceptron On A Budget[J].Siam Journal on Computing,2008,37(5):1342-1372.<br /> [8]ZHAO P,WANG J,WU P,et al.Fast Bounded Online Gradient Descent Algorithms for Scalable Kernel-based Online Learning [C]//Proceedings of the 29th International Conference on Machine Learning.ACM,2012:1075-1082.<br /> [9]FAN H,SONG Q,SHRESTHA S B.Kernel Online Learning with Adaptive Kernel Width [J].Neurocomputing,2016,175:233-242.<br /> [10]CHEN B,LIANG J,ZHENG N,et al.Kernel Least Mean Square with Adaptive Kernel Size [J].Neurocomputing,2016,191:95-106.<br /> [11]NGUYEN T D,LE T,BUI H,et al.Large-scale Online Kernel Learning with Random Feature Reparameterization [C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Morgan Kaufmann,2017:2543-2549.<br /> [12]HAN Z,LIAO S.Stochastic Online Kernel Selection with In- stantaneous Loss in Random Feature Space [C]//Proceedings of the 24th International Conference on Neural Information Processing.Springer,2017:33-42.<br /> [13]YANG T,MAHDAVI M,JIN R,et al.Online Kernel Selection:Algorithms and Evaluations [C]//Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence.AAAI,2012:1197-1202.<br /> [14]AUER P,CESA-BIANCHI N,FREUND Y,et al.The Nonstochastic Multi-armed Bandit Problem [J].SIAM Journal on Computing,2002,32(1):48-77.<br /> [15]BUBECK S,CESA-BIANCHI N.Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems [J].Foundations & Trends<sup>®</sup> in Machine Learning,2012,5(1):1-122.<br /> [16]ARORA R,DEKEL O,TEWARI A.Online Dandit Learning Against an Adaptive Adversary:From Regret to Policy Regret [C]//Proceedings of the 29th International Conference on Machine Learning.ACM,2012:1503-1510.<br /> [17]JIN R,HOI S C H,YANG T.Online Multiple Kernel Learning:Algorithms and Mistake Bounds [C]//Proceedings of the 21st International Conference on Algorithmic Learning Theory.Springer,2010:390-404.<br /> [18]LU J,HOI S C H,WANG J,et al.Large Scale Online Kernel Learning [J].Journal of Machine Learning Research,2016,17(47):1-43.<br /> [19]RAHIMI A,RECHT B.Random Features for Large-scale Kernel Machines [J].Advances in Neural Information Processing Systems,2007,20:1177-1184.<br /> [20]CHANG C C,LIN C J.LIBSVM:A Library for Support Vector Machines [J].ACM Transactions on Intelligent Systems and Technology,2011,2(3):1-27.<br /> [21]LICHMAN M.UCI Machine Learning Repository[EB/OL].http://archive.ics.uci.edu/ml/index.php.
[1] CHENG Zhang-tao, ZHONG Ting, ZHANG Sheng-ming, ZHOU Fan. Survey of Recommender Systems Based on Graph Learning [J]. Computer Science, 2022, 49(9): 1-13.
[2] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[3] QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[4] GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang. Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features [J]. Computer Science, 2022, 49(7): 40-49.
[5] SUN Xiao-han, ZHANG Li. Collaborative Filtering Recommendation Algorithm Based on Rating Region Subspace [J]. Computer Science, 2022, 49(7): 50-56.
[6] LIU Wei-ming, AN Ran, MAO Yi-min. Parallel Support Vector Machine Algorithm Based on Clustering and WOA [J]. Computer Science, 2022, 49(7): 64-72.
[7] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[8] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[9] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[10] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[11] HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[12] CHEN Zhang-hui, XIONG Yun. Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate [J]. Computer Science, 2022, 49(6): 180-186.
[13] XU Hui, KANG Jin-meng, ZHANG Jia-wan. Digital Mural Inpainting Method Based on Feature Perception [J]. Computer Science, 2022, 49(6): 217-223.
[14] XU Jie, ZHU Yu-kun, XING Chun-xiao. Application of Machine Learning in Financial Asset Pricing:A Review [J]. Computer Science, 2022, 49(6): 276-286.
[15] LUO Jun-ren, ZHANG Wan-peng, LU Li-na, CHEN Jing. Survey on Online Adversarial Planning for Real-time Strategy Game [J]. Computer Science, 2022, 49(6): 287-296.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!