Audio-Visual Fusion And Conditioning With Neural Networks For Event Recognition | IEEE Conference Publication | IEEE Xplore