Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures | IEEE Conference Publication | IEEE Xplore