This paper investigates optimal robust output containment problem of general linear heterogeneous multiagent systems (MAS) with completely unknown dynamics. A model-based algorithm using offline policy iteration (PI) is first developed, where the -copy internal model principle is utilized to address the system parameter variations. This offline PI algorithm requires the nominal model of each agent, which may not be available in most real-world applications. To address this issue, a discounted performance function is introduced to express the optimal robust output containment problem as an optimal output-feedback design problem with bounded -gain. To solve this problem online in real time, a Bellman equation is first developed to evaluate a certain control policy and find the updated control policies, simultaneously, using only the state/output information measured online. Then, using this Bellman equation, a model-free off-policy integral reinforcement learning algorithm is proposed to solve the optimal robust output containment problem of heterogeneous MAS, in real time, without requiring any knowledge of the system dynamics. Simulation results are provided to verify the effectiveness of the proposed method.