An evaluation metric considering how well an AI system replies to prompts, taking into account the relevance, coherence, and accuracy of the response.
An evaluation metric considering how well an AI system replies to prompts, taking into account the relevance, coherence, and accuracy of the response.