Compares outputs from multiple LLMs to score agreement, identify dissent, and recommend best response.