由 lechmazur 提供
A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private conversations, form alliances, and vote to eliminate each other
一个开源项目——浏览代码并从 GitHub 自托管。