Alexandre Combessie

Phare LLM Benchmark

Support extending Phare LLM Benchmark to one new language with culturally grounded prompts and native annotators.


<div> </div><div>The dominance of English in LLM evaluation limits their societal applicability. With this funding, we will <strong>add new languages</strong> such as <strong>German, Portuguese, Arabic, and Italian</strong>, with culturally contextualized prompts and translated test sets reviewed by native experts. This ensures that safety benchmarks reflect <strong>local norms, idioms, and risk perceptions</strong> rather than relying on direct translation from English. </div><div> </div><div>Expanding languages is not just a translation effort, it requires adapting the evaluation logic to linguistic and cultural specificities, recruiting diverse annotators, and rethinking what constitutes fairness or harmful content across contexts. Your support allows us to broaden the benchmark’s global inclusivity and reduce Western-centric bias in AI evaluations. <br /><br />Please let us know which language you'd like to see included in the Phare benchmark by email to <a href="https://opencollective.com/redirect?url=mailto%3Aphare%40giskard.ai">phare@giskard.ai</a>.</div>

🌍 Language extension

Support extending Phare LLM Benchmark to one new language with culturally grounded prompts and native annotators.