A Digital Archive of Namibian Languages & Culture
PROJECT
Fiscal Host: Masakhane
We want to source historical and cultural text and speech data in Namibia to preserve history and build representative ML tools
![](/static/images/collective-navigation/CollectiveNavbarIconAbout.png)
About
Namibia is home to 2.5 million people with a rich cultural and colonial history spanning
over 100 years. The stories of the Namibian people have not been told with regards to their
cultural practices, knowledge, nor its history from the perspectives of the Namibian people.
As Goring said at the Nuremberg trials “The victor will always be the judge, and the
vanquished the accused.”
As such, this project aims to capture this knowledge in the historical and cultural context,
for one of the most critically endangered languages, Khoekhoegowab and the Namibian
most widely spoken, Oshiwambo -- and in doing so provide data for NLP tasks. This project
builds on prior efforts to create cultural and historical texts in the khoekhoegowab
language, by crowdsourcing a speech dataset from 300 war veterans from a potential
10000 Namibian war veterans, mostly Oshiwambo speaking and a community of
Khoekhoegowab elders, whose traditional methods are still used in wildlife conservation,
for monitoring and tracking. The project will consider various data gathering methods such
as interviews, focus groups and web apps to capture the data. The speech data will be
annotated and translated into English
over 100 years. The stories of the Namibian people have not been told with regards to their
cultural practices, knowledge, nor its history from the perspectives of the Namibian people.
As Goring said at the Nuremberg trials “The victor will always be the judge, and the
vanquished the accused.”
As such, this project aims to capture this knowledge in the historical and cultural context,
for one of the most critically endangered languages, Khoekhoegowab and the Namibian
most widely spoken, Oshiwambo -- and in doing so provide data for NLP tasks. This project
builds on prior efforts to create cultural and historical texts in the khoekhoegowab
language, by crowdsourcing a speech dataset from 300 war veterans from a potential
10000 Namibian war veterans, mostly Oshiwambo speaking and a community of
Khoekhoegowab elders, whose traditional methods are still used in wildlife conservation,
for monitoring and tracking. The project will consider various data gathering methods such
as interviews, focus groups and web apps to capture the data. The speech data will be
annotated and translated into English
Our team
![](/static/images/collective-navigation/CollectiveNavbarIconContribute.png)
Contribute
Become a financial contributor.