Daniel Herrera Carbajal
ICT

Artificial intelligence in Indigenous communities was at the forefront at North America’s largest Indigenous tech conference.

In a world of generative AI used to write better emails or generate funny photos, Indigenous Tech Conference gathered in Vancouver, British Columbia, Canada to discuss how AI can be used to create true impact in Indigenous communities.

“We don’t need more widgets in the world,” said Ryan St. Germaine, Metis, a founder and CEO of Indigenous Tech Conference. “Technology needs to be pointed towards the challenges of our time.”

Some of those challenges are data sovereignty and language revitalization.

According to the United Nations Educational, Scientific and Cultural Organization, at least 40 percent of the 7,000 languages estimated to be spoken in the world are in danger and a language disappears every two weeks on average.

UNESCO is a UN agency that promotes global peace through international cooperation.

Michael Running Wolf, Northern Cheyenne and Lakota, is a co-founder and architect of First Nations Languages AI reality (FLAIR), an initiative that uses AI to support Indigenous communities in language revitalization and preservation efforts.

While many Indigenous languages have ongoing revitalization efforts, there are languages who have little to no speakers left, making large data collection unrealistic.

The organization’s aim is to reduce the number of data requirements required to build automatic speech recognition for various Indigenous languages.

He told ICT that AI can be used to create an immersive environment to enhance language learning 

“You have this dynamic that even if there is funding, there are very few speakers in which to practice and so while every tribe is committed to revitalizing and reclaiming their languages, often there’s some practical barriers,” Running Wolf said. “For instance, you go to class or you go to the movies and camp, but what happens is you go home, watch Wheel of Fortune, watch TV or YouTube and that’s all in English. So what I can do is give you a tool where you can go home and practice saying ‘Turn your lights off’ in Lakota or Diné, and that’s where AI can be useful because it can make your language ambient to your home.”

While FLAIR works to reduce the data required to build automatic speech recognition for languages, other ASR and large language models still require vast amounts of data. Data which Running Wolf said was “unethically gathered.”

“These large language models have been built using stolen data,” he told ICT. “All our data, and the entirety of the internet has been scraped to create these large language models and now these large language model developers have a problem in that the internet is poison. It’s hard to tell if content is actually created by humans, which is the best kind of data  And so now this puts a premium upon natural organic human data and what is the largest treasure trove of uninfected data and poisoned by AI? Indigenous data.”

Running Wolf equated data to land, and you wouldn’t give away your grandmother’s land for free.

“We are now in the era where our data is one of the last few reservoirs where we should obviously treat it like land,” he said. “You wouldn’t give away an acre of your grandmother’s land to someone for free. Similarly, with our data, if we treat it as a policy framework, as the equivalent of land, then we need to be very careful about it and guarded with it.”

Michael Running Wolf working remotely

In comes in the conversation of creating effective policy that protects and guarantees the sovereignty of Indigenous people’s data

“I think there needs to be more of an accurate depiction of who Indigenous people are and have our own digital sovereignty,” St. Germaine told ICT.

There are currently no federal or tribal policies that protect and guarantee the data sovereignty of Indigenous peoples. 

“We don’t have strong intellectual property rights protection or data sovereignty protection currently. But what if tribes got together and created a co-op, like a data trust, a legal entity whose duty was to protect the data?” Running Wolf told ICT. “If we have strong tribal policy and strong agency over our data, and the strongest thing we can do is actually be the researchers ourselves, tribal research groups working on their own data, creating their own research.”

Daniel Herrera Carbajal is a Multimedia Journalist for the ICT Newscast and ictnews.org. Carbajal is based out of ICT Southwest headquarters in Phoenix, Arizona.