SCB 10X Unveils “Typhoon Isan”, The First Systematic Isan ASR Model and Open-Source Language Data

SCB 10X, the disruptive technology investment arm of SCBX Group, continues to advance Thailand’s AI ecosystem and developer community under its vision to build “AI for Thai People”, powerful, reliable, and deeply human-centric.

Reinforcing its commitment to inclusive AI that reflects Thailand’s linguistic and cultural identity, SCB 10X hosted “TYPHOON: Hed Hai AI Jai Isan”, an event dedicated to launching Typhoon Isan, the first Automatic Speech Recognition (ASR) model capable of transcribing the Isan language with systematic spelling standards, together with open-source linguistic datasets designed to elevate Thailand’s AI capabilities.

Today’s speech recognition models typically struggle with regional or low-resource languages, especially those with limited digital data. When users speak in local dialects, AI systems often fail to accurately interpret or transcribe speech. This challenge inspired the development of Typhoon Isan, beginning with the Isan language due to its significant cultural, economic, and demographic importance.

Statistics show that over 20 million people speak Isan, representing nearly one-third of Thailand’s population. The Northeastern region generates more than 180 Billion Baht in GDP, accounting for approximately 10% of the national economy, with Isan speakers contributing across diverse industries. Yet, Isan remains primarily a spoken language with no standardized writing system, making systematic documentation essential for preserving cultural heritage while enabling future digital and economic development.

Driven by the belief that AI must understand the voices of all Thai people, SCB 10X’s research and development team created Typhoon Isan, an Open-Source AI Initiative dedicated to building research-driven models that truly understand Thai linguistic and cultural contexts. This project is the result of collaboration between SCB 10X researchers, linguists, cultural experts, teachers, students, and local communities, marking a milestone effort toward establishing a modern digital standard for the Isan language.

During the “TYPHOON: Hed Hai AI Jai Isan” event, SCB 10X introduced several key research outputs:

  • Typhoon Isan ASR – an open-source Automatic Speech Recognition model capable of accurately transcribing spoken Isan into text
  • Typhoon Isan TTS (Demo) – a text-to-speech model that produces natural-sounding Isan speech
  • Isan Speech Transcription Convention – guidelines for transcribing spoken Isan for AI training
  • Isan Spelling Standard – a Thai-script–based orthographic system for writing Isan
  • Isan Speech Corpus – an open dataset of spoken Isan collected from multiple provinces in the Northeastern region of Thailand
  • Isan Phonetic Dictionary – a phonetic lexicon of words and their pronunciations in Isan

The Typhoon Isan initiative marks a significant milestone for SCB 10X in developing technologies that reflect the identity of Thai people, and reinforces its mission to build AI that is inclusive, accessible, and representative of every voice across the nation.