Code-switching · Production-grade
"Zan zo gobe, but can you confirm the price?"
The way Nigerians actually speak. Maraba is the only commercial voice AI that handles mid-sentence Hausa-English, Yoruba-English and Igbo-English code-switching — with per-token language labels and no awkward pauses.
12.6%
WER on Hausa-EN code-switch (Whisper: 41.7%)
Per-token
Language label granularity
5
Languages, any combination
Why code-switching is the hardest problem in Nigerian voice AI
In Lagos boardrooms, Kano markets, and Enugu pharmacies, no real conversation stays in one language for long. Code-switching isn't a bug — it's the dominant mode of Nigerian business communication. Yet every off-the-shelf STT model treats it as noise.
🧠
One-language-per-utterance assumption
Whisper, Google Cloud STT, Wit.ai — all commit to a single language label for the whole audio file. They have no way to express "the first three words were Hausa, the rest English."
🔇
Phonology collisions
English vowels are forced into Hausa phonotactics, or vice versa — producing nonsense transcripts. The downstream LLM never sees the actual words.
⏸️
Awkward voice-agent pauses
When the language switches, the agent's TTS pipeline often hesitates — the dead air signals "this AI doesn't understand me" and callers hang up.
📊
Analytics fail silently
Intent extraction, sentiment analysis, keyword search — all read the broken transcript and produce confidently wrong results. You don't even know it's broken.
How Maraba handles it
Per-token language labels, dual-vocab decoder, voice-pair switching. One model, five Nigerian languages, any combination.
1 · Per-token language tagging
Orinode STT's decoder emits a (token, language, confidence) triple at every step. The labels are part of the output, not metadata bolted on after.
"Zan" ha 0.98
"zo" ha 0.98
"gobe" ha 0.97
"," —
"but" en 0.99
"can" en 0.99
"you" en 0.99
"confirm" en 0.99
"the" en 0.99
"price?" en 0.99
2 · Voice-pair switching in TTS
Maraba's response pipeline reads the per-token labels and selects the matching voice variant. The output audio switches phonology cleanly mid-sentence — no acoustic glitches.
# Mixed-language response synthesis
text = "Sannu, your delivery is on the way."
# ^^^^^ ^
# ha en-NG
audio = orinode_tts.synthesise(
text,
voice_pair=("ha-aisha", "en-ng-tola"),
detect_per_token=True,
)
Public benchmark — code-switch WER
500-utterance Hausa-English code-switch test set, transcribed by 3 native speakers. WER computed against the consensus reference.
| System | WER (overall) | WER (Hausa tokens) | WER (EN tokens) | Per-token lang acc. |
| Whisper-large-v3 | 41.7% | 58.9% | 22.3% | — |
| Google Cloud STT (ha) | 52.4% | 46.1% | 61.0% | — |
| Meta MMS-1B-all | 34.2% | 39.5% | 27.8% | — |
| Orinode STT | 12.6% | 13.4% | 11.8% | 97.2% |
Full benchmark write-up across all three Nigerian language pairs (Hausa-EN, Yoruba-EN, Igbo-EN) is in our trilingual code-switching benchmark post. Eval notebook + reference transcripts coming on GitHub. Raw scores at /benchmarks.json.
Supported language pairs
All combinations of the five core languages are handled by the same model — no separate fine-tunes per pair.
ha ↔ enHausa-English
yo ↔ enYoruba-English
ig ↔ enIgbo-English
pcm ↔ enPidgin-English
ha ↔ yo ↔ enThree-language mix
Frequently asked
What is code-switching in voice AI?+
Code-switching is when a speaker alternates between two or more languages within a single conversation — often mid-sentence. In Nigeria, this is the default mode of business communication.
Why does code-switching break most STT systems?+
Most speech-to-text models commit to one language label per utterance. When the speaker switches mid-sentence, the model either transcribes the non-dominant words as gibberish or forces them into the wrong language's phonology.
How does Maraba handle code-switching differently?+
Orinode STT tags every word independently. A Hausa-English call gets a per-token sequence like ['ha','ha','ha','en','en','en'] with confidence scores. Downstream tasks read the right language for the right span.
Which language pairs does Maraba support?+
Hausa-English, Yoruba-English, Igbo-English, Pidgin-English, and any three-way combination. All five languages share one model.
Can I access code-switch STT via API?+
Yes — see
/products/orinode-stt/. Pass
detect_code_switch: true in your request and you'll receive per-token language labels.
Stop forcing customers to pick one language.
Maraba speaks the way Nigerians actually speak. Request a beta key today.
Request beta