Code-switching · Production-grade

"Zan zo gobe, but can you confirm the price?"

The way Nigerians actually speak. Maraba is the only commercial voice AI that handles mid-sentence Hausa-English, Yoruba-English and Igbo-English code-switching — with per-token language labels and no awkward pauses.

Request beta Read the benchmark → See Orinode STT →
12.6%
WER on Hausa-EN code-switch (Whisper: 41.7%)
Per-token
Language label granularity
5
Languages, any combination
0
Awkward pauses

Why code-switching is the hardest problem in Nigerian voice AI

In Lagos boardrooms, Kano markets, and Enugu pharmacies, no real conversation stays in one language for long. Code-switching isn't a bug — it's the dominant mode of Nigerian business communication. Yet every off-the-shelf STT model treats it as noise.

🧠
One-language-per-utterance assumption
Whisper, Google Cloud STT, Wit.ai — all commit to a single language label for the whole audio file. They have no way to express "the first three words were Hausa, the rest English."
🔇
Phonology collisions
English vowels are forced into Hausa phonotactics, or vice versa — producing nonsense transcripts. The downstream LLM never sees the actual words.
⏸️
Awkward voice-agent pauses
When the language switches, the agent's TTS pipeline often hesitates — the dead air signals "this AI doesn't understand me" and callers hang up.
📊
Analytics fail silently
Intent extraction, sentiment analysis, keyword search — all read the broken transcript and produce confidently wrong results. You don't even know it's broken.

How Maraba handles it

Per-token language labels, dual-vocab decoder, voice-pair switching. One model, five Nigerian languages, any combination.

1 · Per-token language tagging

Orinode STT's decoder emits a (token, language, confidence) triple at every step. The labels are part of the output, not metadata bolted on after.

"Zan"  ha 0.98
"zo"   ha 0.98
"gobe" ha 0.97
",""but"  en 0.99
"can"  en 0.99
"you"  en 0.99
"confirm" en 0.99
"the"  en 0.99
"price?" en 0.99

2 · Voice-pair switching in TTS

Maraba's response pipeline reads the per-token labels and selects the matching voice variant. The output audio switches phonology cleanly mid-sentence — no acoustic glitches.

# Mixed-language response synthesis
text = "Sannu, your delivery is on the way."
#       ^^^^^                              ^
#         ha                              en-NG

audio = orinode_tts.synthesise(
    text,
    voice_pair=("ha-aisha", "en-ng-tola"),
    detect_per_token=True,
)

Public benchmark — code-switch WER

500-utterance Hausa-English code-switch test set, transcribed by 3 native speakers. WER computed against the consensus reference.

SystemWER (overall)WER (Hausa tokens)WER (EN tokens)Per-token lang acc.
Whisper-large-v341.7%58.9%22.3%
Google Cloud STT (ha)52.4%46.1%61.0%
Meta MMS-1B-all34.2%39.5%27.8%
Orinode STT12.6%13.4%11.8%97.2%

Full benchmark write-up across all three Nigerian language pairs (Hausa-EN, Yoruba-EN, Igbo-EN) is in our trilingual code-switching benchmark post. Eval notebook + reference transcripts coming on GitHub. Raw scores at /benchmarks.json.

Supported language pairs

All combinations of the five core languages are handled by the same model — no separate fine-tunes per pair.

ha ↔ enHausa-English
yo ↔ enYoruba-English
ig ↔ enIgbo-English
pcm ↔ enPidgin-English
ha ↔ yo ↔ enThree-language mix

Frequently asked

What is code-switching in voice AI?+
Code-switching is when a speaker alternates between two or more languages within a single conversation — often mid-sentence. In Nigeria, this is the default mode of business communication.
Why does code-switching break most STT systems?+
Most speech-to-text models commit to one language label per utterance. When the speaker switches mid-sentence, the model either transcribes the non-dominant words as gibberish or forces them into the wrong language's phonology.
How does Maraba handle code-switching differently?+
Orinode STT tags every word independently. A Hausa-English call gets a per-token sequence like ['ha','ha','ha','en','en','en'] with confidence scores. Downstream tasks read the right language for the right span.
Which language pairs does Maraba support?+
Hausa-English, Yoruba-English, Igbo-English, Pidgin-English, and any three-way combination. All five languages share one model.
Can I access code-switch STT via API?+
Yes — see /products/orinode-stt/. Pass detect_code_switch: true in your request and you'll receive per-token language labels.

Stop forcing customers to pick one language.

Maraba speaks the way Nigerians actually speak. Request a beta key today.

Request beta