Colloquial phonology · conjunction
ובדרך
Prescriptive nikud uvadˈeʁeχ
Spoken norm vebadˈeʁeχ
Prescriptive vocalization and spoken pronunciation do not always match.
Standard G2P predicts vowel diacritics and prescriptive rules. Everyday speech shifts vowels, simplifies conjunctions, and pronounces loanwords by ear.
ReNikud uses weak supervision from speech to train a character-aligned pseudo-vocalization model for spoken Hebrew G2P.
Colloquial phonology · conjunction
ובדרך
Prescriptive nikud uvadˈeʁeχ
Spoken norm vebadˈeʁeχ
Colloquial pronunciation
בירושלים
Prescriptive nikud biʁuʃalˈajim
Spoken norm bejeʁuʃalˈajim
Weak labels from audio, then per-character pseudo-vocalization.
Parallel ASR on unlabeled speech produces Hebrew and IPA transcripts. An FST filter keeps only pairs that align — 1.52M sentences from ~1.7k hours.
Same clip · two ASR transcripts
ʃalˈomʃalˈom
ʃagˈom
Character-level encoder with three parallel heads per letter — consonant, vowel, stress. Trained on FST-aligned pseudo-labels from step 1; constrained decoding yields spoken IPA.
Per character · שלום shalom → ʃalˈom
/ʃ/
/l/
∅
/m/
a
o
∅
∅
Read by column: Hebrew letter → consonant, vowel, stress.
abdefhijklmnopstuvwzɡʁʃʒʔχ ∅
All non-vowel phoneme symbols, or no consonant.
Vowel · 7 classes
Vowel targets
a e i o u ∅
Hebrew vowel output is one vowel or none.
Stress · binary
Stress target
ˈ / ∅
Binary head: stress mark or no stress.
Spoken Hebrew pronunciation cases from MILIM Benchmark.
Lexical slang
פאדיחה
Prescriptive nikud padiχˈa
Slang norm fadˈiχa
Colloquial phonology · conjunction
ומשפחה
Normative /u/ umiʃpaxˈa
Colloquial /ve/ vemiʃpaχˈa
Penultimate stress · loanword
קונספט
Mil’ra bias konsˈept
Mil’el target kˈonsept
Rare phoneme · /w/
וויסקי
Vav · /v/ vˈiski
Loanword · /w/ wˈiski
Modern Hebrew grapheme-to-phoneme conversion is hard because Hebrew orthography is an abjad, with most vowels omitted in writing. Common pipelines infer pronunciation via nikud, but this depends on scarce annotated vocalization data, does not fully represent lexical stress, and reflects prescriptive norms more than everyday spoken usage.
ReNikud is an audio-supervised Hebrew G2P method. It builds phonemic pseudo-labels from large unlabeled Hebrew speech corpora using ASR, then trains a character-aligned pseudo-vocalization model that predicts IPA realizations at each grapheme position. On established Hebrew G2P benchmarks and targeted spoken-Hebrew evaluations, ReNikud improves over prior baselines.
@misc{melichov2026renikud,
title={ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion},
author={Maxim Melichov and Yakov Kolani and Morris Alper},
year={2026},
url={https://arxiv.org/pdf/2606.20179},
}