မဝ်ဂျူ:bg-pronunciation
မံက်ပြာကတ်
This module automatically converts Bulgarian orthography to a phonetic transcription in the International Phonetic Alphabet. It also generates hyphenations and syllabifications. It supports generating all various parts of pronunciation sections together, including the above and audio and rhymes, using the show_all entry point.
Testcases
[ပလေဝ်ဒါန်]All tests passed. (refresh)
| Text | Expected | Actual | |
|---|---|---|---|
| височина (visočina) | ви‧со‧чи‧на | ви‧со‧чи‧на | |
| сестра (sestra) | сес‧тра | сес‧тра | |
| пленник (plennik) | плен‧ник | плен‧ник | |
| преодолея (preodoleja) | пре‧одо‧лея | пре‧одо‧лея | |
| маоизъм (maoizǎm) | мао‧изъм | мао‧изъм | |
| майка (majka) | май‧ка | май‧ка | |
| айс.берг (ajs.berg) | айс‧берг | айс‧берг | |
| майор (major) | ма‧йор | ма‧йор | |
| фризьор (frizjor) | фри‧зьор | фри‧зьор | |
| суджук (sudžuk) | су‧джук | су‧джук | |
| над.живея (nad.živeja) | над‧жи‧вея | над‧жи‧вея | |
| сестра (sestra) | сес‧тра | сес‧тра | |
| потури (poturi) | по‧ту‧ри | по‧ту‧ри | |
| сланина (slanina) | сла‧ни‧на | сла‧ни‧на | |
| пража (praža) | пра‧жа | пра‧жа | |
| спринцовка (sprincovka) | сприн‧цов‧ка | сприн‧цов‧ка | |
| пържа (pǎrža) | пър‧жа | пър‧жа | |
| яркост (jarkost) | яр‧кост | яр‧кост | |
| рало (ralo) | ра‧ло | ра‧ло | |
| белило (belilo) | бе‧ли‧ло | бе‧ли‧ло | |
| шевица (ševica) | ше‧ви‧ца | ше‧ви‧ца | |
| доило (doilo) | до‧ило | до‧ило | |
| начало (načalo) | на‧ча‧ло | на‧ча‧ло | |
| хитрост (hitrost) | хит‧рост | хит‧рост | |
| хитър (hitǎr) | хи‧тър | хи‧тър | |
| шевица (ševica) | ше‧ви‧ца | ше‧ви‧ца | |
| вдлъбна (vdlǎbna) | вдлъб‧на | вдлъб‧на | |
| размахам (razmaham) | раз‧ма‧хам | раз‧ма‧хам | |
| укор (ukor) | укор | укор | |
| упорит (uporit) | упо‧рит | упо‧рит | |
| осем (osem) | осем | осем | |
| оценка (ocenka) | оцен‧ка | оцен‧ка | |
| лея (leja) | лея | лея | |
| аз (az) | аз | аз | |
| тя (tja) | тя | тя | |
| е (e) | е | е | |
| мен (men) | мен | мен | |
| страст (strast) | страст | страст | |
| пръст (prǎst) | пръст | пръст | |
| шофьор (šofjor) | шо‧фьор | шо‧фьор | |
| фотьойл (fotjojl) | фо‧тьойл | фо‧тьойл | |
| бельо (beljo) | бе‧льо | бе‧льо | |
| шедьовър (šedjovǎr) | ше‧дьо‧вър | ше‧дьо‧вър | |
| мениджър (menidžǎr) | ме‧ни‧джър | ме‧ни‧джър | |
| джудже (džudže) | джу‧дже | джу‧дже | |
| жар-птица (žar-ptica) | жар-пти‧ца | жар-пти‧ца | |
| морално-нравствен (moralno-nravstven) | мо‧рал‧но-нрав‧ствен | мо‧рал‧но-нрав‧ствен | |
| кандидат-студент (kandidat-student) | кан‧ди‧дат-сту‧дент | кан‧ди‧дат-сту‧дент | |
| министър-председател (ministǎr-predsedatel) | ми‧нис‧тър-пред‧се‧да‧тел | ми‧нис‧тър-пред‧се‧да‧тел | |
| член-кореспондент (člen-korespondent) | член-ко‧рес‧пон‧дент | член-ко‧рес‧пон‧дент | |
| бизнес администрация (biznes administracija) | биз‧нес ад‧ми‧нис‧тра‧ция | биз‧нес ад‧ми‧нис‧тра‧ция | |
| екшън герой (ekšǎn geroj) | ек‧шън ге‧рой | ек‧шън ге‧рой | |
| тенис корт (tenis kort) | те‧нис корт | те‧нис корт | |
| заместник министър-председател (zamestnik ministǎr-predsedatel) | за‧мес‧тник ми‧нис‧тър-пред‧се‧да‧тел | за‧мес‧тник ми‧нис‧тър-пред‧се‧да‧тел | |
| заместник началник-управление (zamestnik načalnik-upravlenie) | за‧мес‧тник на‧чал‧ник-уп‧рав‧ле‧ние | за‧мес‧тник на‧чал‧ник-уп‧рав‧ле‧ние | |
| SIM карта (SIM karta) | SIM кар‧та | SIM кар‧та | |
| VIP зона (VIP zona) | VIP зо‧на | VIP зо‧на |
| Text | Expected | Actual | |
|---|---|---|---|
| къ́ща (kǎ́šta) | ˈkɤʃtɐ | ˈkɤʃtɐ | |
| сгъстя́ се (sgǎstjá se), endschwa=true | zɡɐˈstʲɤ̟ sɛ | zɡɐˈstʲɤ̟ sɛ | |
| сгъстя́ се (sgǎstjá se) (respelled сгъстя̣́ се) | zɡɐˈstʲɤ̟ sɛ | zɡɐˈstʲɤ̟ sɛ | |
| а̀бдики́ращ (àbdikírašt) | ˌabdiˈkirɐʃt | ˌabdiˈkirɐʃt | |
| безшу́мен (bezšúmen) | bɛʃˈʃu̟mɛn | bɛʃˈʃu̟mɛn | |
| щастли́в (štastlív) | ʃtɐˈslif | ʃtɐˈslif | |
| народността́ (narodnosttá) | nɐrodnoˈsta | nɐrodnoˈsta | |
| я (ja) | ja̟ | ja̟ | |
| юг (jug) | ju̟k | ju̟k | |
| яйце́ (jajcé) | jɐjˈt͡sɛ | jɐjˈt͡sɛ | |
| изя́м (izjám) | iˈzʲa̟m | iˈzʲa̟m | |
| учи́лище (učílište) | oˈt͡ʃiliʃtɛ | oˈt͡ʃiliʃtɛ | |
| чорбаджи́я (čorbadžíja) | t͡ʃo̟rbɐˈdʒijɐ | t͡ʃo̟rbɐˈdʒijɐ | |
| уби́йца (ubíjca) | oˈbijt͡sɐ | oˈbijt͡sɐ | |
| безбра́чие (bezbráčie) | bɛzˈbrat͡ʃiɛ | bɛzˈbrat͡ʃiɛ | |
| измра́ (izmrá) (respelled из.мра́) | izˈmra | izˈmra | |
| сала́та (saláta) | sɐˈɫatɐ | sɐˈɫatɐ | |
| шега́ (šegá) | ʃɛˈɡa | ʃɛˈɡa | |
| жена́ (žená) | ʒɛˈna | ʒɛˈna | |
| инти́мен (intímen) | inˈtimɛn | inˈtimɛn | |
| посо́лство (posólstvo) | poˈsɔɫstvo | poˈsɔɫstvo | |
| ъ́гъл (ǎ́gǎl) | ˈɤɡɐɫ | ˈɤɡɐɫ | |
| усу́квам (usúkvam) | oˈsukvɐm | oˈsukvɐm | |
| ле́ща (léšta) | ˈlɛʃtɐ | ˈlɛʃtɐ | |
| липа́ (lipá) | liˈpa | liˈpa | |
| океа́н (okeán) | okɛˈan | okɛˈan | |
| меки́ца (mekíca) | mɛˈkit͡sɐ | mɛˈkit͡sɐ | |
| ла́гер (láger) | ˈɫaɡɛr | ˈɫaɡɛr | |
| маги́я (magíja) | mɐˈɡijɐ | mɐˈɡijɐ | |
| хем (hem) | xɛm | xɛm | |
| химн (himn) | ximn | ximn | |
| тулу́п (tulúp) | toˈɫup | toˈɫup | |
| жа̀р-пти́ца (žàr-ptíca) | ˌʒa̟r-pˈtit͡sɐ | ˌʒa̟r-pˈtit͡sɐ | |
| в о́фис (v ófis) | f ˈɔfis | f ˈɔfis | |
| във Фра́нция (vǎv Fráncija) | vɐf ˈfrant͡sijɐ | vɐf ˈfrant͡sijɐ | |
| ня́колко (njákolko) | ˈnʲa̟koɫko | ˈnʲa̟koɫko | |
| в Япо́ния (v Japónija) | f jɐˈpɔnijɐ | f jɐˈpɔnijɐ | |
| автоплу́г (avtoplúg) | ɐftoˈpɫuk | ɐftoˈpɫuk | |
| уе́бса́йт (uébsájt) (respelled ўе́бса́йт) | ˈwɛpˈsajt | ˈwɛpˈsajt | |
| уе́лски (uélski) (respelled ўе́лски) | ˈwɛɫski | ˈwɛɫski | |
| уе́стърн (uéstǎrn) (respelled ўе́стърн) | ˈwɛstɐrn | ˈwɛstɐrn | |
| О́уен (Óuen) (respelled О́ўен) | ˈɔwɛn | ˈɔwɛn | |
| но́ухау (nóuhau) (respelled но́ўхаў) | ˈnɔwxɐw | ˈnɔwxɐw | |
| Джо́узеф (Džóuzef) (respelled Джо́ўзеф) | ˈdʒɔwzɛf | ˈdʒɔwzɛf | |
| бо́улинг (bóuling) (respelled бо́ўлинг) | ˈbɔwliŋk | ˈbɔwliŋk | |
| даунло́уд (daunlóud) (respelled даўнло́ўд) | dɐwnˈɫɔwt | dɐwnˈɫɔwt | |
| уи́ски (uíski) (respelled ўи́ски) | ˈwiski | ˈwiski | |
| уи́кенд (uíkend) (respelled ўи́кенд) | ˈwikɛnt | ˈwikɛnt | |
| Уо́руик (Uóruik) (respelled Ўо́рўик) | ˈwɔrwik | ˈwɔrwik | |
| Хе́лоуин (Hélouin) (respelled Хе́лоўин) | ˈxɛɫowin | ˈxɛɫowin |
| Text | Expected | Actual | |
|---|---|---|---|
| а (a) | а | а | |
| в (v) | в | в | |
| е (e) | е | е | |
| и (i) | и | и | |
| ѝ (ì) | ѝ | ѝ | |
| о (o) | о | о | |
| с (s) | с | с | |
| у (u) | у | у | |
| аз (az) | аз | аз | |
| ти (ti) | ти | ти | |
| той (toj) | той | той | |
| тя (tja) | тя | тя | |
| във (vǎv) | във | във | |
| със (sǎs) | със | със | |
| принц (princ) | принц | принц | |
| спринт (sprint) | спринт | спринт | |
| глист (glist) | глист | глист | |
| скункс (skunks) | скункс | скункс | |
| ами (ami) | а‧ми | а‧ми | |
| ала (ala) | а‧ла | а‧ла | |
| ако (ako) | а‧ко | а‧ко | |
| уви (uvi) | у‧ви | у‧ви | |
| или (ili) | и‧ли | и‧ли | |
| саламура (salamura) | са‧ла‧му‧ра | са‧ла‧му‧ра | |
| барабан (baraban) | ба‧ра‧бан | ба‧ра‧бан | |
| сполука (spoluka) | спо‧лу‧ка | спо‧лу‧ка | |
| щавя (štavja) | ща‧вя | ща‧вя | |
| стрина (strina) | стри‧на | стри‧на | |
| когато (kogato) | ко‧га‧то | ко‧га‧то | |
| изям (izjam) | и‧зям | и‧зям | |
| старицата (staricata) | ста‧ри‧ца‧та | ста‧ри‧ца‧та | |
| получените (polučenite) | по‧лу‧че‧ни‧те | по‧лу‧че‧ни‧те | |
| подобаващите (podobavaštite) | по‧до‧ба‧ва‧щи‧те | по‧до‧ба‧ва‧щи‧те | |
| обучаващите (obučavaštite) | о‧бу‧ча‧ва‧щи‧те | о‧бу‧ча‧ва‧щи‧те | |
| джудже (džudže) | джу‧дже | джу‧дже | |
| суджук (sudžuk) | су‧джук | су‧джук | |
| дамаджана (damadžana) | да‧ма‧джа‧на | да‧ма‧джа‧на | |
| джаджите (džadžite) | джа‧джи‧те | джа‧джи‧те | |
| койот (kojot) | ко‧йот | ко‧йот | |
| майонеза (majoneza) | ма‧йо‧не‧за | ма‧йо‧не‧за | |
| пейоративен (pejorativen) | пе‧йо‧ра‧ти‧вен | пе‧йо‧ра‧ти‧вен | |
| майор (major) | ма‧йор | ма‧йор | |
| безименен (bezimenen) | бе‧зи‧ме‧нен | бе‧зи‧ме‧нен | |
| изопачавам (izopačavam) | и‧зо‧па‧ча‧вам | и‧зо‧па‧ча‧вам | |
| отивам (otivam) | о‧ти‧вам | о‧ти‧вам | |
| разоран (razoran) | ра‧зо‧ран | ра‧зо‧ран | |
| бульон (buljon) | бу‧льон | бу‧льон | |
| фризьор (frizjor) | фри‧зьор | фри‧зьор | |
| шедьовър (šedjovǎr) | ше‧дьо‧вър | ше‧дьо‧вър | |
| гьозум (gjozum) | гьо‧зум | гьо‧зум | |
| ликьор (likjor) | ли‧кьор | ли‧кьор | |
| воал (voal) | во‧ал | во‧ал | |
| маоизъм (maoizǎm) | ма‧о‧и‧зъм | ма‧о‧и‧зъм | |
| феерия (feerija) | фе‧е‧ри‧я | фе‧е‧ри‧я | |
| воайор (voajor) | во‧а‧йор | во‧а‧йор | |
| миокард (miokard) | ми‧о‧кард | ми‧о‧кард | |
| кьопоолу (kjopoolu) | кьо‧по‧о‧лу | кьо‧по‧о‧лу | |
| аятолах (ajatolah) | а‧я‧то‧лах | а‧я‧то‧лах | |
| авария (avarija) | а‧ва‧ри‧я | а‧ва‧ри‧я | |
| позиции (pozicii) | по‧зи‧ци‧и | по‧зи‧ци‧и | |
| хазяи (hazjai) | ха‧зя‧и | ха‧зя‧и | |
| дерибеи (deribei) | де‧ри‧бе‧и | де‧ри‧бе‧и | |
| преодолея (preodoleja) | пре‧о‧до‧ле‧я | пре‧о‧до‧ле‧я | |
| нащрек (naštrek) | на‧щрек | на‧щрек | |
| поощрявам (pooštrjavam) | по‧о‧щря‧вам | по‧о‧щря‧вам | |
| защриховам (zaštrihovam) | за‧щри‧хо‧вам | за‧щри‧хо‧вам | |
| поощрителен (pooštritelen) | по‧о‧щри‧те‧лен | по‧о‧щри‧те‧лен | |
| изщракване (izštrakvane) | из‧щрак‧ва‧не | из‧щрак‧ва‧не | |
| Вайерщрас (Vajerštras) | Ва‧йер‧щрас | Ва‧йер‧щрас | |
| Кьонигщрасе (Kjonigštrase) | Кьо‧ниг‧щра‧се | Кьо‧ниг‧щра‧се | |
| общност (obštnost) | общ‧ност | общ‧ност | |
| всъщност (vsǎštnost) | всъщ‧ност | всъщ‧ност | |
| помощник (pomoštnik) | по‧мощ‧ник | по‧мощ‧ник | |
| чорапогащник (čorapogaštnik) | чо‧ра‧по‧гащ‧ник | чо‧ра‧по‧гащ‧ник | |
| нощница (noštnica) | нощ‧ни‧ца | нощ‧ни‧ца | |
| чудовищност (čudovištnost) | чу‧до‧вищ‧ност | чу‧до‧вищ‧ност | |
| немощливо (nemoštlivo) | не‧мощ‧ли‧во | не‧мощ‧ли‧во | |
| съобщавам (sǎobštavam) | съ‧об‧ща‧вам | съ‧об‧ща‧вам | |
| въобще (vǎobšte) | въ‧об‧ще | въ‧об‧ще | |
| манджа (mandža) | ман‧джа | ман‧джа | |
| калайджия (kalajdžija) | ка‧лай‧джи‧я | ка‧лай‧джи‧я | |
| авджия (avdžija) | ав‧джи‧я | ав‧джи‧я | |
| изджвака (izdžvaka) | из‧джва‧ка | из‧джва‧ка | |
| пленник (plennik) | плен‧ник | плен‧ник | |
| майка (majka) | май‧ка | май‧ка | |
| профашистки (profašistki) | про‧фа‧шист‧ки | про‧фа‧шист‧ки | |
| гледка (gledka) | глед‧ка | глед‧ка | |
| крачка (kračka) | крач‧ка | крач‧ка | |
| цедка (cedka) | цед‧ка | цед‧ка | |
| звезда (zvezda) | звез‧да | звез‧да | |
| спринцовка (sprincovka) | сприн‧цов‧ка | сприн‧цов‧ка | |
| бързо (bǎrzo) | бър‧зо | бър‧зо | |
| малко (malko) | мал‧ко | мал‧ко | |
| после (posle) | по‧сле | по‧сле | |
| партия (partija) | пар‧ти‧я | пар‧ти‧я | |
| гланцов (glancov) | глан‧цов | глан‧цов | |
| пепелник (pepelnik) | пе‧пел‧ник | пе‧пел‧ник | |
| пилци (pilci) | пил‧ци | пил‧ци | |
| аншоа (anšoa) | ан‧шо‧а | ан‧шо‧а | |
| ядро (jadro) | я‧дро | я‧дро | |
| ироничност (ironičnost) | и‧ро‧нич‧ност | и‧ро‧нич‧ност | |
| профилактична (profilaktična) | про‧фи‧лак‧тич‧на | про‧фи‧лак‧тич‧на | |
| боцна (bocna) | боц‧на | боц‧на | |
| спецна (specna) | спец‧на | спец‧на | |
| бичме (bičme) | бич‧ме | бич‧ме | |
| кръчма (krǎčma) | кръч‧ма | кръч‧ма | |
| боцман (bocman) | боц‧ман | боц‧ман | |
| сачма (sačma) | сач‧ма | сач‧ма | |
| Ричмънд (Ričmǎnd) | Рич‧мънд | Рич‧мънд | |
| мичман (mičman) | мич‧ман | мич‧ман | |
| разчеша (razčeša) | раз‧че‧ша | раз‧че‧ша | |
| пецма (pecma) | пец‧ма | пец‧ма | |
| сестра (sestra) | се‧стра | се‧стра | |
| царство (carstvo) | цар‧ство | цар‧ство | |
| нравствен (nravstven) | нрав‧ствен | нрав‧ствен | |
| мандраджия (mandradžija) | ман‧дра‧джи‧я | ман‧дра‧джи‧я | |
| мизансцен (mizanscen) | ми‧зан‧сцен | ми‧зан‧сцен | |
| странство (stranstvo) | стран‧ство | стран‧ство | |
| пространство (prostranstvo) | про‧стран‧ство | про‧стран‧ство | |
| робство (robstvo) | роб‧ство | роб‧ство | |
| транспорт (transport) | тран‧спорт | тран‧спорт | |
| посвикна (posvikna) | по‧свик‧на | по‧свик‧на | |
| скръндза (skrǎndza) | скрън‧дза | скрън‧дза | |
| годзила (godzila) | год‧зи‧ла | год‧зи‧ла | |
| камикадзе (kamikadze) | ка‧ми‧кад‧зе | ка‧ми‧кад‧зе | |
| надживея (nadživeja) | на‧джи‧ве‧я | на‧джи‧ве‧я | |
| скрън.дза (skrǎn.dza) | скрън‧дза | скрън‧дза | |
| го.дзила (go.dzila) | го‧дзи‧ла | го‧дзи‧ла | |
| камика.дзе (kamika.dze) | ка‧ми‧ка‧дзе | ка‧ми‧ка‧дзе | |
| над.живея (nad.živeja) | над‧жи‧ве‧я | над‧жи‧ве‧я | |
| безсилен (bezsilen) | без‧си‧лен | без‧си‧лен | |
| безшумен (bezšumen) | без‧шу‧мен | без‧шу‧мен | |
| безвъзвратен (bezvǎzvraten) | без‧въз‧вра‧тен | без‧въз‧вра‧тен | |
| безхаберен (bezhaberen) | без‧ха‧бе‧рен | без‧ха‧бе‧рен | |
| безстрашен (bezstrašen) | без‧стра‧шен | без‧стра‧шен | |
| безхлебна (bezhlebna) | без‧хле‧бна | без‧хле‧бна | |
| безвремие (bezvremie) | без‧вре‧ми‧е | без‧вре‧ми‧е | |
| безмерен (bezmeren) | без‧ме‧рен | без‧ме‧рен | |
| безличен (bezličen) | без‧ли‧чен | без‧ли‧чен | |
| безнаказан (beznakazan) | без‧на‧ка‧зан | без‧на‧ка‧зан | |
| безразборен (bezrazboren) | без‧раз‧бо‧рен | без‧раз‧бо‧рен | |
| бездетен (bezdeten) | без‧де‧тен | без‧де‧тен | |
| безпардонен (bezpardonen) | без‧пар‧до‧нен | без‧пар‧до‧нен | |
| безтелесен (beztelesen) | без‧те‧ле‧сен | без‧те‧ле‧сен | |
| безглав (bezglav) | без‧глав | без‧глав | |
| безчестен (bezčesten) | без‧че‧стен | без‧че‧стен | |
| безпризорен (bezprizoren) | без‧при‧зо‧рен | без‧при‧зо‧рен | |
| безгрешен (bezgrešen) | без‧гре‧шен | без‧гре‧шен | |
| безкраен (bezkraen) | без‧кра‧ен | без‧кра‧ен | |
| безбрежен (bezbrežen) | без‧бре‧жен | без‧бре‧жен | |
| бездна (bezdna) | безд‧на | безд‧на | |
| изхвърлям (izhvǎrljam) | из‧хвър‧лям | из‧хвър‧лям | |
| изстена (izstena) | из‧сте‧на | из‧сте‧на | |
| извор (izvor) | из‧вор | из‧вор | |
| извозвам (izvozvam) | из‧воз‧вам | из‧воз‧вам | |
| извлача (izvlača) | из‧вла‧ча | из‧вла‧ча | |
| изхрачване (izhračvane) | из‧храч‧ва‧не | из‧храч‧ва‧не | |
| изшмугна (izšmugna) | из‧шмуг‧на | из‧шмуг‧на | |
| изживяното (izživjanoto) | из‧жи‧вя‧но‧то | из‧жи‧вя‧но‧то | |
| изненада (iznenada) | из‧не‧на‧да | из‧не‧на‧да | |
| излъгах (izlǎgah) | из‧лъ‧гах | из‧лъ‧гах | |
| измяна (izmjana) | из‧мя‧на | из‧мя‧на | |
| изрод (izrod) | из‧род | из‧род | |
| изтрезвително (iztrezvitelno) | из‧трез‧ви‧тел‧но | из‧трез‧ви‧тел‧но | |
| изпроставял (izprostavjal) | из‧про‧ста‧вял | из‧про‧ста‧вял | |
| изключвам (izključvam) | из‧ключ‧вам | из‧ключ‧вам | |
| изблиза (izbliza) | из‧бли‧за | из‧бли‧за | |
| надслов (nadslov) | над‧слов | над‧слов | |
| надхвърлен (nadhvǎrlen) | над‧хвър‧лен | над‧хвър‧лен | |
| надвиквам (nadvikvam) | над‧вик‧вам | над‧вик‧вам | |
| надве (nadve) | над‧ве | над‧ве | |
| надгробен (nadgroben) | над‧гро‧бен | над‧гро‧бен | |
| надпис (nadpis) | над‧пис | над‧пис | |
| надценявам (nadcenjavam) | над‧це‧ня‧вам | над‧це‧ня‧вам | |
| надделея (naddeleja) | над‧де‧ле‧я | над‧де‧ле‧я | |
| над.раствам (nad.rastvam) | над‧ра‧ствам | над‧ра‧ствам | |
| надмощие (nadmoštie) | над‧мо‧щи‧е | над‧мо‧щи‧е | |
| ненадминат (nenadminat) | не‧над‧ми‧нат | не‧над‧ми‧нат | |
| безнадзорен (beznadzoren) | без‧над‧зо‧рен | без‧над‧зо‧рен | |
| надница (nadnica) | над‧ни‧ца | над‧ни‧ца | |
| надменност (nadmennost) | над‧мен‧ност | над‧мен‧ност | |
| на.длъж (na.dlǎž) | на‧длъж | на‧длъж | |
| надробен (nadroben) | на‧дро‧бен | на‧дро‧бен | |
| надрънкам (nadrǎnkam) | на‧дрън‧кам | на‧дрън‧кам | |
| надраскам (nadraskam) | на‧дра‧скам | на‧дра‧скам | |
| надрусам (nadrusam) | на‧дру‧сам | на‧дру‧сам | |
| надран (nadran) | на‧дран | на‧дран | |
| подстрекател (podstrekatel) | под‧стре‧ка‧тел | под‧стре‧ка‧тел | |
| подход (podhod) | под‧ход | под‧ход | |
| подвижен (podvižen) | под‧ви‧жен | под‧ви‧жен | |
| подзаглавие (podzaglavie) | под‧за‧гла‧ви‧е | под‧за‧гла‧ви‧е | |
| подклаждам (podklaždam) | под‧клаж‧дам | под‧клаж‧дам | |
| подбор (podbor) | под‧бор | под‧бор | |
| подпирам (podpiram) | под‧пи‧рам | под‧пи‧рам | |
| подценявам (podcenjavam) | под‧це‧ня‧вам | под‧це‧ня‧вам | |
| подновявам (podnovjavam) | под‧но‧вя‧вам | под‧но‧вя‧вам | |
| подмамвам (podmamvam) | под‧мам‧вам | под‧мам‧вам | |
| подлост (podlost) | под‧лост | под‧лост | |
| под.разделение (pod.razdelenie) | под‧раз‧де‧ле‧ни‧е | под‧раз‧де‧ле‧ни‧е | |
| подробен (podroben) | по‧дро‧бен | по‧дро‧бен | |
| подражавам (podražavam) | по‧дра‧жа‧вам | по‧дра‧жа‧вам | |
| подремя (podremja) | по‧дре‧мя | по‧дре‧мя | |
| подрусам (podrusam) | по‧дру‧сам | по‧дру‧сам | |
| безизразен (bezizrazen) | бе‧зиз‧ра‧зен | бе‧зиз‧ра‧зен | |
| безизразност (bezizraznost) | бе‧зиз‧ра‧зност | бе‧зиз‧ра‧зност | |
| безвъзмезден (bezvǎzmezden) | без‧въз‧мез‧ден | без‧въз‧мез‧ден | |
| безвъздушен (bezvǎzdušen) | без‧въз‧ду‧шен | без‧въз‧ду‧шен | |
| безразличен (bezrazličen) | без‧раз‧ли‧чен | без‧раз‧ли‧чен | |
| безразборност (bezrazbornost) | без‧раз‧бор‧ност | без‧раз‧бор‧ност | |
| безпредметен (bezpredmeten) | без‧пред‧ме‧тен | без‧пред‧ме‧тен | |
| поизправя (poizpravja) | по‧из‧пра‧вя | по‧из‧пра‧вя | |
| поизмъча (poizmǎča) | по‧из‧мъ‧ча | по‧из‧мъ‧ча | |
| поизгладя (poizgladja) | по‧из‧гла‧дя | по‧из‧гла‧дя | |
| произношение (proiznošenie) | про‧из‧но‧ше‧ни‧е | про‧из‧но‧ше‧ни‧е | |
| произтича (proiztiča) | про‧из‧ти‧ча | про‧из‧ти‧ча | |
| наизмислил (naizmislil) | на‧из‧ми‧слил | на‧из‧ми‧слил | |
| наизлезлите (naizlezlite) | на‧из‧ле‧зли‧те | на‧из‧ле‧зли‧те | |
| предразположение (predrazpoloženie) | пред‧раз‧по‧ло‧же‧ни‧е | пред‧раз‧по‧ло‧же‧ни‧е | |
| преразглеждане (prerazgleždane) | пре‧раз‧глеж‧да‧не | пре‧раз‧глеж‧да‧не | |
| преразпределение (prerazpredelenie) | пре‧раз‧пре‧де‧ле‧ни‧е | пре‧раз‧пре‧де‧ле‧ни‧е | |
| преразказ (prerazkaz) | пре‧раз‧каз | пре‧раз‧каз | |
| превъзмогна (prevǎzmogna) | пре‧въз‧мог‧на | пре‧въз‧мог‧на | |
| превъзпитание (prevǎzpitanie) | пре‧въз‧пи‧та‧ни‧е | пре‧въз‧пи‧та‧ни‧е | |
| преиздавам (preizdavam) | пре‧из‧да‧вам | пре‧из‧да‧вам | |
| преизбирам (preizbiram) | пре‧из‧би‧рам | пре‧из‧би‧рам | |
| невъзможен (nevǎzmožen) | не‧въз‧мо‧жен | не‧въз‧мо‧жен | |
| невъзпитан (nevǎzpitan) | не‧въз‧пи‧тан | не‧въз‧пи‧тан | |
| неизбежен (neizbežen) | не‧из‧бе‧жен | не‧из‧бе‧жен | |
| неизменност (neizmennost) | не‧из‧мен‧ност | не‧из‧мен‧ност | |
| неразделен (nerazdelen) | не‧раз‧де‧лен | не‧раз‧де‧лен | |
| неразположение (nerazpoloženie) | не‧раз‧по‧ло‧же‧ни‧е | не‧раз‧по‧ло‧же‧ни‧е | |
| поразмисля (porazmislja) | по‧раз‧ми‧сля | по‧раз‧ми‧сля | |
| пораздрусам (porazdrusam) | по‧раз‧дру‧сам | по‧раз‧дру‧сам | |
| наразказах (narazkazah) | на‧раз‧ка‧зах | на‧раз‧ка‧зах | |
| наразлепил (narazlepil) | на‧раз‧ле‧пил | на‧раз‧ле‧пил | |
| неотложен (neotložen) | не‧от‧ло‧жен | не‧от‧ло‧жен | |
| неотменим (neotmenim) | не‧от‧ме‧ним | не‧от‧ме‧ним | |
| поотложа (pootloža) | по‧от‧ло‧жа | по‧от‧ло‧жа | |
| поотмина (pootmina) | по‧от‧ми‧на | по‧от‧ми‧на | |
| уелски (uelski) | у‧ел‧ски | у‧ел‧ски | |
| уебсайт (uebsajt) | у‧еб‧сайт | у‧еб‧сайт | |
| уестърн (uestǎrn) | у‧е‧стърн | у‧е‧стърн | |
| Оуен (Ouen) | О‧у‧ен | О‧у‧ен | |
| ноухау (nouhau) | но‧у‧ха‧у | но‧у‧ха‧у | |
| Джоузеф (Džouzef) | Джо‧у‧зеф | Джо‧у‧зеф | |
| боулинг (bouling) | бо‧у‧линг | бо‧у‧линг | |
| даунлоуд (daunloud) | да‧ун‧ло‧уд | да‧ун‧ло‧уд | |
| уиски (uiski) | у‧и‧ски | у‧и‧ски | |
| уикенд (uikend) | у‧и‧кенд | у‧и‧кенд | |
| Уоруик (Uoruik) | У‧о‧ру‧ик | У‧о‧ру‧ик | |
| Хелоуин (Helouin) | Хе‧ло‧у‧ин | Хе‧ло‧у‧ин | |
| ўелски | уел‧ски | уел‧ски | |
| ўебсайт | уеб‧сайт | уеб‧сайт | |
| ўестърн | уе‧стърн | уе‧стърн | |
| Оўен | О‧уен | О‧уен | |
| ноўхаў | ноу‧хау | ноу‧хау | |
| Джоўзеф | Джоу‧зеф | Джоу‧зеф | |
| боўлинг | боу‧линг | боу‧линг | |
| даўн.лоўд | даун‧лоуд | даун‧лоуд | |
| ўиски | уи‧ски | уи‧ски | |
| ўикенд | уи‧кенд | уи‧кенд | |
| Ўорўик | Уор‧уик | Уор‧уик | |
| Хелоўин | Хе‧ло‧уин | Хе‧ло‧уин | |
| ўинд.сърфинг | уинд‧сър‧финг | уинд‧сър‧финг | |
| разни хора-разни вкусове (razni hora-razni vkusove) | раз‧ни хо‧ра-раз‧ни вку‧со‧ве | раз‧ни хо‧ра-раз‧ни вку‧со‧ве | |
| акушер-гинеколог (akušer-ginekolog) | а‧ку‧шер-ги‧не‧ко‧лог | а‧ку‧шер-ги‧не‧ко‧лог | |
| най-напред (naj-napred) | най-на‧пред | най-на‧пред | |
| ампер-час (amper-čas) | ам‧пер-час | ам‧пер-час | |
| га-га (ga-ga) | га-га | га-га | |
| пи-пи (pi-pi) | пи-пи | пи-пи | |
| Гвинея-Бисау (Gvineja-Bisau) | Гви‧не‧я-Би‧са‧у | Гви‧не‧я-Би‧са‧у | |
| шам-фъстък (šam-fǎstǎk) | шам-фъ‧стък | шам-фъ‧стък | |
| вълна-убиец (vǎlna-ubiec) | въл‧на-у‧би‧ец | въл‧на-у‧би‧ец | |
| акушер-гинеколог (akušer-ginekolog) | а‧ку‧шер-ги‧не‧ко‧лог | а‧ку‧шер-ги‧не‧ко‧лог | |
| по-добре късно, отколкото никога (po-dobre kǎsno, otkolkoto nikoga) | по-до‧бре къ‧сно, от‧кол‧ко‧то ни‧ко‧га | по-до‧бре къ‧сно, от‧кол‧ко‧то ни‧ко‧га | |
| зенитно-ракетен (zenitno-raketen) | зе‧нит‧но-ра‧ке‧тен | зе‧нит‧но-ра‧ке‧тен | |
| горе-долу (gore-dolu) | го‧ре-до‧лу | го‧ре-до‧лу | |
| най-после (naj-posle) | най-по‧сле | най-по‧сле | |
| чик-чирик (čik-čirik) | чик-чи‧рик | чик-чи‧рик | |
| среден род (sreden rod) | сре‧ден род | сре‧ден род | |
| божа кравичка (boža kravička) | бо‧жа кра‧вич‧ка | бо‧жа кра‧вич‧ка | |
| Съединени американски щати (Sǎedineni amerikanski štati) | Съ‧е‧ди‧не‧ни а‧ме‧ри‧кан‧ски ща‧ти | Съ‧е‧ди‧не‧ни а‧ме‧ри‧кан‧ски ща‧ти | |
| от младих до старих (ot mladih do starih) | от мла‧дих до ста‧рих | от мла‧дих до ста‧рих | |
| со кротце, со благо и со малко кютек (so krotce, so blago i so malko kjutek) | со крот‧це, со бла‧го и со мал‧ко кю‧тек | со крот‧це, со бла‧го и со мал‧ко кю‧тек |
References
[ပလေဝ်ဒါန်]- Тилков, Димитър; Бояджиев, Тодор; Георгиева, Елена; Пенчев, Йордан; Станков, Валентин (1998), Граматика на съвременния български книжовен език (in ဗူလ်ဂရဳယာန်), 3rd edition, volume 1, Sofia: ABAGAR
local export = {}
local substring = mw.ustring.sub
local rsubn = mw.ustring.gsub
local rmatch = mw.ustring.match
local rsplit = mw.text.split
local rlen = mw.ustring.len
local U = require("Module:string/char")
local lang = require("Module:languages").getByCode("bg")
local script = require("Module:scripts").getByCode("Cyrl")
local ipa_module = "Module:IPA"
local audio_module = "Module:audio"
local headword_data_module = "Module:headword/data"
local homophones_module = "Module:homophones"
local hyphenation_module = "Module:hyphenation"
local parameters_module = "Module:parameters"
local rhymes_module = "Module:rhymes"
local table_module = "Module:table"
local tracking_module = "Module:debug/track"
local GRAVE = U(0x300)
local ACUTE = U(0x301)
local BREVE = U(0x306)
local PRIMARY = U(0x2C8)
local SECONDARY = U(0x2CC)
local TIE = U(0x361)
local FRONTED = U(0x31F)
local DOTUNDER = U(0x323)
local HYPH = U(0x2027)
local BREAK_MARKER = "."
local vowels = "aɤɔuɛiɐo"
local vowels_c = "[" .. vowels .. "]"
local cons = "bvɡdʒzjklɫwmnprstfxʃɣʲ" .. TIE
local cons_c = "[" .. cons .. "]"
local hcons_c = "[бвгджзйклмнпрстфхшщьчц#БВГДЖЗЙКЛМНПРСТФХШЩЬЧЦ=]"
local hvowels_c = "[аъоуеияѝюАЪОУЕИЯЍЮ]"
local capital_letters_c = "[БВГДЖЗЙКЛМНПРСТФХШЩЬЧЦАЪОУЕИЯЍЮ]"
local accents = PRIMARY .. SECONDARY
local accents_c = "[" .. accents .. "]"
-- single characters that map to IPA sounds
local phonetic_chars_map = {
["а"] = "a",
["б"] = "b",
["в"] = "v",
["г"] = "ɡ",
["д"] = "d",
["е"] = "ɛ",
["ж"] = "ʒ",
["з"] = "z",
["и"] = "i",
["й"] = "j",
["к"] = "k",
["л"] = "l",
["м"] = "m",
["н"] = "n",
["о"] = "ɔ",
["п"] = "p",
["р"] = "r",
["с"] = "s",
["т"] = "t",
["у"] = "u",
["ў"] = "w",
["ф"] = "f",
["х"] = "x",
["ц"] = "t" .. TIE .. "s",
["ч"] = "t" .. TIE .. "ʃ",
["ш"] = "ʃ",
["щ"] = "ʃt",
["ъ"] = "ɤ",
["ь"] = "ʲ",
["ю"] = "ʲu",
["я"] = "ʲa",
[GRAVE] = SECONDARY,
[ACUTE] = PRIMARY
}
local devoicing = {
["b"] = "p", ["d"] = "t", ["ɡ"] = "k",
["z"] = "s", ["ʒ"] = "ʃ",
["v"] = "f"
}
local voicing = {
["p"] = "b", ["t"] = "d", ["k"] = "ɡ",
["s"] = "z", ["ʃ"] = "ʒ", ["x"] = "ɣ",
["f"] = "v"
}
-- Prefixes where, if they occur at the beginning of the word and the stress is on the next syllable, we place the
-- syllable division directly after the prefix. For example, the default syllable-breaking algorithm would convert
-- безбра́чие to беˈзбрачие; but because it begins with без-, we convert it to безˈбрачие. Note that we don't (yet?)
-- convert измра́ to изˈмра instead of default измˈра, although we probably should.
--
-- Think twice before putting prefixes like на-, пре- and от- here, because of the existence of над-, пред-, and о-,
-- which are also prefixes.
local IPA_prefixes = {"bɛz", "vɤz", "vɤzproiz", "iz", "naiz", "poiz", "prɛvɤz", "proiz", "raz"}
-- version of rsubn() that discards all but the first return value
local function rsub(term, foo, bar)
local retval = rsubn(term, foo, bar)
return retval
end
-- version of rsubn() that discards all but the count value
local function count_matches(term, pattern)
local _, match_count = rsubn(term, pattern, "")
return match_count
end
-- apply rsub() repeatedly until no change
local function rsub_repeatedly(term, foo, bar)
while true do
local new_term = rsub(term, foo, bar)
if new_term == term then
return term
end
term = new_term
end
end
local function char_at(str, index)
return substring(str, index, index)
end
local function starts_with(str, substr)
return substring(str, 1, rlen(substr)) == substr
end
local function count_vowels(word)
return count_matches(word, hvowels_c)
end
local function count_capital_letters(word)
return count_matches(word, capital_letters_c)
end
local function count_accents(ipa)
return count_matches(ipa, accents_c)
end
local function count_ipa_vowels(ipa)
return count_matches(ipa, vowels_c)
end
function export.remove_pron_notations(text, remove_grave)
text = rsub(text, "[." .. DOTUNDER .. "]", "")
text = rsub(text, "ў", "у")
text = rsub(text, "Ў", "У")
-- Remove grave accents from annotations but maybe not from phonetic respelling
if remove_grave then
text = mw.ustring.toNFC(rsub(mw.ustring.toNFD(text), GRAVE, ""))
end
return text
end
function export.toIPA(term, endschwa)
if type(term) == "table" then -- called from a template or a bot
endschwa = term.args.endschwa
term = term.args[1]
end
local origterm = term
term = mw.ustring.toNFD(mw.ustring.lower(term))
term = rsub(term, "у" .. BREVE, "ў") -- recompose ў
term = rsub(term, "и" .. BREVE, "й") -- recompose й
if term:find(GRAVE) and not term:find(ACUTE) then
error("Use acute accent, not grave accent, for primary stress: " .. origterm)
end
-- allow DOTUNDER to signal same as endschwa=1
term = rsub(term, "а(" .. accents_c .. "?)" .. DOTUNDER, "ъ%1")
term = rsub(term, "я(" .. accents_c .. "?)" .. DOTUNDER, "ʲɤ%1")
term = rsub(term, ".", phonetic_chars_map)
-- Mark word boundaries
term = rsub(term, "(%s+)", "#%1#")
term = "#" .. term .. "#"
-- Convert verbal and definite endings
if endschwa then
term = rsub(term, "a(" .. PRIMARY .. "?t?#)", "ɤ%1")
end
-- Change ʲ to j after vowels or word-initially
term = rsub(term, "([" .. vowels .. "#]" .. accents_c .. "?)ʲ", "%1j")
-------------------- Move stress ---------------
-- First, move leftwards over the vowel.
term = rsub(term, "(" .. vowels_c .. ")(" .. accents_c .. ")", "%2%1")
-- Then, move leftwards over j or soft sign.
term = rsub(term, "([jʲ])(" .. accents_c .. ")", "%2%1")
-- Then, move leftwards over a single consonant.
term = rsub(term, "(" .. cons_c .. ")(" .. accents_c .. ")", "%2%1")
-- Then, move leftwards over Cl/Cr combinations where C is an obstruent (NOTE: IPA ɡ).
term = rsub(term, "([bdɡptkxfv]" .. ")(" .. accents_c .. ")([rl])", "%2%1%3")
-- Then, move leftwards over kv/gv (NOTE: IPA ɡ).
term = rsub(term, "([kɡ]" .. ")(" .. accents_c .. ")(v)", "%2%1%3")
-- Then, move leftwards over sC combinations, where C is a stop or resonant (NOTE: IPA ɡ).
term = rsub(term, "([sz]" .. ")(" .. accents_c .. ")([bdɡptkvlrmn])", "%2%1%3")
-- Then, move leftwards over affricates not followed by a consonant.
term = rsub(term, "([td]" .. TIE .. "?)(" .. accents_c .. ")([szʃʒ][" .. vowels .. "ʲ])", "%2%1%3")
-- If we ended up in the middle of a tied affricate, move to its right.
term = rsub(term, "(" .. TIE .. ")(" .. accents_c .. ")(" .. cons_c .. ")", "%1%3%2")
-- Then, move leftwards over any remaining consonants at the beginning of a word.
term = rsub(term, "#(" .. cons_c .. "*)(" .. accents_c .. ")", "#%2%1")
-- Then correct for known prefixes.
for _, prefix in ipairs(IPA_prefixes) do
local prefix_prefix, prefix_final_cons = rmatch(prefix, "^(.-)(" .. cons_c .. "*)$")
if prefix_final_cons then
-- Check for accent moved too far to the left into a prefix, e.g. безбрачие accented as беˈзбрачие instead
-- of безˈбрачие
term = rsub(term, "#(" .. prefix_prefix .. ")(" .. accents_c .. ")(" .. prefix_final_cons .. ")", "#%1%3%2")
end
end
-- If the previous substitution resulted in a stress occuring immediately after a consonant
-- but before a palatalizer mark, then put the stress before the consonant.
term = rsub(term, "(" .. cons_c .. ")" .. "(" .. accents_c .. ")" .. "ʲ", "%2%1ʲ")
-- Finally, if there is an explicit syllable boundary in the cluster of consonants where the stress is, put it there.
-- First check for accent to the right of the explicit syllable boundary.
term = rsub(term, "(" .. cons_c .. "*)%.(" .. cons_c .. "*)(" .. accents_c .. ")(" .. cons_c .. "*)", "%1%3%2%4")
-- Then check for accent to the left of the explicit syllable boundary.
term = rsub(term, "(" .. cons_c .. "*)(" .. accents_c .. ")(" .. cons_c .. "*)%.(" .. cons_c .. "*)", "%1%3%2%4")
-- Finally, remove any remaining syllable boundaries.
term = rsub(term, "%.", "")
-------------------- Vowel reduction (in unstressed syllables) ---------------
local function reduce_vowel(vowel)
return rsub(vowel, "[aɔɤu]", { ["a"] = "ɐ", ["ɔ"] = "o", ["ɤ"] = "ɐ", ["u"] = "o" })
end
-- Reduce all vowels before the stress, except if the word has no accent at all. (FIXME: This is presumably
-- intended for single-syllable words without accents, but if the word is multisyllabic without accents,
-- presumably all vowels should be reduced.)
term = rsub(term, "(#[^#" .. accents .. "]*)(.-#)", function(a, b)
if count_vowels(origterm) <= 1 then
return a .. b
else
return reduce_vowel(a) .. b
end
end)
-- Reduce all vowels after the accent except the first vowel after the accent mark (which is stressed).
term = rsub(term, "(" .. accents_c .. "[^aɛiɔuɤ#]*[aɛiɔuɤ])([^#" .. accents .. "]*)", function(a, b)
return a .. reduce_vowel(b)
end)
-------------------- Vowel assimilation to adjacent consonants (fronting/raising) ---------------
term = rsub(term, "([ʃʒʲj])([aouɤ])", "%1%2" .. FRONTED)
-- Hard l
term = rsub_repeatedly(term, "l([^ʲɛi])", "ɫ%1")
-- Voicing assimilation
term = rsub(term, "([bdɡzʒv" .. TIE .. "]*)(" .. accents_c .. "?[ptksʃfx#])", function(a, b)
return rsub(a, ".", devoicing) .. b end)
term = rsub(term, "([ptksʃfx" .. TIE .. "]*)(" .. accents_c .. "?[bdɡzʒ])", function(a, b)
return rsub(a, ".", voicing) .. b end)
term = rsub(term, "n(" .. accents_c .. "?[ɡk]+)", "ŋ%1")
term = rsub(term, "m(" .. accents_c .. "?[fv]+)", "ɱ%1")
-- Sibilant assimilation
term = rsub(term, "[sz](" .. accents_c .. "?[td]?" .. TIE .. "?)([ʃʒ])", "%2%1%2")
-- Reduce consonant clusters
term = rsub(term, "([szʃʒ])[td](" .. accents_c .. "?)([tdknml])", "%2%1%3")
-- Strip hashes
term = rsub(term, "#", "")
return term
end
----Syllabification code----
-- Authorship: Chernorizets
-- Lua port: Kiril Kovachev
local function set_of(t)
local out = {}
for _, v in pairs(t) do
out[v] = true
end
return out
end
local function in_set(set, value)
return set[value] == true
end
-- Classification of letters by phonetic category
local vowels_syllab = set_of {"а", "ъ", "о", "у", "е", "и", "ю", "я"}
local sonorants = set_of { "л", "м", "н", "р", "й", "ў"}
local stops = set_of {"б", "п", "г", "к", "д", "т"}
local fricatives = set_of {"в", "ф", "ж", "ш", "з", "с", "х"}
local affricates = set_of {"ч", "ц"}
local function is_vowel(ch)
return in_set(vowels_syllab, ch)
end
local function is_palatalizer(ch)
return ch == "ь"
end
local function is_sonorant(ch)
return in_set(sonorants, ch)
end
local function is_stop(ch)
return in_set(stops, ch)
end
local function is_fricative(ch)
return in_set(fricatives, ch)
end
local function is_affricate(ch)
return in_set(affricates, ch)
end
--[[
Sonority objects:
Sonority objects take the form of a table with the following attributes:
{
rank (int): the numerical value representing the position of the sound in the sonority hierarchy;
first_index (int): the index of the first letter that makes up the sound within the word.
The index of the first letter in a word with this sonority rank.
The affricates "дж" and "дз" are represented by two letters each, but
for sonority purposes they function as a "unit", hence we just need
the index of the first letter of the affricate.
}
--]]
local function new_sonority(rank, first_index)
return {
["rank"] = rank,
["first_index"] = first_index
}
end
local function get_sonority_rank(ch)
if is_fricative(ch) then
return 1
end
if is_stop(ch) or is_affricate(ch) then
return 2
end
if is_sonorant(ch) then
return 3
end
if is_vowel(ch) then
return 4
end
return 0
end
-- Get the representation of a word as a list of sequential sonority objects, stored in a table.
-- Their representation is just {[1] = (sonority object #1), [2] = (sonority object #2)} etc.
-- Please see above for description of sonority objects' layout.
local function get_sonority_model(word, start_idx, end_idx)
local sonorities = {}
word = mw.ustring.lower(word)
local i = start_idx
while i < end_idx do
local curr = char_at(word, i)
if curr == "щ" then
-- One letter representing 2 sounds - decompose it.
table.insert(sonorities, new_sonority(get_sonority_rank("ш"), i))
table.insert(sonorities, new_sonority(get_sonority_rank("т"), i));
elseif curr == "д" then
-- Handle affricates with 'д' - only 'дж' here for illustration.
local next_char = (i == end_idx - 1 and " ") or char_at(word, i+1)
local should_skip = false
if next_char == "ж" then
table.insert(sonorities, new_sonority(2, i)) -- 2 = affricate sonority rank
i = i + 1 -- Skip over the 'ж'
should_skip = true
end
if not should_skip then table.insert(sonorities, new_sonority(get_sonority_rank("д"), i)) end
elseif not is_palatalizer(curr) then
-- Skip over 'ь' since it doesn't change the sonority.
table.insert(sonorities, new_sonority(get_sonority_rank(curr), i))
end
i = i + 1
end
return sonorities
end
-- Forced breaks when the user inputs a break marker into the input string
-- word: string; start and end are integers indexing the string
local function find_forced_break(word, range_start, range_end)
if range_start >= range_end then return -1 end
local marker_pos = mw.ustring.find(word, BREAK_MARKER, range_start, true) or -1
return marker_pos >= range_end and -1 or marker_pos
end
local function strip_forced_breaks(segment)
return rsub(segment, "[.]", "");
end
---- Morphological prefix handling
--[[
This code brings morphological prefix awareness to syllabification.
This is necessary, because following the principle of rising sonority
alone fails to determine syllable boundaries correctly in some cases
— that is, when certain prefixes should be kept together as a first syllable.
]]
--[[
Affected prefixes. Each of them ends in a consonant that can be followed
by another consonant of a higher sonority in some words. In such cases,
naive syllable breaking would chop off the prefix's last consonant, and
glue it to the onset of the next syllable.
]]
local prefixes = {
-- без- family
"без",
-- из- family
"безиз", "наиз", "поиз", "произ", "преиз", "неиз", "из",
-- въз- family
"безвъз", "превъз", "невъз", "въз",
-- раз- family
"безраз", "предраз", "пораз", "нараз", "прераз", "нераз", "раз",
-- от- family
"неот", "поот", "от",
-- ending in fricatives
"екс", "таз", "дис",
-- ending in stops
"пред"
}
--[[
Finds the (zero-based) separation point between a
morphological prefix and the rest of the word.
By convention, that's the index of the first character
after the prefix.
word: the word to check for prefixes
return -1 if no prefix found, or if the separation point
is handled by the sonority model. A non-zero index otherwise.
]]
local function followed_by_higher_sonority_cons(prefix, word) -- prefix, word are both strings
prefix = mw.ustring.lower(prefix)
word = mw.ustring.lower(word)
local prefix_last_char = char_at(prefix, rlen(prefix))
local first_char_after_prefix = char_at(word, rlen(prefix) + 1)
-- Prefixes followed by vowels do, in fact, get broken up.
if is_vowel(first_char_after_prefix) then return false end
return get_sonority_rank(prefix_last_char) < get_sonority_rank(first_char_after_prefix)
end
local function find_separation_points(word)
local matching_prefixes = {}
word = mw.ustring.lower(word)
for _, prefix in pairs(prefixes) do
if starts_with(word, prefix) and followed_by_higher_sonority_cons(prefix, word) then
table.insert(matching_prefixes, rlen(prefix) + 1)
end
end
return matching_prefixes
end
---- Main syllabification code
---Context objects:
--[[ encoded as a table like
{
word (string),
prefix_separation_points (table[int])
}
]]
local function new_context(word, pos)
return {
["word"] = word,
["prefix_separation_points"] = pos
}
end
--[[
Consonant clusters that exhibit rising sonority, but should be
broken up regardless to produce natural-sounding syllables.
The breakpoint for clusters of 3 or more consonants can vary –
here we provide a zero-based offset within the cluster for each.
]]
local sonority_exception_break = {
["км"] = 1, ["гм"] = 1, ["дм"] = 1, ["вм"] = 1,
["зм"] = 1, ["цм"] = 1, ["чм"] = 1,
["дн"] = 1, ["вн"] = 1, ["тн"] = 1, ["чн"] = 1,
["кн"] = 1, ["гн"] = 1, ["цн"] = 1,
["зд"] = 1, ["зч"] = 1, ["зц"] = 1,
["вк"] = 1, ["вг"] = 1, ["дл"] = 1, ["жд"] = 1,
["згн"] = 1, ["здн"] = 2, ["вдж"] = 1
}
local sonority_exception_keep = {
"ств", "св", "вс"
}
local function normalize_word(word)
if word == nil then return "" end
word = rsub(rsub(word, "^\\s+", ""), "\\s+^", "") -- Strip spaces
return word
end
local function normalize_syllable(syllable)
local normalized = strip_forced_breaks(syllable)
normalized = rsub(normalized, "ў", "у")
normalized = rsub(normalized, "Ў", "У")
return normalized
end
local function find_rising_sonority_break(sonorities)
local prev_rank = -1;
for _, curr in pairs(sonorities) do
if curr.rank <= prev_rank then
-- Found a break.
return curr.first_index
end
prev_rank = curr.rank
end
-- There was no rising sonority break. Start syllable at first index.
return sonorities[1].first_index
end
local function matches(str, substr, start_idx, end_idx)
local strlen = end_idx - start_idx
if strlen ~= rlen(substr) then return false end
str = mw.ustring.lower(str)
substr = mw.ustring.lower(substr)
local i = start_idx
local j = 1
while i < end_idx do
if char_at(str, i) ~= char_at(substr, j) then return false end
i = i + 1
j = j + 1
end
return true
end
-- ctx: context object
-- left and right vowels: integers
-- sonority break: integer
local function fixup_syllable_onset(ctx, left_vowel, sonority_break, right_vowel)
local word = mw.ustring.lower(ctx.word)
-- 'щр' is a syllable onset when in front of a vowel.
-- Although 'щ' + sonorant technically follows rising sonority, syllables
-- like щнV, щлV etc. are unnatural and incorrect. In such cases, we treat
-- the sonorant as the onset of the next syllable.
if char_at(word, right_vowel - 2) == "щ" then
local penult = char_at(word, right_vowel - 1)
if penult == "р" then return (right_vowel - 2) end
if is_sonorant(penult) then return (right_vowel - 1) end
end
-- Check for situations where we shouldn't break the cluster.
local match_found = false
for _, cluster in pairs(sonority_exception_keep) do
if matches(word, cluster, left_vowel + 1, right_vowel) then
match_found = true
break
end
end
if (match_found) then return left_vowel + 1 end -- syllable onset == beginning of cluster
-- Check for situations where we should break the cluster even if
-- it obeys the principle of rising sonority.
local maybe_cluster = nil
for cluster, _ in pairs(sonority_exception_break) do
if matches(word, cluster, left_vowel + 1, right_vowel) then
maybe_cluster = cluster
break
end
end
if maybe_cluster ~= nil then
local offset = sonority_exception_break[maybe_cluster]
return left_vowel + 1 + offset
end
local separation_points = ctx.prefix_separation_points
local separation_match = nil
for _, pos in pairs(separation_points) do
if pos > left_vowel and pos < right_vowel then
separation_match = pos
break
end
end
if separation_match ~= nil then return separation_match else return sonority_break end
end
-- ctx: context object
-- left/right vowels: integers
local function find_next_syllable_onset(ctx, left_vowel, right_vowel)
local n_cons = right_vowel - left_vowel - 1
-- No consonants - syllable starts on rightVowel
if n_cons == 0 then return right_vowel end
-- Check for forced breaks
local break_pos = find_forced_break(ctx.word, left_vowel + 1, right_vowel)
if break_pos ~= -1 then return break_pos + 1 end
-- Single consonant between two vowels - starts a syllable
if n_cons == 1 then return left_vowel + 1 end
-- Two or more consonants between the vowels. Find the point (if any)
-- where we break from rising sonority, and treat it as the tentative
-- onset of a new syllable.
local sonorities = get_sonority_model(ctx.word, left_vowel + 1, right_vowel)
local sonority_break = find_rising_sonority_break(sonorities)
-- Apply exceptions to the rising sonority principle to avoid
-- unnatural-sounding syllables.
return fixup_syllable_onset(ctx, left_vowel, sonority_break, right_vowel)
end
local function deaccent(term)
return rsub(term, "[" .. ACUTE .. GRAVE .. DOTUNDER .. "]", "")
end
local function deaccent_all(term)
return deaccent(rsub(mw.ustring.toNFD(term), BREVE, ""))
end
-- Returns a table of strings (list)
local function syllabify_poly(word)
local syllables = {}
local ctx = new_context(word, find_separation_points(word))
local prev_vowel = -1
local prev_onset = 1;
for i = 1, rlen(word) do
if is_vowel(mw.ustring.lower(char_at(word, i))) then
-- A vowel, yay!
local should_skip = false
if prev_vowel == -1 then
prev_vowel = i
should_skip = true;
end
-- This is not the first vowel we've seen. In-between
-- the previous vowel and this one, there is a syllable
-- break, and the first character after the break starts
-- a new syllable.
if not should_skip then
local next_onset = find_next_syllable_onset(ctx, prev_vowel, i)
table.insert(syllables, substring(word, prev_onset, next_onset - 1))
prev_vowel = i
prev_onset = next_onset
end
end
end
-- Add the last syllable
table.insert(syllables, substring(word, prev_onset))
return syllables
end
function export.syllabify_word(word)
local norm = normalize_word(word)
if rlen(norm) == 0 then return "" end;
local n_vowels = count_vowels(norm)
local syllables = n_vowels <= 1 and {norm} or syllabify_poly(norm)
local out = {}
for k, v in pairs(syllables) do
out[k] = normalize_syllable(v)
end
return table.concat(out, HYPH)
end
local function tokenize_words(term)
local out = {}
local prev_index = 1
local len = rlen(term)
for i = 1, len do
local current_char = char_at(term, i)
if current_char == "-" or current_char == " " then
table.insert(out, substring(term, prev_index, i))
prev_index = i + 1
end
end
table.insert(out, substring(term, prev_index, len))
return out
end
function export.syllabify(term)
term = deaccent(term)
local words = tokenize_words(term)
local out = {}
for _, word in pairs(words) do
table.insert(out, export.syllabify_word(word))
end
return table.concat(out, "")
end
---Hyphenation
-- Hyphenate a word from its existing syllabification
function export.hyphenate(syllabification)
-- Source: http://logic.fmi.uni-sofia.bg/hyphenation/hyph-bg.html#hyphenation-rules-between-1983-and-2012
-- Also note: the rules from 2012 onward, which encode the modern standard, are entirely
-- backwards-compatible with the previous standard. Thus our code can generate valid 2012
-- hyphenations despite only explicitly implementing the older (1983) rules.
---Pre-processing----
local word = deaccent(syllabification)
word = rsub_repeatedly(word, HYPH .. "дж", HYPH .. "#")
word = rsub_repeatedly(word, "дж$", "#")
word = rsub_repeatedly(word, "^дж", "#")
word = rsub_repeatedly(word, "(" .. hvowels_c .. ")" .. HYPH .. "(" .. hcons_c .. ")(" .. rsub(hcons_c, "[ьЬ]", "") .. "+)", "%1%2" .. HYPH .. "%3")
word = rsub_repeatedly(word, "(" .. rsub(hcons_c, "[йЙ]", "") .. ")(" .. hcons_c .. "+)" .. HYPH, "%1" .. HYPH .. "%2")
word = rsub_repeatedly(word, "^(" .. hvowels_c .. ")" .. HYPH, "%1")
word = rsub_repeatedly(word, HYPH .. "(" .. hvowels_c .. ")$", "%1")
word = rsub_repeatedly(word, "(" .. hvowels_c .. ")" .. HYPH .. "(" .. hvowels_c .. ")" .. HYPH .. "(" .. hvowels_c .. ")", "%1%2" .. HYPH .. "%3")
word = rsub_repeatedly(word, HYPH .. "(" .. hvowels_c .. ")" .. HYPH .. "(" .. hcons_c .. ")", HYPH .. "%1%2")
word = rsub_repeatedly(word, "#", "дж")
return word
end
-- Hyphenate a word directly, no need to calculate its syllabification beforehand (used in test suite)
function export.hyphenate_total(word)
local syllabification = export.syllabify(word)
return export.hyphenate(syllabification)
end
local function get_anntext(term, ann)
if ann == "1" or ann == "y" then
-- remove secondary stress annotations
return "'''" .. export.remove_pron_notations(term, true) .. "''': "
elseif ann then
return "'''" .. ann .. "''': "
else
return ""
end
end
local HYPHENATION_LABEL = "Hyphenation<sup>([[Appendix:Bulgarian hyphenation#Hyphenation|key]])</sup>"
local SYLLABIFICATION_LABEL = "Syllabification<sup>([[Appendix:Bulgarian hyphenation#Syllabification|key]])</sup>"
local function format_hyphenation(hyphenation, label)
hyphenation = deaccent(hyphenation) -- remove grave/acute accent
local syllables = rsplit(hyphenation, HYPH)
label = label or HYPHENATION_LABEL
return require(hyphenation_module).format_hyphenations {
lang = lang,
hyphs = { { hyph = syllables } },
sc = script,
caption = label,
}
end
local function format_syllabification(syllabification)
return format_hyphenation(syllabification, SYLLABIFICATION_LABEL)
end
-- Display syllabification and hyphenation, together if the same, and on separate lines if not;
-- allows specifying an indentation level, if the hyphenation must be indented to more than one level.
local function render_bg_hyph(term, indentation, syllabification, hyphenation)
syllabification = syllabification or export.syllabify(term)
hyphenation = hyphenation or export.hyphenate(syllabification)
local out = ""
-- Users must put a * (or **) before the template usage
if syllabification == hyphenation then
if syllabification ~= "-" then
out = format_syllabification(syllabification)
end
else
local syllabification_text = format_syllabification(syllabification)
local hyphenation_text = format_hyphenation(hyphenation)
if syllabification ~= "-" then
out = syllabification_text
end
if hyphenation ~= "-" then
if syllabification == "-" then
out = hyphenation_text
else
out = out .. "\n" .. mw.ustring.rep("*", indentation) .. " " .. hyphenation_text
end
end
end
return out
end
-- Entry point to {{bg-hyph}}
function export.show_hyphenation(frame)
local params = {
[1] = {},
["indent"] = { type = "number" }
}
local title = mw.title.getCurrentTitle()
local args = require(parameters_module).process(frame:getParent().args, params)
local term = args[1] or title.nsText == "Template" and "при́мер" or title.text
local indent = args["indent"] or 1
return render_bg_hyph(term, indent)
end
function export.get_rhymes(ipa)
ipa = rsub(ipa, FRONTED, "")
local length = rlen(ipa)
local i = length
local vowels_seen = 0
local final_consonant_found = false
-- March until accent is found
while i > 0 do
-- Track vowel or final consonant if found
if vowels_seen == 0 and rmatch(char_at(ipa, i), cons_c) then
final_consonant_found = true
end
if rmatch(char_at(ipa, i), vowels_c) then
vowels_seen = vowels_seen + 1
end
if rmatch(char_at(ipa, i), accents_c) then
-- Note whether letter before the accent was vowel
local final_vowel_cluster = rmatch(ipa, vowels_c .. accents_c .. vowels_c .. "$")
-- March until the vowel first following the accent is found
while i <= length and not rmatch(char_at(ipa, i), vowels_c) do
i = i + 1
end
-- March back if only a single word-final vowel was previously spotted –
-- this corresponds to final-syllable-stressed words, whose rhyme
-- needs to include a consonant according to Bulgarian rhyming rules.
if vowels_seen <= 1 and not final_consonant_found and not final_vowel_cluster then
while i > 1 and not rmatch(char_at(ipa, i), rsub(cons_c, "ʲ", "")) do
i = i - 1
end
-- Account for affricates (note: this can only occur in
-- consonant-rhyme, i.e. final-stressed words)
if i > 1 and char_at(ipa, i-1) == TIE then
-- If a tie is present, there must be a letter before it as well.
i = i - 2
elseif i > 1 and char_at(ipa, i) == "ʒ" and char_at(ipa, i-1) == "d" then
-- Treat [dʒ] sequence as an affricate – this can have some edge cases.
-- In future, the module should distinguish [d.ʒ] and [d͡ʒ].
i = i - 1
end
end
return substring(ipa, i)
end
i = i - 1
end
local n_vowels = count_ipa_vowels(ipa)
if n_vowels == 1 then
i = length
if rmatch(char_at(ipa, i), cons_c) then
while i > 1 and not rmatch(char_at(ipa, i), vowels_c) do
i = i - 1
end
else
while i > 1 and not rmatch(char_at(ipa, i), rsub(cons_c, "ʲ", "")) do
i = i - 1
end
end
if i > 1 and char_at(ipa, i - 1) == TIE then
i = i - 2
end
return substring(ipa, i)
end
return nil
end
-- Render a single IPA transcription as wikitext (with optional qualifiers + accent labels)
local function format_ipa(ipa, q, qq, a, aa)
-- Introduce narrow transcription brackets
ipa = "[" .. ipa .. "]"
local ipa_data = {
lang = lang,
items = {{ pron = ipa }},
q = q,
qq = qq,
a = a,
aa = aa,
}
return require(ipa_module).format_IPA_full(ipa_data)
end
function export.show(frame)
local params = {
[1] = {},
["endschwa"] = { type = "boolean" },
["ann"] = {},
["q"] = { type = "qualifier" },
["qq"] = { type = "qualifier" },
["a"] = { type = "labels" },
["aa"] = { type = "labels" },
["pagename"] = {},
}
local args = require(parameters_module).process(frame:getParent().args, params)
local term = args[1] or args.pagename or mw.title.getCurrentTitle().nsText == "Template" and "при́мер" or
mw.loadData(headword_data_module).pagename
local ipa = export.toIPA(term, args.endschwa)
local ipa_text = format_ipa(ipa, args.q, args.qq, args.a, args.aa)
local anntext = get_anntext(term, args.ann)
-- Terms with a lack of stress despite not being monosyllabic
if count_vowels(term) > 1 and count_accents(ipa) == 0 then
require(tracking_module)("bg-IPA/no stress")
end
return anntext .. ipa_text
end
-- Convert rhyme suffix and optional syllable count to wikitext
local function format_rhymes(rhyme_suffix, syllable_counts, q, qq, l, ll)
return require(rhymes_module).format_rhymes({
lang = lang,
rhymes = {
{rhyme = rhyme_suffix, num_syl = syllable_counts, q = q, qq = qq, a = l, aa = ll},
},
})
end
-- Entry point for {{bg-rhymes}}
function export.show_rhymes(frame)
-- TODO: add qualifiers / labels
local params = {
[1] = {},
["pagename"] = {},
["s"] = { sublist = true, type = "number", },
["q"] = { type = "qualifier", },
["qq"] = { type = "qualifier", },
["a"] = { type = "labels", },
["aa"] = { type = "labels", },
}
local args = require(parameters_module).process(frame:getParent().args, params)
local term = args[1] or args.pagename or mw.title.getCurrentTitle().nsText == "Template" and "при́мер" or
mw.loadData(headword_data_module).pagename
local ipa = export.toIPA(term, args.endschwa)
local rhymes = export.get_rhymes(ipa)
local syllable_counts = args.s or {count_vowels(term)}
return format_rhymes(rhymes, syllable_counts, args.q, args.qq, args.a, args.aa)
end
-- Matches patterns such as: Bg-къща.ogg<Sofia>,
-- with capture groups [1]="Bg-къща.ogg", [2]="Sofia"
local accent_qualifier_pattern = "(.-)<(.+)>"
-- Matches speaker profiles, i.e. a shorthand for representing a given
-- speaker and the properties of their audio
local speaker_profile_pattern = "%[([^:~]-)([:~]?)([^:~]*)%]"
local speaker_profiles = require("Module:bg-pronunciation/speaker profiles").speaker_profiles
local audio_functions = require("Module:bg-pronunciation/speaker profiles").audio_functions
-- Return a list of audio formatted as wikitext
local function format_audio_list(list, ipa, pagename, corresponding_respelling)
local request_rfap = false
if list == nil then
return {}, request_rfap
end
-- Audio list is separated by # symbols.
local file_specs = rsplit(list, "#")
local audios = {}
for _, spec in pairs(file_specs) do
-- Try to match speaker profile
local speaker_profile_name, separator, term_override = rmatch(spec, speaker_profile_pattern)
-- FIXME
if speaker_profile_name == "" and term_override ~= "" then
speaker_profile_name = term_override
term_override = nil
end
if speaker_profile_name then
local speaker_profile = speaker_profiles[speaker_profile_name]
if not speaker_profile then
error("Speaker profile named '" .. speaker_profile_name .. "' does not exist")
end
-- E.g. [ABC~2] on page="човек" is equivalent to: [ABC:човек-2].
if separator == "~" then
term_override = mw.loadData(headword_data_module).pagename .. "-" .. term_override
end
if term_override == "" then
term_override = corresponding_respelling
end
local renderer_name = speaker_profile["renderer"]
local filename = term_override or pagename
local rendered = audio_functions[renderer_name].display(speaker_profile, filename, {ipa=ipa})
if rendered then
table.insert(audios, rendered)
else
request_rfap = true
end
else
-- Attempt to match qualifiers; if none given, then the entire spec is simply interpreted as a file name.
local filename, accents_string = rmatch(spec, accent_qualifier_pattern)
filename = filename or spec
local accents_list = accents_string and rsplit(accents_string, ",") or nil
table.insert(audios,
require(audio_module).format_audio({
lang = lang,
file = filename,
a = accents_list,
})
)
end
end
return audios, request_rfap
end
local function format_homophones(homophones_list)
local homophones_data = {}
for _, hmp in pairs(homophones_list) do
table.insert(homophones_data, {
term = hmp
})
end
return require(homophones_module).format_homophones({
lang = lang,
homophones = homophones_data,
})
end
-- Entry point for {{bg-pr}}
function export.show_all(frame)
local params = {
[1] = { list = true, disallow_holes = true},
["q"] = { list = true, type = "qualifier", allow_holes=true },
["qq"] = { list = true, type = "qualifier", allow_holes=true },
["l"] = { list = true, type = "labels", allow_holes=true },
["ll"] = { list = true, type = "labels", allow_holes=true },
["ann"] = { list = true, allow_holes=true, separate_no_index = true },
["audio"] = { list = true, allow_holes=true },
["a"] = { alias_of = "audio", list = true, allow_holes=true },
["rhymes"] = { list = true, separate_no_index = true, allow_holes=true },
["s"] = { list = true, type = "number", separate_no_index = true, allow_holes=true },
["syllabification"] = { list = true, separate_no_index = true, allow_holes=true },
["syl"] = { alias_of = "syllabification", list = true, separate_no_index = true, allow_holes=true },
["hyphenation"] = { list = true, separate_no_index = true, allow_holes=true },
["hyph"] = { alias_of = "hyphenation", list = true, separate_no_index = true, allow_holes=true },
["homophones"] = { list = true, allow_holes=true },
["hs"] = { list = true, separate_no_index = true, allow_holes=true}, -- Hyphenation and syllabification override at once
["hmp"] = { alias_of = "homophones", list = true, allow_holes=true },
["endreduce"] = { list = true, allow_holes=true },
["endschwa"] = { alias_of = "endreduce", list = true, allow_holes=true },
["raw"] = { list = true, allow_holes=true },
["pagename"] = {},
}
local args = require(parameters_module).process(frame:getParent().args, params)
local pagename = args.pagename or mw.title.getCurrentTitle().nsText == "Template" and "при́мер" or
mw.loadData(headword_data_module).pagename
-- Ensure at least one pronunciation line present (unlikely to be desired in the case of multisyllabic words, though)
if #args[1] == 0 then
args[1] = {pagename}
end
-- Returns the hyphenation for a term if one is given by the user,
-- or automatically generate one from the respelling if it meets the criteria:
--[[
- must have more than one vowel
- the respelling given must be the same as the page title (when accents
are removed)
--]]
local syllabify = export.syllabify
local function hyphenate(x) return export.hyphenate(export.syllabify(x)) end
local function deal_with_hyph_syl_respellings(given, respelling, hyphenate_or_syllabify)
if given == "+" then
return hyphenate_or_syllabify(respelling)
elseif given == "#" then
return hyphenate_or_syllabify(pagename)
elseif given then
return rsub(given, "%.", HYPH)
elseif count_vowels(respelling) == 1 or rsub(deaccent_all(respelling), "%.", "") ~= pagename then
return "-"
else
return hyphenate_or_syllabify(respelling)
end
end
-- Return nil if a rhyme should not be added based on the term.
--[[ The current rules are:
- if a manual override is given by the user, that will be the rhyme regardless
- if no override is given and the term contains a space or hyphen, return nil
- if there is no space and hyphen, but there is more than 1 capital letter,
then the term will be diagonosed as an abbreviation, and nil will be returned
(this logic will help to keep rhymes for names, e.g. Петър)
- if there is more than 1 accent mark (primary or secondary combined)
in the IPA, then nil is returned
- if all the above guards fail, the module will generate a rhyme
(despite appearances, this will actually be most words anyway)
--]]
local function deal_with_rhymes(override, ipa)
if override then
return override
end
if count_capital_letters(pagename) > 1 then
return nil
end
if count_accents(ipa) > 1 then
return nil
end
if rmatch(ipa, "[ %-]") or rmatch(pagename, "[ %-]") then
return nil
end
return export.get_rhymes(ipa)
end
-- Track whether an audio file has been included using a speaker profile but doesn't yet exist; automatically {{rfap}} if so.
local request_rfap = false
-- Build up wikitext output for each pronunciation line
local pronunciation_lines = {}
for i, respelling in pairs(args[1]) do
local q = args.q[i]
local qq = args.qq[i]
local l = args.l[i]
local ll = args.ll[i]
local num_syllables = args.s.default or args.s[i] or count_vowels(respelling)
local ipa = args.raw[i] or export.toIPA(respelling, args.endreduce[i])
local audio, request_rfap_local = format_audio_list(args.audio.default or args.audio[i], ipa, pagename, respelling)
local rhymes = deal_with_rhymes(args.rhymes.default or args.rhymes[i], ipa) or "-"
local homophones = args.homophones[i] and rsplit(args.homophones[i], "#")
local syllabification = deal_with_hyph_syl_respellings(args.hs.default or args.hs[1] or args.syllabification.default or args.syllabification[i], respelling, syllabify)
local hyphenation = deal_with_hyph_syl_respellings(args.hs.default or args.hs[1] or args.hyphenation.default or args.hyphenation[i], respelling, hyphenate)
local ann = args.ann.default or args.ann[i]
table.insert(pronunciation_lines, {
term = respelling,
q = q,
qq = qq,
l = l,
ll = ll,
num_syllables = num_syllables,
ipa = ipa,
audio = audio,
rhymes = rhymes,
homophones = homophones,
syllabification = syllabification,
hyphenation = hyphenation,
ann = ann,
})
if args.rhymes[i] then
require(tracking_module)("bg-pr/manual rhyme")
if export.get_rhymes(pronunciation_lines[i]["ipa"]) ~= args.rhymes[i] then
-- Manual rhyme actually changes the displayed rhyme
require(tracking_module)("bg-pr/rhyme override discrepancy")
end
end
if request_rfap_local then
request_rfap = true
end
end
-- This function is used to check whether all pronunciation lines have the same value for
-- a particular property. If they do, then that property should only be rendered once,
-- underneath each IPA.
-- If the properties are different, then each IPA should have that property value specified
-- indented underneath it, e.g. multiple IPAs with different audio files should have
-- their audios indented beneath each IPA.
local function all_the_same(property, eq) -- `eq` allows the notion of "the_same" to be overridden
eq = eq or function(a, b) return a == b end -- Use "==" as default definition of equality
local first = pronunciation_lines[1][property]
for _, pronunciation_line in pairs(pronunciation_lines) do
if not eq(pronunciation_line[property], first) then
return first, false
end
end
return first, true
end
-- Render overall output as text
local output_lines = {}
-- Check whether values for each property are all the same,
-- in which case they can all be merged at the end of the template,
-- instead of being duplicated per-pronunciation-line.
local first_hyphenation, all_hyphenations_the_same = all_the_same("hyphenation")
local first_syllabification, all_syllabifications_the_same = all_the_same("syllabification")
local first_rhyme, all_rhymes_the_same = all_the_same("rhymes")
local first_homophones, all_homophones_the_same = all_the_same("homophones", require(table_module).deepEquals)
local first_audio, all_audio_the_same = all_the_same("audio", require(table_module).deepEquals)
-- Generate text for each pronunciation line.
for _, pronunciation_line in pairs(pronunciation_lines) do
-- Unpack data
local ipa = pronunciation_line["ipa"]
local term = pronunciation_line["term"]
local ann = pronunciation_line["ann"]
local q = pronunciation_line["q"]
local qq = pronunciation_line["qq"]
local l = pronunciation_line["l"]
local ll = pronunciation_line["ll"]
local audios = pronunciation_line["audio"]
local rhymes = pronunciation_line["rhymes"]
local homophones = pronunciation_line["homophones"]
local hyphenation = pronunciation_line["hyphenation"]
local syllabification = pronunciation_line["syllabification"]
local ipa_text = format_ipa(ipa, q, qq, l, ll)
local ann_text = get_anntext(term, ann)
local out = {"* " .. ann_text .. ipa_text}
if audios and not all_audio_the_same then
for _, audio in pairs(audios) do
table.insert(out, "** " .. audio)
end
end
if rhymes ~= "-" and not all_rhymes_the_same then
local num_syllables = pronunciation_line["num_syllables"]
local rhymes_text = format_rhymes(rhymes, {num_syllables})
table.insert(out, "** " .. rhymes_text)
end
if homophones and not all_homophones_the_same then
local homophones_text = format_homophones(homophones)
table.insert(out, "** " .. homophones_text)
end
local syllabification_text = format_syllabification(syllabification)
if syllabification == hyphenation and syllabification ~= "-" then
if not (all_hyphenations_the_same and all_syllabifications_the_same) then
table.insert(out, "** " .. syllabification_text)
end
else
local hyphenation_text = format_hyphenation(hyphenation)
if not all_syllabifications_the_same and syllabification ~= "-" then
table.insert(out, "** " .. syllabification_text)
end
if not all_hyphenations_the_same and hyphenation ~= "-" then
table.insert("** " .. hyphenation_text)
end
end
-- Produce output for one pronunciation line at a time
table.insert(output_lines, table.concat(out, "\n"))
end
-- Group audios all together if they are all identical
if all_audio_the_same and first_audio then
for _, audio in pairs(first_audio) do
table.insert(output_lines, "* " .. audio)
end
end
-- Group rhymes all together if they are all identical
if all_rhymes_the_same and first_rhyme ~= "-" then
-- Take all syllable counts for all pronunciation lines
-- to inform the syllable count of the rhyme.
local num_syllables = {}
for _, line in pairs(pronunciation_lines) do
table.insert(num_syllables, line["num_syllables"])
end
local rhymes_text = format_rhymes(first_rhyme, num_syllables)
table.insert(output_lines, "* " .. rhymes_text)
end
-- Group homophones all together if they are all identical
if all_homophones_the_same and first_homophones then
local homophones_text = format_homophones(first_homophones)
table.insert(output_lines, "* " .. homophones_text)
end
-- If some hyphenations or syllabifications are absent, then the outermost text should not include these,
-- rather they should have been appended to the relevant pronunciations lines above.
if all_hyphenations_the_same and all_syllabifications_the_same and first_hyphenation == first_syllabification and first_hyphenation ~= "-" then
local syllabification_text = format_syllabification(first_hyphenation)
table.insert(output_lines, "* " .. syllabification_text)
elseif first_hyphenation ~= first_syllabification then
if all_syllabifications_the_same and first_syllabification ~= "-" then
local syllabification_text = format_syllabification(first_syllabification)
table.insert(output_lines, "* " .. syllabification_text)
end
if all_hyphenations_the_same and first_hyphenation ~= "-" then
local hyphenation_text = format_hyphenation(first_hyphenation)
table.insert(output_lines, "* " .. hyphenation_text)
end
end
if request_rfap then
table.insert(output_lines, frame:preprocess("{{rfap|bg}}"))
end
return table.concat(output_lines, "\n")
end
return export