logo IPI PAN

dr hab. Łukasz Dębowski

Phonetic transcription and computational poetry

In this section, the following scripts are provided freely:


These Perl modules define some functions for transcribing texts in Polish and Czech phonetically (in some approximation). The package was intented as a library for TURPIS — a generator of rhymed poems in these languages. The quality of the phonetic transcription is intended to be good enough for meter and rhyme equivalence but it is far from being accurate for speech synthesis and recognition.

The downloadable ZIP archive (4.3K) contains the following files:

I am the author of the phonetic rules for Polish whereas an adaptation of these rules for Czech was done by Alexandr Rosen. If you write a compatible module for another language and ask me to publish it here, I will certainly agree.


Już konie w stajnię wzięto, już im hojnie dano.

Polish pronunciation:
[juš końe fstajńe vźento juž jym xojńe dano]

Vowels and clusters:
[j(u)š k(o)ń(e) fst(a)jń(e) vź(e)nt(o) j(u)ž j(y)m x(o)jń(e) d(a)n(o)]

Stressed Polish pronunciation:
[j?(u)š k!(o)ń.(e) fst!(a)jń.(e) vź!(e)nt.(o) j?(u)ž j?(y)m x!(o)jń.(e) d!(a)n.(o)]

Meter and rhyme:
[?!.!.!.??!.!.] [ano]

TURPIS (The Ultimate Retriever of Poetry Ignoring Sense)

TURPIS is a generator of random poems originally intended for Polish. It generates rhymed and rhythmical poetry according to several predefined patterns of versification (limmerick, saphic stanza, sonnet, trzynastozgłoskowiec, haiku etc.). Thanks to a contribution by Alexandr Rosen, TURPIS can versify in Czech as well.

TURPIS does not write from a scratch. It needs some training data, namely a sufficiently large file (>100KB) containing plain text in a given language. The text is read in order to excerpt all continuous substrings which match the predefined types of verse lines. The excerpted substrings are stored in a database indexed by the substring's meter, rhyme and last word. For composing the poems, the substrings are retrieved at random to form the consecutive lines of a poem. Since the text excerption takes some time, one can save the database of excerpted substrings and open it during the next session of TURPIS instead of excerpting the same text again.

A more detailed description of the program and several examples of its poetry in Polish may be found in my presentation at ROJN seminar. Czech examples are presented in Alexandr Rosen's contribution to the festschrift for Vladimír Petkevič.

This downloadable ZIP file (288K) contains TURPIS and some examples. In more detail, the archive's contents is Speak::{Polish,Czech} plus the following files:

Michał Rudolf's Poeta

It is a very popular pastime to compose probabilistic grammars (PCFGs) for generating grammatically correct but funnily nonsensical texts. By no means it is an entertainment of our times. In the communist countries, numerous PCFGs were devised on paper to model the official newspeak. Times changed, technology progressed, and nowadays some nonsense gets accepted for scientific conferences, cf. SCIgen's paper on "Rooter".

A brilliant contribution to the PCFG text generation was done by dr Michał Rudolf in the subdomain of rhymeless and deeply profound Polish poetry. He no longer maintains his private webpage but I got his permission to publish his PCFG here.

The downloadable ZIP archive (6.1K) contains:

Just a poem:

ogień śpi jak kałuża nicości
kiedy wino czeka na krzyk

sprawiedliwa fala rozmyła się w rozpaczy
płaskie serce nieba nie wróci
to ten kto pożąda ziemi
◂ home