Parsing Latin poetry using constraint satisfaction

Written by

Write lines of dactylic hexameter on the left; the program will attempt to scan each line, printing the results on the right. You don't need to indicate which vowels are lōng or shŏrt—the program will figure it out. An example from the Aneid is shown below.


Arma virumque cano Troiae qui primus ab oris
Italiam fato profugus Lavinjaque venit
litora multum ille et terris iactatus et alto
vi superum saevae memorem Iunonis ob iram;
multa quoque et bello passus dum conderet urbem,
inferretque deos Latio genus unde Latinum,
Albanique patres atque altae moenia Romae.
Musa mihi causas memora quo numine laeso,
quidve dolens regina deum tot volvere casus
insignem pietate virum tot adire labores
impulerit. Tantaene animis caelestibus irae?

How it works

Here is the Javascript source code for this page. The process for scanning text works as follows:
  1. Split the text into lines; each will be scanned separately.
  2. Certain letter combinations are treated as indivisible units; mark them as such.
  3. Look at the neighbors of vowels to see which vowels are long by position or elided.
  4. Guess which vowels are long and short by nature, using the constraint that the meter is dactylic hexameter. Scansion is entirely deterministic if you use accents to mark the so-called natural length of each vowel. But this program trades deterministic computing for an easy-to-use interface: it doesn't require you to mark your vowels in this way; instead, it guesses the appropriate accents (i.e. the natural length of each vowel) using the following inference procedure
This simple search heuristic is not flawless— but it works surprisingly well; I suspect part of the success is due to the fact that the penultimate foot is almost always a dactyl (this is why I begin search with the penultimate foot), and because there are usually enough diphthongs and long-by-nature vowels around to make the problem highly constrained. The program is interesting because it has to search and make guesses; it doesn't scan the way you would if you knew the lengths of the vowels by nature. Instead, it works backwards: starting from the assumption that the meter is perfect dactylic hexameter, it is able to determine the natural length of each vowel by guessing which feet are dacyls and which are spondees.