A quick intro for 590DPL...

I created this project for my husband's research; he's a PhD student in Slavic and Balkan Linguistics at the University of Chicago.

There's no perfect parallel in English for the data I have here, but here's the best explanation I've come up with: Let's say you have the words park, car, bar, far, jar, par, SARS, and you go around to different English-speaking cities and ask how people pronounce them. On the West Coast, you'd probably get the "standard" pronunciation: park, car, bar, etc. In the South, you might get pahk, cahr, bahr. In Boston, you might get paek, cae, bae.

Now, the thing is, in these Bulgarian villages, you'll often see two or more variants-- as if you were to get a town where they say pahk, but they also say cae. There might also be villages where they say "ah" for everything, except the word "car". That'd be something really interesting for a phonologist like my husband. Also interesting for him are villages that have (roughly) the same distribution of the various different pronunciations.

To translate the terminology I use on the site into the English example...

Sites = villages

Reflexes = ways that the words are pronounced, that are distinctive in different places. So, in this example, the "reflexes" would be "ar", "ah", and "ae".

Lexemes = the fundamental "word", regardless of how it's pronounced. I use the letter R (which looks like a P) to represent any of the reflexes. So the "lexemes" for the words I've used in the English example would be pRk, cR, bR, etc.

Tokens = the words that people say: a specific lexeme with a specific reflex

A font that supports Unicode 5.1 (such as Doulos SIL) is critical for the text to show up right.

If you have any questions, just ask! I'm happy to write up how-tos for anything I've done here.