Tuesday, June 16, 2015

How to Simulate Infixation in Hunspell

Infixation is where the affix is “inserted” into the stem, usually it is between the first consonant (if there is one) and the vowel of the first syllable. In Iloko and other Philippine languages this type of affix that is very productive and occurs in the many paradigms of many of the lexical categories.

The only affix types that Hunspell recognizes, however, are “prefix” and “suffix”, in other words, affixing to the left and to the right of the stem. Nevertheless, rules can be written to simulate the process.

In Iloko it is rather simple: if the syllable begins with a consonant, insert the infix between it and the vowel, otherwise, treat it like a prefix.

Example
root: sarita – talk, speech
s<um>arita

root: andar – to run (of machines), to function, to operate
um- andar

The maximal syllables in Iloko is CVC, which makes infixation simple. But, with the adoption of Spanish and English loans, there are syllables that begin with two or more consonants. For example, prito ( from the past participle of Spanish freir lit “fried”) is a commonly used word in Iloko and Tagalog. Iloko’s strategy for infixation is to insert before the vowel, i.e. prinito. Initial clusters thus become another consideration.

Hunspell rules have to be written in such as way that appear as if the first consonant (if the first syllable has one) and the infix are a prefix.

PFX I  Y  25
PFX I  0  um   [aeiou]
PFX I  b  bum  b
PFX I  d  dum  d
PFX I  g  gum  g
. . .
PFX I  t  tum  t

The first rule is straight forward:

If the root begins with a vowel, treat the infix as a prefix and attach it to the left side of the root.

So, with uli “to ascend, go up” the result is umuli. But, the remaining rules show how to deal with roots and stems that begin with consonants.

The value in the third “column” are the characters to remove from the beginning of the root. In the first case “b” and “d”. We want to remove it because we will replace the initial letter with what is in the fourth column, “bum”, the initial consonant with the infix in place. The fifth column specifies the condition under which we want the rule to apply. As expected, it is “b”.

Infixation

root: takder – to stand

1) akder (remove the ‘t’)
2) tum (assemble the pseudo-prefix)
3) tumakder (add to root’s left-word edge, the beginning)

With this strategy, rules for simulation the process of infixation can be written and accounted for. The number of rules needed, however, is determined by the number of possible onsets in Iloko: 14 single consonants and 10 clusters, so 24 distinct rules.

No comments:

Post a Comment