Some thoughts on classic optimality theory(Prince & Smolensky 1993/2004)

[This is a term paper of mine, I'm not that familar with OT, thus I cannot guarantee someone has not made similar proposals before, comments are welcome.]

Basic structure of classic optimality theory

Generation Function: G(input) = {C1, C2,..., Cn}

Evaluation Function: E({C1, C2, C3,..., Cn}, Con) = Ck (1<=k<=n);

Ck = <In_k , G(in_k)>, In_k is the kth input, 1<=k<=n

To be more precise, G-function could be considered as a function of the form P(g(x), x), in which x is input, g is a generation function that generates an infinite number of outputs, and P is a pairing function which generates an order pair called candidate, in which an input is paired with its corresponding output. The G-function thus generates an infinite number of candidates, which are inputs to the E-function, E -function, could be further decomposed of as a composite function of the form S·O·H , in which H is the function calculating the value of harmony, O-function is a ordering function, which assigns an order to the set of candidates, an the S-function selects the most harmonic candidate and return the G-output of the candidate as the output of the E-function.

Thus an OT grammar could be written as S·O·H·P(g(x), x).

A fine-grained H-function

The H-function is calculated based on an ordered set of constraints (called constraint hierarchy), which are said to be substantially universal, the order, on the other hand, is said to be language-specific.

The computational efficiency-driven minimal-constraint condition(CED-MCM):

Minimize the number of universal constraints.

(why such a consdition? In classic OT, it is always said the E-function does all at one jump, yet this shall not be confused with the computational efficiency principle, how many times it takes to finish a task has none to do with how much time it requires to finish the task. Note that, theoretically, in the worst case, we need to go through the whole constraint hierarchy to find the most harmonic candidate due to the free-ness of the G-function.)

There are 7000+ languages in the world, suppose they are all distinct (in a rather vague manner), thus we need at least 8 universal constraints to account for them (based on factorial typology, which says if we have n constraints, there are n!(“n factorial”) ways to rank them). And By CED-MCM we need at most 8 constraints to account for the typological matters. (I speculate the size of the set of universal constraints could be smaller, but we here take 8 as a null hypothesis.)

The Sum-of-order Method(SoO)

Suppose we have a constraint hierarchy(CH) {A, B, C, D, E, F, G, H}, of which each constraint is assigned a weight based on its order in the current (language-specific) CH, from 1 to 8: {1, 2, ..., 8}; If a constraint is violated by a candidate, the candidate's value under this constraint is 1*weight, otherwise, 0*weight.

(The basic idea is to get the sum of the weight of the violated constraints in the CH, this in a sense is to get the sum of the order of the violated constraints, thus SoO)

We then could calculate the harmony value based on this formulation.

Now suppose we have an E-input (input of a E-function) of which the G-output violates C, D, F with another one violating C, F, G, H we thus could calculate the harmony value (Vh) of the two, the former is thus 4+6+3 = 13, while the latter 6+3+7+8 = 24.

Obviously, the latter is better, since its Vh is relatively higher (meaning the violation is less severe, instead of violating less constraints). This apparently holds even if we put it in a OT tableau.

Case 1

	A	B	C	D	E	F	G	H
C1			*	*		*
C2			*			*	*	*

The E-function thus would choose the G-output of C2 as the final output.

A serious problem is that, if, say, a candidate C3 violates A-F, while a C4 violates F-H, they would have the same Vh =21, the H-function thus could not tell which one is the most harmonic, yet C3 is not supposed to be the best choice according to the tableau below.

Case 2

	A	B	C	D	E	F	G	H
C3	*	*	*	*	*	*
C4						*	*	*

Even worse,

Case 3

	A	B	C	D	E	F	G	H
C5	*	*	*	*
C6								*

In this case, C5 has a higher value than C6, yet C6 is supposed to be the most harmonic candidate according to the tableau.

Thus the SoO method just proposed is invalid.

Any working modifications?

One might think of a "Number of violations (NoV) * SoO" method. yet it won't work either, see case 2, the former would end up with a value 6*21, and the latter 3*21, the result is exactly the opposite to the actual one.

Another method concerning these two variables would be SoO/NoV (V-SN), but see the below case:

Case 4

	A	B	C	D	E	F	G	H
C7	*							*
C8		*					*

Here, according to the tableau, the optimal candidate should be C8, yet the V-SN values of C7 and C8 are both 5.

Even worse:

Case 5

	A	B	C	D	E	F	G	H
C9	*		*		*		*	*
C10		*		*		*

In this case, the V-SN value of C9 is 24/5, while that of C10 is 12/3 = 4, yet 24/5 >4, thus C9 seems to be the optimal candidate, which is exactly the opposite result to C10.

It is based on these results that I suspect any method subject to global evaluation would work.

Local evaluation Method

A plausible method thus seems to subject the evaluation procedure to a local domian (as to how the domian is to be defined that is still left to explore, we here take 1 as the null hypothesis), that is, one takes the set of candidates and goes through every constraint in the CH in a top-down manner, at each step, that is, under a specific constraint, the E-function takes the candidate set and returns the optimal candidates, which serves as the input at the next step, the whole procedure would stop when the E-function returns a singleton, of which the element is the final optimal candidate. Under this formulation, the issue in case 2 would immediately dissolve.

This formulation could be written as (Ck indicates the kth candidate set, Ek indicates the E-function under the kth constraint, G(In) indicates the output of G-function):

E1(C1) = E1 (G(In))
En(Cn) = En(En-1(Cn-1))

My worry of this method rests on its computational complexity. Intuitively, an E-function is a composite function S·O·H, if we have 8 constraints, and we in the worst case have to go through all of them, we would then need to run the SOH function 8 times. I doubt whether this could actually be the way the phonological module in our brain works.

[I'm still working on the time complexity and space complexity of the the algorithm]

[Note: a possible alternative could be to define the local domian as parts of the CH where the number of violation is exactly the same for all the competing candidates. But it would require the strating point of the first local domain to always be the first constraint of the CH.]

Other issues

In practice, the candidates chosen seems to me to be cherry-picking, and the actual constraint hierarchy formulated seems to me to be troublesome in that due to the freedom of the G-function, one could always find some optimal data points which abides by the same CH while are not the actual optimal candidate.

Besides, there are too many constraints out there, even in P&S(1993/2004: 281-282), there are more than 30 of them.

To be brief, The Con(set of constraints) and the G-function need to be restricted.

[won't elaborate here, excuse my laziness]

An alternative proposal of OT

I thus would like to propose a diffrent approach. That is, it is usually said that in rule-based phonology, the order is imposed instead of intrinsic.

I thus would propose such a architecture:

The G-function is an unordered set of language-specific/universal [to be explored] rewriting rules.

The E-function is used to evaluate the possibilities generated by using these rules based on a universal set of constraints, of which the order is language-specifc.

In this sense, this architecture could be called a order-evaluating architecture, this is, however, in my eyes, a remedy to the rule-based phonology.

However, whether this model works still needs empirical evidence.

[Citation]

Han, C. (2025b, August 28). Some thoughts on classic optimality theory(Prince & Smolensky 1993/2004). LiLaCom.

https://lifelanguagecomputation.blogspot.com/2025/08/some-thoughts-on-classic-optimality.html

Search This Blog

LiLaCom