Linguistics Olympiad Training Material Edited
Linguistics Olympiad Training Material Edited
Linguistics Olympiad Training Material Edited
Table of Contents
2. Introduction:---------------------------------------------------------------------------------------6
3. Phonetics:------------------------------------------------------------------------------------------8
Figure 1............................................................................................................ 8
Figure 2........................................................................................................... 13
Figure 3........................................................................................................... 13
4. Morphology---------------------------------------------------------------------------------------15
5. Syntax---------------------------------------------------------------------------------------------17
6 Bibliography:--------------------------------------------------------------------------------------21
6.1 Books..........................................................................................21
6.2 Websites......................................................................................21
1. A sample Linguistics Problem...
Read the following sentences in Slovenian and their translations in English. Then
answer the questions that follow:
8. On je nesmiseln. He is pointless.
1. Ubil si ga nemiselno.
3. Duhovnik te je pobral.
6. Fant se je ubil.
1
-By Sagar Sarda
2
1.1 So how to solve it?
Try solving it yourself first, then go through the material on morphology and syntax and
attempt it again. After that, compare your answers here.
So, let’s begin with the verbs, since they’re often the most susceptible to change in any
language (depending on the subject, object, tense, mood, etc.). Here, eating seems to
be the most repeated verb, so note down all the cases of its appearance:
Common words? Pojed + el/la. Any other appearances of pojed? No, so let’s move to the
next verb - to kill:
On je nesmiseln = He is pointless.
Je = to be. However, there are other appearances of this word, which need to be kept in
mind. The verb ‘to be’ often serves the purpose of an auxiliary verb (used to conjugate
verbs in different tenses; to be and to have are the most common of such verbs, for
example: I was going home)
The other verbs can be deduced similarly: udaril=to hit; spustil=to pick up; pobral=to
pick up.
That leaves “si”, “se”, “ga”, “me”, “on”, “sem”, “in”, and the scary big-looking words.
Common features? That’s right, they all have ‘you’ as the subject of the sentence.
3
Next, “se”. Where does it occur? only one instance:
We already know that pojedel is the verb, meaning “to eat”, “si” provides the subject
“you”, so what’s left? The object. So, either “se” means the object is 2 nd person (you) or
that the subject and object are identical (in either case the meaning of the sentence
would not change). Since we do not have any additional information, we will have to
leave it at that (there are no more reflexive sentences-same subject and object-or
sentences with “you” as the object).
Next, “ga”:
They both have “him” as an object in the sentence, but the evidence is still not entirely
conclusive, given that the second sentence has so many unknown words.
Here, “me” is the common object, so me=me (though both are pronounced differently!).
Knowing this word enables us to conclusively say that “ga” = 3 rd person object, as earlier
we did not know the meaning of “me”, but since that is now resolved, it makes sense to
conclude that ga=him.
On je nesmiseln is the only example of “on”. It means “He is pointless”, and is also the
only occasion where “he” is the subject of the sentence in the corpus (set of predefined
information).
“Sem”, too, has only one appearance, in the 9 th sentence, where it signifies that the
subject is the first person (I...). So sem=I.
You do not need to provide elaborate explanations as are provided above for the
meaning of each word and how that was deduced. It will suffice to simply list out these
meanings in an actual problem, and then list out the rules of syntax and morphology.
However, if you feel more comfortable with it, then you can answer the problem in
narrative style showing exactly how you arrived to your answer (saying I did this, then
this was apparent to me, so I deduced this and conclude this...), however, the narrative
approach is usually lengthier. All that you really need to provide is enough information
about the language for any layman (like a computer!) to reach the exact same answers
you did with just your rule set.
4
Lastly, to translate English into Slovenian, we will need to know rules of syntax and the
application of “je”. NOTE: Many people think that since they have obtained the meaning
of all individual words, the problem is mostly over and that they will score highly even if
translations are not entirely accurate. This is not at all true. If the rules of syntax and
morphology are not elaborately explained, even completely correct solutions will not
score highly. However, if rules of syntax are mostly explained, even if one translation is
not entirely correct, or the explanation contains an inaccurate rule, it is still possible to
obtain a high number of points. Syntax and morphology form the main part of the
problem, and the rules MUST satisfy EVERY sentence WITHOUT exception (this is the
only way to ensure that the rules are correct; they must also make intuitive sense, as
the unknown language is still a real one, spoken by real people, who would favor
choosing the simplest explanation).
As you might have noticed, sentences 2, 5, 9 and 10 begin with verbs, unlike the others,
and have the syntactic structure VSO (verb-subject-object), unlike the others, which are
SOV (subject-object-verb). So what’s common to all these sentences?
So what’s common between these sentences? 3 of them have “you” as the subject of the
sentence. The last has “I”. But that is the only incidence of “I” as a subject in the entire
corpus, and every sentence in the corpus not listed here has the subject as 3 rd person
(he or some noun-like animal or boy or priest or police officer). That is their common
link, which means it is safe to assume that the sentences that are thus structured (VSO)
are the sentences that use the personal pronouns for first and second person subjects
(ie. the subject is either first or second person).
So what else is left unexplained? The use of “je” in sentences with other verbs as well
(sentences 1, 3, 4, 6; sentences 7 and 8 have “je” as the primary verb-there is no other
verb in the sentence):
So, we have a rule to identify a first or second person subject (using VSO), and clearly
all these cases have third person subject, and all the sentences not here either have
first or second person subjects or have the normal use of “je” (but they still have it in
the sentence), to mean “is”. Clearly, so, it is appearing specially in cases where the
subject is third person. So, with this, knowledge, it may even be concluded that “je” is
not the word for “is” at all, and that “to be” is implied when there is no verb present.
So, you should provide both possible answers in such cases, with explanations. The
answers are given on the following page.
5
1.3 Answers to the Sample Linguistics Problem:
1.3.1 Answers to Slovenian-English:
3. For this problem, we encounter a word we never have before, “te”, but it would
make sense given its placement and morphological similarity to “me”, that this
signifies that the object is the second person, or “you”, which is not encountered in
the corpus, except when the reflexive “se” is present. So, A priest picked you up
5. I dropped myself.
10.On te je pobral.
6
13.2. Introduction:
A linguist, contrary to popular belief, is not someone who speaks many languages and acts as a
translator, though this is often a bye-product of his job. A linguist is someone who studies how
language works, and makes rules for the operation of languages that are as universal as possible.
In a sense, he “deciphers” languages. As a result, knowledge of one language is sufficient to be a
linguist, though knowing more does give a more intuitive feeling of the subject.
The study of such rules is a part of linguistics. However, linguistics is far broader than just this. Any
word that a person “knows” consists of 5 pieces of information 1:
- Morphological structure: details all the smaller bits it can be broken into, for example,
“disassembled” can be broken into 3 smaller parts-”dis-assemble-d”, each part containing a
specific meaning-”dis-” means “undo” or “opposite”; “-assemble-” means “to put together”; “-d”
indicates that the verb is in past tense;
1
Akmajian, A.A., Demers, R.A., Farmer, A.K., Harnish, R.M. (2010) Linguistics: An
Introduction to Language and Communication, 6th Edition, Cambridge, Massachusetts: The
MIT Press. Page 14.
7
- Syntactic structure: where and how the word would appear in a sentence-for example, in Hindi,
main paani pita hoon would translate into English as “I water drink”; the languages follow
different sentence structures - Hindi follows Subject-Object-Verb (SOV), while English follows
Subject-Verb-Object (SVO), and instinctively, a bilingual person would translate main paani pita
hoon as “I drink water”, not “I water drink”, based on the syntactic information he or she knows
about nouns and verbs in each language.
- Semantic information: words often have specific, distinct meanings (or denotations), but often
they are used to refer to the connotations of the word itself, rather than its denotations. For
example, the denotation of “blacksmith” is someone who works with metals like iron and
produces armor, weapons, etc, and repairs such objects. However, the connotations of
“blacksmith” include whatever you would normally associate with a blacksmith-for example, a
blacksmith is commonly huge, robust and unhygienic, with black hands from beating metal. So, if
someone were to say that “John has a blacksmith’s hands”, it would mean that John’s hands are
black and dirty, even though blacksmith’s hands need not necessarily be black and dirty; a
particular blacksmith’s hands may be cleaner than most, but what is commonly associated with
“blacksmith” is “dirty hands”. Similarly, the connotations of “brother” include “comrade”,
“friend”, “helpful”, and, more recently, “irritant” & “mean”; “mother” would connote
“protection”, “affection”, “care”, etc., while it denotes “female biological parent”. Meaning of
words is divided into 2 categories - denotations and connotations, and semantics is the study of
this extended meaning and the nature of meaning itself.
- Pragmatic information: many words we know have several meanings-for example “bat” has 2
meanings - it could refer to the mammal that flies and hangs upside down during the day in dark
caves, or it could refer to the instrument used in cricket to hit balls, or even to the act of
“batting”. However, we can tell the difference intuitively. When we say “the bat is emitting
ultrasonic waves”, we understand that the “bat” being referred to here is probably the mammal,
even though syntactically, there is nothing wrong with the cricket bat producing ultrasonic
waves, logically, we assume that the bat must be the mammal, which is known to produce such
sounds.
8
- 3. Phonetics:
Phonetics of vowels and consonants are very different. Since there are more consonants, and
consonants interchange with clearer patterns, we’ll concentrate on them first.
The key ways in which consonants differ from each other are in:
a) Place of pronunciation
b) Method of articulation
d) Use of nose
Figure 12
Figure 1. English Consonants Chart 2. NOT based on the International Phonetic Alphabet (IPA).
2
Acosta, F.F. (2008) 26 September, English Consonants Chart, [Online], Available:
http://fajardo-acosta.com/worldlit/language/phonology.htm [Accessed 1 September 2011]
9
The voiceless affricate, “č”, is similar to “ch” in “church” in English; the voiced affricate, “j” is the
“j” sound in “judge” in English.
The palatal voiceless fricative is the sh sound; the palatal voiced fricative, “ž”, is a sound similar to
the “s” sound in “measure” or “fusion”, while the voiceless palatal fricative, “š”, (also represented
by “∫”, the integral sign), represents the “sh” sound in “shoulder” in English.
The “θ” is the “th” sound in “thing” or “thin” or “through” whereas “ð” it is the “th” sound in “this”
or “that” or “the”.
The liquid flap “r” is the “r” in general American or British English, while the retroflex liquid “r” is
the “r” in other European Languages, like Spanish. The velar nasal, “ŋ”, is pronounced as “ng”, as
in “sing”. It is more prominent in African languages, and some Eastern European languages.
The nasals, liquids and semivowels are all voiced, and together, are known as the sonorants, due to
their resonating quality. They are not true consonants, as airflow, though somewhat restricted,
remains largely open (consonants allow a significantly smaller volume of air through
unobstructed).
The glottal stop is something that appears primarily in British English, in place of “t” , for example,
Bri’ish” (British).
While it is not necessary to know the names of the various places of articulation and various types
of consonants, it is important to be able to identify similarities between different sounds-like the
fact that “p” and “t” are stops/plosives; “p”, “b”, “m” and “w” are all articulated at the same place
in the mouth. For instance, a particular problem listed the names of Burmese children and their
dates of birth. Then, another set of birth dates was given, along with another set of children’s
names, and the task was to match the birth dates to the names. The solution entailed discovering
that the place of articulation of the first sound of the name was decided by the day of the week on
which the child was born. As is evident, there are 6 distinct sites for 6 different days, and the
seventh, Sunday, was a vowel.
10
So, to summarize:
a. Stops/plosives: Entail a complete halt in airflow for a fraction of a second. For this
reason, they can’t even be extended, like “s” or “z” (which entail continuous exhalation).
{p, b, t, d, k, g}
b. Fricatives: entail a partial block in airflow for the duration of pronunciation (air still
continues to leave the mouth, but the tongue/teeth/lips partially obstruct it to produce
friction, and hence sound). {f, v, s, z, θ, ð, ∫, ž}
c. Affricates: the sound begins as a stop, but ends as a fricative. Only 2 in English. {č, j}
d. Sonorants: a psuedo-consonant class of sounds, further divided into nasals, liquids, and
semi-vowels.
i. Nasals: stop sounds elongated by exhaling through the nose. As a result, when
people have a “blocked nose”, nasals begin sounding like their corresponding
voiced stop (m --> b; n --> d; ŋ --> g). Also, it is difficult to make the transition from
a nasal to consonants other than the corresponding voiced stop, so many
languages transition out of nasals by either placing a vowel after the nasal, or the
corresponding stop consonant. Interestingly, many languages do not distinguish
between the nasals, especially African ones (for example, “m” and “n” are
perceived to be the same). For instance, they will have conjugations where the
first sound changes accordingly: _ _ xxxxxxx, where the Xs are permanent and the
dashes are the variable letters. So, if the second variable is “d”, then the first is “n”
by default; if the second is “g”, then again the first is “n”, and if the second is “b”,
then the first is “m”. In a sense, “m” is not a different character in its own right,
rather, “mb”, “nd”, and “ng” are the 3 distinct letters.
For a specific example, the Malagasy Language provides an ideal case of such
nasal-morphing. Many words are formed by adding prefixes to existing words, and
the prefix zafi+(nasal) is the example of such a prefix in Malagasy. It is important
to note that the prefix itself differs in spelling based on the following consonant.
11
For instance, “hafaladia” is a word that needs prefixing, and the beginning two
letters of this word “ha-” are just placeholders because in this language, the word
needs a prefix in any form, so “ha-” provides an empty, or meaningless prefix.
When this is eliminated, the morphemes “zafi+(nasal)” and “faladia” need to be
combined. This is done by altering the last character of the first morpheme and
first character of the second morpheme, and accordingly, (nasal) --> m and f --> p,
f’s corresponding stop sound, since it is easier to transition from a nasal to a stop,
than nasal to fricative while enunciating. So, the word becomes zafim-paladia, or
zafimpaladia. In another case, when the same prefix is added to “kitrokely”, the
new word is zafin-kitrokely, since it is easier to transition from “ŋ” to “k” than
from any other nasal. Though the nasal is written as “n” in the latin transcription,
this is because there is no distinction between “n” and “ŋ” in this script, though in
fact the word is pronounced with “ŋ”.
Another example is found in English itself. The prefix “in-” stands for opposite, or
contrary, like “decent” and “indecent”. Yet, this prefix is sometimes spelt
differently based on the next letter, as in “proper” and “improper”, where it is
spelt as “im-” because “m” is easier to pronounce here than “n”. “M” and “p” are
pronounced in the same place, while “n” and “d” are pronounced in the same
place.
ii. Liquids: these are an important class, as many languages differentiate between
this class and others. Often, the two sounds in the class, “l” and “r” sound
absolutely identical to lots of people, especially in Africa and Japan, both of which
contain only one liquid, and they interpret any liquid sound they hear as the liquid
they know. As a consequence, they often mispronounce “l” and “r”; they would
say: “liver” instead of “river” or “berry” instead of “belly”. What exacerbates the
problem is that in half of the cases they will preserve the correct liquid, while in
the other half, they will change it.
12
problem 5) concerning two dialects of Romansch and with different the vowel
sounds in certain cases, for which rules needed to be deciphered. The solution
entailed discovering that “u” in the first dialect remained “u” in the second dialect
if the next letter was a liquid and the liquid was not followed by another
consonant. If either of these conditions were violated, then the “u” in the first
dialect would be pronounced as “uo”in the other dialect.
iii. Semi-vowels: also known as glides, these are mainly a class that do not really
perform functions of vowels or consonants. All that is needed to be known about
them is that they are voiced, and “w” is pronounced at the lips, while “y” is
pronounced at the back of the mouth, at the palate. Again, while it is not
necessary to know the technical names of these places, it is important to know
that they are different, and it is important to be able to identify consonants with
similar places or manners of articulation.
One last piece of information concerning consonants is about consonant clusters. There is a lot to
know about consonant clusters, but what’s important to know that languages have predefined
possibilities of consonant clusters, and native speakers of a language that does not contain certain
consonant clusters will experience great difficulty in pronouncing them. For example, Punjabi does
not contain the consonant cluster “st”, and native Punjabi speakers will insert a vowel in between
the two consonants in order to pronounce the cluster. For example, when they try to say
“station”, (IPA: steɪ∫ɪn) instead they would say “s-uh-tation” (səteɪ∫ɪn).
Moving on, the phonetics of vowels is not integral to the IOL as such, so this section is mainly for
additional interesting information.
Vowels are classified by the position of the tongue when pronouncing the vowel, and the shape of
the mouth (rounded or unrounded), and length (which can be changed for all vowels).
13
This is a simplified chart with examples:
Figure 23
Figure 34
3
Simplified English Vowels Chart, [Online], Available:
http://people.umass.edu/neb/VowelChart.GIF [Accessed 2 September 2011]
4
Complete English Vowels Chart, [Online], Available:
http://home.cc.umanitoba.ca/~krussll/138/sec5/vow-ipa.gif [Accessed 2 September 2011]
14
These charts only display the set of simple vowels. Vowels can often be combined together to give
rise to new vowel sounds, known as diphthongs. The “i” in “fight” is an example of such a
diphthong, which combines the low (open) back vowel, “a” in palm” and the front (closed) high
vowel, “i”, the vowel sound in “sheep”. Other examples of diphthongs include “spout” and “boy”.
A particular property of diphthongs is that they are always long. Languages sometimes
differentiate between long and short vowels, like Faroese does (Problem 2, IOL at Pittsburgh,
2011). Most vowels will also have long and short versions, and commonly, the colon symbol (:, for
example, “cream” would be transcribed as “kri:m”) indicates a single vowel elongated. Any place
where a word is transcribed either with the colon or two vowel sounds (a diphthong), the vowel
sound there is long.
15
4. Morphology
Morphology refers to the study of how words change, or are “morphed” to add meaning or
compound meanings or create entirely new words. Morphemes are parts of words that constitute
the smallest recognizable or meaningful parts of words5.
Most morphemes contain some sort of meaning to themselves, even if they can’t exist as words,
like “in-” or “dis-” which mean “opposite”, or “-ed” (past tense), “-s” (plural), etc. When a word
contains more than one morpheme, the meaning of the word is a combination at an intuitive level,
of the meaning of the two words themselves6.
1. Free morphemes: Most words in English would come under this category; this category contains
morphemes that can exist independently as words, without any morpheme attached to them.
All free morphemes in English have specific meanings; they do not modify other nouns or verbs
by making them plural or changing their tense or subject or object. However, this is not true of
all languages. Take, for instance, Hindi, which has the word “hai” (है ), which does not meaning
anything but “present tense”. All it does is indicate that the sentence is in present tense. English
does not have such words that “mark” the tense or case or plurality, such words form a part of
the word to be modified itself in English (like -ing in English).
2. Bound morphemes: Obviously, these are the opposite of bound morphemes, and cannot form
words on their own. Often, they contain tense markers, mood markers, singular/plural markers,
or any such modifier that only modifies the meaning of the base morpheme, which is usually a
free morpheme, but can also sometimes be a bound morpheme itself. So, bound morphemes
are also divided into more categories:
a. Affixes: these are the most common type of free morphemes, and they include prefixes,
infixes and suffixes (depending on where they are inserted in the base morpheme). Affixes
ordinarily comprise of all the various markers that exist - like tense markers, mood markers
5
Akmajian, A.A., Demers, R.A., Farmer, A.K., Harnish, R.M. (2010) Linguistics: An
Introduction to Language and Communication, 6th Edition, Cambridge, Massachusetts: The
MIT Press. Page 19
6
Akmajian, A.A., Demers, R.A., Farmer, A.K., Harnish, R.M. (2010) Linguistics: An
Introduction to Language and Communication, 6th Edition, Cambridge, Massachusetts: The
MIT Press. Page 19
16
(especially in Romance languages), plural markers, singular markers, subject markers (again,
very prominent in Romance languages, where verbs are conjugated and suffixed depending
on the subject of the sentence - depending on whether it is first/second/third person
singular/plural, male/female, etc.) and all other types of markers. Examples include, as
previously stated, “in-”, “dis-”, “mis-”, “-ed”, “-s”, “-ity”, “-ate”, “-ant”, etc., with the “-”
indicating the place where the base morpheme would be attached in the word. There are, in
certain languages, affixes attached in the middle of the word as well, known as infixes. This
happens very frequently in Native American languages, and also in some Austronesian
languages (Indonesia and its surrounding islands, and the Australian-Indonesian islands in
general). For example, in Bonto Igorot, a language of the Phillipines, the infix -in- signifies
that the noun being described is the product of a complete action 7. For example, the word
kayu means wood, and with the infix -in- added, kinayu refers to “gathered wood”. This
action, inserting a morpheme after the first letter, is the most common way infixes are
added.
b. Bound base morphemes: there are some base morphemes that cannot exist individually as
well. Such morphemes are extremely uncommon in English, though not entirely absent.
Examples include cran-, a base morpheme that never exists on its own, referring to a specific
fruit only if it is attached to -berry and to an amalgamation of fruits if attached to another
fruit, like -apple or -grape8. Other examples include malle- and feas-, derived from malleable
and feasible and malfeasance. Feas- and malle- themselves cannot form words, but with
suffixes like -able and -ance, they do form words.
It is necessary to understand the role of morphemes in altering words and also and to develop the
ability to recognize morphemes in various forms in different languages. The same morpheme
could appear in multiple ways in the same language-like im- and in- in English, which are identical,
but vary depending on the first letter of the base morpheme. It is advisable to attempt a few
problems concerning complex morphology to better understand how to deal with morphology.
Morphology is key in almost half of the questions in the IOL. Past year’s questions include:
2011: 1, 3, 4 2010: 1 2009: 5 2008: 3, 4, 5
7
Akmajian, A.A., Demers, R.A., Farmer, A.K., Harnish, R.M. (2010) Linguistics: An
Introduction to Language and Communication, 6th Edition, Cambridge, Massachusetts: The
MIT Press. Page 20.
8
Akmajian, A.A., Demers, R.A., Farmer, A.K., Harnish, R.M. (2010) Linguistics: An
Introduction to Language and Communication, 6th Edition, Cambridge, Massachusetts: The
MIT Press. Page 20
17
2007: 2, 3 2006: 1, 4 2005: 1 2004: 1, 5. 9
9
All problems published by The International Olympiad in Linguistics, [Online], Available:
http://www.ioling.org/problems/ [Accessed 8 September 2011]
18
5. Syntax
Syntax is the part of linguistics that deals not with the microscopic level of language (words), but
with the macroscopic level (sentences). Syntax is probably the place where non-native speakers of
a language make the most mistakes, through word for word or literal translations. All languages
are governed by an arbitrary set of grammatical rules that are almost never identical for
languages. Take an example of a simple English sentence: “John hit the big ball”. If you were to
translate this sentence correctly (not word for word) into another language and then translate
wrongly (word for word) back into English, it is very likely that the words would be very jumbled. If
the other language were:
Spanish and most Romance languages would follow the same structure as French, while most
languages evolved from Sanskrit would follow the same pattern as Hindi, and most Germanic
languages would yield the same result as the original English sentence (without any error!).
These deviations from the original are the main task and problem that computerized translators
face - language is so vast that it is nearly impossible to produce rules that govern the language in
its entirety, especially when so many words, especially in English could represent multiple parts of
speech (they could be verbs or nouns or adjectives, etc.-the word “that” falls under 4 separate
parts of speech!).
Computers “parse” information by breaking the sentence down into smaller phrases, making them
agree with regular sentences. An ordinary, simple sentence would contain 3 parts - a subject
phrase, verb phrase and object phrase, in certain orders. Note, this is just a standard sentence, in
many cases, it is possible to make sentences without a subject or object as well. Each phrase
within itself could also contain various modifiers and sub-clauses and sub-phrases. For example, in
the sentence: “A panicked John hurriedly ran from the enormous labrador”, the subject phrase, “A
panicked John”, contains an article and an adjective besides the principle noun, which forms the
head of this noun phrase. The verb phrase, “hurriedly ran” contains an adverb as well as a verb,
and, then there is a preposition, “from”, followed by the object phrase which is another noun
phrase, “the enormous labrador”. This sentence itself could be handled by a computer, though
with some difficulty, but if you were to add to this sentence, the phrase “that was giving chase
furiously”, now you have a very ambiguous word, “that” (which belongs to at least 4 different
19
parts of speech-it is a pronoun, adverb, conjunction and determinant, and each one might be
translated differently in another language) coupled with a 3-word long verb in the past continuous
tense, and an adverb. If the computer was meant to decipher all of this accurately, and translate it
correctly into another language, the rules describing each language would have to be extremely
detailed and specific, and this is why syntax is such a big problem that requires a brain fluent in
either language to translate, with only a subconscious understanding of syntactic rules.
This list of abbreviations indicates all the possibilities of language subject-verb-object phrase
orders. English is an SVO language (its order is Subject-Verb-Object), while Hindi is SOV (woh
paani pita hai = I (Subject) water (Object) drink (Verb) hai (tense marker; tense = present).
Examples:
SVO: Most Romance and Germanic languages, Mandarin, Russian, Slavic languages.
SOV: Sanskrit-derived languages, Japanese, Latin etc. Over 75% of all languages fall under SVO or
SOV.
VSO: Welsh & Irish (Gaelic)10, Hebrew, some Austronesian languages.
VOS: Malagasy (from Madagascar)11 is the best documented example. Others include Tzotzil and
Fijian. Mainly Austronesian languages (including Malagasy).
OVS: Hixkarayana12 (a Carib language spoken in Brazil), and Tamil, in some forms and cases.
OSV: This is one of the rarest forms, and occurs mainly in special cases of other languages. For
example, in English, “I hate tennis, but cricket I like”. Rare examples include Nadëb13, another
language spoken by a small minority in Brazil.
Lastly, it is important to know that there are many languages with multiple structures in different
tenses or moods or free form structures-where two or more of the subject, verb and object can be
freely interchanged. Examples of the cases with multiple structures include English in passive voice
(the lion was shot by the hunter = OVS) or French and most Romance languages with personal
10
VSO and Master Yoda, [Online], Available:
http://www.akerbeltz.org/beagangaidhlig/gramar/grammar_VSO.htm, [Accessed 9
September 2011]
11
VSO and Master Yoda, [Online], Available:
http://www.akerbeltz.org/beagangaidhlig/gramar/grammar_VSO.htm, [Accessed 9
September 2011]
12
Dryer, M.S., Order of Subject, Object and Verb, [Online], Available:
http://wals.info/chapter/81, [Accessed 9 September 2011]
13
Dryer, M.S., Order of Subject, Object and Verb, [Online], Available:
http://wals.info/chapter/81, [Accessed 9 September 2011]
20
pronouns (in which case it would be SOV, not SVO), Spanish in half the cases lacks a subject, and
many more. Examples of a free structure (also, No Dominant Order-NDO) include many
Austronesian, African and American languages. Some linguists classify German, Greek and Dutch,
among others, as having a free structure, though this is still debated. 14
Discussions about word order are not limited to the orders of the Subject, Verb and Object. The
grammar of a language also specifies how pre/post positions would come in a sentence (name
depends on whether the words come before or after the word they modify/govern). For example,
English widely uses prepositions, while Hindi primarily uses postpositions. It is important to
understand that prepositional/postpositional morphemes can come in many forms-they can even
be suffixed or prefixed besides being positioned before or after the word they govern. For
example, English, a predominantly prepositional language that reserves special prepositional
words (like from, above, under, about, with, over), does contain some bound postpositions, like
-ward in homeward.
Even the positioning of adjectives can vary from language to language. In English, adjectives and adverbs
always come before the noun / verb they are describing, and it would be very unnatural to say “I ate
hungrily the fish salty”. However, in Spanish and other Romance languages, this would precisely be the
word order, and this is what led to the word order being jumbled above, when translating word for word
from French to English. Additionally, some words are even allowed to “break the rules”. For example, in
French, some qualitative adjectives are allowed to precede the noun, whereas other qualitative adjectives
and all quantitative adjectives must come after the noun. So, if the adjective above had been “beautiful”
instead of “big”, the sentence would have been translated properly into English, without any error. Another
example is evident in English, where adverbs can come on either side of a verb, but when adverbs describe
a noun, they must precede the adjective (you would say “extraordinarily red rose”, not “red extraordinarily
rose”, but “I frequently eat Chinese” and “I eat Chinese frequently” are both accepted). In contrast,
adjectives must always precede a noun in English.
This is probably the most important topic within syntax as far as the Olympiad is concerned.
Knowing the difference between transitive and intransitive verbs is crucial, especially where Aztec
and Mayan languages are concerned. The rules for both types of verbs in such languages are very
14
Pagel, M., (June 2009), Evolution of Word Order Changes, [Online], Available:
http://www.nature.com/nrg/journal/v10/n6/fig_tab/nrg2560_F5.html, [Accessed 10
September 2011]
21
different in such languages, and this is something that really needs to be stressed - how to
distinguish the two types of verbs.
An intransitive verb is a verb that does not have an object. In English, it is often grammatically
correct, and in fact even expected in some cases, to not have an object in a sentence. For example,
the sentence John went to the pizzeria does not have a traditional object. There is a preposition
placed before the place that John visited (“to”), and in this case, it could have sufficed to say “John
went”, where “went” is an intransitive verb. An intransitive verb is a verb that does not have an
object. Other examples of sentences with intransitive verbs include “I fall”, “you sleep”, “we
dream”, “they drink”, “he eats”, “I joke”, and many more. However, many of these verbs could
also have a transitive form, for example, “they drink water” and “he eats biscuits”. It is essential
that caution regarding the type of verb is exercised in dealing with Mayan and Aztec languages,
and indeed most Indian languages native to the Americas. Problem 5 of the IOL 2009 15, in Nahuatl,
the language of the Ancient Aztec Empire, intransitive and transitive verbs had different suffixes to
indicate the same thing (to “make” someone do the verb instead of simply “doing” it: like, The
postman makes him drink as opposed to The postman drinks). The suffix –tia indicated this for an
intransitive verb, while the suffix –ltia indicated this for transitive verbs, and hence it was very
easy to either confuse the extra “-L” as a part of the verb itself, or view it as a separate morpheme.
15
The International Olympiad in Linguistics, Poland 2009, Problem 5 - Nahuatl [Online],
Available: http://www.ioling.org/booklets/iol-2009-indiv-prob.en-gb.pdf [Accessed 8
September 2011].
22