Week 9: The lexicon

English Morphology

Fernanda Barrientos

2024-12-17

The lexicon

What is the content of the lexicon?

The lexicon is the linguist’s term for the language user’s mental dictionary.

  • Something is listed in the lexicon = it is stored in the speakers’ long term memory

  • Lexical items are the fundamental building blocks of morphological structure

    • This has implications for the concept of rule seen on Week 6:
      • Is the lexicon based on words?
      • Is it based on morphemes?

Three views

  • Morpheme lexicon: only bases and affixes are listed, and derived words are formed by rules
  • Strict word-form lexicon: Every word form is listed
    • Somewhere in between, but mostly word-based, is the moderate word-form lexicon
      • Whether a complex word form is listed or not will depend on several factors

View 1: Morpheme-based lexicon

  • Since we don’t memorize each sentence that we use, it is also possible that we don’t need to memorize complex words
  • This theory is particularly attractive when we think of languages with lots of inflection
Singular Plural
Nominative filos fili
Accusative filo filus
Genitive filu filon
  • Granted: a morpheme-based approach would require the speaker to store 7 items, versus 6 under a word-based approach
    • But since many nouns would require the same affixes for inflecting, it is still a win
    • This is also valid for derivation: English -er, -able, etc. may be attached to many bases

View 1: Morpheme-based lexicon

  • However, things are not so simple
  • Think of the meaning of the word reader:
    • Person who reads
    • Academic title
    • Computer software
  • Sometimes, meanings may be non-compositional: that is, the meaning is not equal to the sum of its parts
  • This makes meanings unpredictable

View 1: Morpheme-based lexicon

  • Another issue is that morphemes are not always clearly delimited in the word (that is, lack of segment-abil-ity)
    • Example 1: Base modification (and non-concatenative patterns in general) are not easy to segment
      • Ablaut/Umlaut: Mutter - Mütter (where is the plural morpheme?)
    • Example 2: Cumulative expression
      • Consider the Spanish verb cantar (to sing), where the root is cant- and the first person singular of the simple preterite is canté
      • Where = preterite, first person, singular, and indicative mood: a lot of information in one single morpheme!
      • Suppletive forms also express meaning cumulatively: bad - worse (where worse = bad + the comparative “more than”).

View 1: Morpheme-based lexicon

  • Further cases of lack of segment-abil-ity:
    • Example 3: zero expression
      • Consider the past tense conjugation of the Finnish verb olla (“to be”). What is the morpheme for the third person singular?
oli-n ‘I was’
oli-t ‘You were’
oli ‘He/she/it was’
oli-mme ‘We were’
oli-tte ‘You (pl.) were’
oli-vat ‘They were’

View 1: Morpheme-based lexicon

  • Further cases of lack of segment-abil-ity:
    • Example 4: empty morphs: the opposite of zero expression: a morpheme with no meaning.
    • Consider the following declension in Lezgian:

  • Where -re-, -di-, and -a- do not carry meaning.

View 1: interim summary

  • Pros 👍
    • the theory would fit with the general idea that language is formed my building blocks - convenient!
    • It’s an elegant theory: very demure, very mindful
  • Cons 👎
    • It doesn’t deal well with
      • Base modification
      • Cumulative expression
      • Zero expression
      • Empty morphs
  • This means that would need to extend our definition of morpheme to something that can be (a) a change in the base, (b) a bit where many meanings are mashed together, (c) nothing, or (d) a form without meaning 😱

View 2: a word-based lexicon

  • This approach would spare us the issues seen in the previous theory
    • Remember that the word-based rules in week 6 were able to deal with all patterns, concatenative and non-concatenative
    • It deals very well with productive and unproductive processes
      • Word forms resulting from unproductive processes are just listed in the lexicon

View 2: a word-based lexicon

  • But: what do we do with agglutinative languages, such as Turkish?

  • The amount of inflection seen in Turkish makes it unlikely for speakers to store all possible words in their memory

View 2: a word-based lexicon

  • Also: there is evidence that speakers are able to notice that words are made of morphemes
    • Take Dutch part participles with ge-
      • We don’t attach ge- to verbs starting with be-: spreken \(\rightarrow\) gesproken, but bespreken \(\rightarrow\) besproken and not *gebesproken
      • (This also sort of applies to German)

View 2: a word-based lexicon

  • Furthermore, many languages have phonological morpheme structure conditions – i.e. restrictions on the sounds that may go together within a morpheme.
    • English allows [tθ] and [dθ] in complex words like eighth and width, but not within a single morpheme (in eighth, there is a morpheme boundary between [t] and [θ]: eight-th)
      • German has a maximum of four consonants word-finally, but it allows [rpsts] as in Herbst-s (‘autumn-GEN’), where we have a morpheme boundary between [t] and [s].

View 2: interim summary

  • Pros 👍
    • It has none of the issues that we discussed for the morpheme-based approach
    • It can deal very well with unpredictable meanings and unproductive processes
  • Cons 👎
    • In rich inflectional languages, it is unlikely that speakers have all the words in a language listed in the lexicon
    • Some morphological patterns refer to morphological structure
    • Some phonological restrictions seem to be “bypassed” when a morpheme boundary is in between, which suggests that morphemes are somehow relevant to other linguistic structures

View 3: A moderate word-based approach

  • If the lexicon consists of both word-forms and morphemes, the immediate question is:
    • Which words are directly listed in the lexicon, and which ones are composed “on the fly” from morphemes?
      • A language has actual words, but also possible words (e.g. “bagelize”)
      • Some of this newly created forms may either become popular (what we call a neologism) or not (occasionalisms).
      • Thus, a word-based approach (moderate or not) would force us to choose where to draw the line

A dual approach to lexical access

What is lexical access?

Lexical access is the process of looking up words in the lexicon. Whenever we hear a word, we check in our lexicon for an entry with the meaning for that word.

  • We have two possible routes for lexical access:
    • The decomposition route: we look at the morphemes and then for the meaning, e.g. in the word “insane” we look up “in-”, then “sane”, and then we put both meanings together.
    • The direct route: we have the word stored in the lexicon as a whole.

A dual approach to lexical access

  • The dotted arrow is the direct route: straight to the meaning of the word form
  • The solid arrows are the decompositional route: we look up the morphemes, which then take us to the meaning of the word form
  • Both routes are taken simultaneously: the fastest is the winner
    • But, can we predict which one will win?

A dual approach to lexical access

  • Factors that play a role on lexical access
    • Token frequency: the more a given word is used, the more likely it is that we take the direct route, e.g. “insane” is more frequent than “sane”, so taking the decompositional route would take more time.
    • Segmentability: The more difficult it is to splice the word form into morphemes (that is, with non-concatenative morphology), the more likely it is that we store and retrieve word forms.
    • Effect on base: Some affixes will change the phonological structure of the base: electri[k] + ity = electri[s]ity. In such cases, the more likely it is that we store the full word form.

One more thing: decomposing

  • If we were to follow the decompositional route to decompose the word insanely, would we do that entirely? (i.e. in + sane + ly), or just partially (e.g. insane + ly, or in + sanely)?
    • Here it is possible that we parse it as insane + ly, since insane is a more frequent word form than insanely or sane.
    • In the end, the point is that only by taking into account the three factors affecting lexical access mentioned above we can arrive to a more psychologically realistic account of the lexicon.

Summary

  • A language user’s mental dictionary is the lexicon
  • There are three main theories about what the lexicon consists of
    • The morpheme lexicon is an elegant theory but doesn’t deal well with non-compositional meaning and non-concatenative patterns -The word-form lexicon does not have the issues found in the morpheme model, but:
      • some types of morphological rules seem to rely on the concept of the morpheme, and morphemes may have a real status for speakers
    • The third view, of an essentially word-form lexicon but with morphemes and some derivations also stored, is a more realistic approach
      • A dual approach to lexical access offers the possibility of retrieving meaning via decomposing or direct access
      • The decompositional vs. direct rout are mediated by factors such as frequency, segmentability, and effect on the base.

Next week

  • IT’S CHRISTMAS!!!
  • But for the week immediately after the holidays (7th of January), we will meet on Zoom at the usual time (1:30 pm). A link will be provided via ILIAS.
  • There is no tutorial because it’s already the 23rd.
  • Keep working on the mock exam
  • Complete the exercises (exercises_w9.pdf - will be uploaded by tomorrow)