Week 9: The lexicon

English Morphology

Fernanda Barrientos

2024-12-17

The lexicon

What is the content of the lexicon?

The lexicon is the linguist’s term for the language user’s mental dictionary.

Something is listed in the lexicon = it is stored in the speakers’ long term memory
Lexical items are the fundamental building blocks of morphological structure
- This has implications for the concept of rule seen on Week 6:
  - Is the lexicon based on words?
  - Is it based on morphemes?

Three views

Morpheme lexicon: only bases and affixes are listed, and derived words are formed by rules
Strict word-form lexicon: Every word form is listed
- Somewhere in between, but mostly word-based, is the moderate word-form lexicon
  - Whether a complex word form is listed or not will depend on several factors

View 1: Morpheme-based lexicon

Since we don’t memorize each sentence that we use, it is also possible that we don’t need to memorize complex words
This theory is particularly attractive when we think of languages with lots of inflection

	Singular	Plural
Nominative	filos	fili
Accusative	filo	filus
Genitive	filu	filon

Granted: a morpheme-based approach would require the speaker to store 7 items, versus 6 under a word-based approach
- But since many nouns would require the same affixes for inflecting, it is still a win
- This is also valid for derivation: English -er, -able, etc. may be attached to many bases

View 1: Morpheme-based lexicon

However, things are not so simple
Think of the meaning of the word reader:
- Person who reads
- Academic title
- Computer software
Sometimes, meanings may be non-compositional: that is, the meaning is not equal to the sum of its parts
This makes meanings unpredictable

View 1: Morpheme-based lexicon

Another issue is that morphemes are not always clearly delimited in the word (that is, lack of segment-abil-ity)
- Example 1: Base modification (and non-concatenative patterns in general) are not easy to segment
  - Ablaut/Umlaut: Mutter - Mütter (where is the plural morpheme?)
- Example 2: Cumulative expression
  - Consider the Spanish verb cantar (to sing), where the root is cant- and the first person singular of the simple preterite is canté
  - Where -é = preterite, first person, singular, and indicative mood: a lot of information in one single morpheme!
  - Suppletive forms also express meaning cumulatively: bad - worse (where worse = bad + the comparative “more than”).

View 1: Morpheme-based lexicon

Further cases of lack of segment-abil-ity:
- Example 3: zero expression
  - Consider the past tense conjugation of the Finnish verb olla (“to be”). What is the morpheme for the third person singular?

oli-n	‘I was’
oli-t	‘You were’
oli	‘He/she/it was’
oli-mme	‘We were’
oli-tte	‘You (pl.) were’
oli-vat	‘They were’

View 1: Morpheme-based lexicon

Further cases of lack of segment-abil-ity:
- Example 4: empty morphs: the opposite of zero expression: a morpheme with no meaning.
- Consider the following declension in Lezgian:

Where -re-, -di-, and -a- do not carry meaning.

View 1: interim summary

Pros 👍
- the theory would fit with the general idea that language is formed my building blocks - convenient!
- It’s an elegant theory: very demure, very mindful
Cons 👎
- It doesn’t deal well with
  - Base modification
  - Cumulative expression
  - Zero expression
  - Empty morphs
This means that would need to extend our definition of morpheme to something that can be (a) a change in the base, (b) a bit where many meanings are mashed together, (c) nothing, or (d) a form without meaning 😱

View 2: a word-based lexicon

This approach would spare us the issues seen in the previous theory
- Remember that the word-based rules in week 6 were able to deal with all patterns, concatenative and non-concatenative
- It deals very well with productive and unproductive processes
  - Word forms resulting from unproductive processes are just listed in the lexicon

View 2: a word-based lexicon

But: what do we do with agglutinative languages, such as Turkish?

The amount of inflection seen in Turkish makes it unlikely for speakers to store all possible words in their memory

View 2: a word-based lexicon

Also: there is evidence that speakers are able to notice that words are made of morphemes
- Take Dutch part participles with ge-
  - We don’t attach ge- to verbs starting with be-: spreken \(\rightarrow\) gesproken, but bespreken \(\rightarrow\) besproken and not *gebesproken
  - (This also sort of applies to German)

View 2: a word-based lexicon

Furthermore, many languages have phonological morpheme structure conditions – i.e. restrictions on the sounds that may go together within a morpheme.
- English allows [tθ] and [dθ] in complex words like eighth and width, but not within a single morpheme (in eighth, there is a morpheme boundary between [t] and [θ]: eight-th)
  - German has a maximum of four consonants word-finally, but it allows [rpsts] as in Herbst-s (‘autumn-GEN’), where we have a morpheme boundary between [t] and [s].

View 2: interim summary

Pros 👍
- It has none of the issues that we discussed for the morpheme-based approach
- It can deal very well with unpredictable meanings and unproductive processes
Cons 👎
- In rich inflectional languages, it is unlikely that speakers have all the words in a language listed in the lexicon
- Some morphological patterns refer to morphological structure
- Some phonological restrictions seem to be “bypassed” when a morpheme boundary is in between, which suggests that morphemes are somehow relevant to other linguistic structures

View 3: A moderate word-based approach

If the lexicon consists of both word-forms and morphemes, the immediate question is:
- Which words are directly listed in the lexicon, and which ones are composed “on the fly” from morphemes?
  - A language has actual words, but also possible words (e.g. “bagelize”)
  - Some of this newly created forms may either become popular (what we call a neologism) or not (occasionalisms).
  - Thus, a word-based approach (moderate or not) would force us to choose where to draw the line

A dual approach to lexical access

What is lexical access?

Lexical access is the process of looking up words in the lexicon. Whenever we hear a word, we check in our lexicon for an entry with the meaning for that word.

We have two possible routes for lexical access:
- The decomposition route: we look at the morphemes and then for the meaning, e.g. in the word “insane” we look up “in-”, then “sane”, and then we put both meanings together.
- The direct route: we have the word stored in the lexicon as a whole.

A dual approach to lexical access

The dotted arrow is the direct route: straight to the meaning of the word form
The solid arrows are the decompositional route: we look up the morphemes, which then take us to the meaning of the word form
Both routes are taken simultaneously: the fastest is the winner
- But, can we predict which one will win?

A dual approach to lexical access

Factors that play a role on lexical access
- Token frequency: the more a given word is used, the more likely it is that we take the direct route, e.g. “insane” is more frequent than “sane”, so taking the decompositional route would take more time.
- Segmentability: The more difficult it is to splice the word form into morphemes (that is, with non-concatenative morphology), the more likely it is that we store and retrieve word forms.
- Effect on base: Some affixes will change the phonological structure of the base: electri[k] + ity = electri[s]ity. In such cases, the more likely it is that we store the full word form.

One more thing: decomposing

If we were to follow the decompositional route to decompose the word insanely, would we do that entirely? (i.e. in + sane + ly), or just partially (e.g. insane + ly, or in + sanely)?
- Here it is possible that we parse it as insane + ly, since insane is a more frequent word form than insanely or sane.
- In the end, the point is that only by taking into account the three factors affecting lexical access mentioned above we can arrive to a more psychologically realistic account of the lexicon.

Summary

A language user’s mental dictionary is the lexicon
There are three main theories about what the lexicon consists of
- The morpheme lexicon is an elegant theory but doesn’t deal well with non-compositional meaning and non-concatenative patterns -The word-form lexicon does not have the issues found in the morpheme model, but:
  - some types of morphological rules seem to rely on the concept of the morpheme, and morphemes may have a real status for speakers
- The third view, of an essentially word-form lexicon but with morphemes and some derivations also stored, is a more realistic approach
  - A dual approach to lexical access offers the possibility of retrieving meaning via decomposing or direct access
  - The decompositional vs. direct rout are mediated by factors such as frequency, segmentability, and effect on the base.

Next week

IT’S CHRISTMAS!!!
But for the week immediately after the holidays (7th of January), we will meet on Zoom at the usual time (1:30 pm). A link will be provided via ILIAS.
There is no tutorial because it’s already the 23rd.
Keep working on the mock exam
Complete the exercises (exercises_w9.pdf - will be uploaded by tomorrow)