Book Summary: The Lexicon: An Introduction by Elisabetta Jezek

Rating: 8.2/10.

Ch1: Basic notions

The lexicon is the set of words in a language, abstract object stores in our mind; a dictionary is a concrete object (printed book or electronic) that describes the lexicon. Dictionaries do not always store everything in the lexicon, either intentionally or unintentionally. A vocabulary can refer to either a lexicon or dictionary. Lexicography is the compiling dictionaries, either for human use or computational use (eg: WordNet).

The mapping from concepts (mental categories) to lexical forms is called lexicalization or lexical encoding. Lexicalization (also called lexification) also refers to the process that a new word is introduced into the language.

The mapping from concepts to words is rare one-to-one. Synthetic lexicalization is when one word simultaneously encodes multiple concepts, for example “walk” means to move, by foot; “run” means to move, by foot, at high speed. Analytic lexicalization is when multiple words are combined to express one concept, eg: “fall asleep”. However it’s not always clear whether a concept is simple or complex, and the relativist hypothesis states that conceptual categories are determined by culture and language.

Descriptive lexicalization is when the form of the word describes the word, eg: “worker” = “work + er” is someone who works. Labeling lexicalization is when such is not possible, eg: “doctor”. A lexical gap is a gap in a language where there’s no single word for a concept. Languages differ in how they subdivide the concept space into words.

Content words are nouns, verbs, adjectives, adverbs, carry semantic meaning, and are an open set of items, whereas function words carry grammatical meaning and are a closed set. This rule is imperfect as some verbs like “have” are more like function words. Types of grammatical meanings include number, gender, time, aspect, and voice.

It’s difficult to define what is a word, so counting words in a language is tricky. The citation form is the form declared to be the root form by convention. The lexeme is the abstract unit of lexicon and the lemma is the form of the lexeme seen in the dictionary. Compounds can be multi-word expressions, as long as they’re salient as a whole beyond the sum of its parts, this is also called a semantic constituent (eg: “round table”).

Words are expected to be cohesive, so you can’t stick an adverb in the middle. Other test is whether you can reorder the parts; a third test is whether you can replace a part of it with a synonym.

Another problem with counting words is how to deal with homonymy (two unrelated words accidentally sharing the same form) and polysemy (word with multiple meanings). Difficult for native speakers to tell them apart, best way is through diachronic etymology.

Simple words consist of a single morpheme (inflectional endings are allowed), complex words have multiple morphemes. Some complex words look like phrases, but pass the test for wordhood; when their parts can be substituted yet their meaning is not completely predictable, then it’s called a template, eg: “bird of prey”. It’s hard to distinguish between compound, fixed phrase, and ordinary phrase, so wordhood should be a scalar, not binary property.

Ch2: Lexical information

Lexical meaning comes in several types. Denotational meaning (also descriptive or referential or logical meaning) is a mapping from form to set of entities that it refers to. It can be many entities (eg: “fish”) or a single one (eg: “Toronto”). Connotative meaning is the emotional position of the speaker, or register: for example, “mom” and “mother” have the same referents, but “mother” is more formal.

Collocational meaning is meaning that arises only with combined with another word, eg: the “warm” in “warm welcome” means affectionate, not temperature. It’s difficult to tell which meanings are denotational and which are collocational.

Word class flexibility is most common in isolating languages. Several ways to analyze it: (1) words contain information about possible word classes, (2) lexicon has different entry for each use, (3) word undergoes conversion to another word class, (4) word class is underspecified / precategorical.

Predicate words have additional information attached to the lexicon. Argument structure is the number of arguments that it takes, and what kind of relation it specifies between the arguments. Aktionsart (or actionality, lexical aspect) contains information about the temporal aspect, eg: “explode” is an instantaneous event, while “sleep” happens for a period of time.

It’s hard to decide what knowledge is lexical and what is encyclopedic knowledge (also commonsense, world knowledge). For example, “bread” is a mass noun; you can “eat bread” but it’s not grammatical to “build bread”, is this stored in the lexicon or just common sense? One way to define lexical information is the knowledge that all speakers share.

Ch3: The Meaning of words

Lexical semantics is the study of meaning of individual words, whereas compositional semantics is the study of how words are combined to form meaning in expressions. Sentence meaning depends on syntactic, semantic, and pragmatic context. Example of semantic context is “I broke a glass” (= solid object) vs “I drank a glass” (= glass of some liquid). Example of pragmatic context is “my friend is cool” which could mean “fashionable” or “unemotional”.

Lexical ambiguity is when a word has more than one meaning. Homonymy (or contrastive ambiguity) is when the meanings are unrelated, polysemy (or complementary ambiguity) is when they are related. Verbs tend to have a higher degree of polysemy than nouns. Regular polysemy is classes of polysemy that follow predictable patterns, like institution / people in that institution (eg: “university”, “city”). Inherent polysemy is when multiple aspects of a complex referent are activated simultaneously, eg: “the book on the shelf is boring”, where “book” refers to the physical object as well as the contents within the book. Some polysemy comes from metaphor, eg: “devour a book”.

Several ways to look at meaning. The referential theory is that words refer to objects and events in the real world, and is the basis for formal semantics. The mentalist / conceptual theory is that the connection from words to meaning is mediated by a concept, a construction of reality within the mind. Concepts can be lexicalized by attaching a word to it, but not necessarily so. The structuralist theory says that the meaning of a word is defined by its relation to other words.

The classical theory defines categories by necessary and sufficient conditions, eg: a woman is an adult and female. Unlike the classical theory of meaning, the prototype theory allows fuzziness and degrees of belonging to a category. It still has problems, as categories that should have sharp boundaries (eg: odd numbers) still exhibit prototype effects. Best to assume all the theories co-exist. Lastly, the distributional hypothesis is that words with similar distributional properties have similar meanings, and is the basis for word vectors.

When words each with multiple senses are combined together, how do you decide the meaning of the whole phrase? Sense enumeration models say that all the senses are stored in the lexicon, but have restrictions on when each sense can be activated. Dynamic lexical models argue that words in isolation only have abstract meanings, and larger linguistic units have concrete meanings.

Pustejovsky’s generative lexicon explains how words can take on different meanings on context, for example “I bought a new book” refers to the physical object, whereas “I started a new book” refers to the action of reading a book. Each word has sub-structure of many possible meanings, and when combining them together, each one selects for the possible interpretation so that the whole unit semantically makes sense. This principle of strong compositionality says that there is mutual adjustment on context.

Several different formalisms proposed to represent meaning of words. The feature-based approach (or componential analysis) breaks down a word into a list of features, problem is it gets unwieldy when applied to the whole lexicon, and a lot of residual meaning that’s difficult to break down into features.

Another approach is decomposing words into semantic primitives, eg: “x broke y” can be rewritten as “[x CAUSE [BECOME [y broken]]]”. Some drawbacks: (1) it’s hard to decide on a list of primitives, (2) hard to capture all meaning in terms of primitives, (3) psycholinguistically implausible as children don’t learn supposed primitives like “cause” until much later, (4) hard to explain compositionality.

Pustejovsky (1995) proposes a theory of Qualia structure to explain how meanings can change in composition. Each word has four Qualia roles: Formal (what type of entity: bread is a type of food), Constitutive (what it’s made of: bread is made of flour), Telic (purpose of item: bread is for eating), and Agentive (origin of item: bread is baked). Then different part of the Qualia structure is selected in composition with other words by finding a mutually matching interpretation. However, not clear what values should go into these Qualia slots.

Another formalism is meaning postulates: encode meaning by links between words, without semantic primitives like “cause”. Eg: shatter = break (in small pieces). This seems to agree with how humans define words. Finally, the word-space model defines meaning as a point in a high dimensional vector space, and similarity is determined by angle or distance.

Ch4: The global structure of the lexicon

Word classes are sets of words that behave in similar ways. They can be grouped on a morphological, syntactic, or semantic basis, and there is a high amount of agreement between morphosyntactic behavior and semantic meaning.

Lyons (1977) proposed a three-level categorization of entities as those that are constant in time (first-order), those that occur in time (second-order), and those that are outside of time like beliefs (third-order). Nouns tend to be first-order and verbs second-order, but there are exceptions like “sunset” which is second-order. Givon (1979) proposes a continuum of temporal stability, where some words like “own” are more time-stable than “kick”, and verbs are generally less stable than nouns. Two universal types of human discourse are reference (introducing a new entity) and predication (introducing relationship between existing entities).

Verbs can be divided into subclasses by their valency structure (intransitive, transitive, etc). Intransitive verbs can be further divided into unaccusative (can take optional argument, eg: paint) and unergative (those that cannot, eg: laugh). Syntactic valency is not always the same as semantic valency (which assigns semantic roles to participants in a predicate). Some verbs have selectional restrictions which restrict the semantic class of the arguments, eg: the object of “drink” must be liquid.

The verb’s Aktionsart encodes how it occurs through time. Examples of different lexical aspects are: stative (eg: own), indefinite (eg: walk), definite (eg: fix), and punctual (eg: find). You can use syntactic tests to determine these properties: you can “fix a car for one hour” but you can’t “find a car for one hour”. Some verbs like “break” have event structure, and different substructures can be emphasized depending on context: eg, “Mary breaks the key” focuses on the start of the action and “the key is broken” focuses on the outcome.

Some attempts to classify verbs into semantic classes, but this is difficult. First, the verb might contain a lot of semantic features and different prominence: eg, the word “sit” has a motion but is primarily a state. Also, there’s the problem of polysemy, “take my hand”, “take the train”, “take the medicine” would be semantically very different categories.

Nouns can be categorized as entity nouns or event nouns. Can classify entities in an ontology by concrete vs abstract, animate vs inanimate, etc. However, making arbitrary divisions like this isn’t really interesting to linguists, unless there are morphological or syntactic properties that correlate with the type of entity. A lot of languages differentiate between mass nouns (eg: water) and count nouns (eg: boy).

Event nouns can often be made from verbs in a nominalization process, although it’s often different from regular nouns. Eg: “dancing” is an event noun but it can’t be pluralized. Languages have processes to shift between word classes because it’s efficient to be able to switch between referential and predicate mode. One way to test if a noun is an event noun is to ask whether it can happen / take place / occur.

Nouns can also be said to have argument structure. For example, “Dave’s dive” has a slot for the agent of the dive; “transfer” has three slots: the sender, the recipient, and the thing being transferred. Unlike verbs, the arguments for nouns are much more often omitted, so it’s difficult to identify how many arguments it has. Still, they have to be semantically implied for the sentence to be interpretable.

Event nouns also have Aktionsart, similar to verbs, with categories of state-denoting, indefinite, definite, and punctual. Another class is semelfactive, which is a punctual noun that can be repeated as an instance of a larger event, like a “drink” is made of many “sips”. The Aktionsart of the verb isn’t the same as for the noun, and depends on the derivation strategy: the “-ing” suffix tends to be used for indefinite process, while conversion more often creates a bounded event.

Hengeveld (1992) gave a cross-linguistic definition of word classes: nouns and verbs are heads of referential and predicate phrases; adjectives and adverbs are modifiers respectively. Differentiated systems (like English) have four categories of lexical items, flexible systems have fewer than four such categories. Rigid systems are lacking some categories and express it differently (eg: in Iroquois, “boy” is literally “he is young”), but this hasn’t been attested. The implicational hierarchy is Verb > Noun > Adj > Adv, so if a language has flexible word classes, then nouns and verbs are the last ones to be lacking.

Ch5: Paradigmatic Structures in the Lexicon

Saussure introduced the idea of associative relation, which is any way that two words can be connected. Eg: “book” can be associated with “look” (phonetic similarity), “chapter” (constituent idea), “writer”, “library”, etc. A paradigmatic relation is a set of words that could fit in some slot, like “his X got published”.

Relations can be vertical (if one is subordinate in some way) or horizontal (if they’re on equal level). Two vertical relationships are hypernym/hyponym (X is a type of Y), and holonym/meronym (X is a part of Y). Hypernymy is transitive, but meronymy is not always.

Synonymy is a type of horizontal relation, but usually two words are only synonyms in some contexts and not others, due to polysemy. Near-synonyms are close in meaning, but defer in degree, connotation, domain, etc. Several different types of antonyms, whether it’s polar or scalar property, for reversing a relationship (teacher-student), reversing an action (build-destroy), etc. Some miscellaneous relationships include causation (kill = cause to die), purpose (goal of teach is to learn), etc.

Ch6: Syntagmatic Relations in the Lexicon

A syntagmatic relation is a relationship between words when used in context, for example, the thematic role between a verb and its subject. There are several ways that a combination might be invalid. Conceptual / ontological restrictions are just fundamentally incompatible, like “the table shouted at me”. Lexical restrictions are when words select for semantic criteria, so “the boy is high” is incorrect. Some are by convention, so you say “heavy rain” but not “strong rain”.

For words that can be combined, free combination is combination of compatible words, without any fixed phrases, collocations, metaphor, etc. Collocations are a type of restricted combination. The statistical definition of collocation is when two words appear together more often than by chance; linguistic definition requires that the noun category cannot be guessed from the verb. For example, if you know the verb is “park X”, you know X must be a vehicle, but if you have “pay X”, you cannot guess that X = “attention”.

More modern linguists propose that a language is not made of a lexicon and syntax, but a set of constructions, each of which can’t be composed of simpler constructions. The prototypical construction is like “the X-er …, the Y-er …” but can have more abstract constructions like “Subject V Object Oblique” means “X causes Y to move to Z”. Idiomatic expressions like “take a stab” are different from collocations in that there’s very little ability to rearrange components. When groups of words often occur together, they usually become more closely tied and eventually lexicalized into a single word.

See more reviews on Amazon.com