Juan Chattah - Film Music - Cognition To Interpretation-Routledge (2024)
Juan Chattah - Film Music - Cognition To Interpretation-Routledge (2024)
Juan Chattah - Film Music - Cognition To Interpretation-Routledge (2024)
Film Music: Cognition to Interpretation explores the dynamic counterpoint between a film’s
soundtrack, its visuals and narrative, and the audience’s perception and construction of
meaning.
Adopting a holistic approach covering both the humanities and the sciences—blending
cognitive psychology, musical analysis, behavioral neuroscience, semiotics, linguistics,
and other related fields—the author examines the perceptual and cognitive processes that
elicit musical meaning in film and breathe life into our cinematic experiences. A clear and
engaging writing style distills complex concepts, theories, and analytical methodologies
into explanations accessible to readers from diverse disciplinary backgrounds, making it an
indispensable companion for scholars and students of music, film studies, and cognition.
Across ten chapters, extensive appendices, and hundreds of film references, Film Music:
Cognition to Interpretation offers a new mode of analysis, inviting readers to unlock a deeper
understanding of the expressive power of film music.
Juan Chattah is Associate Professor of Music at the Frost School of Music, University of
Miami, USA.
FILM MUSIC
Cognition to Interpretation
Juan Chattah
Designed cover image: © Juan Chattah
First published 2024
by Routledge
605 Third Avenue, New York, NY 10158
and by Routledge
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2024 Juan Chattah
The right of Juan Chattah to be identified as author of this work has been asserted in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in
any form or by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying and recording, or in any information storage or
retrieval system, without permission in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks or registered
trademarks, and are used only for identification and explanation without intent to
infringe.
Library of Congress Cataloging-in-Publication Data
Names: Chattah, Juan, author.
Title: Film music: cognition to interpretation/Juan Chattah.
Description: [1.] | New York: Routledge, 2023. | Includes bibliographical references and
index.
Identifiers: LCCN 2023029274 (print) | LCCN 2023029275 (ebook) | ISBN
9781138586703 (hardback) | ISBN 9781138586710 (paperback) | ISBN
9780429504457 (ebook)
Subjects: LCSH: Motion picture music—Analysis, appreciation. | Motion picture music—
Psychological aspects.
Classification: LCC ML2075. C468 2023 (print) | LCC ML2075 (ebook) | DDC
781.5/42—dc23/eng/20230621
LC record available at https://lccn.loc.gov/2023029274
LC ebook record available at https://lccn.loc.gov/2023029275
ISBN: 978-1-138-58670-3 (hbk)
ISBN: 978-1-138-58671-0 (pbk)
ISBN: 978-0-429-50445-7 (ebk)
DOI: 10.4324/9780429504457
Typeset in Optima
by Apex CoVantage, LLC
CONTENTS
Acknowledgments vii
Introduction 1
1 Empathy 6
2 CONTAINER Schema 23
3 LINEARITY Schema 37
5 Affordances 64
7 Archetypes 100
8 Associations 116
9 Categorization 125
Behind the scenes of crafting this work, a cast of extraordinary individuals and institutions
graciously extended invaluable support, guidance, and expertise, transforming the writing
process into an awe-inspiring journey.
April Mann, your boundless dedication to the craft of writing guided me through the
labyrinth of words, your discerning questions became sparks that ignited a kaleidoscope
of ideas, and your sincere friendship enriched this journey in ways that words cannot fully
express.
I am deeply indebted to a vibrant community of colleagues and friends working in film
music and cognition, as their inspiring work and invaluable feedback stimulated illumi-
nating conversations and opened new avenues of exploration. My heartfelt thanks go to
Josh Albrecht, Mihailo Antović, Michael Austin, William Ayers, José Luis Besada, Daniel
Bishop, Janet Bourne, Yorgason Brent, Warren Buckland, James Buhler, Poundie Burstein,
Ewan Alexander Clark, David Clem, Maarten Coëgnarts, John Covach, Jeanine Cowen,
Mark Cross, James Deaville, Janice Dickensheets, Kevin Donnelly, Rebecca Eaton, Zohar
Eitan, Robert Fink, Lucio Godoy, Julianne Grasso, Michele Guerra, Erik Heine, Guido Heldt,
Hubert Ho, Julie Hubbert, Bryn Hughes, Deniz Hughes, David Huron, Jesse Kinne, Patrick
Kirst, Violetta Kostka, Mariusz Kozak, Peter Kravanja, Danijela Kulezic-Wilson, Gui Hwan
Lee, Frank Lehman, Scott Lipscomb, Charity Lofthouse, Justin London, Catherine Losada,
Sean McMahon, Kate McQuiston, Miguel Mera, Táhirih Motazedian, Scott Murphy, Chelsea
Oden, Mitch Ohriner, Steven Rahn, Ron Sadoff, Janna Saslaw, Tom Schneller, Matthew Shaf-
tel, Dan Shanahan, David Shire, Robynn Stilwell, Siu-Lan Tan, Rennee Timmers, Yayoi Uno
Everett, Petros Vouvaris, Elsie Walker, Emile Wennekes, and Lawrence Zbikowski.
In the fervor of expressing gratitude, I may have unintentionally left out individuals who
have contributed significantly to bringing this book to fruition. To those inadvertently omit-
ted, please know that your support, guidance, and contribution are deeply appreciated.
I am indebted to the University of Miami, in particular to Charles Mason, Chair of Music
Theory and Composition, Shelly Berg, Dean of the Frost School of Music, and Guillermo
(Willy) Prado, Executive Vice President for Academic Affairs and Provost. Their steadfast
support and commitment to fostering intellectual growth allowed me to expand my
horizons and develop ideas beyond the confines of traditional disciplines.
viii Acknowledgments
At long last! The opening night of Star Wars: Episode VI – Return of the Jedi. The lights inside
the movie theater begin to dim. A majestic fanfare accompanying the emblematic yellow
rollup of narrative crawling into the infinite darkness of space invites us into the film’s fan-
tastical realm. The camera pans to reveal an Imperial Star Destroyer moving toward an
intimidatingly massive Death Star. As a smaller shuttle hurtles from the Destroyer toward the
Star, the captain instructs, “Command station: This is ST 321. Code clearance: Blue. We’re
starting our approach. Deactivate the security shield.” In the music, a powerful brass ensem-
ble announces, “BOM BOM BOM, BUM BI-BOM”, accents of impending menace and
wrath, repeated lower and lower. At the Star, a commander rushes through a formation of
stormtroopers over to the gate, uneasy, anticipating the shuttle’s arrival. The shuttle lands. As
the shuttle’s hatch whooshes open, revealing its inner darkness, the music strikes its lowest,
most menacing notes, extending and completing the initial statement, “BOM BOM BOM,
BUM BI-BOM, BUM BI-BOM”. The commander shivers, turning pale as he hears heavy
footsteps and mechanical breathing emerging from the depths of the shuttle—Darth Vader,
Lord of the Sith, has arrived.
From the opening scene, the cinematography immerses us in a unique realm wherein our
psychophysiological responses subliminally circumscribe our interpretations. The music is
vital in shaping these interpretations: through its melody, we intuit Darth Vader’s presence
long before he appears on-screen; through its meter, we anticipate an encounter of martial
rivalry; through its register and loudness, we sense Darth Vader’s ominous and overpowering
temperament and resonate with other characters’ fears, feeling their anxiety firsthand. While
the music affects us subliminally, below the level of conscious attention, we sometimes
externalize our responses, grabbing a friend’s arm or gasping overtly. Most surprisingly,
despite the seemingly negative valence of our responses to the Star Wars scene discussed
here, we genuinely enjoy the sinisterly thrilling experience.
This remarkable phenomenon is not unique to a single film or scene. In all films, music
operates almost subliminally, activating sensorimotor reflexes and conveying complex mes-
sages that motivate, support, highlight, complement, or even negate other facets of the cin-
ematic experience. Through the music, we suffer the protagonists’ pain, share their laughter,
DOI: 10.4324/9780429504457-1
2 Introduction
shriek along with their fear, and bond with them through their cinematic journeys. We all
recognize that film music is designed to elicit a particular response, but how exactly does
the music influence our interpretations of scenes or entire films? What is the source of the
music’s expressive power? Are we hardwired to respond to it? To what extent do cultural
practices and social norms govern our responses?
To answer these questions, we must trace the logic underlying the perceptual and cogni-
tive processes that elicit musical meaning in film.1 This book’s primary goal is to construct
a comprehensive framework to explore those processes. Constructing such an ambitious
framework requires a holistic approach, one that navigates through the humanities and sci-
ences alike, one that draws on converging thoughts and disciplinary lines—music analysis,
psychology, behavioral neuroscience, semiotics, cognitive linguistics, and related fields—
foregrounding their implications for understanding the mechanisms whereby film music
contributes to our ability to construct meaning.2
I dub this framework ESMAMAPA—Empathy, Schema, Metaphor, Affordance, Memory,
Archetype, Personal Association. ESMAMAPA is ambitious in its scope, attempting to explain
how interpretations emerge in the listeners’ minds (and bodies) by accounting for the pri-
mary cognitive and semiotic processes related to music-based meaning construction. Figure
i.1 presents these processes as a taxonomy broadly structured on the relationship between
the music and how it generates meaning. While these mechanisms intersect and interact
when we construct interpretations, in this book, I consider them one at a time, system-
atically navigating through this taxonomy, disentangling the complexities of these multiply
interrelating mechanisms.
Following this introductory chapter, the book echoes the taxonomy presented in Figure i.1,
proceeding from (primarily) phenomenological to (primarily) intellectual perspectives. Indi-
vidual chapters present film vignettes that exemplify and probe the limits of the ESMAMAPA
framework, foraying into the extensions and implications of the various mechanisms of
meaning construction.
Chapters 1 through 5 draw on sub-disciplinary strands of embodied cognition and neu-
ropsychology to explore mechanisms based on analogy, discussing how material character-
istics of the music govern (or demand) a specific response from a listener and investigating
interactions between the music’s structure and the film’s visual or narrative dimensions.
Chapter 1 introduces a musical empathy model that builds on the discovery of the Mirror
Neuron System (MNS). Although this phenomenon was first identified in macaques, MNS
research reveals neurophysiological mechanisms that might underpin social cognition in
humans; and while findings were originally tied to visual stimuli, I draw here an analogy to
empathy triggered by aural stimuli. To distill the intricacies of music-based empathy, I further
trace our embodied responses to three submechanisms: entrainment, subvocalization, and
contagion. Chapters 2, 3, and 4 form a unit delving into Image Schema Theory and George
Lakoff and Mark Johnson’s Conceptual Metaphor Theory. These chapters elucidate how film
music acts as one agent in a multidimensional mapping process involving the visuals and
the narrative. Each chapter centers on a different image schema—CONTAINER, LINEARITY,
and SOURCE-PATH-GOAL. Chapter 5 takes James Gibson’s notion of ‘affordances’ as a point
of departure, examining musical parameters (meter and tonal centricity) not as properties of the
music itself but as emerging in the embodied conceptualization of musical experiences—
experiences in which the nature of our bodies determines both what and how music can
be meaningful.
While Chapters 1 through 5 engage with the phenomenology of music and investigate
interactions between the music’s structure and the film’s visual or narrative dimensions,
Chapters 6 through 8 begin to extricate the network of associations that emerge from
the music, framed by both the audience’s prior experiences and the film’s narrative.
These chapters continue to draw on cognitive psychology, but explanations dovetail with
theoretical models from cognitive linguistics and speculative notions from semiotics.
Chapter 6 discusses the impact of gestalt laws of perception on our recognition and
memorization of leitmotifs, drawing a parallel with Pavlov’s classical conditioning and
situating leitmotifs as ecologically relevant acoustic cues. Chapter 7 draws on the music-
semiotic theory of ‘musical topics’ and on Charles Sanders Peirce’s and Ferdinand de
Saussure’s sign typologies to clarify the nature and function of film music archetypes.
Because these archetypes exist outside a single film (inter-opus), becoming cultural units
the listener identifies via the music’s stylistic characteristics, this chapter investigates
musical topics from both a synchronic and a diachronic perspective. Chapter 8 borrows
and adapts the Conceptual Integration model developed by Gilles Fauconnier and Mark
Turner to reveal and help reconstruct the thought processes that induce hermeneutic
readings of scenes in which the music carries prominent cultural baggage or in which the
lyrics explicitly (or implicitly) contain clues for understanding the film.
Chapters 9 and 10 expand on the ESMAMAPA framework by exploring two supplemen-
tary mechanisms: categorization and transformation. Chapter 9, inspired by Eleanor Rosch’s
cognitive models of categorization, disentangles the notion of musical irony and concepts
commonly linked to irony, including parody, satire, sarcasm, paradox, hyperbole, and the
grotesque. Chapter 10 resumes the discussions on leitmotifs, topics, and conceptual blends,
identifying instances where, once established, these undergo transformations that signal
new dramatic conditions or plot developments.
As a prelude, a film example at the beginning of each chapter sets the space for the mech-
anisms to be explored. Subsequent examples within each chapter exemplify the mechanism
at hand and probe its boundaries. Although these examples represent a wide diversity of
approaches (in their musical soundtrack and artistic direction) and a broad range of genres
(from silent films to present-day productions), my central goal in choosing them has been to
4 Introduction
illustrate the various cognitive and interpretative mechanisms of the ESMAMAPA framework
with utmost clarity. Given music’s shifting role(s) throughout a film, with multiple mecha-
nisms overlapping and interacting to construct meaning, some chapters, particularly later
ones, include holistic analyses integrating the theoretical paradigms and critical insights
from preceding chapters. Therefore, while advancing through the book and exploring differ-
ent meaning construction strategies applicable to film music, readers will increasingly find
themselves on familiar ground.
A fundamental presupposition underlying ESMAMAPA is that humankind’s cognitive
capacities and responses to music’s expressive forces are relatively uniform and universal—
but the interpretations that we construct stemming from such responses to the music are not.
Therefore, when unpacking the film vignettes interspersed throughout the chapters, I do not
claim interpretational certainty. Just as my interpretations inevitably reflect my own bodily
and intellectual engagement with the music, there are innumerable ways in which audience
members may traverse the expressive trajectories purported by the musical underscoring.
Furthermore, some of my interpretations, mainly those in later chapters, assume an audience
with a certain familiarity with Western musical codes—a familiarity necessary, for example,
to identify Darth Vader’s signature melody. Nevertheless, while the creative interpretative act
that listeners bring to a film is inevitably informed by their own subjective (personal) and
intersubjective (shared) projections and inferences, the ESMAMAPA framework I develop
here can be attuned to investigate alternative interpretations.
Exploring the mechanisms involved in constructing meaning from film music requires
unpacking intricate and sophisticated scholarship. To make this scholarship accessible, I tai-
lor my writing style to a broad audience, using clear prose that includes only the necessary
technical language to ensure the depth of treatment and theoretical rigor expected by prac-
titioners and scholars. In terms of music and music-theoretical knowledge, I only assume
a basic familiarity with notation and theory fundamentals; this will allow readers to follow
along with my transcriptions of selected musical cues. In keeping with the book’s broader
goals, I forgo the temptation to delve all too deeply into any theory or cognitive mechanism
within the main chapters. Instead, I invite readers to consult the appendices, which offer a
more in-depth treatment of scholarship and furnish references to relevant models, methods,
and procedures in neuropsychology, cognitive psychology, and semiotics. Nevertheless,
even in the appendices, I synthesize these insights with broadly accessible language and
examples, optimistic that such prose will also appeal to those with even a passing interest
in this scholarship.3
As I completed the manuscript, colleagues teaching film music courses expressed an
interest in assigning this book (or selected chapters) as course materials. Their suggestions
prompted me to make two additions. First, although all film music can be explored through
the ESMAMAPA framework, finding suitable and clear examples for students is no small feat;
therefore, I include one appendix featuring a wealth of examples paired with questions and
discussion points. Second, although the prose may seem sufficient to grasp the intricacies
and subtleties of the arguments put forth in the various chapters, it is no replacement for
experiencing the film examples firsthand. Therefore, I invite readers to explore the online
media for all examples that include the play icon by visiting the book’s YouTube site
accessible via the QR code on Figure I.2 or at www.youtube.com/channel/UCzNQHM7-
ysgHrkrB4XYyqbA. This dedicated online portal also contains additional examples and
expanded explorations of film music.
Introduction 5
FIGURE 1.2 Scannable QR code to access the book’s dedicated online portal.
Notes
1. Readers familiar with my earlier work will recognize traces of ideas and thoughts, yet all proposed
frameworks and examples illustrating these frameworks have been significantly revised, refined,
and expanded. Therefore, my earlier work offers a prismatic spectrum of concepts and analytical
perspectives, from my initial impulse to understand the mechanisms of meaning construction in
film music, crystallizing in the proposed framework.
2. In seeking to construct a viable and useful framework, I draw on different kinds of inquiry, each
yielding different kinds of evidence, including introspective accounts, composers’ insights, theoreti-
cal speculation, and experimental results. I hope that each of these lines of inquiry and evidence
will carry its own significance so that readers who do not resonate with my interpretations of a
scene, for instance, might nonetheless find merit in the proposed framework or my application of
theoretical and empirical scholarship.
3. Much experimental research on metaphor and image schema stems from cognitive psychology or
neuropsychology. While research in cognitive psychology investigates behaviors, mental processes,
or psychological states, research in cognitive neuropsychology focuses on the corresponding neural
substrates. And while cognitive psychology draws on behavioral paradigms (such as reaction time,
priming, motion imagery, preferential looking, forced-choice matching, goodness-of-fit) to arrive
at results or test hypotheses, cognitive neuropsychology draws on computational modeling and
techniques such as functional magnetic resonance imaging (fMRI), electroencephalography (EEG),
magnetoencephalography (MEG), positron emission tomography (PET), and event-related potentials
(ERP).
1
EMPATHY
In Psycho’s iconic shower scene, the music assaults the listener. Marion stands in the shim-
mering white tub, placidly relaxing, as the water cascades down her body. The soothing
white noise drenches the soundtrack, concealing all other sounds. With her back to the semi-
transparent shower curtain, Marion is unaware of a dark, undefined, looming silhouette. The
shadow gets closer, taking the shape of an older woman. A hand with a menacing knife
reaches up and rips the curtain. Marion screams in horror as the blade ruthlessly slashes her,
again and again, as if tearing the very film apart—the comforting shower noise now turns to
contorted musical madness, with violent violin gestures echoing Marion’s piercing cries for
help. Through the music, we resonate with her frantic struggles—just as Marion, on-screen,
cannot avoid her frightful destiny, we, the captive audience, cannot avoid sharing her agony,
feeling her pain.
Film music makes you feel what the characters feel. We feel scared when the character is
scared, and brave when the character is brave. We feel angry, blue, joyous, lonely. We feel
like we are inside the story, and part of the action. We feel what we hear.
Listeners often ascribe such strong embodied responses to the music’s expressive power,
but this brings about more questions than it answers: What musical forces make us empa-
thize with the characters’ emotions? Are we genuinely experiencing these emotions or
merely inferring their emotional states? What happens in our brains and bodies when we
empathize with film characters? This first chapter begins to answer these questions by estab-
lishing a foundational framework for meaning construction: a model for empathy through
film music. In fleshing out this model, we address relevant scholarship on the neural and
bodily mechanisms underpinning the communication and interpretation of emotion through
music, and present readings of various film scenes to probe the model and highlight film
music’s extraordinary potential for meaning construction.
DOI: 10.4324/9780429504457-2
Empathy 7
and bodily gestures; (2) when listening to the music, audience members mirror these musi-
cal gestures via three sub-mechanisms (entrainment, subvocalization, and contagion); and
(3) the physiological responses stemming from this internal simulation elicit in the audience
the composer’s intended affective state.1
Within the context of this model, musical gestures are correlates of vocal and bodily
gestures and postures, emerging from mapping vocal and bodily gestures and postures onto
the musical space.2 Through this mapping, the associated expressive content of physical
gestures or postures is not lost but translated.3 Whereas bodily gestures often translate to
dynamic musical figures—e.g., melodic or harmonic movement—bodily postures translate
to (relatively) static musical figures—e.g., single chords or notes.4
Because we share similar bodies and minds, our experience of living in our own bodies
allows us to attune ourselves to the actions, sensations, intentions, and emotions of others.
Therefore, central to this model is the presence of a music-based human mirror neuron sys-
tem (MNS), which activates critical brain regions dedicated to perception and action while
listening to music.5 Engaging the MNS in response to music, however, is not contingent
upon musical knowledge or instrumental proficiency. While listening to music, musicians
and nonmusicians alike sing along, tap the beat, mimic performative gestures (like air-guitar
performances), and even interpret the emotional intentions behind the music.6 A music-
based MNS, therefore, allows for a powerful means of communicating emotions—music
becomes a universal language.
To capture music’s wide-ranging parametric variability, this model integrates three mirror-
ing mechanisms: entrainment, subvocalization, and contagion. These mechanisms are analo-
gous yet complementary, each responding to different musical parameters—subvocalization
responds primarily to pitch, timbre, and contour; contagion to texture; entrainment to rhythm
and tempo. Each also furnishes different empathic functions—while entrainment allows us to
empathize with individual and group behavior, subvocalization works primarily at the indi-
vidual level, and contagion at the group level.
Lastly, this model rests on the counterintuitive premise that musically elicited affective
states stem from our subjective experience of physiological changes in our bodies—simply
put, bodily states cause emotions, rather than the other way around.7 As a result, by modu-
lating our physiology—heartbeat, respiration, motor patterns—film music becomes a direct
channel through which we empathize with film characters.
8 Empathy
Entrainment
By means of entrainment, we synchronize our bodies to temporally cyclic stimuli, to the
regularities and periodicities of kinetic events in our environments. Research in neuropsy-
chology provides ample evidence of premotor cortex activity during both production and
perception of temporal stimuli, firmly situating entrainment as a subcomponent of the
human MNS.8 And, as in all mechanisms related to the human MNS, we entrain to stim-
uli spontaneously and subconsciously, with motor responses manifested either overtly or
covertly—for example, overtly when toes tap along, covertly when heartbeats match an
external stimulus.9
Film composers draw on our capacity for entrainment to modulate our affective states.10
Because bodily changes trigger affective ones, and because cardiac and respiration muscle
groups synchronize to a dominant sound, film soundtracks often foreground a character’s
increased heart- or breathing-rate to modulate our physiology and subliminally heighten
our alertness and stress response. Two examples, one from Midnight Express and one from
Gravity, illustrate this common technique deployed within extramusical components of the
soundtrack.
In an early scene from Midnight Express, Billy, an American college student, is about to
board the return flight from a vacation in Istanbul. He fears being caught smuggling two
kilos of hashish strapped to his body. As he enters the security checkpoint, we hear his rac-
ing heartbeat. The guard studies Billy and sneers, “Bag!” Billy tenses and unhurriedly opens
his shoulder bag, a bead of cold sweat rolling down his forehead. A second guard catches
sight of the sweat and commands Billy to take off his sunglasses. Billy’s audible heartbeat
increases further, as does ours. He knows his nerves are visible through his eyes, yet he dar-
ingly takes off his sunglasses and stares at the guards, trying not to look away. Throughout
this nerve-wracking scene, the thumping of his heart comes into the foreground, pushing
other soundtrack elements to the background, prompting us to entrain to Billy’s heartbeat
and experience his apprehension firsthand.
In Gravity, the soundtrack often foregrounds the protagonist’s breathing, prompting us
to entrain to it, thereby modulating our affective states. Outside the Explorer Space Shuttle,
Dr. Ryan Stone carries out her first repair mission. Her bulky white spacesuit is strapped to
a robotic arm, preventing her from drifting away into space. She safely floats close to the
shuttle, her measured breathing allowing her to focus on her work. Something flickers in her
eyes—a cloud of debris hurtling toward the shuttle, far too fast. A bulky piece of debris hits
the right wing, causing the robotic arm to flail uncontrollably. Another piece hits the robotic
arm, causing it to spin away from the Explorer at an incredible speed, Ryan still strapped to
it. Stars and objects orbit wildly in her field of vision. She must detach from the robotic arm
before it carries her too far from the shuttle. Ryan panics—we hear her breathing quicken.
Trembling, she unhooks the clasp, finally thrusting herself away from the robotic arm. Her
eyes sift through the black and incalculable void, glimpsing the shuttle, now shriveled into
a tiny dot in the distance. Her breathing speeds up further, so does ours. As the darkness
swallows her, she calls out for help:
Explorer, do you . . . do you copy? Houston, do you copy? Houston, this is Mission
Specialist Ryan Stone. I am off structure, and I am drifting. Do you copy? Anyone . . . ?
Anybody . . . ? Do you copy? Please copy. Please.
At the climax of this angst-ridden scene, the soundtrack ties her speech’s syntactic disin-
tegration to the increase in her breathing rate, compelling us to embody Ryan’s terrifying
experience by physically affecting us through entrainment.11
Entrainment is not limited to synchronizing to others’ bodily movements or audible vital
signs—cyclic patterns in the music also engage listeners through entrainment.12 Entraining
to the music’s tempo triggers physiological changes that, in turn, elicit affective ones.13 Film
composers skillfully draw on this phenomenon, prompting us to empathize with the charac-
ters, especially while they experience moments of high arousal.
The music in Lord of the Rings: Return of The King grips us and makes us feel what the
characters feel. The opening scene flashes back to Sméagol’s birthday, fishing with his cousin
Déagol at the idyllic River Anduin. Déagol hooks a mighty fish, which hauls him overboard.
As he lets go of the line, a golden ring lying in the silt catches his eye. Déagol climbs out
of the water, holding the glittering ring in his palm. The ring now reflects in Sméagol’s eyes,
who entreats, “Give us that, Déagol, my love! Because it is my birthday, and I wants it.”
FIGURE 1.4 Lord of the Rings: Return of The King. Déagol’s thumping heartbeat. [00:02:30]
10 Empathy
Déagol ignores him, a smirk on his face. Driven by the ring’s destructive forces, they turn on
each other, fighting for possession of the ring. Sméagol tries to snatch the ring from Déagol’s
hand, again and again. Sensing that Déagol will not yield, Sméagol goes for his neck. In the
soundtrack, the ring’s hum blends with a musical rendition of an elevated heartbeat. As Smé-
agol chokes Déagol to death, the thumping in the music decelerates and ultimately ceases.
Sméagol hisses, “My precious”, takes the ring as his own, and disappears. The underscoring
in this spellbinding scene prompts us to subliminally entrain to Déagol’s heartbeat, allowing
us to empathize with his distressing final moments and experience the power of the ring to
turn beloved cousins against each other in cold-blooded conflict.
In a scene from American Animals, the initial heartbeat-mimicking sounds become a
stylized rendition of a heartbeat in the underscoring. Four college students plot to steal
valuable books from their university’s Special Collections Library. On the day of the heist,
Spencer and Warren, unconvincingly disguised as older men, wait in a living room for Chad
to arrive. Spencer sits at a table, expressionless, tapping a heartbeat using two porcelain
animals. Warren paces back and forth, constantly checking his wristwatch. Spencer’s unre-
mitting tapping puts Warren on edge, causing him to shout at Spencer to make him stop.
Although the diegetic thumping ceases, a heartbeat rhythm emerges as non-diegetic music.
Chad arrives, also disguised as an older man. The three drive at a snail’s pace through a resi-
dential area, yet the musical rendition of an unrelenting heartbeat persists in the soundtrack.
Eric, the fourth man, stands waiting at the side of the road. Their van pulls up, and he gets in.
The four men sit in tense silence, while the musical heartbeat resonates in the soundtrack.
They arrive at the university campus and step out of the van. As they stroll through the open
quad toward an imposing mock-colonial building, the Special Collections Library, the styl-
ized musical rendition of the heartbeat becomes buried in other musical layers. The initial
transference of the heartbeat rhythm in the sound design—from diegetic (Foley) sound to
non-diegetic music—gives us an extended window of time to subliminally entrain to it,
thereby ensuring we will empathize with the characters’ apprehension and experience their
elevated tension in anticipation of the heist.
Our tendency to entrain to musical stimuli’s temporal regularities also allows directors
to establish a consistent pace in scenes with irregularly paced visuals. The ‘sensual haircut
scene’ from Edward Scissorhands exemplifies this phenomenon. All the housewives line up
with their dogs in a neighbor’s backyard. Joyce gapes as she holds Kisses on top of a table
and Edward grooms the toy poodle into a tiny, unique work of art. Fascinated, she flirts, “Is
there anything you can’t do, Eddy? I swear, you take my very breath away. Have you ever
cut a woman’s hair? Will you cut mine?” As she offers herself on a chair in front of Edward,
Empathy 11
FIGURE 1.6 Edward Scissorhands. Joyce indulges under Edward’s blades. [00:49:30]
a sensual habanera sets in the music. He confidently swivels her head, studies her hair. She
flushes in anticipation. With calculated precision, he unleashes his skills—snipping here,
layering there. She luxuriates under his unwavering blades, eyes closed, a sensual smile on
her lips. The exhilarating music follows Edward’s swift and rousing movements, brushing
by the visually slow moments depicting Joyce’s ecstatic response, moments that threaten to
undermine the scene’s whirling relentlessness. He continues, cutting dangerously close. She
twirls her feet in pleasure. As he makes a final cut and steps back, satisfied, the music eases
back into the habanera. Glowing, Joyce purrs, “That was the single most thrilling experience
of my life.” Via entrainment, the relentless pace of the music overrides the pace of the visu-
als, allowing us to experience the rousing intensity of the moment, bringing us along with
Joyce to a riveting climax; subsequently, the habanera takes us to a pleasurable repose, help-
ing us recognize how Edward’s sharp and strong hands helped transform Joyce’s persona as
much as her appearance, revealing the latent seductress within.
Subvocalization
Laughter, screams, cries, shouts, gasps, sighs—all are part of an extensive repertoire of vocal
gestures expressing innumerable affective states, including happiness, anger, sorrow, tri-
umph, awe, and frustration. These vocal gestures subliminally activate the listener’s vocal
musculature and neural substrates associated with producing those gestures via a mecha-
nism known as subvocalization.14 However, subvocalization is not limited to mirroring iso-
lated vocal gestures. Recent research shows that singing—which straddles the boundaries
between verbal and nonverbal gestures—also triggers subvocalization.15 As a result, because
nonverbal vocalizations and singing share the same production mechanisms and expressive
purposes, these modes of communication also share the associated physiological and affec-
tive states resulting from subvocalization—when we mimic vocal sounds, we feel firsthand
the emotion these sounds project.
When subvocalizing, we channel a signal’s acoustic signatures—subtle yet salient
nuances in a signal’s sonic parameters—through our vocal apparatus, (re)enacting the
associated physiological processes that trigger them and thus educing the associated emo-
tions. For instance, vocalizations expressive of anger feature anger’s characteristic acoustic
signatures—harsh timbre, angular contour, loud dynamic level, dissonant overtones. Simi-
larly, vocalizations expressive of tenderness feature that emotion’s contrasting characteristic
acoustic signatures—mellow timbre, smooth contour, moderate dynamic level, consonant
overtones.16 Acoustic signatures are therefore foundational to the mechanism of musical
empathy, transforming meaningless sonic signals into gestures with expressive power. Film
12 Empathy
FIGURE 1.7 Gladiator. Maximus mourns the death of his family. [00:42:50]
music exploits the expressive power of acoustic signatures embedded in vocal music to
great effect, especially to construct an empathic bond between film characters and audience
members.
The human voice plays a pivotal role in Gladiator’s musical score, prompting us to empa-
thize with the protagonist’s affective states by engaging our MNS via subvocalization. Maxi-
mus returns home from combat. At the sight of thick black smoke rising over the horizon, he
gallops homeward in a frenzy. He arrives at his worst nightmare—vineyards destroyed, the
earth scorched, houses smoldering—and collapses off the horse. The music emerges with
a rendition of a solo vocal lament over a low drone. Pulling himself up, he staggers past
the rubble, anticipating his greatest fear. As he catches sight of two crucified and charred
bodies—his wife, his son—he howls, tormented, sinking into despair. The music penetrat-
ingly depicts Maximus’s mourning throughout the scene using musical representations of
vocal and nonverbal sounds. The solo voice texture reflects the intimate grief of an inner
psychological space—the timbre of closed-mouth vocalizations conveys a suppressed emo-
tion, and the melodic contour suggests the visceral representation of an animal lament.17
While listening to the music, we subliminally engage our MNS through subvocalization,
which compels us to resonate with Maximus’s suffering.
Subvocalization extends even further. Research in auditory neuroscience broadens the
notion of subvocalization to include responses to a wide range of auditory stimuli, includ-
ing instrumental music.18 When we hear instrumental music, we mimic it internally, emu-
lating its sonic qualities within our vocal capabilities—that is, we resonate strongly with
vocal and instrumental gestures, provided we can reproduce them vocally. Instrumental
music’s expressive capacity is thus scaffolded by acoustic signatures stemming from vocal
gestures.19 However, since our auditory perception attunes mainly to human vocalizations,
vocal music elicits more powerful reactions than instrumental music; therefore, instrumental
gestures approximating voice-like characteristics will trigger a more pronounced subvocal
response.20
The music in 15 Minutes maps the characters’ bodily and vocal gestures onto instrumen-
tal ones. The film chronicles the chaotic journey of Emil and Oleg, two Russian ex-convicts
traveling to the U.S. to collect their share of a heist. The musical underscoring reacts in
tandem with their experiences, subliminally guiding our emotions, constructing an affective
bond that allows us to empathize, yet not sympathize, with these utterly unlikable charac-
ters. Two scenes exemplify this phenomenon by bringing back a motivic idea used at the
beginning of the film during the main titles sequence for character construction, albeit radi-
cally transformed to communicate two widely different affective states.
Empathy 13
Early in the film, Emil and Oleg pay an unexpected visit to their old partner, Milos, who
has settled for a modest yet peaceful life in New York with his wife. Oleg, obsessed with
becoming a Hollywood filmmaker, captures these moments with a handheld camera. As
Emil demands his rightful share of the heist, he learns that Milos has spent all the money.
Emil furiously erupts, stabbing Milos and strangling his wife. Confused by his own murder-
ous impulse, Emil zigzags through the modest apartment, sobs, hunkers down, gesticulates
as if trying to explain what has just happened. In the background, wild musical gestures per-
formed on a violin echo these unsettling events—brusque attacks, wide and unpredictable
leaps, noisy and coarse timbres. Emil ransacks the old kitchen cabinets, tossing empty cans
and bottles, looking for something. Finding a metal container with acetone, he turns to the
bodies. Oleg, who has been filming the havoc, grunts, “Aw . . . Bohemian barbecue.” Emil
closes the curtains, shrouding the room in darkness. Transformed beyond recognition, the
playful motivic idea heard at the beginning of the film now allows us to experience Emil’s
bewilderment while accentuating his wrath.21 The underscoring’s musical gestures, featur-
ing acoustic signatures characteristic of anger, prompt us to empathize with the character’s
affective state—Emil’s fury—via subvocalization while being horrified by his actions.
Toward the end of the film, Oleg is fatally shot. Sprawled on the ground, video camera
in hand, he seizes the opportunity to capture a realistic performance of his own death. He
slowly pans from the Statue of Liberty toward a close-up shot of himself dying. A violin
enters the musical texture, but this time gently following Oleg’s quiet closing credits, antici-
pating the imminent mourning. Faithfully shadowing Oleg’s death, the motivic idea now
features the acoustic signatures of sorrow—smooth contour, narrow range, regular rhythm,
predictable harmonic flow. Once again, the underscoring’s musical gestures prompt us to
empathize with the character’s affective state—now Oleg’s suffering—via subvocalization.
Among the many acoustic signatures, timbre is arguably the most efficient in triggering
an affective response.22 Our perception of timbre is shaped by a tone’s harmonics, their
interaction affecting the composite sound’s degree of sensory consonance and dissonance.23
For instance, the sounds of a flute, ocarina, or French horn lack inharmonic partials (those
that deviate from whole multiples of the fundamental frequency), resulting in the instru-
ments’ characteristic ‘mellow’, ‘warm’, ‘smooth’ timbre. In contrast, the sounds of a trumpet,
oboe, or most percussion instruments feature copious inharmonic partials, resulting in their
characteristic ‘harsh’, ‘rough’, ‘noisy’ timbre.24 Consequently, the presence and intensity
of inharmonic partials influence the associations we form between timbres and affective
states—dull timbres with sadness, brash timbres with anger.25
Film composers are keenly aware of timbre’s effectiveness in triggering a rapid emotional
response in the listener, and hence are keen on choosing specific timbre(s) when orchestrat-
ing musical gestures. In the two examples from 15 Minutes, the widely different violin tim-
bres, resulting from different performance techniques, convey radically different emotions.
However, while some instruments, such as the violin, can produce a broad range of timbres,
others are more consistent in their timbric variability, making them film music staples of
certain affective states.
In Jurassic World: Fallen Kingdom, the music projects honor and loyalty onto a dino-
saur. It introduces a French horn, whose natural timbre meets the necessary acoustic
signatures—high relative amplitude of the fundamental, low-frequency formants, low
perturbation, few inharmonic partials.26 Toward the film’s end, Owen attempts to per-
suade Blue, the last surviving female velociraptor, to be caged in a ‘sanctuary’. Out-
side the Lockwood Mansion, Blue’s birthplace, Owen treads down the steps toward her.
Having trained Blue from birth, he recognizes her noble nature and assures others, “It’s
okay. She won’t hurt us.” The underscoring to Owen’s words features a calming melody
on a French horn performing in its most comfortable register. Owen extends his hand to
caress Blue and mutters, “Hey, girl, come with me. We’ll take you to a safe place, okay?”
Blue chitters as she follows Owen’s gaze to a large, barred enclosure. Breathing heav-
ily, she backs away from his hand and locks eyes with him as though saying goodbye,
and, rejecting a life in captivity, she sprints into the forest to join the other dinosaurs.
The music contributes to anthropomorphizing Blue, prompting us to empathize with
her while reminding us that she is a loyal and honorable being. Moreover, by bestowing
Blue with human traits (such as nobility), the music foreshadows the film’s final message:
humans must learn to live with their environments and must embrace the “humanity” of
other animals.
FIGURE 1.10 Jurassic World: Fallen Kingdom. Owen recognizes Blue’s noble traits. [01:55:00]
Empathy 15
Contagion
Musical contagion is rooted in ecological and evolutionary functions of prosocial interac-
tion and cooperation, such as ensuring task efficiency or the welfare of ingroup members.30
Synchronizing with others through music helps develop ingroup bonding and allows us to
collectively attain signal amplification by combining our individual sounds into a louder
and stronger whole.31 For example, singing or performing in groups engages both producers
and listeners, leading to the social cohesiveness, bonding, and affiliation typical of sport-
ing events, religious ceremonies, and political rallies. Within these environments, and like
behavioral contagion, musical contagion is automatic and subliminal, stemming from our
evolutionary tendencies toward developing social networks and ingroup cohesion.32
Film composers harness these powerful evolutionary and ecological tendencies. By intro-
ducing voluminous choirs or sizable instrumental ensembles where all members contribute
to a unified and cohesive whole, composers prompt us to engage in musical contagion,
subliminally leading us to align with the collective and become co-participants of the film’s
events.
As illustrated in the earlier example from Gladiator, the solo voice engages our MNS
through subvocalization, allowing us to empathize with the protagonist’s affective states. In
other moments in the film, the choir in the underscoring imbues Maximus with the power
he will exert as a leader of gladiators, engaging our MNS through contagion. The film thus
weaves through the various dualities of the protagonist’s personae—the intimate versus the
communal, the emotional versus the impassive, the desire for peace versus the inevitabil-
ity of war—and maps this duality in the music. At a vital moment in the film, Maximus
leads three untrained gladiators to a decisive victory in a brutal fight. As they stride into the
magnificent Colosseum, a massive wall of sound overtakes them—rowdy roars of ferocious
lions, vicious grunts of their opponents, swathing ovations from the agitated arena. The
battle is on, unleashing a chaotic flurry of dust and blood. Maximus takes control, leading
his gladiators, joining forces, battling together. The musical underscoring in these bonding
moments features the confident sound of a male choir in unison with the orchestra, with
an arrangement foregrounding the muscular sound of double basses and brass instruments.
The bustling energy at the arena mounts as Maximus wades through his opponents, slashing
through them heroically. A staggering spectacle. Maximus glances around, all his opponents
defeated. The arena is a graveyard of body and animal parts. As the triumphant gladiators
exit the arena, a crowd of thousands stomps and cheers, “Maximus! Maximus!” By means of
contagion, the music places us alongside the crowd, partaking in the gladiators’ victory—the
FIGURE 1.14 Africa: The Serengeti. The newborn calf joins the herd. [00:35:10]
great number of voices constructs a colossal virtual performing space, and the homorhyth-
mic movement of these voices projects (and instills in us) a sense of vigorous coalition.
Using a choir to propel a sense of collective identity is not limited to stories about
people. In animal documentaries, for example, filmmakers deploy various cinematic
techniques and tropes that contribute to constructing an anthropomorphic narrative. In
a sequence from the documentary Africa: The Serengeti, we learn from an off-screen
narrator that “To survive in the midst of predators, calves must be able to stand within
minutes of birth.” The sequence begins with a struggling newborn wildebeest, surrounded
by an apprehensive herd, prompting the calf to stand—in the music, a distressed solo
female vocalist in the high register sounds in counterpoint to a ceremonial humming of
a male choir in the low register. While we further learn that “Those unable to stand must
be abandoned”, the calf is shown near ferocious predators waiting for the herd to leave.
In a last impulse of life, the calf finally manages to stand. The herd jauntily ‘cheers’ and
once more surrounds the calf, depicting a triumphal, exultant moment of ingroup cohe-
sion. The music maps this event by anthropomorphizing the herd’s sentiment of jubila-
tion—female and male voices now join forces, aligning their singing homorhythmically,
joyfully spanning a wider register, swelling with active melodic contours. Along with the
cinematography, the music in the scene engages our MNS—first via subvocalization and
then contagion—prompting an empathic response in the listeners, a response grounded
in their own human phenomenological experience yet funneled into a sympathetic com-
munal affiliation with the herd.
Although introducing music featuring a choir seems the most effective means to elicit
contagion, many excellent examples feature large orchestras instead. In Star Wars: The
Rise of Skywalker, Poe and the Resistance battle the First Order over Exegol. Outnum-
bered and overpowered, Poe hopelessly yields in defeat, calling out, “My friends. I’m
sorry. I thought we had a shot. But there’s just too many of them.” The music underscoring
Poe’s words features a single French horn, prompting us to empathize with his (and the
Resistance’s) noble and honorable goals. The voice of Lando Calrissian suddenly breaks
in, signaling a massive fleet of ships arriving from across the galaxy, spurring on the Resist-
ance: “But there are more of us, Poe. There are more of us!” With an outburst of orchestral
forces underscoring Lando’s words, the music also says, “There are more of us!”, and via
contagion, it brings us, the audience, into the film’s action as participants in the battle
alongside the Resistance, subliminally assuring us that ‘we’ now have the power in num-
bers to defeat the First Order.
18 Empathy
FIGURE 1.15 Star Wars: The Rise of Skywalker. Massive fleet of ships arrives. [01:53:30]
Coda
Film music puts us right there with the characters on the screen. It makes us feel what they
feel. By drawing on our innate capacity to empathize with others, it places us in a shared,
intersubjective, intercorporeal space that allows us to attune ourselves to the characters’
emotions, sensations, and intentions.33
In this chapter, I introduced a model for musical empathy in film. This model, like its
social empathy counterpart (outlined in Appendix I), unfolds as a mosaic of subcompo-
nents mediated by the mirror neuron system: (1) film composers map the essential fea-
tures of vocal and bodily gestures onto the musical space, (2) we subliminally mirror the
original vocal and bodily gestures via subvocalization, entrainment, and contagion, and (3)
this mirroring mechanism elicits in us the intended physiological and affective responses.
Ultimately, we redirect this empathic response toward constructing an interpretation of the
film’s narrative.34
Understanding musical empathy in terms of its constituent mechanisms—subvocalization,
entrainment, contagion—allows us to recognize that the music in Psycho’s iconic shower
scene was carefully designed to assault the listener. Whereas the stylish visuals, with extreme
close-ups and swift edits, only imply violence, the shrieking violin gestures ratchet up the
tension and subliminally force us to mirror the character’s pain. While we do not see the
knife stabbing Marion, we feel the psychotic violence through the piercing music.
Notes
1. This musical empathy model substitutes several components of the social empathy model I pro-
pose in Appendix I in this volume. As with the proposed model for social empathy, the physical
and psychological attunement this musical empathy model purports has not been empirically
verified as a whole; nevertheless, much recent research in auditory neuroscience and allied fields
supports the individual mechanisms contributing to the music- or sound-borne intersubjective
mirroring of physiological and affective states.
2. Philosophers and musicologists have long recognized such seamless cross-domain mapping of
musical gestures and their associated affective states: Barthes (1985) alludes to the expressive
interplay between the music and the body, and characterizes musical gestures as “figures of the
body, whose texture forms musical signifying” (p. 306); Hatten (2004) points us to the intermodal
(or rather amodal) nature of musical gestures and their communicative function, defining them as
“emergent gestalts that convey affective motion, emotion, and agency” and noting that “the basic
shape of a gesture is isomorphic and intermodal across all systems of production and interpreta-
tion” (p. 109); and similarly, using language that traces musical gestures as emerging from neural
and physiological states, Lidov (2005) notes that “the notion of expressive gesture interpreted as
the surface form of an underlying neurological function gives us a logical connection between
Empathy 19
somatic and musical experience, one which shows a basis for cause and resemblance in the rela-
tion of music to feeling” (p. 152).
3. While vocal and bodily gestures and postures may have a social or cultural origin, this chapter
only considers musical gestures and postures stemming from biological or physiological states.
Later chapters consider gestures that emerge from social milieus and those grounded in symboli-
cally motivated conventionalized movements, which result in arbitrarily established and conven-
tionalized figures (accompaniment patterns, metrical parameters, rhythms, etc.) and which rely on
the listener’s cultural and gestural competency to be understood.
4. Godøy (2003) examines the connection between gestural imagery and musical imagery, arguing
that gestural images are essential in triggering and sustaining mental images of musical sound.
5. Much research draws on experimental evidence to extrapolate a specialized music-based mecha-
nism from a more generalized MNS. For instance, Koelsch et al. (2006) and Menon and Levitin
(2005) show we engage critical regions of the MNS (the premotor cortex and insula) during mu-
sic listening. Extending similar research to a potential mechanism of emotional communication
through music, Overy and Molnar-Szakacs (2009) explain that, while listening to music, “inter-
actions between the [MNS] and the limbic system may allow the human brain to ‘understand’
complex patterns of musical signals and provide a neural substrate for the subsequent emotional
response” (p. 490). For further insights on a music-based MNS see Gridley and Hoff (2006);
Molnar-Szakacs and Overy (2006).
6. Overy and Molnar-Szakacs (2009) show that nonmusicians “activate the MNS during music listen-
ing when mapping [musical stimuli] onto basic, nonexpert musical behaviors that they are able to
perform such as singing, clapping and tapping” (p. 494).
7. Much research supports this proposition. For instance, Molnar-Szakacs and Overy (2006) cites
research showing that music induces facial expressions in listeners which in turn elicit affective
states, leading them to conclude that “the perception of emotion in music may arise in part from
its relation to physical posture and gesture” (p. 238).
8. Large and Snyder (2009), for example, argue that the “perception of pulse and meter result from
rhythmic bursts of high-frequency neural activity . . . [which] enable communication between
neural areas, such as auditory and motor cortices” (p. 46). Given the necessary temporal accuracy
necessary when investigating entrainment, most studies draw on MEG or EEG. For further insights
on the neuropsychology of entrainment see Doelling and Poeppel (2015); Fries (2005); Grahn and
Brett (2007); Janata and Grafton (2003); Nozaradan et al. (2011); Will and Berg (2007).
9. Seeking to understand the covert response to entrainment, Fujioka et al. (2012) used magnetoen-
cephalography (MEG) to measure the participants’ neural oscillations while they passively listened
to rhythms; their results show that, although participants remained motionless, their neural oscilla-
tions (Beta waves) synchronized to the rhythmic stimuli. In addition, neuroimaging research shows
that entrainment to rhythmic periodicities (including biological rhythms) recruits the perceptual,
motor, and sensorimotor cortical areas (Grahn & Rowe, 2009; Zatorre et al., 2007). For further
insights on entraining to heartbeats see Anishchenko et al. (2000); Lorenzi-Filho et al. (1999). To
date, however, there is no conclusive evidence regarding the relationship between the music’s
tempo and the listeners’ heartbeat rate. A thrilling scene in Barbarian offers a compelling exam-
ple of how the sound design can potentially elicit increased heart and breathing rates—as Tess
descends a dimly lit staircase into the underground tunnels of a rental house, her heart and breath-
ing rates audibly intensify. [00:39:00]
10. Although there is no conclusive evidence that individuals spontaneously entrain precisely to the
music’s tempo—that is, below the level of conscious attention—most studies suggest that the mu-
sic’s tempo subliminally modulates our behavior. For instance, Leman et al. (2013) suggest that
“some music is activating in the sense that it increases the [walking] speed, and some music is
relaxing in the sense that it decreases the [walking] speed” (p. 1).
11. Much like listening to (and mirroring) musical gestures by others causes in us physiological
changes that induce affective states, producing our own musical gestures also causes physiological
changes that modulate our own affective states. We can (and often do) utilize this mechanism for
self-regulation. The beginning moments in the scene from Gravity illustrate the use of entrainment
and subvocalization as a means for self-regulation. Debris dangerously crashes at high speeds,
shattering parts of the station, but Ryan has no option but to continue untangling the parachute
ropes. In an attempt to calm herself, she begins humming a peaceful melody. Her humming serves
her well as an affective self-regulatory mechanism, keeping her thriving in a dangerous situation.
20 Empathy
However, her voice and the music vie for control of the narrative, and ultimately, the relentless
underscoring, with its ever-increasing intensity and agitation, overpowers the audience’s senses
and forces us not to be soothed by her tranquil lullaby.
12. Sato et al. (2012), for example, provides evidence that listeners “unconsciously changed their res-
piration timing to coincide with the music track” (p. 255). Analogously, Juslin et al. (2010) argue
that the “rhythm of the music interacts with an internal body rhythm of the listener such as heart
rate, such that the latter rhythm adjusts towards and eventually ‘locks in’ to a common periodicity”
(p. 621).
13. For example, Husain et al. (2002) show that exposure to fast tempi increases arousal and tension.
Similarly, in a related study on brain stem reflex—which controls processes of the autonomous
nervous system such as pulse, respiration, heart rate, and skin conductance—Juslin and Västfjäll
(2008) explored the processes whereby music induces emotions. They identified tempo as the
most significant parameter in a modulating effect, and as the parameter triggering the broadest
range of emotional responses. For an in-depth exploration of the psychophysiological responses to
musical tempo and entrainment see Etzel et al. (2006); Khalfa et al. (2008); Levitin (2006); Scherer
and Coutinho (2013); Trost and Vuilleumier (2013); Van der Zwaag et al. (2011).
14. Numerous studies record covert subvocalizations at the musculoskeletal level using laryngeal
electromyography (e.g., Brodsky et al., 2008) and at the neural level using functional magnetic
resonance imaging and single-cell recordings (e.g., Harris & De Jong, 2014; Mukamel et al.,
2010). Additionally, researchers use transcranial magnetic stimulation to disrupt brain activity in
motor areas associated with subvocalization; Lima et al. (2016), for example, note that impairing
subvocalization also impairs speech discrimination. For further insights on subvocalization see
Bestelmeyer et al. (2014); McGettigan et al. (2015); Warren et al. (2006).
15. Subvocal responses to human singing have been widely observed at both the musculoskeletal and
the neural levels. Lévêque et al. (2013) note that “the perception of a human-produced sound
like the singing voice [induces] motor resonance via interactions between the auditory and vocal
systems” (p. 1), and Callan et al. (2006) found that “neural processes underlying both perception
and covert production of singing and speech activate overlapping brain regions” (p. 1334).
16. Di Stefano et al. (2022) examine the perception and aesthetics of musical consonance and dis-
sonance and reviews relevant scholarship on the topic. They present three hypotheses (vocal
similarity, psychocultural, and sensorimotor) while exploring the biological underpinnings of the
attraction-aversion mechanisms triggered by consonant and dissonant stimuli.
17. Sachs (1962) speculates that particular melodic contours derive from animal howls or wails; he
identifies examples in Western classical, Russian, Australian aboriginal, and Lakota (Sioux) music.
18. Much research explores subvocalization triggered by nonvocal and nonmusical sounds. In a sys-
tematic review, Lima et al. (2016) argue that “because of the multifaceted and flexible nature of
vocal production [responders] generate sensorimotor estimations of different properties of sound”
(pp. 538–539).
19. Using fMRI, Koelsch et al. (2006) observe that instrumental music activates the brain region re-
sponsible for both perceiving and executing vocalizations; they note this activation in the premotor
cortex area responsible for movements in the larynx, even when participants did not exhibit overt
movements. From a speculative, albeit empirically informed perspective, Cox (2001) conflates vo-
cal and instrumental subvocalization, remarking that “when others speak or sing, we understand
these sounds partly in terms of our own experience of making the same or similar sounds. When
others make sounds on musical instruments . . . we understand these human-made sounds . . . in
terms of our own vocal experience, by way of subvocalization” (p. 201). Other scholars acknowl-
edge a mirroring mechanism without directly addressing a mirror neuron system; Heidemann
(2016), for instance, notes that “In listening to vocal music, we may involuntarily mirror the ac-
tions we imagine the performer undertaking” (par. 1.1). Upon surveying existing scholarship, Lima
et al. (2016) conclude that passive music listening also activates neural substrates responsible for
subvocalization and that, more generally, the “supplementary and pre-supplementary motor areas
play a role in facilitating spontaneous motor responses to sound” (p. 527).
20. This phenomenon, which is consistent with findings related to the MNS, has been explored in
relation to vocal versus instrumental melodies (Watts & Hall, 2008) and in relation to the speed
of auditory processing of vocal versus nonvocal sounds (Agus et al., 2010). Lévêque and Schon
(2015), for example, found increased activity in the auditory cortex to vocal melodies; they note
Empathy 21
that “hearing a (human) singing-voice involves more strongly the sensorimotor system than hear-
ing the same melody played with a non-human timbre”, concluding that this increased activation
is due to “a facilitated matching between the perceived sound and the participants motor repre-
sentations” (p. 58). For further insights on the subvocalization of instrumental versus vocal timbres
see Wilson et al. (2004).
21. This example supports Hurley’s (2008) assertion that “hearing anger expressed increases the acti-
vation of muscles used to express anger” (p. 10).
22. Timbre requires the least amount of time to unfold, while other acoustic signatures, such as con-
tour, harmonic progression, or rhythm, necessitate much longer timeframes. Gjerdingen and
Perrott (2008), for example, identified that participants could classify a piece’s genre (which is
primarily defined by its timbre) within a 250- to 475-millisecond timeframe.
23. While sensory dissonance—the physiological phenomenon of ‘beating’ resulting from exposure
to two pitches—can be accurately quantified as manifested in the cochlea, musical dissonance is
an evaluative notion contingent upon compositional style and musical vocabulary.
24. For the acoustic signatures of various instruments see Juslin and Laukka (2003).
25. For research on the relationship between timbre and emotion in music see Balkwill and Thompson
(1999); Gabrielsson and Juslin (1996).
26. Originally, the French horn did not feature valves and thus was only capable of producing the
notes of the harmonic series.
27. In an experiment measuring degrees of skin conductance in response to consonant and dissonant
harmonic constructs, Winold (1963) found that dissonance triggers more significant sweating than
consonance. Studies attempting to trace the ontogenetic origin of our response suggest that our
aversion to dissonances and our preference for consonance is innate. When presented with a con-
sonant and a dissonant stimulus, for example, infants attend to consonant stimuli for longer times,
suggesting that they are preferentially attuned to consonances rather than to dissonances (Crowder
et al., 1991; Masataka, 2006; Trainor et al., 2002; Zentner & Kagan, 1998). However, Plantinga
and Trehub (2014) question this research and note that their findings are “inconsistent with innate
preferences for consonant stimuli” (p. 40), suggesting that our preference related to consonance/
dissonance must be a learned response.
28. Numerous hypotheses contribute to this line of thought, some grounded in perception—e.g., the
presence of beating (or roughness) when hearing two dissonant pitches triggers spectral masking,
hindering our capacity to perceive the individual stimuli and in turn causing a feeling of irritation
(Huron, 2006)—some grounded in ecological or evolutionary theories—e.g., sensory dissonance
is characteristic of warning calls and intimidation vocalizations in many species, and hence as-
sociated with potential danger (Ploog, 1992)—and some grounded in neuroscientific observations
of functional covariations of brain regions—e.g., sensory dissonance activates the parahippocam-
pal gyrus, an area implicated in the processing of stimuli with negative emotional valence (Blood
et al., 1999; Koelsch et al., 2006).
29. In the example from Psycho, the initial shrieking gesture expands toward the lower register, in-
troducing major sevenths and minor seconds toward forming harsh harmonic dissonances. The
music thus unfolds from dissonance within a complex tone toward dissonance within harmonic
constructs.
30. Lima et al. (2016), for example, note that “some sounds, owing to their rhythmical patterns
[i.e., entrainment] or their social and motivational salience [i.e., contagion], elicit motor re-
sponses, such as singing, tapping, dancing, or vocal alignment . . . Such propensity to respond to
rhythmic, social, and emotional auditory information might promote social convergence, learn-
ing, coordination, and affiliation” (p. 537). Within the context of the BRECVEMA model, Juslin
(2013) describes ‘emotional contagion’ in music as “a process whereby an emotion is induced
by a piece of music because the listener perceives the emotional expression of the music, and
then ‘mimics’ this expression internally” (p. 241). While the main thrust of BRECVEMA draws
on an evolutionary perspective, the model presented in this chapter draws on embodiment and
social cognition.
31. Overy and Molnar-Szakacs (2009) note that “whether making entirely different musical contribu-
tions to weave a musical texture, or all producing exactly the same sounds, the whole is much
greater than the individual parts, from a choir to a drum circle to the stadium bleachers. The
emerging sound is a group sound, almost ‘larger than life’, created by a sense of shared purpose”
22 Empathy
(p. 495). They also warn us, “when group music-making reaches a certain level of cooperation and
coordination, the sense of shared purpose and togetherness can be extraordinarily powerful and
even threatening” (p. 495). See also Merker et al. (2009); Phillips-Silver et al. (2010).
32. In a systematic review of the literature on embodied simulation, Juslin and Västfjäll (2008) identi-
fied ecological and evolutionary functions of a music-based MNS, including enhancing group
cohesion and social interaction, learning, and emotional contagion.
33. Hoeckner et al. (2011) introduced the idea of empathy to film musicology, yet they do not propose
a model for musical empathy or distill the various mechanisms involved.
34. Musical empathy represents a compelling resource for composers and proves foundational to the-
orizing more broadly about meaning construction in film music; however, as the following chap-
ters will discern, it is but one of many sources we draw upon when constructing interpretations.
2
CONTAINER Schema
In The Truman Show, the music (de)constructs a fabricated reality, the ‘show’ that is Tru-
man’s life. It is nighttime. Truman is fast asleep. A gentle and melancholic piano piece
underscores a close-up shot of his peaceful face. Christof, the show’s creator, approaches
a giant ON-AIR monitor broadcasting Truman to the world and reaches out with his arm,
virtually caressing Truman’s face. Truman twitches, almost as if feeling Christof’s presence.
As the camera pans away from this intimate moment to show the entire production studio,
a pianist comes into sight—to our surprise, he is playing the very music we hear. By disrupt-
ing the storyworld’s sonic boundaries and transferring sounds within them, the soundtrack
reveals Christof’s power and control over the fictional narrative Truman inhabits and over
our interpretation of that narrative.
Films immerse us in their storyworlds. They surround us with music, sound effects, and
dialogue—all distinct components of their fictional reality. Every so often, these components
become rearranged, freed and unbound from normative paradigms and sound design con-
ventions, prompting us to construct metaphorical interpretations of the film’s narrative. In
this chapter, we first delve into traditional sound design practices by observing the most rep-
resentative arrangements of sonic ‘containers’ and then investigate examples that break the
mold, that disrupt pre-established mental constructs to add a layer of thematic complexity.
DOI: 10.4324/9780429504457-3
24 CONTAINER Schema
all three diegetic containers, as shown in Figure 2.2. The diegetic music in the scene—
emphasized through the characteristic ‘source-identifying’ shot—contributes toward con-
structing and further defining the characters’ identities.
The sound design of a scene in Star Wars: The Return of the Jedi, diagrammed in Figure 2.3,
offers a different, yet still typical, configuration of sonic containers. While in the example
from Titanic several source-identifying shots indicate that the music belongs to the diegesis,
the absence of such shots in this scene from Star Wars gives the music a different purpose.
Here, the music acts as an off-screen narrator informing us about plot developments, with
Darth Vader’s leitmotif signaling his imminent presence.5
FIGURE 2.3 Star Wars: The Return of the Jedi. Normative arrangement of sonic containers.
[01:59:45]
Un-Contained Soundtracks
In the context of a clear diegetic and non-diegetic polarity, and a clear distinction between
the voice, music, and sound effects, any anomaly or disruption of such normative sound
design becomes a marked event, one that prompts us to construct metaphorical interpreta-
tions. The following sections unpack unusual interactions among sonic containers, interac-
tions realized in the sound design via three techniques that I call overlap, replacement, and
transference.6
Overlap
Overlap entails the simultaneous layering of various sonic containers. The overlap of
music, voice, and sound effects, each in a single diegetic form, corresponds to normative
sound design. However, the overlap of the diegetic and non-diegetic forms of any single
dimension—e.g., the overlap of diegetic and non-diegetic music—denotes a structural dis-
ruption that indicates a marked event in the narrative.
FIGURE 2.4 Farewell, My Lovely. Overlap during Philip’s psychedelic experience. [00:50:15]
26 CONTAINER Schema
FIGURE 2.5 The Conversation. Overlap highlighting the tension between Harry’s two worlds.
[01:52:00]
In Farewell, My Lovely, a gang captures Philip and takes him to a clandestine brothel
for interrogation. He is tortured in the kitchen but resists providing answers to the
madam; as a last resort to get information, she injects him full of hallucinatory drugs.
The soundtrack begins to blur the boundaries between sonic containers: diegetically
ambiguous laughter and clashing of kitchen pots and pans overlap with non-diegetic
synthesizer drones and extended instrumental techniques. This disorienting overlap, dia-
grammed in Figure 2.4, impairs our embodiment of clear sonic containers by dissolving
their boundaries and intermingling their contents. As a result, this overlap metaphori-
cally depicts Philip’s mental state and prompts us to embody his psychedelic experi-
ence—although trying to maintain his sanity, he is unable to navigate his way through
the sonic environment.
The soundtrack to The Conversation uses instrumentation to delineate the protagonist’s
inner worlds: the diegetic saxophone represents Harry’s need for social interaction, while
the non-diegetic piano represents his solitary existence. In a scene toward the end of the
film, the overlap between the diegetic saxophone and the non-diegetic piano reflects Har-
ry’s emotional state; as shown in Figure 2.5, these two distinct forms of the music dimension
overlap but resist blending. Additionally, by omitting the voice and the sound effects (in
either form of diegesis), the sound design directs our attention to the tension emerging from
this conflicting overlap in the music, a tension that reflects the dissociation of Harry’s private
reality and social life.
In Punch Drunk Love, distinct sonic environments delineate the worlds of each protago-
nist: a non-diegetic waltz represents Lena’s steady and graceful personality, while random
and chaotic sounds of a portable harmonium, presented at times diegetically and at times
non-diegetically, represent Barry’s unstable and volatile personality. Throughout the film, the
soundtrack keeps these incompatible sonic environments separate until the film’s last scene.
Lena serenely walks into Barry’s workshop, her gait pairs with the beat of her gentle non-
diegetic waltz; Barry sits at the harmonium playing a tranquil melody, learning to master
the instrument just as he learns to contain his scattered thoughts. As Lena approaches Barry
and lovingly folds her arms around him, the non-diegetic waltz and the diegetic harmonium
melody overlap, complementing and embracing each other. Figure 2.6 illustrates this har-
monious overlap.
CONTAINER Schema 27
FIGURE 2.6 Punch Drunk Love. Overlap suggesting the characters complementing each other.
[01:27:30]
Replacement
Replacement entails omitting sounds we expect in a sonic container and presenting them,
in a substitute form, in another container. When the original sound and its substitute share
acoustic characteristics, replacement goes almost unnoticed—for instance, omitting the
diegetic sound of thunder while including a non-diegetic musical rendition of thunder using
percussion instruments. On the other hand, when introducing substitute sounds that deviate
significantly from our expectations, replacement calls attention to itself. In such replace-
ments, the substitute sounds’ connotations come to the foreground, effectively imbuing the
narrative with subtextual commentary.7
By the 1930s, films had introduced sound. Alexander Nevsky (1938), Eisenstein’s first
sound film, circumvented some of sound design’s practical challenges by shooting scenes to
music. This resulted in one of the earliest instances of replacement.8 At the climax of the Bat-
tle on Ice, Alexander challenges the Teutonic Grand Master to a duel. The two knights duel on
horseback while fighters from both camps gather around and watch the brutal exchange—
clashing swords, skidding horses. The music maps the action, introducing a furious contest
of their respective themes and violent anvil strikes. As diagrammed in Figure 2.7, Prokofiev’s
non-diegetic music effectively stands in for the diegetic sound effects; yet, because of the
FIGURE 2.8 City Lights. Replacement highlighting the absence of content. [00:01:30]
similarity between the sound of clashing swords and a metallic idiophone, this replacement
may go unnoticed. Nevertheless, embedding the diegetic sound effects within the non-
diegetic music delivers an artful sonic construction that contributes to the viewers’ percep-
tion of a stylized battle.9
City Lights (1931), another early film from the sound era, contains music and sound
effects, but no voice. Throughout the film, the marked differences between the non-diegetic
music and the diegetic voice it often replaces enable the director to embed narrative over-
tones.10 The film begins with the unveiling of a monument, where dignitaries address the
gathered citizens. As the City Mayor begins his speech, a quacking kazoo sound replaces
his voice. Moments later, a civic leader approaches the microphone and begins her speech,
sounding out a similar garble and quacking, only within a higher, more feminine-sounding
register.11 Figure 2.8 presents a diagram of this politically charged replacement. The narrative
indeterminacy of the scene—at the beginning of the film and with no setup—leaves ample
room for interpretation: since the rhythm and intonation of the kazoo sounds reflect so truth-
fully the rhythm and intonation of a political speech, this replacement suggests the tendency
of political speech to be absent of intelligible and meaningful content.12
While some listeners may argue that the kazoo sounds in City Lights reside at the bound-
ary between music and sound effects, there is absolutely no such ambiguity in The Errand
FIGURE 2.9 The Errand Boy. Replacement during Morty’s impersonating the Chairman. [01:25:
20]
CONTAINER Schema 29
Boy. When Morty, the errand boy at an entertainment company, enters the empty office of
the Chairman of the Board, he seizes on the opportunity to sit at the Chairman’s desk and
fantasize about being the Chairman. As he addresses imaginary board members, the non-
diegetic version of “Blues in Hoss’ Flat” from Count Basie’s Chairman of the Board album
replaces Morty’s diegetic voice. This amusing replacement, shown in Figure 2.9, maximizes
the comedic power of Morty’s impression by enhancing his already hyperbolic appropria-
tion of the Chairman’s physical gestures.
An example from The Hours illustrates the opposite replacement: non-diegetic music
replacing diegetic voice. The film explores multiple facets of one personality, a personality
that cannot be portrayed by or contained within just one character; instead, all charac-
ters in the film contribute to defining a single, abstract persona. At the Hogarth House in
1920s England, Virginia stands alone in the hall as Vanessa, obviously upset, rushes with her
daughter, Angelica, to the waiting taxi. In 2001 New York, Clarissa stands in the middle of
her living room as Louis leaves, relieved to be on his way. Virginia and Clarissa, each within
their own time and place, rest on chairs, still absorbed in the unpleasant departures. The
non-diegetic minimalist figures in the underscoring create an almost arithmetical anticipa-
tion of a musical cadence. The final cadence, however, is not provided by the music, but by
a diegetic sound that crosses through the narratives: the characters’ exhalations. Figure 2.10
illustrates this breathtaking replacement, where the diegetic voice interacts with and ulti-
mately replaces the non-diegetic music, helping construct a shared narrative that crosses
time and space.
In a scene from Mission Impossible II, a skillfully concealed replacement contributes to
character construction. The sound of castanets and animal-like cries of Flamenco dancers
resonate through the Andalusian night. It is a large private party, with dancers performing
on a raised wooden platform. Through the swirling skirts and pounding heels, Ethan, leader
of the Impossible Missions Force, catches a glimpse of Nyah, a highly capable professional
thief. She suddenly vanishes from sight, rushing up the stairs, masking her footsteps with the
sound of the Flamenco dancers’ steps. In the soundtrack, the diegetic music also functions
as diegetic sound effects.13 This replacement, diagrammed in Figure 2.11, contributes to
defining Nyah’s quick-witted persona while establishing one of the primary metaphors in the
film: the equation of intricate fighting and theft with dance.
30 CONTAINER Schema
FIGURE 2.11 Mission Impossible II. Replacement to cover Nyah’s sounds. [00:12:30]
In a scene from Titus, a moving replacement delivers a paralyzing subtext. Lavinia stands
amidst a burned field. Her hands and tongue have been cut off to keep her from revealing
what she saw. Marcus arrives at the horrific scene and implores:
Speak, gentle niece. What stern ungentle hands have lopp’d and hew’d and made thy
body bare of her two branches, those sweet ornaments, whose circling shadows kings
have sought to sleep in, and might not gain so great a happiness as have thy love? Why
dost not speak to me?
As Lavinia opens her bloodied mouth and screams, her diegetic voice is omitted and
replaced by non-diegetic music.14 This mesmerizing replacement, shown in Figure 2.12,
allows the music to take control of the narrative by using a muted flute to portray Lavinia’s
inability to speak—her voice, silenced.
Transference
Transference entails presenting a sonic event within a specific container and subsequently
shifting it to another container.15 Transferences between diegetic music and non-diegetic
music containers are quite common and often go unnoticed. Their very presence, however,
suggests potential narrative entailments.
CONTAINER Schema 31
FIGURE 2.13 The Milk of Sorrow. Transference depicting a transactional moment. [00:56:00]
The Milk of Sorrow paints a troubling picture of modern Peruvian society, vividly depict-
ing the stark differences between social sectors and their means for negotiating material and
spiritual wealth. Fausta, a Peruvian-indigenous woman who possesses a transcendent and
spiritual gift for song, works as a maid for Aída, a white upper-class pianist and composer
searching for fresh musical materials. Fausta is reluctant to give away her songs, yet Aída
promises the financial remuneration that Fausta direly needs. Ultimately, Fausta engages in
the painful exchange. In a scene, while Fausta walks silently toward Aída, the non-diegetic
singing of Fausta sounds in the background; as Fausta faces Aída, she begins to sing the song
aloud, transferring the music to the diegesis. Aída succeeded in drawing Fausta’s most pre-
cious song out of her. This transference, shown in Figure 2.13, metaphorically symbolizes
the painful passage that Fausta resisted, from spiritual song to material property.
Transferences in Inception reveal hidden narrative processes at work, helping the audi-
ence navigate their way through a labyrinth of dream stages. In the film, characters enter
and leave other characters’ dreams as part of a plan to steal information. The seemingly non-
diegetic music heard during characters’ dreams directly relates to the diegetic music heard
when characters awaken. According to the film’s narrative, “In a dream, the mind functions
more quickly; therefore, time seems to feel more slowly {sic}.” Hence, music in the wak-
ing world enters a character’s dream albeit transformed relative to the character’s temporal
FIGURE 2.15 The Truman Show. Transference revealing a fabricated reality. [01:09:00]
perception within their dream stage. On a crowded Parisian street, Ariadne and Cobb sit
at a café. Ariadne, unsuspecting, looks at the passersby. Cobb suggests, “Our dreams feel
real while we’re in them. It’s only when we wake up we realize things were strange,” and
asks, “How did we end up here?” Ariadne looks around, confused, unable to remember,
and begins to realize she is dreaming. As the cityscape and the dream itself disintegrate,
the soundtrack introduces a faint rumble that quickly grows into non-diegetic, elongated,
low-sounding ‘BRAAAM . . . BRAAAM’. As her dream collapses, she wakes up in the work-
shop, “Non, Je Ne Regrette Rien” playing diegetically in the background. A sensitive listener
will retrospectively recognize that the initial non-diegetic music contains a version of the
song slowed four-fold; this is particularly noticeable because of the corresponding changes
in timbre and register—the song’s original lively accompaniment in the trumpet and viola
emerges as elongated ‘BRAAAMs’ in a low brass-like sounding instrument within her dream.
Ariadne’s first dream thus functions as a key moment in the film—via this awakening trans-
ference from what seems to be non-diegetic music to diegetic music, shown in Figure 2.14,
we learn (perhaps subliminally) to find our way through the protagonists’ dreams.16
A transference from non-diegetic music to diegetic music frequently denotes a character’s
power and control over the narrative. In The Truman Show, Cristof, the manipulative tel-
evision producer and creator of the show, comes across as a God-like, omnipresent agent,
continually observing and controlling Truman’s entire existence. In the scene that opens
this chapter, Philip Glass—composer for both the film and the show—appears at the piano,
triggering a transference from non-diegetic to diegetic music. This enacted transference
between sonic containers, with the Glass cameo, opens a gateway to transcend or even
escape the storyworld—it establishes a parallel between the calculated control exercised by
Cristof in directing the show and the control exercised by all film directors in constructing
an immersive yet fabricated reality.
The Father reveals Anthony’s puzzlement as he copes with advanced dementia. All cin-
ematic elements come together to force viewers to experience Anthony’s subjective perspec-
tive, embodying his confusion by transgressing aural and (chrono)logical boundaries—actors
are not bound to one character and characters are played by multiple actors, dialogues
and scenes repeat within an unbroken continuum, décor and lightning morph within sin-
gle scenes and across scenes. The film opens with Anne rushing through one of London’s
CONTAINER Schema 33
affluent neighborhoods. Henry Purcell’s opera King Arthur, or the British Worthy, imposes
its pace onto the soundtrack, with credit lines and visual cuts metronomically marking the
pulse. Impatient, she fetches the keys, opens the door, and calls out, “Dad?” Her anxiety
mounts as she checks every room. Anthony is in the studio, peacefully sitting in his armchair,
wearing headphones. Surprised to see her, he takes off his headphones—the music stops
as if it has been coming through his headphones, transferring the (initially) non-diegetic
musical sounds to the diegesis. Via this subtle sound-design manipulation, the soundtrack
immerses us within Anthony’s unstable, unreliable reality. As the film further unfolds, subse-
quent sound-design manipulations will represent his storyworld, while reminding us of the
inevitable impermanence of our own reality.
Transferences between sonic elements belonging to different forms of diegesis and dif-
ferent layers of a soundtrack (i.e., music, voice, or sound effects) are drastic but effective. In
the musical Dancer in the Dark, this sonic manipulation sheds light on the narrative. Selma
heads home. She has a degenerative disease that makes her blind, so she uses the railroad
tracks as a guiding path. Jeff, a coworker who is romantically interested in Selma, catches up
to her and offers her a lift. She smiles and continues walking. Selma hears a freight train in
the distance and urges Jeff to stay off the tracks. As the train goes by, Jeff notices that Selma
FIGURE 2.17 Dancer in the Dark. Transference during Selma’s dreamlike drifting. [00:55:10]
34 CONTAINER Schema
FIGURE 2.18 The Conversation. Transference suggesting a sound continues in Harry’s mind.
[01:35:50]
is in doubt of where to look for him—she cannot hear him in the noise. He asks, “You can’t
see, can you?”, to which she counters, “What is there to see?” Her rhetorical question trig-
gers a transference in the sound design, from the diegetic sound effects to the diegetic and
non-diegetic music containers. Selma begins to sing “I Have Seen It All.” Here, the music
coexists in both the diegetic and non-diegetic dimensions, illustrating an overlap of sonic
containers characteristic of musicals—the singing and the percussive accompaniment of the
train sounds are part of the elements seen on screen, but the orchestral accompaniment is
not. This wandering transference, shown in Figure 2.17, is vital to our understanding of the
narrative: Selma, victimized by her abusive friends and coworkers because of her blindness,
is aware of every sound surrounding her, which, in her mind, becomes music. This dream-
like drifting allows Selma to add a veneer of beautiful sonic fantasy onto the tragic reality
of her life.17
In another scene from The Conversation, a deafening transference intrudes on the protag-
onist’s (and our) senses. Harry checks in at a motel, in the room next to where Ann and her
coworker are staying. He must listen to their conversations to gain some clarity of their cir-
cumstances. Pressing his ear against the wall, he only hears the faintest suggestion of voices
coming from their room. In a state of auditory confusion, Harry steps out to the balcony and
suddenly sees a bloodied hand on a semi-transparent glass and hears Ann’s scream. In the
soundtrack, her scream (diegetic voice container) instantaneously becomes a synthesized
sound that resembles a scream (non-diegetic music container). Harry tries to escape reality
by turning on the TV and covering his ears with both hands, but the scream (or rather, the
impression of a scream) continues to resonate in his thoughts as the non-diegetic sound ele-
ment persists. As a result, while the film’s plot portrays the transgression of private (sonic)
space, the soundtrack maps this transgression onto the cognitive boundaries that define the
music, the dialogue, and even the diegesis.
Coda
A soundtrack is filled with sounds. During a film, we draw on the CONTAINER schema to
organize these sounds into well-compartmentalized streams of information, each revealing
CONTAINER Schema 35
different psycho-cognitive dimensions. While most films use easily identifiable, genre-based
conventions with clear-cut distinctions between sonic containers, filmmakers occasionally
disrupt these norms to supply a metaphorical subtext. In this chapter, we took a deeper look
into these sonic containers and their interactions, and constructed a model for broadening
our understanding of film music conventions and music’s potential as a narrative resource.
Breaking free from standard sound design formulas, un-contained soundtracks open win-
dows of interpretation that offer audiences a glimpse into previously hidden layers of narra-
tive meaning. However, when stepping outside of normative sound design conventions, we
recognize a newly emerging set of conventions used to break those rules: the techniques of
overlap, replacement, and transference. For filmmakers, these are effective techniques for a
broad range of purposes, from infusing comedic undertones to conveying poignant develop-
ments in the story. For analysts, these techniques provide us with a sophisticated framework
to interpret a soundtrack’s narrative agency.
In The Truman Show, the soundtrack complicates our perception of a fictional narrative
by disrupting the storyworld’s sonic boundaries with a sound design transference. This struc-
tural transgression not only reveals one character’s power and control over that narrative, but
it also reveals the title characters’ inability to escape the superimposed narrative, inducing
us, the viewers, to also experience a kind of sonic abuse, one that metaphorically resonates
with the emotional and psychological abuse the film condemns.
Notes
1. The terms ‘diegetic’ and ‘non-diegetic’ denote sound sources that are ‘on’ or ‘off’ the screen, re-
spectively. Although numerous scholars discuss the subtleties of diegesis, here I leave the notion
of diegesis open, thus avoiding the tendency to subdivide containers ad infinitum.
2. Superimposing the CONTAINER schema’s spatial logic onto our conceptualization of a film’s
soundtrack allows us to recognize the metaphorical potential of dynamic interactions among
sonic containers. The sonic containers within this framework are fluid constructs, defined by the
contained sounds in relation to each other and to the narrative diegesis. For an in-depth discussion
of the notion of image schemas, see Appendix II in this volume.
3. To establish a clear distinction between these sonic containers, sound designers monitor and ma-
nipulate the sound’s characteristics such as reverb, equalization, loudness, and compression lev-
els. For example, diegetic music features a reverb that matches the physical space where the scene
is taking place; if heard from a distance, sounds feature stronger low frequencies; if the sound
source moves relative to the camera, the sound will pan or change loudness; and when portray-
ing live music performances, sound designers generally include ‘mistakes’ or other idiosyncrasies
characteristic of live settings.
4. Heldt (2013) explores different possible levels of narration stemming from the distinction between
diegetic and non-diegetic music.
5. Tan et al. (2008) examine the extent to which pairing a scene with the same music either in its
diegetic or non-diegetic forms has an effect on the viewers’ perception of the music and on their
interpretations of the film’s narrative.
6. While these interactions may potentially involve any sonic container, this chapter focuses
exclusively on interactions between the music (diegetic or non-diegetic) and other sonic
containers.
7. Omitting all sound elements but the non-diegetic music is not an instance of replacement. This
sound design strategy has become ubiquitous in film, particularly as a means to reveal an emo-
tionally charged angle in the narrative, and therefore is not included in this chapter. In Unfaithful,
for instance, a woman becomes enthralled in a passionate extramarital affair; as she leaves the
apartment of a handsome stranger and takes a taxi, all sounds but the non-diegetic music are
omitted. The soundtrack is thus bestowed with a powerful agency, taking complete control of the
36 CONTAINER Schema
narrative; in particular, the absence of diegetic sounds removes all traces of an objective perspec-
tive, forcing the listener into a subjective one.
8. Although by the 1930s sound in film was commonplace, the inclusion of multiple tracks (dialogue,
music, sound effects) and the synchronization of sound to picture remained challenging. In fact,
throughout the history of cinema, sound design manipulations that have pushed the boundaries of
normative conventions have been shaped by aesthetic intentions but have remained constrained
by the available technology.
9. Like all image schemas, the CONTAINER schema is a dynamic construct, one that changes over
time and across communities of viewers, one that is shaped by past as well as new experiences.
While its structure is relatively stable, its contents are largely defined by cultural practices. For
instance, a 1920s audience’s perception of a film’s soundtrack through the lens of the CONTAINER
schema would result in significantly different contents of sonic containers than a contemporary
audience. Furthermore, current sound design practices are fashioning sonic containers with in-
creasingly permeable boundaries, including sounds of ambiguous diegesis and sounds undefined
as belonging to the music, voice, or sound effects. The present conceptualization of a soundtrack
framed within the CONTAINER schema may therefore deviate from future audiences’ perception of
a soundtrack.
10. The score for the film was composed by Charlie Chaplin himself.
11. This is the only reference to the actual voice of characters. Considering Chaplin’s aesthetic rejec-
tion of talkies during these years, he possibly embedded this replacement to poke fun at talking
films.
12. This replacement illustrates that a sound’s source, rather than its characteristics, defines the con-
tainer to which a sound belongs.
13. This example may be understood as an instance of “duplex perception” (Bregman, 1990), where a
sound potentially corresponds to two sources at once, like rain and applause, or crackling of fire
or paper.
14. For a similar effect, see Hitchcock’s The 39 Steps, where the sound of a train’s whistle replaces a
woman’s scream.
15. Unlike overlap and replacement, transference entails a syntagmatic (rather than paradigmatic)
process.
16. The scene’s sonic environment is foreshadowed in the primary musical gesture of the Main Title
soundtrack, becoming a leitmotif to signal the characters’ dream stage. Doll (2018) explores how
filmmakers can shape our perception of a scene by obscuring the distinction between diegetic and
non-diegetic music, particularly within Inception.
17. This shift to a world wherein everything is controlled by the sounds that overwhelm her senses is
visually emphasized by subtle changes in the coloration and dance-like movements of the charac-
ters on-screen.
3
LINEARITY Schema
In The Stepford Wives (2004), the music replicates and reveals the protagonist’s reactions.
Joanna is a wildly successful, highly paid reality television producer, but her dicey new
season presentation goes awry. Immediately after the presentation, a TV network executive
summons Joanna to discuss what went wrong. Leaning over, delicately yet assertively, she
says, “We have shareholders. We can’t let you sink the network. But we wish you only the
best.” Joanna begins to process the meaning behind those words while exchanging looks
with her supervisor. As her confident smile slowly turns to mystified terror, a female choir
in the soundtrack steadily builds from a wordless murmur to a chilling shriek. The music in
the scene duplicates Joanna’s emotional reaction and carries us along with her rising psy-
chological turmoil.
Film music moves us. It goes up or down; we feel we go up or down. It gets faster or slower;
we feel we get faster or slower. It changes; we change. By traversing any one-dimensional
continua, such as pitch, tempo, or loudness, music engages the LINEARITY schema. Engag-
ing this schema during a film prompts us to construct metaphorical mappings between the
music and other cinematic domains, like the visuals or the characters’ affective states, elicit-
ing psychophysiological responses that inform our interpretations. In this chapter, we move
along a continuum between structural and semantic mappings grounded in the LINEARITY
schema.1 While in structural mappings the LINEARITY schema emerges by observing corre-
spondences between the music and the visuals, in semantic mappings the schema emerges
from correspondences between the music and the characters’ affective states or the film’s
narrative.2
Structural Mappings
Film music often elicits the [PITCH FREQUENCY] IS [MOTION IN VERTICAL SPACE] conceptual
metaphor, where upward motion in the visuals correlates with increasing pitch frequencies
in the music, and downward motion with decreasing pitch frequencies.3 In a scene from
a Tom and Jerry cartoon, for example, the music maps Jerry’s fall from a toy airplane with
a descending chromatic figure (and a momentary pause as a brassiere-parachute briefly
DOI: 10.4324/9780429504457-4
38 LINEARITY Schema
FIGURE 3.1 Tom and Jerry Greatest Chases, “Yankee Doodle Mouse”. Jerry’s falls from a toy
airplane. [00:05:20]
opens). This mapping of physical movement onto the musical space reinforces the cartoon’s
extensive use of ‘Mickey-Mousing’, a technique denoting the synchronization of visual and
aural information.
Many composers adopted the Mickey-Mousing technique, initially used for cartoons, to
underscore non-animated films. The music for King Kong (1933), and much of Max Steiner’s
output, is a prime example.4 Kong captured Ann and snatched her to its lair—a cave high
above a subterranean pool. Jack is on a mission to rescue her. He approaches the cave,
hiding, flattening himself into the crevices of the rocks. While Kong is busy fighting a giant
meat-eating bird, Jack seizes on the opportunity to free Ann. Kong roars angrily and turns to
them. As they take the only escape route, rappelling down the inner edge of the lair using a
sturdy vine, Ann clutching tightly onto Jack, a descending musical gesture in the soundtrack
mimics their downward motion. As Kong grabs the vine and begins to pull them back up,
the music turns to ascending musical gestures, mapping their upward motion. With little
recourse, Jack releases the dangling vine, falling alongside Ann to the subterranean pool.
The music once more maps their fall by returning to descending musical gestures.
In a scene from The Hindenburg, the music suggests the movement of characters or
objects not shown on the screen.5 The passengers are boarding the zeppelin, gazing around,
entering a world of luxury and refinement. They fan out, some to their cabins, others to
go exploring. At his cabin, looking through the window, Boreth regards his wife on the
ground—they exchange a long, loving look. At his own cabin, Kessler also gazes at his wife
on the ground—she is at the center of the crowd, waving halfheartedly. Both men remain at
their windows and begin to see the crowd, and the world, slowly receding. At this moment
in the soundtrack, the music introduces intermittently rising figures that culminate in the
Hindenburg theme as the ship flies across the night sky. During this portion of the scene,
however, we do not see a character or object moving in the vertical plane; instead, the
camera ascends, creating a subjective point of view akin to a passenger’s perspective in
the ascending ship. By drawing on the [PITCH FREQUENCY] IS [MOTION IN VERTICAL SPACE]
conceptual metaphor, the music helps seal a cognitive gap, resulting in an ingenious com-
positional device, rather than intricate visual effects, to portray the rising of the ship.
The main title sequence of Interview with the Vampire offers a haunting, out-of-body per-
spective. High above the street level, we circle one of San Francisco’s Golden Gate Bridge
towers. A hollow open fifth on the strings’ high register rings in the soundtrack as Christmas
lights on the Embarcadero Center illuminate the moonless night. As we fly over the Ferry
Building and gently descend on the busy Market Street, the music supplies echoes of the
Catholic hymn “Libera Me”, its descending lines first materializing in the children’s voices’
high tessitura and gradually slinking into the strings’ low register. The underscoring in this
opening scene prompts us to construct evocative interpretations based on kinetic imagery.
While at a surface level the overarching downward musical contour maps the perspective of
an unseen vampire descending onto San Francisco, at a deeper level and within the context
of the film, the music’s contour arguably suggests a theological and spiritual descent.
The LINEARITY schema is not limited to mappings of vertical movement onto the musical
space. Often, pitch frequency and loudness metaphorically indicate the size of objects or
characters—that is, loud musical figures in the low-frequency range represent large objects
or characters, and soft musical figures in the high-frequency range represent small objects
and characters.
In various crucial scenes, Jurassic Park exploits LINEARITY-based correlations for charac-
ter construction. Early in the film, we visit the park’s nursery—long white tables covered with
dinosaur eggs bathed in infrared light. One egg moves, and the shell begins to crack. In the
FIGURE 3.4 Interview with the Vampire. Descent during the main title sequence.
40 LINEARITY Schema
soundtrack, tender musical figures in the high register, performed softly by a gentle choir
and celesta, underscore this wondrous moment the hatching of a tiny baby raptor. Later in
the film, the now-grown raptors become ferocious predators as they break free and go on
the hunt. Tim and Lex, the two kids in the film, hide in the kitchen. The raptors get closer.
One raptor stands in the doorway, drawing itself up to its full height; the other stomps into
the kitchen, its tail knocking pots and pans from the counter. In the soundtrack, menacing
musical figures in the low register performed loudly by brass and timpani underscore this
frightening moment, the realization that size does matter. Taken together, the music in these
two scenes effectively blends two conceptual metaphors based on the LINEARITY schema:
[PITCH FREQUENCY] IS [SIZE] and [LOUDNESS] IS [SIZE].6
Mappings of dynamic parameters are not always consistent with mappings of static
parameters. For example, while musical figures in the high register represent small objects
and those in the low register represent large objects, music representing dynamic trans-
formations (growing or shrinking) exhibits the opposite tendencies. As the next examples
suggest, different principles govern mappings of dynamic parameters—when characters or
objects change in size, the music for depicting their growth ascends, and the music for
depicting their shrinkage descends.7
In Alice in Wonderland, shrinking in size allows Alice to figure out the meaning of grow-
ing up. Alice slithers through the rabbit hole and lands in a round hall with many doors and
a three-legged glass table at the center, a small key sitting on top. She grabs the key and
opens a small door, about two feet high. There is a lovely garden with a fountain on the
other side. She tries to fit through the door, but her shoulders get stuck, so she pulls back.
Suddenly, a small bottle with the label “DRINK ME” appears on the glass table. She sniffs
the contents and recoils, but shrugs and takes a sip. As Alice begins to shrink, disappearing
LINEARITY Schema 41
FIGURE 3.6 Alice in Wonderland. Alice shrinks after drinking a potion. [00:15:30]
within her now-oversized clothes, the music introduces downward string glissandi. She
becomes a few inches tall, the right size to pass through the little door and into the magical
garden.
In Charlie and the Chocolate Factory, we get a taste of the opposite transformation.
Violet is determined to excel in all competitive tasks. As the world record holder in gum
chewing, she snatches an experimental stick of gum that delivers all daily meals, includ-
ing tomato soup, roast beef, and blueberry pie. She munches and grinds relentlessly.
Uncomfortably concerned, Wonka suggests, “Spit it out!”, but Violet’s competitive mother
flouts Wonka’s advice and cheers, “Keep chewing, kiddo! My little girl’s gonna be the
first person in the world to have a chewing gum meal!” She begins to taste the different
meals. “Tomato soup, I can feel it running down my throat! It’s changing! Roast beef with
baked potato! Crispy skin and butter!” As she gets to the dessert, blueberry pie, Violet
turns violet and begins to inflate into a giant blueberry. In the soundtrack, musical figures
become higher and higher. Wonka, fascinated, steps back and observes Violet’s bizarre
transformation.
Such paradoxical relationships between dynamic and static parameters are not pervasive;
in fact, the reversal in the pitch-size case is potentially a unique phenomenon. For instance,
dynamic and static mappings via the [TEMPO] IS [SPEED OF PHYSICAL MOVEMENT] concep-
tual metaphor do not exhibit such reversal—that is, fast speeds consistently correspond to
fast tempi and slow speeds to slow tempi, regardless of whether these mappings depict static
or dynamic parameters.8
A scene in The Incredibles nicely illustrates the mapping of (dynamic) physical movement
onto the musical space. A bomb on the elevated train tracks explodes and blows away part
of the tracks. As a train approaches, heading straight for the chasm, the non-diegetic action
music immerses us in the thrill. Mr. Incredible runs toward the oncoming train, hoping to
FIGURE 3.7 Charlie and Chocolate Factory. Violet turns into a giant blueberry. [01:05:15]
42 LINEARITY Schema
FIGURE 3.8 The Incredibles. Mr. Incredible slows the train to a halt. [00:08:00]
intercept it before it derails. He plants himself on the tracks and braces for full impact, know-
ing the hit will hurt badly. As he miraculously slows the train to a stop, the music’s tempo
also slows down to a grinding halt.
Crouching Tiger, Hidden Dragon offers an elegiac spectacle of the mystical facets of
martial arts. Two female warriors, Jen and Yu, challenge each other at the center of an inte-
rior courtyard. Swords and weapons cover the walls. They begin their fight. Yu scoops up
every weapon available, but none is any match for Jen’s Green Destiny sword. The battle
intensifies with every new weapon Yu draws on, climaxing as she holds a broken blade at
Jen’s neck. Through the scene, the increasing speed of drumming—from about 112 to 186
BPM—and the resulting temporal compression of rhythmic figures map the relentless visuals
via the [TEMPO] IS [SPEED OF PHYSICAL MOVEMENT] conceptual metaphor, fiercely thrusting
the scene toward the battle’s climax.
In the last example, the music, along with the camera movement, editing, and other
visual parameters, parallels the increasing aggressiveness of the fighters’ movements. This
suggests an additional conceptual metaphor, one that extends beyond musical and spa-
tial-kinetic correspondences, one that captures the characters’ psychological states via
the [PSYCHOLOGICAL TENSION] IS [TEMPO] conceptual metaphor, where slow tempi cor-
relate with calm states and fast tempi with agitated states. Note that the mapping direc-
tion reverses here, no longer originating in the (concrete) visual domain and shaping the
(abstract) musical domain; instead, the music takes on the more concrete role, now sug-
gesting developments in the narrative or inner changes in the characters’ psyches. This
shift, from music acting as a ‘target’ domain to music acting as a ‘source’ domain, argu-
ably defines the boundary of the Mickey-Mousing technique and takes us into film music’s
semantic sphere.
FIGURE 3.9 Crouching Tiger, Hidden Dragon. Jen and Yu challenge each other. [01:34:10]
LINEARITY Schema 43
FIGURE 3.10 Charlie and the Chocolate Factory. Charlie opens his birthday present. [00:21:00]
Semantic Mappings
By drawing on conceptual metaphors that embed semantic mappings, film music becomes
an unseen narrator, revealing the characters’ moods or psychological states and providing
glimpses into the narrative’s otherwise hidden facets. PSYCHOLOGICAL TENSION is a fre-
quent target domain in such conceptual metaphors, possibly because it manifests itself more
clearly within the narrative than other, more elusive, psychological states.
In another scene from Charlie and the Chocolate Factory, Charlie is about to open his birth-
day present. It is a Wonka bar. Charlie’s parents and grandparents gather around him, hopeful he
will find one of the precious golden tickets to visit Wonka’s factory inside the candy. He wants to
wait to open the candy, but Grandpa Joe complains, “If you add our ages together, we’re three
hundred and eighty-one years old. We don’t wait!” Charlie is anxious in anticipation, but his
parents comfort him, “Charlie, you mustn’t be too disappointed if you don’t get one . . . What-
ever happens, you’ll still have the candy.” He slowly unwraps the Wonka Whipple-Scrumptious
Fudgemallow Delight. In the soundtrack, a crescendo in the strings maps the mounting tension
via the [PSYCHOLOGICAL TENSION] IS [LOUDNESS] conceptual metaphor.9 With a brisk move,
Charlie rips off the last of the wrapping—no golden ticket.10
Multiple musical parameters may function as a source domain to map the characters’
psychological tension, thus projecting multi-pronged conceptual metaphors. In The Verdict,
Frank sees one last shot at salvaging his law career by taking on a medical malpractice case.
He is not mentally strong enough, so he relies on Laura, an enigmatic romantic companion,
for encouragement and reassurance. Frank’s case experiences a setback. Outside the cham-
bers, he tiredly looks at Laura and admits, “We’re going to lose.”
FIGURE 3.12 The Stepford Wives. Joanna is terminated from the TV network. [00:09:40]
You want me to tell you it’s your fault? It probably is . . . You’re like a kid. You’re com-
ing here like it’s Sunday night, and you want me to say that you’ve got a fever, so you
don’t have to go to school . . . Listen! The damned case doesn’t start until tomorrow, and
already it’s over for you . . . If you want to be a failure, then do it someplace else.
Frank hurries out of the room and shuts the door. In this tense scene, the music begins with
a single pitch in the middle register, gradually adding all twelve notes of the chromatic scale
and expanding onto a higher register, effectively mapping the increasing tension unfolding
from the conversation via two conceptual metaphors: [PSYCHOLOGICAL TENSION] IS [DIS-
SONANCE] and [PSYCHOLOGICAL TENSION] IS [PITCH FREQUENCY].
In the scene from The Stepford Wives that opens this chapter, the music illustrates a dif-
ferent combination of conceptual metaphors mapping the protagonist’s psychological ten-
sion. As Joanna realizes she is being fired, a sprawling vocal gesture in the music (performed
by a choir) portrays her short but intense psychological turmoil: the increase in loudness
maps (and heightens) a comparable rise in pitch frequency, combining the effect of the
[PSYCHOLOGICAL TENSION] IS [LOUDNESS] and the [PSYCHOLOGICAL TENSION] IS [PITCH
FREQUENCY] conceptual metaphors.11
The potential for combining different musical parameters is vast.12 A scene from Sound-
less illustrates concomitant relationships between (at least) four musical parameters, all
functioning as source domains and reinforcing related conceptual metaphors. Victor, a
methodical hitman with a reputation for killing without making a sound, takes one last job.
He stares steadily through the scope, aiming at his target through a window in a nearby
building. Security workers notice Victor and attempt to alert the target to move away from
the window. While static visuals portray Victor’s deep state of concentration and focus as
he prepares to shoot, the underscoring draws on many linear parameters (including loud-
ness, pitch frequency, dissonance, and timbral density) to map the anxiety unfolding among
the security workers. As a result, in the thrilling moments anticipating Victor’s shooting, the
music establishes a concomitant relationship between musical parameters, reinforcing the
related [PSYCHOLOGICAL TENSION] IS [LOUDNESS], [PSYCHOLOGICAL TENSION] IS [PITCH
FREQUENCY], [PSYCHOLOGICAL TENSION] IS [DISSONANCE], and [PSYCHOLOGICAL TEN-
SION] IS [TIMBRAL DENSITY] conceptual metaphors.
Metaphorical mappings between the music and other facets of a film extend beyond
depictions of the characters’ psychological tension. By drawing on our tendencies for musi-
cal empathy, metaphorical mappings may help create an aura of collective (and ideological)
cohesiveness. In Avatar, humanity discovers the distant moon Pandora, where the indig-
enous humanoids, the Na’vi, live in harmony with nature. Jake Sully, a wounded former
Marine, takes on a mission to infiltrate and exploit the indigenous race in exchange for
regaining his mobility. As Jake falls in love with a female Na’vi, he changes allegiances
and leads the indigenous humanoids to fight for survival. At a ceremony, Jake rallies the
Na’vi. In a subdued and defeated tone, he decries, “The Sky People have sent a message
that they can take whatever they want, and no one can stop them.” Gradually, his voice fills
with passion and fury to assert, “We will send them a message”, encouraging the crowd to
defend itself. In a final note of defiance, he prompts everyone to join forces: “Fly now with
me, brothers and sisters! Fly! And we will show the Sky People that this is our land!” The
entire tribe responds to Jake’s fervent call to arms, their shouts echoing across the forest.
The music in the scene maps the speech’s progression and tone, beginning with a subdued
melody in the low register of the celli but steadily intensifying, adding instrumental forces,
constructing a coalescing sound mass that climaxes by introducing energetic percussion and
a vigorous choir. Here, the [IN-GROUP COHESION] IS [TIMBRAL DENSITY] and [AROUSAL] IS
[LOUDNESS] metaphors allow the music to reflect the progression in Jake’s speech, inviting
us to empathize with the crowd and the ideology of resistance via musical contagion and
subvocalization.13
In Cabaret’s arguably most iconic scene, a young blond boy in a rural beer garden sings
“Tomorrow Belongs to Me”, an idyllic song in the traditional German folk style about
the beauties of nature, youth, and family life. Moments into the song, the camera reveals
the young boy’s Nazi uniform and cuts to other individuals rising and joining in the singing.
The build from a soothing solo voice to a strident collective choir accompanies a progres-
sion in the characters’ facial expressions, from tender and innocent to angry and threatening.
The unfolding collective entrainment depicted in the scene, combined with contextual cues
and editing, gradually transforms an idyllic ballad into a militant Nazi anthem.14 This scene
46 LINEARITY Schema
FIGURE 3.15 Cabaret. The crowd joins in singing “Tomorrow Belongs to Me”. [01:18:30]
illustrates how metaphorical mappings that rely on musical contagion, entrainment, and
subvocalization may lead individuals to identify with the collective, eliciting an initial aes-
thetic response to the music that invites individuals into the underlying ideology.15
Metaphorical mappings extend beyond events that unfold within single scenes. Via
nuanced, longer-range conceptual metaphors, the soundtrack may help unify entire films
by depicting gradual changes in the characters or narrative settings. Short Circuit is a film
about Number 5, a robot, coming to life. The music supports this narrative, delineating
a semantic opposition between the human and the mechanical. Throughout the film, the
[DEGREE OF HUMANITY] IS [TIMBRE] conceptual metaphor unfolds, where electronic timbres
denote mechanical beings and acoustic timbres denote human beings. While at the start of
the film the music features a synthetic timbre lacking natural overtones, toward the middle,
as the robot becomes human-like, the music gradually introduces acoustic timbres, imbuing
natural overtones within the musical texture, and by the end, a sixty-piece orchestra swells
through the soundtrack as Number 5 starts a new life.
In The Conversation, the music also establishes a metaphorical correlation that frames
the entire film. Harry Caul is a surveillance expert whose job is to listen into other people’s
private lives. Throughout the film, the non-diegetic piano score captures Harry’s private
reality through changes in timbre. We thus learn to recognize Harry’s psychological state
via the [PSYCHOLOGICAL TENSION] IS [TIMBRAL DISTORTION] conceptual metaphor—ena-
bled by the LINEARITY image schema inherent in both domains—where increases in timbral
distortion represent increased levels of psychological distress. Early in the film, as Harry
calmly walks home after successfully recording the conversation of a young couple in a
park, the main theme transpires through smooth and undistorted piano sounds. The conver-
sation Harry recorded hints at a murder. As he begins to worry about becoming involved in
a labyrinth of secrecy and murder, the smooth piano sounds become increasingly distorted.
Toward the film’s end, Harry realizes his apartment has been tapped—he himself has been
FIGURE 3.17 The Conversation. Harry’s psychological journey. [00:09:30] [01:18:00] [01:51:20]
subject to audio surveillance. As he frantically searches for the recording device, a highly
distorted piano timbre signals his nervous tension. Moments later, as he gives up hope of
finding the recording device, the music briefly stops and returns to the undistorted piano
sound, reflecting his surrender.
Coda
Film music makes us feel like we are moving along with the characters. By engaging
the LINEARITY schema, the music acts as one agent in a multidimensional mapping pro-
cess, prompting us to construct metaphors that highlight localized surface-level events or
deeply hidden narrative layers—but even when localized, such mappings may carry great
significance.
In The Stepford Wives, the husbands pursue a misogynistic ideal of femininity, turning
their wives into gorgeous, subservient, complacent female automatons who lack agency,
humanity, or the capacity to feel or express emotions. Joanna’s firing marks the beginning
of the transformation that others attempt to impose on her. However, the localized musical
event revealing her psychological trajectory via the LINEARITY schema, from circumspect to
enraged, allows us to recognize in her the very characteristic absent in the soulless android
she must resist becoming.
Notes
1. These mappings can be expressed via the conceptual metaphor structure, [A] IS [B], wherein both
the source [B] and the target [A] exhibit a one-dimensional continuum. Clustering examples into
a conceptual metaphor also clarifies the structure of metaphors and foregrounds the embedded
directionality: from [B] to [A]. For an in-depth discussion of the notion of image schemas and con-
ceptual metaphors, see Appendix II.
2. Nevertheless, these are not exclusive categories, and examples may be situated along a broader
linear dimension, between purely structural and purely semantic.
3. Many experimental studies on the [PITCH] IS [HEIGHT] conceptual metaphor observe a great de-
gree of consistency within (and beyond) Western cultures. Widmann et al. (2004) used a visual
priming paradigm and event related responses (ERP) to establish the consistency of the pitch-verti-
cality correlation; they interpret their results from an ecological perspective, noting that “expecta-
tions on forthcoming sounds can speed up responding to environmental changes and can, thus,
be a basis for successful adaptation” (p. 709). Much research, however, challenges the assumed
vertical orientation, with some favoring other spatial relationships such as lateral orientation (e.g.,
Rusconi et al., 2006; Wühr & Müsseler, 2002), and some highlighting the absence of both an
orientation and a one-dimensional structure. Abril (2001), for instance, explores the ability of
48 LINEARITY Schema
bilingual (Spanish and English) children to accurately recognize and label register shifts in music;
although not the object of his study, he points out that English terminology draws on the VERTICAL-
ITY schema (with terms such as ‘high’ or ‘low’), while Spanish terminology draws on two unre-
lated schemas extraneous to the VERTICALITY and LINEARITY schemas (with words such as agudo
[‘sharp’, ‘penetrating’] for high registers and grave [‘serious’, ‘severe’] for low registers). Similarly,
Zbikowski (2007) questions the (perceived) universality of such metaphorical mappings, pointing
out instances of cultural variability: “In Bali and Java pitches are conceived not as ‘high’ and ‘low’
but as ‘small’ and ‘large’. Here the conceptual metaphor is PITCH RELATIONSHIPS ARE RELATION-
SHIPS OF PHYSICAL SIZE . . . The Suyá of the Amazon basin do not have an extensive vocabulary for
describing pitch relationships. When they are described, however, it is in terms of age: pitches are
conceived not as ‘high’ and ‘low’ but as ‘young’ and ‘old’. The conceptual metaphor that guides
this mapping is PITCH RELATIONSHIPS ARE AGE RELATIONSHIPS” (pp. 67–68). Taken collectively,
these studies suggest that language and culture play “a surprisingly active role in the development
and organization of image schemas, contributing not only to cross-linguistic variation but also to
some universal similarities among image-schematic concepts” (Dewell, 2005, p. 371).
4. Friedmann (2017) investigates how the chromatic gestures in the “climbing motif” underscoring
Kong’s climbing the Empire State Building in King Kong shapes the viewer’s perception of the
scene.
5. Music cognition scholars have observed a direct link between the sounds produced by physical
movement and those evoked by kinetic imagery. For instance, Clarke (2005) notes that “since
sounds in the everyday world specify (among other things) the motional characteristics of their
sources, it is inevitable that musical sounds will also specify the fictional movements and gestures
of the virtual environment which they conjure up” (p. 74).
6. Some scholars argue that the logic behind the pitch-size association is grounded in our evolution-
ary biology, stemming from “our life-long experience of correlation between an object’s size and
the pitch it would produce. In particular, pitch height is correlated across animal species with
body size, as larger species tend to produce lower-pitched sounds” (Eitan, 2013, p. 172). Similarly,
Huron et al. (2006) draw on an ethological perspective and suggest that composers often “place a
melody in a lower register in order to evoke threatening, dominant or aggressive associations” and
conversely, they may place a melody in the high register “to evoke more passive, vulnerable or
submissive associations”. (p. 176). This phenomenon is further discussed in Chapter 8 in relation
to leitmotif construction.
7. Eitan (2017) suggests that such paradoxical relationships arise from a different set of embodied
experiences, ones that activate the [MORE] IS [UP] conceptual metaphor.
8. Many musical parameters, other than pitch and pitch relations, embed a one-dimensional struc-
ture but lack a spatial orientation; these parameters include tempo (slow to fast), loudness (soft to
loud), density (one layer to multiple layers), consonance (consonant to dissonant), and sometimes
timbre (undistorted to distorted). The film music metaphors explored in this volume draw on these
musical parameters, mapping their inherent LINEARITY onto spatial, kinetic, and affective domains
of human experience.
9. Via recurrent interactions with the environment, we establish a direct relationship between dis-
tance and loudness. As a result, a sudden increase in loudness arguably triggers a visceral re-
sponse, alerting us of a potentially dangerous event such as an object rapidly approaching. In
tracing this correlation’s evolutionary origin, Granot and Eitan (2011) argue that it stems from
our ecological response to an increase in loudness characteristic of looming threats in natural
contexts.
10. In the horror and suspense genres, this technique, known as the ‘red herring’, is used to mislead
audiences.
11. Joanna contains her emotions and walks calmly toward the elevators. She steps into the eleva-
tor and the doors smoothly close. Alone in the elevator, she releases a blood-curdling scream,
one that harkens back to the non-diegetic chilling shriek, almost like enacting agency over the
soundtrack by effecting a transference from non-diegetic music to diegetic voice. This very agency
is a trademark that will put her apart from the Stepford wives.
12. Arguably, the heightened effect of concomitant relationships has an evolutionary origin. In par-
ticular, concomitant relationships in which various parameters intensify (e.g., faster, louder) are
environmentally significant for an organism and hence more salient, as these may “imply the
approach of a potentially harmful object, raising an organism’s attention and alertness” (Küssner
LINEARITY Schema 49
et al., 2014, p. 12). For instance, Eitan and Granot (2006) note that “the musical dimensions of
loudness, pitch, and tempo seem to interact via concomitant intensity levels or contours . . . a
pitch rise, a crescendo, and an accelerando are commonly considered intensifying” (p. 225).
13. Contagion, subvocalization, and entrainment are discussed in Chapter 1 as part of a broader
mechanism of musical empathy.
14. Belletto (2008) notes that the song “mask[s] a dissonance between image and content, between
surface and depth”, and “whatever beauty it may possess, the song’s real function is to consolidate
the crowd and marshal them into one uniform voice” (p. 613). In fact, this song is now pervasively
(and perversely) used within right-wing neo-Nazi circles for precisely this function.
15. Collective entrainment, as a mode of inducing a collective consciousness distanced form a sense
of the individual, manifests itself the most distinctly during religious congregations, patriotic as-
semblies, social rallies, and sport-related gatherings, which incite individuals into coordinated
affect and ideological affiliation.
4
SOURCE-PATH-GOAL & CONTAINER Schemas
In Citizen Kane, the music guides us through the protagonist’s deteriorating marriage. At the
Kane residence, Emily sits at the center of the table in the breakfast room. Charles enters, kisses
her on the forehead, and sits close. Charles avows, “You are beautiful”, and she flirts back,
“Oh, I can’t be.” In the next tableau, some years have passed. Emily, now sitting toward the end
of the table, inquires, “Charles, do you know how long you kept me waiting last night while
you went to the newspaper for ten minutes? What do you do in a newspaper in the middle of
the night?”, to which Charles retorts, “Emily, my dear, your only correspondent is the Inquirer.”
More time has passed, and Charles looks older. Emily fires a shot, “Sometimes I think I’d prefer
a rival of flesh and blood”, but Charles dismisses her by countering, “Oh, Emily, I don’t spend
that much time on the newspaper.” After numerous changes in time, Charles and Emily sit at
different ends of the breakfast table, retreating to unbearable silence. The music to the first
tableau presents a theme that lacks resolution, opening the space for subsequent tableaux,
each presenting a variation on the original theme but failing to provide a conclusive point of
repose. As a result, this ‘theme and variations’ design offers well-contained snapshots of their
relationship; the inconclusive quality of the music for each tableau, however, drives the narra-
tive forward in time and reveals their relationship’s trajectory, from love to indifference.
Film music takes us along the characters’ journeys, through indefinite narrative times and
places. Sometimes it takes us to the end of the road, and sometimes it stops just short of the
destination, leaving us hanging in suspense and eager for what comes next. In this chapter,
we embark on a journey to a metaphorical space where two schemas merge—SOURCE-
PATH-GOAL and CONTAINER—a space wherein film music reaches a syntactically nuanced
narrative potential. Before reaching our ultimate goal, however, we take a detour with two
exploratory excursions that offer us a glimpse into how these schemas work independently
to guide our responses and interpretations.
In our everyday lives, we often resort to metaphors grounded in the SOURCE-PATH-GOAL (SPG)
schema to conceptualize the abstract notion of ‘time’ in terms of the more concrete notion
of ‘space’ (“Saturday seems so far away”, “The past lies behind us”). Similarly, filmmakers
DOI: 10.4324/9780429504457-5
SOURCE-PATH-GOAL & CONTAINER Schemas 51
FIGURE 4.1 Film Stars Don’t Die in Liverpool. Passage of time. [00:12:30]
frequently draw on the SPG schema’s spatial logic to suggest the passage and direction of nar-
rative time by carefully controlling camera movement and visual effects (e.g., zooming, track-
ing, panning), equating space-points (here, there) with narrative time-points (present, future).
By framing narrative events as journeys, the cinematography evokes the [PASSAGE OF TIME]
IS [MOTION IN SPACE] conceptual metaphor.1 In this first detour, I identify a complementary
metaphor that relies on the SPG schema to reveal the passage of time, but one that draws on
the music to do so: [DIRECTION OF TIME] IS [DIRECTION OF A SOUND’S ENVELOPE].2
Film Stars Don’t Die in Liverpool draws on both the visuals and the music to guide us
through a series of flashbacks. This biographical romance drama recounts the final days of Glo-
ria Grahame and her infatuation with Peter Turner. Peter, a young and up-and-coming actor, sits
at the bedside looking after Gloria, a once-glamorous but now ailing actress. As Peter leaves
the room, a tracking shot follows him. Soon, the camera breaks free and follows his gaze
toward the end of a hallway, leaving him behind and moving swiftly toward the depth-of-field.
As the camera halts its motion, it reveals a much younger Gloria, vigorously dancing and exer-
cising. We recognize that the object of Peter’s gaze was the room where he met Gloria years
ago and that the camera movement transported us from the present to that past via the [PAS-
SAGE OF TIME] IS [MOTION IN SPACE] conceptual metaphor. This visual metaphor is supported
and further clarified by a parallel musical metaphor, the [DIRECTION OF TIME] IS [DIRECTION
OF A SOUND’S ENVELOPE] conceptual metaphor. At the beginning of the scene, the music pre-
sents piano sounds with their characteristic envelope—a percussive onset or attack (SOURCE),
an early decay followed by a more sustained fade (PATH), ultimately reaching silence upon the
release of the piano key (GOAL). As the camera moves through the hallway, metaphorically
going back in time, the envelope of piano sounds plays backward—from its release (former
GOAL) to its attack (former SOURCE), traversing a steady increase of sound (sonic PATH). This
sonic manipulation, which results in the retrograde envelope of piano sounds, suggests the
backward flow of time by establishing the [DIRECTION OF TIME] IS [DIRECTION OF A SOUND’S
ENVELOPE] conceptual metaphor. With this metaphor, the music invites us to suspend our per-
ception of ‘natural time’ and navigate our way through a fictional, constructed ‘narrative time’.
Both the visual and the musical metaphors are grounded in the SPG schema, but while the
visual metaphor draws on our concrete experiences of moving in physical space, the musical
metaphor draws on our ideation of moving through a sound’s envelope.
Moulin Rouge! presents a more audibly foregrounded example. The film tells the story
of Christian, a young English novelist, who falls for a beautiful singer at the Moulin Rouge.
Christian begins writing his memoir. As he reflects on the past, his typing “I first came to Paris
one year ago” triggers a tracking shot that briskly moves from his typewriter, through a dance
52 SOURCE-PATH-GOAL & CONTAINER Schemas
floor at the Moulin Rouge, to a panoramic view of Paris. In the soundtrack, piano sounds
featuring a reversed envelope underscore this extreme zoom-out, metaphorically portraying
the backward-flowing narrative time. Once the camera settles, Christian’s non-diegetic voice
reveals, “It was 1899, the Summer of Love.” At this moment, the past becomes the present, the
piano sounds return to their natural envelope, and the narrative unfolds again forward in time.
Our understanding of musical forms (as large-scale syntactic structures) is almost exclusively
grounded in the CONTAINER schema, where similarities and contrasts in musical materials
define the formal units, or ‘containers’.3 In this second detour, we explore instances where
the CONTAINER schema impinges upon our perception of the music’s formal design and, by
extension, shapes our interpretation of a scene.
In standard music analysis, the content of such formal units is defined by letters that
represent similarity or contrast. For example, ‘A’ denotes a single formal unit containing
relatively uniform musical materials; ‘A – B’ denotes a segmentation in two formal units,
each containing contrasting materials; and ‘A – B – A – C – A’ denotes a segmentation in five
formal units, akin to a rondo form, where the middle and outer units contain contrasting
musical materials in relation to the other ones.
Composers resort to a one-part formal design (‘A’) to unify lengthy film sequences that
depict events unfolding at various times and places. A montage in The Shawshank Redemp-
tion illustrates this use of musical underscoring. After 40 years in prison, Brooks walks out
on parole. He is disoriented—tears stream down his face. Riding the bus for the first time,
he fearfully clutches the seat in front of him. The buzzing city terrifies him. He gets a small,
old, grim apartment with heavy wooden beams crossing the ceiling, a rickety bed, and a
timeworn desk. He also gets a job bagging groceries at the Foodway, which quickly becomes
overwhelming. Only feeding pigeons in the park comforts him, as that brings back memories
of his time in prison. His world is shattered. Dressed in his old suit, he finishes knotting his
tie, puts his hat on, and places a letter on the old desk. He takes one last look around and
steps up onto a chair. Smiling with inner peace, he shifts his weight on the wobbly chair until
it goes out from under him, leaving his feet swinging mid-air, freely. The music accompanying
this lengthy montage features an unremitting texture of strings and gentle electronics blended
with homorhythmic, solo-piano musical phrases leading to long-sustained sonorities that
undermine our perception of a metrical structure and project an introspective, contemplative
character. By sustaining this mood throughout, avoiding significant changes in any parameter,
the music’s one-part form underscores a lengthy narrative unit depicting Brooks’s journey of
despair and hopelessness, unifying that which may otherwise appear segmented or scattered.4
In The Hours, lengthy film sequences connect various narrative threads by weaving in a
musical fabric that draws on a minimalist compositional style—continuous accompaniment
figures, repeating harmonic frameworks, uninterrupted melodic gestures. For instance, in
the montage from this film discussed in Chapter 2, the monothematic music underscores
events unfolding at different times and places—the Hogarth House in 1920s England and
a New York apartment in 2001. By sustaining a single mood through brief musical inter-
ruptions, the underscoring unifies this fractured sequence and contributes to constructing
a shared narrative in which the yearnings and fears of the characters intertwine to define a
more abstract persona, a single identity transcending time and place.5
In montages depicting simultaneous events unfolding in multiple locations, the music
frequently outlines a quasi-rondo form—the music becomes associated with a particular
location, halts or changes when the narrative shifts elsewhere, and resumes or changes back
with the return to the initial location. In The Taking of Pelham One Two Three, a four-
minute sequence juxtaposes simultaneous events unfolding in multiple locations. An active,
FIGURE 4.5 The Hours. The characters’ yearnings and fears intertwine. [01:10:00]
54 SOURCE-PATH-GOAL & CONTAINER Schemas
FIGURE 4.6 The Taking of Pelham One Two Three. Simultaneous events in multiple locations.
[00:49:40]
fast-paced theme underscores employees at the Federal Reserve, opening gray canvas money
bags, sorting bills, and collating the money into packets. The music stops abruptly as the nar-
rative shifts to the Gracie Mansion, New York Mayor’s official residence. The mayor is not feel-
ing well; he is getting a shot in the rear end. Deputy Mayor Warren walks into the bedroom
to prompt the mayor to make a public statement about the ongoing hijacking of a subway
train and insists, “The mayor of the City of New York, trailing by twenty-two points in all the
polls, [should care] enough about seventeen citizens in jeopardy to make a personal appear-
ance in their behalf!” The fast-paced music resumes, in medias res, as we return to the action
at the Federal Reserve, with tellers sorting bills by denomination and using machines to pack
them faster than the eye can see. Then, nearly all musical layers fade out as we trail through
the Subway Command Center, where Transit Police Lieutenant Garber awaits the delivery of
the ransom. He exchanges strong words with other officials and rhetorically asks, “How long
does it take to get that money together? We’ll never make it. The passengers are dead ducks.”
Once again, the fast-paced music returns as we see Federal Reserve clerks assembling bills,
selecting ten bundles of fifties and five bundles of hundreds, fastening packets with rubber
bands. Seconds later, tensely static music sets in as the narrative shifts to the subway car,
where hijackers are arguing about their next move. Blue mentions, “They’ve requested more
time . . . I didn’t give it to them.” Green asks, “Suppose they can’t make it?”, to which Blue
swears, “Then we do what we said we’d do. There’s no other way.” One last time, the fast-
paced music reappears as the narrative returns to the Federal Reserve, where the money is
now neatly piled together and loaded into canvas bags. Finally, two guards take the bags
and hurry down a corridor toward a gate leading to the security elevators, ultimately placing
the bags in a police truck. The ransom money is on its way. Throughout this lengthy scene,
the music’s formal design alternates units containing contrasting musical materials, map-
ping the alternation of environments and events occurring simultaneously in the storyworld,
yet coming together to form a unified quasi-rondo, ‘A – (silence) – A – B – (silence) – A –
C – A’. Such formal design with interruptions and interpolations is a quintessential composi-
tional strategy to suggest the simultaneous unfolding of various events at different locations.
While scenes featuring one-part or rondo forms exploit the music’s unifying potential,
scenes featuring a well-defined two-part (binary) form often set up a musical contrast that
resonates with the broad themes developed in the film. For example, in an early scene from
Vier Minuten [Four Minutes], the music illustrates an ‘A – B’ formal design that foreshadows
plot developments. Two rugged workmen and an older woman transport a grand piano on
a beat-up pickup truck. With its characteristic fast-paced tempo, rough electric timbres, and
hectic guitar riffs, the initial hard rock music brings us into the space of the men driving. The
older woman, however, appears irritated. Without hesitation, she reaches over and changes
SOURCE-PATH-GOAL & CONTAINER Schemas 55
the radio dial. A delicate classical piece brings us into her space with its gentle piano tim-
bre, slow unfolding melodies, and refined acoustic balance. Here, the music’s ‘A – B’ formal
design (hard rock-to-classical) sets up a dialectic that will play out during the film.6
The Fifth Element traverses multiple binaries—good and evil, light and dark, natural and
supernatural, human and non-human. The soundtrack to one of its most iconic scenes sup-
ports and further expands on these binaries. Cued by the line “It’s showtime” in the dialogue,
the music links the simultaneous (yet spatially disconnected) events and furthers the semantic
dichotomy of human and non-human embodied by both, intergalactic quasi-human singer
Diva Plavalaguna and cyborg Leeloo. The curtain rises. Diva stands in the center of the stage,
a star-filled window behind her. She begins with an almost divine performance of “Il Dolce
Suono” [“The Sweet Sound”], a vocally challenging aria from Gaetano Donizetti’s opera
Lucia di Lammermoor.7 The piece displays Diva’s strong command of a human vocal range,
which she traverses with conjunct motion (i.e., using nearby tones) and arch-shaped melodic
contours.8 Meanwhile, still held hostage, Leeloo is bound by her human affordances, unable
to free herself and fight back an army of Mangalores. As Donizetti’s aria concludes, the
sonic environment changes radically, setting off a contrasting section. Here, Eric Serra’s “Diva
Dance” demands a departure from the human vocal affordances—the range of the vocal line
now more than doubles that of the preceding section and features angular melodic contours
and extreme shifts in register. As Diva engages with supernatural agility in an uncanny sonic
spectacle denoting alien dexterity and power, Leeloo miraculously regains her superhuman
powers to fight the Mangalores, engaging in a physical spectacle of martial skills.9 As Diva
concludes her performance, Leeloo takes down the last Mangalore. To a burst of applause,
both Diva and Leeloo take a bow. Through this mesmerizing and thrilling scene, we recog-
nize the human qualities of the ‘A’ section and the extra-human qualities of the ‘B’ section
via subvocalization. In turn, our embodiment of the music’s binary formal design guides us
through the film’s own conception of a human and non-human dichotomy.
FIGURE 4.8 The Fifth Element. Diva’s and Leeloo’s performances. [01:23:40]
56 SOURCE-PATH-GOAL & CONTAINER Schemas
While so far we have read scenes through the lens of a single schema, we now turn to
moments in which multiple schemas are at play, moments in which the music activates
complex conceptual structures that only a combination of schemas can capture.10 Here,
we explore the melodic, harmonic, and temporal traits of musical cadences, which elicit
in listeners strong responses of arrival, stability, and closure, evoking frameworks of space
grounded in both the SPG and CONTAINER schemas.11
Embedded within a film score, musical cadences are powerful rhetorical devices that
shape our perception and interpretation of a film’s narrative syntax.12 Because different
cadential gestures elicit relative degrees of arrival and closure within the music, when
embedded within a film, these musical gestures prompt us to construct interpretations and
judgments of relative degrees of arrival and closure within the narrative.13 While cadences
ending in the tonic pitch, tonic chord, or at a downbeat provide the most definite sense of
stability, becoming a ‘location’ that provides finality and ‘closure’, cadences ending with
other pitches, chords, or on weaker metrical positions confer lesser degrees of finality and
closure. As a result, the SPG and CONTAINER schemas’ spatial logics enable metaphorical
mappings between musical cadences and the film’s narrative, here gathered under the [NAR-
RATIVE CLOSURE] IS [MUSICAL CLOSURE] conceptual metaphor.
Two examples from 15 Minutes illustrate the use of melodic and temporal cadential
gestures to convey the [NARRATIVE CLOSURE] IS [MUSICAL CLOSURE] conceptual metaphor.
In both examples, the music presents two sub-segments featuring nearly identical musical
materials yet each ending with a different cadential gesture—the first inconclusive, the sec-
ond conclusive. Whereas in the first example cadences unfolding within the music’s tempo-
ral domain play a vital role in our judgment of arrival and narrative closure, in the second
example, cadences unfolding within the pitch domain illustrate the music’s power to shape
the narrative arch of a film sequence.
In an early scene, Eddie and Jordy engage in a relentless foot chase in a busy Manhattan
neighborhood to catch two criminals, Emil and Oleg. In the music, the thrilling drumming
on ethnic instruments closely follows the action. Eddie tries to stay in the lead, but he runs
out of breath and the criminals are getting away, so he must attempt to take a shot. He
crouches, takes aim at Emil, but suddenly lowers his gun, forfeiting his chance to shoot. The
drumming vanishes, swiftly fading away with a rhythmic gesture that thwarts the downbeat’s
FIGURE 4.9 15 Minutes. Street pursuit. (Music by Anthony Marinelli & J. Peter Robinson.)
[00:48:15] [00:48:30]
SOURCE-PATH-GOAL & CONTAINER Schemas 57
FIGURE 4.10 15 Minutes. Oleg dies. (Music by Anthony Marinelli & J. Peter Robinson.) [01:50:00]
[01:50:35]
arrival. Only seconds after, he resumes a shooting position, closes an eye, aims, and takes
the shot. Simultaneously, the drumming resumes, culminating in the much-expected hit on
the downbeat, slamming the narrative segment to an arresting close. As a result, the inter-
play between rhythm and meter shapes our experience of the scene, bestowing the music
with different degrees of resolution, which we map onto the narrative via the [NARRATIVE
CLOSURE] IS [MUSICAL CLOSURE] conceptual metaphor.14
Toward the film’s end, Oleg, whom we know as an aspiring film director, is acciden-
tally shot while recording his own movie. The underscoring to Oleg’s final words features a
mournful violin melody. Whereas the visuals depict his death by abruptly halting all physical
motion, the music hints at a lack of closure by interrupting the violin melody at the sub-
dominant instead of the tonic. As the music suggests, Oleg is merely simulating his death to
secure a dramatic, Hollywood-like ending for his film. Upon capturing a successful melo-
dramatic shot of his own death, Oleg resumes the dialogue, and the music returns. A few
seconds later, however, Oleg truly dies. A cadential tonic signals this event, providing the
musical closure that elicits a sense of narrative closure.
While single melodic or rhythmic lines can elicit strong phenomenological responses
of arrival and closure, most musical scores accomplish this effect via chordal or harmonic
structures. A scene from The Conversation exemplifies the use of harmonic cadences to
shape our perception of a lengthy scene. Harry sits absorbed in his thoughts while traveling
on an electric bus. In the soundtrack, a bluesy piano accompaniment instills the murky
FIGURE 4.11 The Conversation. Harry visits his mistress. (Music by David Shire.) [00:20:10]
[00:27:25]
58 SOURCE-PATH-GOAL & CONTAINER Schemas
FIGURE 4.12 Amadeus. Opening and concluding scenes. (Music by Wolfgang A. Mozart.)
[00:00:15] [02:52:25]
scene with a contemplative mood. Harry steps out of the bus, crosses a lonely street, enters
a building, and stops at the base of a staircase, hesitating. The music also halts, on the sub-
dominant chord, avoiding both musical and narrative closure. Harry approaches an apart-
ment door and enters quietly. Amy, his sweet and unsophisticated mistress, rises from bed in
a faded silk Oriental robe. No music on the soundtrack. They chitchat for a while, but Harry
becomes uneasy about personal questions and decides to leave. Amy, mystified, susurrates,
“I was happy you came tonight, Harry. My toes were dancing under the covers. But I don’t
think I’m going to wait for you anymore.” Harry steps outside and boards the electric bus,
now heading home. The bluesy piano accompaniment returns, but this time leads to an
authentic cadence, eliciting in us a sense of both musical and narrative closure.
In Amadeus, instead of shaping a single scene or sequence of scenes, a set of two cadences
shapes the entire film. The film opens with a grandiose orchestral D-minor chord, sounding
before the film introduces any visual elements. The frail voice of Salieri calling, “Mozart!” res-
onates through the nighttime streets of Vienna. Then another chord, an A-dominant-seventh,
creates a half cadence, to which Salieri’s frail voice again calls in distress, “Mozart!” Through
this visual and aural contrapuntal interplay, the film’s first scene punctuates the opening
with an unstable musical gesture halting at a dominant chord. By averting tonal conclu-
sion, the orchestral music launches a grandiose narrative space, announcing the beginning
of an extraordinary story.15 By the end of the film, as Salieri concludes his narration with
an account of Mozart’s funeral, the music presents a plagal cadence in the original key, not
only bestowing a sense of closure but aligning with the characteristic religious connotations
of plagal cadences.16
August Rush presents a similar example to Amadeus. Freddie, a young kid, becomes
captivated by the surrounding sounds which, in his mind, become music. He arrives in
New York searching for his parents and immediately drifts into a world controlled by city
sounds—car honks, loud machinery noises, and other street sounds turn into an expanding
musical texture, climaxing with an inconclusive dominant chord at a weak hypermetrical
position announcing the beginning of the narrative.17 In the film’s concluding scene, Freddie
conducts an orchestra in New York’s Central Park. As his voiceover conveys a final message,
“Music is all around us, all you have to do is listen”, the music provides the much-expected
conclusive arrival at the original key’s tonic.
SOURCE-PATH-GOAL & CONTAINER Schemas 59
FIGURE 4.13 August Rush. Freddie gets lost in the city’s cacophony. He conducts a symphony
orchestra. (Music by Mark Mancina.) [00:28:40] [01:46:40]
Finding Neverland portrays the intimate relationship between writer J. M. Barrie and the
Davis family, who inspired him to write his well-known play Peter Pan, or The Boy Who
Wouldn’t Grow Up. The film presents a rare example, one in which none of the music cues,
even the final one, ever arrive at a final tonic, thus sustaining a sense of perpetual won-
der. As a result, although the film addresses the themes of Peter Pan only tangentially, the
underscoring provides a direct link to these themes: never-ending childhood and everlasting
innocence.
In the montage from Citizen Kane that opens this chapter, a theme and variations formal
design underscores the protagonist’s crumbling marriage. The first tableau introduces the
happy couple with a tender theme, a seductive waltz in E♭-major that soon departs toward
unstable tonal regions, cadencing in the unusual (and inconclusive) lowered submediant
(C♭-major), opening the space for the depiction of Charles and Emily’s volatile and disin-
tegrating marriage.18 In the next tableau, the theme morphs metrically to duple meter and
becomes spirited and playful; in E-major, this variation feels somewhat offset, removed from
the original key, and although it cadences in the temporary tonic, its hypermetric premature
ending does not project a sense of closure. The next tableau presents a nervous variation
FIGURE 4.14 Finding Neverland. Three moments in the film. (Music by Jan A. P. Kaczmarek.)
[00:08:00] [00:28:20] [01:36:20]
60 SOURCE-PATH-GOAL & CONTAINER Schemas
featuring urgent woodwind tremolos that embed martial resonances; harmonic instability
further sets in, as this modulating variation first introduces the distant key area of E-minor but
hurriedly leads us to D-major. A feisty variation in the B♭-Phrygian mode, which could be
heard as prolonging E♭-minor’s (unstable) dominant, underscores the subsequent tableau; a
defiant gesture presented at the beginning of this variation repeats obsessively, never resolv-
ing, leading us without interruption to the next tableau. The music now becomes somber,
with penetrating long notes, the registral break in the theme’s melody suggesting a dislo-
cated, broken marriage; a half cadence toward the end of this variation leads us to the last
tableau, but before we get there, echoes of the Gregorian chant known as Dies Irae (“Day of
Wrath”) connote the marriage’s dire condition. In the last tableau, the E♭-major tonal area
and triple meter harken back to the original theme: a feeble off-beat E♭-pedal tone in the
harp, a fragile trace of their once-happy marriage, pulsates through this last variation and
punctuates a tenuous final cadence in E♭-major, suggesting a restored yet elusive stability.
FIGURE 4.15A Citizen Kane. Theme. Breakfast montage’s first tableau. (Music by Bernard
Herrmann.) [00:51:50]
FIGURE 4.15B Citizen Kane. Variation. Breakfast montage’s second tableau. (Music by Bernard
Herrmann.) [00:53:10]
SOURCE-PATH-GOAL & CONTAINER Schemas 61
FIGURE 4.15C Citizen Kane. Variation. Breakfast montage’s third tableau. (Music by Bernard Her-
rmann.) [00:53:25]
FIGURE 4.15D Citizen Kane. Variation. Breakfast montage’s fourth tableau. (Music by Bernard
Herrmann.) [00:53:55]
FIGURE 4.15E Citizen Kane. Variation. Breakfast montage’s fifth tableau. (Music by Bernard Her-
rmann.) [00:54:10]
62 SOURCE-PATH-GOAL & CONTAINER Schemas
FIGURE 4.15F Citizen Kane. Variation. Breakfast montage’s sixth (and final) tableau. (Music by
Bernard Herrmann) [00:54:15]
Coda
Film music transports us. By way of cross-domain metaphors, it takes us along with the char-
acters in their journeys. Whereas musical metaphors grounded in the SPG schema shape our
experiences of a narrative’s chronology, those grounded in the CONTAINER schema shape
our experiences of a narrative’s segmentation. When combined in the music, the SPG and
the CONTAINER schemas suggest syntactic events that set the space for exponentially more
nuanced and layered narrative interpretations.
Citizen Kane spans one man’s life, from his youth to the aftermath of his death. By fore-
grounding both the SPG and CONTAINER schemas, the music of the breakfast montage
amplifies other elements of the cinematography, allowing us to perceive both continuous
time and discrete events. Through fractured timelines and temporal dislocations, enhanced
through the musical score, Citizen Kane effectively explores the impermanence of time, the
inaccuracy of memory, and the impossibility of recovering the past.
Notes
1. Coëgnarts and Kravanja (2012) offer a well-designed framework to flesh out instances of the [PAS-
SAGE OF TIME] IS [MOTION IN SPACE] conceptual metaphor within the cinematography. See also
Coëgnarts (2019); Coëgnarts and Kravanja (2015); Forceville and Jeulink (2011).
2. The concept of ‘envelope’, itself a metaphor, denotes the contour of the sound’s intensity through
time. It often features four stages: (1) attack, the onset of the sound; (2) decay, the initial quick fade
after attack; (3) sustain, the continuation of the sound; and (4) release, the termination of sound,
generally caused by ceasing sound production.
3. Whereas conceptualizing sound design categories in terms of the CONTAINER schema, as outlined
in Chapter 2, presupposes a synchronic perception of the constituents of the schema, conceptu-
alizing musical form in terms of the CONTAINER schema, outlined in this chapter, presupposes a
diachronic perception of the constituents of the schema. For an in-depth discussion of the notion
of image schemas and conceptual metaphors, see Appendix II.
4. Thomas Newman’s musical language for the scene brings to mind Aaron Copland’s open voicings,
which feature a profusion of fourths, fifths, and ninths. Although the Dorian mode is most promi-
nent, sporadic chromaticism infuses a degree of tonal and modal ambiguity. Exploring a single
musical mood to unify a lengthy scene has become a trademark of composer Thomas Newman;
this approach is also prominent in his score for American Beauty.
5. See also the metaphorical use of sound design to achieve this effect, as discussed in Chapter 2.
SOURCE-PATH-GOAL & CONTAINER Schemas 63
6. In addition, two sound design manipulations help outline the narrative arc of the film: first, the
woman’s changing of the radio dial triggers a transference in the sound design, from non-diegetic
to diegetic music; and second, as the characters reach their destination and the narrative unfolds
in spaces far removed from the truck, the classical music continues to resonate, suggesting a
reversed transference, from diegetic back to non-diegetic. These two transferences in the sound
design establish a complementary metaphor that foreshadows the film’s overall plot: the older
woman is in control; she nurtures and appeases rebellious and aggressive individuals with the
kinds of music traditionally deemed sophisticated and elegant.
7. Donizetti intended this aria to be accompanied by a glass harmonica; instead, the soundtrack fea-
tures a flute. The eerie sound of a glass harmonica would have worked against the desire to ground
this aria in a human, rather than an alien sonic environment.
8. These features are characteristic of vocal writing, grounding the music in the affordances of a hu-
man (well-trained) soprano. Additionally, as customary in operatic passages, prosody (i.e., patterns
in the text) allows for a flexible rhythmic profile in the solo voice.
9. Here, the wordless vocals create associations with the sound of a Theremin. The use of a Theremin
(and more broadly, of electronically generated timbres) has permeated sci-fi films as a convention
to represent alien beings since the 1950s. In a survey of soundtracks to sci-fi films, Schmidt (2010)
notes, “there is some suggestion that our brains physically interpret electronic sounds as in some
way profoundly artificial in relation to the sounds produced by other instruments . . . Thus, no mat-
ter how pleasing it may be to the ear, the electronic may always signify both itself and an anxiety
about authenticity, and might have always been pre-destined to be alien” (p. 36).
10. In verbal communication, schema combinations may emerge in single words (such as in the verb
‘insert’, which combines the CONTAINER and SPG schemas to specify both boundary and direc-
tion) or in verbal metaphors (such as in “Our project hit a wall”, which combines the BLOCKAGE
and SPG schemas).
11. When describing musical events related to cadences, we commonly resort to prose—for example,
“After a detour [SPG] to the dominant key area, the cadential arrival [SPG] at the tonic closes [CON-
TAINER] the recapitulation.” In such metaphorical constructions, the SPG and CONTAINER schemas
address different facets of our experience. Whereas the SPG schema evokes the syntagmatic un-
folding of musical events, the CONTAINER schema evokes the effect a cadence has on shaping a
musical (or narrative) structure—i.e., cadences outline the boundaries of a container defined by
music-syntactical structures.
12. Lehman (2013) offers an in-depth exploration of the punctuative function of cadences in Holly-
wood film scores.
13. When describing film events related to narrative syntax, we commonly resort to descriptors such
as ‘non-linear story’, ‘cliffhanger’, ‘circular narrative’, ‘plot twist’, or ‘satisfying close’. Thompson
et al. (1994) conducted various experiments to examine this phenomenon in relation to the under-
scoring. Their results suggest that “the final note and harmonic accompaniment in underscoring
[at cadential points] can significantly affect viewers’ sense of closure, or finality, of a filmed event”
(p. 23) and that “meter and rhythm play an important role in affecting [the] listener’s sense of musi-
cal closure” (p. 24).
14. Additionally, a subtle sound design manipulation—the diegetic sound of gunshots replacing the
non-diegetic music—suggests that Eddie is in control of the narrative.
15. Motazedian (2016) explores the viability of long-range tonal organization in film music and in-
cludes Amadeus as a case study.
16. A plagal cadence is commonly referred to as the ‘Amen’ cadence. Additionally, it is not coinciden-
tal that the opening musical gesture and the concluding cadence are in the same key.
17. A subtle sound design manipulation in the scene functions as a metaphor for character develop-
ment: a smooth transference from diegetic sound effects to non-diegetic music helps depict Fred-
die’s engagement with music, projecting him as exceptionally sensitive to sounds and music. In
the film’s concluding scene, the soundtrack once more extends the boundaries of the CONTAINER
schema in the sound design, featuring a transference from diegetic music to non-diegetic music to
convey Freddie’s final message.
18. In retrospect, this last sonority (C♭-major) can be reinterpreted (enharmonically) as the dominant
of E-major, the key of the next variation.
5
AFFORDANCES
In Tim Burton’s Batman, the music invites us into a ritualized dance, only to trap us in a
battle of mythical proportions. Jack and his band of criminals have been set up. They duck
into a chemical supply room—a two-story refinery floor accessible through a network
of steel ladders and catwalks, filled with huge containers stamped with “Danger! Highly
Toxic”. The police are also on the move, hunting for Jack and his men. Jack’s men open fire.
The cops shoot back. Their bullets puncture ducts and pipes that spew gas and chemical
sludge. A fast-paced waltz in the soundtrack prompts us to engage in the action, distorting
the ensuing clash into an unsettling and lethal dance. Shots resonate loudly as Jack scuttles
across the elevated walkways, firing at the police, puncturing more ducts and containers,
thereby releasing more poisonous chemicals. He spots an exit at the end of a catwalk,
but a caped black shadow descends into his path. Giant wings and a yellow-and-black
insignia—Batman. His leitmotif swells in the music, assuring us of his presence. As Batman
pulls Jack off the ground, one of Jack’s men points a gun at the commissioner’s head and
shouts, “Hold it! Let him go, or I’ll do Gordon.” Batman does not move. The cops do not
move. All action stops. The music halts its pulse and introduces static, suspenseful sonori-
ties. Batman slowly releases Jack and stands back. Jack straightens his clothes and smirks,
“Nice outfit.” Having spotted his gun a few feet away on the catwalk, Jack quickly grabs it
and aims at Batman, who has mysteriously vanished. Still poised, gun in hand, Jack aims
for one cop and takes him down with a single shot. The music reintroduces fragments of
the wicked waltz. As Batman reappears from the shadows, Jack fires point-blank, but Bat-
man swings his heavy cape, and the bullet ricochets toward Jack’s own face. Jack loses
balance, stumbles to the edge of the catwalk, and in an agonizing pirouette, topples over,
only managing to grab onto the lowest rung—beneath him is a large container full of dark
green sludge. Batman leaps to save Jack, reaching out with his black-gloved hand. In the
soundtrack, the violins stretch to a seemingly out-of-reach note and struggle to hold tight.
Their eyes meet for a moment. Jack is slipping. He cannot hold on any longer and plunges
two stories down into the bubbling toxic waste. Throughout this scene, the music thrusts
us into Gotham City’s underworld and allows us to embody the actions of its odd, deviant,
freakish characters.
DOI: 10.4324/9780429504457-6
Affordances 65
Affordances of Meter
Understood through the lens of affordances, meter is not a property isolated within the
music itself; instead, meter arises from and mediates in our embodied conceptualization of
the music. As we entrain to the music’s temporalities, we perceive its meter not as cyclically
organized beats but as a feature that allows us to engage in a broad array of interactions, a
feature that affords us dancing, marching, or relaxing to the music.2
Because our bodies are symmetrical, most movements related to physical labor unfold
in cycles of duple organization (lifting and lowering, pushing and pulling), which trans-
late into onomatopoeic musical renditions of duple meter in work songs ([1—2]–[1—2]–
[1—2] . . .).3 Entraining to these songs’ duple meter affords us the kinds of engagement
with work that help ease physical effort and ensure greater efficiency and coordination,
particularly in communal tasks.4 In films, therefore, music depicting collective work is
often rendered as stylized versions of duple meter.5
In Snow White, the song “Whistle While You Work” uses duple meter for a light-hearted,
fanciful depiction of work. While wandering in the woods, Snow White and her animal
friends sneak into a cute little house. Surprised, Snow White remarks, “Must be seven little
children. And from the look of this table, seven untidy little children.” Dust everywhere, dirty
little dishes, cobwebs in the fireplace. The bluebirds whistle a military bugle call, and Snow
White begins to sing, “Just whistle while you work and cheerfully together, we can tidy up
the place.” To the song’s duple meter, she sweeps dust back and forth, birds lift and lower
rugs, and chipmunks scrub the tiny clothes from side to side on a turtle’s shell. The back-
ground music in duple meter continues as the camera fades to a mine somewhere in the
mountains. Four of the Seven Dwarfs are working in a cavern, chipping away at the walls,
their picks clinking rhythmically to the duple meter of the music. They break into song: “We
FIGURE 5.1 Snow White. Snow White and her animal friends tidy the place while four Dwarfs
work at a mine. [00:18:20]
66 Affordances
FIGURE 5.2 Pirates of the Caribbean: The Curse of the Black Pearl. Pirates march underwater.
[01:48:30]
dig, dig, dig, dig, dig, dig, dig from early morn till night. We dig, dig, dig, dig, dig, dig, dig
up everything in sight.” In both vignettes, despite the differences in the characters and their
forms of work, the music makes us co-participants of the action—while the lyrics and the
visuals describe the very phenomenon under discussion here, our entraining to the music’s
duple meter affords us the opportunity to internally simulate the kinds of physical labor
depicted on-screen.6
Besides aurally depicting work or physical labor, music in duple meter draws our atten-
tion to other common cyclical bodily movements of characters, such as walking or march-
ing.7 In a scene from Pirates of the Caribbean: The Curse of the Black Pearl, the moonlight
shines down into the deep blue waters. Suddenly, fish scatter as the underwater currents
unveil distant figures. In the soundtrack, a plodding duple meter march emerges. The fig-
ures appear more clearly, stomping onto the shadows of a shipwreck—these are the pirates,
slowly marching across the ocean floor, turning into skeletons when splashed by moon-
light. In this captivating scene, the music’s slow duple meter affords us the opportunity to
entrain to a spellbinding sunken march and to subliminally trudge along with the ghostly
pirates.
Triple meter ([1—2—3]–[1—2—3]–[1—2—3] . . .) presents a stark opposition to duple
meter in terms of entrainment and affordances. Because triple meter defies our bod-
ies’ symmetrical nature, it commonly characterizes music associated with activities not
related—and even directly opposed—to physical labor (such as waltzes or lullabies). Film
music often draws on triple meter to depict such activities and make us co-participants of
the on-screen action.
Finding Neverland offers glimpses of Sir James Matthew Barrie’s escapes from the trou-
bles of adulthood into the imaginative world of childhood. It is a pleasant afternoon in
Kensington Gardens, but James senses the somber mood of Sylvia, a recently widowed
mother, and her boys. To lighten the atmosphere, James grabs Porthos, his shaggy Saint
Bernard, and begins dancing. A delightful non-diegetic waltz (in triple meter) transports us
to an alternative reality—Porthos becomes a great Russian bear courteously dancing and
prancing with James within a large circus ring filled with amusing and peculiar characters.
Sylvia and the boys clap in enthusiasm, enjoying the show. The music’s triple meter affords
us a journey into J. M. Barrie’s transformative, fantastical, surreal dance, and invites us to
“believe, believe, believe!”
As suggested in the scene from Finding Neverland, the odd number of impulses in
triple meter, which does not align well with our evenly shaped bodies, carries semantic
associations that characterize ‘odd’ settings or individuals. In the scene from Batman with
which I open the chapter, Jack falls into bubbling chemical waste that radically alters his
face, giving him a clown-like appearance. In a later scene, Jack pays a visit to Grissom,
the most powerful crime lord in all of Gotham City, the one who set him up. Grissom sur-
reptitiously reaches for a gun. Standing in the shadows, Jack warns, “Don’t bother. Jack’s
dead, my friend. You can call me Joker.” He steps into the light, flings away his hat, and
reveals his hideous glory. A non-diegetic carnivalesque waltz assaults the scene as Joker’s
Mickey-Moused murderous moves become an odd and deadly dance.3 Subliminally, we
entrain to the music’s triple meter, which affords us a glimpse into Joker’s bizarre, anoma-
lous, offbeat world.
Asymmetric meters are complex temporal structures that defy organization into cyclic
groups of two or three. By not affording us the organic entrainment characteristic of duple
or triple meters, asymmetric meters present a physical challenge to both performers and
FIGURE 5.6 The Taking of Pelham One Two Three. Bankers struggle to collect the money.
[00:49:40]
listeners.8 Quintuple meter, for instance, although not impossible to entrain to, demands
considerable effort.
The main theme for the Mission Impossible TV series (later used for the films) captures
the near-impossibility of the teams’ quests through the complexity of the music’s quintuple
meter ([1—2—3—4—5]–[1—2—3—4—5] . . .). During the TV series and the films, the
main theme enters at critical moments, in which the protagonists are about to complete
their (seemingly) impossible missions. The music’s complex rhythmic structure (a subdivided
quintuple pulse organized as 3 + 3+2 + 2) impinges on the film’s kinesthetic dimension—by
breaking the divide between the film’s fictional world and our bodily reality, the music in
these scenes demands a complex physiological response, one that brings us along with the
characters to the limits of our shared affordances.
In a scene from The Taking of Pelham One Two Three, also discussed in Chapter 4,
city officials struggle against time to collect and count the terrorists’ million-dollar ransom.
Moments before the cue begins, Police Lt. Zachary Garber exclaims, “Your instructions
were complicated. The money has to be counted, stacked, transported uptown. It just isn’t
physically possible!” To Garber’s list of bodily activities suggesting a duple work meter, the
music’s 7/4 asymmetric meter ([1—2—3—4—5—6—7]–[1—2—3—4—5—6—7] . . .) lends
the scene an unyielding and relentless feel, affording us a visceral understanding of the
bankers’ desperate attempt to comply with the unreasonable demands.
The more complex the temporal structures, the less we can entrain to them. For most of
us, asymmetric meters with a higher number of beats, or phrases with changing meters, are
at the boundary of entrainment. This resistance to entrainment instills in us a high degree of
uncertainty about the downbeat’s arrival, eliciting an intensely unsettling physical response
uniquely suitable for the horror and suspense genres.9 The main theme from the iconic horror
film The Exorcist features 15-tuple meters—shifted and subdivided as 3½ + 3½ + 3½ + 3½ + 1,
FIGURE 5.7 Main theme for The Exorcist. (Music by Mike Oldfield.)
Affordances 69
as shown on Figure 5.7. As a result, the asymmetric meter presented in the main titles sup-
ports and advances the aesthetics behind this film, which exploit uncertainty to elicit fear,
anxiety, and distress. By not affording us the opportunity to establish a regular pattern of
entrainment, such complex asymmetric meters prime us for the terrifying events about to
unfold and trigger the adrenaline rush we crave. Nevertheless, although these responses are
usually regarded as negative emotions, within the safe environment of a movie’s storyworld,
these become, for some, highly pleasurable experiences.
Entrainment is a vital component in constructing our temporal reality—our physical time
impinges on our psychological time. In the absence of a metric structure of cyclical beat
patterns, we experience a stop in motor activity that evokes a postural immobility akin to
a resting state, suggesting a chronological stretching of the ‘now’, a static timelessness, a
lengthening of our experience of the present.
In The Matrix, fleeting moments of suspended action are framed on either side by intense
pulsating activity in duple meter. Trinity and the Key Maker are atop a speeding cargo truck
carrying motorcycles. Trinity looks back and sees an explosion on the horizon. Link, over
the phone, urges her, “Keep moving.” The music prods us with its relentless duple meter. She
grabs the Key Maker, “Let’s go”, and they rush to the front of the truck. She straddles the first
bike, and the Key Maker climbs on behind her. As she shoots the chain and pops the clutch,
the back tire screeches and the motorbike dashes up the final wedge of cargo ramp, leaping
into the air, soaring over the truck’s cabin, virtually suspending time into infinity. The ametric
music underscoring this moment maps the slow-motion visuals and thwarts our entrainment
cycle, affording us the opportunity to soar with Trinity, defying time and space.
Entraining to the music’s metrical structure is not just an individualized psycho-physiological
phenomenon, but one that reflects collective practices.10 We are constantly immersed in
social and cultural environments defined by bodily activities that prompt us to construct
associations between motor patterns, the musical metrical structures that afford those pat-
terns, and the socio-cultural environment where such motor patterns unfold. Collective
entrainment is one of the primary mechanisms for increasing our individual capacities, ena-
bling critical human activities that rely on social interaction, such as courtship, hunting,
and building. Film music often draws on this phenomenon by extending the implications of
meter beyond the purely physical realm, using meter as a marker to connote characteristic
psycho-social activities—for instance, duple meter for competition or disparagement, triple
meter for seduction or adulation.
300 depicts the brutal reality of Spartan soldiers as they engage in the epically fierce
Battle of Thermopylae. The non-diegetic voice of Leonidas, the Spartan King, assures his
70 Affordances
men, “We can hold the Hot Gates. We can win.” Spartans erupt, “Haawwooo!”, which sets
off a slow-motion montage underscored with duple meter music. An armored rhinoceros
menacingly approaches the Spartans. Firmly standing in front of his brave men, Leonidas
shoots his spear at the beast, his body straight, his muscles unquivering under the heavy
armor, only the hem of his tattered crimson cape pushed gently by the wind. Finally, the
creature collapses, its horns and thick folded skin burrowing into the muddy ground, growl-
ing its last growl at the Spartan King’s feet. Throughout the entire montage, the music does
not synchronize with the slow-motion visuals. Instead, the music’s duple meter elicits and
reinforces explicit psycho-social facets of the narrative, affording us bodily engagement with
the film’s martial associations.11
In Punch Drunk Love, the music colors our interpretations of the visuals and dialogue. On
a mission to repair his relationship with Lena, Barry rushes out of the elevator, carrying his
harmonium. He careens down the hallway, first going right, then twirling left at the corner,
swaying along with the heavy instrument. The music does not synchronize with his running;
instead, it introduces a waltzing meter, suggesting a nascent romance. He arrives, puts the
instrument on the floor, and rings the bell. As Lena answers, he rambles:
Lena, I’m so sorry. I’m so sorry that I left you at the hospital. I called a phone sex line.
I called a phone sex line before I met you, and four blond brothers came after me, and
they hurt you, and I’m sorry. And then I had to leave because I wanted to make sure you
never got hurt again. And I have a lot of pudding, and in six to eight weeks it can be
redeemed. So, if you can just give me that much time, I think I can get enough mileage to
fly with you wherever you have to go if you have to travel for your work, because I don’t
want to be anywhere without you. So, can you just let me redeem the mileage?
Throughout Barry’s lengthy and scattered monologue, the characters remain at oppo-
site sides of the visual frame, their contrasting dress colors—Barry in blue, Lena in red—
intensifying the dichotomy and polarization.12 The music’s triple meter, however, softens our
perception of the confrontation and foreshadows a harmonious resolution. Lena reminds
Barry, “You left me at the hospital. You can’t do that.” Barry reiterates, “If you just give me
six to eight weeks, I can redeem the mileage, and I can go with you wherever you have to
travel.” For a long moment, they gaze at each other in understanding, smile, and ultimately
kiss. As the couple embraces and gently moves in unison, we join them in entraining to the
triple meter of the romantic non-diegetic music.
Affordances 71
FIGURE 5.10 Punch Drunk Love. Barry rushes to Lena’s apartment. [01:24:20]
The connotative sphere of musical meter extends even further. The music’s affordances,
and the kinds of bodily engagement these connote, serve as markers of socio-cultural norms
and values. As a result, via entrainment we shape our social identities. The duple versus tri-
ple dichotomy, for example, often connotes the poor versus the wealthy, or low versus high
social status. Because the music’s affordances are integral to delineating cultural constructs
and social boundaries, entraining to the music’s meter allows us a more nuanced under-
standing of a scene.13
In An Education, David attempts to convince teenage Jenny to accept the questionable
ethical acts that allow them to afford their extravagant lifestyle of wealth, refinement, and
leisure. David, Jenny, and a couple of friends pull up on a luxurious Bristol just outside their
Bedford Square flat. As everyone else heads for a drink, Jenny flees, but David catches up
with her on the street. He explains, “It was an old map cooped up in that miserable little
cottage, and she didn’t even know what it was. We liberated it.” Nodding reluctantly, Jenny
snorts, “Liberated! That’s one word for it.” David counters:
Oh, don’t be bourgeois. I know you have fun with us. You drink everything I put in front
of you down in one, every last drop, and then you slam your glass down on the bar and
ask for more. We’re not clever like you, so we have to be clever in other ways.
In the music, ametric string textures quietly underscore Jenny’s inner struggle. She nods
reluctantly, but David continues trying to persuade her:
If you don’t like it, then I will understand, and you can go back to Twickenham and listen
to the Home Service and do your Latin homework. But these weekends, and the restau-
rants and the concerts don’t grow on trees. This is who we are, Jenny.
As David teases her with the prospects of a lavish lifestyle, the melody teases us with echoes
of a triple meter. He holds out his hand, inviting her to join them as they are.14 After a few
wavering moments, Jenny smiles and takes his hand, consenting to David’s tempting offer.
As David pulls her toward him, the musical underscoring swiftly changes to a well-defined
triple meter, and both characters begin to dance a (non-diegetic) waltz. Here, the triple
meter in the music functions as a commentary on the narrative discourse, signaling Jenny’s
rejection of a work ethic and embrace of a life of leisure only possible at a higher social
status.
72 Affordances
FIGURE 5.11 An Education. David persuades Jenny. (Music by Paul Englishby.) [00:47:50]
Affordances of Tonality
Tonality extends beyond the hierarchical relationship of tones in a musical system. The per-
ceived stability and instability of particular tones affect our (inter)active behavior with the
music—as we embody the music’s tonality, we subliminally attune our motor systems to a
broad array of potential interactions. Therefore, understood through the lens of affordances,
tonality is not a property isolated within the music itself, but one that mediates between
our embodied conceptualization of the music and our interaction with the environment.
In this second portion of the chapter, we explore tonal frameworks that allow (or hinder)
our perception of a musical gravitational center, tonal frameworks that afford us (or not) the
opportunity to feel physically and mentally grounded.
Insights on gravitational disturbances outside of music shed light on the logic behind
the affordances of gravity in music. For instance, aeronautics experiments that simulate
weightlessness conditions characteristic of space offer enthralling insights into the impact of
gravitational forces (and their absence) on human physiology, biochemistry, cognition, and
behavior.15 Much of this research reminds us that the feeling of disorientation associated
with gravitational disturbances stems from ‘cognitive dissonance’—conflicting inter-sensory
information reaching our brains, such as our vision telling us we are stationary while our
hearing tells us we are moving.16 While many inter-sensory conflicts are possible—emerging
from contradictory kinetic, proprioceptive, vestibular, visual, aural, tactile, olfactory, or other
sensory information—the visual-vestibular conflict is particularly salient. From the literature
on cognitive dissonance transpires the fact that our vestibular systems’ strong connection to
our hearing apparatus—the vestibular system is located inside the inner ear—may explain
music’s unique ability to elicit disturbances akin to the absence of a gravitational center.17
In films, the music accompanying zero-gravity or extraterrestrial scenes typically draws
on symmetrical pitch collections.18 The inherent structure of these collections, wherein all
pitches may equally serve as a tonic, contributes to equalizing each pitch’s pulling power by
dissolving their primacy to function as a (musical) gravitational center.19 By dissolving tonal-
ity, and thereby not affording us the potential to experience tonal grounding, symmetrical
constructs contribute to suspending our embodiment of physical gravity, particularly when
coupled with equally disorienting visuals.20 The ensuing cognitive dissonance—the music
Affordances 73
telling us there is no gravitational center while our proprioceptive sense tells us otherwise—
has therefore implications for our embodied responses to symmetrical pitch collections.21
Peter Hyams’s 2010: The Year We Made Contact, the sequel to Stanley Kubrick’s 2001:
A Space Odyssey, chronicles a space mission to one of Jupiter’s seemingly inhospitable
moons aboard the Discovery Two spacecraft. The underscoring for scenes showing the Dis-
covery Two floating in space presents symmetrical pitch collections of various cardinalities—
number of notes—to portend the kinds of bodily engagement that escape terrestrial forces.
The first time we see the spaceship, an ominous tritone—a two-note collection that divides the
octave exactly in half—saturates the musical fabric, envelops our senses, and contributes to
disturbing our proprioceptive sense of gravitation.22 The second time, a hovering whole-tone
FIGURE 5.12A First appearance of the Discovery Two spacecraft. (Music by David Shire.)
FIGURE 5.12B The Discovery Two approaches Europa. (Music by David Shire.)
FIGURE 5.12C The Discovery Two about to complete its mission. (Music by David Shire.)
FIGURE 5.12D 2010: The Year We Made Contact. Different moments in the Discovery Two’s jour-
ney. [00:22:00] [00:32:50] [01:37:35]
74 Affordances
FIGURE 5.13 Alice in Wonderland. Alice falls through the rabbit hole. [00:15:30]
collection—a set of six equally distanced notes—further lifts us and affords us an opportunity
to sense the weightlessness depicted in the visuals. As the film unfolds, the underscoring for
scenes depicting (or intended to elicit in us the sensation of) zero-gravity becomes denser,
drawing on collections of higher cardinalities. For instance, well into the film, the music pre-
sents a dodecaphonic collection (gesturally segmented into two transpositions of a hexatonic
collection) during a countdown sequence. Here, the music affords us a sense of weightless-
ness and infuses a sense of suspense and uncertainty that supports the events in the narrative.
Scenes unfolding in surreal worlds—not governed by the earthly gravitational forces we rec-
ognize—are also commonly underscored with symmetrical pitch collections. In Disney’s ani-
mated version of Alice in Wonderland (1951), as Alice falls through a rabbit hole and through
the center of the Earth, the objects she encounters behave strangely, not abiding by gravity’s
physical laws. A downward scale outlining a whole-tone collection [C, D, E, F#, G#, A#] under-
scores the initial fall, and as gravity further dissolves throughout her fall, the music introduces an
effect akin to a Shepard tone, simulating a continuous descending shift.23 Finally, as Alice enters
an unknown world transformed by altered perspectives, the music again features a whole-tone
collection, which further contributes to the suspense and uncertainty in the narrative.24
As these examples begin to suggest, the affordances of symmetrical constructs extend
beyond our bodily capacities and into our psychological faculties. Because these constructs
do not afford us the opportunity to define a clear tonal center, they elicit feelings of suspense,
uncertainty, confusion, irresolution, and lack of direction.25 Therefore, in scenes taking place
on Earth—where gravity is intact—symmetrical constructs generally serve as psychological
markers of instability, affording us a window into the characters in their dream states, hal-
lucinatory episodes, unusual passages of time, or trips to the unknown.
Dream worlds are often unfettered by the laws of gravity. Even when firmly grounded on
Earth, while dreaming, we may experience proprioceptive hallucinations, such as the sensa-
tions of floating, flying, or falling. Symmetrical pitch collections are therefore exceptionally
suited for underscoring dream sequences and have been a staple of filmic dream worlds. In
the iconic dream sequence from Alfred Hitchcock’s psychoanalytic thriller Spellbound, the
music captures John Ballantyne’s physical and psychological lack of grounding. John leans
on a comfortable chair and describes his nightmares to a psychiatrist. His recollections
depict forces of gravity and motion in space:
I can’t make out just what sort of a place it was. He was leaning over the sloping roof of a
high building, the man with the beard. I yelled for him to watch out. Then he went over,
slowly, with his feet in the air. Then I saw the proprietor again, the man in the mask. He
Affordances 75
was hiding behind a tall chimney, and he had a small wheel in his hand. I saw him drop
the wheel on the roof (emphasis added).
Despite the physical descriptions, the character is unable to identify the place, perhaps because
of the spatial and gravitational oddities the dream projects. Salvador Dalí’s vignettes for the film,
featuring free-floating eyes, slanted surfaces, falling objects, and timeless clocks, submerge us
in a surreal world governed by contradictory conditions and irrational juxtapositions. The musi-
cal fabric, saturated with figures using symmetrical constructs—whole-tone [C, D, E, F#, G#,
A#], octatonic [A, B♭ C, D♭, D#, E, F#, G], and dodecaphonic collections—has a dual func-
tion: at face value, it contributes to immersing us in the suspended reality of a dream sequence,
but ultimately, it is vital in eliciting a pale shade of the protagonist’s mental instability.
When music based on such pitch collections does not accompany visuals that portray
altered realities or gravitational disturbances, we intuitively draw a parallel to the character’s
psyche or to the unfolding narrative instead. Two scenes from The Simpsons TV series illus-
trate this use of music. In the episode “Them, Robot”, a factory worker enters a supply closet
FIGURE 5.15A The Simpsons, “Them, Robot”. Worker procures paperclips. [00:01:20]
FIGURE 5.15B The Simpsons, “Beware My Cheating Bart”. Homer reads a fortune cookie message.
[00:01:10]
76 Affordances
to procure one standard-size paper clip; as he exits the closet, a green rod falls to the ground
and prevents the door from closing. In another episode, “Beware My Cheating Bart”, Homer
finishes lunch at a mall, opens a fortune cookie that reads “Eat less, live longer”, and begins
to feel dizzy. Although the events of these scenes seem inconsequential, the musical fabric—
which draws exclusively on a whole-tone collection [C, D, E, F#, G#, A#]—infuses suspense-
ful undertones, commenting on these events, foreshadowing a destabilized narrative.
Since disorientating events in the narrative may stem from uncertainty about time, com-
posers often introduce symmetrical pitch collections to evoke the lack of chronological
direction. In Back to the Future, Marty is accidentally thrown back into the 1950s during a
failed experiment by his eccentric scientist friend Emmet “Doc” Brown. Toward the begin-
ning of the film, they are experimenting with time, attempting to send a remote-controlled
car to the future with Emmet’s dog in the driver’s seat. The DeLorean roars to life as Emmet
manipulates its remote control. It takes off. The speedometer passes 20, then 50, then 80 . . .
85 . . . 88 . . . then bam! A sharp blast of air and electricity hits Emmet and Marty. Marty
blinks in disbelief as the car vanishes, only leaving behind a trail of fire and the car’s vanity
plate, which reads “OUTATIME.” Marty asks, “Where the hell are they?”, to which Emmet
retorts, “The appropriate question is: When the hell are they?” The music throughout the
lengthy scene draws on an octatonic collection [G, A♭, A#, B, C#, D, E, F] affording us an
opportunity to partake in the characters’ new experience of time, which now lacks the linear
chronological forces we recognize.
the police, Jack’s gang, and Batman, projects a well-defined tonal gravitational center
via the C-minor pitch collection. Its polymetrical framework, a quick waltz within an
underlying duple hypermetric structure, signals both the current confrontation and fore-
shadows Jack’s transformation into the Joker. Unexpectedly, as one of Jack’s men shouts,
“Hold it!” and threatens to take down the commissioner, the confrontation stops. Will
Batman make a move? Will Jack escape the scene? The music supports the stop in the
action and the suspense that emerges from it by halting its pulse and introducing a static
whole-tone collection [C, D, E, F#, G#, A#]. This drastic change in the music’s affor-
dances affects us viscerally, guiding our embodied engagement with the music and our
interpretations.
FIGURE 5.18A Atonal, ametric music in The Matrix. (Music by Don Davis.)
The Matrix is also set on Earth, but the film presents earthly life as an illusion, one that a
chosen few can escape by awakening from a dream state. Neo is a chosen one, but he does
not yet know it. He is on the run, attempting to escape from ‘agents’—sentient eradication
programs within the source code of the matrix, manifested as men in dark green suits and
square sunglasses. On his left, there is a window, which he can use to get to the roof. He
debates with himself, “It’s insane!”, yet takes his chances, climbs out the window, and begins
walking on the narrow ledge of the skyscraper. A close-up of Neo reveals his psychologi-
cal disorientation. A sudden gust of wind knocks Neo off-balance, forcing him to drop his
phone so that he can cling tightly to the building. Startled, he watches as the abyss swallows
the phone. The music reinforces this feeling of disorientation via its (a)tonal and (a)metrical
design: a near-symmetrical pitch collection does not afford us a firm gravitational point, and
seemingly random metrical displacements of a musical figure hinder our entraining to a pulse.
Here, the musical and visual devices capture the protagonist’s physical and psychological
disorientation, and do more: they foreshadow the revelations to come and evoke the idea that
the world Neo lives in may not be governed by our gravitational and chronological realities.
In another scene from The Taking of Pelham One Two Three, sixteen passengers and the
conductor have been held hostage inside a subway car for hours. Blue, a mercenary, mean-
ders into the subway car and ominously orders, “You. Stand up, please.” The passengers
warily turn to look at whomever Blue has indicated. The conductor slowly looks up and,
in a broken voice, asks, “You mean me?” Blue quietly leads the conductor by one arm to
the rear of the car—all passengers’ eyes are on the conductor as he passes by, his free hand
hesitantly clinging to one overhead strap after another. The music accompanying the scene
does not project a beat or pulse, and draws on a nine-note chromatic collection to under-
mine any sense of stability. Like the conductor, we seek a point of support, but the music
gives us nothing to grab onto. Instead, the ametric and atonal music maps the static visuals
Affordances 79
FIGURE 5.19A Nine-note chromatic collection, ametric music in The Taking of Pelham One Two
Three. (Music by David Shire.)
FIGURE 5.19B The Taking of Pelham One Two Three. Hostages wait endlessly in the subway car.
[01:05:45]
FIGURE 5.20 2001: A Space Odyssey. Attendant brings food to astronauts. [00:35:20]
80 Affordances
FIGURE 5.21 WALL-E. WALL-E and EVE dance around the Axiom. [01:00:05]
WALL-E also presents a scene in outer space featuring music that helps reconstruct an
earthbound experience. In the distant future, humankind has abandoned Earth. WALL-E,
a small garbage-collecting robot, is left behind by himself to clean up the rubble. As he
tinkers with the trash he collects, he inadvertently becomes fascinated with Earth’s history,
especially show tunes and musicals. EVE, a reconnaissance ‘female’ robot, arrives in search
of living organisms. WALL-E falls in love with EVE, and both embark on a fantastic jour-
ney through space, hitching a ride on the outside of Axiom, a spacecraft carrying humans
who evacuated from Earth over seven centuries earlier. Suddenly, a tiny spark of electricity
passes between them. WALL-E spins and pirouettes, giggling, flying around the spacecraft.
EVE matches him move for move, in perfect unison. As both engage in synchronized fly-
ing around the ship, weaving in and out of rocket flames, harmoniously twirling double
helixes, the gentle music in the soundtrack supports their dance-like movements by pro-
jecting a well-defined triple meter and tonality (D♭-Lydian). Humans aboard the Axiom
watch in amazement as WALL-E and EVE spiral gracefully around one another. However,
having lived for such a long time onboard the spacecraft, the humans no longer recog-
nize or are physically capable of the earthbound activities their ancestors engaged in.
Engrossed in Earth research, the captain asks the computer, “Define dancing”, to which it
responds, “A series of movements, involving two partners, where speed and rhythm match
harmoniously with music.” While the dialogue defines ‘dancing’, the audiovisual spectacle
invites us to experience ‘dancing’. The music is thus integral in constructing this embodied
experience—although underscoring zero-gravity visuals, the music affords us, and the pas-
sengers aboard the Axiom, an opportunity to feel grounded and engage in a virtual dance
along with WALL-E and EVE.28 Through the music, we recognize what members of this futur-
istic society have lost and what they must now learn from the machines.
Coda
Film music invites bodily engagement. As listeners, we subliminally react to the music’s
properties and structure, attuning our perceptual experiences to its affordances. In this chap-
ter, we explored the affordances of metrical and tonal frameworks, both in isolation and in
combination, elucidating how these facilitate our embodiment of a film character’s physical
actions and psychological states. In constructing a more ambitious argument, the chapter
contemplates how the music’s affordances present an additional associative layer of mean-
ing tied to the perceiver’s cultural practices and values.
Affordances 81
In Batman, characters lack superpowers. They obsessively seek, but ultimately fail, to
transcend beyond their human boundaries. Batman resorts to material objects to extend
his bodily capacities—armor to protect himself, gadgets to swiftly navigate his way through
the gothic cityscape, a searchlight to project his image onto the smoggy skies and assert his
presence—yet he is painfully aware of the limits of his human affordances. The music in
the film emerges as one additional gadget—this time designed by the filmmakers to reach
beyond the film’s storyworld—one that exists within an analogous mythical tension: the
music subliminally forces us to resonate with the film’s characters in their attempts to recon-
struct themselves yet remains trapped and constrained by their (and our) human affordances.
Notes
1. Gibson (1979) argues that our perception of the environment is not based on a passive observa-
tion of its features, but on the active detection of the opportunities it affords for action. Through
the notion of affordances, he acknowledges a continuum between perception and action, and a
reciprocal relationship between an organism and its environment. Affordances therefore rest on
the interplay between the environment’s structure, our capacities, and our intentions. A Gibsonian
account of perception contends that we ‘resonate’ with our environments, recognizing ecologi-
cally relevant information; such an account shifts the focus from the properties of objects to the
relationship between the observer and the environment. In turn, a broad understanding of the
notion of ‘environment’ ensures that we recognize affordances as extending beyond the physical
or material and onto the social and cultural realms. Music, for instance, when conceived of in
terms of organism-environment interaction, offers ecologically relevant information, which we
pick up from “the (sonic) environment and which affords perceptual significance” (Reybrouck,
2012, p. 394). For an in-depth discussion of the notion of affordances, see Appendix III in this
volume.
2. Entraining to music is a complex phenomenon based on an interplay between the music’s hi-
erarchical levels of patterning—rhythmic gestures, beats, (hyper)metrical units—and our bodily
capacities. The beat is generally the most accessible (entrainable) temporal level, generally recog-
nized by tapping to the music’s pulse, or ‘tactus’. Our perception of meter results from grouping
beats into larger periodic temporal units; two-beats (duple meter) and three-beats (triple meter)
groupings are the most typical. Because some listeners may hear slowly paced periodicities (e.g.,
at 30 BPM), while others may construct different perceptual impressions by entraining to nested
periodicities, either twice as fast (at 60 BPM) or even four times as fast (at 120 BPM), within the
context of this chapter I select periodicities that approximate 60 BPM. Furthermore, I presume
that, like me, readers do not have access to the scores or other music-related information corre-
sponding to the film examples, and therefore must rely upon their (innate and learned) capacities
for entrainment.
3. Karl Bücher’s ethnomusicological investigations on work and rhythm point us to the origin of folk
songs as the acoustic backdrop for the temporal structuring of physical labor. Meter in music that
accompanies physical labor “necessarily results from our inner bodily constitution and from the
technical preconditions of the work” (quoted in Meyer-Kalkus, 2007, p. 172).
4. In 1966, Pete and Toshi Seeger, alongside folklorist Bruce Jackson, produced a documentary about
enslaved African Americans. They observed that songs would somewhat alleviate the strenuous
physical effort and prevent individual workers from being singled out as performing more slowly
than the rest. The documentary captures a real-life instance of collective entrainment to song,
unfiltered by the aesthetics of a film director or composer.
5. Because quadruple meter contains a partially accented third beat ([1—2—3—4]–[1—2—3—4] . . .),
we often perceive it as a variant of duple meter, therefore carrying nearly identical extramusical
associations.
6. Duple meter is pervasive in filmic depictions of work and locomotion and in Western music writ
large. Huron’s (2006) statistical studies of many musical genres illustrate that “duple and quadru-
ple meters occur twice as often as triple and irregular meters (66% vs. 34%). In addition, simple
82 Affordances
meters are roughly six times more common than compound meters. In other words, . . . there ex-
ists a preference for binary beat groupings and a marked preference for binary beat subdivisions”
(p. 195). He then builds upon the neurophysiological studies using EEG by Brochard et al.’s (2003)
and on Jones and Boltz’s (1989) theory of rhythmic attention, to conclude that “there is some in-
nate disposition toward binary temporal grouping” (p. 196).
7. Theories of embodied cognition suggest that tempo perception originates in human locomotion
(walking or running), where “the experience of rhythm is mediated by two complementary repre-
sentations: a sensory representation of the motional-rhythmical properties of an external source, on
the one hand, and a motor representation of the musculoskeletal system, on the other” (McAngus
Todd et al., 1999, p. 26).
8. In some unique cases, the music’s meter is employed as a form of word painting rather than to
evoke a visceral response. A case in point is the film Stardust, where the music for the character
Septimus incorporates a septuple meter ([1—2—3—4—5—6—7]–[1—2—3—4—5—6—7] . . .).
9. Much research suggests that our tendency to entrainment is an evolutionary adaptation of our
sensory systems. For instance, Phillips-Silver et al. (2010) note that entrainment builds upon “pre-
existing adaptations that allow organisms to perceive stimuli as rhythmic, to produce periodic
stimuli, and to integrate the two using sensory feedback” (p. 3). Because entrainment allows for
“successful predictions [that] can enhance the speed of perceptual organization” (Grahn & Rowe,
2009, p. 7547), our successful entrainment to cyclic patterns in our environments offers an evolu-
tionary advantage. In turn, musical entrainment emerges as an extension of our broader tendency
and capacity for entrainment. London (2012), for instance, speaks of entrainment in music as “a
form of anticipatory behavior” (p. 25). Similarly, Large (2002) speculates that musical entrainment
stems from “temporal expectancies that adapt in response to temporally fluctuating input” (p. 1).
Therefore, in the context of adaptive cognition, temporal elements that undermine our capacity
for entrainment, such as complex asymmetric meters, will elicit intensely unsettling physical and
psychological responses.
10. London (2012) draws a connection between temporal entrainment and the expressive power
and social relevance of music, reminding us that we “learn to attune ourselves to the particular
rhythms of our native musical culture[s]” (p. 64).
11. Exploring entrainment as a “rhythmic dimension of human sociality”, for example, McNeill (1997)
points out that synchronized marching in combat serves a ‘bonding’ rather than a mere ‘coordi-
nating’ function. Furthermore, he suggests that entrainment as a means for social bonding and
identity formation may extend beyond marching, and remarks that for the Aztecs, the practice of
dancing ensured that “warriors asserted their corporate identity and prepared for battle” (p. 103).
Keyfitz (1996), in turn, suggests there is a lasting psycho-social impact to communal entrainment,
speculating that whether “ambushing a tiger in the hunt [or] attacking an enemy stockade, the
group that has drilled or danced together is more likely to come out successful through the mutual
trust among its members developed as bonding” (p. 408).
12. Similarly, in a scene from Farewell My Lovely [00:44:10] the film’s femme fatale and a private
investigator speak alone. The music enters as she says, “Why don’t you come over here and sit
beside me”, to which he responds, “I’ve been thinking about that for some time, ever since you
first crossed your legs.” In this scene, the visuals suggest analogous dynamics to Punch Drunk
Love between the characters—by their position opposite to each other and by the color of their
clothes—while the music in triple meter underscores a similar dance of seduction and anticipates
their becoming intimate.
13. Even when asserting that associations triggered by physiological responses are ‘learned’, the
learning process is a biological (not a disembodied) phenomenon based on motor and neuro-
logical mechanisms that stem (in this case) from temporal patterns of rhythmically coordinated
experiences.
14. The visuals support and contribute to this interpretation. Initially framing Jenny and a luxury car on
opposite sides emphasizes Jenny’s distance from the lifestyle David offers. Subsequently, as Jenny
no longer resists the idea of engaging in unethical behavior, David swings her over to the side of
the frame where the luxury car is located, suggesting a transformation that aligns with a different
set of values. As David and Jenny continue dancing, the sudden appearance of two luxury cars
aligned on the visual frame (and matching the colors of David’s and Jenny’s attire) further supports
this highly suggestive visual metaphor.
Affordances 83
15. When conducted on Earth, this research simulates weightlessness conditions; for a review see
Oman (2003).
16. In space, for instance, the conflict between proprioceptive information and visual or tactile cues
creates a “disparity between sensory input from various sources [that] may result in acute disori-
entation” (Tischler & Morey-Holton, 1992, p. 1345). For additional sensory conflict models see
Oman (1998); Reason and Brand (1975).
17. For instance, the “sensory conflict produced by the altered gravireceptor signals produces symp-
toms of motion sickness” (Oman, 2003, p. 202), mainly when “vestibular stimulation is in conflict
with all other sensory information” (Hecht et al., 2001, p. 116).
18. The perception of a tonal center is one of the most complex phenomena studied in music cogni-
tion and the subject of extensive experimental research. Browne’s (1981) “rare interval” hypothesis
holds that a scale’s unique intervallic profile contributes to defining a pitch collection’s qualia in
terms of its gravitational center. Symmetrical collections are characterized by cyclical intervallic
patterns, high frequency of a small number of intervals, and sub-segments that map onto them-
selves verbatim under various degrees of transposition or inversion. The absence of statistically
rare intervals and the absence of unequal distribution of intervallic structures in symmetrical col-
lections is foundational to our perception of their qualia. Nevertheless, various compositional
strategies—the archetypical gestures and style-bound conventions embedded in the musical tex-
ture—play an equally significant role in eliciting various degrees of tonal centricity; as Butler
(1989) notes, the “listeners’ judgments of tonal center are strongly influenced when rare intervals
are arranged differently across time” (p. 234).
19. In contrast, the ‘rare’ tritone and diminished triad in a major scale contribute to orienting us
toward the scale’s gravitational center (or tonic) by eliciting a set of tendencies in relation to the
tonic as the most stable point. Therefore, the absence of rare intervals or rare triads in symmetrical
collections does not afford us a landmark, a pointer, to orient us toward the collection’s gravita-
tional center.
20. The visuals of space expeditions often seek to elicit feelings akin to weightlessness by simulating
zero-gravity environments, triggering in us a cognitive dissonance between the optical perception
of floating and the proprioceptive sensation of gravity. For instance, in exploring such visual tech-
niques, D’Aloia (2012) argues that the spectator “loses (at least temporarily) the sensation of being
grounded to a surface of support . . . in the immersive darkness of the movie theatre, the spectator
senses an incongruity between what is seen and what is felt” (p. 232). In turn, the music is a pow-
erful device, one that operates in concert with other cinematographic techniques. In fact, research
shows that our motor behavior responds more strongly to aural than visual stimuli (see Patel et al.,
2005; Repp & Penel, 2004). In scenes depicting space expeditions, for example, music that draws
on symmetrical pitch collections may engender this inter-sensory conflict, thereby further enhanc-
ing the cognitive dissonance.
21. A tritone, for instance, embeds an unresolved sensory dissonance that causes a “physiologi-
cal interference along the basilar membrane of the cochlea . . . [rendering] the hearing organ
less able to discern the various spectral components present in the environment” (Huron, 2006,
pp. 324–325).
22. Murphy (2006) speaks to the use of the tritone, and specifically the major tritone progression, in
recent Hollywood films featuring scenes set in outer space.
23. In a Shepard tone, the superposition and loudness fluctuations of sine waves an octave apart cre-
ate the auditory illusion of a continually ascending or descending pitch.
24. In Cloudy with a Chance of Meatballs, a descending sequence of dominant-seventh harmonies
underscores the protagonist’s wandering in the air at the mercy of a strong tornado. [00:50:20]
Each dominant-seventh harmony is devoid of its resolution to the tonic, and instead, each leads
to another dominant-seventh harmony a half-step below it. This descending chromatic sequence
systematically thwarts the listener’s tonal expectations, not affording the listener a tonal center.
Only as the protagonist reaches the ground, the music resolves to a stable sonority that functions
as a tonic, denoting that the protagonist reached firm ground. Scenes depicting falling characters
or objects typically blend the LINEARITY image schema (see Chapter 3) with a musical dissolution
of gravitational forces.
25. The different ratios of intervallic constructs embedded within pitch collections prompt us to con-
struct mental representations (i.e., schemas) particular to each pitch collection—the distances
84 Affordances
between adjacent tones in a collection gives “rise to the emergence of the very different qualia”
(Shepard, 2009).
26. Film directors often draw on scientific research and new technologies to arrive at cinematographic
artifices that elicit in us specific responses. For instance, in 2001: A Space Odyssey, Kubrick
pioneered a visual effect to simulate a zero-gravity environment; to achieve this, he designed a
circular set that rotated vertically and instructed the actors to move at the rotating speed so as to
remain at the bottom of the set.
27. However, as the film unfolds, this normalcy fades away, and the music becomes dissonant and
ametric, unfamiliar and alien.
28. Oden (2023) explores how triple and compound meters generate a feeling of weightlessness,
including in this scene from WALL-E.
6
MEMORY & AUDITORY PERCEPTION
In Jaws, the music lurks from beneath our conscious attention and conditions us to respond
automatically, reflexively, instinctively. The film opens with a dozen young men and women
gathered around a beach bonfire, serenely trading songs. Chrissie and Tom break away
from the circle and head for a swim. Chrissie runs up a dune and pauses to look at the
quiet ocean, the sun setting on the horizon. She tosses her clothes on the sand and placidly
draws herself deeper and deeper into the soothing, silent waters. Tom remains on the shore.
The music sneaks into the soundtrack, with a two-note figure emerging from the depths of
the orchestra. A ripple in the water grabs Chrissie’s attention as she feels gently pushed up
and pulled down. No, it is not Tom playing around; he is still on the beach. Frightened, she
swims toward the shore, but the ripple moves with her. Now, with a tenacious drive, the two-
note musical figure intensifies, sucking Chrissie (and us) down to her horrific fate. A leitmotif
has emerged.
Leitmotifs are more than memory traces. They are to us what the sound of a bell is to
Pavlov’s dog: ecologically relevant acoustic cues that rely on evolutionary mechanisms to
condition us to respond in specific ways.1 During a film, leitmotifs manifest themselves
through the concurrent and consistent appearance of particular musical figures and their
extramusical counterpart; once established, they effectively bring to mind their extramusical
counterparts and trigger a visceral response. In this chapter, we first delve into the various
rationales behind leitmotif construction. Then, we attend to the cognitive mechanisms that
allow us to identify these figures, solidify them in our minds, and associate them with their
extramusical counterparts. Last, we reflect on the entailments of leitmotifs from the vantage
point of evolutionary psychology.
Leitmotif Construction
Leitmotifs straddle both the embodied and semiotic worlds, eliciting physiological responses
while also serving as musical signs. This dual nature of leitmotifs, which imbues them with
unmatched rhetorical and expressive power, manifests itself during the compositional
DOI: 10.4324/9780429504457-7
86 Memory & Auditory Perception
Hardwired Responses
Composers design leitmotifs that elicit hardwired responses by establishing analogies
between the visual and aural domains. In these cases, the extramusical counterpart’s physi-
ognomy (such as size, shape, density, reflectance, texture) helps determine the musical char-
acteristics (such as loudness, pitch range, timbre, contour). In tracing our affective responses
to these musical characteristics, we recognize that our evolutionary history has hardwired
us to respond in specific ways. When foregrounded within leitmotifs, these musical charac-
teristics elicit visceral responses analogous to those elicited by the leitmotif’s extramusical
counterpart. Leitmotifs can therefore indicate their extramusical counterpart’s temperament
and physical appearance—small and friendly characters or objects would feature a small
ensemble sound playing softly, whereas large and threatening characters or objects would
feature large symphonic forces playing loudly.
In Maleficent’s opening scene, we learn that in a magical kingdom lived a wonderful
creature, a fairy girl, whose name was Maleficent. A young girl with small wings rests hap-
pily on a tree branch, innocently playing with dolls and gently wrapping her tiny hands
around a broken tree branch to heal it. At this tender moment, the soundtrack introduces
young Maleficent’s leitmotif—a gentle melody on resonant instruments performing in their
high registers. Later in the film, a much older Maleficent learns of a betrayal that hardens
her heart. She screams with rage and marches toward the center of the kingdom. White fire
erupts from her, scorching everything in her path. Maleficent now has turned into a fierce
creature bent on revenge. During this threatening moment in the film, the soundtrack intro-
duces a different leitmotif—densely stacked chords in the low register performed by a large
ensemble that includes nearly the entire orchestra—to represent the now-unkind, fierce
Maleficent. In both instances, the leitmotifs’ musical characteristics directly tap into our
evolutionary, hardwired responses.
Music’s Affordances
Film composers often rely on the music’s affordances (and the embodied responses these
elicit) to establish a strong link between a musical figure and its extramusical counterpart.
Such affordances transpire through a leitmotif’s structural parameters—as we embody these
structural parameters, we subliminally engage in a broad array of interactions with the
music, interactions that directly relate to a leitmotif’s extramusical counterpart.
88 Memory & Auditory Perception
FIGURE 6.3B Vertigo. Scottie follows Madeleine up the bell tower. [01:16:00]
Vertigo illustrates a masterful interplay between various cinematic devices and the audi-
ence’s reactions to those devices. Scottie, who suffers from an intense fear of heights, attempts
to prevent Madeleine from committing suicide. Steps from a tall church, they exchange ten-
der words. Madeleine, however, seems distressed; she runs to the bell tower and quickly
goes up the stairs. Scottie follows her but cannot keep up—his vertigo paralyzes him. At this
moment in the soundtrack, and at similar moments throughout the film, the music signals
the protagonist’s outbursts of vertigo with a distinctive leitmotif that affords us no firm tonal
grounding—a chordal construct superimposing two unrelated triads (E♭m and D) and harp
sweeps in two different modes (E♭-Aeolian and D-Ionian), forming a near-dodecaphonic
collection—depicting (and eliciting in us) a feeling of bodily and psychological instability.2
In The Matrix, characters bend time and space. Trinity is on the run, with armed police
officers and two agents following her. She emerges as a black blur in the dark cityscape,
darting across rooftops as if she were part of the shadows themselves. The police chas-
ing her seem heavy and lumbering in comparison, but two agents take the lead, almost
matching her athleticism. With a sudden burst of speed, Trinity reaches the edge of a
rooftop and defies gravity, leaping in the air, her eyes fixed on the distant building. Time
seems to stretch out endlessly. As she hurtles through space, the wind whips past her face,
the sounds of the city fade away, and the ‘Trinity leap’ leitmotif soars in the soundtrack:
Memory & Auditory Perception 89
a set of oscillating triads that disrupt the firmly grounded musical gravitational forces and
an (a)metric profile that suspends musical temporality. Then, with a soft thud, she lands
on the opposite rooftop, her form perfect and controlled. The agents, left behind on the
first rooftop, stare in disbelief at Trinity’s audacious feat. Throughout the film, the ‘Trinity
leaps’ leitmotif capitalizes on our tendencies to attune to the music’s affordances with a
figure that evokes a suspended embodied response analogous to Trinity’s gravity-defying
moves.
Acoustic Resemblance
Via musical onomatopoeia—imitating the sound of an extramusical element—composers
construct leitmotifs that feature a close acoustic resemblance between the musical and
extramusical elements. By mimicking a character’s sounds and thereby drawing on our
pre-existing auditory associations, leitmotifs begin to extend into the terrain of musical
semiotics.3
Return to Oz presents numerous leitmotifs, most of them resting on the associative pow-
ers of onomatopoeia. The leitmotif for Billina, a chicken who travels with Dorothy, captures
the nasal timbre of a chicken’s cackle as well as the chicken-esque intervallic patterns and
articulation; while the two oboes and a bassoon help construct Billina’s timbric character-
istics, the angular-shaped (and somewhat disjointed) musical figures provide the contour
and articulation that contribute to onomatopoeically capturing her cackle. The leitmotif for
another character, Tik-Tok, a metallic mechanical character, draws primarily on timbre; it
features the metallic tone of an all-brass ensemble formed by one cornet, two baritone
horns, and one tuba.
90 Memory & Auditory Perception
Musical Archetypes
Musical ‘topics’ function as archetypes, providing composers a scaffolding framework to elicit,
through their music, specific social and cultural associations.4 When weaving musical top-
ics through the fabric of leitmotifs, composers are keen to foreground a topic’s characteristic
‘marker(s)’. Although any musical parameter may potentially amount to a defining topical
marker, pitch collections (in the form of modes or scales) are particularly effective and versatile.
Kung Fu Panda illustrates the use of a pitch collection, while also including culturally
specific instrumentation as an equally effective topical marker. This animated film unfolds
in the Valley of Peace, a land in ancient China inhabited by anthropomorphic animals. The
‘Hero’ leitmotif for the main character, Po, who secretly dreams of becoming a Kung Fu
legend, draws on a pentatonic scale [D, F, G, A, C] and uses strings and traditional Chinese
instrument, to signify his origins.5 This leitmotif emerges for the first time as Po dreams of
saving his village from the evil Tai Lung.
Often, numerous markers combine to define a topic. In the Star Wars films, Darth Vader’s
leitmotif’s minor mode, dotted rhythm, and slow tempo evoke the ‘funeral march’ topic,
along with its socio-cultural associations.6 Here, however, multiple additional mechanisms
play a role in strengthening the link between the leitmotif and its extramusical counterpart:
its low register and relatively high loudness fill the entire aural space, instilling in us the
fear that stems from encountering an overpowering large force; its duple meter engages our
embodied affordances of marching, bringing about resonances of war and conflict;7 and,
its oscillating chromatic-third relationships [C-minor, A♭-minor] suggest a complementary
topic associated with the character’s sinister nature.8
Leitmotif (Re)Cognition
For a musical figure to become a leitmotif, listeners must engage in numerous cognitive
mechanisms related to perception and memorization: we subliminally embody a musical
figure via subvocalization, organize and decode it according to gestalt principles, solidify
it in memory through subvocal rehearsal, and categorize it according to learned musical
schemas.9 Such perception and memorization mechanisms are vital in setting the space for
a subsequent associative mechanism, where we link a ‘proto’-leitmotif to an extramusical
element. Ultimately, once established, a leitmotif’s appearance within the fabric of a film’s
soundtrack will elicit the presence of its associated extramusical element, thereby reinforc-
ing a leitmotif’s semiotic nature as a musical sign. In this section, we draw exclusively on
Jaws’ famous leitmotif to illustrate how this multipronged mechanism subliminally leads us
from hearing a two-note growling musical figure to sensing the presence (virtual or realized)
of a menacing shark.
Perception
Jaws’ opening title sequence submerges us within an underwater point-of-view that sug-
gests the perspective of an animal lurking in the ocean’s depths. In the soundtrack, the
unnerving silence is gradually filled with a two-note figure in the low register. Our brains
and bodies actively (yet subliminally) respond to the low rumblings in the music: we
subvocalize the ascending semitone musical figure, entrain to the regularity of its underly-
ing pulse, and increase our heart rate and pupil dilation in response to its unpredictable
accents.
We also engage gestalt principles (italicized here) to decode this acoustic stimulus. The
initial temporal and registral proximity of two notes (E – F) and the silences between itera-
tions create the impression of a single melodic figure. A gradual unfolding of this figure via its
obsessive repetition suggests an embryonic musical process—the similarity of all repetitions,
Memory & Auditory Perception 93
emerging with an ever-increasing frequency and within a clear metrical structure, prompts
us to perceive a lengthier, fully formed musical theme. (See Figure 6.11.)
Its growth does not stop. This two-note figure further expands harmonically into two
semitone-oscillating dense harmonic constructs that, with their analogous contours and
simultaneous onsets, engage the common fate principle, prompting us to group the various
layers of this sonic construct.10 (See Figure 6.12.) While this two-note figure continues grow-
ing, the music presents a secondary melodic figure—it is precisely the proximity, similarity,
and common fate gestalt principles that allow us to identify this as a ‘secondary’ figure, one
that complements (yet is dissociated from) the primary one.
Toward the end of the opening titles sequence, the initial two-note figure gains further
strength, adding octave doublings, capturing the entire register.11 By the end of the main title
sequence, listeners have subliminally processed the two-note musical figure to the extent
that it has become a proto-leitmotif—a sonic object with the potential to become a full-
fledged leitmotif that will hook us to the story. (See Figure 6.13.)
To make a musical idea recognizable (and potentially more memorable), composers
embed ‘salient’ features in one (or more) of its parameters—for instance, an unfamiliar
timbre, an unusual rhythmic gesture, or an uncommon contour.12 In doing so, composers
intuitively draw on our evolutionary history, on our hardwired tendencies to respond to
changes in an otherwise regular environment.13 In Jaws, the unusual combination of con-
trabassoon and double-bass performing a motivic idea in the instruments’ lowest registers
(rather than an accompaniment figure or a bass-line) cuts through the listeners’ perceptual
habituation.14
Memorization
Leitmotifs are fluid, idealized mental constructs, morphing with every new instance, always
containing within themselves the potential for multiple expressions.15 Our ability to solidify
a leitmotif in short-term memory is contingent upon the interplay between learned schemas
(entailing a top-down cognitive mechanism) and its repetition (both, embodied repetitions
in the form of subvocal rehearsal, and actualized repetitions in the form of iterations in the
soundtrack).16
Learned schemas serve to offload cognitive processing and provide a scaffolding frame-
work for perceiving information.17 These mental constructs are vast and varied, spanning
from phonological to syntactical musical attributes, including tuning systems, pitch collec-
tions, timbres, metrical patterns, phrase structures, and even stylistic or performative traits.
When subliminally parsing new music stimuli through learned schemas, we further ‘chunk’
information and identify schematic correspondences.18
Music that draws on encultured listeners’ learned schemas—e.g., using a familiar tun-
ing system or drawing on archetypical gestures of a recognizable style—is more prone to
stick in the memory.19 For this film, anecdotal evidence suggests that the primary figure,
the two-note chromatic ascent, firmly remains in the collective consciousness as ‘the Jaws
theme’, while the secondary figure, the arpeggiated gestures in the horns, is seldom if ever
recognized in connection to the film. This difference results, in part, from schemas at work:
tonally, the primary figure easily relates to learned melodic and tonal schemas, whereas the
secondary figure does not; and temporally, the primary figure projects a clear metrical pat-
tern, becoming a driving ostinato that marks the pulse, thus nicely fitting within the temporal
scaffolds of learned schemas, whereas the secondary figure is not bound by the established
Memory & Auditory Perception 95
quadruple meter, freely introducing irregular durations (triplets, quintuplets, septuplets) that
further divorce it from the scaffolds of common metrical schemas.20 Therefore, the primary
figure easily sticks in our memories, while the secondary one quickly fades.
Mere exposure to an acoustic stimulus does not ensure its memorization. Initial ‘echoic
memory’ becomes extended through the ‘phonological loop’, a covert and subliminal rep-
etition mechanism that engages subvocal articulation and rehearsal and that helps us mem-
orize auditory stimuli. In the context of Jaws, although the initial two-note gesture ceases to
sound, we continue to subvocalize it, which subliminally contributes to its memorization.
The importance of the phonological loop notwithstanding, forming a robust memory trace
may only result through actualized repetitions in the soundtrack.
Just as in all types of cognition, the number of repetitions of a stimulus and the potentially
interfering informational ‘noise’ in between repetitions significantly affect the perception
and memorization processes.21 Certain auditory conditions set the space for effective per-
ception and memorization—to promote distinct memory traces, musical figures must appear
isolated from distracting auditory signals (dialogue, sound effects, or even busy accompani-
ment textures) and must repeat within short timespans.22 In Jaws, these two conditions are
met. At the onset of the main title sequence, the two-note musical figure emerges as the only
sound element in the film, unhindered by interfering dialogue or sound effects. Less than
three minutes into the film [00:02:30], a new music cue repeats identical materials, only
competing with sporadic cries for help from the victim. Shortly after [00:15:30], yet another
cue presents identical musical materials, but now completely isolated, not hampered by any
other soundtrack layer.
Association
Through the phases described up to this point, we have perceived acoustic information,
recognized units within it, and stored these in short-term memory. However, for these musi-
cal units to function akin to stimuli in classical conditioning, they must be associated with
an extramusical counterpart in the storyworld: during a film, a musical figure consistently
and systematically appears coupled with an extramusical element in the storyworld (an
object, a character, a situation), and after repeated exposure, the appearance of this musical
figure within the film’s soundtrack will suggest the presence of the associated counterpart
in the storyworld.23 By virtue of their association with a secondary stimulus, leitmotifs take
on a semantic nature, therefore residing along a continuum between implicit and explicit
memory.24
In Jaws, the concurrent presence of the characteristic two-note ascending chromatic figure
and the idea of a shark surface in various ways. During the main title sequence, the visuals
subliminally imply the presence of the extramusical counterpart, assuming a visual perspec-
tive from the ocean’s depths. Shortly after, in the second cue, Chrissie’s bizarre movements
suggest the extramusical counterpart’s presence as she is violently tossed from side to side
and ultimately pulled under the water. By the time we hear the third cue, the dialogue has
revealed the presence of a shark in the waters; while there has as yet been no visual confir-
mation of the existence of a shark, we witness the outcome—blood suddenly surges in the
water where children are playing and swimming. It is not until the fourth cue that the music
finally emerges coupled with images of the shark, this time circling a character submerged
in a seemingly flimsy cage. By this point in the film, the leitmotif is well-established. Now,
96 Memory & Auditory Perception
the expressive and signifying powers of the leitmotif emerge powerfully—by drawing on our
hardwired tendency to classical conditioning, the soundtrack introduces subsequent itera-
tions of this leitmotif to signal the dangerous shark’s imminent or latent presence.
Coda
Film music leitmotifs signal the presence of extramusical elements—characters, places,
events, or other narrative elements of a film’s storyworld. This chapter begins to reveal
the multi-level complexity of leitmotifs: while on the surface they appear as intra-
opus-established symbols, at a deeper level, they function as ecologically relevant acous-
tic cues that rely on the listener’s evolutionary, hardwired mechanisms.
Jaws navigates through a space of cultural significance while sinking its teeth into our
psyche. While the film’s surface is flooded with symbolism that grows more prophetic over
time, its lurking leitmotif operates beneath our conscious attention, ultimately altering our
behavior and relationship with nature, leaving a lasting, indelible yet unseen scar.
Notes
1. Leitmotifs have enjoyed much attention from film musicology; in fact, identifying and labeling
leitmotifs has long been a staple of film music scholarship. Although some recent musicological
studies on leitmotifs are informed by cognitive psychology, these do not engage in the empirical
investigation and thus remain firmly rooted in the humanities. These include the exploration of
prototype formation (Bribitzer-Stull, 2015; Zbikowski, 2002), memory (Biancorosso, 2013; Cohen,
2014), evolutionary perspective (Biancorosso, 2010), and cue abstraction and imprint formation
Memory & Auditory Perception 97
(Cambouropoulos, 2001; Reybrouck, 2010; Wiggins, 2010). Additionally, scholars apply method-
ologies from cognitive sciences to understand leitmotifs from behavioral and psychological per-
spectives, with studies focusing on perception (Baker & Müllensiefen, 2017), memory (Boltz et al.,
1991, 2009), cognitive processing (Boltz, 2001, 2004; Hacohen & Wagner, 1997; Nosal et al.,
2016; Tan et al., 2010; Töpper & Schwan, 2008), cue abstraction and imprint formation (Deliège,
1992, 2001; Deliège & Mélen, 1997), salience (Deliège et al., 1996), and physiological responses
(Cantor, 2004). For in-depth discussions of memory and auditory perception, see Appendices IV
and V in this volume.
2. Murphy (2022) observes correspondences between the music and the visual artwork presented
during the main title sequence.
3. Since the relationship between signifier (the musical sound) and signified (the actual sound) is
based upon similarity, musical onomatopoeia amounts to a semiotic function—primarily an
iconic sign within the context of Peirce’s trivium.
4. Chapter 7 offers an in-depth treatment of film musical ‘topics’.
5. In the animated film Lady and the Tramp, the Siamese cats’ leitmotif outlines a pentatonic scale to
signify their Asian ancestry. Similarly, in Dr. No, the pentatonic scale is used for the protagonist’s
leitmotif to connote his Chinese descent.
6. Most funeral marches in the common practice repertoire feature analogous parameters. See for
instance, Chopin’s Bb-minor Piano Sonata, Beethoven’s Symphony #3, Grieg’s Funeral March, Ber-
lioz’s Marche Funèbre Pour la Dernière Scène, Mahler’s Symphony #5, Beethoven’s Piano Sonata
#12, and Alkan’s Funeral March on the Death of a Par.
7. By drawing on topical signification, some leitmotifs (tangentially) embed traces of embodied af-
fordances to suggest character traits. For instance, comparing the meter of leitmotifs represent-
ing two villains, Darth Vader from Star Wars and Joker from Batman, sheds light on the musical
characteristics that reflect their individualities—whereas the duple meter in Darth Vader’s leitmotif
represents his martial and warlike nature, the triple meter in Joker’s leitmotif represents his odd,
circus-like appearance.
8. Lehman (2018) defines this specific chromatic-third relationship as evoking “a feeling of harmonic
unnaturalness . . . the affective ‘dark side’ ” (p. 101).
9. Leitmotif cognition engages bottom-up and top-down mechanisms. These are functionally dis-
tinct: while bottom-up mechanisms are geared toward constructing perceptual entities, top-down
mechanisms are geared toward selecting information by activating learned schemas. Reybrouck
(2010) echoes this duality in perception and distinguishes between ‘extracting’ salient features
from the musical surface and ‘abstracting’ the musical surface; while the former relates to the
sensory experience of music according to gestalt principles, the latter relates to the mechanisms of
cue-abstraction and imprint-formation necessary to the construction of a learned schema.
10. For some listeners, this musical gesture is reminiscent of Stravinsky’s The Rite of Spring chord.
11. Here, the primary and secondary musical figures blend: the harmonic construct draws on a pitch
collection associated with the secondary theme (E, G, B♭, D♭, E♭), yet its voicing foregrounds
interval-class 1 between its outer notes, the primary interval of the primary two-note figure.
12. Deliège (1992) attends to “head structures” (the initial portion of a theme) and passages with “ac-
cents” (musically marked, salient events in terms of loudness, range, contour, temporal placement,
duration, or any musical parameter), and suggests that both become imprinted in memory to a
greater degree than the remaining portions of a melody.
13. There is, therefore, a delicate balance between abiding by common schemata and attaining sali-
ency; to construct memorable and recognizable musical figures, composers must do both: draw
on common musical schemas and include salient features.
14. The saliency of a proto-leitmotif’s initial presentation is vital to ensure its recognition, especially
because subsequent iterations may be obscured in the soundtrack, buried under dense accompa-
niment textures—successful recognition of subsequent iterations is contingent upon our forming a
strong imprint of the proto-leitmotif. Much of Dowling’s influential research program attends to the
interaction of contour (bottom-up mechanisms akin to gestalt) and scale/interval information (top-
down mechanisms grounded in schemas) in the memorization of melodies; in several studies, he
suggests that when exposed to short, atonal, or novel melodies, we are more prone to retain their
contour information, whereas when exposed to longer, tonal, or familiar melodies, we are more
prone to retain a melody’s interval information (see Dowling, 1978; Dowling & Bartlett,1981;
Dowling & Fujitani, 1971). More recent neuropsychological studies also suggest the primacy of
98 Memory & Auditory Perception
24. Although most research on melody recall and discrimination examines the role of conscious atten-
tion, leitmotifs are embedded within the musical fabric, which often goes unnoticed and is seldom
attended consciously.
25. Our ability to create and recognize patterns in music may have been a precursor to our ability
to create and understand language, as both involve organizing and structuring sounds. Therefore,
our tendencies to recognize and memorize leitmotifs may rely on a pre-linguistic evolutionary
function.
26. Rötter (1994) identified marked physiological reactions, such as peaks in Galvanic skin response,
that coincide with the occurrences of leitmotifs, suggesting a phenomenon akin to classical condi-
tioning. Biancorosso (2010) speculates about Jaws’ leitmotif triggering such classical conditioning,
noting that “by the time the audience (or those viewers already in the know) sees the subsequent
attacks, the connection between the minor-second motive and the action becomes Pavlovian in its
reflexivity” (p. 307).
27. Often, salient musical gestures that do not function as leitmotifs become so intricately associated
with an element in the storyworld that they transcend the boundaries of their original film and
find their way into other films as a quotation or parodic commentary—the shrieking string gestures
from Psycho explored in Chapter 1 are a prime example.
28. Most scholars regard Jaws’ as the leitmotif par excellence because it uses “mood-congruent re-
lationships . . . [to] reflect both a joint encoding and a unified memory representation of music
and film information” (Boltz, 2004, p. 1196) and because it relies on “events associated with
strong emotions . . . [which] are better remembered than emotionally more neutral events” (Guen-
ther, 2002, p. 323). In fact, extra-musical elements such as the narrative, context, associations,
and other sensory information all contribute to our (re)cognizing and solidifying melodies in our
memory, particularly when there is structural, affective, or semantic congruency or alignment
between the stimuli across modalities. Tulving’s (1983) ‘encoding specificity’ hypothesis purports
that the presence of contextual similarity surrounding the initial encoding and subsequent retrieval
of memories helps consolidate them as long-term memory traces. This suggests that reinforcing
contextual associations during a set of initial presentations may promote encoding a leitmotif as a
long-term memory trace. Boltz et al. (1991) examines the effect of background music on remem-
bering filmed (visual) events, and observes that the music’s impact is dependent on its (affective)
congruency with the visuals: “in situations where the mood of the music corresponds to the affec-
tive meaning of the scene, memory should be quite high . . . depending on the mood congruency
and placement of music relative to a critical scene, music can enhance subsequent recall relative
to situations where no music occurs at all” (p. 600). In a subsequent study, Boltz (2004) explores
the inverse relationship, “whether visual information influences the perception and memory of
music” (p. 43), and observes that “the affect and format of visual information” (p. 54) influences
the cognition of music, particularly the cognition of melodies. The more emotionally and gripping
the setting, therefore, the less we attend to generic framing elements; this sets up the conditions
for salient (affective) elements to make a more memorable impression.
29. Much research suggests that leitmotifs significantly influence our attitudes toward particular ‘real-
life’ scenarios outside of films, and that the impact of leitmotifs extends from in-theater to out-of-
theater behavioral responses, particularly after exposure to highly effective horror or frightening
films. Nosal et al. (2016) argue that “the music accompanying shark footage is nontrivial . . . many
people trace their fear of sharks to the 1975 blockbuster Jaws, whose redolent soundtrack has
become deeply rooted in popular culture . . . [evoking] haunting images of surfacing dorsal fins,
swimmers’ legs underwater, and the histrionic combination of blood and bubbles” (p. 2). Cantor
(2004) echoes these observations and expands the repertoire of films to explore the behavioral
changes and lingering effects of leitmotifs. These lingering effects include sleep disturbances or
anxiety in related waking life situations, such as “difficulty swimming after Jaws (in lakes and pools
as well as the ocean); uneasiness around clowns, televisions, and trees after Poltergeist; avoidance
of camping and the woods following The Blair Witch Project; and anxiety when home alone after
Scream” (p. 283). While leitmotifs embedded in horror films are highly effective and clearest to
unpack in terms of evolutionary psychology, leitmotifs embedded within westerns, romantic com-
edies, or sci-fi films may engender analogously strong psychophysiological responses.
7
ARCHETYPES
We have our tickets for the matinee, but we arrive a few minutes late. In our rush, we take no
notice of the auditorium number—this is a massive multiplex cinema, replaying dozens of
movie favorites from years past, all about to start. We trust, however, that our ears will guide
us to the right auditorium. Walking past the first, we hear Christmas music, with the arche-
typical cowbells yet infused with sinister undertones—it must be Home Alone. Steps later,
Spanish-sounding (Phrygian-mode) melodies on a flamenco guitar—most likely The Mask
of Zorro. Then, Baroque flourishes on a harpsichord supporting a chamber string quartet’s
contrapuntal gestures—Dangerous Liaisons plays here. Further ahead in the hallway, a short,
pungent motif on a reverberant honky-tonk piano blending with sporadic timpani hits, a
muted trumpet, and gunshots—likely a Spaghetti Western or a Tarantino movie borrowing the
distinctive sound. Finally, as we hear synthesizer drones engendering futuristic sonic land-
scapes, we know we have found our auditorium—Blade Runner’s opening credits are rolling.
Film music connotes distinct sociocultural spheres. It embeds musical archetypes that
function as signs, guiding us through a film’s storyworld by revealing the ethnic or social
background of characters, suggesting or supporting narrative developments, setting locale
and time period, and even indicating the genre of films themselves. These signs are called
musical ‘topics’—cultural constructs that transpire through a community of listeners’ musi-
cal discourse.1 Musical topics, however, are not permanent semiotic constructs. As cultural
units of meaning, they emerge, change, and vanish with the constant flow and transforma-
tion of compositional and film-scoring practices.2 Therefore, in this chapter, we explore film
music topics from two interdependent perspectives: first, from a synchronic perspective,
describing their most common functions within a film at a single point in time from our
vantage point today; and then, from a diachronic perspective, tracing a single topic’s devel-
opment and transformation within U.S. films over nearly a century.
DOI: 10.4324/9780429504457-8
Archetypes 101
of film music topics: to set locale and time period, to support character construction and
development, to add subtext to the story, and to indicate genre.
FIGURE 7.1A Dangerous Liaisons. Main title music. (Music by George Fenton.)
FIGURE 7.2A The Last Emperor. Main title music. (Music by David Byrne, Ryuichi Sakamoto,
Cong Su.)
In some cases, topics can be more suggestive of place than time. The main title theme
of The Last Emperor, for example, introduces an ‘Asian pastoral’ topic. It features an erhu
(a Chinese instrument akin to a violin) performing a pentatonic melody supported by taiko
drums (a percussion instrument indigenous to much of Asia). The instrumentation and pitch
configuration help place the narrative in the context of China. Nevertheless, because these
instruments have been used in Asia for centuries, embedding such topical markers in the
music helps foreground the place, rather than a specific time.
FIGURE 7.3A Home Alone. Main title music. (Music by John Williams.)
Some topics indicate time, but instead of pointing to a time period, they suggest a season.
The main title music in Home Alone, for instance, introduces a ‘Christmas’ topic to situate the
narrative around late December of some undetermined year. The music opens with an innocent
melody in the high register of a celesta outlining a major triad. However, as it adds the arche-
typical sleigh bells and triangle along with high-pitched woodwinds and strings, it swiftly turns
to the parallel minor mode and surreptitiously slinks mischievous string glissandi and descend-
ing chromatic lines reminiscent of Tchaikovsky’s “Dance of the Sugar Plum Fairy”, evoking a
‘wicked Christmas’ topic. Here, the time of the year is more relevant than the specific year.
FIGURE 7.5 15 Minutes. NYPD Detective Eddie Flemming sobering up. [00:04:30]
In contrast, a ‘suave noir detective’ topic introduces world-wise NYPD Detective Eddie
Flemming, first seen sobering up by submerging his head in ice water. The music, a smooth-
jazz quasi-improvised piece in Dorian mode reminiscent of Miles Davis’s “So What” and
featuring a small ensemble comprising Hammond organ, muted trumpet, electric bass, and
drums, pours a chilled, smooth, vintage veneer over Flemming’s character.
Later, a ‘Miami Vice’-like topic introduces the young New York City fire marshal, Jordy
Warsaw. The music features a metrically offset electric guitar riff over a looped beat-machine
backdrop, imbuing Jordy with youthful urgency and dynamism. Throughout the film,
although the music accompanying each character continues to infuse archetypical topical
markers, it evolves, reflecting the characters’ journeys as they adjust to their new realities.
FIGURE 7.7A ‘Funeral march’ topic in The Three Musketeers. (Music by Herbert Stothart.)
FIGURE 7.7B The Three Musketeers. Lady de Winter walks toward her execution. [01:58:30]
In comedies and parodies, the music often over-intensifies the underlying affect by infus-
ing a topic’s hyperbolic rendition, capturing a topic’s musical essence or most salient features,
effectively using the soundtrack to evoke extra-musical associations. A scene from Meet the
Parents, for example, portrays Greg Focker as a hero by introducing a version of the ‘heroic’
topic in the music—characterized by the use of trumpet and French horns, marching tempo,
duple meter, periodic phrase structure, harmonic progressions based on diatonic triads in
root position, profuse melodic intervallic motions in perfect 4ths and 5ths, and kettledrums or
cymbals on selected downbeats.5 The ‘heroic’ topic in this scene, particularly its hyperbolic
rendition, functions not as character construction but as character representation to support
Greg’s short-lived heroic narrative—an equally epic downfall will soon follow his grand farce.6
FIGURE 7.8A ‘Heroic’ topic in Meet the Parents. (Music by Randy Newman.)
106 Archetypes
FIGURE 7.8B Meet the Parents. Greg arrives victorious with the cat. [01:13:10]
Genre Indicator
Weaved through the musical fabric of main title themes, topics help filmmakers (and view-
ers) situate a film within a particular genre or sub-genre. Although main title themes are
often extended pieces of music, by drawing on topics, they instantly convey the type of film
about to unfold—sci-fi, horror, comedy, noir, or any other genre. This subsection explores
the relatively stable ‘Spaghetti Western’ topic, which has maintained its semiotic currency
for decades, outlasting the genre itself.
The origin of the ‘Spaghetti Western’ topic can be traced back to Ennio Morricone’s film
scores and his collaborations with director Sergio Leone. In particular, his main theme for The
Good, the Bad and the Ugly includes topic-defining markers: galloping rhythms rendered
in the percussion or strummed guitars, Aeolian mode, whistling or animal howls (e.g., the
‘coyote’ motif), short phrases with relatively lengthy pauses, ‘twangy’ guitar, harmonica, bass
ocarina, soprano recorder, and strings or choir background textures.7 In addition, sonic mark-
ers providing a stylized rendition of on-screen environments—such as long reverbs suggestive
of vast, empty, open spaces, and close microphone techniques matching the extreme close-
ups—helped define the archetypical sound of the ‘Spaghetti Western’ topic.
FIGURE 7.9A The Good, the Bad and the Ugly. Main title music. (Music by Ennio Morricone.)
FIGURE 7.9B The Good, the Bad and the Ugly. Main title sequence.
Archetypes 107
The main title theme from The Mandalorian borrows a great number of markers of the
‘Spaghetti Western’ topic: it opens with a solo ocarina ‘howling’ in its mid register and a
short melodic figure reminiscent of The Good, the Bad and the Ugly, all immersed within
a reverb and echo suggestive of a large space. Moments later, a synthesized animal cry and
a bell-like hit preface galloping rhythms in the drums, which serve as accompaniment to
an Aeolian-mode call-and-response melodic exchange between acoustic guitar and low
brass. Toward the center part of the theme, as the galloping accompaniment continues, a
brief interlude introduces a whistling timbre resembling a Theremin, supplying a futuristic
aura. Although this TV show is set in the Star Wars storyworld, ostensibly science fiction,
these topical markers narrow and (re)direct our expectations in a different direction—we
will most likely expect a lone, possibly flawed (anti-)hero with an idiosyncratic moral code.
As a result, by weaving archetypical topical markers through the fabric of the main theme,
the filmmakers firmly situate The Mandalorian within the Spaghetti Western genre’s legacy.
Diachronic Perspective
While from a synchronic perspective topics seem stable and permanent units of meaning—
featuring little or no deviation in their archetypical markers—a diachronic perspective
reveals that they are fluid constructs: new topics emerge while existing ones change their
markers, fuse with other existing topics, or vanish altogether. In this second part of the
chapter, an abridged diachronic analysis of the ‘superhero’ topic permeating through a wide
range of superhero films’ main title themes reveals significant changes in the topic’s markers
over time.8
FIGURE 7.12A Captain Blood (1935). Main title music. (Music by Erich W. Korngold.)
FIGURE 7.13A Superman (1941). Main title music. (Music by Winston Sharples & Sammy Timberg.)
gestures and figures gain cultural currency, becoming the new signifiers of the ‘superhero’
topic.
In the 1930s and 1940s, while searching for musical gestures that would support and
help construct the superhero persona, film composers drew on established musical tra-
ditions (opera, concert music). They infused superhero themes with the post-romantic
sound of epic orchestral forces, sweeping melodic gestures, and other archetypical fea-
tures of the previously established ‘military’ topic—duple meter, brisk tempo, major
mode, brass and percussion timbres, diatonic progressions, triadic chordal structures,
periodic phrases. The main theme for Captain Blood (1935), for example, emerges as
an early instance of a topically defining theme, featuring a major-mode trumpet fanfare
with profuse close-voiced triadic structures and stable periodic phrasing typical of the
‘military’ topic. (See Figure 7.12A.)
The theme for the Superman (1941) animated cartoon series adopts the trumpet fanfare
and triadic gestures associated with the ‘military’ and (at the time) ‘superhero’ topic but
detours into non-diatonic regions via a chromatic ♭VII harmony, which one may read as
alluding to the ‘superhuman’ powers of the character, a characteristic absent in ‘human’
heroes.11 The music thus absorbs and crystalizes the superpowers of the characters it repre-
sents so as to transcend the familiar diatonic boundaries. (See Figure 7.13A.)
In the 1960s, the Batman TV series theme illustrates a marked shift away from the epic
musical flair characteristic of earlier superhero screen renditions. Because the series fit-
ted seamlessly within a camp film noir narrative—with intricate plots that revolved around
assaults, thefts, murders, and moral corruption—it seems suitable that its main theme
110 Archetypes
FIGURE 7.14A Batman TV series (1960s). Main title music. (Music by Neal Hefti.)
borrowed elements from noir soundtracks, especially those by Henry Mancini (Peter Gunn,
The Pink Panther), Count Basie (M Squad), and Monty Norman (Dr. No, the James Bond
theme).12 After a fleeting ‘fanfare’ gesture in the trumpets, reminiscent of previous super-
hero themes, the Batman TV series theme outlines a twelve-bar blues progression (with its
distinctive extended harmonic language), features a small-band ensemble highlighting a
twangy guitar performing a swirling riff around #4̂, introduces vocals replicating the guitar
riff with the catchy ‘na na na na na na na na . . . Batman!’, and embeds sonic markers that
emulate the visual expletives emblematic of comic books—‘SOCK!’, ‘POW!’, ‘ZOK!’. Here
the quasi-noir sound and the sonic markers contribute to the TV show’s tongue-in-cheek
rendition of its comic book counterpart. (See Figure 7.14A.)
The shift away from the grand symphonic sound extended well into the 1970s, with the
main theme for the Wonder Woman TV film (and later the TV series) following suit—featuring
a Boogie-Woogie accompaniment, big-band instrumentation, and pop vocals.13 Neverthe-
less, a brief reference to the prior tradition of superhero themes remains in the form of a short
brass fanfare. In the film, a U.S. Air Force plane crashes on an uncharted island inhabited by
Amazon women, one of whom, Diana, brings the pilot back to the U.S. and helps battle the
Third Reich. Diana strives to adapt to the American culture, wearing ordinary clothes and
getting conventional jobs (e.g., nurse, secretary). In fact, when she takes on a superhero role
(as Wonder Woman), she reveals her skimpy, star-spangled armor and her special powers.
By appropriating a big-band, iconic American sound, Wonder Woman’s main theme helps
her fit in and become an American cultural icon. (See Figure 7.15A.)
Archetypes 111
FIGURE 7.15A Wonder Woman TV series (1970s). Main title music. (Music by Charles Fox.)
The release of Superman: The Movie in 1978 marked a return to orchestral superhero
themes infused with the ‘military’ topic, including markers such as major key, brass timbres,
duple meters, and fanfare-like gestures.14 Its harmonic stability notwithstanding, and just like
its 1941 counterpart, the theme for Superman (1978) detours into non-diatonic harmonic
regions. Here, a non-diatonic harmonic gesture now known as ‘Aeolian cadence’ (♭VI ®
♭VII ® I) becomes intricately linked to the character’s heroic actions. Since then, this musi-
cal gesture has been highly influential, emerging as a distinct marker of the ‘superhero’
topic, reliably found in the music of more recent films, such as in Captain America (2011).15
FIGURE 7.16A Superman (1978). Main title music. (Music by John Williams.)
112 Archetypes
Batman (1989) once more brought a new bent to the musical signatures of the superhero
topic. Although Batman’s main theme maintained the brass-heavy orchestration and the
militaristic gestures, it also introduced a darkly minor mode and a disquieting melodic
gesture [♭6̂ ® 5̂ ® #4̂] carrying devious and tragic resonances.16 Additionally, the theme
often interweaves the ‘uncanny’ topic—characterized by oscillating chromatic mediants,
particularly including minor chords—bestowing the character with an ominous aura.17 The
film begins on a dreary night in Gotham City as the Wayne family exits a movie theater
and ventures down a poorly lit alleyway. Emerging from the shadows, a man wielding a
gun demands their valuables, and the Waynes submit without resistance. A high-angle shot
captures a fleeting shadow. To confirm Batman’s looming presence, the music swells with
his signature theme, blending a chromatic mediant relationship (Cm ® A♭m) to evoke dark
and uncanny resonances synonymous with Batman’s nocturnal persona.
FIGURE 7.19A Wonder Woman (2017). Main-on-end title music. (Music by Rupert Gregson-
Williams.)
Coda
Just as cultures rely on archetypes to define their identities and enable communication, films
strongly depend on musical topics—culturally shared musical symbols to supply informa-
tion about the film, the characters, the setting, or the narrative. In this chapter, we explored
film music topics from two complementary perspectives: synchronic and diachronic. A syn-
chronic perspective allowed us to distill the phonetic and syntactical structures on the
music’s surface that denote specific extramusical meanings. In turn, a diachronic perspec-
tive allowed us to unearth old topics and trace their influence in (trans)forming new ones,
while reminding us of a topic’s (im)permanence as a unit of meaning permeating filmgoers’
cultures.19
As we walk out of the multiplex, more recent films are about to begin. We first hear
a catchy pentatonic motif on exotic zither-sounding instruments (yangqin and guzheng)
punctuated with emphatic percussion (tanggu, paigu, and bangu drums)—musical markers
evoking an ‘Asian’ topic. Then, symphonic forces join in, with proud strings on a powerfully
tenacious minor-mode melody and heroic brass outlining triadic progressions that project
an unyielding harmonic stability delicately decorated with exotic chromaticism—musical
markers denoting a contemporary ‘superhero’ topic. Such an exquisite fusion of topical
ethos suggests that Shang-Chi and the Legend of the Ten Rings plays in this auditorium. We
can hardly wait to come back for more!
Notes
1. As signs, musical topics are akin to leitmotifs. However, whereas a leitmotif’s signified remains
exclusively within a film’s storyworld, a topic’s signified resides outside a film’s storyworld, in the
collective subconscious shaped by soundtracks to multiple films.
2. Within a community of listeners sharing common musical codes, topics are effective means for
conveying meaning; outside of such a community of listeners, however, topics may lose their
semiotic currency and even go unnoticed. For an in-depth discussion of the notion of archetypes,
particularly musical topics, see Appendix VI.
3. Examples of the latter include Back to the Future (1985), Jumper (2008), Hot Tub Time Machine
(2010), Looper (2012), and About Time (2013).
4. Particularly relevant examples include Chopin’s Piano Sonata No. 2 (third movement), Beethov-
en’s Eroica Symphony (second movement), and Mendelssohn’s Song Without Words Op. 62,
No. 3.
5. Although the sociocultural associations corresponding to this topic seem grounded in a symbolic
relationship (i.e., based on convention), these rest on both iconic (i.e., through resemblance) and
indexical (i.e., through proximity) relationships—for instance, the duple meter functions iconi-
cally, mapping the physicality of marching, while the sound of kettledrums and trumpets functions
indexically, indicating early battlefields.
6. See Chapter 9 for an in-depth discussion of this scene through the lens of categorization.
7. Leinberger (2004) offers a fascinating, in-depth study of Morricone’s score for The Good, the Bad,
and the Ugly.
8. Although we attend here to the diachronic fluidity of the ‘superhero’ topic by focusing on new
markers introduced over time, one may understand much about this and other topics by attending
to their invariant markers—that is, attending to the consistent musical features across diachroni-
cally different versions of a topic. Additionally, one may construct a hierarchy of topical stability
by observing and computing degrees of diachronic difference—such a hierarchy may convey
significant insights into the evolution of musical signs.
9. Buhler (2016) highlights the influential nature of music media in the realm of franchising, as it
expands its reach to various platforms such as television, video games, books, amusement park
rides, and websites.
Archetypes 115
10. Young (2013) identifies a blend of the ‘military’ and the ‘fantasia’ topics as the core sub-components
of a ‘superhero’ topic and traces the origin of this blend to early sound film: signifiers of the
military topic include march (duple meter) meters, major-mode (triadic) fanfares, and brass in-
strumentation, while signifiers of the ‘fantasia’ topic include harp glissandi and pantriadic chro-
maticism. In particular, pantriadic chromaticism seamlessly aligns with the elicited associations
of the supernatural (and superheroes) by fostering a musical environment devoid of tonal forces
via a “thorough negation of tonal norms of centricity, diatonicity, and functionality” (Cohn, 2012,
p. xiv).
11. Young (2013) notes that “heroes such as Captain Blood, Robin Hood, and Zorro possess no ex-
traordinary powers outside of prodigious skill with a sword—in short, they could exist in reality.
However, heroes like Superman and Captain Marvel have abilities that are superhuman or god-
like, demanding that they can only exist within a fantasy world” (p. 111).
12. The Batman character aligns with the three types of film noir ‘heroes’ identified in the literature:
the ‘seeker-hero’, whose “investigation takes the form of a quest into a dangerous and threaten-
ing world: the noir world” (Walker, 1992, p. 10); the ‘victim hero’, who is the “passive subject of
investigation tested by threats to his masculinity and individual autonomy” (Mason, 2011, p. 138);
and the ‘amnesiac-hero’, who “becomes a victim of a violent and hostile world and who lives in
fear” (Walker, 1992, p. 15).
13. While the story for the first season of this series unfolded during World War II, the stories for sub-
sequent seasons unfolded during the 1970s. Correspondingly, the lyrics of the TV series’ theme
song, with its World War II references, were omitted after the first season.
14. Halfyard (2013) argues that “what gives [this] formula its heroic character is largely the harmonic
stability created by the use of tonic-dominant harmony and corresponding melodic intervals such
as prominent open fifths, alongside the energetic character of the march rhythms . . . The punchy,
confident, militaristic scoring” (p. 172).
15. While in the Super Mario (1985) video game an Aeolian cadence indicates the successful com-
pletion of a task, the subsequent film, Super Mario Bros. (1993), embeds this harmonic gesture in
its main title theme. The Aeolian cadence also often appears in songs associated with superhero
films, such as Nickelback’s “Hero” for Spider-Man (2002) and Seal’s “Kiss from a Rose” for Batman
Forever (1995).
16. Donnelly (1998) construes Batman’s theme as “pure Gothic melodrama, using a large, dark and
Wagnerian orchestral sound” (p. 148), and Young (2013) reads it as a “[shift] from a strictly heroic
style to one of the tragic heroic” (p. 105). Halfyard (2013) compares Batman’s and Superman’s
themes and notes that Batman’s “minor-key theme draws on horror-movie musical tropes as much
as superheroic ones, and the score substitutes constant harmonic slippage for Superman’s diatonic
stability” (p. 175).
17. Bribitzer-Stull (2015) notes this chromatic-mediant oscillation in Richard Wagner’s Der Ring des
Nibelungen, and suggests it denotes “mystery, dark magic, the eldritch, and the otherworldly”
(p. 144). In tracing the affective impact of such chromatic-mediant oscillations, Lehman (2018)
suggests that “the characteristic semitonal displacements of the pillars 1̂ and 5̂ of the home triad,
pitches that are flayed outward in opposite directions, as though being tugged by invisible tonal
tendrils of ill intent” (p. 101). Heine (2018) notes that this progression is so “strongly uncanny”
that “only a sorcerer could conjure up such unnatural harmonies” (p. 122). Moreover, Huron
(2006) conducted a series of experiments in which participants identified the qualia for chromatic
mediants; the reported phenomenal responses (or qualia) for this chromatic mediant relationship
include “mysterious, cheerless, somber, dark, tragic, despairing, death, depressed” (p. 273). Half-
yard (2004) offers a fascinating in-depth study of Elfman’s score for Batman.
18. Huron (2006) notes that scale degrees (including chromatic ones) elicit uniquely different phenome-
nal responses, which we conceptualize using descriptors known as ‘qualia’. He reports that the raised
subdominant (#4̂) evokes qualia that include “intentional, motivated . . . moderately anxious . . .
curious about possibilities” (p. 145).
19. Bourne (2021) suggests a shift in directionality between concert and film music, proposing that
contemporary listeners draw upon their familiarity with film music to understand Western art
music.
8
ASSOCIATIONS
In The Conversation, the diegetic music surreptitiously infiltrates the narrative and sublimi-
nally informs our interpretations. After a tense day at work, Harry, his male colleagues, and
their new female acquaintances drive to a warehouse for a spur-of-the-moment party. Not
comfortable in large gatherings, Harry seeks comfort in Meredith, a young woman he just
met. Meredith cuddles close to Harry as she dances him away from the crowd. As they drift
to a secluded area, an instrumental version of Duke Ellington’s “Sophisticated Lady” softly
reverberates through the warehouse. They dance intimately for a while. At ease, Harry and
Meredith open up about their lives as the song continues distantly in the background. Dur-
ing this moment of tender intimacy, the diegetic music reveals facets of her personality:
the song’s associations impinge upon our (and possibly Harry’s) perception of Meredith—a
sophisticated lady, “smoking, drinking, never thinking of tomorrow”.
Film music draws from deep within our personal associations to embed subtextual com-
mentary, eliciting uniquely individualized readings of a single scene or an entire film. Con-
ceptual Blending, a framework developed by Fauconnier and Turner (2002), is exceptionally
suitable for fleshing out potential projections between the music’s associations and a film’s
dramatic development that elicit those readings. In this chapter, we draw on this frame-
work’s multiple-space model to explore instances in which existing songs’ or concert pieces’
contextual associations (even if distinctly subjective) contribute to a film’s meaning with
subtexts that inform our interpretations.
Songs
Filmmakers recognize the potential of introducing songs to supply an additional layer of
meaning that supports and comments on the main narrative. Often, songs (even instrumen-
tal versions) from different styles, eras, and languages strategically align with the narrative
events. At these moments in a film, we draw onto a song’s associations (elicited by its lyrics
or title) and project these onto the film’s narrative to construct an interpretation.
Lars and the Real Girl explores the nature and reality of love. Lars, a quirky young man
conflicted by feelings of love and loss, develops a romantic (non-sexual) bond with Bianca,
DOI: 10.4324/9780429504457-9
Associations 117
FIGURE 8.1A Lars and the Real Girl. Lars serenades his beloved. [00:42:40]
a realistic sex doll, forging an imaginary relationship out of love and loneliness that thrives
on pseudo-dialogue and a pretense of tolerance. In the Midwest woodlands, placidly rest-
ing in a treehouse, Lars sings Nat King Cole’s classic song “L-O-V-E” to Bianca: “L is for the
way you look at me, O is for the only one I see, V is very very extraordinary, E is even more
than anyone that you adore can . . .” He continues singing, “Love is all that I can give to you,
Love is more than just a game for two, Two in love can make it”, and abruptly stops to sug-
gest, “You can watch me chop wood, too. I’m really good at it.”1 As the conceptual blend in
Figure 8.1 outlines, the associations brought about by the song’s lyrics inform our interpreta-
tion of the film.2 The first stanza reveals Lars’s delusional version of love, one that flouts the
reality of Bianca’s inanimacy—by switching between chest voice and falsetto, he simulates
a feminine timbre, projecting onto Bianca a (false) voice with which she appears to join him
in a duet. In turn, Lars’s singing only the first three lines of the second stanza helps delineate
the film’s narrative trajectory: love is all that Lars can give to Bianca; his love is more than
just a game and one that extends beyond the two of them to involve Lars’s family and the
entire town; and these feelings, however delusional, help Lars make it, as everyone around
him grows more human by accepting Bianca, helping Lars overcome his trauma.
Compilation scores saturate the soundtrack with pre-existing popular songs. This practice
emerged as a marketing strategy, initially placing a catchy feature song at the film’s begin-
ning or end.3 Then, during the 1980s, the financially advantageous idea of selling a movie
and a soundtrack prompted filmmakers to introduce many popular songs as part of the
non-diegetic (rather than diegetic) soundtrack, songs that (often) bore little or no relation to
118 Associations
the events in the film.4 Nevertheless, this practice opened new dimensions in the general
conception of film music, a practice that, in the hands of thoughtful music supervisors,
enhances a film’s story through fitting and ingenious song placement.
Fools Rush In is a romantic comedy in which dozens of songs from different cultural
backgrounds heighten the marked differences in the protagonists’ cultural backgrounds.
Alex, a New York nightclub designer, meets Isabel, a beautiful young Mexican photog-
rapher. Their one-night stand results in Isabel’s unplanned pregnancy. In a scene toward
the beginning, Isabel unexpectedly reappears at Alex’s doorsteps, three months after
their romantic encounter. Alex arrives home and exclaims in surprise, “Isabel!” She
concedes, “You remembered. Well, I didn’t know what to do. I never did anything like
that before, going home with someone I don’t know.” In an effort to comfort her, Alex
adds in a scattered voice, “Hey, you and me, both. It was just one of those great, phe-
nomenal, spontaneous things.” An awkward silence sets in. To make conversation, Alex
asks, “So, how you been?” to which Isabel unhesitatingly responds, “Pregnant.”
Alex is stunned by the news. His reaction upsets Isabel, who drives away in a huff. As
Alex follows her, the song “Para Dónde Vas?” [“Where Are You Going?”] emerges in the
soundtrack. At a surface level, the lyrics speak Alex’s mind as he follows Isabel—we
may read “Para dónde vas, muchacha?” as “Where are you going, Isabel?” At a deeper
level, the conceptual blend in Figure 8.2 illustrates how this Spanish song can be heard
FIGURE 8.2A Fools Rush In. The Iguanas’ “Para Dónde Vas?” begins to play. [00:20:30]
as Alex attempting to speak to Isabel in her language, acknowledging her culture and
his willingness to engage with it, seeking to bridge the gap between them and trying to
repair the damage caused by his initial reaction.
Whereas in the preceding examples the soundtrack foregrounds the lyrics, films often
feature instrumental versions of songs with renditions that exclude the lyrics. In such cases,
the potential projections we form between the music and the narrative rest on our recogni-
tion of the song and prior knowledge of its lyrics or title.
In the provocative revenge thriller Promising Young Woman, numerous songs, some
transformed beyond recognition, supply subtexts that (re)define the characters. Cassie,
a brilliant medical student, drops out of school as she struggles emotionally and psy-
chologically after her best friend’s rape and subsequent suicide. By day, she works at
a coffee shop; by night, she reclaims her agency, asserting her power as a woman by
adopting an alter ego to confront male predators. Toward the film’s end, Cassie plans to
infiltrate the bachelor party of the men who had assaulted her friend. It is early in the
evening. Cassie pulls up on a deserted dirt road in the woods and adds finishing touches
to her makeup. Heavy black mascara and eyeliner magnify her dark eyes against the
bleach-blonde wig, framed by large dangling silver earrings and a nurse’s cap. She is
barely recognizable. In the music, an exotic, tense, ill-fated melodic figure outlining a
tritone (F#, D, E♭, C) brings veiled yet lethal undertones. She steps out of the car, tosses
the plate into the shrubs, grabs a nurse’s bag and red high-heel shoes from the trunk, and
slowly strides down the road and uphill toward an isolated cabin in the woods. Now,
the music’s minor mode, sinking string sweeps, and heavy pace suggest a somber march
topic, escorting her to the danger zone, foreshadowing her fate. As we entrain to the
music’s meter and subvocalize its deviant glissandi, we realize this is Britney Spears’s
“Toxic” in disguise.5 Cassie arrives at the cabin, puts on her high heels, and rings the
bell—she is about to penetrate a dangerous space. On the downbeat, one guy opens the
door to the testosterone-filled cabin and announces, “Yes! The doctor is in the house!”6
This visual and musical foreplay reveals a cluster of discourses informed by the song’s
title and its arrangement, mapping a plethora of associations and attributes onto the
film’s narrative: the disguised song underscores Cassie’s transformation into her alter ego
to infiltrate the bachelor party disguised as a stripper, yet her reaction to the male threat
manifests itself through an augmented sexual agency that becomes just as deviant, just
as toxic.
FIGURE 8.3A Promising Young Woman. Cassie arrives at the cabin. [01:22:15]
120 Associations
Concert Pieces
Selective projections between the music and narrative spaces prompt us to construct infer-
ences, arguments, and interpretations. Such selective projections extend beyond those sug-
gested by the lyrics: a piece’s program (preconceived narrative) or compositional milieu
(place of composition, era, style) allows directors to convey vital information or supply a
subtext critical to our understanding of a film’s narrative.
The plot of the futuristic action thriller Minority Report revolves around a precrime unit in
the police force. With the aid of three precogs (i.e., psychics), the police gather visual infor-
mation and preempt future violent crimes. John Anderton, Chief of the Precrime Unit, enters
the analytical room and takes off his coat. Jad, the main dispatcher, briefs Anderton, “We got
a white male, about five-eight, approximately one-forty.” Anderton prepares to manipulate
the visuals transmitted by the precogs to find information that would reveal the time and
place of the crime and the offender’s identity. He puts on his finger gloves and inserts a disc
into a slot. Schubert’s Unfinished Symphony begins to play. Jad shares, “I love this part”, as
he watches Anderton’s conductor-like technique for parsing information, ordering the scat-
tered images by skillfully placing some in the foreground, some in the background. Anderton
zooms in onto one image to get a clearer picture of the gunman—it is Anderton himself. He
abruptly stops the session; the music stops too. Introducing Schubert’s Unfinished Symphony
prompts us to draw a metaphorical parallel between the music and the narrative—as its name
implies, Schubert never completed this piece, which foreshadows a critical plot point later
in the film. The conceptual blend in Figure 8.4 reveals the link between the music’s conno-
tations with the plot: at the surface level, his skillful ‘conducting’ will help the police force
prevent crimes from coming to fruition; and at a deeper level, within the broader context of
the film, Schubert’s Unfinished Symphony, a famously incomplete piece from the distant past,
may be heard as a reflection of Anderton’s grappling with unresolved trauma from his past,
particularly his failed attempts to prevent the abduction of his son.
Associations 121
Often, when a piece belongs to an opera or dance suite, the associated narrative or pro-
gram of the larger work impinges upon our interpretation. Lord of War offers a fictional yet
incredibly realistic glimpse into the end of the Cold War and the emergence of worldwide
terrorism. The film follows smuggler Yuri Orlov unscrupulously supplying illegal weapons to
emerging world powers. He titles himself a ‘Lord of War’. In an immorally sublime moment,
Yuri admires an AK-47. To the mesmerizing sounds of Tchaikovsky’s “Swan Lake”, the cam-
era zooms into and dances around the weapon—its chromed barrel, its elegant 30-round
curved magazine. The correlations between the narrative and the music extend beyond
the Russian origin of the machine gun and the musical piece, juxtaposing beauty and evil,
disguising nefariousness in refinement. In Tchaikovsky’s ballet, Prince Siegfried falls in love
with Odette, a young woman cast under the spell of an evil sorcerer who turned her into a
beautiful white swan. To break the spell, Siegfried must pledge his love for Odette. But one
night, at a social function at Siegfried’s castle, the magician appears in disguise with his
wicked daughter Odile, who looks much like Odette yet wears a black dress. Deceived by
the similarity, Siegfried mistakenly swears eternal love to Odile. The character associations
triggered by Tchaikovsky’s work allow us to understand the scene in a new light. The con-
ceptual blend in Figure 8.5 reveals the link between the music’s connotations and the plot:
‘Lord of War’ becomes mesmerized by the deadly black AK-47 machine gun.
122 Associations
In some unique cases, the associations that stem from a piece’s formal structure inform
our interpretations of an entire film’s narrative structure. In the political thriller The Lives
of Others, a piece titled “Sonata for a Good Man” becomes an important plot point. A
“sonata” form in music provides the composer with a large-scale design with dramatic
potential; it contains the three sections characteristic of rhetorical structures: exposition,
development, and (transformed) recapitulation.7 A scene early in the film shows Georg
and his fiancé Christa sharing some intimate moments after his birthday party. He briefly
looks at one present, a score of the “Sonata for a Good Man”. They are not aware, how-
ever, that Hauptmann, a GDR officer codenamed HGW XX/7, is keeping them under
close surveillance. As Hauptmann observes the day-to-day life of Georg and the anti-
GDR group, he begins to question his own ideology. As the story unfolds, he embarks
on a radical transformation, sympathizing more and more with the anti-GDR group. The
last scene of the film shows a profoundly changed Hauptmann. He no longer works as
a secret agent and no longer shares the GDR’s ideology. Moreover, the dedication of
Georg’s novel to Hauptmann suggests a complete disengagement from his previous GDR
persona and his embracing of a greater concern for the human condition. The plot from
The Lives of Others thus closely maps a sonata’s formal structure: it presents two main
characters with contrasting ideologies; subsequently, it sets up a confrontation and strug-
gle between these two ideologies; and ultimately, it presents once more the two main
characters, but one has undergone a radical transformation, assimilating the values and
perspectives of the other.
Associations 123
FIGURE 8.6A The Lives of Others. Georg unwraps the score for the “Sonata for a Good Man”.
[00:34:20]
FIGURE 8.6C The Lives of Others. Georg’s book, “Sonata for a Good Man”, catches Hauptmann’s
attention. [02:11:30]
Coda
During a film, a network of musical, (con)textual, and personal associations influences our
reading of a scene. Conceptual Blending offers a robust analytical framework to explore the
projections we construct between the music and other cinematic domains, revealing (and
reconstructing) the cognitive processes that prompt us to generate meanings and interpreta-
tions about music’s place within a film. In instances where the music does not offer new
information but rather echoes the other input spaces, blending allows us to focus on the
elements present in the input spaces by accentuating subtle differences.8
The Conversation comments on the emerging social tensions between the public and
the private, articulating a rendition of a collective reality by extrapolating the inner psycho-
logical journey of an individual. Befitting the film’s themes of secrecy and wiretapping, the
soundtrack echoes the film’s narrative without perceptible authorial intrusion: while the plot
portrays a transgression of the protagonist’s private space, the soundtrack maps this trans-
gression onto the aural boundaries, tapping into our personal, private musical associations.
Notes
1. Lars halts his singing halfway through the stanza. He omits the last two stanzas of the chorus, “Take
my heart and please don’t break it, Love was made for me and you”, which deviate from the film’s
primary plot points—these lines point to the very agency that Bianca lacks, suggesting that love was
not intended to include inanimate objects.
2. In this example, the vital relationship of ‘disanalogy’ links corresponding elements across input
spaces. Therefore, running the blend highlights not the similarities, but the differences. For an in-
depth discussion of Conceptual Blending, see Appendix VII.
3. Addressing that trend, Bazelon (1975) mentions that “it does not seem to matter that the theme
tunes have little relevance to the film’s dramatic context . . . . Usually placed at the beginning as a
title song but occasionally at the end . . . the songs cash in on today’s fast-changing market, osten-
sibly giving pictures a gilt-edged frame of catchiness” (p. 30).
4. Criticism has been directed not only at the obvious financial motivations of a compilation score but
also at its artistic function within the film. Bell (1994) notes that songs are “being purchased and
placed in films, not for artistic reasons, but because they might sell more soundtrack records/CDs”
(p. 66), and warns us that “the incorrect use of songs endangers the cohesiveness of film art. Instead
of a two-hour dramatic statement, motion pictures often become bits of plot interspersed between
MTV-like music videos” (p. 67).
5. Entraining to Cassie’s actions (by entraining to the music’s meter) affords us an opportunity to
embody her steady (yet ill-omened) march and further identify with her.
6. At the cadence point, the final chord (the unstable dominant-seventh of the minor mode) opens up
the space for the events about to unfold. Some listeners may hear the sound of the opening door as
a downbeat that brings a level of resolution to the music via a sound design transference, one that
transfers the agency from Cassie to the men in the house.
7. In the exposition, the music presents two contrasting themes—the first in the home key and the
second in a contrasting key. In the development, it musically explores and confronts these themes.
In the recapitulation, the music once more presents both themes, but the second one, now in the
home key, has undergone a transformation to conform to, and arguably assimilate, the qualities of
the first theme.
8. Elsewhere (Chattah, 2008), I propose a preliminary model for analyzing irony in film music based
on a modified version of Fauconnier and Turner’s framework. Within this model, interpretations of
irony emerge in a blended conceptual space and stem from cross-domain projections based on
vial relations of ‘incongruity’ or ‘opposition’ between input spaces. However, this model presents
a critical limitation: its inability to parse out various types of irony or to distinguish tropes closely
related to irony. Therefore, in Chapter 9, I apply models of categorization to disentangle the tropes
emerging from the deliberate placement of incongruous music in a film.
9
CATEGORIZATION
In Terminator 2: Judgment Day, the music often supplies contradictory subtexts. Strange
lightning forms a circular opening in the sky, and a flare of light materializes as a massive,
sculpted, naked body. The Terminator has arrived. It strides impassively into a diner, swive-
ling its head with its characteristic emotionless gaze. Customers freeze. It approaches a
rough-looking biker and orders, “I need your clothes, your boots, and your motorcycle.” As
it steps out of the diner, now fully clothed in black leather and heavy boots, George Thoro-
good and The Destroyers’ “Bad to the Bone” roars in the soundtrack. We recognize that the
song functions as an off-screen narrator revealing facets of the character, yet we have seen
the film already and know the Terminator is not human, has no bones, and is not the bad
one here! It is so ironic . . . But is it?
Film music conveys messages that support, complement, or negate the visuals or the dia-
logue. When confronted with conflicting, ambiguous, or incompatible meanings between
the music and other cinematic domains, we generally resort to a chain of inferences suggest-
ing ‘irony’. Indeed, a key component in irony is the presence of incongruity; but a detailed
examination of these film moments reveals that tropes closely related to irony (such as
parody, satire, or paradox) may instead be at play. In this chapter, we draw on probabilistic
models of categorization and the notion of second-order inferences to construct a frame-
work for reevaluating our interpretations of incongruity in film.
DOI: 10.4324/9780429504457-10
126 Categorization
such as incongruity (conflicting presentation between the music and any other concep-
tual domain), intertextuality (borrowing from another film or another genre), hyperbole
(out-of-context exaggeration), grotesque (distortion exposing peculiar characteristics), and
cinematic anomaly (divergence from established cinematic conventions).3 First-order inter-
pretations are also contingent upon the attributes’ values, which specify an attribute’s loca-
tion within the cinematic text—for instance, “hyperbole” may be located in the music, the
visuals, or the dialogue. Because we find traces of these attributes in the cinematic text,
first-order interpretations seem objective; but forming first-order interpretations is also con-
tingent upon extra-textual contexts, the discursive practices of interpretative communities,
and the knowledge of culturally established aesthetic and cinematic conventions.
In evaluating first-order interpretations (e.g., degree of hyperbole, nature of perceived
anomalies), an additional set of attributes emerges as second-order interpretations: rever-
sal (indicating a reassessment of semantic value that results in the opposite effect), critical
appraisal (suggesting a target of judgment), focal shift (highlighting a perspective), and hom-
age (celebrating an extra-opus text). These second-order interpretations are again cast as
attributes, each with their corresponding values—for instance, “focal shift” may highlight
the perspective of a character, the audience, or an unseen narrator. In contrast to attributes
related to first-order interpretations, attributes stemming from second-order interpretations
are not in the text, but in the audience’s mind.4
Ultimately, both first- and second-order interpretations combine to form a list of attributes,
which are weighted when categorizing film examples.5 Exploring examples and identifying
Categorization 127
the attributes that contribute to forming interpretations of irony and related tropes reveal
near-prototypical exemplars begin to emerge.6 In assigning labels to categories, I resort here
to preconceived notions developed outside of cinema and include tropes that suggest irony
or that we commonly associate with (or mistake for) irony:7
• Irony proper
○ Structural (Situational) Irony: Expectations deviating from the state of affairs.
○ Dramatic (Tragic) Irony: Character’s fate revealed to the audience but unknown to the
character.
○ Socratic Irony: Message of praise to imply blame, or of blame to imply praise.
• Tropes related to irony
○ Sarcasm: Disguised contempt or criticism toward an individual.
○ Satire: Disguised contempt or criticism toward a contextual situation or social
dynamics.
○ Parody: Exaggeration or decontextualization (of a style or genre) for comic effect.8
• Distantly related tropes commonly mistaken for irony
○ Quotation: Inter-textual reference.
○ Paradox: Self-contradictory statement.
○ Lie: Untruthful and deceiving statement.
128 Categorization
FIGURE 9.3A Con Air. Inmates dance and sing as they escape. [01:29:30]
indicator of ironic intent. In Con Air, several notoriously violent ex-convicts take control
of a plane and escape. Their celebration involves dancing and singing to “Sweet Home
Alabama”, but most of the characters are not aware of the tragic (rather than joyful) conno-
tations of the music. In a somewhat subdued voice, the more intellectual of the criminals,
a serial killer, quietly utters, “Define irony: Bunch of idiots dancing on a plane to a song
made famous by a band that died in a plane crash.” The plane ultimately crashes, and eve-
ryone but the serial killer dies in the accident. Irony thus plays on a disjunction between
the characters’ and the audience’s points of view, with a flagrant incongruity between
the song’s connotations and the convicts’ state of happiness. The reversal, however, takes
place in the audience’s realization of the convicts’ likely fate. As Figure 9.3B shows, this
is an example of ‘dramatic irony’ (often called ‘tragic irony’), where the audience knows
what the character has yet to find out.
Stereotyped characters commonly share their looks, general behavior, and even musi-
cal taste. White Chicks brings stereotypes to the foreground only to deconstruct them for
humorous effect. As a group of young white girls rides in a convertible through an upper-
class neighborhood, the radio commentator announces, “And now, the number one most
requested song at WQQR”, and Vanessa Carlton’s “A Thousand Miles” begins to play. The
girls react with great excitement, “This is our jam!”, and start singing along. In a subse-
quent scene, an undercover black cop disguised as a white girl, Tiffany Wilson, joins a
masculine black bodybuilder, Latrell Spencer, for an after-party ride. As ‘she’ recognizes
Latrell’s intimate intentions, she attempts to establish some distance by playing “A Thou-
sand Miles” on the car stereo, a song that would turn him off. To her surprise, Latrell
exclaims, “I love this song!” He starts singing along, shaking vigorously during the orches-
tral riffs and impersonating the lyrics. By exposing his musical taste, he finds his way into
a radically different tradition. The incongruity between Tiffany’s (and our) expectations
regarding Latrell’s musical taste and his (arguably exaggerated) joyful reaction to the song,
combined with his hyperbolic enactment of the lyrics, causes a reversal of character type.
Because this example embeds a fundamental incongruity that reveals a state of affairs dif-
ferent from expected, it belongs to the structural irony category, often called situational
irony. (See Figure 9.4B.)
Superstar portrays the school life of Mary Katherine Gallagher, a rather graceless and
mildly hyperactive uniform-wearing Catholic schoolgirl with dreams of superstardom.
During a Roman Catholic service set to Schubert’s peaceful “Ave Maria”, as Mary whim-
pers about her unsympathetic reality, her overenthusiastic friend, Helen, cues a daydream
sequence with the line: “That’s it. You are feeling sad, so you know what it’s time for? Super-
model documentary hour!” Both girls leap into a daydream sequence of ‘superstardom’
as supermodels in a photo-shoot session set to Imani Coppola’s “I’m a Tree”. This dazzling
moment frees them from the characteristic awkwardness that brands them as social outcasts
in their real lives. Yet, they also adopt hyperbolic and grotesque mannerisms that caricature
supermodels, such as the distorted ‘French’ accent Helen enacts when recalling, “I was just
walking down the street one day and a man come up to me and he said, ‘Do you like to be
a supermodel?’, and I said oui, and the next day I’m in New York, on the cover of Vogue.”
This dreamlike sequence is filled with cinematic anomalies—rapid camera movement, con-
stant white-screen bursts simulating a camera’s flash. The transition from their dreamlike
state to reality is affected by a visual element—a priest unexpectedly appears within the
130 Categorization
dream sequence—prompting Mary and Helen to return to their nervous stillness while Mary
murmurs her characteristic ‘Sorry, sorry.’ These cinematic cues serve as indexical signs that
unmask the director’s intention: to target Mary and Helen via a hyperbolic and grotesque
statement of the opposite. As Figure 9.5B shows, the blatant ‘blame-by-praise’ and presence
of a target of ridicule or critique in this example suggests it is an instance of Socratic irony;
and because the target is a character in the film, rather than social norms or the film genre
itself, this example may be further read as an instance of sarcasm.
The drama Precious revolves around a young, black, overweight, sexually and emotion-
ally abused girl. Daydreaming about an alternative life and a different identity are her only
means of coping with the grim reality surrounding her. Leading to one of the many dream-
like sequences, we see Precious walking through her neighborhood past a group of harass-
ing bullies hearing some indistinct R&B music. One guy pushes Precious to the ground. As
she falls unconscious, the film cuts to a dreamlike sequence: she wears a glamorous dress
Categorization 131
and dances on a stage with her imaginary light-skinned boyfriend to Queen Latifah’s “Come
Into My House”. As her (imaginary) boyfriend gets behind her and licks her ear, the film cuts
to reality, and we see a dog licking her ear while she is on the ground regaining conscious-
ness. The cinematic anomalies in the scene—the split-screen montage typical of music vid-
eos, a shift to vivid colors, slow-motion and stop-motion editing—serve as indexical signs
that unmask the director’s intention: targeting Precious via a hyperbolic statement that plays
on a transmuted expression of reality, one of excess and glamour. This cinematic device calls
attention to Precious’s bleak life, which sadly lacks the glamor she craves. Such hyperbolic
and grotesque admiration in the form of ‘blame-by-praise’ is widespread in discursive irony,
tacitly implying the word ‘not’ as a semantic modifier. The presence of a target of critique or
ridicule suggests this is an instance of Socratic irony; and because the target is a character in
the film rather than the film genre itself, we may read this moment as an instance of sarcasm.
(See Figure 9.6B.) However, within the context of the entire film, this moment of sarcasm
132 Categorization
serves to project a broader critical judgment on the social dynamics that caused Precious’s
reality, thus resulting in satire.10
Guess Who is a romantic comedy that touches upon racial issues. Simon, a young
white man, plans to marry the daughter of Percy, a protective African-American dad.
After a somewhat heated discussion, Percy decides Simon should stay in a hotel. Unfor-
tunately, the hotel is fully booked, so they drive back to Percy’s home. As they get in the
car, the ending of the song “Ebony and Ivory”, performed by Paul McCartney and Stevie
Wonder, is playing on the radio, and the lyrics, “Ebony, ivory, living in perfect harmony”,
make both Simon and Percy uneasy. The conflict between the associations brought about
by the lyrics of the song (itself a metaphor about the white and black keys of the piano
arranged in perfect harmony) and the events in the narrative (suggesting a fundamen-
tal lack of harmony between Simon and Percy) elicit an ironic reversal of the meaning
of the song. Just as in the last two examples, this ironic reversal is characterized by a
‘blame-by-praise’, suggesting this too is an example of Socratic irony. (See Figure 9.7B.)
In this instance, however, the critique is not directed at a character in the film, but at
Categorization 133
FIGURE 9.7A Guess Who. Percy and Simon drive back home. [00:29:45]
the macro-social dynamics and a contextual situation, further suggesting that this is an
instance of satire. Using this rhetorical device, the film shifts its narrative agency from
the characters to a tacit narrator, one that allows for a shift in perspective and a broader
critical appraisal of the social dynamics unfolding in the scene.
A comparable example, but from a very different genre, appears in Michael Moore’s
Bowling for Columbine. This documentary film is a multi-layered examination of the Col-
umbine tragedy and a stark critique of gun ownership in the United States. In a lengthy
montage, accompanied by Louis Armstrong’s “What a Wonderful World”, the film recounts
the United States’ involvement in foreign and military policies that may have led to 9/11.
The coupling of violent visuals with a song whose positive lyrics point to a trouble-free
world illustrates Moore’s signature sense of Socratic irony employed for satirical purposes to
advance a critical appraisal of the social norms and the political climate that led to a collec-
tive state of fear and paranoia. (See Figure 9.8B)
134 Categorization
FIGURE 9.8A Bowling for Columbine. Montage accompanied by “What a Wonderful World”.
[00:24:30]
The comedy Repossessed alludes to an earlier horror film, The Exorcist, by presenting
many inter-textual relations—setting, iconography, character types, and even a common
actress in an identical role (Linda Blair). In a scene, Father Jedediah plans to rid Nancy of
the devil that possesses her. He quietly enters Nancy’s bedroom and slowly approaches
her as she is tied to the bed. As he gets closer and closer to Nancy, the music grows louder
and louder; but he suddenly presses a button on a handheld device and causes the music
to stop. Allowing a character to have diegetic control over the non-diegetic music is an
anomaly that functions as a marker, a punchline that deconstructs the cinematic illusion,
triggering a reversal of effect. As we saw in Chapter 3, an increase in loudness usually trig-
gers tension in the audience and points to an imminent dangerous event. Here, however,
the unusual transgression of sound design boundaries, particularly at a sonically climac-
tic moment, releases the tension and evokes laughter. This example aligns with parody,
combining hyperbole and the grotesque to critically appraise and caricaturize a film (The
Categorization 135
Exorcist) and the horror genre writ large. (See Figure 9.9B.) Although there is a reversal of
effect (laughter rather than suspense), there is no Socratic blame-by-praise structure or sig-
nificant incongruity. Hence, this example should not be read as an instance of irony, but as
an instance of parody.
In the comedy Airplane!, a reconfiguration of the symbolic system generates a vacil-
lation between similarity and difference in relation to other films. It opens with a shot of
clouds seen from slightly above, almost resembling a turbulent sea. Jaws’ menacing theme
emerges as the plane’s ‘fin’ breaks through the clouds. As the music builds in intensity, the
plane’s fin gets closer. Ultimately, as the entire plane cuts through the clouds in a sharp
ascent, a shocking dissonant chord in the music functions as an anomaly that punctuates
the scene with a musical gag-line that intensifies the humorous effect, landing a biting
example of parody. (See Figure 9.10B.)
136 Categorization
FIGURE 9.10A Airplane! A plane’s ‘fin’ emerges from the clouds. [00:00:05]
In the dark comedy Ted, there are copious references to 1980s films. In a scene filled
with intrepid allusions to Indiana Jones’ musical theme, just like Indy grabs his hat while
making his grand escape, Ted grabs his ‘cloth’ ear before his adventurous flee down the
stairs. Although intertextuality is the most salient feature here, there is no reversal of effect,
no anomaly, and no element of the grotesque; therefore, this moment in Ted engenders a
sense of homage to Indiana Jones and thus would be best categorized as quotation rather
than parody. (See Figure 9.11B.)
Face Off is an action/crime/sci-fi film in which an FBI agent undergoes facial transplant
surgery to assume the identity of a terrorist. In a scene, a child witnesses a violent event.
Attempting to distance the child from this violence, the FBI agent places a set of headphones
over the child’s ears playing a gentle version of “Over the Rainbow”. Colored by the song,
the violence turns into exquisitely choreographed dance-like movements. The incongruity
Categorization 137
FIGURE 9.11A Ted. Ted channels his inner Indiana Jones. [01:21:00]
between the music and the suggestion of violence in the visuals and narrative does not trig-
ger an ironic reading of the scene, but instead, the scene portrays the child’s perspective.
As Figure 9.12B shows, this is an example of a paradox, which entails a contradiction that
makes sense and does not require resolution.
Meet the Parents is a comedy about Greg meeting his prospective in-laws. Jack, the
disapproving father-in-law, has one precious possession: Jinx, a white-tailed, Himalayan-
Persian cat. During Greg’s weekend with the in-laws, Jinx is nowhere to be found. To gain
Jack’s respect, Greg disguises a newly acquired cat by painting its tail white and pretends
to have found Jinx. As Greg arrives with the disguised cat, the majestic and heroic flair of
the music (via a ‘heroic’ musical topic) corresponds with the visuals of Greg as determined
and strong-minded (via the characteristic slow-motion and a half-smile). The film thus pre-
sents us with contradictory depictions of Greg: the plot presents him as hopeless and ill-
fated, while the hyperbolic music and visuals portray him as a hero. Here, however, the
138 Categorization
FIGURE 9.12A Face Off. The child is caught in the midst of the violence. [01:34:10]
inter-textual reference to the epic genre does not establish parody, because the genre is not
critically appraised. Additionally, no second-order interpretation emerges—no reversal of
effect, critical appraisal, shift of focal point, or homage. Moreover, although Greg’s fate is
not a blissful one, no narrative elements in the film point to or suggest that outcome. There-
fore, as captured in Figure 9.13B, this example does not fit irony, satire, sarcasm, or parody;
instead, and although music is incapable of lying, this example suggests that the music helps
the character lie by staging a false heroic persona.
Terminator 2: Judgment Day builds on the character stereotypes developed in The Ter-
minator; but in a role reversal, a reprogrammed T-800 (the one we identify as ‘The Termina-
tor’) arrives to protect John Connor from a T-1000, a more advanced, shape-shifting android
assassin. In the scene from this film with which I open the chapter, the Terminator seizes a
rugged-looking biker’s clothes, steps outside the diner, and even grabs the biker’s Harley.
Categorization 139
FIGURE 9.13A Meet the Parents. Greg arrives with a lookalike cat. [01:13:10]
“Bad to the Bone” rumbles loudly on the soundtrack. The diner’s owner comes out with a
10-gauge Winchester lever-action shotgun, fires a round, and threatens, “I can’t let you take
the man’s wheels, son. Now get off, or I’ll put you down.” Expressionless, the Terminator
eases the bike onto its kickstand and strides toward the guy. Staring impassively, the Termi-
nator snatches the shotgun—the guy gulps, thinking he is doomed. The Terminator calmly
reaches toward the man’s shirt pocket, grabs his sunglasses, and puts them on. Now looking
the part, the Terminator steps back on the Harley and roars off. Here, the non-diegetic song
helps develop the character’s tough-guy facade, constructing a lie on two levels—that the
Terminator is human and a “bad” guy. (See Figure 9.14B.)
Some combinations of first- and second-order interpretations do not reflect any of the
tropes discussed. In The Grand Budapest Hotel, ludicrous characters engage in un-natural-
istic conversations that match their preposterous behavior. M. Gustave clasps Madame D’s
140 Categorization
hands while comforting her, “You’ve nothing to fear. You’re always anxious before you travel.
I admit you appear to be suffering a more acute attack on this occasion, but truly and hon-
estly [abrupt stop]”—he is suddenly taken aback by Madame D’s “diabolical” nail varnish.
At that very moment, the non-diegetic music also abruptly stops, and M. Gustave’s tone
shifts from affectionate and reassuring to critical and disapproving. Throughout the scene,
the film becomes Gustave’s accomplice, a supporting agent or unseen narrator that supplies
the appropriate non-diegetic music to bolster the romantic narrative Gustave is trying to
foist upon Madame D. The halt in the music, however, draws attention to the dialogue, and
farcically complicates the presence of the non-diegetic music as influenced by the dialogue
itself, foregrounding cinematic conventions only to deconstruct them for humorous effects.
Unfortunately, the categories here discussed do not fit a humorous trope that entails incon-
gruity based on a cinematic anomaly—a sudden stop in the non-diegetic music that brings
with it an equally sudden change of valence in the dialogue. (See Figure 9.15B.)
Categorization 141
FIGURE 9.15A The Grand Budapest Hotel. M. Gustave and Madame D. talk before her departure.
[00:10:10]
Fluid Categories
Identifying the attributes (first- and second-order interpretations) that inform our catego-
rization, as presented in Figure 9.16, allows us to distill the difference between different
shades of irony and between irony and related tropes. Related to first-order interpretations:
(1) incongruity is a necessary attribute in irony, but the location of such incongruity may
suggest a non-ironic trope; (2) intertextuality rules out ironic meaning; and (3) hyperbole,
anomaly, and the grotesque may be present in ironic as well as non-ironic tropes. Related to
second-order interpretations: (1) reversal of effect is pervasive in irony and parody, but is not
as common in tropes distant from irony; (2) critical appraisal requires further examination
to identify whether sarcasm, satire, or parody are at play;11 (3) a focal shift is not common
142 Categorization
in irony, but it appears more often in distant tropes such as paradox; (4) homage emerges as
a distinct qualifier for quotation, (5) the absence of second-order interpretations may indi-
cate a lie; and (6) the presence of first- and second-order interpretations notwithstanding,
additional markers may suggest an undefined trope, one for which we may not yet have a
fitting label.
Since the framework for categorizing irony and related tropes I present in this chapter
is based on graded similarity to a prototype (or, often, to an exemplar) rather than on
explicitly specifying the boundaries between categories or abstracting the necessary and
sufficient conditions for category membership, knowledge of categories demands knowl-
edge of near-prototypical exemplars. Confined to the examples provided in this chapter,
knowledge of the category parody, for instance, comprises knowledge of near-prototypical
examples such as those from Airplane! and Repossessed. The prototype model thus offers
a suitable approach to distilling semantic categories featuring fluid boundaries and typi-
cality-based category membership.12 Nonetheless, the members of any semantic category
are rarely a perfect prototype, as these may deviate in the degree of membership and even
contain cues suggesting closely related or even distant categories. It would therefore be
theoretically advantageous to conceive of irony and other tropes within a multidimen-
sional continuum, a space that situates categories relative to each other and allows for in-
between-trope graded typicality or regions of overlap where non-prototypical exemplars
could be situated.
Figure 9.17 renders such multidimensional space, wherein distances between categories
are relative. Over time, new categories (along with their corresponding prototypes) may
emerge from within interpretive communities—for instance, to label the trope in The Grand
Categorization 143
Budapest Hotel—forcing existing exemplars to shift in their position closer or away from a
prototype. This multidimensional mapping may allow for tracing diachronic fluctuations in
our categorization, continuously adjusting the location of various tropes to represent the
complexity of our experience and to better approximate new exemplars.
Coda
Music operates at an almost subliminal level during a film, conveying complex messages
that support, highlight, or complement other cinematic domains. Occasionally, however,
the music and other cinematic domains are at odds, supplying incongruous information that
violates our expectations, triggering a momentary discomfort we resolve by physiologically
resorting to laughter and rationally reducing the event to an instance of irony. This reflects
that our everyday tendencies for basic-level categorization permeate our cinematic expe-
riences.13 In this chapter, we probed such tendencies by cross-examining the boundaries
between musically induced irony and related tropes, and proposed that new cinematic
instances prompt us to recalibrate our basic-level categorization and even to contrive new
categories.
Terminator 2: Judgment Day also probes our innate tendency to categorization: in its
soundtrack and cinematography, the film blurs generic boundaries between sci-fi, tech-noir,
action, comedy, adventure, horror, fantasy, mystery, and thriller, becoming a blueprint for
many future films within these genres; and, in its themes, the film portends a unique vision
of a world, one that blurs the boundaries between humans and machines, protagonists and
antagonists, emotion and reason, familiar and unfamiliar, memories and dreams, fate and
future, being and becoming.
144 Categorization
Notes
1. When acknowledging that ‘conflicting meanings’ is a common trait in all instances of irony, it
may be tempting to adopt a ‘family resemblance’ model—where category members share specific
features, yet no single feature is common to all instances.
2. Although affordances and function have commonly been included as attributes, including interpre-
tations as attributes renders a more robust model for categorizing irony. The attributes that inform
our categorizations are distributed across a wide network of perspectives, including not only how
items “look, but also they sound, smell and feel, how to operate them, and emotions they arouse”
(Barsalou et al., 2003, p. 88). In addition, because attributes are in themselves complex concepts
with both internal and external structures, the ability to establish inter-attribute relations allows us
to identify attributes that may co-occur. For example, having wings, flying, and building nests in
trees are closely interrelated, as having wings affords a bird flying and hence building a nest in a
tree. For an in-depth discussion of the notion of categorization, see Appendix VIII in this volume.
3. Cinematic anomalies often act as indexical markers that help unmask the director’s intention,
replacing typical communicative signals such as rolling eyes, intonation, or facial gestures.
4. In discussing irony and related tropes, Hutcheon (1985) suggests that parody is not located in the
text, but in the readers’ minds. Elleström (2002) further distills the objective and subjective com-
ponents in arriving at an interpretation, noting that “the material of irony is found in the text, but
it is formed by the reader” (p. 49).
5. This list of attributes arose organically from examining the film examples included in this chapter;
additional examples will no doubt contribute to expanding this preliminary list of attributes.
6. In this process of categorization, not all attributes enjoy equal weight. For instance, when con-
structing interpretations of parody or satire, viewers must selectively direct their attention toward
the presence of intertextuality and the values of critical appraisal. In contrast, to decide between
parody and homage, viewers must selectively direct their attention toward the presence of reversal
of effect and the values of critical appraisal. In addition, the weights given to both the attributes
and the values within them will contribute to identifying a prototype that serves as representative
both within the ‘vertical’ and ‘horizontal’ dimensions of a category. The horizontality of categori-
zation is manifested in the various tropes related to irony, all arising as variations of the incidence
of the one essential element: a conflict of messages.
7. Some readers may argue that these labels fail to reflect the true nature of the proposed prototypes.
I return to this issue toward the end of the chapter. Additionally, as an audience member who has
worked with these examples for extended times, I commonly fall victim to learning, priming, and
plain habituation.
8. Hutcheon (1985) notes that parody is “related to burlesque, travesty, pastiche, plagiarism, quotation,
and allusion, but remains distinct from them. It shares with them a restriction of focus: its interpreta-
tion is always of another discursive text. The ethos of that act of repetition can vary, but its ‘target’ is
always [intra-opus] in this sense”, and in turn, satire is “[extra-opus] (social, moral) in its ameliorative
aim” (p. 43). Dovetailing on Hutcheon’s ideas, Cobo (2003) suggests that “the difference between
parody and satire resides on the distinct nature of their respective targets . . . the parodist target is
always another work of art or more generally, another form of coded discourse . . . satire, on the
other hand, is both moral and social in its focus and ameliorative in its intention” (p. 61).
9. Cherlin (2017) delves into the numerous forms of musical irony found in classical music, exam-
ining works by a range of composers including Mozart and Mahler. Kostka (2016) discusses the
concept of parody in modern and postmodern art, particularly as it stems from blending stylistic
traditions.
10. Socratic irony stems from a particular rhetorical pattern in Socratic dialogues, in which Socrates
pretends to need information and professes admiration for the wisdom of his companion. For in-
stance, saying “Congratulations! You’re so smart . . .” to someone that just received a failing grade
would be understood as ‘not so smart’.
11. Everett (2004) draws attention to this fact and defines “parody in musical discourse as a com-
poser’s appropriation of pre-existing music with intent to highlight it in a significant way . . .
The analyst then determines whether the accompanying ethos is deferential (neutral), ridiculing
(satirical), or contradictory (ironic) based on how the new context transforms and/or subverts the
topical/expressive meaning of the borrowed element” (par. 1).
Categorization 145
12. As Rosch (1978) mentions, the prototype model seeks to “achieve separateness and clarity of actu-
ally continuous categories [by] conceiving of each category in terms of its clear cases rather than
its boundaries” (p. 259).
13. In contrast to natural categories, irony presents a unique case of basic-level categorization that
remains largely abstract and disembodied at subordinate levels—e.g., romantic irony, dramatic
irony, or cosmic irony, all retain an equal degree of abstraction compared to the superordinate
level.
10
INTERPRETIVE TRANSFORMATIONS
In The Red Violin, the music invites us on a transformative journey, one that traverses over
three centuries and several continents. The extraordinary, bewitching, desired red violin is
in line for auction. As the bidding unfolds, a series of flashbacks brings us along with the
violin from its creation in seventeenth-century Italy to its ever-changing milieu as it falls in
the hands of peculiar individuals—a child prodigy in an eighteenth-century Austrian mon-
astery, a Romani female improviser, a mischievous soloist in nineteenth-century Oxford,
a violin restaurateur in China during the Cultural Revolution, and an appraiser of antique
instruments in present-day Montreal. Sweeping transformations of the ‘red violin’ leitmotif
guide us through the transient owners’ sociocultural environments while illuminating their
inharmonious relationships with the instrument, with life, with music.
In this chapter, we dovetail on the themes from previous chapters and fine-tune our
understanding of leitmotifs, topics, and musical associations, now attending to their trans-
formations and the impact of such transformations on our interpretations of the film’s dra-
matic trajectory. First, we explore instances where leitmotif variations indicate an analogous
affective or associative realignment in the storyworld. Then, we investigate musical troping,
a technique with remarkable expressive potential entailing a transmutation of unrelated top-
ics. Last, we examine how a reconfiguration of a song, or the context surrounding a song,
prompts us to reframe our interpretations of the music’s place within the narrative.
Leitmotif Transformation
Leitmotifs are fluid, changeable constructs whose transformations help outline and explain
narrative or character developments. Once established, leitmotifs cue our attention to spe-
cific elements in the storyworld, and, when transformed, they suggest a new affective state
or supply a new semantic layer that modulates our perception and characterization of those
elements.1 Films draw primarily on two types of leitmotif transformations to influence our
interpretation of the narrative: those grounded in embodied mechanisms and those driven
by semiotic associations.
DOI: 10.4324/9780429504457-11
Interpretive Transformations 147
FIGURE 10.2 The Adventures of Robin Hood. Much the Miller’s Son joins the band. [00:04:
30]
from the harsh punishment. Much, in need of moral guidance, eagerly requests to join the
Merry Men, swearing, “From this day on, I’ll follow only you . . . Take me as your servant.”
On his horse, looking down on the short fellow, Robin takes a few moments to respond,
hesitatingly accepting him into the group. In the soundtrack, a disjointed, hesitant version
of the ‘Merry Men’ leitmotif denotes the addition of a character to the band whom Robin
Hood did not intend to include—short fragments of the leitmotif elicit in us a feeling of
incoherence and hesitation, while the wandering tonal center affords us only uncertain and
unconvincing grounds.
Moments later, Robin encounters Little John, a giant man in worn clothes, carrying a
quarterstaff. Each man stands on the opposite ends of a bridge, measuring themselves by one
another. Realizing ‘Little’ John’s intimidating appearance, Robin immediately contemplates
including him in his army, but first, he puts Little John to the test. The two men battle with
quarterstaffs on the bridge—striking, ducking, reeling, but neither giving ground. Robin loses
his footing and falls to the water. Little John roars with laughter and thrusts his quarterstaff
to Robin to help him to the bank on the side of the bridge. Both men laugh and introduce
themselves. In the soundtrack, the ‘Merry Men’ leitmotif emerges with lighthearted, come-
dic undertones—a duet of bassoons in their high registers—anticipating Robin’s invitation
to Little John to join the band by approvingly remarking, “If you can hold a breach like you
held the bridge, you’re one of us, and welcome.”
As Friar Tuck joins the band, the ‘Merry Men’ leitmotif sounds in a solo bassoon playing
in its high register and accompanied by pizzicato strings. Here, the instrument not only
feeds humorous undertones into the musical texture—as Robin promises to the large fellow
“a venison pastie and the biggest you ever ate . . . boar’s head, beef, casks of ale”—but also
onomatopoeically mimics the distinctly high nasal tone of Friar Tuck.
FIGURE 10.3 The Adventures of Robin Hood. Little John joins the band. [00:23:00]
Interpretive Transformations 149
FIGURE 10.4 The Adventures of Robin Hood. Friar Tuck joins the band. [00:32:00]
As additional members join the band, additional instruments join the leitmotif (wood-
winds, brass, strings, percussion), consolidating the orchestral ensemble, representing Robin
Hood’s strengthening forces. From this point in the film, the ‘Merry Men’ leitmotif under-
scores scenes where they take on the establishment. Now complete, the band of Merry Men
prepares to ambush Sir Guy’s Party. Eagerly, the men climb through the many branches into
the treetops and prepare to strike. In the soundtrack, the ‘Merry Men’ leitmotif resonates fer-
vently, with many contrapuntal voices moving in all directions yet supporting the collective
texture with an orchestral tour de force.
After the successful ambush, the men celebrate with a banquet at Robin’s camp—long
tables, two whole steers and several pigs roasting over fire pits, outlaws celebrating and
deservedly enjoying themselves dancing. In the soundtrack, the entire orchestra gleefully
joins in, setting the ‘Merry Men’ leitmotif in a joyous triple meter, turning it into a delicious
waltz that distances us from the leitmotif’s march-like associations presented so far in the
film, affording us an engagement with the scene’s playful, leisurely, festive atmosphere.
FIGURE 10.5 The Adventures of Robin Hood. The band prepares to ambush Sir Guy’s Party.
[00:33:30]
FIGURE 10.6 The Adventures of Robin Hood. The Merry Men celebrate their victory. [00:39:50]
150 Interpretive Transformations
FIGURE 10.7 The Adventures of Robin Hood. Disguised Merry Men marching toward the
Nottingham Castle. [01:29:50]
FIGURE 10.8 The Adventures of Robin Hood. Ceremonial trumpets announce Prince John’s
arrival. [01:31:45]
Near the film’s end, the band plots to disguise themselves as members of the Bishop of
the Black Canon’s entourage to gain access to Nottingham Castle. As the men approach
the gate, a camouflaged version of the ‘Merry Men’ leitmotif seeps through the soundtrack,
supplying only a few structural notes, concealing its very identity. Its slow pace matches the
anticipatory yet measured speed of the procession, and the tolling bells, minor mode, and
dominant pedal tone (not affording us a resolution to tonic) elicit a suspenseful atmosphere
that foreshadows the potentially treacherous situation ahead for the Merry Men.
Robin and his men enter Nottingham Castle and sneak into the royal box. A fanfare of
ceremonial trumpets announces Prince John’s arrival, as he strolls through the chamber
and gallantly positions himself on the throne, unaware that the Merry Men surround him.
A close listen to the soundtrack in the scene, however, reveals a most astute transformation,
one that involves tinkering with the soundtrack’s diegesis via a transference—the ‘Merry
Men’ leitmotif infiltrates the film’s storyworld and even the royal fanfare of ceremonial trum-
pets. This canny transference suggests that the band of men is poised to dominate in the
clash about to unfold and even suggests a potential transformation of the prince’s forces, a
realignment to the ideals of justice championed by the Merry Men.
meaning. In such leitmotif transformations, the intra- and extra-opus associative powers
of leitmotifs and topics merge to guide us through the film’s dramatic trajectory. This sub-
section also draws on a single film, The Red Violin, whose soundtrack masterfully real-
izes a large-scale structural plan that maps the narrative’s events through topical leitmotif
transformations.4
The Red Violin chronicles the journey of a mysterious instrument through a series of
flashbacks, each meticulously crafted as a historical tableau that reveals the complex inter-
relationship between the instrument, its owners, and their cultural milieus. These flashbacks
are linked by a fortune teller reading the tarot cards to Anna, the violin maker’s wife. The
cards, however, do not predict Anna’s fate, but the violin’s. Along this journey, the instru-
ment gets buried, stolen, shot, and nearly burned, but when played, it resonates with its
current owner’s unique temperament to produce sublime sounds.
The film begins with a present-day violin auction in Montreal but soon flashes back to
the instrument’s creation in seventeenth-century Cremona. In this first flashback, Anna hums
a lullaby-like tune to her child in the womb, a soothing yet nostalgic tune beginning in the
Dorian mode and gently shifting to the relative Aeolian mode. Sadly, they both die during
the delivery. To keep their memory alive, Anna’s husband, master luthier Bussotti, varnishes
the instrument with their blood. In the soundtrack, Anna’s humming fades away and dove-
tails into the same tune performed on a violin—the ‘red violin’ leitmotif is born.
The tarot cards predict a long journey with blissful and tragic moments. In the second
flashback, the violin embarks on this journey, drifting toward an Austrian orphanage, falling
in the hands of Kaspar, an orphan who attempts to learn to play the instrument. During the
day, he practices tirelessly; during the night, while asleep, he holds his newly found treasure
close. In this first transformation, the leitmotif adopts a ‘Baroque style’ topic, characterized
by clear diatonic harmonic progressions and continuous rhythmic figurations. Despite Kas-
par’s attempt to control the musical time by rehearsing to the beat of a metronome, the beat
of his own heart betrays him, and he collapses while auditioning before the prince. Kaspar’s
untimely death, therefore, stems from his obsessive drive (but sheer inability) to control the
FIGURE 10.9A The ‘Red Violin’ leitmotif, as hummed by Anna. (Music by John Corigliano.)
FIGURE 10.9B The Red Violin. The Red Violin appears for the first time. [00:06:30]
152 Interpretive Transformations
FIGURE 10.10A ‘Baroque style’ topical transformation of the ‘Red Violin’ leitmotif. (Music by John
Corigliano.)
instrument, the music, or even himself. Kaspar is buried right outside the orphanage, holding
tightly to the violin.
In the third flashback, a few years later, a Romani group expropriates the violin from
Kaspar’s grave. They travel through Europe, wandering extemporaneously through many
countries. In the hands of a female Romani improviser, the leitmotif re-emerges with a free
spirit, infused with ad libitum gestures and exotic modes archetypical of a ‘Romani’ topic.
FIGURE 10.11A ‘Romani’ topical transformation of the ‘Red Violin’ leitmotif. (Music by John
Corigliano.)
FIGURE 10.11B The Red Violin. Romani woman improvises on the instrument. [00:51:20]
Interpretive Transformations 153
FIGURE 10.12A ‘Paganini-esque’ topical transformation of the ‘Red Violin’ leitmotif. (Music by
John Corigliano.)
FIGURE 10.12B The Red Violin. Lord Frederick Pope’s emotionally charged performance.
[00:56:15]
The Romani group travels through Oxford, through the lands of Lord Frederick Pope, a
nineteenth-century violin virtuoso, who takes possession of the instrument. In this fourth flash-
back, the leitmotif flirts with a ‘Paganini-esque’ topic to exploit Pope’s devilishly virtuosic tech-
nical capacities—ricochet bowing, multiple stopping at dazzling speeds, extended octave
playing—and to allow him to exude his intense emotional and artistic expression through the
music. He becomes aesthetically transfixed by the instrument and begins to include it not only
in his performances, but even in his lovemaking. This makes his mistress jealous. As she finds
Pope in a ménage à trois with another mistress and the violin, she fires a bullet at the instrument.
One of Pope’s servants travels to Shanghai and pawns the broken instrument to an anti-
quarian, who repairs the damage. Although physically restored, the violin cannot resist the
collective—in China, during Mao’s Cultural Revolution, the violin is portrayed as a symbol
of Western decadence. Within this tableau, the oppressive sociocultural ideology entirely
silences, rather than transforms, the ‘red violin’ leitmotif.
FIGURE 10.13 The Red Violin. The instrument amidst China’s Cultural Revolution. [01:16:30]
154 Interpretive Transformations
FIGURE 10.14 The Red Violin. The instrument realizes its destiny. [02:03:15]
Upon the Chinese antiquarian’s death, the government inherits the instrument and soon
ships it to Montreal for auction. In present-day Montreal, Charles Morritz, an expert violin
appraiser and a luthier himself, notices the much-coveted ‘red violin’, the last of legendary
luthier Bussotti. Several eager buyers obsessively compete for the instrument, but merely as
a trophy, rather than for its capacity to produce sublime music. Morritz, although unable to
afford the multimillion-dollar instrument, is determined to procure the violin for his young
daughter. In a brief flashback to the fortune teller, the final tarot card, Death, appears upside-
down, not predicting death but rebirth. Morritz exchanges a counterfeit for the real red violin
during the auction. As he secures the red violin and flees the scene, he calls home to speak
with his daughter and reveals, “Honey, I’m coming home, and I am bringing you something
very special.” The original leitmotif, unaltered, which the audience has not heard since the
film’s first moments, returns to the soundtrack, suggesting the instrument will finally realize
its destiny, reaching the hands of a luthier’s child.
Musical Troping
Musical troping entails fusing two distinct, unrelated topics.5 Through this fusion, the
music accentuates a plot based on some dichotomy, such as two distinct characters, two
different cultures, or two points of view. Whereas topics draw on conventionally estab-
lished meanings, troping establishes new and localized meanings, thereby constructing
metaphors that draw parallels between converging structural musical elements (source)
and converging extra-musical elements (target). The examples explored in this section
present a fusion of two sociocultural milieus and a corresponding musical troping in the
soundtrack, here gathered under the [CONVERGENCE OF NARRATIVE ELEMENTS] IS [CON-
VERGENCE OF TOPICS] conceptual metaphor. Within this metaphorical process, an asym-
metry between the elements entering the musical troping prompts us to map an analogous
asymmetry onto the extra-musical domain. Moreover, a parallel process of signification
tangential to, but enriching, the metaphorical troping is at play: identifiable pieces bring
unique associations and connotations that (in some listeners may) elicit uniquely subjec-
tive readings of the scenes.6
Being There tells the story of Chauncey Gardiner, who lived in isolation, working as a
gardener in a millionaire’s house. When the wealthy man dies, Chauncey must leave the
house and submit himself to an unfamiliar, modern world. The film presents two contrast-
ing elements: the inner world of Chauncey Gardiner (an old, prudent, innocent, almost
Interpretive Transformations 155
sterile human being) and the outer world (a modern and exciting, yet contaminated and
corrupt environment). The soundtrack bestows Chauncey and the outer world with their
own distinct musical topics: ‘classical music’ for Chauncey and ‘disco-funk’ for the outer
world.
As Chauncey leaves the only home he has ever known and steps into the outer world, the
soundtrack fuses two unrelated topics. The musical troping in the scene blends salient mark-
ers characteristic of each topic. The music retains vital elements from a classical piece but
adjusts them to conform to the ‘disco-funk’ topic: the original (symphonic) instrumentation
integrates archetypical disco timbres (organ, drum set, cowbells); the melody, performed on
the orchestral instruments (trumpet with sporadic orchestral tutti), adopts rhythmic nuances
(anticipations and retardations) characteristic of the disco style; the harmonic progression
incorporates chordal extensions (7ths, 9ths, 11ths) representative of contemporary practices;
a jazzy organ breaks through the musical texture with funky riffs and counter-melodies;
and, a rhythm section punctuates the metrical structure but shifts the emphasis to off-beat
metrical positions and infuses a disco feel. Superimposing the resulting musical troping onto
the narrative triggers a metaphorical correlation—we project the structural convergence of
topics onto a parallel convergence in the narrative—placing the confluence of two widely
divergent sociocultural worlds at the center of attention.
A closer examination of this metaphorical correspondence, however, illustrates an asym-
metrical relationship in both the musical and narrative domains—one topic enters the
musical troping as dominant, subverting the other. Changes in timbre and in the sonic envi-
ronment amount to the most radical transformations of the ‘classical music’ topic, now
subverted and under the generic control of the ‘disco-funk’ topic—for instance, the presence
of non-musical sounds (such as the city noises) and the absence of a long reverb (archetypi-
cal of concert halls) relocate the classical piece into a non-classical context. As a result,
although the classical piece retains its melodic profile and other essential qualities, it surren-
ders its identity when entering the musical troping and becomes absorbed by the dominant
‘disco-funk’ topic. In turn, and stemming from this metaphorical mechanism, we map the
musical asymmetry onto our reading of the narrative: Chauncey surrenders his identity as
a gardener when leaving his old home and yields to the subduing, overpowering forces of
the reckless city.
In addition, a parallel meaning-making process is at play. The classical piece undergo-
ing transformation, Richard Strauss’s “Also Sprach Zarathustra”, is a cultural trope unto
itself, bringing about the “discovery of new, potentially dangerous worlds” connotations for
FIGURE 10.15 Being There. Chauncey launches himself into a new world. [00:19:00]
156 Interpretive Transformations
listeners familiar with the earlier film 2001: A Space Odyssey. We project these associations
onto this narrative, generating a conceptual blend that enriches our interpretation of this
moment in the film. (See Figure 10.16.)
Big Night is about two Italian brothers, Primo and Secondo, who emigrated from Italy
and opened a restaurant in America. The restaurant is almost bankrupt, the lack of success
due to the owners’ decision to keep the restaurant as authentic to the Italian tradition as
possible. Here, the film establishes a dichotomy between Italian culture and American
culture.
The restaurant must close its doors because of a lack of customers. Secondo seeks the
advice of an Italian friend who owns a prosperous restaurant called Pascal’s. The key to Pas-
cal’s success is the fusion of the Italian and American cultures in every aspect of the restau-
rant, from the menu (spaghetti with meatballs) to the live performances (Italian classics with
an easy-listening American twist) to its very name (an Italian name with the characteristically
American apostrophe plus ‘s’). In the film, as Secondo enters this very successful restaurant,
the music and the visuals portray this cultural fusion. In the soundtrack, the musical troping
transforms a traditional Italian song (here representing the ‘Italian classic’ topic) by infus-
ing archetypical markers of an ‘American easy listening’ topic: the delivery of the original
Italian lyrics takes on a pronounced American accent; an electric piano accompaniment
replaces the characteristic orchestral accompaniment; profuse syncopations (characteristic
of American popular and jazz styles) seep through the melody’s rhythmic profile; and, the
singer’s vocal delivery and performance inflections embrace an almost karaoke singing style.
A metaphor thus emerges, via which we project this topical fusion onto a parallel fusion in
the narrative.
Interpretive Transformations 157
FIGURE 10.17 Big Night. Secondo visits Pascal’s to learn their formula to success. [00:23:40]
In addition, by introducing a specific song, “’O Sole Mio”, the soundtrack concocts a
complementary meaning-making process. The association of this song with Italian culture
has been so emblematic that it replaced the Italian national anthem during the 1920 Sum-
mer Olympics when a recording of the anthem could not be found. Nevertheless, the asso-
ciations brought about by “’O Sole Mio” extend beyond the anecdotal. Broadly, the lyrics of
the song refer to the sun, which rises every day and brings a new beginning, which can be
interpreted as a metaphor for the potential rebirth of the failing restaurant, for the potential to
bring new life to their struggling business and begin anew. More specifically, we may project
the lyrics “sta nfronte a te” [“right in front of you”] onto the narrative and read this moment
as emphasizing both the physical location of the restaurants (one in front of the other) or as a
metaphorical message to Primo and Secondo: the potential for a radical transformation, one
that may salvage their ailing restaurant, lies right in front of them. However, taking that path
would require them to give in to the cultural fusion embraced by Pascal’s, a compromise
they are unwilling to make. (See Figure 10.18.)
Saturday Night Fever explores society’s cultural, economic, and ethnic divisions. Tony is a
Brooklyn native of Italian descent who aspires to cross social boundaries by entering a disco-
dance contest partnering with Stephanie, a classical ballerina. After a night at the disco, Tony
and his entourage drive Annette, another dancer, to the mythical Verrazano-Narrows Bridge,
where they ritually stop to lark about.7 Drunk yet full of energy, they dance and hang from
the bridge girders, daring one another to climb higher. The non-diegetic soundtrack presents
a musical fusion of the ‘classical music’ and ‘disco music’ topics. Startled, Annette watches
them from the car. As the music climaxes, Joey dangerously jumps, barely grabbing the rails,
and Tony and Double J. leap to the rescue. Annette screams out of the car and runs to the
edge of the bridge. It was a trick—the three boys, safely standing on a ledge, chant in unison,
“Can you dig it? I knew that you could.” Here, the amalgam of diverse musical sounds result-
ing from the troping metaphorically mirrors the pluralistic dance floor.8
In this case, however, an asymmetry in the metaphorical correlation suggests an alterna-
tive reading of the scene, one in which a celebration of diversity and plurality gives way
to transgression. At a structural level, the chaotic mingling of sounds hints at a breach of
aural boundaries that translates into interpersonal assault—the timbres and musical gestures
representing “lowbrow” disco styles (wah-wah guitar, unruly percussion, pounding electric
bass) forced into the refined symphonic orchestration characteristic of “highbrow” musical
styles foreshadow the sexual transgression in the gangbang scene.
FIGURE 10.19 Saturday Night Fever. Tricks of life and death on the Verrazano-Narrows Bridge.
[01:10:50]
Interpretive Transformations 159
FIGURE 10.21A Dance of the 41. Amada attempts to master Schubert’s “Ständchen”. [00:13:20]
different subtexts at the beginning and ending of the film. In an early scene, Amada waits for
Ignacio to come home, sitting at a piano, striving to master an excerpt of Schubert’s song. As
he arrives late at night, Amada continues to play. Here, the absent lyrics, “Softly my songs
plead, through the night to you; down into the silent grove, beloved, come to me!”, voice
Amada’s struggling plea for Ignacio’s withheld love. Despite her efforts, however, she cannot
find the right tone. (See Figure 10.21B.)
Toward the end of the film, Ignacio and Evaristo (and their amorous relationship) become
the talk of the town, tainting Amada’s reputation in society and turning their marriage into a
noxious arrangement solely intended to salvage their social status. Although smothered by
the suffocating circumstances, Amada regains her agency and deliberately turns Schubert’s
tender song into an inharmonious, disjointed, irritating musical pronouncement, one that
voices her despair and discontent by suggesting the opposite of the song’s original meaning.
Our original conceptual blend now introduces correspondences of ‘disanalogy’ between
input spaces. All viewers will recognize that her deliberately distorted playing reveals how
their marriage, which could have been a tender song, has become a grating, barely recog-
nizable version of its potential idealized self. Viewers familiar with the song’s lyrics, in turn,
will recognize that Amada has deliberately transformed her original soft plea into a harsh
message of rejection.
Interpretive Transformations 161
FIGURE 10.22 Dance of the 41. Amada plays a transformed piano version of Schubert’s song.
[01:15:20]
In the psychological drama He Loves Me, He Loves Me Not, Nat King Cole’s “L-O-V-E”
brings about radically different readings of the story. In this case, it is not the music’s trans-
formations but a perspective flip and the resulting reframing of the protagonists’ realities that
prompt us to reconfigure our conceptual blends. The story follows Angélique, a talented
art student madly in love with Loïc, a married man she believes will leave his pregnant
wife. The film first presents Angélique’s romantic and idyllic point of view. In the bloom of
love, she sends a delicate pastel-pink rose to Loïc, who smiles upon receiving it. The song
“L-O-V-E” accompanies the first montage, which depicts whimsical moments of a budding
romance—Loïc’s driving Angélique to his neighborhood, him playing in a park while she
pencil-draws him from a distance, her sending him her art projects and him hanging them
in his office. The lyrics narrate:
L is for the way you look at me; O is for the only one I see; V is very, very extraordinary;
E is even more than anyone that you adore can . . . Love is all that I can give to you; Love
is more than just a game for two; Two in love can make it; Take my heart and please don’t
break it; Love was made for me and you.
We project the associations elicited by the song’s lyrics—along with its major mode, lively
tempo, bouncy rhythms, and light texture—onto Angélique’s enthusiastic pursuit of love,
both through her art and with Loïc. (See Figure 10.23B.)
FIGURE 10.23A He Loves Me, He Loves Me Not. Angélique’s and Loïc’s idyllic budding romance.
[00:10:30]
162 Interpretive Transformations
However, as the film unfolds, we notice that Loïc does not return her affection. Angélique’s
friends note marks of probable physical abuse and warn her of a potentially unhealthy rela-
tionship. In a turning point, Loïc never arrives at the airport for a romantic getaway to Flor-
ence. Devastated, she spirals into depression. By this point, we, and Angélique’s friends, are
convinced that this kindhearted, bright, impressionable young woman fell in love with a
psychologically and physically abusive married man.
Partway through, the film rewinds to the beginning and presents many of the same
events in a new light, from Loïc’s perspective. In a heartbeat, the film’s tone shifts from
romantic comedy to psychological thriller. We learn that Angélique’s infatuation sprang
from a fleeting encounter with Loïc, one he barely recalls—a brief exchange of only a
handful of words and his offer of a ride to a neighboring house she was looking after.
As this new perspective begins to reveal her true, dark colors, he turns from suitor to
victim: an obsessed, unpredictable, violent stalker (Angélique) endeavors to tear Loïc’s
life apart by attempting to break up his marriage and even to murder his wife. As Loïc
accidentally finds out about Angélique’s obsession, she strikes him with a brass figurine,
and he falls down the stairs. The song “L-O-V-E” accompanies a second montage, which
depicts Angélique being arrested, diagnosed with erotomania, and remanded to a men-
tal institution that implements electroshock procedures, all interspersed with snapshots
of Loïc unconscious at the hospital’s intensive care unit, later regaining consciousness,
relearning to walk, and mending his marriage, yet hobbling around with a walker years
after the incident. Primed by our first hearing of the song within a radically different con-
text, we now notice the incongruity between the song’s lyrics and the narrative. As we
subliminally attempt to resolve this incongruity, we begin to notice obsessive undertones
in the song’s lyrics, ones that paint a one-sided portrait of love. Our original conceptual
blend now introduces correspondences of ‘disanalogy’ between input spaces. Within this
new context, the song takes on a sinister hue, creating a stark contrast to the rosy veneer
it brushed over the first montage, re-casting Angélique as an unreliable narrator: while at
the beginning of the film Angélique presents herself as madly in love, now we learn she is
just mentally deranged.
Interpretive Transformations 163
FIGURE 10.24 He Loves Me, He Loves Me Not. Consequences of Angélique’s delusional behavior.
[01:27:30]
Coda
In this chapter, we explored instances where musical or contextual transformations broaden
the soundtrack’s expressive potential for meaning construction: leitmotif transformations,
grounded in embodied mechanisms or driven by topical associations, suggest a new affec-
tive state or supply a new semantic layer; musical troping accentuates a dichotomy in the
plot by metaphorically foregrounding a fusion of two distinct elements in the storyworld;
and the associations brought about by musical or contextual transformations, particularly in
relation to a song and its lyrics, prompt us to reconfigure and reframe our interpretations of
the music’s place within the film.
In The Red Violin, the main leitmotif becomes a latent memory, reconstructed time and
time again with each transformation. As a result, the music strings together the film’s threads
by transcending temporal and spatial relations, blurring physical and spiritual boundaries,
and rebuilding a harmonious relationship with the past, the present, and the future.
Notes
1. Just as we recognize a film’s protagonist with different haircuts or wearing different clothes, we
recognize leitmotif transformations even when disguised within intricate textures. In fact, we are
evolutionarily hardwired to recognize leitmotif transformations, rather than to regard these as new
musical fragments. For additional insights on the ecological advantages of auditory perception, see
Appendix V in this volume.
2. Winters (2007) offers a fascinating, in-depth study of Korngold’s score for The Adventures of Robin
Hood.
3. Although most listeners will hear the ‘Merry Men’ leitmotif for the first time during the film, the music
is itself a transformation of a waltz theme, an earlier work by the film composer himself, Erich Korn-
gold. He composed this theme as part of an operetta, Rosen aus Florida, to fit a lighthearted scene in
which a beauty contest of nations occurs. To represent Austria, Korngold composed a waltz that brings
together characteristic musical figures suggesting his “homeland’s musical style” (Winters, 2007).
4. Composer John Corigliano mentions, “Because the movie was involved with the tarot, involved
with classical music and a violin, and involved with many different ages of music, I felt that one had
to tie everything together. If you just wrote baroque, classical, romantic, and so forth, they would
be detached, since the only thing that threaded through the 300 years was this violin. One needed
a thread that had to be thematic and harmonic . . . Unless you tie it together with some common
thread, you will not feel the organic quality of the movie” (cited in Schelle, 1999, p. 160). That com-
mon thread is the ‘red violin’ leitmotif.
5. Hatten (1994) notes that troping occurs when “two different, formally unrelated types are brought
together in the same functional location so as to spark an interpretation based on their interaction”
(p. 295).
164 Interpretive Transformations
attunement this model purports has not been empirically verified as a whole, much recent
research in social neuropsychology and cognitive neuroscience has validated the individual
mechanisms that contribute to an intersubjective resonance of physiological and affective
states.
Encoding Affect
We uncritically accept the idea that emotions trigger physiological states, that our mood
defines how we feel physically. However, recent research in cognitive psychology indicates
that this belief stems from intuitive yet misleading sensory information and suggests that the
reverse chronological relationship is at play: physiological states cause emotions. William
James was one of the earliest psychologists to advance this counter-intuitive proposition:
Common sense says we lose our fortune, are sorry and weep; we meet a bear, are fright-
ened and run; we are insulted by a rival, are angry and strike . . . this order of sequence
is incorrect . . . the bodily manifestations must first be interposed . . . the more rational
statement is that we feel sorry because we cry, angry because we strike, afraid because
we tremble.
(James, 1884, pp. 189–190)
Keysers (2011) shows that this sequence of events is particularly evident in emotional
contagion stemming from bodily mimicry; he points us to research suggesting that changes
in facial expressions in others—lowering of eyebrows, clenching of jaws, raising of mouth
corners—trigger in us a covert or overt mimicking of these expressions (Dimberg & Öhman,
1996), which in turn modulates of our affective states (Murphy & Zajonc, 1993).1 Analo-
gously, Jabbi and Keysers (2008) found that observing emotions in others’ facial expressions
causes a sequential, two-part response in the brain: first, we activate the cortical motor activ-
ity responsible for acting out those facial expressions; subsequently, we engage the neural
substrate responsible for processing feelings and emotions (the insula).
Because our affective states result from physiological changes, we may self-regulate our
emotions by assuming postures or enacting gestures (Laird, 2007; Strack et al., 1988). For
instance, adopting a facial expression associated with a particular emotion (anger, disgust,
fear, happiness, sadness, or surprise) generates an autonomic nervous system response
(heart rate variability and skin conductance) analogous to that triggered by the spontaneous
facial expression (Levenson et al., 1990). Thus, “fake” and “genuine” smiles cause the same
brain reaction, but to different degrees. This research supports the idea that motor activity
mediates—and precedes—emotional contagion and regulation, and reminds us of the inte-
gral role of gestures and postures within a model of empathy.
Gallese (2017) notes that because our bodies “stag[e] subjectivity by means of a series of
postures, feelings, expressions, and behaviors . . . [their] expressive content is subjectively
experienced and recognized [by others]” (p. 181).
Bodily gestures and postures embed cues with which to interpret their expressive purport;
these cues emerge primarily from the properties of the movement itself, such as its direction,
contour, speed, or force. Subtle nuances in these properties endow a meaningless move-
ment or action with expressive power, transforming it into a meaningful gesture—a gaze
becomes a stare or a flirt; a simple touch becomes a caress or a slap.
Because our voices are vital instruments for expressing our affective states, we have devel-
oped a robust capacity to produce and perceive expressively nuanced vocal gestures. We
imbue these vocal gestures with salient sonic features, or acoustic signatures, that correlate
to particular affective states. These acoustic signatures are the very qualities that shape and
transform meaningless sounds into meaningful sonic gestures—a mere exhalation becomes
a lamenting sigh or a disgruntled moan.
Acoustic Signatures
Physiological changes not only affect our self-perception of emotions but also modulate our
vocal expressions. Respiration patterns, degrees of salivation, muscular tension, subglot-
tal pressure, transglottal airflow, vocal fold vibration, and many other physical phenomena
affect our phonation and articulation, embedding acoustic signatures within our vocaliza-
tions (Foulds-Elliott et al., 2000; Scherer et al., 2011; Sundberg et al., 2011). Our voice is
thus a unique instrument that alters its sound, reflecting our emotional state.
Studies on the acoustic signatures of speech and non-linguistic vocalizations identi-
fied correlations between acoustic parameters—e.g., intensity, intensity variation, spectral
energy, fundamental frequency, contour, speed of syllabic onset, vibrato—and affective
states—e.g., anger, fear, joy, pride, sadness, tenderness. Given the vast amount of empiri-
cal research on this topic, Scherer et al. (2017) engaged in a systematic review to distill the
acoustic signatures of particular emotions. Figure I.2 synthesizes their observations for anger
and sadness, illustrating a generalized phenomenon that entails the interplay between the
physiological correlates of emotions and their resulting acoustic signatures.3
In addition, acoustic signatures appear to have an ecological (albeit expressive) function;
these provide members of the same species with critical and “behaviorally relevant infor-
mation . . . [or content] correlated directly to physical characteristics of the caller” such as
sex, size, or affective state (Miller & Cohen, 2010). Larger animals, for instance, have longer
vocal tracts and thus produce louder sounds with stronger fundamentals, achieving greater
sound dispersion. To gain an ecological advantage, during physical conflict, “animals in
an aggressive stance would try to appear larger both visually (e.g., erecting their feathers,
raising their tail) and vocally, by producing lower-pitched sounds, associated with larger,
stronger animals” (Eitan, 2013).4
Because a sound’s volume (or loudness) and pitch range are reliable indicators of the
source’s size, our responses to these sonic parameters conform to adaptive mechanisms.5
For instance, when creating a threatening display, we produce sounds that make us appear
large; but when coming across as friendly or submissive, we produce sounds that help project
ourselves as small. According to this logic, and as depicted in Figure I.3, sounds representing
168 Appendix I: Empathy
large or threatening characters, objects, or situations would lie in the low-frequency range
and have loud volumes, whereas sounds representing small or amiable characters, objects, or
situations would lie in the high-frequency range and have soft volumes (Huron et al., 2006).6
Decoding Affect
Psychologists and philosophers have long intuited a functional mechanism of intersubjec-
tive empathy that rests on intercorporal resonance—i.e., mapping observed gestures, pos-
tures, actions, or vocalizations of others into our own bodies. James (1890) notes that “every
representation of a movement awakens in some degree the actual movement which is its
object” (p. 526); Lipps (1903) recalls his bodily and emotional attuning while watching an
acrobat walk on a suspended wire and mentions, “in inner imitation there is no separation
between the acrobat up above and me below. On the contrary, I identify myself with him.
Appendix I: Empathy 169
I feel myself in him and in his place” (as cited in Agosta, 2014, p. 61); and Merleau-Ponty
(1962) notes that:
Nearly a century later, research in neuropsychology identified neural substrates that ena-
ble such intersubjective empathy: the mirror neuron system (MNS), a network of neurons that
respond to both perception and action. Mirror neurons were first identified in a series of experi-
ments on macaques;8 these neurons fired when the animal executed an action and when the
animal observed an analogous action performed by the experimenter. Subsequent investiga-
tions, also in macaques, found audiovisual mirror neurons that fired when the animal performed
an action and when the animal heard the related sound of that same action (Kohler et al., 2002).
While the discovery of mirror neurons in macaques took place via recordings of single
brain cells, non-invasive techniques have been used to confirm a homologous system in
humans—the methods most widely used include functional magnetic resonance imaging
(fMRI) and transcranial magnetic stimulation (TMS).9 Although these methods lack the spa-
tial and temporal accuracy of single-cell recordings, the results gathered by these studies
strongly support the existence of an action-perception link, or mirror system.10 These studies
indicate that humans engage a mirror system within a wide range of conditions, subliminally
activating these neural structures in response to: (1) a self-performed action or an observed/
heard action performed by someone else;11 (2) a self-experienced emotion or an observed/
heard emotion experienced by someone else; and (3) a self-experienced sensation or an
observed/heard sensation experienced by someone else.12
Advancing the findings related to the human mirror system, Gallese (2010) notes that:
others’ behavior is immediately meaningful because it enables a direct link to our own
situated lived experience of the same behaviors, by means of processing what we per-
ceive of others (their actions, emotions, sensations) onto the same neural assemblies pre-
siding over our own instantiations of the same actions, emotions, and sensations.
(p. 87)
The human mirror system thus mediates an intersubjective process, allowing the actions,
gestures, postures, sensations, and emotions of others to resonate within us, and supports a
neurally grounded understanding of human empathy.
While we subliminally engage the human mirror system in response to heard or seen
actions, gestures, and postures, we also engage a parallel mechanism that inhibits our
overtly enacting those actions, gestures, and postures (Hurley, 2008; Keysers, 2011). As Gal-
lese (2009) explains:
when the action is executed or imitated, the cortico-spinal pathway is activated, leading
to the excitation of muscles and the ensuing movements. When the action is observed or
170 Appendix I: Empathy
This inhibitory mechanism prevents us from automatically mimicking others’ actions, ges-
tures, and postures, and succumbing to complete emotional contagion.
On the other hand, social phenomena that engage behavioral contagion—e.g., applaud-
ing, following a crowd’s gaze, engaging in singing along—rely on the mirroring system to
build a collective consciousness, setting up the conditions and environments that prompt
us to relax those inhibitory mechanisms, allowing us to engage physically and emotion-
ally with a crowd of strangers.14 Arguably, therefore, relaxing the inhibitory mechanisms
in certain conditions has its roots in our evolutionary history, conditioning us to respond
to a crowd’s behavior to promote ingroup bonding and ensure survival (De Waal, 2007).15
Like other submechanisms of the human MNS, behavioral contagion may be elicited both
through visual and auditory stimuli, and it may reveal itself in overt actions, or it may remain
covert. Moreover, group size is a strong determinant of behavioral contagion’s effect—the
larger the crowd looking up, the stronger our impulse to mirror its behavior.16 In everyday
settings, behavioral contagion is manifested as the spontaneous spreading of group behav-
ior that serves to coordinate affect and affiliation, playing a vital role in establishing group
identity and co-regulation in settings such as sporting events, religious ceremonies, warfare,
and rock concerts.17
Notes
1. Keysers (2011) further notes that “just as people can activate their premotor cortex without moving
their own hands when seeing someone else grasp a ball, they can also activate these higher order
motor areas when viewing facial expressions without necessarily moving their face” (p. 115).
2. Just as premotor mirror neurons discharge to both the sight and the sound of an agent’s actions,
the insula responds both to the sight of facial expressions denoting affective states and to sounds
denoting affective states.
3. For additional research on acoustic signatures see Hammerschmidt and Jürgens (2007); Juslin and
Laukka (2003); Owren and Bachorowski (2007).
4. For a systematic review addressing the ecological and evolutionary bases for sound production see
Eitan (2013).
5. Huron et al. (2006) draw on an ethological perspective to explore this phenomenon. In their
study, listeners judged a melody in the low register with loud volume as denoting aggression or
threat, and the same melody in the high register and with soft volume as denoting friendliness.
Eitan and Granot (2006) also found a strong correlation between visuospatial features (size, shape,
and height) and musical parameters (dynamics, pitch contour, pitch intervals, attack rate, and
articulation).
6. Our human tendency to integrate visual and aural features appears to be innate (or at least prelin-
guistic). For additional insights on the integration of visual and aural perception see appendix V in
this volume.
7. Whereas proprioception is the conscious perception of our body image, the mechanism described
here circumvents any conscious processes. Proprioception, therefore, does not play a role in this
model of musical empathy. For insights on proprioception and music see Acitores (2011).
8. Two studies laid the groundwork for these discoveries: Gallese et al. (1996); Rizzolatti et al. (1996).
9. For experimental research using non-invasive techniques to identify a mirror neuron system in
humans see Aziz-Zadeh et al. (2004); Buccino et al. (2005); Iacoboni et al. (1999).
10. Given this shortcoming, the notion of a homologous mirror neuron system in humans has trig-
gered much criticism. A relatively recent study using single-cell recordings in humans, however,
purports to have found direct evidence of mirror neurons in humans (Mukamel et al., 2010).
Appendix I: Empathy 171
11. Studies, both in primates and humans, suggest that the sight of actions (Buccino et al., 2004), the
sight of facial gestures (Dimberg et al., 2000), and the sound of vocalizations (Neumann & Strack,
2000) cause an automatic and subliminal activation of the neural structures involved in our own
execution of those actions, facial gestures, and vocalizations. Lima et al. (2016) offer supporting
evidence for the spontaneous activation of sound-related motor representations during auditory
perception.
12. Several studies have shown that we activate regions in the insula both when experiencing an
emotion or sensation and when viewing others experiencing an emotion or sensation; these find-
ings contribute to constructing a neural representation of shared affective experiences (Carr et al.,
2003). These mirroring mechanisms in the insula have been observed in relation to the perception
of disgust (Wicker et al., 2003), pain (Singer et al., 2004) and touch (Molenberghs et al., 2012).
13. These mechanisms, which we develop relatively early in life, take place below the level of con-
sciousness and are involuntary. At a musculoskeletal level, experiments using throat-audio and
larynx-electromyography (EMG), show that auditory stimuli (particularly voices and music) ac-
tivate subvocal articulation, even when not actively attending or aware of the stimulus (Brodsky
et al., 2008). And at a more integrated (neural/embodied) level, experiments using behavioral
responses (e.g., finger tapping) as well as neuroimaging (fMRI), show that our brains and bodies
subliminally synchronize to periodic rhythmic stimuli (Mayville et al., 2002). Although actual
movement or emotional expressions remain generally covert, in cases where inhibitory mecha-
nisms are damaged, movement or vocalizations may manifest overtly. Numerous studies report of
individuals with prefrontal cortical lesions who are unable to engage this inhibitory mechanism,
resulting in unintentional and overt copying of other’s actions—this disorder is labeled as ‘echop-
raxia’ in psychiatry. Kinsbourne (2005) notes, however, that only actions that attract attention and
provoke arousal trigger echopraxia; for instance, “[while] babies do imitate simple facial gestures
. . . [they] do not imitate every movement that happens around them” (p. 166).
14. While early research framed behavioral contagion in terms of deindividuation (Festinger et al.,
1952) and release restraint (Wheeler, 1966), more recent studies link behavioral contagion to the
human MNS (Lee & Tsai, 2010). To date, however, there is relatively little research on the neural
substrates of behavioral contagion.
15. This balance between subliminal mirroring and inhibition in empathy and social cognition is thus
critical for ingroup affiliation and communication between conspecifics (Juslin & Västfjäll, 2008;
Lakin et al., 2003).
16. Milgram et al. (1969) observed a correlation between the size of a crowd standing on a street look-
ing up and the number of passersby who adopted a crowd behavior and looked up.
17. For further insights on behavioral contagion related to applauding see Freedman et al. (1980), for
gaze-following see Milgram et al. (1969), and for singing along see Bargh and Chartrand (1999).
APPENDIX II
Conceptual Metaphor and Image Schema
Metaphors are ubiquitous in all types of discourse, from everyday expressions to scholarly
prose, from technical manuals to multimedia advertisements. They are persuasive rhetorical
devices with explanatory and expressive power—explanatory because they help us grasp
abstract ideas through concrete experiences, expressive because they capture and highlight
objective reality while leaving ample space for creative and subjective interpretations.
In this appendix, we first explore the notion of metaphor through the lens of Lakoff and
Johnson’s Conceptual Metaphor Theory. Then we examine the role of image schemas within
metaphorical mappings and attend more in detail to three selected schemas—CONTAINER,
LINEARITY, SOURCE-PATH-GOAL—delving into their spatial logic, cultural variability, and
usage within musical discourse and scholarship. Insights from experimental research in cog-
nitive psychology and neuropsychology interspersed through the discussion validate these
ideas from a scientific perspective and support the framework for meaning construction
through film music I propose in Chapters 2, 3, and 4.
Metaphor
In its most basic form, a metaphor is a false categorization. We recognize the statement
“John is a tiger” as a metaphor because ‘John’ is not a member of the ‘tiger’ category.
Understanding the meaning of this metaphor, however, is contingent upon our ability to
recognize correspondences between its components—‘John’ and ‘tiger’—and map salient
features between them: aggressive, untamed, or ferocious behavior.1 Figure II.1 outlines this
essential cognitive mechanism of metaphor interpretation.
Metaphors are conceptual and socio-cultural phenomena not bound by language, and
they can, and often do, highlight correspondences between various sensory modalities.2
This led Lakoff and Johnson (1980) to suggest that metaphorical thinking is foundational to
human thought and to “a more inclusive theory of human representations of reality” (p. 26).
However, only when such correspondences extend to suggest semantic entailments may we
fully capture a metaphor’s expressive potential.3 In Köhler (1929), we find one of the earliest
references to a shape-sound correspondence and a discussion of its semantic implications.4
Appendix II: Conceptual Metaphor and Image Schema 173
He presented participants with shapes similar to those in Figure II.2 and noted that they
systematically associated angular shapes with ‘takete’ and round shapes with ‘maluma’,
nonsense words projecting distinct sonic characteristics.5
Directionality of Mappings
Kövecses (2002) argues that the source-to-target ‘unidirectional’ mapping stems from une-
qual degrees of abstraction between conceptual domains: “Our experiences with the physi-
cal world serve as natural and logical foundations for the comprehension of more abstract
domains. This explains why in most cases of everyday metaphors the source and target are
not reversible” (p. 6). Some scholars, however, challenge such unidirectionality in meta-
phors, arguing that in metaphors comparing two concrete or two abstract domains, the order
of domains defines the directionality and hence the mapped features.7 For instance, the met-
aphors “That surgeon is a butcher” and “That butcher is a surgeon” compare equally con-
crete domains; the altered order (and implied directionality), however, drastically changes
the projected features from source to target—that is, the former metaphor generates map-
pings of ‘careless’ and ‘brutal’, while the latter generates mappings of ‘precise’ and ‘skillful’.
primed participants with physical experiences of warmth, distance, hardness, weight, and
roughness, and found that such priming strategies strongly influence social judgments, lead-
ing them to advance the notion of “a cognitive architecture in which social-psychological
concepts metaphorically related to physical-sensory concepts—such as a warm person, a
close relationship, and a hard negotiator—are grounded in those physical concepts, such
that activation of the physical version also activates (primes) the more abstract psychological
concept” (p. 268).11
Kövecses (2002) offers a broader perspective, tracing the origin of mappings to percep-
tual, biological, or cultural experiences: [MORE] IS [UP] (“Turn the radio down”) stems from
correlations of perceptual experiences, such as adding fluid to a container and observ-
ing its level rise; [ANGER] IS [HEAT] (“I am boiling with anger”), stems from correlations of
biological experience, such as feeling internal temperature changes resulting from mood
changes; and [LIFE] IS [A GAMBLING GAME] (“It is a roll of the dice”) stems from correlations
of cultural experiences, such as defining our actions as social activities that involve luck and
randomness. This broader perspective regarding the origin of metaphors foregrounds that
cultural and linguistic variations often highlight a tension between physically and cultur-
ally grounded metaphors. To resolve this tension, Ibarretxe-Antuñano (2013) proposes the
‘culture sieve’ model, a two-stage process in which “bodily-grounded experiences that con-
tribute to understanding and motivating the metaphorical mappings between the source and
target” are subsequently “purged, adapted, and modified by the cultural information avail-
able” (p. 324).12 Aware of this tension between universality and cultural variation, Dewell
(2005) warns us that although “semantic meanings are grounded in prelinguistic cognition
that is real and patterned and largely universal” we must avoid “the implication that the
meaning of language constructions can be reduced to pre-existing universal representa-
tions” (p. 385).
Image Schemas
Image schemas are integral to metaphorical constructions. These emerge from recurrent
embodied interactions with the environment, capturing regularities in our sensorimotor
experiences as spatial and bodily logics—placing objects within a CONTAINER, travers-
ing a LANDSCAPE from a SOURCE through a PATH and toward a GOAL. Schemas thus
become mental constructs that enable metaphorical mappings, thereby shaping our per-
ception, discourse, and even our abstract thoughts.13 For example, the conceptual meta-
phor [RELATIONSHIPS] ARE [JOURNEYS]—instantiated in “Marriage is a long and bumpy
road”, “Our relationship is moving along in the right direction”, and “We may have to
go our separate ways”—activates the SOURCE-PATH-GOAL image schema and serves
as a vehicle to understand the abstract notion of relationships in terms of our concrete
experiences of journeys.
Given the abstract nature of music, it is not surprising that we conceptualize it using met-
aphors that draw on image schemas.14 In fact, descriptions of musical phenomena activate
many image schemas, such as in “The clarinet enters [CONTAINER, PATH] the orchestral tex-
ture with a rising [VERTICALITY, LINEARITY] melody traversing [PATH] two octaves, signaling
the imminent arrival [PATH-GOAL] of a closing cadence in a distant [LANDSCAPE] key area
[LANDSCAPE]”. Such prose, filled with metaphors that activate image schemas, is arguably
necessary to communicate and share interpretations of music.15
176 Appendix II: Conceptual Metaphor and Image Schema
CONTAINER Schema
The CONTAINER schema emerges from concrete embodied experiences with bounded
regions of space, such as interacting with their contents (placing cookies in a jar, filling a
cup) or crossing their boundaries (walking out of a building, diving into the ocean, getting in
the car). Figure II.4 presents a diagram of the CONTAINER schema, which includes a region
of space and a boundary delineating an interior and exterior.
Lakoff and Johnson (1980) regard our body as the primary ‘container’, as we are “bounded
and set off from the rest of the world by the surface of our skins . . . [projecting] our own
in-out orientation onto other physical objects that are bounded by surfaces” (p. 29). Once
we establish the CONTAINER schema via concrete embodied experiences, we overlay its
spatial logic onto abstract phenomena that objectively lack spatial properties. This allows
us to grasp a wide array of abstract thoughts and phenomena, from actions framed by cul-
tural constructs (such as crossing national borders or entering a new century) to metaphori-
cal expressions that extend beyond the empirically plausible (“My heart is full of love”,
“I passed out”, “We are opening an investigation”, “I’ll keep that in mind”).16
Although seemingly universal, the CONTAINER schema is not free from cultural variability or
unfettered by the influence of language. For instance, while English speakers use ‘in’ and ‘on’
to denote spatial relationships (which draw on the CONTAINER and SUPPORT schemas corre-
spondingly), Korean speakers use the verb kkita to denote gradations of ‘tight’ versus ‘loose’ fit
to denote the same spatial relationships (Aurnague et al., 2007; Choi et al., 1999). Vandeloise
(2007) draws on the CONTAINER schema to illustrate a similar phenomenon, noting that in
English, blended mixtures are often (albeit erroneously) conceptualized in terms of CONTAIN-
MENT; when using sentences such as “There is milk in the coffee”, we reinforce a false cultural
frame in which the coffee, as the more important of the two liquids, is the container, and the
milk, as the lesser liquid, is the contained. These examples illustrate that, because schemas
ground linguistic and semantic meaning, they shape our perception of reality.
level rise).17 As diagrammed in Figure II.5, both exhibit a one-dimensional continuum; but
while VERTICALITY denotes a specific orientation in space (perpendicular to the horizon),
LINEARITY lacks a specific orientation.
Once established through concrete embodied experiences, the LINEARITY and VER-
TICALITY schemas permeate our understanding of abstract concepts by scaffolding their
continuous one-dimensional structures (with or without a predetermined vertical orien-
tation). This allows us to construct metaphors where the orientation is ambiguous or
altogether absent (“He has mood swings”, which draws on LINEARITY) or where the ver-
tical orientation plays a vital role (“The rising violence in the city”, which draws on
VERTICALITY).
The LINEARITY schema provides a more suitable alternative (than the VERTICALITY schema)
for conceptualizing a wide range of abstract phenomena, as it allows us to reveal their one-
dimensional linear quality while circumventing possible orientation constraints. By drawing
on concrete experiences framed by one-dimensional continua, the LINEARITY schema thus
shapes our perception and cognition of abstract notions, including socio-cultural ones such
as social class (low to high) and interpersonal affection (warm to cold), as well as musical
ones such as pitch (low to high), tempo (slow to fast), and volume (soft to loud).
SOURCE-PATH-GOAL Schema
The SOURCE-PATH-GOAL schema (SPG) emerges from concrete embodied experiences fea-
turing a start-point, a path, and an end-point, such as going from home to work via our usual
route. Its spatial logic, diagrammed in Figure II.6, comprises a starting-point (A), an ending-
point (B), a path connecting the two, and the implicit movement from A to B.18
Once established, the SPG schema provides robust explanatory power for abstract con-
structs featuring a start-point, a path, and an end-point. To better grasp the abstract notion
of ‘relationships’, for instance, we use expressions such as “Our relationship has hit a dead-
end”, “We may have to go our separate ways”, “It’s been a long and winding road”, all based
on the [RELATIONSHIPS] ARE [JOURNEYS] conceptual metaphor grounded in the SPG schema.
Analogously, we resort to the spatial logic of the SPG schema to account for the syntagmatic
unfolding of narrative events. Metaphors that describe narrative developments, for instance,
include expressions such as “The characters embarked on a journey”, “They encountered
obstacles along the way”, and “The protagonist became side-tracked”.
178 Appendix II: Conceptual Metaphor and Image Schema
The SPG schema often interacts with other schemas that complement its structure, result-
ing in multiparametric blends that more truthfully capture the essence of a metaphor’s con-
crete and abstract domains. For instance, although not a metaphor in itself, the English word
‘into’ simultaneously activates the CONTAINER schema (the ‘in’ portion) and the SOURCE-
PATH-GOAL schema (the ‘to’ portion), as diagrammed in Figure II.7. Combining the spatial
logic of these two schemas allows us to describe concrete actions (walking into a building,
diving into the ocean) and grasp abstract ideas (fading into oblivion, signing into your email
account).
Notes
1. Reference to context is critical in identifying and understanding metaphors, as contextual clues
can transform a factual statement into a metaphor and vice versa. For example, although “John is
a tiger” may justifiably seem a metaphor, in the context of sports, this apparently false categoriza-
tion could be a factual one—that is, that John either belongs to or roots for the Detroit Tigers (or
one of the many tiger-themed sports teams).
2. Linguistic cross-modal metaphors can evoke various sensory modalities. For example, in “The
dark sound of thunder”, the semantic sphere of ‘dark’ activates our sight while ‘sound of thunder’
activates our hearing.
Appendix II: Conceptual Metaphor and Image Schema 179
3. Our ecological and evolutionary biology drives our tendency to attend to congruent stimuli
across senses. Bregman (1990) identifies this tendency in newborns, who “spend more time look-
ing at a face that appears visually to be speaking the same words that it is hearing than one that
is not”, suggesting this is evidence that “the grouping of sounds can influence the grouping of
visual events with which they are synchronized and vice versa” (p. 653). Similarly, Welch’s (1999)
‘unity assumption’ describes the human tendency to attend to structurally congruent stimuli; he
speculates that humans are “vested in maintaining congruence in their perceptual world” (p. 373)
and hence in all modes of perception. Related to film music, Marshall and Cohen’s (1988) study,
particularly their Congruence-Association Model, supports the claim that “attention is directed to
the overlapping congruent meaning of the music and film” (p. 109); although this study has been
widely cited in the film music literature, much research exploring cross-modal and cross-domain
correspondences using innovative (and more robust) experimental paradigms have challenged its
findings (e.g., Lipscomb, 1995; Millet et al., 2021; Sirius & Clarke, 1994). Moreover, many studies
identify or corroborate cross-domain correspondences without delving into the potential seman-
tic entailments of these correspondences. Velasco et al. (2016) tackle taste-vision correspond-
ences and examine the extent to which a food package’s shape (straight, curvy, symmetrical,
asymmetrical) influences the consumers’ taste perceptions and expectations; their results broadly
suggest that packages with round contours induce expectations of sweet flavors, while sharp
edges induce expectations of bitter flavors. Sweeny et al. (2012) focus on visual-aural correspond-
ences that stem from systematically seeing mouth shapes (vertically versus horizontally stretched)
and hearing particular sounds (‘wee’ versus ‘woo’); their results complement the McGurk effect
(where what we hear is shaped by what we see) by suggesting the opposite phenomenon, that
“speech sounds influence visual perception of a basic geometric feature . . . what we see is
shaped by what we hear” (p. 199).
4. When identifying and corroborating cross-modal and intra-modal correspondences, empirical
and experimental studies in cognitive musicology echo the studies (and results) from the broader
cognitive sciences (e.g., Collier & Hubbard, 2001; Eitan & Granot, 2006; Gjerdingen, 1994;
Lipscomb & Kendall, 1994). Music cognition studies that venture into the neurosciences also
echo the corresponding claims and results, suggesting that conceptualizing music and musical
meaning recruits somatomotor and somatosensory cortical areas (e.g., Platel et al., 1997; Zatorre
et al., 1999). Most studies in music cognition, however, explore these phenomena by presenting
participants with stimuli that isolate a single musical or sonic parameter (such as single notes,
scales, beats), stimuli best characterized as ‘sounds’ rather than ‘music’; nevertheless, this body of
research has been the starting point of a journey leading to the theories I put forth in this volume.
5. Many studies extend Köhler’s work, tracing the origin of the cross-modal correspondences he
identified while speculating about their universality, ontogeny, and evolutionary implications.
Several studies argue that this cross-modal correspondence rests on the covert activation of vocal
articulatory movements (e.g., Deroy & Auvray, 2013; Galantucci et al., 2006). On the other hand,
Ozturk et al. (2013) replicated Köhler’s study with infants, using the words ‘kiki’ and ‘bubu’, and
their results suggest that such associations are preverbal. In exploring the cultural variability of this
cross-modal correspondence, Ramachandran and Hubbard (2001) also replicated Köhler’s experi-
ment, this time with American and Indian participants and using the words ‘kiki’ and ‘bouba’;
their results suggest that such shape-sound correspondences are to some degree cross-cultural.
Bremner et al. (2013) further elaborate on the cross-cultural pervasiveness of this correlation.
6. The use of capitals for conceptual metaphors and image schemas is standard practice within this
field of scholarship; the use of brackets for target and source domains is not, but it helps delineate
the components of a conceptual metaphor.
7. Ortony (1993), for instance, argues that the mapped features define the directionality, because
these are salient in the source, not the target.
8. Bundgaard (2009) notes that some abstract targets are consistently, and at times exclusively,
framed through single concrete sources—for instance, the target TIME is generally framed by the
source SPACE.
9. Johnson (2007) suggests that both domains entering a metaphor, the concrete and the abstract,
share neural substrates, that “the body must recruit neural structures central to sensory and motor
processing to carry out the inferences that make up our abstract patterns of thinking. Structures
of perceiving and doing must serve as structures of thinking and knowing” (p. 26, emphasis in
original). To support this argument, some scholars recur to computational models of neural net-
works (Feldman & Narayanan, 2004); such models, however, redirect the conversation away from
180 Appendix II: Conceptual Metaphor and Image Schema
embodiment and toward logical and computational paradigms typical of the first wave of cogni-
tive psychology.
10. Such cross-domain co-activations at the early stages of life arguably form the basis of conceptual
metaphors such as [DESIRE] IS [HUNGER] (“She is starving for attention”). Johnson and Lakoff (2002)
note that “we first acquire the bodily and spatial understanding of concepts and later understand
their metaphorical extensions in abstract concepts” (p. 254). Asch and Nerlove (1960) provide
evidence consistent with Johnson and Lakoff’s claim; in their study, children initially developed a
command of words like ‘warm’ or ‘cold’ within the context of objects, and only later applied these
concepts to individuals using qualifiers such as ‘warm’ or ‘cold’ to describe their personality.
11. Although insufficient to draw conclusive results, a growing body of research suggests that humans
recruit sensorimotor brain areas for abstract reasoning and semantic comprehension. For instance,
using event-related fMRI, Hauk et al. (2004) found that action words associated with particu-
lar body parts (‘lick’, ‘pick’, ‘kick’) activate areas in the premotor cortex directly adjacent to (or
overlapping) the areas activated by the movement of the corresponding body parts. More recent
work within cognitive neuropsychology merging theoretical, meta-analytical, and computational
accounts explains this phenomenon as a co-activation or re-activation of neural substrates dur-
ing metaphor or image schema processing. For instance, Anderson’s (2010) ‘neural reuse’ theory,
which draws on a database of thousands of fMRI studies, proposes that both domains in a meta-
phor (the abstract and the concrete) share neural substrates. He argues that “conceptual-linguistic
understanding might involve the reactivation of perceptuomotor experiences . . . [in metaphorical
mappings] understanding in one domain would involve the reactivation of neural structures used
for another” (p. 254)—in other words, when unpacking a metaphor to grasp abstract ideas, we
must reactivate neural structures previously developed for concrete experiences.
12. Abril (2001) studied how bilingual children discern musical registers, noting English terminol-
ogy uses ‘high/low’—i.e., drawing on the VERTICALITY schema—while Spanish terminology
uses ‘agudo’ (sharp) and ‘grave’ (serious)—i.e., deviating from both VERTICALITY and LINEARITY
schemas.
13. Talmy (1983) and Langacker (1987) were among the first to put forth the notion of image schema.
Johnson (1987) reformulated the notion of image schema as “a dynamic cognitive construct that
functions somewhat like an abstract structure of an image and thereby connects together a vast
range of different experiences that manifest this same recurring experience” (p. 2). Here, I use
the terms ‘schema’, ‘embodied schema’, and ‘image schema’ interchangeably, and use ‘schemas’
instead of ‘schemata’ to denote the plural.
14. Many musicologists and music theorists have drawn on the notions of conceptual metaphor and
image schema to formulate broad frameworks for musical thinking and musical meaning (e.g.,
Chattah, 2006, 2015; Cook, 1998; Cox, 1999, 2016; Johnson & Larson, 2003; Larson, 2012;
Saslaw, 1996; Spitzer, 2004; Zbikowski, 1998, 2002). Larson (2012), for instance, focuses on
musical ‘forces’ such as ‘gravity’, ‘magnetism’, and ‘inertia’, understood as recurrent bodily ex-
periences of motion. Cox’s (2016) ‘mimetic hypothesis’ takes conceptual metaphor as a point of
departure in recognizing the “role of physical imitation in everyday cognition, and on the various
forms of physical imitation in music cognition” (p. 3). And Zbikowski (2002), perhaps the most
significant contribution toward systematizing the music-theoretical understanding of metaphor,
extends cross-domain mappings to include Fauconnier and Turner’s (2002) Conceptual Blending
theory, which we further explore in Appendix VII in this volume.
15. Just like metaphors, schemas are not bound by language. For instance, the various diagrams in-
cluded in this and other chapters (single lines, arrows, cubes) activate the very schemas they
represent.
16. While schemas emerge from recurrent embodied experiences, cultural practices often frame these
experiences.
17. At the neural level, establishing the VERTICALITY schema may entail the simultaneous neural acti-
vation of areas dedicated to two unrelated phenomena (such as pouring liquid in a container and
observing its level rise) where “experiential conflation has no semantic motivation and is solely
identified as simultaneous activation of distinct parts of the brain. Frames or domains experienced
together are temporally neurally bound: they fire in synch” (Brandt, 2009, p. 66).
18. The SPG schema is analogous to the LINEARITY schema in its one-dimensional structure, while
additionally featuring a defined orientation, from ‘source’ to ‘goal’.
APPENDIX III
Affordances
James Gibson challenged the prevailing paradigms in psychology, which held that our minds
construct representations of the world around us and that perception is a passive and recep-
tive mechanism unaffected by our environments. He advanced an ecological approach to
psychology, arguing that perception does not rely on cognitive or symbolic constructions
and that our senses develop functional adaptations in response to changes in our environ-
ments. Gibson’s viewpoints and arguments, particularly his notion of ‘affordances’, rapidly
attracted the attention of researchers in the cognitive and social sciences, who embarked on
an empirical exploration of his novel conceptual frameworks. The findings from these studies
have in turn informed recent advances within widely diverse areas of knowledge, including
artificial intelligence, robotics, industrial design, human factors, architecture, and the arts.
In this appendix, I briefly expound on the notion of affordances, summarize relevant
insights from the cognitive sciences, and survey its recent application within music-related
scholarship. In Chapter 5, I further extend Gibson’s argument by claiming that film music
affordances not only allow but also invite a wide array of potential (inter)actions—during a
film, we ‘resonate’ with the sonic environment, attuning our perceptual experiences to the
music’s affordances, and in turn, these latent or actualized affordances influence our bodily
responses and guide our interpretations of the narrative.
Theory of Affordances
An ecological account of perception rejects the notion that our senses passively transform
information from the outside world into neural patterns; instead, it contends that our senses
actively detect and interpret ecologically relevant information—we do not simply see; we
recognize opportunities for (inter)action. Gibson (1979) coined the notion of ‘affordances’
to denote the potential (inter)actions the environment offers us—a chair affords us some-
thing to sit on, an enclosed space affords us shelter, a bridge affords us an opportunity to
traverse a distance between two points.1
Affordances combine perception and action into a unified and interdependent process.
Perceiving potential (inter)actions means perceiving affordances—the kick-ability of a ball,
182 Appendix III: Affordances
the grasp-ability of a mug, the climb-ability of a hill.2 Because such potential (inter)actions
rest on a reciprocal relationship between an organism’s capacities and its environment’s
material properties, affordances are relative to and co-specified by both these variables. For
instance, a body of water affords a support surface to a water strider bug, but not a human.
Furthermore, despite our environments’ relatively stable structural properties and our rela-
tively stable bodily capacities, new (and creative) affordances emerge based on our needs,
goals, or intentions.3 As a result, affordances are in constant flux—although a chair affords
us to sit on it, the same chair may afford us to hang our coats, fight an intruder, lock a door,
or step on it to reach a ceiling fixture.
In advancing and refining his ecological approach to cognition, Gibson (1979) argues
that because we attune our perception so strongly to our intended motor actions, “what we
perceive when we look at objects are their affordances, not their qualities” (p. 134). This
means that the environment’s affordances determine the attributes we perceive in it, that
functional significance (rather than material structure) shapes our perception. For instance,
we perceive a chair not as an object of a particular color or shape but as an object that
affords us something to sit on (or do something else with).4 Such interdependency of percep-
tion and action has inspired a wealth of empirical studies within cognitive psychology and
the neurosciences.5
many of the properties to which the brain is attuned are likely to be action-relevant
and relational . . . Interaction with an environment offering multiple affordances causes
regions of the brain to be differentially activated in accordance with their functional
biases.
(p. 8)
Moreover, Cisek’s (2007) ‘affordance competition hypothesis’ suggests that our brains simul-
taneously process sensory, motor, and cognitive information provided by regions in each
of the four lobes of the cortex and that “sensory information arriving from the world is
continuously used to specify several currently available potential actions” (p. 1586). Cisek’s
hypothesis thus supports the notion of affordances as ‘latent’ action potentials that “may or
may not be actualized on a given occasion, but it nonetheless represents a ‘real possibility’
of action” (Pinna, 2017).
184 Appendix III: Affordances
Extended Affordances
Gibson (1979) reminds us that our environments afford us “a rich and complex set of inter-
actions, sexual, predatory, nurturing, fighting, playing, cooperating, and communicating”
(p. 128), indicating that affordances extend beyond the physical or material and onto the
social and intellectual realms. Even the material properties of our environments offer these
more abstract affordances. For instance, while we mainly use our hands and fingers to manip-
ulate objects, we also often use our fingers to perform basic mathematical operations.10
Our understanding of ‘environment’, however, should extend beyond material or physi-
cal attributes. Gibson (1979) remarks that:
it is a mistake to separate the natural from the artificial as if there were two environments
. . . to separate the cultural environment from the natural environment, as if there were a
world of mental products distinct from the world of material products.
(p. 130)
Music’s Affordances
Music’s affordances emerge from an interplay among three conditions: the music’s struc-
ture, our perceptual capacities, and our intentions. In terms of the music’s structure, the vast
number of musical parameters (e.g., timbre, dynamics, meter, harmony, articulation, density,
texture) bring about an almost limitless set of affordances. In terms of our perceptual capaci-
ties and intentions, Clarke (2005) reminds us of the wide variety of potential affordances of
music, which include “dancing, singing (and singing along), playing (and playing along),
working, persuading, drinking and eating, doing aerobics, taking drugs, playing air guitar,
traveling, protesting, seducing, waiting on the telephone, sleeping . . . the list is endless”
(p. 204). The last decade has seen an outpouring of scholarly explorations of music’s affor-
dances; these range widely in scope and perspective, accounting for our physical engage-
ment with music (performing, dancing, exercising) as well as our intellectual engagement
with it (analyzing, communicating, signifying).11
Appendix III: Affordances 185
From a performative perspective, Huron and Berec (2009) explore how music affords
instrumentally idiomatic gestures; they note that:
From a similar performance practice perspective, Love (2017) explores how stylistic norms
define the musical environment and constrain the potential affordances; for example, when
improvising within the jazz style, the performer “perceives this environment in terms of its
affordances . . . [and] seeks opportunities for artistic display” (p. 31).
Krueger’s (2011) discussion of music’s affordances includes the potential for emotion
regulation, communicative expression, and identity construction. He suggests that music
affords us opportunities for:
creating and cultivating the self, as well as creating and cultivating a shared world that
we inhabit with others . . . Thinking of music as an affordance-laden structure thus reaf-
firms the crucial role that music plays in constructing and regulating emotional and social
experiences in everyday life.
(p. 1)
Clarke (2005), in turn, redirects the conversation toward an intellectual engagement with
music and fully frames the notion of affordances as the potential for analytical interpretation
and musical criticism. In exemplifying such an account of music’s affordances, he notes that:
Notes
1. The ‘material’ environment is here understood as consisting of all surrounding objects, spaces,
and surfaces.
2. Gibson’s ecological approach aligns with the notion of an ‘embedded mind’ not dissociated from
natural, social, and cultural environments.
3. Norman’s (2002) “physical versus learned” categorization of affordances acquires particular rel-
evance in light of the current trends of “life hacks”, which entails the novel use of familiar artifacts
to arrive at practical solutions to common problems. The emergence of life hacks further under-
scores that the object-perceiver relationship is centered on, and relative to, an organism’s intended
actions.
4. Implicit in the notion of affordances is a critique of the Cartesian mind-body dualism. Gibson
(1979) suggests that perception does not elicit or give rise to neural representations of the account
of perception—that is, perception entails ‘resonating’ with the environment’s affordances, and
further notes that “exteroception is accompanied by proprioception . . . to perceive the world is
186 Appendix III: Affordances
to coperceive oneself . . . The awareness of the world and of one’s complementary relations to the
world are not separable” (Gibson, 1979, p. 141).
5. Research in cognitive psychology and neuroscience, however, faces methodological and ontologi-
cal challenges in collecting and interpreting data to support Gibson’s notion of affordances. In
fact, a neural account of perception and action seemingly counters Gibson’s ecological account of
perception–that is, locating affordances as representations inside the brain conflicts with an eco-
logical program by addressing the very subject-object framework Gibson attempted to disprove.
For example, many studies in cognitive psychology deviate from Gibson’s ecological approach
by reinstating the subject-object and mind-body dualist views, framing affordances as the neural
‘representations’ or dispositions of motor actions (Declerck, 2013; Sakreida et al., 2016).
6. For studies in cognitive psychology that explore the affordances of tools see Bach et al. (2014);
Fagg and Arbib (1998); Kühn et al. (2014); Makris et al. (2013); Proverbio et al. (2013); Sakreida
et al. (2016).
7. For further insights on canonical neurons (which are closely related to mirror neurons) see Iaco-
boni et al. (2005); Rizzolatti and Fadiga (1998); Rochat et al. (2010).
8. Another type of bimodal neurons, mirror neurons, fire both when performing an action and when
seeing or hearing someone else performing that same action. These bimodal neurons are founda-
tional to the model of musical empathy proposed in Chapter 1.
9. Such co-activations extend to aural perception (Osiurak et al., 2017). For instance, Chao and
Martin (2000) found evidence that seeing or reading the names of tools or manipulable artifacts
activates brain areas involved in grasping.
10. This type of cognitive off-loading is just one instance in which we use the environment’s affordances
to engage in more abstract behavior. Such cognitive off-loading by using “the structure of the en-
vironment and their operations upon it as a convenient stand-in for the information-processing
operations” (Clark, 1989, p. 64) has been observed exclusively in humans.
11. Reybrouck (2005) devises a taxonomy of music’s affordances. He introduces five categories of
affordances: “[i] the sound-producing actions proper, [ii] the effects of these actions, [iii] the pos-
sibility of imagining the sonorous unfolding as a kind of movement through time, [iv] the mental
simulation of this movement in terms of bodily based image schemata and [v] the movements
which can be possibly induced by the sounds” (pp. 258–259). In a subsequent study (2012),
he adds three related categories of affordances related to the performative aspects of music: “(i)
the production of musical instruments out of sounding material, (ii) the development of playing
techniques to produce musical sounds, and (iii) the shaping of the sound by using modulatory
techniques” (p. 404).
12. For additional insights on instrumental affordances see De Souza (2017).
APPENDIX IV
Memory
Memory intersects with nearly every component of our human experience, from engaging
in physical activities (riding a bike) to communicating (language) and even developing a
personal or collective identity. And although it seems paradoxical, while our memories are
grounded in the past, one of their primary roles is to predict and anticipate the future. This
appendix first surveys well-established theories about memory systems and then explores
current theories explaining memory’s ecological and evolutionary functions.
Record-Keeping
The record-keeping theory is what most of us think of as memory. Via the ‘storage’ metaphor,
this theory likens human memory to that of machines or recording devices.1 It proposes a
memory architecture that features three types of storage—sensory, short-term, and long-
term—each defined in terms of its modality, capacity, and duration (Atkinson & Shiffrin,
1968).
Sensory memory is modality-specific, holding a finite amount of perceptual information
that we may retrieve for a few seconds after exposure. In the auditory domain, this is called
‘echoic memory’, where aural stimuli may be ‘played back’ in our minds with relatively
good fidelity for a short time (Winkler & Cowan, 2005).2 After a few seconds, our sensory
188 Appendix IV: Memory
memories fade; this information decay stems from (1) an extended time-lapse to a repetition
of the stimuli or (2) interference of competing information (Berman et al., 2009).3
Two other types of information storage within the record-keeping theory are ‘short-term’
and ‘long-term’ memory, each defined in terms of their temporal reach and capacity. Such
a division between short-term and long-term memory as separate storage locations aligns
with the memory architecture of computers as having separate short-term and long-term
memory modules. While short-term memory holds a limited amount of information for a
limited time, long-term memory holds vast amounts of information that we can retrieve or
access as needed, even years later.4
Given the difficulty in adequately explaining the process whereby we consolidate sen-
sory memories into short-term ones, and short-term memories into long-term ones, current
theories reject the idea of such processes altogether. Instead, alternative memory-systems
models contend that when performing memory-related tasks, we do not draw on a single
memory system but combine and integrate several of them.5
Working Memory
Baddeley and Hitch (1974) challenged the (then-dominant) model of short-term memory
as temporary information storage, proposing a set of real-time reconstruction processes
instead. In their “working memory” model, information from sensory memory is actively
(and often subliminally) manipulated via complex cognitive tasks (e.g., retrieving long-
term memory frames) to consolidate long-term memories.6 Therefore, working memory
rests on a subliminal interaction between sensory and long-term memories through which
we assimilate new stimuli into existing knowledge and thereby consolidate long-term
memories.7 Several mechanisms are vital to working memory: repeating (e.g., phonologi-
cal loop), chunking (based on gestalt principles), recognizing (based on schemata), acti-
vating (via priming and classical conditioning), and associating (via semantic and episodic
memory systems).8
Repeating and chunking involve bottom-up processing, and hence are not driven by inter-
pretative mechanisms. Repeating is critical in consolidating memories (Jacoby, 1978). For
instance, the ‘phonological loop’ is an essential mechanism of auditory working memory;
it engages subvocal rehearsal to keep aural information from decaying. Chunking entails
integrating discrete pieces of information into single, larger units (Mathy & Feldman, 2012).
For instance, when exposed to raw information, we engage gestalt principles to group that
information, to ‘chunk’ it.
Long-Term Memory
Schacter and Tulving (1994) complicate the traditional conception of long-term memory as
passive retention, identifying multiple long-term ‘memory systems’. Their primary subdivi-
sion of long-term memory systems includes implicit (or non-declarative) and explicit (or
declarative) memories, each subdivided further.11
We engage our implicit memory when performing tasks that rely on habit or skill—riding
a bicycle, tying shoelaces, driving home from work. Implicit memory is “encoded during a
particular episode [and] subsequently expressed without conscious or deliberate recollec-
tion” (Schacter, 1987, p. 501); subsequently, it “does not involve conscious recollection but
instead reveals itself through behavior” (Eysenck & Keane, 2015, p. 286). Implicit memory is
understood through various systems: priming, classical conditioning, schemata, and scripts.
This subdivision notwithstanding, brain research shows that all long-term memory systems
are interdependent and rely on one another for consolidation and subsequent retrieval (Buri-
anova et al., 2010).
Priming and classical conditioning outline cognitive processes influenced by a previously
presented stimulus. In priming, exposure to a stimulus influences our behavior, response, or
perception of a subsequent stimulus. For example, primed with the color yellow and subse-
quently asked about fruits, most of us think of a banana or a lemon, ‘activating’ that concept
because of the perceptual relationship between stimulus and response. Besides such per-
ceptual relationships, priming paradigms may include semantic or conceptual relationships.
While priming subtly influences our response or behavior, classical conditioning more
directly elicits an associated response or behavior. Classical conditioning stems from estab-
lishing an association between two contiguously presented stimuli so that one evokes the
presence of the other.12 In Pavlov’s familiar experiment, the dogs associated the sound of
a bell with food; hence, the sound caused them to salivate. Soon after Pavlov published
the results of his experiments, much empirical research explored classical conditioning in
humans and within a wide range of settings, from restaurant aromas to routine immuniza-
tions to everyday cellphone use.13
Schemata and scripts, two other systems of implicit long-term memory, are abstract struc-
tures characterized by patterns of thought or behavior. Schemata emerge through recurrent
experiences of interaction with our environments, shaping our perception and behavior
(i.e., recognizing people, objects, or sounds). An experiment by Barlett (1933) illustrates the
scaffolding power of schemata; in the experiment, British participants were presented with
a Native American folk story. When asked to recall the story, the participants’ recollections
omitted elements that were culturally foreign to them—they transformed the story to be
consistent with their own cultural knowledge, such as changing the word ‘canoe’ to ‘boat’.
This means that our prior experiences and memories shape our perception and contribute to
constructing new memories. In fact, we perceive and solidify information as new memories
more easily when it is similar to existing mental representations than when it diverges from
pre-existing mental representations.
Like schemata, scripts are also abstract structures, but ones that include an “appro-
priate sequence of events in a particular context” (Schank & Abelson, 1975, p. 170).14
Familiar examples of scripts include ‘riding a bike’ or ‘tying shoelaces’, which involve a
well-delineated chronological sequence of events, but ones that are difficult to verbalize.
Because of their chronological organization, scripts are often characterized as procedural
memory. While classical conditioning guides future behavior by associating two stimuli,
scripts instead guide future behavior (and predict future events) by engendering expectations
that concern the appropriate or most likely sequence of actions.
In contrast to implicit memory, explicit memory can be consciously and intentionally
retrieved. Scholarship on explicit memory subdivides it into semantic memory (collec-
tively shared knowledge that includes facts, concepts, and names) and episodic memory
(autobiographical information only accessible by a single individual, including events
and experiences). Semantic memory is what we commonly think of as knowledge that
rests upon culturally shared categorizations—countries’ capitals, sports’ rules, musical
genres.15 Whereas semantic memory draws on socially constructed and shared categories,
episodic memories draw on personal experiences; hence, we commonly regard these as
recollections—the foods we ate this morning, our first day at school.
Ecological/Evolutionary Perspective
The ecological/evolutionary approach to memory has coexisted with the record-keeping,
working memory, and long-term memory theories. This approach moves further away from
the exploration of memory as ‘storage’ or ‘processes’ and toward an understanding of the
‘function’ of memory—instead of focusing on how memory works or where in the brain
memories reside, an ecological/evolutionary perspective focuses on why we have memories.
From this perspective, memory’s primary function is not to preserve past experiences but to
anticipate future events (Morris, 1988), providing vital ecological and evolutionary advan-
tages that allow us to predict future outcomes and adapt to novel environments. Because of
Appendix IV: Memory 191
this anticipation mechanism, our memory constantly adapts by registering salient informa-
tion or deviations of previously consolidated patterns, reconfiguring our cognitive systems in
response to new experiences. Here, the saliency or distinctiveness of stimuli plays a critical
role in their memorization.
Structural saliency is a byproduct of chunking and recognition. Structurally salient events
are noticeable deviations from gestalt or schemata patterns; these attract attention and
thus are more easily consolidated in long-term memory (Alger et al., 2018; Holmes, 1972;
Waters & Leeper, 1936). Cohen and Carr (1975) explored this phenomenon by presenting
photographs of human faces to participants and requesting they judge the distinctiveness of
each face; in a subsequent stage of the experiment, participants more accurately recognized
the faces they rated as distinctive. As Guenther (1998) succinctly notes, “making information
distinctive or associating information with distinctive images and ideas can promote better
memory of that information” (p. 328). Semantic and affective saliency, on the other hand,
is a byproduct of activations and associations. Guenther (1998), for instance, notes that
“events associated with strong emotions . . . are better remembered than emotionally more
neutral events” (p. 323). Similarly, Alger et al. (2018) mention that memory of an “emotion-
ally salient [episode] is preferentially preserved, whereas memory for neutral, contextual
detail is forgotten or suppressed” (p. 34).
This facility to adjust to changes in our environments equips us to predict future events,
resulting in ecologically advantageous behavior. This suggests a close relationship between
memory, imagery, and expectancy. Eysenck and Keane (2015), for instance, highlight the
interplay between memory and imagery when noting that “imagining future events involves
the same (or similar) processes to those involved in remembering past events” (p. 274).
Huron and Margulis (2010), in turn, highlight a common goal of memory and expectancy
in noting that the evolutionary and ecological goal of expectancy is “to prepare an organ-
ism for the future, not the past” (p. 579) and that the “adaptive purpose of memory must
be prospective rather than retrospective” (p. 580).16 Therefore, like general memory, musi-
cal memory functions to aid expectation and prediction. Chapter 6 fleshes out how these
ecological and evolutionary roles of memory play a role within the context of film music
leitmotifs.
Notes
1. Although this theory gained traction in the 1970s with the development of computer technologies
and artificial memory systems (e.g., hard drives, tapes), recent paradigm shifts in cognition and
psychology have challenged the record-keeping conception of memory.
2. Note that such retrieval of sensory memories does not entail any complex processing.
3. Once sensory information reaches our brains, it can disintegrate or, if our brains register it as rel-
evant, it can become part of the memory system that has been traditionally parsed into long- or
short-term memory.
4. Although this distinction may seem arbitrary, much research in neuropsychology supports it,
particularly through the study of individuals with brain damage or amnesia—individuals with
amnesia generally present long-term memory impairments while maintaining a well-functioning
short-term memory (Spiers et al., 2001).
5. A ‘connectionist’ model of memory (McClelland, 2000) also rejects the traditional storage view,
favoring activity patterns between neuronal units that form neural networks. Memorization entails
changes in our bodies and brains. Although Barlett introduced the connectionist approach to
192 Appendix IV: Memory
memory in 1932, much recent research from cognitive neurophysiology supports a connectionist
approach by revealing that “memory reflects changes to neurons involved in perception, lan-
guage, feeling, movement, and so on” (Guenther, 1998), thus outright dismissing the division into
short- and long-term memory systems (Macken et al., 2003; Nairne, 2002).
6. As a result, working memory does not consist of a specific neural substrate or dedicated brain
module; instead, it consists of “distributed, overlapping and interactive cortical networks that in
the aggregate encode the long-term memory” (Fuster & Bressler, 2012, p. 207). Similarly, Craik
and Lockhart (1972) identified various “levels of processing” to consolidate information in long-
term memory, ranging from shallow recognition to deep semantic analysis, with “deeper levels
of analysis produc[ing] more elaborate, longer lasting and stronger memory traces than shallow
levels” (Eysenck & Keane, 2015, p. 230).
7. Imaging studies support the interplay between working memory and long-term memory, showing
interactions between large-scale brain networks during general memory tasks (Fuster & Bressler,
2012). Real-world scenarios illustrate a close interplay of various memory systems. Kvavilashvili
and Mandler (2004), for instance, explored the spontaneous (involuntary) recall of semantic mem-
ories; they report on a diary, retrospectively finding cues that triggered those semantic memories.
Their findings suggest that involuntary semantic memories are triggered by our subliminal percep-
tion of related information—as an example, they mention the involuntary “mindpopping” recall
of the names Itchy and Scratchy (from The Simpsons) when one of the authors scratched her back.
Additionally, evidence from cognitive neuroscience suggests that working memory engages brain
areas involved in long-term memory (D’Esposito, 2007). For instance, imaging studies identified
simultaneous activity in brain areas critical for working memory and long-term memory encoding
(Wagner et al., 1998), suggesting that processes within working memory temporarily activate long-
term memory systems (Ward, 2015).
8. Structural congruency also promotes memorization and recall.
9. An informal experiment conducted by my graduate students illustrates this phenomenon of false
memories. After a feature presentation, filmgoers were asked about the music in the film, and most
responses addressed the music’s “beautiful melodies”, “nice bass”, “emotional impact”. Paradoxi-
cally, the film contained no music at all. This reflects that an experience is ‘reconstructed’ based
on existing schemata: since most films contain music, filmgoers reconstruct and recall their expe-
rience shaped by schemata based on such prior knowledge.
10. Tulving’s (1983) ‘encoding specificity hypothesis’ purports that the presence of contextual similar-
ity surrounding the initial encoding and subsequent retrieval of memories helps consolidate them
as long-term memory traces.
11. Some scholars, however, understand explicit and implicit memory as along a continuum, with
no boundaries between them (e.g., Challis et al., 1996). This line of thought suggests that explicit
memory engages in deeper levels of processing, as compared to implicit memory, and that im-
plicit memory serves as a precursor to explicit memory. Much empirical research supports this
theory; for instance, using eye-tracking and measuring gaze fixation, Ryan et al. (2000) explored
how participants form associations via implicit learning when exposed to images of real-world
scenes; they concluded that “memory representations of scenes contain information about rela-
tions among elements of the scenes, at least some of which is not accessible to verbal report”
(p. 454).
12. In a similar vein, Juslin and Västfjäll (2008) describes ‘evaluative conditioning’ as “a special kind of
classic conditioning that involves the pairing of an initially neutral conditioned stimulus (CS) with
an affectively valenced, unconditioned stimulus (US). After the pairing, the CS acquires the ability
to evoke the same affective state as the US in the perceiver” (p. 564).
13. In Pavlovian conditioning, once two stimuli are associated, either stimulus in isolation may trigger
the same physiological response.
14. Scripts are also called ‘procedural memory’.
15. Semantic memory seems to be amodal (i.e., not tied to any particular modality of perception),
possibly constructed through a process of “transduction” (Barsalou, 2008), where initial modal
representations turn into amodal representations. For instance, when encountering dogs, “modal
representations arise as dogs are seen, heard, and touched. In turn, these modal representations
are transduced into amodal symbols that stand for these experiences” (p. 92). There is, however,
no definitive empirical evidence for the presence of amodal brain representations; this suggests
Appendix IV: Memory 193
that, during retrieval of semantic memories, the original modal representations may become ac-
tive. It is therefore currently unclear how semantic categories are represented in the brain, and
how we arrive at stable category representations. In fact, Barsalou (2008) offers support for the
involvement of perceptual and motor neural substrates when engaging semantic categories.
16. Huron and Margulis (2010) regard various forms of memory (including semantic, episodic, and
procedural) as “different forms of expectation” (p. 581).
APPENDIX V
Auditory Perception
We are constantly submerged within a complex sonic environment that floods our auditory
sense, yet we navigate this sonic environment seamlessly and with little cognitive effort. The
sounds of rustling leaves, birds, rain, wind, conversations, music, traffic, may all be present
at once, yet we have the capacity to distill overlapping sounds, to attend to one or more of
them, or even to ignore them altogether.
This appendix explores vital mechanisms of auditory perception that allow us to engage
meaningfully with our sonic environments, mechanisms that pertain to Auditory Stream
Analysis. Through this exploration, we will begin to recognize that auditory perception is
a critical component of an evolutionarily advantageous cognitive mechanism we will call
ecological resonance.
regularities, and are finite and shared by all humans. Top-down processing, on the other
hand, draws on ‘learned’ schemas—these are mental representations formed by repeated
exposure to stimuli that present some degree of regularity, allow us to attend to abstract
and conceptual-level patterning, and are potentially infinite and unique to individuals or
communities. Bottom-up and top-down processing are thus functionally distinct, the for-
mer geared toward constructing perceptual entities, the latter toward selecting information
by activating learned schemas. Although seemingly independent, bottom-up and top-down
processing work in tandem during auditory perception to enable us to establish (and later
derive) meaning from a sonic landscape.
Figure V.1, for instance, offers a visual analogy that distills the bottom-up and top-down
processing that allows us to go from perception to the generation of an action plan.4 First,
the interaction of top-down and bottom-up processing prompts us to identify a string of let-
ters on the left box; at this stage, however, this string of letters does not carry any meaning.
Then, with the support of innate gestalt principles, we parse out certain letters belonging
together because of their unique font. Subsequently, based on learned schemas, we recog-
nize two meaningful words. Ultimately, we draw on these words’ semantic content to derive
a useful message, one that prompts us to generate an action plan.
In spoken language, innate schemas allow us to discriminate amongst surface-level
regularities to identify speech sounds. Learned schemas, on the other hand, account for
deep- and abstract-level regularities—from the phonetic (the sound for a single letter) to
the syntactical (sentence structure), the stylistic (colloquial, poetic), and even the narra-
tive structure (linear narration, circular stories). Analogously, in music perception, innate
schemas attend to the surface-level regularities that allow us to identify sounds as music,
and learned schemas account for deep- and abstract-level regularities—from the phonetic
(sound of a violin, sound of a marimba) to the syntactical (harmonic progression, formal
design), and even to the stylistic (pop, classical).
Because gestalt theory attends to the principles that govern sensory perception, it is espe-
cially useful for conceptualizing bottom-up mechanisms. Gestalt psychology maintains that
our tendency to construct patterns is automatic, pre-attentive, and grounded in a basic set
of hardwired schemas that conform to fundamental grouping principles. These principles,
which include proximity, similarity, closure, good continuation, common fate, and good
form, allow us to process ‘chunks’ of information or patterns. In music, these grouping
principles operate on our perception of both contiguous and simultaneous sounds—broadly
understood as the music’s melodic and harmonic dimensions, respectively—and emerge
from multiple musical and sonic parameters, including pitch (frequency), rhythm (temporal
proximity), timbre and articulation (spectral information), dynamics (loudness), and source
(positioning in the sonic landscape).
Proximity
The gestalt principle of proximity accounts for our tendency to group objects that feature
relative nearness in space, time, or another parameter. Visually, we will perceive the dots on
the left of Figure V.2 as randomly scattered, while we will perceive the same number of dots
on the right as grouped into two distinct units because of their spatial proximity.
In music, proximity may manifest itself through the music’s various parameters—e.g.,
sounds within a particular frequency range, sounds in close temporal adjacency, sounds
emanating from a relatively defined location. For instance, by exploiting the principle of
proximity in range, composers may create the impression of multiple melodies emanating
from a single line that alternates between two relatively distant registers—musicians call this
a ‘compound melody’.5
Similarity
The gestalt principle of similarity accounts for our human tendency to form groups based
on resemblance. When observing the shapes in Figure V.3, the gestalt principle of proximity
will prompt us to perceive three distinct groups; within each of these groups, however, our
brains will subconsciously form two groups based on the shape’s similarities in terms of their
size (left), contour (center), or color (right).
In music, multiple parameters may contribute to forming groups based on similarity,
including timbre, loudness, and articulation. For instance, in much orchestral music, com-
posers create the impression of a continuous, single, unified musical gesture via instru-
mental ‘dovetailing’—superimposing the endings and beginnings of segments performed by
instruments with a similar timbre to render musical figures that extend through a range that
falls outside of any single instrument.
Common Fate
The gestalt principle of common fate is related to the principle of similarity, but instead
of forming groups based on static attributes such as shape, size, and color, we group
elements based on dynamic attributes such as spatial behavior, including movement,
direction, and onset. In Figure V.4, the lines depict the spatial behavior of four elements,
prompting us to group them as a single unit (left), as two distinct units (center), or as four
distinct units (right).
In music, our perception of groups based on the principle of common fate is influenced by
the contours, durations, and onsets of various musical layers. The common fate principle thus
allows for our perception and recognition of heterophonic, homophonic, and polyphonic
musical textures: heterophonic texture will result from all layers sharing analogous onsets,
durations, and contours; homophonic will result from our perceiving two layers, one func-
tioning as melody, the other as accompaniment; and polyphonic will result from perceiving
multiple independent layers that do not share analogous onsets, durations, or contours.
The perception of masking and good continuation in music is contingent (primarily) upon
the loudness of various layers. For instance, a loud tam-tam hit will temporarily mask a
sustained pianissimo sonority in the strings—when masked by the louder sounds of the tam-
tam, the pianissimo’s sustained sonority is inaccessible to our perception, yet we assume its
continuity.6 Some composers use this principle to give the impression of a continuous layer,
particularly when arranging orchestral music for piano.
Figure-Ground
The gestalt principle of figure-ground grouping prompts us to engage our attention selec-
tively. In the famous profiles/vase figure, we will attend selectively to either the profiles or
the vase, but not to both simultaneously. (See Figure V.6.)
In auditory perception, this gestalt principle is most evident in the phenomenon known
as the ‘cocktail party effect’—even when embedded within a complex and dense acoustic
landscape, such as in a cocktail party, we can selectively direct our attention to a single
stimulus (e.g., a conversation or the music) and regard other aural stimuli as background.
For instance, in homophonic music, we perceive a melody as the figure and the accompani-
ment as the background. In music perception, this principle is most evident when focusing
our attention on different parts of a texture—in polyphonic music, for instance, we may
consider any voice or part as figure or (back)ground; yet, while we can attend to a single
melodic strand with relative ease, attending to multiple simultaneous strands will present a
cognitive challenge.7
Saliency
Although not a gestalt principle of perception, saliency interacts with innate and learned
schemas. In fact, saliency results from our “processing of difference in the context of simi-
larity” (Hunt, 2013, p. 10), from perceiving deviations from a trend or norm both within
Appendix V: Auditory Perception 199
bottom-up and top-down perception. In Figure V.7, when looking at the shape on the left,
our attention will be drawn to the small lump as a deviation from a smooth trend; and, when
reading the text on the right, attention will be drawn to the word “sexophone”, which devi-
ates from the expected ‘saxophone’ by embedding the word ‘sex’ and hence introducing
sensual resonances. Because we are most efficient in processing information that conforms
to innate or learned schemas, information that features some degree of saliency or distinc-
tiveness will add to the cognitive load when processing such information.8
In music, composers may draw on our tendency to react to bottom-up saliency when
deviating from an established pattern—a corner in a melodic contour, an unusual chord in
a repeating harmonic pattern, a syncopation in an otherwise non-syncopated metrical and
rhythmic framework.9 In turn, performers may draw on our tendency to react to top-down
saliency when deviating from a learned schema, such as when re-harmonizing a standard or
when inserting agogic accents for expressive purposes.10
A system that did not take into account the sensory input [i.e., did not engage bottom-up
mechanisms] would be cut off from the outside world, whereas one that did not use pre-
viously acquired knowledge [i.e., did not engage top-down mechanisms] would have a
very unstable representation of the changing world.
(Bey & McAdams, 2002, p. 852)11
Such interactions between innate and learned schemas permeate our evolutionary history,
working below our conscious attention, allowing us to navigate the sonic environment,
where sounds are not mere patterns but meaningful cues—this interaction mechanism is
best described as ecological resonance.
Bregman (1990) addresses our tendency to ecologically resonate with everyday sounds,
noting that “our brains are not built to hear sounds in the abstract, but to form descriptions of
environmental events” (p. 679). Similarly, in music, Reybrouck (2010) anticipates the notion
of ecological resonance, recognizing the interaction between innate and learned schemas
in “the construction of new distinctions and observables as well as the recognition of knowl-
edge structures that are already acquired as the outcome of previous interactions [with
sounds]” (p. 189).12 This interdependency between bottom-up and top-down processes, as
highlighted by Bregman and Reybrouck, brings us nearer to understanding auditory percep-
tion’s evolutionary and ecological purposes.
200 Appendix V: Auditory Perception
Gibson (1966) addresses bottom-up and top-down perception as hardwired reflexes in the
visual domain that bestow organisms with an ecological advantage.13 In essence, this mecha-
nism entails structuring information into objects we recognize. Please turn to Figure V.8 before
reading further. Although this image may initially appear to be just the surface of a tree bark,
we will identify the frog in it upon engaging our top-down schemas. Furthermore, once we
recognize the frog, we cannot ‘unsee’ it, and our attention will always be drawn to it. Although
animals try to take advantage of their camouflaging mechanisms, they also gain an equal eco-
logical advantage by perceiving a predator or prey to enable appropriate adaptive behavior.
In terms of everyday sounds, for instance, a mosquito’s buzz is not just meaningful
because we have formed a schema upon repeated encounters with the sound, but because
we associate the schema with the presence of a mosquito and the potential for a mosquito
bite. As a result, an organism that associates a sonic learned schema with an environmen-
tally threatening situation may initiate adaptive behavior when encountering such sound.
Reybrouck (2010) argues that music functions as environmental sound; therefore, just as
organisms adapt in response to their sonic environments, we “show adaptive behavior in
[our] interactions with the music as environment” (p. 189). Likewise, Biancorosso (2010)
traces a mechanism suggestive of ecological resonance to our evolutionary history and men-
tions that
sound can signal the unknown or ominous, such as a distant menace or a fast-approaching
threat requiring immediate response . . . insofar as the music partakes of the creation of
such scenarios, it may be placed alongside memories of sounds heard in real-life situ-
ations and thus be treated, at least indirectly, as a natural sign. Our response to such a
sound would have to be, at least to some extent, wired into the brain.
(p. 317)
Chapter 6 expands on our hardwired tendency to ecologically resonate with a film’s sonic
and musical landscape.
Notes
1. Bregman (1990) labels his theory “Auditory Scene Analysis”. My changing ‘scene’ for ‘stream’ is
intended to avoid confusion with the use of the word ‘scene’ in the context of a film. Bregman’s
theories draw on Gibson’s (1966) ecological theories of visual perception and Gregory’s (1980)
top-down processing theory.
2. Bottom-up auditory perception begins with the stimulus itself; it is a data-driven process car-
ried in one direction, from auditory organs to the auditory cortex. Top-down auditory perception,
on the other hand, uses mental images and contextual information to process auditory stimuli.
Bregman (1990) addresses bottom-up and top-down processes respectively as ‘primitive’ and
‘schema-based’: “It seems reasonable to believe that the process of auditory scene analysis must
be governed by both innate and learned constraints . . . the effects of the unlearned constraints will
be called ‘primitive segregation’ and those of the learned ones will be called ‘schema-based segre-
gation’ ” (p. 242). Regardless of their labeling, all scholars agree that these interacting mechanisms
are hardwired and pre-attentive, meaning that we are born with the capacity to engage them and
that we do so below the level of consciousness.
3. Schemas are mental structures that emerge from regularities we perceive in our environment.
Within the context of this book, I define schemas quite broadly to include hard-wired conceptual
frameworks and gestalt principles of organization.
4. Since illustrating mechanisms of aural perception within a book is best achieved by drawing on
analogies with visual perception, all explanations in this appendix include a visual component
before considering its aural counterpart.
5. Bregman (1990) surveys various experiments and concludes that “if the alternation is fast enough
and the frequency separation great enough, listeners will not experience a single stream of tones
alternating in pitch, but will perceive two streams of tone . . . When two streams are heard, the
listener has the impression of two different sources of sound” (p. 642).
6. Bregman (1990) points out that, even if a sustained sound were to be “deleted and replaced by a
much louder sound, the actual existence of the background pattern is irrelevant to our perception
of good continuation” (p. 346).
7. Drawing selective attention to the figure does not mean, however, that the background has no ef-
fect on our perception. On the contrary, the background greatly influences our perception of the
figure, particularly in music.
8. At a neural level, experiments using event-related potentials (ERP) illustrate that our neurons fire
differently when exposed to a stimulus that deviates from a pattern; for example, even a small
deviation of a metronomic pulse will trigger a mismatch negativity response that can be observed
in the electroencephalography (EEG) signal (Zanto et al., 2006).
9. Dyson and Watkins (1984) conclude that “corners of melodic contours act as features and are
perceptually more salient than the intervening notes” (p. 483).
10. In contrast to innate schemas, learned schemas “[do] not perceptually segregate sounds, creating
perceptual units such as auditory streams, but allow us to select information from a mixture by
a matching process between schemas stored in memory and a sensory representation” (Bey &
McAdams, 2002, p. 851). Observing saliency through learned schemas, therefore, results not
from what we hear, but instead from a discrepancy between what we hear and what we expect
to hear.
11. Bey and McAdams (2002) identify both the ‘attentive’ nature of learned schemas, “when we ex-
plicitly try to hear a sound source or a sound sequence in a background mixture” (p. 845), as well
as their ‘pre-attentive’ nature, when “being in a room with many people talking and hearing one’s
name emerge from the mixture” (p. 845). In addition, empirical studies conducted by Boltz (2001)
show that schemas lessen the cognitive load by “reduce[ing] the amount of attentional effort that
is needed for perception and comprehension, and organize[ing] one’s perceptual experience into
a coherent and intelligible whole” (p. 432). While Bregman (1990) claims that innate and learned
mechanisms are different because the former is “automatic” and the latter involves “attention”,
I argue here that both are automatic.
12. Reybrouck (2010) further highlights the ‘online’ confluence of the perceptual immediacy of ge-
stalt-based innate schemas and the necessary conceptual abstraction of learned schemas, noting
the “dynamic tension between ‘experience’ and ‘recognition’ with the former relying on a mo-
ment-to-moment scanning of sensory particulars, and the latter relying on processes of abstraction
and generalization” (pp. 187–88).
202 Appendix V: Auditory Perception
13. Visual and auditory perception rely on analogous grouping strategies to minimize the cognitive
and attention loads and to help us cope with vast amounts of information reaching our senses.
Furthermore, our human tendency to integrate visual and aural features appears to be innate (or
at least prelinguistic). Wagner et al. (1981) found that infants develop visual and auditory integra-
tion before learning to talk; in their experiments, infants attended to dotted lines when exposed
to a series of short sounds and to solid lines when exposed to long sounds. In a study that extends
Marshall and Cohen’s (1988) seminal work and revisits their Congruence-Association Model, Mil-
let et al. (2021) explore this phenomenon within film music with the use of eye-tracking technol-
ogy; they explain that, during moments of structural congruency, the “music led to quicker first
fixations on film objects . . . [accentuating] the saliency of film objects and [supplying] emotional
information altering the viewers’ sentiment towards the film” (p. 1).
APPENDIX VI
Archetypes
We are surrounded by archetypes, universal symbols that emerge from within a culture and
become units of meaning shared and reproduced by its members. In literary works, gardens
usually symbolize love and fertility, towers symbolize power and worship. In films, shadows
often symbolize dark forces or malicious tendencies, bridges symbolize transitions between
opposites. Music partakes in this universe of archetypes with surface-level figures, timbres,
and even styles, expressing our cultures’ settings, rituals, and structures.1 Within musico-
semiotic scholarship, these culturally wrought units of meaning are called musical ‘topics’.2
This appendix traces the origins of topics theory and surveys its application to a wide
range of repertoires, from eighteenth-century concert music to twenty-first-century popu-
lar and film music. It then explores musical topics through the lens of modern semiotics,
whose frameworks provide an elegant inference engine that allows us to glean the potential
mechanisms whereby we derive meaning from topics. Although such frameworks rest solely
on speculative insights, they have propelled a wealth of empirical research on musical top-
ics. Therefore, this appendix surveys relevant empirical research to expand our awareness of
musical topics’ expressive power.
Topics Theory
Composers and scholars have been keen to identify musical units that suggest extra-musical
meaning.3 Early music treatises (e.g., Heinichen, 1711; Mattheson, 1739) discuss composi-
tional practices using ‘loci topici’—surface-level musical textures and gestures with rhetori-
cal and affective potential.4 Ratner (1980) drew on these treatises to compile a thesaurus of
characteristic figures encultured listeners of the time would recognize, laying the ground-
work for numerous scholars to refine and apply topics theory to various repertoires.
Agawu (1991) distills a compendium of the most frequently used topics in
eighteenth-century Western concert music, a ‘Universe of Topics’.5 The topics in this
compendium reflect a broad range of settings, which we can further categorize as
affective states, musical figures and gestures, styles and genres, and social contexts
204 Appendix VI: Archetypes
and functions. Figure VI.1 presents various topics from Agawu’s compendium arranged
within representative categories.
For most eighteenth-century listeners, the topics in Agawu’s compendium would have
been readily recognizable; but in the twenty-first century, because we are distant from the
musical conventions of the time, we need “to be schooled in the idiom of the eighteenth cen-
tury” to develop the necessary “listener-competence” (Agawu, 1991, p. 49).6 An encultured
listener, then and now, would recognize surface-level features in Mozart’s Piano Sonata K.
332 and hear a succession of musical topics rather than merely a series of musical themes
or figures. Figure VI.2 presents a portion of this work and depicts the topics suggested by the
embedded musical figures and gestures. The pace and range of the melody in bars 1–4 sug-
gest the ‘singing style’ topic, while the triple meter and the accompaniment featuring a tonic
pedal tone suggest the peaceful ‘pastoral’ topic. The emboldened, mezzo-forte contrapuntal
figures in bars 5–8, which contrast with the homophonic texture of the preceding bars, sug-
gest a sudden shift to the ‘learned style’ topic; yet toward the end of this figure, two stac-
cato chords seem to poke fun of the prior serious tone, suggesting a possible ‘opera buffa’
topic. Emulating the sound of hunting horns, featuring dotted rhythms and the characteristic
3rd-5th-6th intervallic gesture in a major mode, bars 12–22 suggest the ‘horn call’ and ‘hunt
style’ topics. Then, at bar 22, the sudden forte dynamics, the shift to the relative minor mode,
a more active rhythmic profile, and the diminished-seventh sonorities suggest the turmoil
and instability characteristic of the ‘Sturm und Drang’ [‘storm and stress’] topic.7
Resembling Agawu’s compendium for the common practice repertoire, Tagg’s (1999)
compendium of “feels”, which he defines as an “ethnocentric selection of possible connota-
tive spheres” (p. 12), includes topics listeners would recognize in more popular styles.8 Just
like an eighteenth-century audience member would subliminally recognize the ‘Sturm und
Drang’ or ‘hunt style’ topics in concert music, nearly every twenty-first-century audience
member would recognize the ‘Spaghetti Western’, ‘twinkling happy Christmas’, or most
other topics included in Tagg’s compendium. Figure VI.3 offers a selection of Tagg’s extensive
list of feels.
Appendix VI: Archetypes 205
Because topics are conventionalized signs, “competence is assumed on the part of the
listener, enabling the composer to enter into a contract with [the] audience” (Agawu, 1991,
p. 33).9 Film composers, for instance, are exceptionally attuned to twenty-first-century top-
ics, particularly because most film directors communicate with their film composers by
employing words or phrases similar to those included in Tagg’s compendium; composers, in
turn, must recognize and translate these into music that evokes the desired response from
the audience.
FIGURE VI.3 Tagg’s “ethnocentric selection of possible connotative spheres”. (After Tagg, 1999)
206 Appendix VI: Archetypes
Within the film music repertoire, such conventionalization of topics began during the
silent film era’s film-accompaniment practice. From the mid-1890s to the late 1920s,
several music anthologies or encyclopedias circulated among film musicians, primarily
accompanists. These volumes contain music classified according to ‘moods’ or ‘dramatic
settings’ and include several examples in each category from which accompanists would
select to support the narrative. Although to our modern ears we may hardly conceive of
some of the included categories as topics (such as ‘Alabama’, ‘Bees’, or ‘Chatter’), from a
contemporary listener’s perspective, all these categories conformed to a comprehensive
universe of topics.
Ernö Rapée’s (1925) Encyclopedia of Music for Motion Pictures outlines fifty-two cat-
egories, capturing the most frequently used dramatic settings in films (at the time). In the
introduction to the volume, Rapée notes:
“In creating fifty-two divisions and classifications in this Manual, I tried to give the most
numbers to those classes of music which are most frequently called upon to synchronize
actions on the screen”. (iii)
“One-third of all film footage [used] to depict action; another third will show no physical
action, but will have, as a preponderance, psychologic [sic] situations; the remaining third
will neither show action nor suggest psychological situations, but will restrict itself to
showing or creating atmosphere or scenery”. (iii)
Rapée also offers suggestions for the suitable placement of these selections. The music in
the ‘sinister’ category, for instance, “is meant for situations like the presence of the captured
enemy, demolishing of a hostile aëroplane or battleship, or for the picturing of anything
unsympathetic” (iii); the selections under ‘parties’ may be “suitable also for the portrayal of
social gatherings in gardens” (iii).
Most music anthologies for motion pictures of the early twentieth century drew almost
exclusively from pre-existing Romantic and post-Romantic concert music. This practice per-
petuated and reinforced already-established concert-music topics, introducing them to the
film music repertoire.10 As a result, these topics, and the associated surface-level figures,
became profoundly influential to film composers of the immediate decades that followed,
with film composers continuing to emulate these topics during the embryonic stages of
sound film. As a result, these encyclopedias directly established a set of musical topics that
became inextricably linked to particular settings or film genres.
exploration of topics can be framed within these semiotic traditions, each illuminating dif-
ferent facets of topics as signs. While a Peircean perspective would shed light on the episte-
mology of topics by situating them within a sign typology, a Barthesian perspective would
shed light on the process whereby topics emerge as units of meaning within the musical
discourse.
Peirce’s Trivium
Peirce established three categories of signs, defined by the relationship between a sign and what
it represents: icons, indexes, and symbols. Icons physically resemble what they represent—e.g.,
a bicycle drawing to suggest their presence on the road. Indexes point to or show evidence of
what they represent—e.g., smoke to suggest there is, or has been, a fire. Symbols emerge as
arbitrarily established conventions—e.g., a red traffic light to mean drivers should stop.
Within Peirce’s taxonomy, topics are symbols because their signifying power rests on
an arbitrarily established and conventionalized relationship between a sign and its mean-
ing.11 Their pre-culturized meaning is particularly evident when suggesting social contexts
and functions, such as in the ‘Italian’, ‘Turkish’, ‘ecclesiastical’, and ‘military’ topics. Their
symbolic nature notwithstanding, topics often develop out of icons or indices, acquiring
conventionality through time. For example, while resembling the acoustic rhythmic profile
of galloping horses, the ‘gallop’ topic has been further reinforced within culturally defined
repertoires by programmatic elements (e.g., the music’s title or lyrics). Fortunately, Peirce’s
trivium allows for signs to reflect more than one type of relationship; hence, besides their
primary typology as a symbol, topics can, at some level, also reflect a secondary typology.
Topical Markers
Within a topic’s chain of signification, the primary link rests on the listeners’ attunement to
surface-level musical figures, or topical ‘markers.’ These markers help:
identify a particular musical style and often, by connotative extension, the cultural genre
to which that musical style belongs . . . different combinations of different aspects of
duration, rhythm, timbre, tonality, spatiality, diataxis . . . [help listeners] instantaneously
know if they’re hearing 1970s disco rather than zouk, rococo chamber music rather than
death metal, glitch dub rather than Gregorian plainchant, mbaqanga rather than Muzak,
an Elizabethan madrigal rather than a low-church hymn, a ra-ga performance rather than
a romantic pop ballad, a national anthem rather than a TV detective theme.
(Tagg, 2012, p. 522)12
As the primary elements within the chain of signification, markers are thus vital for encul-
tured listeners to recognize musical topics effectively.
When connoting a style or genre, a topic can share an unlimited number of markers with
the style or genre from which it derives. Although this may seem to blur the lines between
an identity relationship and a semiotic one, contextual clues catalyze our recognition of top-
ics as signs. For example, a segment of film music might incorporate all markers of the ‘70s
Disco Music’ topic, and thus it might seem to become a disco hit on its own. However, con-
textual clues help us recognize it as a topic rather than as a disco hit; these contextual clues
may be internal to the film (e.g., narrative, visuals, or dialogue), external to the film (e.g., the
fact that we are in a movie theater and not in a discotheque in the ‘70s), or they can be both.
Although a single marker may connote a particular topic, in most instances, a combi-
nation of several markers is necessary to connote a topic with precision. For instance, the
minor mode marker is insufficient to suggest a specific topic; instead, only when combined
with other markers such as slow tempo, duple meter, and dotted figures, will it effectively
connote the ‘funeral march’ topic. Therefore, although there is a relatively finite number
of markers, they allow for an infinitely larger number of topics when used in combination.
Appendix VI: Archetypes 209
For instance, the triple meter marker may combine with other markers such as pedal tones,
harmonic progressions, tempo, rhythmic figures, and phrase structures to connote an array
of classical music topics, including ‘musette’, ‘minuet’, or ‘saraband.’
film music, especially typical genre music, produces schemas which act with other visu-
ally induced schemas (typical plot) to produce the most plausible interpretation. If the
pictures are ambiguous or indefinite, the music takes on more importance in interpreting
the film.
(p. 102)
There is, however, no conclusive evidence of how we store such statistically constructed
meanings. Most theories in the cognitive sciences assume that semantic knowledge (e.g.,
memories, categories, symbols) resides in amodal systems, unbound from its original
modality-specific representation. Musical topics, under this assumption, reside outside of
the brain’s modal systems. However, this assumption entails a process of ‘transduction’
in which modal information becomes clustered and stored as amodal information. Barsa-
lou (2008) hypothesizes a potential transduction process, arguing that, as we experience
the world, “modal representations in the brain’s systems for perception, action, and affect
become active . . . In turn, the brain transduces these modal representations into amodal
representations that represent category knowledge in a modular semantic memory” (p. 92).
210 Appendix VI: Archetypes
For instance, when listening to a ‘funeral march’ topic, modal representations related to all
senses arise, from the kinesthetic embodiment of the slow duple meter to the dark colors of
a procession and the negative affect triggered by the circumstances. In turn, modal repre-
sentations are clustered and ‘transduced’ into amodal representations associated with our
experience of the music.
Although such a transduction process would arguably be uniform to all humans, subtle
differences in our individual experiences with topics result in a wide range of extra-musical
connotations. For instance, Tagg and Clarida (2003) gathered hundreds of responses to ten
brief main title themes from film and TV, seeking to discern a link between musical struc-
tures and listener connotations. They presented musical excerpts without accompanying
visuals and found that music mediates wide-ranging extra-musical notions of “gender, love,
loneliness, injustice, nostalgia, sadness, exoticism, nature, crime, normality, urgency, fash-
ion, fun, [and] the military” (p. 17).13
Some studies seek to identify the source of such wide-ranging differences in connota-
tions.14 Tagg (1989) discerns specific topical markers corresponding to collectively shared,
gendered associations, presented in Figure VI.5, and reflects on the power of music to
advance and construct stereotypes, warning us that “we need to know how music can make
us think and feel about different sorts of people” (p. 17).
Regardless of individual differences in the associations, we recognize most topical mark-
ers within seconds, or even faster. For instance, Gjerdingen and Perrott (2008) suggest that
when listeners scan a radio dial, they recognize styles and genres in about a quarter of a
second.15 This rapid recognition of markers contributes to the expressive power of topics to
define filmic elements, such as genre, narrative setting, or character stereotypes, elements
that would take significantly more time to develop via other cinematic means.
Notes
1. We subliminally learn to recognize the surface-level markers (signifier) that define these musical
units of meaning (signified) through consistent exposure.
2. Whereas musical topics (also known as “topoi”) are specific musical gestures with extra-musical
meaning, musical archetypes may include elements such as musical forms, instrumentation, or
performance techniques, which may be used ‘topically’ to convey meaning. Due to the ambigu-
ous boundary between these two terms, within the context of this book, I use the more conven-
tionalized “musical topic”.
Appendix VI: Archetypes 211
3. This section is not intended to summarize the state of the art of topics theory; instead, it offers the
background that allows us to recognize how this analytical and interpretational framework can
seamlessly be extended to explore the film music repertoire.
4. Most eighteenth-century treatises discuss topics as compositional devices and not as ‘signs’ within
the musical discourse.
5. Gjerdingen (1986) identifies several musical ‘schemata’ in eighteenth-century classical music that
manifest themselves as two-voice (melody-bass) paradigms. Although these schemata may elicit
extra-musical associations in some listeners, thus becoming topics in themselves, Gjerdingen pri-
marily addresses their normative use as stylistic constructs and as episodic markers within a formal
structure.
6. Some studies lie at the boundary between historical and empirical musicology. For instance,
Krumhansl (1998) observes that the function of topics differs significantly between listeners, even
within the common-practice repertoire, and notes that while “the topics in the Mozart piece ap-
pear to function as a way of establishing the musical form . . . the topics in the Beethoven piece
are more strongly associated with emotional content” (p. 119). Nevertheless, the methodological
framework grounds this study within cognitive musicology, as it accounts for the experiences of
present-day listeners instead of drawing on writings of contemporary composers and critics to
discern the function of musical topics.
7. Topical investigation extends beyond the mere labeling of topics and explores their interaction
with other musico-structural parameters.
8. Tagg refers to such musical topics as ‘paramusical symbols’ in other publications.
9. Other studies tangentially touch upon the potential of music to mediate socio-cultural conven-
tions (Lastinger, 2011; Shevy, 2008), broadly suggesting that “the more typical the music stands
for a certain film or (pop) music genre, the more clearly it triggers stereotypes and relatively sharp
supra-individual expectations” (Herget, 2021). From a quasi-experimental perspective, Clynes
(1977) proposes that music may embed “sentic types”, which he recognizes as surface musical
features associated with particular emotions; although scentic types seem to align seamlessly with
musical topics, they work at an embodied level, eliciting emotions in the listener, and thus are
removed from the culturally established associations characteristic of topics.
10. Most anthologies, however, were limited in the function of the musical accompaniment and did
not include music that may serve as episodic markers (opening, closing, connecting scenes) or
style quotations (classical, jazz, folk, etc.). Tagg (1989) offers a reduced number of categories rep-
resentative of most anthologies, which include “ ‘animals,’ ‘bright,’ ‘bucolic,’ ‘children,’ ‘comedy,’
‘danger,’ ‘disaster,’ ‘eerie,’ ‘exotic,’ ‘fashion,’ ‘foreign,’ ‘futuristic,’ ‘grandiose,’ ‘happy,’ ‘heavy in-
dustry,’ ‘humour,’ ‘impressive,’ ‘light action,’ ‘melancholy,’ ‘mysterious,’ ‘national,’ ‘nature,’ ‘open
air,’ ‘panoramic,’ ‘pastoral,’ ‘period,’ ‘prestigious,’ ‘religious,’ ‘romance,’ ‘sad,’ ‘scenic,’ ‘sea,’ ‘seri-
ous,’ ‘solitude,’ ‘space’ (cosmos), ‘sport,’ ‘suspense,’ ‘tenderness,’ ‘tension,’ ‘tragic,’ ‘travel,’ ‘water’
and ‘western’ ” (p. 24).
11. There is passionate disagreement among Peircean scholars regarding the true nature of signs. This
disagreement stems, in part, from Peirce’s embarking on a meticulous classification that includes
sixty-six types of signs.
12. Tagg (2012) defines diataxis as “narrative form patterns” such as twelve-bar blues or sonata form,
which may enjoy a connotative dimension within interpretative communities. Similarly, Huron
(2006) address formal structure, phrase construction, and cadences as “style-distinguishing”
patterns.
13. In a similar study, Tagg (1989) observed a statistically significant clustering of responses gath-
ered within cohesive semantic fields broadly defined in terms of categorizations stemming from
canned-music libraries and anthologies for silent film accompaniment.
14. Wingstedt et al. (2008) suggest that, to a great extent, these differences stem from musical training
and from (possibly) gendered “differences in habits of listening to music, watching movies and
playing computer games” (p. 211).
15. This leads Huron (2006) to conclude that “long-term structure is not a promising creative tool.
New genres or styles are better established by employing distinctive timbres, textures, or moment-
to-moment note transitions” (p. 208). Therefore, composers seeking to construct effective topics
should attend primarily to timbre and texture, rather than harmonic progressions or rhythmic
figures.
APPENDIX VII
Conceptual Integration
When hypothesizing, conjecturing, imagining, envisioning, deducing, and even when rea-
soning, we subconsciously blend multiple mental images to construct probable scenarios,
generating novel and highly subjective interpretations. Fauconnier and Turner’s (2002) Con-
ceptual Integration theory provides an elegant model that sheds light on how these cogni-
tive mechanisms unfold in our minds.1 This appendix reviews the foundations of conceptual
blending, considers relevant neuroscientific research that supports this theory, and presents
a cursory overview of music-related scholarship drawing on conceptual blending to support
hermeneutic interpretations.
A Buddhist Monk begins at dawn one day walking up a mountain, reaches the top at
sunset, meditates at the top for several days until one dawn when he begins to walk back
to the foot of the mountain, which he reaches at sunset. Make no assumptions about his
starting or stopping or about his pace during the trips. Riddle: Is there a place on the path
that the Monk occupies at the same hour of the day on the two separate journeys?4
Appendix VII: Conceptual Integration 213
FIGURE VII.1 Blended space to solve the riddle of the monk. (After Fauconnier & Turner, 2002)
The riddle of the Monk is not a metaphorical statement; instead, it prompts us to construct
a hypothetical structure that is impossible to materialize: a mental representation in which
the monk is traveling in both directions on the same day. In such a hypothetical scenario,
monk one (traveling in one direction) meets monk two (traveling in the opposite direction)—
the time and place of this event answer the riddle. Figure VII.1 illustrates the thought pro-
cess that solves the riddle.5 Within this four-space model, elements within each space have
their counterpart in another space. These elements are represented iconically or listed in an
abbreviated form inside the corresponding input space: the direction on the journey, the day
of the journey, the mountain with its peak and foot, and the monk on the mountain. In this
conceptual blend, the generic space posits the abstract structure that relates the elements of
the input spaces: a path, a moving individual, a position along the path, a day of travel, and
an unspecified travel direction. Each element in one input space maps onto a corresponding
element in the other input space. The blended space superimposes both input spaces accord-
ing to the structure proposed in the generic space—the mountain (along its peak and foot) is
projected as a single element; the different days are projected into a single day; the moving
individuals, however, are not fused, preserving their direction and relative positioning.6
and make sense of information by providing a basic set of expectations, assumptions, and
associations—for instance, frames may draw on embodied knowledge such as image sche-
mas, on interpersonal dynamics such as ‘competing’ or ‘flirting’, or on cultural constructs
such as ‘justice’ or ‘love’.
Input spaces emerge when activating the frames within the generic space. Unlike the
generic space, input spaces can draw on knowledge from multiple domains to retrieve spe-
cific information and details, including anything from words and concepts to images and
sensory experiences. The information across input spaces is linked via “vital relations”, the
connective tissue that holds input spaces together. Vital relations are numerous, including
‘analogy’, ‘disanalogy’, ‘cause-effect’, ‘part-whole’, and ‘change’. Like other components of
conceptual blends, vital relations are arbitrarily chosen and can dynamically change as our
thoughts unfold.
When we “run the blend”, attending to the generic space and its frames, and project-
ing elements within the input spaces according to specific vital relations, a blended space
emerges. Nevertheless, the blended space is not simply a combination of the generic and
input spaces, but a new structure that stems from their interaction—an emergent structure
with unique properties not present in the source spaces, including new concepts, relation-
ships, or thought patterns.
Running the blend is an a-chronological process that entails: composition (i.e., con-
structing the image, like placing the two monks on a mountain); completion (i.e., supplying
information or assumptions, like taking for granted that the monks are walking rather than
crawling); elaboration (i.e., fleshing out the frames as they apply to the input spaces, like
‘monastic life’ and its connotations of meditation, prayer, and rituals); and integration (i.e.,
combining elements into a unified event, like the two monks walking on the same mountain
at the same time).
Notes
1. Conceptual Integration theory emerged as a reaction to well-established paradigms that “assume
that natural language semantics can be adequately studied with the tools of formal logic” (Lakoff &
Sweetser, 1994, p. ix). Early formulations of the theory date back to Fauconnier (1985); subsequent
expansions on these early ideas include works by Dinsmore (1991) and Cutrer (1994).
2. Additionally, conceptual blending is akin to “multidimensional scaling”, which entails the coac-
tivation of semantic spaces while positioning their components within a Euclidean space (Bauer,
2021; De Silva & Tenenbaum, 2004.
3. In comparing Conceptual Integration and Conceptual Metaphor theories, Zbikowski (2009) ad-
dresses the notion of mental spaces and conceptual domains, and notes that “the former [is]
understood as ephemeral and pragmatic and the latter as relatively stable and abstract” (p. 370).
This is one of the most immediate differences between the two theories. In addition, while meta-
phors rely on projections from one conceptual domain to another, conceptual blending relies
on the fusion of two domains according to common abstract structures or patterns. Furthermore,
conceptual blending is not concerned with grouping specific examples into larger, more inclusive
categories; rather, it identifies unique instances of emergent patterns of similarity between cog-
nitive domains. One fundamental similarity between these two theories, however, rests in their
regarding metaphor as a conceptual rather than a purely linguistic phenomenon, one that relies
216 Appendix VII: Conceptual Integration
on projections among conceptual spaces. Antovic (2018) combines Conceptual Integration and
Conceptual Metaphor theories to flesh out a theory of musical semantics.
4. From Koestler’s (1964) The Act of Creation. Cited in Fauconnier and Turner (2002, p. 39).
5. However, explaining or modeling the thought processes underlying the formation of metaphorical
or hypothetical scenarios does not address the primary motivation for resorting to metaphors or
hypothetical scenarios. Some research draws our attention to a potential motivation: the emotional
underpinnings of the “AHA!” phenomenon that results from finding connections, solving a riddle,
or ‘getting’ a metaphor (Bowden & Jung-Beeman, 2003; Thagard & Stewart, 2011). Thagard and
Aubie (2008), for instance, propose that the “AHA!” moment activates the autonomous nervous
system (i.e., heartbeat, sweating, etc.) and in turn an “emotional experience [arises] from a com-
plex neural process that integrates cognitive appraisal of a situation with perception of internal
physiological states” (p. 9).
6. The blended space may involve elements that are not present in either input space (most often this
is a replacement for a metonymic counterpart from one of the spaces).
7. The primary challenge in supporting conceptual blending from a neural vantage point entails veri-
fying potential mechanisms of “neuronal binding”, which would have further implications on sym-
bolic representations and consciousness (Gibbs, 2001). In light of this challenge, Ritchie (2004)
contends that “[it is] yet unknown how the information from these separate areas bound together . . .
[and] there is no reason to think that such binding requires the replication of conceptual structure
from each in a novel structure. Given the ability of most humans to spin out fanciful narratives,
and to construct complex logical arguments, the creation of an entirely new ‘blended space’ for
each conceptual integration would multiply conceptual representations in the brain to the point
that memory capacity would quickly be exhausted” (p. 33). Nevertheless, within the broader chal-
lenge of explaining consciousness, researchers within theoretical neuroscience have proposed
neurobiologically plausible models of neuronal binding based on synchronization of neural activ-
ity (Crick & Clark, 1994; Engel et al., 1999; Grandjean et al., 2008; van der Velde et al., 2017).
8. Thagard and Stewart (2011) provide a computational-simulation account of neuronal binding as
“the combination of previously unconnected mental representations constituted by patterns of
neural activity” to support creative thinking, broadly understood to include “scientific discovery,
technological invention, social innovation, and artistic imagination” (p. 3). To bridge the gap be-
tween theoretical neuroscience and embodiment, they propose a five-stage rationale that helps
circumvent the problem of neural representation: “1) Creativity results from novel combinations
of representations; 2) In humans, mental representations are patterns of neural activity; 3) Neural
representations are multimodal, encompassing information that can be visual, auditory, tactile, ol-
factory, gustatory, kinesthetic, and emotional, as well as verbal; 4) Neural representations are com-
bined by convolution, a kind of twisting together of existing representations; and 5) The causes of
creative activity reside not just in psychological and neural mechanisms” (p. 2).
9. Fauconnier and Turner (2002) account for such cases by including the “vital relations” of identity
and analogy, which “apply across spaces to the topology of scales, image-schemas, and force-
dynamic patterns inside mental spaces” (p. 105).
10. Kövecses (2002) believes that conceptual metaphors are a special case within the much larger
conceptual blending phenomenon proposed by Fauconnier and Turner. In fact, Fauconnier and
Turner claim to have developed their theory to explain cognitive processes writ large, and not only
those involving metaphorical statements—they refer to conceptual metaphors as “single-scope
networks”.
11. These advantages notwithstanding, Antović (2022) draws on both, the cognitive-linguistic theo-
ries of Conceptual Metaphor and Conceptual Integration, to formulate a “multilevel-grounded
semantics”.
APPENDIX VIII
Categorization
Categorization Approaches
Categories are, by and large, socially and culturally defined—the labels, category mem-
berships, and associations brought to mind by categories are strongly influenced by our
cultural environments and past experiences.3 Categories of film music topics, for instance,
are defined by a large body of films and our experiences with those films. As this body of
work accumulates, additional defining characteristics emerge and become emblematic of
the category itself.
Numerous models that explain the cognitive underpinnings of category construction
have been developed within philosophy, psychology, the cognitive sciences, and related
fields. Here, I review the two most influential approaches: rule-based and probabilistic.
218 Appendix VIII: Categorization
Rule-Based Approach
In his treatise Categories, Aristotle systematized an approach wherein category members
feature singly necessary and jointly sufficient features—this approach is now referred to as
“classical” or “rule-based”.4 Members of the category ‘pentagon’, as those shown in Figure
VIII.1, for instance, must feature singly necessary and jointly sufficient conditions: (1) it
must be a closed geometric form, (2) it must have five sides, and (3) its five internal angles
must add up to 540 degrees. The absence of any one of these defining features disqualifies
a shape from membership in the ‘pentagon’ category.
A rule-based model works well for geometric figures and countless other categories, par-
ticularly because it allows us to formulate proto-logical expressions that simplify complex
ideas, define the precise boundaries of categories, and apply uniformly to examples when
testing for category membership.5 However, this model crumbles when approaching more
fluid, context-based categorizations that reflect degrees of inclusion. Semantic (or artificial)
categories present such challenges; concrete ones such as ‘furniture’ or ‘musical instru-
ment’, and certainly more abstract ones such as ‘irony’, all defy attempts to define their
necessary and sufficient features that delineate category boundaries. To address these short-
comings, Rosch and Mervis (1975) and Rosch (1978) built upon Wittgenstein’s (1953) notion
of ‘family resemblance’ and further explored and systematized a probabilistic approach to
categorization.
Probabilistic Approach
In his Philosophische Untersuchungen [Philosophical Investigations] (1953), Wittgenstein
identifies artificial categories where members do not share necessary and sufficient fea-
tures, but where members are instead related by overlapping attributes not common to all
members—in the category ‘games’, for example, category members are related by overlap-
ping (but not necessary or sufficient) attributes such as boards, cards, players, balls, and
rules. Rosch and Mervis (1975) conducted a set of experiments that systematized a mode
of categorization in which members illustrate variable and graded similarity relationships,
setting up the foundation for the “exemplar” and the “prototype” models of categorization.
In both the exemplar and prototype models, statistically prominent attributes determine
category membership, and the degree of category membership of new examples depends on
Appendix VIII: Categorization 219
the degree to which they exhibit those attributes; in both, also, new examples are compared
to existing representations in memory. But whereas in the exemplar model we make category
judgments by comparing new items to memory traces of individual category members (or
‘exemplars’), in the prototype model we make category judgments based on an abstract cate-
gory member (or ‘prototype’) which exhibits the central tendencies of all category members.6
For instance, the exemplar model purports that in categorizing an animal as a ‘bird’, we
compare it to existing memory traces of exemplars within the ‘bird’ category; in contrast, the
prototype model purports that in categorizing an animal as a ‘bird’, we compare it to a single
abstraction that subsumes the central characteristics of all members of the bird category.7
In later work, Rosch (1978) further develops the prototype model. She identifies that
graded similarity to a prototype depends on the weights we give to statistically prominent
attributes and to the values within attributes. Examples sharing a greater number of heavily
weighted attributes and values are more typical members of the category. For the ‘bird’ cat-
egory, for instance, the ‘mode of locomotion’ is one of the most heavily weighted attributes;
and within the ‘mode of locomotion’ attribute, ‘flying’ is more heavily weighted than ‘run-
ning’ or ‘swimming’. Based on this weighting of attributes and values, wrens feature a higher
degree of bird-like typicality than chickens, whereas penguins exhibit a lower degree of typi-
cality than either chickens or wrens. Figure VIII.2 maps the possible cognitive mechanism
220 Appendix VIII: Categorization
the prior example, for instance, individuals would thus name the object ‘piano’, even when
shown a picture of a grand piano. She argues that this tendency reflects a balance between
efficiency (by minimizing the number of categories considered) and informativeness (by max-
imizing the most relevant information): calling an object “instrument” is highly efficient yet
poorly informative; calling the same object a “fortepiano” is highly informative but not effi-
cient; calling it a “piano” strikes the optimal balance between efficiency and effectiveness.11
In addition to a ‘vertical’ organizational principle based on levels of inclusiveness, Rosch
contemplates a ‘horizontal’ dimension, which accounts for the segmentation of categories
at a particular level, such as including ‘piano’, ‘timpani’, and ‘theremin’ within the ‘musical
instrument’ superordinate category. (See Figure VIII.4.) She notes that in defining a category,
individuals draw on prototypes that feature the most representative attributes of a category.
In our example, ‘piano’ is more representative of the musical instrument category than tim-
pani or theremin.
Notes
1. An efficient categorization process provides an ecological and evolutionary advantage, as minor
errors in classification may have drastic results; for instance, we categorize animals as snakes or
hamsters so to infer attributes such as dangerous or harmless.
2. The neurological underpinnings of categorization are, to a great extent, unknown. Neuroscien-
tific evidence suggests that we employ different cognitive mechanisms depending on our needs,
circumstances, and capabilities. Farah and McClelland (1992), for instance, argue that categories
are represented differently in the brain, whether these are the result of items’ surface-level features
or the result of their function, particularly when items categorized could be understood as either
‘living’ or ‘nonliving’—that is, categorization of living items is strongly dependent on surface-level
features, whereas categorization of nonliving items is strongly dependent on their function. Smith
et al. (2016) provide a different perspective, noting “the presence in human and primate brains
of multiple, dissociable processes within the overall category-learning system” (p. 13) and assert-
ing that we engage different cognitive strategies for categorization depending on the ‘density’ of
categories, resorting to the exemplar model for sparse categories and to the prototype model for
densely populated categories.
3. Lakoff’s (1987) principle of ‘domain-of-experience’ extends the notion of culturally defined cat-
egories even further, to identify categories in which all its members are related by a cultural or
collective experience; his most telling example is the category ‘balan’ in the Dyirbal language,
which includes “women”, “fire”, and “dangerous things”.
4. Aristotle argued that every object or concept belongs to a specific category or genus, and that
this category can be determined by examining the defining characteristics or essential properties
222 Appendix VIII: Categorization
of the object or concept. This approach, also known as “essentialism”, became a cornerstone of
Aristotle’s and his followers’ philosophies.
5. Because of these benefits, this approach is optimal from an evolutionary perspective, as it allows
us to make rapid inferences or predictions based on partial information or a subset of features
(Medin & Coley, 1998).
6. Rosch (1978) characterizes prototypes as “those members of a category that most reflect the re-
dundancy structure of the category as a whole” (p. 12). However, because a prototype does not
correspond to an actual instance representative of a category, a prototype could be qualified as an
average or central tendency of all category members (Kruschke, 2008).
7. Recently, the exemplar model has fallen out of favor because of its weaknesses related to memory
and information retrieval, which may result in ecological and adaptive risks. The prototype model,
instead, offers the ecological advantage of allowing for category judgments based on a single ab-
stract prototype, rather than on a number of exemplars stored in memory. That said, Rosseel (2002)
considers the possibility of multiple prototypes per category.
8. These tendencies toward ecological and evolutionary cognition were pervasive within the schol-
arly community of the time, yet scholars seldom acknowledged each other’s work. Rosch (1978)
notes that “given an actor with the motor programs for sitting, it is a fact of the perceived world that
objects with the perceptual attributes of chairs are more likely to have functional sit-on-able-ness
than objects with the appearance of cats” (p. 4). Additionally, from an ecological perspective, she
notes that the perception of certain attributes is species-specific; for instance, because a “dog’s
sense of smell is more highly differentiated than a human’s, [the] structure of the world for a dog
must surely include attributes of smell that we, as a species, are incapable of perceiving . . . Fur-
thermore, because a dog’s body is constructed differently from a human’s, its motor interactions
with objects are necessarily differently structured” (p. 4). Although not directly cited within her
work, these passages are a distinct nod to Gibson’s notion of affordances. In turn, in further de-
veloping his notion of affordances, Gibson (1979) draws on Rosch’s notion of prototypes, albeit
without explicitly citing her. He notes, for instance, that “the fact that a stone is a missile does not
imply that it cannot be other things as well. It can be a paperweight, a bookend, a hammer, or a
pendulum bob. It can be piled on another rock to make a cairn or a stone wall. These affordances
are all consistent with one another. The differences between them are not clear-cut, and the ar-
bitrary names by which they are called [i.e., categories] do not count for perception” (p. 134).
Gibson then adds that “The theory of affordances rescues us from the philosophical muddle of
assuming fixed classes of objects, each defined by its common features and then given a name . . .
You do not have to classify and label things in order to perceive what they afford” (p. 134).
9. See The Triplets of Belleville. [00:55:10]
10. In addition, category members at the basic level may activate motor programs and modal repre-
sentations more vividly than at the more abstract and disembodied superordinate levels (Grodal,
2009). For instance, the basic level ‘piano’ elicits a more immediate response during retrieval than
the more abstract superordinate level ‘instrument’.
11. The balance between efficiency and effectiveness notwithstanding, situational or contextual vari-
ables prompt different levels of categorization. For instance, an orchestra conductor addressing
the woodwind performers would opt for the subordinate level ‘piccolo flute’ to the basic level
‘flute’; in such cases, although addressing the subordinate level demands more detailed process-
ing, it helps avoid grave misunderstandings.
APPENDIX IX
Additional Examples
In this last appendix, I offer additional examples that may be of interest to readers. To con-
textualize the music within the film’s action and storyline, a summary of the film and the
scene introduces each example.1 Following those brief introductions, I offer questions and
discussion points to draw the reader’s attention to salient musical facets. However, there is
no single or correct answer to any of the questions and discussion points; instead, these will
prompt further reflection and give readers an opportunity to apply the ESMAMAPA frame-
work in exploring their own sensitivity and intuitions about film music’s power to shape
our experiences and interpretations. Last, while the cognitive and semiotic mechanisms the
ESMAMAPA framework encompasses may prove sufficient when unpacking most interpreta-
tions, I include questions at the end of the appendix that will prompt the reader to test this
framework’s explanatory limits and underlying premises.
1917
Several devastating years into World War I, two British lance corporals—Schofield and
Blake—race against time, crossing over into enemy territory to deliver a message that could
save thousands, including Blake’s brother. In a scene early in the film, the two soldiers
embark on their journey through no-man’s-land, first reaching abandoned German trenches.
• How does the music reflect the psychological tension leading to the discovery that the
trenches have been abandoned?
• As the soldiers cross through the mud, they glimpse corpses trapped in barbed wire. How
does the music allow us access to the characters’ psychological apprehension?
• What effect does the final cadential gesture in this music cue have on our interpretation
of the storyline?
45 Years
Kate and Geoff plan to celebrate their 45th wedding anniversary with many friends. Weeks
before the celebration, Geoff receives a letter notifying him that his first love, Katya, was
224 Appendix IX: Additional Examples
discovered after many years, frozen in the glaciers in Switzerland. As the party nears, Kate
finds pictures that Geoff kept from his time with Katya in the Swiss Alps and learns that Katya
was pregnant when she died. In the film’s last scene, during the celebration, Kate and Geoff
dance to their wedding song: “Smoke Gets in Your Eyes” by The Platters.
• How do the lyrics of the song color our perception of their marriage and their future as a
couple?
• What other musical elements contribute to this perception?
• How do other music cues in the film help delineate Kate’s emotional journey?
A.I.
It is the twentieth-second century. Natural disasters stemming from global climate change
have reduced the world’s population. Henry and Monica, whose child is in suspended ani-
mation because of a rare disease, adopt David, a robotic boy capable of experiencing love.
In an early scene, Monica struggles to accept David as her child and tests his (and her own)
boundaries of adaptive human behavior.
• Why does the music introduce musical gestures predominantly in the high register?
• How does the tension between David’s human appearance and his/its mechanical con-
struction play within the music’s timbres?
• The music introduces dissonant undertones within an otherwise innocent, childlike tex-
ture. What is the purpose of these dissonances?
• How does the cadential gesture at the end of the scene influence our perception of the
storyline?
Avengers: Endgame
After Thanos erased half of all life and decimated the universe, the Avengers plan to use the
Infinity Stones to reverse his actions. In the final scene, Captain America, Thor, and Iron Man
cannot take control of Thanos’s brutal forces. In defeat, Captain America seems to surrender.
At that moment, a portal opens. Okoye, T’Challa, and Shuri appear, and then Falcon flies
out. Other portals open, bringing Dr. Strange, Spiderman, other Avengers, the Guardians
of the Galaxy, the Ravagers, and the armies of Wakanda and Asgard. At Captain America’s
command, “Avengers: assemble!”, they join forces to fight Thanos and his army.
• As Okoye, T’Challa, and Shuri appear, what instrumental choices bring about thoughts of
honor and courage?
• How does the music reflect the ever-strengthening coalescing forces as superheroes arrive?
• Although the music initially projects an ametric structure, it gradually defines a duple
meter. How does the music’s temporality influence our perception of the events to unfold?
Beast
In a remote community coping with a string of unsolved rapes and murders of young girls,
troubled 27-year-old Moll falls for a mysterious outsider. As their relationship blossoms, he
Appendix IX: Additional Examples 225
empowers her to escape her wealthy yet oppressive family. He comes under suspicion for
the rapes and murders, but she defends him. In the film’s opening scene, Moll sings in a
choir conducted by her mother; interspersed, we see flashes of some of the missing girls’
obituaries. In a later scene, the police detain Moll and bring her to their headquarters for
interrogation.
• In the first scene, the choir piece’s idyllic, perfectly in-tune setting becomes increasingly
distorted with emergent low, noisy, dissonant rumblings. What is the music conveying at
this moment in the film? How does the music prompt us to construct that interpretation?
• In the latter scene, the music blends with the sound of Moll’s heartbeat. Although her
heart’s pace remains steady, the loudness noticeably increases. How does the soundtrack
influence us viscerally at this moment in the film?
Cassandra’s Dream
The Blaine brothers, Ian and Terry, enjoy a life of good fortune. Their luck soon changes when
they purchase a luxury sailboat at an oddly low price and name it “Cassandra’s Dream”,
unaware of the mythological antecedents of the name.
• What effect does the minimalist-style score have on our perception of the characters and
the narrative?
• How does the inconclusive nature of the cadential gestures to (nearly) all cues resonate
with the film’s central message?
• What is the combined effect of the minimalist-style score and the inconclusive gestures
on either realizing or thwarting our musical (and dramatic) expectations?
Chicken Run
The lives of (anthropomorphic) egg-laying chickens are doomed when a chicken-pie-making
machine arrives at the chicken farm—they must organize and escape. The main title
sequence illustrates a series of unsuccessful attempts at breaking free—unable to fly, the
chickens try some bizarre strategies.
• What topic do the surface-level musical features suggest? What is the function of such a
topic during the main title sequence?
• Toward the end of the main title sequence, there is a sudden shift to a triple meter. How
does this shift modulate our perception of that moment in the film?
• What type of cadence concludes the main title sequence? To what effect?
Deadpool
Wade Wilson, a former Special Forces operative, becomes romantically involved with
Vanessa. Diagnosed with terminal cancer, Wade vanishes to save Vanessa from heartache.
He submits himself to Ajax, a mad scientist, who misleadingly promises to heal him and
injects him with a serum to awaken his mutant genes. The experiment disfigures and
transforms Wade into Deadpool, yet leaves him with enhanced healing powers, which
226 Appendix IX: Additional Examples
he uses to hunt down Ajax. In the film’s last scene, Deadpool saves Vanessa from a col-
lapsing ship. The self-healing but disfigured superhero apologizes to his former girlfriend
for abandoning her, and although she still has trouble with Wade, she is willing to move
forward with Deadpool.
• What is the role of the ‘sexophone’ topic in the scene? Does it function primarily to sup-
port the narrative, indicate the film’s genre, or define the character’s identity?
• As Vanessa forgives Wade and acknowledges that she will eventually get used to his hide-
ous face, how does the sound design reflect his taking control of the narrative?
Midsommar
Dani and Christian travel to Sweden to visit their friend’s rural hometown during its midsum-
mer festivities. Their idyllic retreat is soon transformed by a series of violent and uncanny
rituals led by a Scandinavian pagan cult. In an early scene, Dani awakens from a strange
psychedelic mushroom trip and joins others in their hike toward the village. Upon arrival,
the village’s members greet and welcome them to the celebrations with music, food, and
gifts.
• A repeating, harmonically static upward gesture on flutes underscores the long journey.
How does the repetition of this gesture impact us?
• How does the music’s metrical structure influence our perception of time?
• As the group arrives, a flute ensemble welcomes them, prompting a transference in the
diegesis. How does this transference subliminally prepare us for the events to unfold?
Mission Impossible II
Ethan Hunt leads the Impossible Mission Force team on a quest to seize a deadly virus
before it reaches the hands of terrorists. In an early scene, Ethan travels to Seville, Spain, to
recruit Nyah Nordoff-Hall, a skillful thief and the chief terrorist’s former girlfriend; he spots
her at a private gathering, stealing a necklace. In the film’s last scene, Nyah is cleared of her
criminal record and joins Ethan for a vacation in Sydney.
• What topic does the Flamenco guitar suggest in the first scene during the private gathering?
• The guitar melody will become their leitmotif. How do the meter, modality, and other
musical parameters reflect the characters’ relationship?
• How does the music map the slow-motion in the visuals as they first cross paths?
• What are the most significant leitmotif transformations in the ending scene? What
meaning(s) do these transformations convey, and how do these convey such meaning(s)?
In this film, Captain John Miller leads his squad behind enemy lines to rescue Private James
Ryan. In the film’s first scene, an aging veteran walks through a cemetery and becomes over-
whelmed with emotions as he recalls his time as a soldier. The film’s last scene reveals that
Appendix IX: Additional Examples 227
the elderly veteran is Ryan, filled with feelings of grief and gratitude while saluting Miller’s
gravestone.
• How do the music’s cadential gestures in the first and last scenes help us embark on the
characters’ journeys and subsequently bring that journey to a conclusion?
• What instrumental choices bring about thoughts of honor, decency, and courage?
• Although the song initially feels like an innocent nursery rhyme, its lyrics project a rebel
anthem. How does this tension play out within the music and the film’s context?
• How do the song’s Appalachian flavor, duple meter, and increasing loudness and number
of voices impact our understanding of the scene?
• Katniss’s diegetic singing smoothly unfolds into a non-diegetic orchestral rendition of the
song. What sound design technique is at play? What are the narrative and interpretational
entailments of using this technique at this moment in the film?
The Lobster
In a dystopian world, single people have a month and two weeks to find a partner, or they are
converted into a beast of their choice and freed into the wild. David, who has recently become
single after a 12-year relationship, is eager to find a new partner, but realizing he has no out-
standing physical traits, he proactively chooses to become a lobster. In a scene, the Greek song
“Ti einai afto’ pou to lene agape” underscores slow-motion visuals of a ritual in which single
individuals hunt each other in the woods. The lyrics to the first verse translate to “What is it
that is called love? What is that? What drives the heart secretly, and who feels that nostalgia?”
• How do the song and its lyrics work in tandem with (or against) the visuals to enhance the
tension between their and our conception of love?
Up
Carl and Ellie fantasize about escaping life’s hurdles by restoring an abandoned house and
moving it to a mountain peak overlooking Paradise Falls. The opening sequence, which
228 Appendix IX: Additional Examples
contains only music and no dialogue or sound effects, spans their entire life together, from
marriage to rebuilding the house, her suffering a miscarriage, saving for a trip to Paradise
Falls but repeatedly spending those savings on more pressing needs, and her falling ill and
untimely death.
• How do the piece’s ‘music box’ sound and other timbres influence our interpretation?
• Balloons are a vital element in the story. How is their movement (up and down) reflected
in the score?
• The montage spans a great number of moments in their married life. How does the music
help delimit these moments?
• Although much of the montage shows them working, how does the music’s meter infuse
a more playful tone?
• How are the events, such as Ellie’s suffering after her miscarriage or Carl’s mourning of
Ellie’s death, reflected in the music?
Note
1. Note to instructors: The Copyright Act allows for showing copyrighted films or clips during regular
classroom instruction, and only to students in the course (thus excluding ‘public’ viewings). Instruc-
tors must use a legal copy of the film and may not charge a fee to the students.
REFERENCES
Abril, C. (2001). The use of labels to describe pitch changes by bilingual children. Bulletin of the Coun-
cil for Research in Music Education, 151, 31–40.
Acitores, A. P. (2011). Towards a theory of proprioception as a bodily basis for consciousness in
music. Music and Consciousness: Philosophical, Psychological, and Cultural Perspectives, 1, 215.
Adolph, K. E., Eppler, M. A., & Gibson, E. J. (1993). Crawling versus walking infants’ perception of
affordances for locomotion over sloping surfaces. Child Development, 64(4), 1158–1174.
Agawu, V. K. (1991). Playing with signs: A semiotic interpretation of classic music. Princeton University
Press.
Agosta, L. (2014). A rumor of empathy: Rewriting empathy in the context of philosophy. Springer.
Agus, T. R., Thorpe, S. J., Suied, C., & Pressnitzer, D. (2010, May). Characteristics of human voice
processing. In Proceedings of 2010 IEEE international symposium on circuits and systems (pp. 509–
512). IEEE.
Alger, S. E., Kensinger, E. A., & Payne, J. D. (2018). Preferential consolidation of emotionally salient
information during a nap is preserved in middle age. Neurobiology of Aging, 68, 34–47.
Anderson, M. L. (2010). Neural reuse: A fundamental organizational principle of the brain. Behavioral
and Brain Sciences, 33(4), 245–266.
Anderson, M. L. (2016). Précis of after phrenology: Neural reuse and the interactive brain. Behavioral
and Brain Sciences, 39.
Anishchenko, V. S., Balanov, A. G., Janson, N. B., Igosheva, N. B., & Bordyugov, G. V. (2000). Entrain-
ment between heart rate and weak noninvasive forcing. International Journal of Bifurcation and
Chaos, 10(10), 2339–2348.
Antović, M. (2018). From expectation to concepts: Toward multilevel grounding in musical semantics.
Cognitive Semiotics, 9(2), 105–138.
Antović, M. (2022). Form, meaning and intentionality: The case of metaphor in music. In Metaphors
and analogies in sciences and humanities: Words and worlds (pp. 553–577). Springer International
Publishing.
Asch, S. E., & Nerlove, H. (1960). The development of double function terms in children. In H. Wapner &
B. Kaplan (Eds.), Perspectives in psychological theory (pp. 41–60). International Universities Press.
Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control pro-
cesses. In Psychology of learning and motivation (Vol. 2, pp. 89–195). Academic Press.
Atkinson, S. (2011). Canons, augmentations, and their meaning in two works by Steve Reich. Music
Theory Online, 17(1).
230 References
Aurnague, M., Hickmann, M., Vieu, L., & Shtalbi, H. (2007). The categorization of spatial entities in
language and cognition (Human Cognitive Processing). John Benjamins.
Aziz-Zadeh, L., Iacoboni, M., Zaidel, E., Wilson, S., & Mazziotta, J. (2004). Left hemisphere motor
facilitation in response to manual action sounds. European Journal of Neuroscience, 19(9),
2609–2612.
Bach, P., Nicholson, T., & Hudson, M. (2014). The affordance-matching hypothesis: How objects guide
action understanding and prediction. Frontiers in Human Neuroscience, 8, 254.
Baddeley, A. D., & Hitch, G. (1974). Working memory. In Psychology of learning and motivation (Vol.
8, pp. 47–89). Academic Press.
Baker, D. J., & Müllensiefen, D. (2017). Perception of leitmotives in Richard Wagner’s Der Ring des
Nibelungen. Frontiers in Psychology, 8, 662.
Balkwill, L. L., & Thompson, W. F. (1999). A cross-cultural investigation of the perception of emotion
in music: Psychophysical and cultural cues. Music Perception, 17(1), 43–64.
Bargh, J. A., & Chartrand, T. L. (1999). The unbearable automaticity of being. American Psychologist,
54(7), 462.
Bargh, J. A., Williams, L., Huang, J., Song, H., & Ackerman, J. (2010). From the physical to the psycho-
logical: Mundane experiences influence social judgment and interpersonal behavior. Behavioral
and Brain Sciences, 33(4), 267–268.
Barsalou, L. W. (2008). Cognitive and neural contributions to understanding the conceptual system.
Current Directions in Psychological Science, 17(2), 91–95.
Barsalou, L. W., Simmons, W. K., Barbey, A. K., & Wilson, C. D. (2003). Grounding conceptual knowl-
edge in modality-specific systems. Trends in Cognitive Sciences, 7(2), 84–91.
Barthes, R. (1985). The responsibility of forms: Critical essays on music, art, and representation. Uni-
versity of California.
Bartlett, F. C. (1933). Remembering: A study in experimental and social psychology. British Journal of
Educational Psychology, 3(2), 187–192.
Bauer, A. (2021). Tone-color, movement, changing harmonic planes: Cognition, constraints and con-
ceptual blends in modernist music (eScholarship). University of California.
Bazelon, I. (1975). Knowing the score: Notes on film music. Van Nostrand Reinhold Company.
Bell, D. A. (1994). Getting the best score for your film: A filmmakers’ guide to music scoring. Silman-
James Press.
Belletto, S. (2008). “Cabaret” and antifascist aesthetics. Criticism, 50(4), 609–630.
Berman, M. G., Jonides, J., & Lewis, R. L. (2009). In search of decay in verbal short-term memory. Jour-
nal of Experimental Psychology: Learning, Memory, and Cognition, 35(2), 317.
Bestelmeyer, P. E., Maurage, P., Rouger, J., Latinus, M., & Belin, P. (2014). Adaptation to vocal
expressions reveals multistep perception of auditory emotion. Journal of Neuroscience, 34(24),
8098–8105.
Bey, C., & McAdams, S. (2002). Schema-based processing in auditory scene analysis. Perception &
Psychophysics, 64(5), 844–854.
Biancorosso, G. (2010). The shark in the music. Music Analysis, 29(1–3), 306–333.
Biancorosso, G. (2013). Memory and the leitmotif in cinema. In Representation in western music.
Cambridge University Press.
Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999). Emotional responses to pleasant and
unpleasant music correlate with activity in paralimbic brain regions. Nature Neuroscience, 2(4),
382–387.
Boltz, M., Schulkind, M., & Kantra, S. (1991). Effects of background music on the remembering of
filmed events. Memory & Cognition, 19(6), 593–606.
Boltz, M. G. (2001). Musical soundtracks as a schematic influence on the cognitive processing of
filmed events. Music Perception, 18(4), 427–454.
Boltz, M. G. (2004). The cognitive processing of film and musical soundtracks. Memory & Cognition,
32(7), 1194–1205.
References 231
Boltz, M. G., Ebendorf, B., & Field, B. (2009). Audiovisual interactions: The impact of visual informa-
tion on music perception and memory. Music Perception, 27(1), 43–59.
Bourne, J. (2021). Hearing film music topics outside the movie theatre: Listening cinematically to pas-
torals. In The Oxford handbook of cinematic listening. Oxford University Press.
Bowden, E. M., & Jung-Beeman, M. (2003). Aha! Insight experience correlates with solution activation
in the right hemisphere. Psychonomic Bulletin & Review, 10(3), 730–737.
Brandt, A. (2021). Defining creativity: A view from the arts. Creativity Research Journal, 33(2), 81–95.
Brandt, L. (2009). Metaphor and the communicative mind. Cognitive Semiotics, 5(1–2), 37–107.
Bregman, A. (1990). Auditory scene analysis: The perceptual organization of sound. MIT Press.
Bremner, A. J., Caparos, S., Davidoff, J., de Fockert, J., Linnell, K. J., & Spence, C. (2013). ‘Bouba’ and
‘Kiki’ in Namibia? A remote culture makes similar shape–sound matches, but different shape–taste
matches to westerners. Cognition, 126, 165–172.
Bribitzer-Stull, M. (2015). Understanding the leitmotif. Cambridge University Press.
Brochard, R., Abecasis, D., Potter, D., Ragot, R., & Drake, C. (2003). The “ticktock” of our internal
clock: Direct brain evidence of subjective accents in isochronous sequences. Psychological Sci-
ence, 14(4), 362–366.
Brodsky, W., Kessler, Y., Rubinstein, B. S., Ginsborg, J., & Henik, A. (2008). The mental representation
of music notation: Notational audiation. Journal of Experimental Psychology: Human Perception
and Performance, 34(2), 427.
Browne, R. (1981). Tonal implications of the diatonic set. In Theory Only, 5(6), 3–21.
Buccino, G., Riggio, L., Melli, G., Binkofski, F., Gallese, V., & Rizzolatti, G. (2005). Listening to action-
related sentences modulates the activity of the motor system: A combined TMS and behavioral
study. Cognitive Brain Research, 24(3), 355–363.
Buccino, G., Vogt, S., Ritzl, A., Fink, G. R., Zilles, K., Freund, H. J., & Rizzolatti, G. (2004). Neural circuits
underlying imitation learning of hand actions: An event-related fMRI study. Neuron, 42(2), 323–334.
Buhler, J. (2016). Branding the Franchise: Music, Opening Credits, and the (Corporate) Myth of Origin.
In Music in Epic Film (pp. 17–40). Routledge.
Bullerjahn, C., & Güldenring, M. (1994). An empirical investigation of effects of film music using qual-
itative content analysis. Psychomusicology: A Journal of Research in Music Cognition, 13(1–2), 99.
Bundgaard, P. F. (2009). Are cross-domain mappings psychologically deep, but conceptually shallow?
What is still left to test for conceptual metaphor theory. Cognitive Semiotics, 5(1–2), 400–407.
Burianova, H., McIntosh, A. R., & Grady, C. L. (2010). A common functional brain network for auto-
biographical, episodic, and semantic memory retrieval. Neuroimage, 49(1), 865–874.
Butler, D. (1989). Describing the perception of tonality in music: A critique of the tonal hierarchy
theory and a proposal for a theory of intervallic rivalry. Music Perception, 6(3), 219–241.
Byron, T. P. (2008). The processing of pitch and temporal information in relational memory for melo-
dies [PhD dissertation, University of Western Sydney].
Callan, D. E., Tsytsarev, V., Hanakawa, T., Callan, A. M., Katsuhara, M., Fukuyama, H., & Turner, R.
(2006). Song and speech: Brain regions involved with perception and covert production. Neuroim-
age, 31(3), 1327–1342.
Calvin, W. H. (1996). The cerebral code: Thinking a thought in the mosaics of the mind. MIT Press.
Cambouropoulos, E. (2001). Melodic cue abstraction, similarity, and category formation: A formal
model. Music Perception, 18(3), 347–370.
Cantor, J. (2004). “I’ll never have a clown in my house”—why movie horror lives on. Poetics Today,
25(2), 283–304.
Carr, L., Iacoboni, M., Dubeau, M. C., Mazziotta, J. C., & Lenzi, G. L. (2003). Neural mechanisms of
empathy in humans: A relay from neural systems for imitation to limbic areas. Proceedings of the
National Academy of Sciences, 100(9), 5497–5502.
Challis, B. H., Velichkovsky, B. M., & Craik, F. I. (1996). Levels-of-processing effects on a variety of mem-
ory tasks: New findings and theoretical implications. Consciousness and Cognition, 5(1–2), 142–164.
Chao, L. L., & Martin, A. (2000). Representation of manipulable man-made objects in the dorsal
stream. Neuroimage, 12(4), 478–484.
232 References
Chattah, J. (2006). Semiotics, pragmatics, and metaphor in film music analysis [Unpublished PhD dis-
sertation, Florida State University].
Chattah, J. (2008). Conceptual integration and film music analysis. Semiotics, 772–783.
Chattah, J. (2015). Film music as embodiment. In M. Coëgnarts & P. Kravanja (Eds.), Embodied cogni-
tion and cinema. Leuven University Press.
Cherlin, M. (2017). Varieties of musical irony. Cambridge University Press.
Choi, S., McDonough, L., Bowerman, M., & Mandler, J. (1999). Early sensitivity to language-specific
spatial categories in English and Korean. Cognitive Development, 14(2), 241–268.
Cisek, P. (2007). Cortical mechanisms of action selection: The affordance competition hypothesis.
Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1485), 1585–1599.
Clark, A. (1989). Microcognition: Philosophy, cognitive science, and parallel distributed processing.
MIT Press.
Clarke, E. (2005). Ways of listening: An ecological approach to the perception of musical meaning.
Oxford University Press.
Clynes, M. (1977). Sentics: The touch of emotions. Anchor Press.
Cobo, R. M. D. (2003). Parody and satire in Burgess’ a clockwork orange and in Kubrick’s cinematic
adaptation. Estudios Humanísticos. Filología, (25), 57–69.
Coëgnarts, M. (2019). Film as embodied art: Bodily meaning in the cinema of Stanley Kubrick. Aca-
demic Studies Press.
Coëgnarts, M., & Kravanja, P. (2012). Towards an embodied poetics of cinema: The metaphoric con-
struction of abstract meaning in film. Alphaville, 4, 1–18.
Coëgnarts, M., & Kravanja, P. (2015). With the past in front of the character: Evidence for spatial-
temporal metaphors in cinema. Metaphor and Symbol, 30(3), 218–239.
Cohen, A. J. (2014). Film music from the perspective of cognitive science. In The Oxford handbook of
film music studies (pp. 96–130). Oxford University Press.
Cohen, M. E., & Carr, W. J. (1975). Facial recognition and the von Restorff effect. Bulletin of the Psy-
chonomic Society, 6(4), 383–384.
Cohn, R. (2012). Audacious euphony: Chromatic harmony and the triad’s second nature. Oxford
University Press.
Collier, W. G., & Hubbard, T. L. (2001). Musical scales and evaluations of happiness and awkward-
ness: Effects of pitch, direction, and scale mode. American Journal of Psychology, 114(3), 355–375.
Cook, N. (1998). Analysing musical multimedia. Clarendon Press.
Cook, N. (2017). Music, performance, meaning: Selected essays. Routledge.
Cox, A. (1999). The metaphoric logic of musical motion and space [Unpublished PhD dissertation,
University of Oregon].
Cox, A. (2001). The mimetic hypothesis and embodied musical meaning. Musicae Scientiae, 5(2),
195–212.
Cox, A. (2016). Music and embodied cognition: Listening, moving, feeling, and thinking. Indiana Uni-
versity Press.
Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal
of Verbal Learning and Verbal Behavior, 11(6), 671–684.
Crick, F., & Clark, J. (1994). The astonishing hypothesis. Journal of Consciousness Studies, 1(1), 10–16.
Crowder, R. G., Reznick, J. S., & Rosenkrantz, S. L. (1991). Perception of the major/minor distinction:
V. Preferences among infants. Bulletin of the Psychonomic Society, 29(3), 187–188.
Cutrer, L. M. (1994). Time and tense in narrative and in everyday language [Doctoral dissertation,
University of California, San Diego].
D’Aloia, A. (2012). The intangible ground – A neurophenomenology of the film experience. NECSUS:
European Journal of Media Studies, 1(2), 219–239.
Deacon, T. (2006). The aesthetic faculty. In The artful mind: Cognitive science and the riddle of human
creativity (pp. 21–53). Oxford University Press.
Declerck, G. (2013). Why motor simulation cannot explain affordance perception. Adaptive Behav-
ior, 21(4), 286–298.
References 233
Deliège, I. (1992). Recognition of the Wagnerian leitmotiv: Experimental study based on an excerpt
from Das Rheingold. Musik Psychologie, 9, 25–54.
Deliège, I. (2001). Prototype effects in music listening: An empirical approach to the notion of imprint.
Music Perception, 18(3), 371–407.
Deliège, I., & Mélen, M. (1997). Cue abstraction in the representation of musical form. In Perception
and cognition of music (pp. 374–397). Psychology Press.
Deliège, I., Mélen, M., Stammers, D., & Cross, I. (1996). Musical schemata in real-time listening to a
piece of music. Music Perception, 14(2), 117–159.
Deroy, O., & Auvray, M. (2013). A new Molyneux’s problem: Sounds, shapes and arbitrary crossmodal
correspondences. In O. Kutz, M. Bhatt, S. Borgo, & P. Santos (Eds.), Second international workshop
the shape of things (pp. 61–70). International Association of Ontology and its Applications.
De Silva, V., & Tenenbaum, J. B. (2004). Sparse multidimensional scaling using landmark points (Vol.
120, Technical report). Stanford University.
D’Esposito, M. (2007). From cognitive to neural models of working memory. Philosophical Transac-
tions of the Royal Society B: Biological Sciences, 362(1481), 761–772.
De Souza, J. (2017). Music at hand: Instruments, bodies, and cognition. Oxford University Press.
De Waal, F. B. (2007). The ‘Russian doll’ model of empathy and imitation. In On being moved: From
mirror neurons to empathy (pp. 35–48). John Benjamins Publishing Company.
Dewell, R. (2005). Dynamic patterns of CONTAINMENT. In B. Hampe & J. Grady (Eds.), From percep-
tion to meaning: Image schemas in cognitive linguistics (pp. 369–394). Walter de Gruyter GmbH.
de Wit, M. M., de Vries, S., van der Kamp, J., & Withagen, R. (2017). Affordances and neuroscience:
Steps towards a successful marriage. Neuroscience & Biobehavioral Reviews, 80, 622–629.
Dimberg, U., & Öhman, A. (1996). Behold the wrath: Psychophysiological responses to facial stim-
uli. Motivation and Emotion, 20(2), 149–182.
Dimberg, U., Thunberg, M., & Elmehed, K. (2000). Unconscious facial reactions to emotional facial
expressions. Psychological Science, 11(1), 86–89.
Dinsmore, J. (1991). Partitioned representations. In Partitioned representations (pp. 45–91).
Springer.
Di Stefano, N., Vuust, P., & Brattico, E. (2022). Consonance and dissonance perception. A critical
review of the historical sources, multidisciplinary findings, and main hypotheses. Physics of Life
Reviews, 43, 273–304.
Doelling, K. B., & Poeppel, D. (2015). Cortical entrainment to music and its modulation by exper-
tise. Proceedings of the National Academy of Sciences, 112(45), E6233–E6242.
Doll, C. (2018). Was it diegetic, or just a dream? Music’s paradoxical place in the film INCEPTION.
Society for Music Theory: Videocast Journal (SMT-V), 4(1).
Donnelly, K. J. (1998). The classical film score forever? Batman, Batman Returns and post-classical film
music. Contemporary Hollywood Cinema, 142–155.
Dowling, W. J. (1972). Recognition of melodic transformations: Inversion, retrograde, and retrograde
inversion. Perception & Psychophysics, 12(5), 417–421.
Dowling, W. J. (1978). Scale and contour: Two components of a theory of memory for melodies. Psy-
chological Review, 85(4), 341.
Dowling, W. J., & Bartlett, J. C. (1981). The importance of interval information in long-term memory for
melodies. Psychomusicology: A Journal of Research in Music Cognition, 1(1), 30.
Dowling, W. J., & Fujitani, D. S. (1971). Contour, interval, and pitch recognition in memory for melo-
dies. The Journal of the Acoustical Society of America, 49(2B), 524–531.
Dyson, M. C., & Watkins, A. J. (1984). A figural approach to the role of melodic contour in melody
recognition. Perception & Psychophysics, 35(5), 477–488.
Eitan, Z. (2013). How pitch and loudness shape musical space and motion. In The psychology of
music in multimedia. Oxford University Press.
Eitan, Z. (2017). Cross-modal experience of musical pitch as space and motion: Current research
and future challenges. In Body, sound and space in music and beyond: Multimodal explorations
(pp. 49–68). Routledge.
234 References
Eitan, Z., & Granot, R. Y. (2006). How music moves: Musical parameters and listeners images of
motion. Music Perception, 23(3), 221–248.
Elleström, L. (2002). Divine madness: On interpreting literature, music, and the visual arts ironically.
Bucknell University Press.
Ellis, R., & Tucker, M. (2000). Micro-affordance: The potentiation of components of action by seen
objects. British Journal of Psychology, 91(4), 451–471.
Engel, A. K., Fries, P., König, P., Brecht, M., & Singer, W. (1999). Temporal binding, binocular rivalry,
and consciousness. Consciousness and Cognition, 8(2), 128–151.
Etzel, J. A., Johnsen, E. L., Dickerson, J., Tranel, D., & Adolphs, R. (2006). Cardiovascular and respira-
tory responses during musical mood induction. International Journal of Psychophysiology, 61(1),
57–69.
Everett, Y. U. (2004). Parody with an ironic edge: Dramatic works by Kurt Weill, Peter Maxwell Davies,
and Louis Andriessen. Music Theory Online, 10(4).
Eysenck, M. W., & Keane, M. T. (2015). Cognitive psychology: A student’s handbook. Psychology
Press.
Fagg, A. H., & Arbib, M. A. (1998). Modeling parietal–premotor interactions in primate control of
grasping. Neural Networks, 11(7–8), 1277–1303.
Farah, M. J., & McClelland, J. L. (1992). Neural network models and cognitive neuropsychology. Psy-
chiatric Annals, 22(3), 148–153.
Fauconnier, G. (1985). Mental spaces: Aspects of meaning construction in natural language. MIT Press.
Fauconnier, G., & Turner, M. (2002). The way we think: Conceptual blending and the mind’s hidden
complexities. Basic Books.
Feldman, J., & Narayanan, S. (2004). Embodied meaning in a neural theory of language. Brain and
Language, 89(2), 385–392.
Festinger, L., Gerard, H. B., Hymovitch, B., Kelley, H. H., & Raven, B. (1952). The influence process in
the presence of extreme deviates. Human Relations, 5(4), 327–346.
Forceville, C., & Jeulink, M. (2011). The flesh and blood of embodied understanding: The source-path-
goal schema in animation film. Pragmatics & Cognition, 19(1), 37–59.
Fotheringhame, D. K., & Young, M. P. (1997). Neural coding schemes for sensory representation: Theo-
retical proposals and empirical evidence. Cognitive Neuroscience, 47–76.
Foulds-Elliott, S. D., Thorpe, C. W., Cala, S. J., & Davis, P. J. (2000). Respiratory function in operatic
singing: Effects of emotional connection. Logopedics Phoniatrics Vocology, 25(4), 151–168.
Freedman, J. L., Birsky, J., & Cavoukian, A. (1980). Environmental determinants of behavioral conta-
gion: Density and number. Basic and Applied Social Psychology, 1(2), 155–161.
Friedmann, J. L. (2017). Music to climb by: Rising chromaticism in Max Steiner’s score for King Kong.
Journal of Film Music, 10(2), 153–161.
Fries, P. (2005). A mechanism for cognitive dynamics: Neuronal communication through neuronal
coherence. Trends in Cognitive Sciences, 9(10), 474–480.
Fujioka, T., Trainor, L. J., Large, E. W., & Ross, B. (2012). Internalized timing of isochronous sounds is
represented in neuromagnetic beta oscillations. Journal of Neuroscience, 32(5), 1791–1802.
Fuster, J. M., & Bressler, S. L. (2012). Cognit activation: A mechanism enabling temporal integration in
working memory. Trends in Cognitive Sciences, 16(4), 207–218.
Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music performance: Between the per-
former’s intention and the listener’s experience. Psychology of Music, 24(1), 68–91.
Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech reviewed. Psycho-
nomic Bulletin & Review, 13, 361–377.
Gallese, V. (2009). Mirror neurons, embodied simulation, and the neural basis of social identification.
Psychoanalytic Dialogues, 19(5), 519–536.
Gallese, V. (2010). Neuroscientific approach to intersubjectivity. In The embodied self: Dimensions,
coherence, and disorders (pp. 77–92). Schattauer.
Gallese, V. (2017). The empathic body in experimental aesthetics–embodied simulation and art. In
Empathy (pp. 181–199). Palgrave Macmillan.
References 235
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cor-
tex. Brain, 119(2), 593–609.
Garbarini, F., & Adenzato, M. (2004). At the root of embodied cognition: Cognitive science meets
neurophysiology. Brain and Cognition, 56(1), 100–106.
Gernsbacher, M. A., Keysar, B., Robertson, R. R., & Werner, N. K. (2001). The role of suppression and
enhancement in understanding metaphors. Journal of Memory and Language, 45(3), 433–450.
Gibbs Jr, R. W. (2001). Evaluating contemporary models of figurative language understanding. Meta-
phor and Symbol, 16(3–4), 317–333.
Gibson, J. J. (1966). The senses considered as perceptual systems. Houghton-Mifflin.
Gibson, J. J. (1979). The ecological approach to visual perception. Houghton-Mifflin.
Gjerdingen, R. O. (1986). The formation and deformation of classic/romantic phrase schemata: A theo-
retical model and historical study. Music Theory Spectrum, 8, 25–43.
Gjerdingen, R. O. (1994). Apparent motion in music? Music Perception, 11(4), 335–370.
Gjerdingen, R. O., & Perrott, D. (2008). Scanning the dial: The rapid recognition of music genres. Jour-
nal of New Music Research, 37(2), 93–100.
Godøy, R. I. (2003). Gestural imagery in the service of musical imagery. In Gesture-based communica-
tion in human-computer interaction: 5th international gesture workshop, GW 2003, Genova, Italy
(pp. 55–62). Springer.
Grady, J. (1997). Foundations of meaning: Primary metaphors and primary scenes. University of Cali-
fornia: EScholarship.
Grahn, J. A., & Brett, M. (2007). Rhythm and beat perception in motor areas of the brain. Journal of
Cognitive Neuroscience, 19(5), 893–906.
Grahn, J. A., & Rowe, J. B. (2009). Feeling the beat: Premotor and striatal interactions in musicians and
nonmusicians during beat perception. Journal of Neuroscience, 29(23), 7540–7548.
Grandjean, D., Sander, D., & Scherer, K. R. (2008). Conscious emotional experience emerges as a
function of multilevel, appraisal-driven response synchronization. Consciousness and Cogni-
tion, 17(2), 484–495.
Granot, R. Y., & Eitan, Z. (2011). Musical tension and the interaction of dynamic auditory param-
eters. Music Perception, 28(3), 219–246.
Gregory, R. L. (1980). Perceptions as hypotheses. Philosophical Transactions of the Royal Society B:
Biological Sciences, 290(1038), 181–197.
Gridley, M. C., & Hoff, R. (2006). Do mirror neurons explain misattribution of emotions in music?
Perceptual and Motor Skills, 102(2), 600–602.
Grodal, T. (2009). Embodied visions: Evolution, emotion, culture, and film. Oxford University Press.
Guenther, R. K. (1998). Human cognition. Prentice Hall.
Guenther, R. K. (2002). Memory. In Foundations of cognitive psychology: Core readings. MIT Press.
Hacohen, R., & Wagner, N. (1997). The communicative force of Wagner’s Leitmotifs: Complementary
relationships between their connotations and denotations. Music Perception, 14(4), 445–475.
Halfyard, J. K. (2004). Danny Elfman’s Batman: A film score guide. Scarecrow Press.
Halfyard, J. K. (2013). Cue the big theme? The sound of the superhero. In The Oxford handbook of new
audiovisual aesthetics (pp. 171–193). Oxford University Press.
Hammerschmidt, K., & Jürgens, U. (2007). Acoustical correlates of affective prosody. Journal of
Voice, 21(5), 531–540.
Harris, R., & De Jong, B. M. (2014). Cerebral activations related to audition-driven performance
imagery in professional musicians. PLoS One, 9(4), e93681.
Hatten, R. S. (1994). Musical meaning in Beethoven: Markedness, correlation, and interpretation. Indi-
ana University Press.
Hatten, R. S. (2004). Interpreting musical gestures, topics, and tropes: Mozart, Beethoven, Schubert.
Indiana University Press.
Hauk, O., Johnsrude, I., & Pulvermüller, F. (2004). Somatotopic representation of action words in
human motor and premotor cortex. Neuron, 41(2), 301–307.
236 References
Hecht, H., Kavelaars, J., Cheung, C. C., & Young, L. R. (2001). Orientation illusions and heart-rate
changes during short-radius centrifugation. Journal of Vestibular Research, 11(2), 115–127.
Heidemann, K. (2016). A system for describing vocal timbre in popular song. Music Theory
Online, 22(1).
Heine, E. (2018). Chromatic mediants and narrative context in film. Music Analysis, 37(1), 103–132.
Heinichen, J. D. (1711). Neu erfundene und gründliche Anweisung wie ein Musik-liebender auf
gewisse vortheilhafte Art könne zu vollkommener Erlernung des General-Basses. Schiller.
Heldt, G. (2013). Music and levels of narration in film. Intellect.
Herget, A. K. (2021). On music’s potential to convey meaning in film: A systematic review of empirical
evidence. Psychology of Music, 49(1), 21–49.
Hodhod, R. A., & Magerko, B. S. (2014). Computational creativity: Improv agents and conceptual
blending. International Journal of Cognitive Informatics & Natural Intelligence, 8(2), 1–14.
Hoeckner, B., Wyatt, E. W., Decety, J., & Nusbaum, H. (2011). Film music influences how viewers
relate to movie characters. Psychology of Aesthetics, Creativity, and the Arts, 5(2), 146.
Holmes, D. S. (1972). Repression or interference? A further investigation. Journal of Personality and
Social Psychology, 22(2), 163.
Hsu, H. C., & Su, L. I. (2014). Love in disguise: Incongruity between text and music in song. Journal
of Pragmatics, 62, 136–150.
Hulme, C., Maughan, S., & Brown, G. D. (1991). Memory for familiar and unfamiliar words: Evidence
for a long-term memory contribution to short-term memory span. Journal of Memory and Lan-
guage, 30(6), 685–701.
Hunt, R. R. (2013). Precision in memory through distinctive processing. Current Directions in Psycho-
logical Science, 22(1), 10–15.
Hurley, S. (2008). The shared circuits model (SCM): How control, mirroring, and simulation can ena-
ble imitation, deliberation, and mindreading. Behavioral and Brain Sciences, 31(1), 1–22.
Huron, D., & Berec, J. (2009). Characterizing idiomatic organization in music: A theory and case study
of musical affordances. Empirical Musicology Review, 4(3), 103–122.
Huron, D., Kinney, D., & Precoda, K. (2006). Influence of pitch height on the perception of submissive-
ness and threat in musical passages. Empirical Musicology Review, 1(3), 170–177.
Huron, D., & Margulis, E. H. (2010). Musical expectancy and thrills. In Handbook of music and emo-
tion: Theory, research, applications. Oxford University Press.
Huron, D. B. (2006). Sweet anticipation: Music and the psychology of expectation. MIT Press.
Husain, G., Thompson, W. F., & Schellenberg, E. G. (2002). Effects of musical tempo and mode on
arousal, mood, and spatial abilities. Music Perception, 20(2), 151–171.
Hutcheon, L. (1985). A theory of parody: The teachings of twentieth-century art forms. Methuen.
Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G., Mazziotta, J. C., & Rizzolatti, G. (2005).
Grasping the intentions of others with one’s own mirror neuron system. PLoS Biology, 3(3), e79.
Iacoboni, M., Woods, R. P., Brass, M., Bekkering, H., Mazziotta, J. C., & Rizzolatti, G. (1999). Cortical
mechanisms of human imitation. Science, 286(5449), 2526–2528.
Ibarretxe-Antuñano, I. (2013). The relationship between conceptual metaphor and culture. Intercul-
tural Pragmatics, 10(2), 315–339.
Jabbi, M., & Keysers, C. (2008). Inferior frontal gyrus activity triggers anterior insula response to emo-
tional facial expressions. Emotion, 8(6), 775.
Jacoby, L. L. (1978). On interpreting the effects of repetition: Solving a problem versus remembering a
solution. Journal of Verbal Learning and Verbal Behavior, 17(6), 649–667.
James, W. (1884). What is an emotion? Mind, 9(34).
James, W. (1890). Principles of psychology (Vol. 2). Holt.
Janata, P., & Grafton, S. T. (2003). Swinging in the brain: Shared neural substrates for behaviors related
to sequencing and music. Nature Neuroscience, 6(7), 682–687.
Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination, and reason.
University of Chicago Press.
References 237
Johnson, M. (2007). The meaning of the body. In W. Overton, U. Mueller, & J. Newman (Eds.), Devel-
opmental perspectives on embodiment and consciousness. Psychology Press.
Johnson, M., & Lakoff, G. (2002). Why cognitive linguistics requires embodied realism. Cognitive
Linguistics, 13(3), 245–264.
Johnson, M., & Larson, S. (2003). ‘Something in the way she moves’—metaphors of musical motion.
Metaphor and Symbol, 18, 63–84.
Jones, M. R. (1993). Dynamics of musical patterns: How do melody and rhythm fit together. In Psychol-
ogy and music: The understanding of melody and rhythm (pp. 67–92). Erlbaum Associates.
Jones, M. R., & Boltz, M. (1989). Dynamic attending and responses to time. Psychological Review, 96,
459–491.
Juslin, P. N. (2013). From everyday emotions to aesthetic emotions: Towards a unified theory of musi-
cal emotions. Physics of Life Reviews, 10(3), 235–266.
Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music perfor-
mance: Different channels, same code? Psychological Bulletin, 129(5), 770.
Juslin, P. N., Liljeström, S., Västfjäll, D., & Lundqvist, L. O. (2010). How does music evoke emotions?
Exploring the underlying mechanisms. In P. N. Juslin & J. A. Sloboda (Eds.), Handbook of music and
emotion: Theory, research, applications (pp. 605–642). Oxford University Press.
Juslin, P. N., & Västfjäll, D. (2008). Emotional responses to music: The need to consider underlying
mechanisms. Behavioral and Brain Sciences, 31(5), 559–575.
Kendall, G. S. (2010). Meaning in electroacoustic music and the everyday mind. Organised Sound,
15(1), 63–74.
Keyfitz, N. (1996). Review of Keeping together in time: Dance and drill in human history by McNeill,
W. H. Contemporary Sociology, 25(3), 408–409.
Keysers, C. (2011). The empathic brain: How the discovery of mirror neurons changes our understand-
ing of human nature. Lulu.com.
Khalfa, S., Roy, M., Rainville, P., Dalla Bella, S., & Peretz, I. (2008). Role of tempo entrainment in
psychophysiological differentiation of happy and sad music? International Journal of Psychophysiol-
ogy, 68(1), 17–26.
Kinsbourne, M. (2005). Imitation as entrainment: Brain mechanisms and social consequences. Per-
spectives on Imitation: From Neuroscience to Social Science, 2, 163–172.
Kintsch, W., Patel, V. L., & Ericsson, K. A. (1999). The role of long-term working memory in text com-
prehension. Psychologia, 42(4), 186–198.
Koelsch, S., Fritz, T., Cramon, D. Y. V., Müller, K., & Friederici, A. D. (2006). Investigating emotion with
music: An fMRI study. Human Brain Mapping, 27(3), 239–250.
Koestler, A. (1964). The act of creation. Macmillan.
Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., & Rizzolatti, G. (2002). Hearing sounds,
understanding actions: Action representation in mirror neurons. Science, 297(5582), 846–848.
Köhler, W. (1929). Gestalt psychology. H. Liveright.
Kostka, V. (2016). Linda Hutcheon’s theory of parody and its application to postmodern music. Avant:
Pismo Awangardy Filozoficzno-Naukowej, (1), 67–73.
Kövecses, Z. (2002). Metaphor: A practical introduction. Oxford University Press.
Krueger, J. W. (2011). Doing things with music. Phenomenology and the Cognitive Sciences, 10(1),
1–22.
Krumhansl, C. L. (1991). Music psychology: Tonal structures in perception and memory. Annual
Review of Psychology, 42(1), 277–303.
Krumhansl, C. L. (1998). Topic in music: An empirical study of memorability, openness, and emotion
in Mozart’s String Quintet in C Major and Beethoven’s String Quartet in A Minor. Music Percep-
tion, 16(1), 119–134.
Kruschke, J. K. (2008). Models of categorization. In The Cambridge handbook of computational psy-
chology (pp. 267–301). Cambridge University Press.
238 References
Kühn, S., Werner, A., Lindenberger, U., & Verrel, J. (2014). Acute immobilisation facilitates premotor
preparatory activity for the non-restrained hand when facing grasp affordances. NeuroImage, 92,
69–73.
Küssner, M. B., Tidhar, D., Prior, H. M., & Leech-Wilkinson, D. (2014). Musicians are more consistent:
Gestural cross-modal mappings of pitch, loudness and tempo in real-time. Frontiers in Psychol-
ogy, 5, 789.
Kvavilashvili, L., & Mandler, G. (2004). Out of one’s mind: A study of involuntary semantic memories.
Cognitive Psychology, 48(1), 47–94.
Laird, J. D. (2007). Feelings: The perception of self. Oxford University Press.
Lakin, J. L., Jefferis, V. E., Cheng, C. M., & Chartrand, T. L. (2003). The chameleon effect as social glue:
Evidence for the evolutionary significance of nonconscious mimicry. Journal of Nonverbal Behav-
ior, 27(3), 145–162.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. Univer-
sity of Chicago Press.
Lakoff, G. (2014). Mapping the brain’s metaphor circuitry: Metaphorical thought in everyday reason.
Frontiers in Human Neuroscience, 8, 958.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press.
Lakoff, G., & Sweetser, E. (1994). Foreword to Gilles Fauconnier, mental spaces. In Mental spaces
(pp. 1–7). Cambridge University Press.
Langacker, R. W. (1987). Foundations of cognitive grammar: Theoretical prerequisites (Vol. 1). Stanford
University Press.
Large, E. W., & Palmer, C. (2002). Perceiving temporal regularity in music. Cognitive Science, 26(1),
1–37.
Large, E. W., & Snyder, J. S. (2009). Pulse and meter as neural resonance. Annals of the New York
Academy of Sciences, 1169(1), 46–57.
Larson, S. (2012). Musical forces: Motion, metaphor, and meaning in music. Indiana University Press.
Lastinger, D. L. (2011). The effect of background music on the perception of personality and demo-
graphics. Journal of Music Therapy, 48(2), 208–225.
Lee, Y. T., & Tsai, S. J. (2010). The mirror neuron system may play a role in the pathogenesis of mass
hysteria. Medical Hypotheses, 74(2), 244–245.
Lehman, F. (2013). Hollywood cadences: Music and the structure of cinematic expectation. Music
Theory Online, 19(4).
Lehman, F. (2018). Hollywood harmony: Musical wonder and the sound of cinema. Oxford University
Press.
Leinberger, C. (2004). Ennio Morricone’s the good, the bad and the ugly: A film score guide. Scarecrow
Press.
Leman, M., Moelants, D., Varewyck, M., Styns, F., van Noorden, L., & Martens, J. P. (2013). Activat-
ing and relaxing music entrains the speed of beat synchronized walking. PLoS One, 8(7), e67932.
Levenson, R. W., Ekman, P., & Friesen, W. V. (1990). Voluntary facial action generates emotion-specific
autonomic nervous system activity. Psychophysiology, 27(4), 363–384.
Lévêque, Y., Muggleton, N., Stewart, L., & Schön, D. (2013). Involvement of the larynx motor area in
singing-voice perception: A TMS study. Frontiers in Psychology, 4, 418.
Lévêque, Y., & Schon, D. (2015). Modulation of the motor cortex during singing-voice perception.
Neuropsychologia, 70, 58–63.
Levitin, D. J. (2006). This is your brain on music: The science of a human obsession. Penguin.
Lidov, D. (2005). Is language a music?: Writings on musical form and signification. Indiana University
Press.
Lima, C. F., Krishnan, S., & Scott, S. K. (2016). Roles of supplementary motor areas in auditory process-
ing and auditory imagery. Trends in Neurosciences, 39(8), 527–542.
Lipps, T. (1903). Ästhetik. Leopold Voss Verlag.
Lipscomb, S. D. (1995). Cognition of musical and visual accent structure alignment in film and anima-
tion [Unpublished PhD dissertation, University of California, Los Angeles].
References 239
Lipscomb, S. D., & Kendall, R. A. (1994). Perceptual judgement of the relationship between musical
and visual components in film. Psychomusicology: A Journal of Research in Music Cognition, 13
(1–2), 60.
London, J. (2012). Hearing in time: Psychological aspects of musical meter. Oxford University Press.
Lorenzi-Filho, G., Dajani, H. R., Leung, R. S., Floras, J. S., & Bradley, T. D. (1999). Entrainment of
blood pressure and heart rate oscillations by periodic breathing. American Journal of Respiratory
and Critical Care Medicine, 159(4), 1147–1154.
Love, S. C. (2017). An ecological description of jazz improvisation. Psychomusicology: Music, Mind,
and Brain, 27(1), 31.
Macken, W. J., Tremblay, S., Houghton, R. J., Nicholls, A. P., & Jones, D. M. (2003). Does auditory
streaming require attention? Evidence from attentional selectivity in short-term memory. Journal of
Experimental Psychology: Human Perception and Performance, 29(1), 43.
Makris, S., Hadar, A. A., & Yarrow, K. (2013). Are object affordances fully automatic? A case of covert
attention. Behavioral Neuroscience, 127(5), 797.
Marshall, S. K., & Cohen, A. J. (1988). Effects of musical soundtracks on attitudes toward animated
geometric figures. Music Perception, 6(1), 95–112.
Masataka, N. (2006). Preference for consonance over dissonance by hearing newborns of deaf parents
and of hearing parents. Developmental Science, 9(1), 46–50.
Mason, F. (2011). Hollywood’s detectives: Crime series in the 1930s and 1940s from the Whodunnit
to hard-boiled Noir. Springer.
Mathy, F., & Feldman, J. (2012). What’s magic about magic numbers? Chunking and data compression
in short-term memory. Cognition, 122(3), 346–362.
Mattheson, J. (1739). Der vollkommene Capellmeister. Herold.
Mayville, J. M., Jantzen, K. J., Fuchs, A., Steinberg, F. L., & Kelso, J. S. (2002). Cortical and subcorti-
cal networks underlying syncopated and synchronized coordination revealed using fMRI. Human
Brain Mapping, 17(4), 214–229.
McAngus Todd, N. P., O’Boyle, D. J., & Lee, C. S. (1999). A sensory-motor theory of rhythm, time per-
ception and beat induction. Journal of New Music Research, 28(1), 5–28.
McClelland, J. L. (2000). Connectionist models of memory. In The Oxford handbook of memory
(pp. 583–596). Oxford University Press.
McGettigan, C., Walsh, E., Jessop, R., Agnew, Z. K., Sauter, D. A., Warren, J. E., & Scott, S. K. (2015).
Individual differences in laughter perception reveal roles for mentalizing and sensorimotor systems
in the evaluation of emotional authenticity. Cerebral Cortex, 25(1), 246–257.
McLeod, K. (2006). “A fifth of Beethoven”: Disco, classical music, and the politics of inclusion. Ameri-
can Music, 347–363.
McNeill, W. H. (1997). Keeping together in time: Dance and drill in human history. Harvard Univer-
sity Press.
Medin, D. L., & Coley, J. D. (1998). Concepts and categorization. In Perception and cognition at cen-
tury’s end: Handbook of perception and cognition (pp. 403–439). Elsevier Science.
Menon, V., & Levitin, D. J. (2005). The rewards of music listening: Response and physiological con-
nectivity of the mesolimbic system. Neuroimage, 28(1), 175–184.
Merker, B. H., Madison, G. S., & Eckerdal, P. (2009). On the role and origin of isochrony in human
rhythmic entrainment. Cortex, 45(1), 4–17.
Merleau-Ponty, M., & Smith, C. (1962). Phenomenology of perception (Vol. 26). Routledge.
Meyer-Kalkus, R. (2007). Work, rhythm, dance. Embodiment in Cognition and Culture, 71, 165–181.
Milgram, S., Bickman, L., & Berkowitz, L. (1969). Note on the drawing power of crowds of different
size. Journal of Personality and Social Psychology, 13(2), 79.
Miller, C. T., & Cohen, Y. E. (2010). Vocalizations as auditory objects: Behavior and neurophysiology.
In Primate neuroethology. Oxford University Press.
Millet, B., Chattah, J., & Ahn, S. (2021). Soundtrack design: The impact of music on visual attention
and affective responses. Applied Ergonomics, 93(103301), 1–9.
240 References
Molenberghs, P., Cunnington, R., & Mattingley, J. B. (2012). Brain regions with mirror properties:
A meta-analysis of 125 human fMRI studies. Neuroscience & Biobehavioral Reviews, 36(1),
341–349.
Molnar-Szakacs, I., & Overy, K. (2006). Music and mirror neurons: From motion to ‘e’motion. Social
Cognitive and Affective Neuroscience, 1(3), 235–241.
Morris, P. (1988). Memory research: Past mistakes and future prospects. In Growth points in cognition
(pp. 91–110). Routledge.
Motazedian, T. (2016). To key or not to key: Tonal design in film music [Doctoral dissertation, Yale
University].
Mukamel, R., Ekstrom, A. D., Kaplan, J., Iacoboni, M., & Fried, I. (2010). Single-neuron responses in
humans during execution and observation of actions. Current Biology, 20(8), 750–756.
Murphy, S. (2006). The major tritone progression in recent Hollywood science fiction films. Music
Theory Online, 12(2).
Murphy, S. (2022). Three audiovisual correspondences in the main title for Vertigo. Music Theory
Online, 28(1).
Murphy, S. T., & Zajonc, R. B. (1993). Affect, cognition, and awareness: Affective priming with optimal
and suboptimal stimulus exposures. Journal of Personality and Social Psychology, 64(5), 723.
Nairne, J. S. (2002). The myth of the encoding-retrieval match. Memory, 10(5–6), 389–395.
Neumann, R., & Strack, F. (2000). “Mood contagion”: The automatic transfer of mood between per-
sons. Journal of Personality and Social Psychology, 79(2), 211.
Norman, J. (2002). Two visual systems and two theories of perception: An attempt to reconcile the
constructivist and ecological approaches. Behavioral and Brain Sciences, 25(1), 73–96.
Nosal, A. P., Keenan, E. A., Hastings, P. A., & Gneezy, A. (2016). The effect of background music in
shark documentaries on viewers’ perceptions of sharks. PLoS One, 11(8), e0159279.
Nozaradan, S., Peretz, I., Missal, M., & Mouraux, A. (2011). Tagging the neuronal entrainment to beat
and meter. Journal of Neuroscience, 31(28), 10234–10240.
Oden, C. (2023). “We are dancing, we are flying”: The feeling of flight in dance scenes from recent
popular film. Society for Music Theory: Videocast Journal (SMT-V), 9(1).
Oman, C. M. (1998). Sensory conflict theory and space sickness: Our changing perspective. Journal of
Vestibular Research, 8(1), 51–56.
Oman, C. M. (2003). Neurovestibular adaptation to spaceflight: Research progress. Journal of Vestibu-
lar Research, 12(5–6), 201–203.
Ortony, A. (1993). Metaphor and thought (2nd ed.). Cambridge University Press.
Osiurak, F., Rossetti, Y., & Badets, A. (2017). What is an affordance? 40 years later. Neuroscience &
Biobehavioral Reviews, 77, 403–417.
Overy, K., & Molnar-Szakacs, I. (2009). Being together in time: Musical experience and the mirror
neuron system. Music Perception, 26(5), 489–504.
Owren, M. J., & Bachorowski, J. A. (2007). Measuring emotion-related vocal acoustics. In Handbook
of emotion elicitation and assessment (pp. 239–266). Oxford University Press.
Ozturk, O., Krehm, M., & Vouloumanos, A. (2013). Sound symbolism in infancy: Evidence for sound-
shape cross-modal correspondences in 4-month-olds. Journal of Experimental Child Psychology,
114(2), 173–186.
Patel, A. D., Iversen, J. R., Chen, Y., & Repp, B. H. (2005). The influence of metricality and modality on
synchronization with a beat. Experimental Brain Research, 163(2), 226–238.
Phillips, W. A., & Singer, W. (1997). In search of common foundations for cortical computation. The
Behavioral and Brain Sciences, 20(4), 657–683.
Phillips-Silver, J., Aktipis, C. A., & Bryant, G. (2010). The ecology of entrainment: Foundations of coor-
dinated rhythmic movement. Music Perception, 28(1), 3–14.
Pinna, S. (2017). Cognition as organism-environment interaction. In Extended cognition and the
dynamics of algorithmic skills (pp. 19–37). Springer.
Plantinga, J., & Trehub, S. E. (2014). Revisiting the innate preference for consonance. Journal of Experi-
mental Psychology. Human Perception and Performance, 40(1), 40–49.
References 241
Platel, H., Price, C., Baron, J. C., Wise, R., Lambert, J., Frackowiak, R. S., Lechevalier, B., & Eustache,
F. (1997). The structural components of music perception. A functional anatomical study. Brain:
A Journal of Neurology, 120(2), 229–243.
Ploog, D. W. (1992). The evolution of vocal communication. In H. Papoušek, U. Jürgens, & M. Papoušek
(Eds.), Nonverbal vocal communication: Comparative and developmental approaches (pp. 6–30).
Cambridge University Press; Editions de la Maison des Sciences de l’Homme.
Proverbio, A. M., Azzari, R., & Adorni, R. (2013). Is there a left hemispheric asymmetry for tool affor-
dance processing? Neuropsychologia, 51(13), 2690–2701.
Ramachandran, V. S., & Hubbard, E. M. (2001). Synaesthesia—a window into perception, thought and
language. Journal of Consciousness Studies, 8(12), 3–34.
Rapée, E. (1925). Encyclopaedia of music for pictures. Belwin.
Ratner, L. G. (1980). Classic music: Expression, form, and style. Schirmer Books; Collier Macmillan
Publishers.
Reason, J. T., & Brand, J. J. (1975). Motion sickness. Academic Press.
Repp, B. H., & Penel, A. (2004). Rhythmic movement is attracted more strongly to auditory than to
visual rhythms. Psychological Research, 68(4), 252–270.
Reybrouck, M. (2005). A biosemiotic and ecological approach to music cognition: Event perception
between auditory listening and cognitive economy. Axiomathes, 15(2), 229–266.
Reybrouck, M. (2010). Music cognition and real-time listening: Denotation, cue abstraction, route
description and cognitive maps. Musicae Scientiae, 14(2_suppl), 187–202.
Reybrouck, M. (2012). Musical sense-making and the concept of affordance: An ecosemiotic and
experiential approach. Biosemiotics, 5(3), 391–409.
Ritchie, L. D. (2004). Lost in “conceptual space”: Metaphors of conceptual integration. Metaphor and
Symbol, 19(1), 31–50.
Rizzolatti, G., & Fadiga, L. (1998). Grasping objects and grasping action meanings: The dual role of
monkey rostroventral premotor cortex (area F5). Sensory Guidance of Movement, 218, 81–103.
Rizzolatti, G., Fadiga, L., Gallese, V., & Fogassi, L. (1996). Premotor cortex and the recognition of
motor actions. Cognitive Brain Research, 3(2), 131–141.
Rochat, M. J., Caruana, F., Jezzini, A., Escola, L., Intskirveli, I., Grammont, F., Gallese, V., Rizzolatti,
G. and Umiltà, M. A. (2010). Responses of mirror neurons in area F5 to hand and tool grasping
observation. Experimental Brain Research, 204(4), 605–616.
Rosch, E. (1978). Principles of categorization. In Cognition and categorization (pp. 27–48). Erlbaum Associates.
Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure of catego-
ries. Cognitive Psychology, 7(4), 573–605.
Rötter, G. (1994). The perception of form and its relation to emotional listening and physiological data.
In I. Deliège (Ed.), Proceedings of the 3rd international conference on music perception and cogni-
tion (pp. 277–278). European Society for the Cognitive Sciences of Music.
Rusconi, E., Kwan, B., Giordano, B. L., Umiltà, C., & Butterworth, B. (2006). Spatial representation of
pitch height: The SMARC effect. Cognition, 99(2), 113–129.
Ryan, J. D., Althoff, R. R., Whitlow, S., & Cohen, N. J. (2000). Amnesia is a deficit in relational mem-
ory. Psychological Science, 11(6), 454–461.
Sachs, C., & Kunst, J. (1962). The wellsprings of music. M. Nijhoff.
Sakreida, K., Effnert, I., Thill, S., Menz, M. M., Jirak, D., Eickhoff, C. R., Ziemke, T., Eickhoff, S. B.,
Borghi, A. M., & Binkofski, F. (2016). Affordance processing in segregated parieto-frontal dorsal
stream sub-pathways. Neuroscience & Biobehavioral Reviews, 69, 89–112.
Saslaw, J. (1996). Forces, containers, and paths: The role of body-derived image schemas in the con-
ceptualization of music. Journal of Music Theory, 40(2), 217–243.
Sato, T. G., Ohsuga, M., & Moriya, T. (2012). Increase in the timing coincidence of a respiration event
induced by listening repeatedly to the same music track. Acoustical Science and Technology, 33(4),
255–261.
Sayrs, E. (2003). Narrative, metaphor, and conceptual blending in ‘the hanging tree.’ Music Theory
Online, 9(1).
242 References
Schacter, D. L. (1987). Implicit memory: History and current status. Journal of Experimental Psychol-
ogy: Learning, Memory, and Cognition, 13(3), 501.
Schacter, D. L., & Tulving, E. (1994). What are the memory systems of 1994? In D. L. Schacter & E.
Tulving (Eds.), Memory systems 1994 (pp. 1–38). The MIT Press.
Schank, R. C., & Abelson, R. P. (1975). Scripts, plans, and knowledge. IJCAI, 75, 151–157.
Schelle, M. (1999). The score: Interviews with film composers. Silman-James Press.
Scherer, K. R., Clark-Polner, E., & Mortillaro, M. (2011). In the eye of the beholder? Universality and
cultural specificity in the expression and perception of emotion. International Journal of Psychol-
ogy, 46(6), 401–435.
Scherer, K. R., & Coutinho, E. (2013). How music creates emotion: A multifactorial process approach.
In The emotional power of music: Multidisciplinary perspectives on musical arousal, expression,
and social control (pp. 121–145). Oxford University Press.
Scherer, K. R., Trznadel, S., Fantini, B., & Sundberg, J. (2017). Recognizing emotions in the singing
voice. Psychomusicology: Music, Mind, and Brain, 27(4), 244.
Schiavetto, A., Cortese, F., & Alain, C. (1999). Global and local processing of musical sequences: An
event-related brain potential study. Neuroreport, 10(12), 2467–2472.
Schmidt, L. (2010). A popular avant-garde: The paradoxical tradition of electronic and atonal sounds in
sci-fi music scoring. In Sounds of the future: Essays on music in science fiction films. McFarland &
Co.
Shepard, R. N. (2009). One cognitive psychologist’s quest for the structural grounds of music cogni-
tion. Psychomusicology: Music, Mind and Brain, 20(1–2), 130.
Shevy, M. (2008). Music genre as cognitive schema: Extramusical associations with country and hip-
hop music. Psychology of Music, 36(4), 477–498.
Singer, T., Seymour, B., O’Doherty, J., Kaube, H., Dolan, R. J., & Frith, C. D. (2004). Empathy for pain
involves the affective but not sensory components of pain. Science, 303(5661), 1157–1162.
Sirius, G., & Clarke, E. F. (1994). The perception of audiovisual relationships: A preliminary study.
Psychomusicology, 13(1–2), 119.
Smith, J. D., Zakrzewski, A. C., Johnson, J. M., Valleau, J. C., & Church, B. A. (2016). Categorization:
The view from animal cognition. Behavioral Sciences, 6(2), 12.
Spiers, H. J., Maguire, E. A., & Burgess, N. (2001). Hippocampal amnesia. Neurocase, 7(5), 357–382.
Spitzer, M. (2004). Metaphor and musical thought. Chicago University Press.
Spitzer, M. (2018). Conceptual blending and musical emotion. Musicae Scientiae, 22(1), 24–37.
Strack, F., Martin, L. L., & Stepper, S. (1988). Inhibiting and facilitating conditions of the human smile:
A nonobtrusive test of the facial feedback hypothesis. Journal of Personality and Social Psychol-
ogy, 54(5), 768.
Sundberg, J., Patel, S., Bjorkner, E., & Scherer, K. R. (2011). Interdependencies among voice source
parameters in emotional speech. IEEE Transactions on Affective Computing, 2(3), 162–174.
Sweeny, T., Guzman-Martinez, E., Ortega, L., Grabowecky, M., & Suzuki, S. (2012). Sounds exaggerate
visual shape. Cognition, 124(2), 194–200.
Tagg, P. (1989). An anthropology of stereotypes in TV music? Swedish Musicological Journal, 71, 19–42.
Tagg, P. (1999). Introductory notes to the semiotics of music, version 3. Retrieved December 10, 2020,
from www.tagg.org/xpdfs/semiotug.pdf
Tagg, P. (2012). Music’s meanings: A modern musicology for non-musos. The Mass Media Music Schol-
ars’ Press.
Tagg, P., & Clarida, B. (2003). Ten little title tunes. Mass Media Scholars’ Press.
Talmy, L. (1983). How language structures space. In H. L. Pick, Jr. & L. P. Acredolo (Eds.), Spatial ori-
entation: Theory, research, and application. Plenum Press.
Tan, S. L., Baxa, J., & Spackman, M. P. (2010). Effects of built-in audio versus unrelated background
music on performance in an adventure role-playing game. International Journal of Gaming and
Computer–Mediated Simulations (IJGCMS), 2(3), 1–23.
References 243
Tan, S. L., Spackman, M. P., & Wakefield, E. M. (2008). Effects of diegetic and non–diegetic presenta-
tion of film music on viewers’ interpretation of film narrative. In Conference proceedings for the
2008 international conference of music perception and cognition (pp. 588–593). Department of
Psychology, Hokkaido University.
Thagard, P., & Aubie, B. (2008). Emotional consciousness: A neural model of how cognitive appraisal
and somatic perception interact to produce qualitative experience. Consciousness and Cogni-
tion, 17(3), 811–834.
Thagard, P., & Stewart, T. C. (2011). The AHA! experience: Creativity through emergent binding in
neural networks. Cognitive Science, 35(1), 1–33.
Thompson, W. F., Russo, F. A., & Sinclair, D. (1994). Effects of underscoring on the perception of
closure in filmed events. Psychomusicology: A Journal of Research in Music Cognition, 13(1–2), 9.
Tischler, M., & Morey-Holton, E. (1992). Space research on organs and tissues. In Space programs and
technologies conference (p. 1345). American Institute of Aeronautics and Astronautics.
Töpper, J., & Schwan, S. (2008). James Bond in angst?: Inferences about protagonists’ emotional states
in films. Journal of Media Psychology: Theories, Methods, and Applications, 20(4), 131.
Trainor, L. J., Tsang, C. D., & Cheung, V. H. (2002). Preference for sensory consonance in 2-and
4-month-old infants. Music Perception, 20(2), 187–194.
Trost, W., & Vuilleumier, P. (2013). Rhythmic entrainment as a mechanism for emotion induction by
music: A neurophysiological perspective. In The emotional power of music: Multidisciplinary per-
spectives on musical arousal, expression, and social control (pp. 213–225). Oxford University Press.
Tsougras, C., & Stefanou, D. (2018). Embedded blends and meaning construction in Modest Mussorg-
sky’s Pictures at an Exhibition. Musicae Scientiae, 22(1), 38–56.
Tucker, M., & Ellis, R. (1998). On the relations between seen objects and components of potential
actions. Journal of Experimental Psychology: Human Perception and Performance, 24(3), 830–846.
Tulving, E. (1983). Elements of episodic memory. Oxford University Press.
Vandeloise, C. (2007). A taxonomy of basic natural entities. In M. Aurnague, M. Hickmann, & L.
Vieu (Eds.), Categorization of spatial entities in language and cognition. John Benjamins Publishing
Company.
Velasco, C., Woods, A., Petit, O., Cheok, A., & Spence, C. (2016). Crossmodal correspondences
between taste and shape, and their implications for product packaging: A review. Food Quality and
Preference, 52, 17–26.
van der Velde, F., Forth, J., Nazareth, D. S., & Wiggins, G. A. (2017). Linking neural and symbolic rep-
resentation and processing of conceptual structures. Frontiers in Psychology, 8, 1297.
Wagner, A. D., Stebbins, G. T., Masciari, F., Fleischman, D. A., & Gabrieli, J. D. (1998). Neuropsycho-
logical dissociation between recognition familiarity and perceptual priming in visual long-term
memory. Cortex, 34(4), 493–511.
Wagner, S., Winner, E., Cicchetti, D., & Gardner, H. (1981). “Metaphorical” mapping in human
infants. Child Development, 728–731.
Walker, M. (1992). Film Noir: Introduction. In The movie book of film Noir (pp. 8–38). Studio Vista.
Ward, J. (2015). The student’s guide to cognitive neuroscience. Psychology Press.
Warren, J. E., Sauter, D. A., Eisner, F., Wiland, J., Dresner, M. A., Wise, R. J., Rosen, S. and Scott, S.
K. (2006). Positive emotions preferentially engage an auditory-motor “mirror” system. Journal of
Neuroscience, 26(50), 13067–13075.
Warren, W. H. (1984). Perceiving affordances: Visual guidance of stair climbing. Journal of Experimen-
tal Psychology: Human Perception and Performance, 10(5), 683–703.
Waters, R. H., & Leeper, R. (1936). The relation of affective tone to the retention of experiences of daily
life. Journal of Experimental Psychology, 19(2), 203.
Watts, C. R., & Hall, M. D. (2008). Timbral influences on vocal pitch-matching accuracy. Logopedics
Phoniatrics Vocology, 33(2), 74–82.
Welch, R. (1999). Meaning, attention, and the “unity assumption” in the intersensory bias of spatial
and temporal perceptions. Advances in Psychology, 129, 371–387.
244 References
Wheeler, L. (1966). Toward a theory of behavioral contagion. Psychological Review, 73(2), 179.
Wicker, B., Keysers, C., Plailly, J., Royet, J., Gallese, V., & Rizzolatti, G. (2003). Both of us disgusted
in my insula: The common neural basis of seeing and feeling disgust. Neuron (Cambridge, Mass.),
40(3), 655–664.
Widmann, A., Kujala, T., Tervaniemi, M., Kujala, A., & Schröger, E. (2004). From symbols to sounds:
Visual symbolic information activates sound representations. Psychophysiology, 41(5), 709–715.
Wiggins, G. A. (2010). Cue abstraction, paradigmatic analysis and information dynamics: Towards
music analysis by cognitive model. Musicae Scientiae, 14(2_suppl), 307–331.
Will, U., & Berg, E. (2007). Brain wave synchronization and entrainment to periodic acoustic stim-
uli. Neuroscience Letters, 424(1), 55–60.
Wilson, S. M., Saygin, A. P., Sereno, M. I., & Iacoboni, M. (2004). Listening to speech activates motor
areas involved in speech production. Nature Neuroscience, 7(7), 701–702.
Wingstedt, J., Brändström, S., & Berg, J. (2008). Young adolescents’ usage of narrative functions of
media music by manipulation of musical expression. Psychology of Music, 36(2), 193–214.
Winkler, I., & Cowan, N. (2005). From sensory to long-term memory: Evidence from auditory memory
reactivation studies. Experimental Psychology, 52(1), 3–20.
Winold, C. A. (1963). The effects of changes in harmonic tension upon listener response. Indiana
University.
Winters, B. (2007). Erich Wolfgang Korngold’s the adventures of Robin Hood: A film score guide (Vol.
6). Scarecrow Press.
Wittgenstein, L. (1953). Philosophische Untersuchungen. Kegan Paul.
Wühr, P., & Müsseler, J. (2002). Blindness to response-compatible stimuli in the psychological refrac-
tory period paradigm. Visual Cognition, 9(4–5), 421–457.
Young, M. D. (2013). Musical topics in the comic book superhero film genre [PhD dissertation, The
University of Texas at Austin].
Zanto, T. P., Snyder, J. S., & Large, E. W. (2006). Neural correlates of rhythmic expectancy. Advances
in Cognitive Psychology, 2(2), 221.
Zatorre, R. J., Chen, J. L., & Penhune, V. B. (2007). When the brain plays music: Auditory-motor inter-
actions in music perception and production. Nature Reviews Neuroscience, 8(7), 547–558.
Zatorre, R. J., Mondor, T. A., & Evans, A. C. (1999). Auditory attention to space and frequency activates
similar cerebral systems. Neuroimage, 10(5), 544–554.
Zbikowski, L. M. (1998). Metaphor and music theory: Reflections from cognitive science. Music The-
ory Online, 4(1).
Zbikowski, L. M. (1999). The blossoms of ‘Trockne Blumen’: Music and text in the early nineteenth
century. Music Analysis, 18(3), 307–345.
Zbikowski, L. M. (2002). Conceptualizing music: Cognitive structure, theory, and analysis. Oxford
University Press.
Zbikowski, L. M. (2007). Aspects of meaning construction in music: Toward a cognitive grammar of
music. Almen Semiotik, 17, 43–72.
Zbikowski, L. M. (2009). Music, language, and multimodal metaphor. Multimodal Metaphor, 359–381.
Zentner, M. R., & Kagan, J. (1998). Infants’ perception of consonance and dissonance in music. Infant
Behavior and Development, 21(3), 483–492.
van der Zwaag, M. D., Westerink, J. H., & van den Broek, E. L. (2011). Emotional and psychophysi-
ological responses to tempo, mode, and percussiveness. Musicae Scientiae, 15(2), 250–269.
FILM, FILMMAKER, COMPOSER INDEX
The Iguanas: “Para Dónde Vas?” 118 Nat King Cole: “L-O-V-E” 117, 161
Imani Coppola: “I’m a Tree” 129 Newman, Randy 105
Inception 31, 36n16 Newman, Thomas 62n4
The Incredibles 41–42 Newton Howard, James 86–87
Interview with the Vampire 39 Nickelback: “Hero” 115n15
The Red Violin 146, 151–154, 163, 163n4 Super Mario Bros. 115n15
Repossessed 134, 135, 142 Super Mario video game 115n15
Return to Oz 89–90 Superstar 129, 131
Robinson, J. Peter 56–57
The Taking of Pelham One Two Three 53–54,
Sakamoto, Ryuichi 102 68, 78–79
Saturday Night Fever 158–159 Tchaikovsky, Piotr I.: “Dance of the Sugar Plum
Saving Private Ryan 226–227 Fairy” 103; “Swan Lake” 121
Schubert, Franz: “Ave Maria” 129; “Ständchen” Ted 136, 137
159–161; Unfinished Symphony 120 Terminator 2: Judgment Day 125,
Scream 99n29 138–140, 143
Seal: “Kiss from a Rose” 115n15 The Terminator 91
Shang-Chi and the Legend of the Ten Rings 114 The Three Musketeers (1948) 104–105
Sharples, Winston 109 Timberg, Sammy 109
The Shawshank Redemption 53 Titanic 23–24
Shire, David 57, 73, 79, 90 Titus 30
Short Circuit 46 Tom and Jerry Greatest Chases: “Yankee Doodle
The Simpsons 192n7; “Beware My Cheating Mouse” 38
Bart” 75; “Them, Robot” 75 The Triplets of Belleville 220, 222n9
Snow White 65 The Truman Show 23, 32, 35
Soundless 44
Spellbound 74–75 Unfaithful 35n7
Spider-Man 115n15 Up 227–228
Stardust 82n8
Star Wars franchise 92, 97n7, 98n16, 107 Vanessa Carlton: “A Thousand Miles” 129
Star Wars: Episode VI – Return of the Jedi 1 The Verdict 43
Star Wars: The Return of the Jedi 24–25 Vertigo 88
Star Wars: The Rise of Skywalker 17–18 Vier Minuten 54–55
The Stepford Wives (2004) 37, 44, 47, 48n11
Stevie Wonder: “Ebony and Ivory” 132 Wagner, Richard: Der Ring des Nibelungen
Stokowski, Leopold 164n9 115n17
Stothart, Herbert 105 WALL-E 80, 84n28
Strauss, Johann: “Blue Danube” 79 White Chicks 129, 130
Stravinsky, Igor: The Rite of Spring 97n10 Who Framed Roger Rabbit? 107
Su, Cong 102 Williams, John 92–94, 102, 111, 113
Superman (1941) 109 Wonder Woman franchise 113
Superman (1978) 111–112, 115n11 Wonder Woman TV series (1970s) 110–111
SUBJECT INDEX
acoustic signatures 167–168, 170n3, 223–224; 97n7; of duple vs. triple meters 66, 69,
affective correlates of 167; definition 71, 97n7; and entrainment 124n5; and
11; ecological/evolutionary function of leitmotif 85, 87–89, 97n7; metrical 55,
167, 170n4; ethological perspective on 65–71, 88–89, 124n5, 149; metrical and
167–168, 170n5; physiological correlates tonal combined 76–80; of octatonic pitch
of 167; and timbre 14, 21n22, 21n24 collection 74–76; of symmetrical pitch
acoustic signatures (film music) 12, 151; and collections 73–75, 78, 83n19; taxonomy
consonance/dissonance (sensory) 14–15; of 186n11; tonal 72–76, 88, 147–148,
and empathy 10; and subvocalization 150; of triple meter 66–67, 70–71,
11–15, 21n22; and timbre 12–15, 77–80, 82n12, 97n7, 149; of whole-tone
21n24, 148 pitch collection 72–77
adaptive cognition 82n9, 167, 200; and analogues see cross-modal correspondences
leitmotifs 96; and memory 191, 222n7 archetypes 3, 100–115, 150, 203–210, 228;
affordances 63n8, 81, 181–185; and bimodal in film music (see topics (film music));
neurons 183, 186n8; and canonical literary 203; in music (see topics
neurons 183, 186nn7–8; and Cartesian (music))
dualism 185n4, 186n5; and cognitive off- associations (film music) see conceptual
loading 186n10; ecological/evolutionary blending (film music)
perspectives on 81n1, 181; empirical attention 96, 171n13, 198; and cognitive load
exploration of 182–183; and the 201n11, 202n13; cross-modal 98n23;
environment 81n1, 184; extended 184; and empathy 171n13; to film music 1,
and exteroception 185n4; and human 146, 179n3; to music 19n10; rhythmic
behavior 182; micro-affordances 183; 81n6; and saliency 191, 198–199, 201n7
and mirror neurons 186n7; of musical Auditory Stream Analysis 194–199, 201nn1–2;
instruments 184–185; neural correlates of and bottom-up/top-down processes 199;
183, 186n5; and proprioception 185n4; and cocktail party effect 198; and innate
role in categorization 220, 222n8; of vs. learned schemas 199
surfaces 182; theory of 181
affordances (film music) 63n8, 64–84, 92, cadences 29, 175, 223–226; Aeolian 111,
149–150, 184–185, 228; in the absence 115n15; half 124n6; metrical 56–57; and
of meter 69, 71, 76–77; of alternating musical syntax 63n11; and SOURCE-
meters 68; of asymmetric meters 68, PATH-GOAL and CONTAINER 56–60,
82n8; of chromatic collection 78–79; 124n6; and style 211n12; tonal 56–62
cultural entailments of 69, 71, 86; of categorization 217–221; and affordances 220,
dodecaphonic pitch collection 73–75; 222n8; and Aristotle’s essentialism
of duple meter 17–18, 32–33, 65–66, 221–222n4; attributes in 219–220;
SUBJECT INDEX 249
basic level 220–221, 222n11; conceptual metaphor (film music) 37–63; and
and CONTAINER 217; ecological/ CONTAINER 52–62; cultural variability
evolutionary perspectives on 219, of 48n3; and LINEARITY 37–47; and
221n1, 222n5, 222nn7–8; exemplar SOURCE-PATH-GOAL 50–52, 56–62;
model of 218–219, 221n2, 222n7; and and troping 154; see also cross-modal
family resemblance 218; horizontal correspondences (film music)
dimension of 221; neural correlates of consonance/dissonance 224–225; and acoustic
221n2, 222n10; probabilistic approach to signatures 10, 14; and behavior 20n16,
218–219; prototype model of 218–219, 21n27; cognitive (see cognitive dissonance);
222nn6–7; rule-based approach to in harmonic constructs vs. in complex tones
218; and topics (music) 217; values in 15, 21n29, 138; and LINEARITY 43–45,
219–220; vertical dimension of 220–221 48n8, 84n27; and masking 21n28; musical
categorization (film music) 125–146; and vs. sensory 21n23; neural correlates of
affordances 144n2; attributes in 125; 21n28; physiological correlates of 21nn27–
basic level 143–144, 146n13; and 28; sensory 83n16, 83n21
family resemblance 144n1; horizontal contagion (behavioral) 166, 170, 171n14,
vs. vertical dimensions of 144n6; 171nn16–17; and deindividuation
probabilistic approach to 125; prototype 171n14; emotional 21n30, 22n32
model of 142, 145n12; and saliency contagion (film music) 6–7, 16–18, 45–46, 149,
125–126; weighted values in 125–126 228; ecological/evolutionary perspectives
classical conditioning 3, 85, 95–96, 99n26, on 16; and a human MNS 17
192n13; and memory 188–190 CONTAINER 172–178, 180n17, 217; and
cognitive dissonance: in film music 72–73; and SOURCE-PATH-GOAL 50, 177–178
gravity 72, 83n20 CONTAINER (film music) 23–36, 62nn2–3; and
conceptual blending 212–215; and the “AHA!” narrative syntax 52; and non-normative
phenomenon 216n5; completion 214; sound design 25–35; and normative
composition 214; and conceptual sound design 23–25; and SOURCE-
metaphor 214–215, 216n10; disanalogy PATH-GOAL 56–62; synchronic vs.
214; elaboration 214; and formal logic diachronic perception of 62n3
215n1; frames 213–214; and image cross-modal correspondences 202n13, 228;
schemas 214, 216n10; integration audio-visual 179n3; and the Congruence
214; mental spaces 212–213; and Association Model 179n3; cultural
multidimensional scaling 215n2; music variability in 179n5; and developmental
scholarship in 215; neural correlates of psychology 180n10; and the McGurk effect
214, 216nn7–8; single-scope networks 179n3; and the unity assumption 179n3;
216n10; vital relations 214, 216n9 vs. intra-modal correspondences 179n4
conceptual blending (film music) 116–124, cross-modal correspondences (film music) 18n2,
154–156, 158, 159–163; and concert 37–38, 42, 92, 99n28; and attention
pieces 120–124; disanalogy 124n2, 160, 98n23; concomitant relationships 44,
162; and formal design (music) 122; and 48n12; and consonance/dissonance
incongruity 124n8; and irony 124n8; and 43–44; ecological/evolutionary
popular songs 116–120; transformations perspectives on 47n3, 48n6, 48n9, 48n12;
of 159–163; vital relations 124n1, 124n8, and loudness 39, 43–45; and pitch
160, 162 frequency 37–41, 43–44, 83n24; semantic
Conceptual Integration see conceptual blending 43–47; and sound’s envelope 51–52;
conceptual metaphor 50, 154, 172–182; and static vs. dynamic parameters 40–41;
[A] IS [B] structure 47n1, 173; and structural 37–42; tempo and 41–42; and
cinematography 62n1; and conceptual timbral density 44–45; and timbre 46–47
blending 180n14; and the cultural sieve
model 175; cultural variation in 175, diegesis 10, 23–34, 35nn1–7, 36n9, 36n16,
180n12; directionality of 47n1, 174, 41, 46, 48n11, 52, 62, 63n6, 63n14, 67,
179n7; and the mimetic hypothesis 69–71, 116–117, 134, 140, 150, 158,
180n14; and musical forces 180n14; 226–289; and CONTAINER 23, 35n2;
neural correlates of 179n9, 180n11; definition of 35n1
origin 174–175; source and target
domains of 173–174; theory 173–175; ecological resonance see perception (auditory)
universality of 174–175 embedded mind 185n2
250 SUBJECT INDEX
empathy (film music) 6–22, 45, 86, 228; and gravity: and cognitive dissonance 72; in film
consonance/dissonance 20n16; and 72–73, 83n20, 84n26; in music 72–73;
contagion 16–18, 149, 45–46 (see also physical 72; and proprioception 72–73
contagion (behavioral)); ecological/
evolutionary perspectives on 22n32; image schemas 172–182; BLOCKAGE 63n10;
entrainment and 8–11 (see also CONTAINER 175–176; LANDSCAPE
entrainment); and a human MNS 7, 18, 175; LINEARITY 175–177, 180n17;
19nn5–6; neural correlates of 19n5; neural correlates of 180n11, 180n17;
physiological correlates of 19n7; process SOURCE-PATH-GOAL 175, 177–178,
model of 6–7, 18, 18n1; and self- 180n17; VERTICALITY 175–177, 180n17
regulation 19n11; and subvocalization instrumentation see timbre
11–15, 45–46 (see also subvocalization) irony see rhetorical devices
empathy (social) 82n11, 165–170; and
electromyography (EMG) 171n13; leitmotif 64, 226; and affordances (film music)
evolutionary function of 170; and 85, 87–89, 97n7; as classical conditioning
gestures (physical) 166–167; and gestures 85, 95–96, 99n26; construction of
(vocal) 167; and a human MNS 169, 85–92; and cue abstraction 97n9, 98n18;
170n2; and inhibitory mechanisms definition 85; ecological/evolutionary
169–170, 171n13; and mimesis 166; perspectives on 86, 94, 96, 99n29,
neural correlates of 166, 169, 170nn1–2, 163n1; and embodied schemas 147–150;
171n12; and postures 165–170; process in film musicology 96n1; and gestalt
model of 18n1, 165–170 principles 92–94, 97n14; and imprint
entrainment: ecological/evolutionary perspectives formation 97n9, 98n18; and memory
on 82nn9–11; neural correlates of 8, 85, 94–95; neural correlates of 97n14;
19nn8–9, 81n6; physiological correlates and onomatopoeia 85, 89–91; in opera
of 20n13, 82n13; and saliency 21n30; 98n23; perception of 85; and saliency
and speed thresholds 81n2 94, 97nn12–14, 99nn27–28; and topics
entrainment (film music) 6–11, 18, 45–46, (film music) 91–92, 147, 150–154;
49n15, 81n2, 92, 124n5, 228; and transformation of 146–154
affordances 65; and associations 69; and LINEARITY (film music) 37–49, 83n24; see also
cardiorespiratory rates 20n12; ecological/ cross-modal correspondences
evolutionary perspectives on 82n10; of
meter 66–71; for physical labor 81n4; memory 187–192; and amnesia 191n4; and
physiological correlates of 20nn12–13, bottom-up vs. top-down processing
81n6, 82n7 188–189; classical conditioning 189,
envelope (sound) 62n1 192nn12–13; connectionist model of
ESMAMAPA 2, 3, 4, 223, 228 191n5; echoic 187–188; ecological/
evolutionary perspectives on 190–191;
formal design (film music) 52, 227–228; binary episodic 190; and expectation 193n16;
52–55; one-part 52–53; rondo 52–54; and gestalt principles 188; implicit vs.
sonata 122, 124n7; variation 62 explicit 189, 192n11; and language
99n24; long-term 189–190, 192nn6–7;
gestalt principles 182; common fate 197; of melodies 98n14; modal vs. amodal
figure-ground 198; good continuation representations of 192n15; neural
197; masking 197; and memory 188; substrates of 192n7; phonological loop
proximity 196; and saliency 94, 198; 188; priming 189; procedural 190;
similarity 196 record-keeping theory 187–188; and
gestalt principles (film music) 97n9; common saliency 191; and schemata 190, 192n9;
fate 93; proximity 92–93; similarity scripts 190; semantic 190; sensory
92–93 187–188; short- vs. long-term 188; and
gestalt principles (music): common fate 197; subvocalization 188; working 188–189,
figure–ground 198; good continuation 192nn6–7
198; masking 198, 201n6; proximity 196; memory (film music) 85–99, 98n15, 228;
and saliency 198; similarity 196–197 bottom-up vs. top-down processing
gestures (film music) 6–7; cultural origin of 94–95, 97n9, 97n14; classical
19n3; physiological correlates of 6–7; conditioning 95; and cross-modal
and topics (music) 19n3 attention 98n23; cultural variability of
SUBJECT INDEX 251
94; implicit vs. explicit 95; and leitmotifs 125, 127; lie 126–127; paradox 125,
95, 98n16; phonological loop 95; and 127; parody 125, 127, 144n6, 144n8;
tonality 98n20 quotation 126–127; sarcasm 126–127;
metaphor 173–175; and categorization 178n1; satire 125, 127, 144n6, 144n8
and cross-modal correspondences rhetorical devices (film music): irony 128–136,
172–173, 178n2, 179n3; see also 140–141, 144nn9–11; lie 138–141;
conceptual metaphor paradox 138, 142; parody 135–139,
Mickey Mousing 38, 42, 67 141–142, 144n11; quotation 137, 142;
mirror neuron system (MNS) 20n20, 165, sarcasm 121, 133; satire 131–134, 138,
169–170; audiovisual mirror neurons 141, 144n11
169, 171n11; and behavioral contagion
170; as bimodal neurons 186n8; vs. saliency: in conceptual metaphors 179n7; neural
canonical neurons 186n7; and emotion correlates of 201n8; and schemas 201n10
171n12; and empathy (social) 169; and saliency (film music) 97nn12–14, 98n18; and
film music 7, 18; in humans 170nn9–10; categorization 125; and the Congruence
music-based 7 Association Model 202n13; and
modes see pitch collections memorization 99n28
schema see image schema
onomatopoeia 27, 30, 34, 36n14, 65, 148–149; semiotics: Anglo-American 206–207; chain of
definition 89; semiotic perspective on 87, significations 207; continental 206–207;
97n3 denotation/connotation 206–207; icon
orchestration see timbre 207; index 207; Peircean 206–207;
signifier/signified 206–207; structuralist
perception (auditory) 194–201, 228; and 206–207; symbol 207
Auditory Scene Analysis 194–195 semiotics (music) 206–209; icons vs. indexes
(see also Auditory Stream Analysis); 114n5; and leitmotifs 88–89; and topics
bottom-up vs. top-down processing (music) 203, 207–209
194–195, 201n2; ecological/evolutionary Shepard tone 74, 83n23
perspectives on 194, 199–200; and signs see semiotics
gestalt principles 194–195; and innate sound design 62n5; and CONTAINER 23,
schemas 194–195, 199, 201n3; and 62n3, 63n17 (see also CONTAINER
learned schemas 194–195, 199, 201n3 (music)); and cultural variability 36n9;
perception (film music) 85–99 non-normative 25–35; normative 23,
perception (visual) 196–199; bottom-up vs. 35n3; overlap 25–26; replacement
top-down processing 200; ecological/ 27–30, 63n14; and technology 36n8;
evolutionary perspectives on 200 transference 10, 30–34, 48n11, 63n6,
pitch collections 91, 152; Aeolian 92, 103–104, 63n17, 124n6, 134, 150
106–107, 112, 114, 119, 150–151, SOURCE-PATH-GOAL see image schemas
207–208; chromatic collection 78–79; SOURCE-PATH-GOAL (film music) 50–63; and
dodecaphonic 73–75; Dorian 62n4, CONTAINER 56–62; and narrative syntax
151; Hungarian minor 103; Ionian 109, 50–52
115n10, 161, 204; and leitmotifs 91; subvocalization 92, 95; and memory 188
Lydian 80; octatonic 74–76; pentatonic subvocalization (film music) 6–7, 11–15, 16,
91, 97n5; Phrygian 60, 62, 100; qualia of 45–46, 151; and a human MNS 12, 17;
83n18, 83n25; symmetrical 73–75, 78, neural correlates of 11, 20n19, 20n20;
83n19; and tonal centricity 83nn18–19; physiological correlates of 11, 20nn14–
whole-tone 72–77 15, 20n18, 21n22
proprioception 72–75, 83n16, 83n20, 170n7;
and affordances 72, 185n4 timbre 63n9, 87; associations of 63n7, 113; and
distortion 46; and onomatopoeia 148 (see
qualia: of harmonic progressions 115n17; of also onomatopoeia); and saliency 94
pitch collections 83n18, 83n25; of scale tonal centricity: and pitch collections 72–73;
degrees 113, 115n18 and rare intervals 83n19
topics (film music) 92, 97n7, 100–115,
red herring 48n10 137–140, 147, 151–159, 225; diachronic
rhetorical devices 228; categorization of perspective 108–113; function 100; and
126–127, 144n2, 146n13; irony 124n8, leitmotifs 91–92, 114n1; and listeners’
252 SUBJECT INDEX