Sonic Intimacy
Voice, Species, Technics (or, How To Listen to the World)
Dominic Pettman



The Aural Phase

How might we attend to the act of listening itself, rather than to a specific sound? Moreover, how might we do so in a way that does not presume anything essential about the listener? In suspending habitual assumptions, we can better appreciate the ways in which the sonic environment not only interpellates us, through ideology, but constitutes us, as ontological beings. We are born in and of sound. Our first prenatal experience is overwhelmingly aural: we become embodied and enfleshed within the squelches, rumbles, and pulsing thumps of the mother’s body. Even before we have ears, we can “hear” through our skin. (Indeed, this capacity continues into adulthood.) Then, after leaving the womb, we learn who we are by the sound of our name and the names of others.1 We respond to sonic stimuli, like good Pavlovian subjects. Gradually, we sort the friendly sounds from the unpleasant ones, absorbing a universe of aural materials, from the intimate caress of the lullaby to the impersonal Doppler effect of the ambulance. I hear, therefore I am (who I am)—a biological and biographical cogito that does not so much exclude the hearing-impaired as acknowledge the ways in which sound creates subjectivity through its own surplus as much as absence. (Vibrations are the interface between the experience of an ear that functions as designed and one that does not, since no one—not even the profoundly deaf—can escape the sonic “feeling” of sound waves.)

So we begin this little meditation with a human being, caught in the act of listening. This obliging heuristic figure may be straining to hear something, perhaps even through a curled hand, imitating a Victorian ear trumpet. Alternatively, this hypothetical person may be covering her ears in the hope of diminishing a din. (For it is an evolutionary blessing and curse that we cannot close our ears in the same way we can close our eyes.) Practitioners of sound studies never tire of lamenting the fact that the world has become a scopophilic place and that vision is the royal register of both understanding and action. (After all, Martin Heidegger named the decisive historical shift into modernity as “the age of the world picture” and not “the age of the world composition.”) They seek to redress the balance and reveal the ways in which sound continues to shape, influence, and punctuate our lives, both in a personal sense and in a wider social context. “In the beginning was the Word,” they will say, in order to demonstrate how the ear has been demoted over the centuries in order to make way for the less intimate, more distanced and critical eye. Oral culture is thus offered as the linguistic Eden from which we’ve been exiled, now obliged to navigate our complex environments primarily with visual cues and symbolic signs. (And yes, I’m simplifying things. Kindly indulge me, however, while I set the scene.)

In a work titled “Fantasy and the Origins of Sexuality,” Jean Laplanche and Jean-Bertrand Pontalis state that “the prototype of the signifier lies in the aural sphere” (49). In other words, the ear is the first organ to posit a generic referent within a geometry of relations from which the subject engenders him or herself, in a psychic sense.2 The abstract Other—usually assumed to be the mother—is registered aurally, and the rest of the universe is assembled from this mobile but steady sonic source, including the listening self. Thus, in hearing our infant self vocalize, we sing, click, hum, and shout ourselves into the world. (Tiny Walt Whitmans, all of us, singing the songs of ourselves.) Thus, our preverbal sounds are a series of increasingly confident vibrational speculations that there is an “I” whom we can hear (from the inside, as it were). This might be considered an aural prequel to Lacan’s famous mirror stage, establishing a foundational auto-affection with the self that is soon alienated by visual recognition of the same (thereby giving breech birth to our own split subjectivity—a redundant phrase, in Lacan’s scheme).3 But even before our traumatic fall into visual self-recognition, the infant soundscape is a challenging environment in which to orient ourselves. For every soothing piece of music there is a barking dog or the shrieking of one’s own hungry throat. (“Shut that baby up, for Pete’s sake! . . . Oh. Wait a minute. That’s me.”) A caricature to be sure—but, I hope, not one without conceptual utility.

Next comes the jagged, alphabetic internalization of language, along with the prioritization of becoming an eye-witness to our own lives. We are encouraged not to march to the beat of our own ear drums but rather to follow the bright banner of visual evidence. We are seduced by smiles, which have no sound. By clothing, which is mute. Our eyes begin to devour the world, and our souls are re-scrambled by the reprogramming of our sensoriums that this necessitates, so that the five senses fall into an obedient—and thus efficient—hierarchy and chain of command. Of course, the ear is still on the alert, perhaps as much as it always was (even as it is obliged to “screen out” all the undesirable noises that pollute the place). But we are spectators first and auditors second in most of the most important arenas of life. Only in specific media-cultural contexts is the ear pampered or brought to the fore as before (during a concert, for instance, or on the phone).4 Life is audio-visual—a term that is deceptive in putting sound before image.5 We can watch a movie with substandard sound, but we are unlikely to tolerate a film with a high-quality sound track and compromised visuals. After all, we “watch” a movie; we don’t listen to it. (Although sound-engineers might vociferously disagree.)

All of which is to say that we are sonic creatures to a large extent but historically have had trouble recognizing this fact, let alone acting upon it—aesthetically (making more beautiful sounds), politically (organizing ourselves around democratic sonic principles), or ethically (in listening to the Other rather than being captivated or repulsed by her visage).6 And yet, despite the heavy bias toward the eye, the overwhelming cultural tendency has been to fetishize “the voice” as the location and medium of expression for the human being. All our desires, frustrations, and confusions can seemingly be registered in “the grain of the voice”—as if unconsciously admitting that some kind of authenticity is to be found in neglected or taken-for-granted phenomena. We might argue—as many have—whether the voice is the sonic sign of singular human life (fleeting, unique, precious) or a sound that captures and unites the estranging, generic nature of existence (“I am legion”).7 We don’t know what kind of there is there, in the voice, just that something—or rather someone—makes a sound that emanates from an obscure source (life? spirit? Being?). If the eyes are the window to the soul, then the voice is the sound of that soul after the curtains have been drawn.8

Humans, as always, monopolize the metaphysical condition. We use our voices to sing the praises of our own voices. As such, the human voice is, on the whole, a sonic form of narcissism: a bio-cultural artifact in concert with what Giorgio Agamben calls “the anthropological machine” (that is, the all-encompassing apparatus designed to sort the human element from the animal, on one side, and the machine, on the other).9 Even as we ignore or disavow the voice of those we choose not to hear, for psychically or politically expedient reasons—perhaps not even granting the speaker the status of someone who has a voice10—we still celebrate the vocal vector of speech as one of the finest mediums of communication and connection available. Indeed, the voice is the ticket to entrance into the human community, as the laws concerning deaf-mute people made clear until relatively recently (likewise the Victorian custom of children being seen and not heard).11

But the voice also has the potential to create a glitch in the humanist machinery, when it surprises us with the intensity or force of an “aural punctum”—a sonic prick or wound, which unexpectedly troubles our own smooth assumptions or untested delusions. Beneath the words being spoken lies the grain of the voice, itself shaped by the multitextured materiality of the larynx as well as the sonic traces or index of experience.12 There are many nonlinguistic elements in any vocal form of human communication, and these blur the neat distinction between form (voice) and content (speech).13 “Tone,” for instance, is crucial when it comes to interpreting a spoken missive. Sarcasm, irony, gravity, levity, “vocal fry”—such modes show that the sonic “envelope” in which a message is delivered alters the message itself. The same words can be rendered into polar opposite meanings (“Great!”). The medium is the message. And yet the voice is often considered one of the prime instances of unmediated communication (not yet tainted by technologies such as the telephone). Speaking “face-to-face” is a model of intimacy, even presumed immediacy, which must, as a consequence, be formalized in more public settings. Thus, in meetings at the workplace, for instance, specific linguistic protocols come into play, designed to diffuse the existential intensity of being exposed to the voice of the other (two “unstable” ontological elements, working in tandem).

The voice is ambiguous, ambivalent, and enigmatic. We don’t trust things we can’t seize with our eyes and hands. We might squeeze the beloved’s body in passion or fury, but we can never hold his or her voice hostage (and thus there is always a part of the other that will escape our will-to-possess—recording machines notwithstanding, as we shall see). The voice seems to be at once inherently human, but also potentially troubling to such a slippery category. The voice in joy, the voice in love, the voice in labor (both work and procreation), the voice in pain, the voice in misery—these intimate yet often impersonal sounds threaten to expose the human being as an animal, a monster, or even an alien. The voice has the disconcerting tendency to detach itself from the body and wander around the place causing mischief, attaching itself to strange entities like parrots or loudspeakers, or even to take a noisy kind of refuge inside our own heads, as with headphones or schizophrenia. In times of distress, we may have trouble syncing up a familiar face with a voice: one of the primal scenes of estrangement, prompting us to be suspicious of the entire world, which cannot reliably sync up the visuals with the soundtrack of our lives.14

Voices, in other words, are as seductive, nourishing, and necessary to our well-being as they are potentially alienating, confusing, or hurtful. What’s more, even as they help us demarcate important zones of orientation (such as age, gender, race, or even species), they can—by virtue of this border-mapping capacity—transgress the neat divisions we make between “us” and “them,” at all scales and junctures. Hence the key series of questions animating this book. What voices are we not hearing when attuned only, or at least primarily, to our own gender or species? What are the psychosocial, cultural, philosophical, and anthropological factors that are limiting our ability to even consider, let alone hear, the voices of nonhuman subjects? Can we even talk of “the voice” when we step out of the human world? And—even if we do so, for the sake of a sustained thought experiment—how can we widen the circle of the voice to include not only other animals but other natural and environmental elements, including machines, as well? What is lost if we treat all sounds as potential “voices” of discrete monads? Conversely, what new perspective might be gained from even a brief suspension of such habitual distinctions? What might justify clearing some conceptual space in order to listen for, if not precisely to, the collective, polyphonic “voice of the world”? Indeed, what is the ultimate purpose of trying such an auditory experiment?

One answer to the latter question lies in our own figurative deafness to a world in the midst of an extinction event of enormous magnitude. Alarmed scientists try to tell us on a daily basis that we are not listening to the earth, which is—elliptically perhaps, and in its own cryptic way—trying to tell us that it is in trouble. No doubt, such language will make many uncomfortable, as it seems to be positing a holistic or New Age subject—Mother Earth or simply Nature—in its romantic guise.15 But let’s not get too reactive too quickly. After all, scientists themselves, many of them proud positivists, are happy to speak figuratively of “the voice of the earth” (for the purposes of listening and responding, in an ethical, quasi-Hippocratic mode). This book looks for those fragile but important hinges between the figurative and the actual, linking the two artificially separated realms together in order to entertain the notion that the vox mundi—“the voice of the world”—exists, in some (extremely nonhuman) sense.16 But that doesn’t mean we know what it is saying. It doesn’t mean there is a metaphysical or transcendent source for such a voice. It doesn’t mean that there is one single holistic voice or message. This admittedly risky conceit merely acknowledges that there are sounds in the world, created by a cacophony of creatures and things, both “natural” and “artificial,” and that these sounds are often in a kind of dialogue with each other: a loose but significant form of call-and-response. This book is thus an appeal to listen to voices that we would normally never think of as such and in the process make something audible that previously wasn’t, through the very act of naming it so. (The saying “The squeaky wheel gets the grease” tacitly admits that nonsentient things can cry out for help or attention.) Of course, there are essential distinctions to be made between the voice of a person, a cat, a robot, an ocean, and a wheel, even if we are willing to grant such a designation for the sake of argument. And the purpose of this experiment isn’t to ignore such differences. However, an appreciation of alterity, in whatever degree, should not discourage us from looking for acoustic analogies, in Kaja Silverman’s sense, that reveal undiscovered affinities and shared fates among the most unlikely combinations.17 In being open to vocal solicitation from nonhuman sources, we might find ourselves being re-interpellated, and in ways that are empowering, not just for ourselves but also for the entity hailing us. Complicating Althusser, the one who calls us into identification need not always be a cop.

The following pages are thus divided into four chapters, each focusing on one aspect of post-human, nonhuman, or all-too-human vocalizations (while also demonstrating the ways in which these intersect and influence each other). The first is dedicated to the cybernetic voice, which speaks to us through our machines, whether traceable back eventually to another person or a software program. The second maps the gendered voice, together with the territorial boundary markers that have historically and culturally attempted to contain and control the voice of women, especially when uncannily untethered to female bodies. The third pays heed to the creaturely voice, in order to better attune ourselves to the muffled or latent screech within our own speech (as well as the vocal expressions of animals). And the fourth posits the existence of the ecological voice, in order to listen better to the diverse yet profoundly interconnected communicative events that largely comprise the natural (and not so natural) world.

Thanks in large part to the industrialization of the human ear (a history well described by Jonathan Sterne), we have lost the capacity to hear the vox mundi, which is not a coherent, organic, quasi-spiritual gestalt—the voice of Gaia, as it were—but the sum total of cacophonous, heterogeneous, incommensurate, and unsynthesizable sounds of the postnatural world. “The aural phase” is not one that we ever transcend, at least not while our signs are still vital. Our ears are vigilant, certainly, but only across a very selective bandwidth or frequency: the “ping” of our phones, the grizzle of our baby, the meow of our cat, the sigh of our lover. Repressed, unheard, difficult, and/or heretical voices whisper to us as “noises off,” attempting to remind us that our own tongues form merely one instance of the voice of the world. We would do well to listen more carefully to these sonic solicitations, just beyond the threshold of our acculturated sonic filters.


1. For a more developed and influential conceptual mapping of aural subjectivation, see Didier Anzieu’s notion of “the sonorous envelope,” as developed in The Skin Ego.

2. On this theme, Mladen Dolar quotes Freud’s discussion of “the sounds which betray parental intercourse” (The Voice and Nothing More, 133), in which the father of psychoanalysis notes that “children, in such circumstances, divine something sexual in the uncanny sounds that reach their ears. Indeed, the movements expressive of sexual excitement lie within them ready to hand, as innate pieces of mechanism” (134). While admitting some reservations concerning Freud’s assumptions about the child’s psychic technology, Dolar agrees that “a fantasy is a confabulation built around the sonorous kernel” (136).

3. It is also possible that the moment one hears one’s own recorded voice for the first time, and the uncanny (mis)recognition and denial that such an experience often inspires (“That’s not me!”), is another audio equivalent of the mirror stage.

4. In his intimidatingly comprehensive tome Making Noise: From Babel to the Big Bang and Beyond, Hillel Schwartz excavates an enormous archive of historical details pertaining to sounds, which, through their sheer volume (in both senses), collectively form an argument against the bias of the visual over the auditory—or at the very least, render this common claim even more mystifying, given the prevalence of noise, sound, and music in our lives.

5. The etymology of audience stresses the faculty of listening over and above seeing—something that has been reversed in the society of the spectacle. Likewise, no marketeer has ever talked of “ear drums” in the way they refer to “eyeballs.”

6. Levinas’s reverence for the face of the Other seemingly rests on this tradition that prioritizes the visual over the aural, so that one might wonder what an ethics would “sound” like that involves an active listening to the voice rather than a deferral to the injunction of the eye. However, on occasion, Levinas is indeed concerned with the relationship of ethics to the voice via the face of the Other, especially in terms of historical and metaphysical suppression (although it must be said that this interest is overly tied to speech and/as language rather than the underlying “grain” of the voice). See, for instance, Seán Hand’s discussion in “The Other Voice: Ethics and Expression in Emmanuel Levinas.”

7. See Jeff Dolven’s forthcoming book The Senses of Style on the notion of “voice” and/as style (and vice versa).

8. Aristotle famously writes in “On the Soul” that “voice is a kind of sound characteristic of what has soul in it; nothing that is without soul utters voice” (572). For Aristotle, all life shares in psukhē, or soul. But this does not mean that trees or caterpillars potentially have voice, defined by the natural philosopher as “a sound with a meaning.” This meaning derives from the coincidence of audible breath (powered by the sensitive soul) and “an act of imagination” (enabled by the rational soul). So Aristotle is unclear, and at times inconsistent, about the vocal capacity or potential of nonhuman animals. The braying of a donkey may indeed have some voice in it, since “such animals are devoid of lung have no voice.” So by inverse logic, those with lungs do, at least potentially, have voice. But the sounds of these creatures will never rise to the level of language or speech, for Aristotle, since that requires nous. The barking of a dog can be considered a “sound with a meaning” (“There is an intruder in the house!”). The dog therefore enjoys a certain amount of vocal agency. But, for Aristotle, this is a case of nonlinguistic voice. One wonders, then, where strong and legible boundary lines can be drawn between meaningless sound, meaningful sound, and voice. The Venn diagram is very mobile, depending on the source—and indeed the listener. Consider, for instance, sounds seemingly without soul, such as a thunderclap or a church bell. Both are examples of “a sound with meaning” to a human subject (“A storm is brewing” and “Time to come worship,” respectively). Does this mean—according to both Aristotle’s rather vague definition and the principle of the vox mundi—that a thunderclap or a bell can have voice? Victor Hugo, via his famous figure of Quasimodo, might answer in the affirmative, even going so far as to claim a form of communication in the case of the bells of Notre Dame Cathedral: “the one form of speech he [the hunchback] could hear.” (I thank Soyoung Yoon for alerting me to this reference.)

9. See Giorgio Agamben, The Open.

10. The now-classic reference here is Gayatri Spivak’s “Can the Subaltern Speak?”

11. As Barbara Glowczewski notes (“Assemblages”), there has been a strong historical equation between humanity, figured against and above the animal, and specifically spoken language: “Thus people have forbade children who grow up without speech to continue to express themselves with signs, including deaf people. For 100 years the Vatican forbade the use of sign language, even though it is a language par excellence.”

12. Shane Butler provides the important reminder that “the living voice . . . is itself a medium; like the wax of Edison’s later cylinders, or that of an ancient writing tablet, its ability to express depends in part on its ability to be impressed” (The Ancient Phonograph, 27). In the electric, and then digital, ages, we tend to forget that the voice itself is as much medium as message.

13. The Greek word phōnē could be used to mean both “voice” and “speech” and has thus contributed to a long legacy of ambiguity when it comes to attempted conceptual distinctions between these two intimately related phenomena (see Butler, The Ancient Phonograph).

14. The celebrated film editor Walter Murch notes: “Renoir in particular was extremely interested in realistic sound. He went so far in one direction that he almost came around the other side. There’s a wonderful quote by him where he says that dubbing—replacing the original sound with something else—is an invention of the devil and that if such a thing had been possible in the thirteenth century, the practitioners would have been burned at the stake for preaching the duality of the soul! Renoir felt that a person’s voice was an expression of that person’s soul, and that to fool around with it in any way was to do the devil’s work. The devil is frequently represented as having a voice at odd with what you see. In The Exorcist, the voice that the young girl speaks with is not her own voice. This idea of devilry and duality and dubbing, there’s something to be explored there” (qtd. in Ondaatje, The Conversations, 112–113).

15. Various mystical traditions find nothing new in the idea that the voice is far from exclusively human, as expressed in the Sufi belief that “there is nothing in this world that does not speak. Everything and every being is continually calling out its nature, its character and its secret” (Khan, The Sufi Message of Hazrat Inayat Khan, 148). Extrahuman voices are also sometimes evoked by modern artifacts in an attempt to connect to lost indigenous cosmologies, for instance, that of the Alacalufe and Yaghan peoples, featured in Patricio Guzmán’s The Pearl Button. “They say that water has memory,” notes the film’s narrator. “I believe it also has a voice. If we were to get very close to it, we’d be able to hear the voices of each of the Indians and the disappeared.”

16. Vox in Latin means both “voice” and “word.”

17. Silverman writes: “Analogy is the correspondence of two or more things with each other,” rather than to each other (The Acoustic Mirror, 40). Moreover, “an analogy is a very different thing from a metaphor. A metaphor entails the substitution of one thing for another. This is a profoundly undemocratic relationship, because the former is a temporary stand-in for the latter and because it only has provisional reality. In an analogy, on the other hand, both terms are on equal footing, ontologically and semiotically. They also belong to each other at the most profound level of their being” (173).