The Making of Early Literary Recordings
Jason Camlot



Audiotextual Criticism

THE EXPERIENCE OF listening to an old, spoken recording is nothing less than strange. Listen to the voice of Alfred Tennyson reading his poem “The Charge of the Light Brigade” recorded on an Edison phonograph cylinder in 1890. The sound recording of Tennyson’s voice is strange for many reasons:

(1) It does something strange. It’s weird to have sound emanating from someone thought dead and gone forever (except for the immortality of Tennyson’s poetry, of course) resonate, vibrate through the air, and trigger our ears with physical pressure, creating the “disembodiment” and “teleportation” phenomena that Jeffrey Sconce associates with “electronic presence” in what he calls “haunted media.”1

(2) It sounds strange. Aside from the strangeness of hearing the voice of a dead person, the recording itself does not sound normal to us. The audio signal we are getting is not clear or intelligible, according to our present standards of fidelity for audio playback, because we are listening to a digitized version of sound recorded on a late Victorian brown wax cylinder. The particular cylinder used in this recording was not preserved according to best archival practices either. It was stored at the back of a South African barn from 1890 until the 1960s, losing some of its shape over time and adding an additional curve to the sound, distorting the voice of the poet eerily.2 Some of the sound distortions include collateral noise, crackle, and a muffled hum. This noise of the wax cylinder is the voice of the medium itself, audible “media-archeological information (about the physically real event).”3 It communicates a past temporal presence, and the substances used to capture and preserve it. The strangeness of this effect is compounded because the actual material at the source of these sounds (the curved voice, the noise of wax) is communicated in the absence of the cylinder itself. That sound has long since migrated from the cylinder “into” a laptop computer. Without the material artifact before our eyes, without a metal stylus navigating the hills and dales of the wax surface of the cylinder’s body, we are not in a position to attribute what we are hearing to its material source, and there is a certain dissonance between the crackling signal, which suggests a tangible source for the sound, and the digital software and hardware by which we reproduce it in the present. This recording’s status as an “aural object”—a phrase reflecting our inclination to attribute the aural profile of a sound to a corresponding material object—is unclear, estranging (us from) the sound we hear.4

(3) It contains strange sounds. There are unidentifiable sounds on the recording, not attributable to voice, that make the audio artifact even weirder. Starting from about 1:33 into the recording, we hear a loud banging sound. Is this a defect in the cylinder evoking human action (a knock on the door?) from the time when the recording was made? Is it Tennyson getting carried away with his recitation, banging on the table as he performs (as one scholar proposes in his interpretation of the recording)?5 Or is it (more likely) the amplification of a crack in the cylinder that we have integrated into our listening narrative of the recording, just as listeners at early phonograph exhibitions are reported to have interpreted thumps and roars arising from flaws in a cylinder to foot-tapping or mechanical sound effects?6 We can’t tell what we hear. Our inability to attribute aspects of the signal to a material source and our natural inclination to integrate such aspects of the signal into our narrative of listening reveals our inherent desire, our need, for a legitimate source.

(4) It is strangely disconnected or fugitive. In fact, we cannot tell much at all from listening to this sound recording about the context in which the recording was made. The knocking sound may make us ask what Tennyson was doing while he was making this recording, but the sound recording as a whole prompts a broader version of this question: What was he doing, making this recording? The voice stripped of its social and material contexts estranges the signal for us when we hear it again in our own immediate space. So many early sound recordings are strange because they come to us as “fugitive” signals from another place, increasingly via digital media, stripped of their own informing spaces, situations, and reasons, and ripped from their original media formats.7 The idea of the status of the fugitive, ephemeral entity preoccupied the inventors of early sound-recording technologies as they worked to develop a means of capturing previously fugitive sounds, voices, and events for the long term. As James Lastra has argued, those who first experimented with sound recording and preservation through phonographic methods were engaged in a process of inverting the frameworks that informed our attention to substances: “By preserving the purely contingent, these phonographic systems effectively reversed the rational hierarchy between the essential and inessential, between substance and accident.”8 As historically postliminary listeners and as researchers, we are presented with the challenge of restoring some of the most basic information that is needed to understand the meaning of a sound after the fact of its occurrence. The repeatable sound signal does not always explain itself.

(5) It is strangely real. For all this missing information, when we hear a historical voice recording, when we listen to Tennyson read “The Charge of The Light Brigade” again, over a century and a quarter after he recited it into a phonograph, there is something very real about it. The real-time quality of recorded sound, that it puts us into time that has already passed and opens a tunnel connection with the past, triggers what the philosopher Wolfgang Ernst has called “the drama of time-critical media.”9 An encounter with recorded sound develops as an experience of real-time processing. The listener gets the sense that the overheard time frame is somehow alive in the present, replicating the live sonic event of which it is apparently a real-time reproduction. Sound recording works on human perception itself, and on our perception of time, in particular. Ernst’s argument about the strange drama of sound recording is based on his idea that we are not cognitively equipped to process events from two temporal dimensions simultaneously. When we immerse ourselves in real-time sound, we perceive it as “live” and this jars our awareness of time.

The strange distortions in the signal I have been describing are offshoots of the fact that sound recording is a time-based medium, and we, as humans, are time-sensitive listening creatures. If the timing of a voice is off, we are pretty quick to notice. If your friend’s voice sharing a story with you at a café began to accelerate and rise in pitch in a manner that emulates an LP record (meant to be played at 331/3 revolutions per minute [rpm]) being played at 45 or 78 rpm, you would begin to question your perception of reality. There were variable standards for recording and playback speeds from 1890, when Tennyson made his early literary recordings, to the 1920s. The speed at which a recording was made had an impact on the sound quality, volume, and, of course, the potential duration of a record.10 As sound recording became increasingly commercialized as a technology, with set record formats, firmer standards were developed, but as late as the 1920s, the playback device that controlled the speed of replay (sometimes referred to as “the governor”) might require regular adjustment. In an amusing recording entitled “Spoken English and Broken English” (1927), George Bernard Shaw illustrates the sensitivity that informs our perception of time in relation to the sound recorded voice, noting that attention to “the screw, which regulates the speed” may be necessary to realize the true vocal presence of the speaker whose voice has been recorded.11 Technical knowledge and adjustments in the present are required to calibrate the signal for it to be perceived as a plausibly real emanation from the past.

The present book is about how to engage critically with early sound recordings as literary works. The audible strangeness of such sound recordings, which I have begun to identify and explain, represents a safeguard against such critical analysis. I sympathize with the recordings. I want to keep them weird, to preserve their status as objects of wonder. But at the same time, I am interested in learning more about them, how and why they were made, circulated, categorized, used, preserved, written about, discarded, rediscovered, copied, then circulated, used and categorized again. To think critically about sound recordings as literary works, we need to explore the historically specific convergences between audio-recording technologies, media formats, and the institutions and practices of the literary context.

Phonopoetics as a concept refers to the emergence and making (poesis) of literary speech sounds (phono) as they can be heard in early spoken recordings. That such sounds were apprehended (captured) on replayable records allows us to apprehend (understand) their literary historical significance. It is one of my working assumptions that temporal specificity—the location of a sound recording within a specific historical context—is key to understanding the meaning of any literary sound recording. In order to make sense of particular sound recordings produced during the acoustic and electrical periods, many different kinds of them must be considered. My own area of interest in historical audiotexts relates especially to the meaning of “the literary” as an informing framework and ideology of audiotextual production and use. Considering the concept of the literary in relation to early spoken recordings demands a sociological approach, since the project necessarily challenges the sequence of restrictive criteria that have been placed on the concept of literature since the nineteenth century. As Raymond Williams pointed out long ago, literature as a professional concept was governed first by “a restriction to printed texts, then a narrowing to what are called ‘imaginative’ works, and then finally [by] a circumscription to a critically established minority of ‘canonical’ texts.”12 Early literary recordings in the sense in which I propose we use the term, on the other hand, are not printed, only occasionally imaginative in the high literary sense, and often would not have qualified as canonical when they first appeared, or have done so at any time since. The literary recording in the widest sense helped mobilize an ideology and practical engagement with sounds that were associated with ideas of literary performance, experience, and enculturation. These sounds might be of an elocutionist reciting Tennyson or an actor declaiming Shakespeare, but they might also be of a preacher reading a passage from the King James Bible, or a professional recording artist performing a sketch in dialect. The early literary recording is the result of the social and cultural forces that produced it and informed its meaning for the people who used it, and, as D. F. McKenzie has argued, the diverse “forms of record and communication” that we study are “not disparate but interdependent, whether at any one time or successively down through the years.”13

My approach to the literary historical study of the audiotext does not identify the literary nature of a sound recording exclusively with particular, extrinsically identifiable qualities of the sound signal under examination, and even less so with the imagined intentions of the performative reader whose voice and performance we may now study in the form of an audiotext, but rather with a diverse range of psychological, ideological, institutional, aesthetic, and social associations that informed that recorded signal’s production and subsequent use. Research concerning the informing theories and techniques of performance heard in a literary recording are indeed of significant interest to audiotextual criticism. However, these elements discerned in an audiotextual signal should not be approached wholly from the perspective of the performer’s authority over those informing techniques (i.e., his or her mastery of elocution). Rather, they are potentially important factors at work within a wider range of contextualizing elements, forces, and associations. This aligns the method of audiotextual criticism I am proposing here with the mode of textual criticism proposed by Jerome McGann and Donald McKenzie in the 1980s.14 Audiotextual criticism is an expansion of the sociology of texts, introduced by textual critics and book historians, into the realms of media history, sound studies, performance studies, format theory, and other related approaches to the production and circulation of audible literary works.

Speaking of the technology of the book, but really in reference to all artifacts of communication, McKenzie articulated an idea of a sociology of texts that is useful for imagining a sociology of the audiotext, asserting that “a book is never simply a remarkable object. Like every other technology it is invariably the product of human agency in complex and highly volatile contexts which a responsible scholarship must seek to recover if we are to understand better the creation and communication of meaning as the defining characteristic of human societies.” A sociology of the audiotext certainly attends to the formal structure of the signal under consideration, but only as one facet of the broader consideration of the social realities and functions of the media in which it has appeared, and, again in the words of McKenzie, of “the human motives and interactions which texts involve at every stage of their production, transmission and consumption,” including “the roles of institutions, and their own complex structures, in affecting the forms of social discourse, past and present.”15

To proceed with a critical project of this nature, it is necessary to understand the generic, formal, material, and ideological classifications that were used to categorize spoken records in the early periods of sustainable sound recording (1888–1925). To this end, I will briefly outline some key concepts and points of method useful for approaching these fascinating artifacts, critically, in the context of literary studies. These concepts include that of the sound signal as an object of critical analysis, the idea of a “literary” recording as a discernible category of recorded sound, definitions of audiotextual forms and genres, and the material history that has mediated and continues to mediate our engagement with these cultural artifacts. The chapters that follow offer classificatory analysis in a descriptive sense (what were the different classes of spoken recordings in the early era of sound recording); in a metacritical sense (what are the most useful categories of historical analysis available to us for interpreting early spoken recordings); and, finally, in an applied analytical and interpretive sense (what happens when we attempt to unpack the significance of the descriptive categories by applying our suggested methods of historical analysis).

By focusing on examples from the early period of sound recording, this book presents a historically located iteration of a phonopoetics, a poetics of the sound-recorded performances of the literary. This early period encompasses what are often referred to as the acoustic and electrical eras of audio-recording technology. These broad descriptive categories of technological history span the period that included Edison’s tinfoil cylinder phonograph (ca. 1877, although no recordings exist from this period), Edison’s “Perfected” cylinder phonograph (ca. 1888), Berliner’s flat disc gramophone (ca. 1894), the Columbia Records and Victor Talking Machine Co. experiments with “electrical” recording techniques that used electromagnetic microphones, amplifiers, and disc-cutting machines (ca. 1925), including Cecil Watts’s instantaneous disc recorder (ca. 1934), and the many technological variations and improvements on each of these devices that emerged prior to the introduction of analogue tape recording technology for widespread use in the 1960s. My historical narratives based on case-studies from these early periods of recorded sound illustrate a range of methods that we may employ to better understand the significance of how sound recording was deployed for literary purposes before the advent of tape (which marked the beginning of the “analogue” or “magnetic” era of recording, ca. 1950) and, more recently, of digital recording technologies (ca. 1992). Technological and media specifications do not determine the story of literary recordings but are part of larger networks of ideas, discourses, ideologies, institutions, and practices that must inform our interpretation of such artifacts from the perspective of the present. Audiotextual criticism begins with the conceptual conversion of a sound into a signal of interpretive significance.

Sound and Signal

A sound is produced when a source object vibrates in a manner that causes the surrounding air to move, and when those vibrations are of such a quality that they can be heard by a perceiving entity (for our purposes, a human being with the capacity to hear). The source object will, for example, pulsate in a manner that works to compress and rarefy the surrounding air molecules in a pattern that will in turn travel to an ear to be heard. The form in which the compressions and rarefactions travel are often described as a wave with particular characteristics that have implications for the nature of the sound that has been produced. The oscillation rate of the source determines what is called the frequency of the sound wave and is characterized in hearing as the pitch of the sound. If the vibrations are less frequent, the sound will be lower (more bassy) than if they are more frequent, when the sound will be higher (more trebly). The degree of compression and rarefaction created by the source’s motion determines what is called the amplitude of the sound wave and refers to the loudness of the sound when it is perceived.16 The ear hears these vibratory waves as sounds due to its capacity as a tympanic mechanism for transducing vibrations.17

In this simple attempt to provide a description of the physics of sound and hearing, I have necessarily lapsed from the topic of sound into that of signal, from vibrational and auditory entity into representation. The physical characteristics of sound and hearing have an extensive and complex set of linguistic, numerical, and visual resources and methods for their representation. Audiotextual criticism necessarily draws on these representational resources. We measure the frequency of sounds according to their rate of oscillation in hertz (Hz) and kilohertz (kHz) or cycles per second. (Most humans can hear frequencies within the range of 20 Hz to 20,000 Hz [20 kHz], and are especially receptive to sounds ranging between 1 and 4 kHz).18 We measure the amplitude of sounds in decibels (dB), a method of representing the ratio of one sound in relation to another, of metering the sound as a signal. The understanding of hearing as tympanic transduction evolved in the nineteenth century and offers a rich and far-reaching story of cultural and technological representation. Jonathan Sterne devotes a whole chapter to the topic in his book The Audible Past (2003). All this to say that at times these terms function as useful shorthand for explaining how a media technology works or how the results of its functioning are heard, but they are always also historically rich and resonant metaphors, to which we should attend both for their interesting historicity and for the breadth of their explicative powers.

The term “phonography” (sound writing) held multiple historical meanings in the nineteenth and twentieth centuries, referring both to written systems of phonetic transcription (shorthand scripts) and to applications of the phonograph for the recording of sound. Lisa Gitelman’s Scripts, Grooves and Writing Machines (1999) is one compelling telling among many possible accounts of the meaning of this historical (and ongoing) metaphorical continuum between writing and sound, inscription and aurality.19 As Stefan Helmreich has observed, a historically resonant and metaphorically powerful term such as “transduction”—used to describe what happens to sound as it traverses media while traveling from source to ear, turning from one kind of energy into another20—has been “an appealing concept because it narrows the distance between cultural analysis and technical description, offering a conceptual language partially shared between scholars in the humanities and in engineering and science circles.”21 Many of the keywords deployed in discussing sound-oriented practices and sound media technologies possess meanings that are shared, or partially shared, across disciplines in the humanities, social sciences, sciences, and engineering. Furthermore, many of these keywords resonated historically in multiple frequencies simultaneously, and part of our work as critics of literary recordings and historical sounds of all kinds is to parse the conceptual overtones and undertones from the historically situated fundamental frequency.

When we refer to the sound in our critical accounts, we are often speaking metaphorically, in the terms of the signal. An audio signal is a representation of a sound. Engineers can refer to the “bumps and pits on a wax cylinder” as “the raw audio signal,” but in doing so they are splitting hairs between degrees of separation between sound and representation. Proceeding with a theory that separates sound from signal clarifies the status of the recorded audio signal as a representational and manipulable artifact of a sound event that once occurred, sometime, somewhere. The audio signal as a figurative entity that is conceptualized as variation over time may be approached as measurable for the purposes of analysis and transformation. The figurative nature of the signal may be qualified depending on the medium—an analogue signal represented as a continuous flow of fluctuating electrical voltage will differ from a digital signal understood as a dense series of discrete values—but its status as representation remains.

The idea of the wave as a metaphor that is useful for describing the formal characteristics of sound, introduced as an explanatory analogy by Hermann von Helmholtz in 1863, was the first step toward his goal of identifying the distinguishing characteristics of different musical tones. “The waves of air proceeding from a sounding body,” he writes, “transport the tremor to the human ear exactly in the same way as the water transports the tremor produced by the stone to the floating chip.”22 From this first analogy he could proceed to unpack far more granular metaphors concerning the characteristics of force, pitch, quality, and so forth, of musical sound. The signal moved from descriptive analogy to observable phenomenon when systems designed specifically for the analysis of signal content were developed. The psychologist Frank Seashore reported on his development of the voice tonoscope in 1902, “a device constructed on the principle of the stroboscope; that is, the vibrations of the voice are made visible upon a moving surface by the action of intermittent light.”23 The tonoscope—an early mechanical realization of the metaphor of the signal—was “intended to be a general measuring instrument” to be deployed “in a number of ways,”24 and, indeed, scores of studies analyzing musical and speech performance from the disciplinary perspectives of psychology, music, and speech education were published based on information gathered using Seashore’s device through the 1930s.25

The signal, in such historical examples, was still a specialized source of information generated by equipment that required regular calibration and improvement. Even the more familiar widespread examples of signal visualization, such as the analog Vu meters standardized in the early 1940s—those illuminated needle gauges with peak or overload areas marked with red on the right part of the numbered arc—while providing measured information about the audio signal, were mostly applied to the practical concerns of setting the levels of broadcast voice properly.26 In effect, it is the development of digital signal processing combined with a growing corpus of digital audio data and speedy processers that has made the audio signal such a potentially important working metaphor for the critical analysis of sound. As Alexander Lerch has observed, the combination of factors just mentioned “has significantly increased both the need and the possibilities of automatic systems for analyzing audio content, resulting in a lively and growing research field.”27 The implications for audiotextual criticism of this lively activity in the development of automatic techniques for the analysis and visualization of sound may be compelling, because the tools that are emerging on a regular basis suggest new questions we can ask our signals about historical media, social relations, prosodic performance, affect, and any number of other points of interest. Over time, digital audio signal processing can lead us to expect increasingly articulate answers from the signals we study. It will certainly continue to impact our understanding of the qualities of literary sounds, or audiotexts, and move us further down the path of Roland Barthes’s understanding of “The Text” as “a methodological field” (as opposed to “the work,” which is “a fragment of substance”),28 and most likely toward increasingly elaborate “rationales of audio text” realized in relation to information systems for the purposes of software analysis, content modeling, and cataloguing.29 The audiotext is an interpretive concept by which sound is conceptualized as a signal with ideational, aesthetic, social, cultural, and formal qualities of historical significance. The identification of an audiotextual signal with literature as an expressive art form entails explanation of how and why its sonic features can be understood to signify meaningfully in the context of the literary.

What Is a Literary Recording?

Poetry and the literary in general, when historically defined beneath the broader umbrella of belles lettres, and even to some extent in its narrower definitions that still fell within certain late eighteenth- and early nineteenth-century theories of rhetoric, did not exclude the oral performance of a work of literature. While these definitions predate sound recording, the idea of a “literary” sound recording is not an oxymoron if imagined counterfactually through the broader sensorium of eighteenth-century theories of rhetoric. Raymond Williams’s assertion in Keywords that nineteenth-century notions of literature tended to exclude “speaking” helps identify some of the historical and definitional complexity of a sound-recorded performance of a literary work when the technology finally arrived at the end of the nineteenth century to realize and deliver such a thing.30 By the end of the nineteenth century, literature was mostly identified with printed material artifacts: periodicals, pamphlets, and books. This identification of literature with print media explains why Thomas Edison, as early as 1878, imagined literary recordings as “phonographic books” and put forward a series of claims for “the advantages of such books over those printed.”31 The emergence of the possibility of a “literary” recording represents a challenge to this accrued assumption that the literary is constitutionally embedded in the visual and silent media of paper and ink. In challenging such a basic assumption about the material media formats in which “literature” can be found, the idea of a “literary” recording raises questions about the methodologies that have been developed over the past century for the purpose of literary analysis, and, indeed, about the institutions (educational and other) that have supported the literary as a theoretical concept.

Spoken recordings of different kinds may be discerned as literary in relation to a host of contextualizing circumstances that need to be unpacked, often on a case-by-case basis. Jonathan Sterne has noted that terms such as “mediality” and “literariness” are “mundane” for their lack of concrete definitional purpose and demand significant work in explicating “the general web of practice and reference” that informs their use if they are to function as meaningful, eventful, and remarkable terms.32 The discernibility of a spoken recording as “literary” may depend as much on when it is approached as such as on the context in which it first appeared as an artifact in the world. This is why it is important to proceed with a critical awareness of our present circumstances of interpretation when embarking on a historical account of the literary recording as an artifact of interest, and why it is equally important to qualify the significance of what might have been understood as “literary” about a recording in specific historical eras.

One of the most significant attempts to articulate the import of the literary reading as an object of critical engagement remains Charles Bernstein’s introduction to a volume of essays he edited, Close Listening: Poetry and the Performed Word (1998). The primary aims of Bernstein’s critical intervention (with its focus on poetry reading, in particular) are to ensure that the sounded work be approached as a primary source for critical consideration rather than as an auxiliary extension of the printed text, and to suggest the possibilities of a new kind of “aural” prosody that takes into account the sonic elements of the acoustic performance of a literary text.33 To this end he introduces the term “audiotext” to describe the artifact in question, that is, “the audible acoustic text of the poem.”34 The audiotext as an object of study demands our consideration of multifoliate (including multiphonic) versions as different performances of a text.

Bernstein defines the poetry reading “as its own medium” characterized by its “anti-performative” or “anti-rhetorical” qualities.35 He focuses on the mono-valence of certain methods of poetry reading as definitive of the poetry reading as a medium. In doing so he is working to achieve a powerfully formalist approach to the aesthetic effect of mono-valence, as opposed to dramatic or theatrical modes of reading, which interpret and perform literary works differently. The designation of anti-performative monovalence to the poetry reading associates one historical method of delivering poetry out loud—a method typically used to read and interpret poetry since the 1950s—with poetry reading in general. This generic association of the poetry reading with an implicit lack of spectacle or drama, while relevant primarily to some modern and contemporary reading styles, raises important questions about how we should go about historicizing methods of poetry reading from a longer historical timeline, and how to explain why particular anti-expressivist methods of verse speaking are now so prevalent. One of the purposes of the present study is to provide a prehistory of the perceived anti-theatricality of modern and contemporary literary performance. The poetry readings we hear on early spoken recordings do not sound anti-theatrical.

When we listen to an early sound recording of a Victorian actor or elocutionist performing a poem, our first affective response will often be one of embarrassment or amusement. We are embarrassed by the seeming grandiosity of the vocal tactics deployed by the performer to communicate the meaning of the poem as dramatic scenario; we are amused by the excessive amplitude, the crushing pronunciation of consonants, and the stylized vibrato. These are historically conditioned responses resulting in a kind of affective immunity to, or culturally determined predilection against, certain conventions and styles of oral performance and, in some cases, the conditions of recording that demanded such oral features. We are trained to listen to reading and speech for particular kinds of affective cues that communicate qualities we are supposed to expect and appreciate according to our socio-historical location. Thus, in approaching early literary recordings, as critics, we are necessarily engaged in negotiating our historical neglect of certain styles of literary performance, and in attempting to reconstruct an understanding of the affective importance these reading styles may have had, despite such neglect.36

The sound of early literary recordings is informed by alternate conceptions and methods of how to vocalize the literary, but it was also informed by the media context in which those vocalizations were produced and captured. While no sound recording offers a transparent or unmediated record of a performance event, early sound recordings demanded greater accommodation of the affordances of the recording technology and preservation media than those made after the widespread use of tape recording. Even for an amateur home recordist working with an Edison phonograph at the turn of the century, there were many technical and performance considerations to take into account to achieve an audible sounding recording. In a chapter entitled “The Secret of Making Phonograph Records” from the Openeer Papers (1900),37 the author provides a litany of considerations, ranging from the adjustment and adaptation of the recorder, the thickness of the diaphragm to be used, the state of the rubber washers that support the diaphragm, the tightness of the diaphragm clamp, the shape and material qualities of the horn, the condition of the recording cylinder, the functioning of the phonograph motor works, the acoustical effects of the furniture and draperies in the room, and, of course, the techniques of speaking into the recording tube or horn. To make even a basic speech recording demanded significant technical experimentation and artistry from the recordist and performer.38

The story of the professional recording studio during the era of acoustic recording tells of a space that gradually transformed itself from inventor’s experimental laboratory to one designed for the implementation of acoustical expertise by professional “recordists” (the predecessors of latter-day sound engineers) who understood how best to place musicians and vocal performers, and how to select from a vast assortment of diaphragms and speaking horns “depending on the type of performance, the humidity of the air, or any of a host of other factors.”39 It also tells of the emergence of a new class of performers from whom the industry demanded recurrent perfectionism in vocal delivery as they faced the technically limited capabilities of the acoustical recording apparatus and spoke or sang. These studio recording artists were required to adjust their vocal style, physical posture, and movements in relation to such limitations. As Susan Schmidt Horning has noted, this necessary consideration of the phonograph’s technological affordances may have “inhibited spontaneity by forcing the performer to divide his or her concentration between artistic interpretation and recall of the ‘staging’ required before the recording horn.”40 The recording artist was performing for the machine at the same time as he or she was singing or declaiming to an unseen human audience, thus enacting a new and sophisticated form of acoustical staging in recorded performance. The emergence of the professional recording artist during the early period of recorded sound has been identified as primarily an American phenomenon,41 with stage performers doing most of the commercial recording in England and the rest of the world. In all cases, the early spoken recording offers itself as an audible representation of the spatial location of a speaker in several senses. There is the speaker before the acoustic recording horn or later (in the 1920s) a carbon or condenser microphone for electrical recording, in the first instance, and there is the speaker of the text whose speaking location and situation are depicted through performance in the recording, sometimes explicitly, as in recordings that present descriptive sketches, dramatic scenes, or character speeches set in imagined locations, and sometimes more subtly, as in recordings of lyric poems. Sound recordings are best understood not as reproductions but as representations of three-dimensional events, and consequently the role of the critic is to attend to and analyze “the representational capabilities of sound recording.”42 Early spoken recordings are immanently engaged in an artful representation of the temporally located scenarios they envoice.

Audiotextual Genres: Micro and Macro

The generic features of early audiotexts are largely discernible in their located sound, often as part of a process of excerption (or entextualization), recontextualization, and generic consolidation in performance. Richard Bauman usefully defines “entextualization” as a process of “bounding off a stretch of discourse” and “endowing it with cohesive formal properties” so that it becomes objectified and “extractable from its context of production.” Recontextualization, he continues, “amounts to a rekeying of the text, a shift in its illocutionary force and perlocutionary effect—what it counts as and what it does.” In the case of early spoken recordings, recontextualized speech is generically instantiated in recorded performances, acts of expression “framed as display” and open to “interpretive and evaluative scrutiny by an audience both in terms of its intrinsic qualities and its associational resonances.”43 The generic features of an audiotext thus become discernible in the located, contextualized sound displayed in a recorded speech performance.

There are two dominant, mutually informing scales by which we can effectively conceptualize genre for audiotexts. There is the granular scale of the elocutionary microgenres that informed particular moments and performative motifs within recorded readings; and there is the macro scale that encompasses broader generic categories of literary recordings as they were developed and organized for commercial markets and in cultural communities of use.44

To focus on the more granular scale first, audiotexts as artifacts of interpretation invite the development of a theory of elocutionary microgenres that focus less on the print-based generic categorization of the originating literary work (i.e., whether it is a poem, a play, or a novel, etc.) than on the affective forms developed as discernible speech genres for the purpose of communicating meaningful forms of character, thought, and emotion with the human voice. These audiotextual microgenres entail audibly discernible, affective forms that are untethered from the printed generic forms that may have been used to categorize and describe the performance text as it appeared on the page. For example, in elocution manuals that presented techniques for vocalizing Tennyson’s “The Charge of the Light Brigade,” the lineated poem and its metrical characteristics that might have been used to define the printed text generically, as, say, an example of dactylic occasional verse or a commemorative battle poem in a martial meter, were dissolved into prose paragraphs with instructions to consider it as a collection of expressive parts, affective enunciations defined by “contrasts, oppositions and changes in movement” that “have a certain relation to the spirit of the whole.”45 While the spirit of the whole is acknowledged, it is not always the primary point of formal focus in the context of oral interpretation. Audiotextual elocutionary microgenres disrupt the idea of literary genre as entities in a hierarchical structure of transcendent forms. The conception of prosody that informs their study is sonic rather than visual, and their generic forms can be excerpted as utterances and effusions without losing a coherent generic status in their own right. We can discuss oratorical microgeneric categories of emphasis, amplitude, force, and pitch as occurring in a particular recorded performance or across a range of recorded performances. Audiotextual microgenres are always located in a speaking context and are implicitly loquacious.

When thinking about speech in such microgeneric terms, it is useful to remember that even everyday speech acts can be said to have identifiable formal, generic features, and that these forms arise from formulaic scripts and the forces that inform contexts of utterance. Mikhail Bakhtin argues that the expression of an utterance “always responds to a greater or lesser degree, that is, it expresses the speakers’ attitude toward others’ utterances and not just his attitude toward the object of his utterance.”46 This granular conception of the genres of speech focuses as much on the situational positioning of the speaker as on the relationship between intonation and the thematic content of the speech itself. The numerous attitudinal changes that occur within a longer speech have been described by Erving Goffman as shifts in a speaker’s “footing,” wherein the speaker’s vocal stance or alignment can be discerned through analysis of “sound markers” such as “pitch, volume, rhythm, stress, tonal quality” and other features, which are often shorter than a grammatical sentence and so entail generic units that are “[p] rosodic, not syntactic.” Goffman suggests using the term “phonemic clause” to describe such microgeneric units of speech, observing that a change in “footing is another way of talking about a change in our frame for events” and such changes are a “persistent feature of natural talk.”47

Early spoken recordings are not, of course, spontaneously situated, responsive expressions of speech, but, on the contrary, were often heavily planned and highly formulaic in their structure and delivery. The formulaic and sometimes overdetermined nature of speech recordings can be understood as compensatory for the absence of an immediately identifiable situation justifying the speech heard, or, as Patrick Feaster has put it, “phonography [speech writing/recording] . . . implies the existence of techniques for overcoming that disorientation to render radically decontextualized sounds intelligible.”48 In the absence of quotation marks and other conventions of depicting speech and dialogue that had been developed as print conventions in the novel, without the visual aid of elocutionary gesture, and without the cues of costume, stage set and a cast of characters to situate a particular three-minute speech within a much broader dramatic context, new generic tactics and formulae were required to help the listener situate speeches heard emanating from machines in meaningful ways. Marjorie Garber has discussed how speaking a quotation before a live audience reminds us that “writing is displacement,” and that quoting in speech represents a kind of performance of the inherent displacement and ensuing anxieties about the relative authenticity, authority, attribution, and opacity of all speech acts.49 Discussion of the reading format or the genre of the early spoken recording entails consideration of the tactics deployed to efface displacement, compose anxiety, and defray the costs of phonographic speech.

While traditional literary methods of generic designation, whether based on fixed outer forms (e.g., the metrical and rhyming characteristics of the sonnet), subject matter (e.g., a funeral elegy or epithalamion), intended effect (e.g., horror) or inner tonal elements such as the piece’s attitude or valence (e.g., satire, irony),50 will certainly inform our discussion of the generic format of an early literary sound recording, such aspects of the critical idiom surrounding literary genre will always be supplemented, and sometimes drowned out by, the audiotextual codes and restrictions that work to inform the framing of a recorded speech with the aim of shaping the overarching expectations of a listener. Feaster usefully identifies a few clear examples of such tactics of broad or macrogeneric audiotextual formatting, ranging from explicit introductory announcements that introduce, name, or describe the sounds to follow to the regularization of formats informing more complex audiotexts that combined multiple sonic elements, such as “descriptive” and “minstrel” records, audiotextual formats that quickly developed holistic generic meaning in the early culture of sound recording.51

Early record catalogues can be informative for understanding this broader, macrogeneric scale by which early spoken recordings were delineated. An Edison-Bell Consolidated Phonograph Company record list from 1898 reveals that numerous phonographic genres—dialect pieces, comic monologues and dialogues, descriptive sketches, historical accounts, nursery rhymes, burlesques, political and topical speeches, memorial statements, dramatic scenes, prayers, talks, and many others—were all organized under the broad rubric “Recitations, Speeches and Dialogues.”52 This two-and-a-half-page list extracted from a twenty-one-page catalogue provides a fairly comprehensive sense of the range of genres of voice recordings that were being made and sold in Britain at the end of the nineteenth century, even though we will not likely hear many or any of these specific recordings, do not know whose voices were heard on the records, nor always what a title on the list might have referred to when realized as an actual sound recording.53 This 1898 record list shows an interesting combination of either duplicated or imitated American dialect character recordings (a “Negro Dialogue,” six Uncle Josh the rube, eight Casey the Irishman, and seven Schultz the German recordings in series that were originally popularized by the American recording artists Cal Stewart, Russell Hunting, and Frank Kennedy, respectively), and records aimed more specifically at an English audience, including a set of seven Prime Minister William Gladstone–themed cylinders, a “Hyde Park Socialist (Burlesque),” and a few records based on the work of the English music hall or Punch magazine humorists H. G. Snazelle, R. G. Knowles, and Douglas William Jerrold.

All the recordings mentioned thus far fall into the categories of “Speeches” and “Dialogues,” but “In Memoriam—Tennyson,” also on the list, would most likely have been seen as a “Recitation.” We have no information about the reader of the poem, or what sections comprised this recording, but the nature of the printed text, and its identification with a past poet laureate—it is the only item on the list that provides both a title and author’s name—suggest recorded performance of literary, elocutionary culture. Other records on the list that would have been identified with literary recitation are a set of six Shakespeare pieces, including “Marc Anthony (Julius Caesar),”” Hamlet’s Soliloquy,” and “The Seven Ages of Man” from As You Like It; a “Selection from ‘The School for Scandal’” by Sheridan; “The Lord’s Prayer and 23rd Psalm”; and “The Sermon on the Mount.”54

One important sense of the word “recitation” at the end of the nineteenth century was that of an elevated and artistic performance of a text recognized as a source of serious expressive value. There were many different kinds of recitation and recitation records in 1898, but what I am identifying here as literary recitation implied a certain combination of aesthetic authority, gravity, and elocutionary prowess that, together, would have an important cultivating effect on the listening subject, or, better, citizen. The qualities Raymond Williams identifies with an eighteenth-century conception of poetry—namely, “the high skills of writing and speaking in the special context of high imagination”—persisted in a nineteenth-century conception of literature and the literary in early literary recordings.55/sup> Recitation as a broad audiotextual generic category might have covered a thematically diverse range of texts, but from an elocutionary perspective they would have been perceived as generically related.

Elocutionary records and literary recordings were not necessarily the same thing, but these terms often converged to define a particular generic category of early spoken recording. A slightly earlier example of record cataloguing illustrates this point. Emile Berliner’s United States Gramophone Co.’s earliest known American “List of Plates in Stock” (November 1, 1894), offering forty-nine recordings, identifies thirteen record categories: Band Music, Instrumental Quartette, Barytone, Clarionet (sic), Cornet, Drum and Fife, Trombone, Piano, Children’s Songs, Indian Songs, Soprano, Recitation, and Vocal Quartette.56 Two of the categories are generically specific, identifying the genre of song—“Children’s” and “Indian”—offered on the record. The one title listed under the category of “Recitation” is “Marc Anthony’s Curse: A Lesson in Elocution.”57 Speeches and selections from Shakespeare provided much material for the recitation category of early record catalogues (as the Edison-Bell Consolidated Phonograph Company list shows). Our focus at the moment is on the significance of the subtitle “A Lesson in Elocution” in relation to the recording of a speech from Shakespeare’s Julius Caesar, act 3, scene 1. What was a lesson in elocutionary performance, and what did it have to do with the performance of a speech from Shakespeare? In an expanded list of Berliner plates from about a year later (January 1895), another title was added to the Recitation category, with this note:

We have for this important department
secured the co-operation of the eminent ver-
satile elocutionist, Mr. David C. Bangs.
602 Mark Anthony’s Curse
A Lesson in Elocution
600 The Village Blacksmith
(Many others in preparation.)58

The addition of Henry Wadsworth Longfellow’s poem “The Village Blacksmith” to the list, the identification by name of the eminent elocutionist employed—the only name to appear on the expanded list of some eighty-five recordings—and the suggestion that this was a growing department of the company’s inventory suggest that The United States Gramophone Co. was beginning to discover the potential of the gramophone as a literary and pedagogical medium. As Catherine Robson has shown, in both America and Britain at the end of the nineteenth century, “recitation” meant a studied, refined literary performance. The end-of-term public “Examination” or “Exhibition Day” at many American schools was also referred to as a “Recitation” featuring different forms of scholarly performance—grammar, spelling, arithmetic, and penmanship—and culminating in “the individual recitation of poetic and oratorical selections.”59 A recorded recitation of a selection from Shakespeare or Longfellow by an “eminent, versatile elocutionist” was an artifact that captured a standard of excellence in oratory and literary interpretation, two areas that were still a significant part of the curriculum at the turn of the century.

The way the elocutionist, David C. Bangs, is described is worthy of note. The alleged eminence of Bangs echoes the power of that descriptor when used to define the significance of recordings by eminent figures, such as the eminent statesman Gladstone, and suggests that Bangs’s performances will function as models worth preserving in one’s personal library of voices. His advertised versatility also suggested that he was capable of delivering exemplary performances of spoken pieces of all genres and elevations, and not just those of the lofty, literary and elocutionary variety. Indeed, between 1894 and 1895, Bangs made the two recordings already mentioned, as well as recordings of a comic monologue, a children’s record, and a monologue in high elocutionary mode from the gramophone’s own personified first-person perspective,60 and in 1896, he recorded “The Lord’s Prayer and Psalm 100.”61 This is a repertoire of modest range compared to other recording artists of the period and indicates that Bangs focused on elocutionary eminence rather than versatility.

The distance between a recorded selection from Shakespeare, a poem by Longfellow, and a prayer or psalm was not great if understood within the context of elocutionary delivery and the moral and cultural purpose of generic literary recitation at this time. As Joan Shelly Rubin has shown, there was a strong “congruence of the form of the recitation in both church and public schools,” which rested on a “liberalized theology” that identified the moral power of poetry with that of prayer.62 This congruence can be heard in the techniques used by recording artists to perform poems and prayers and would explain why early recordings of poetry and biblical passages often share the grave sounds of prolonged vowels, recurrent, slow-falling intonation, occasional vibrato or trilling, and ubiquitous rolled r’s.63 That these same grave elocutionary techniques were parodied in comic records underscores the degree to which the “literary” recordings and elocutionary recitations were in constant play with numerous other genres of spoken performance that mimicked, challenged, reinforced, and defined the emerging sound of the literary. As we listen to the wide range of speech recordings discussed in this book, and consider how they inflected their mutual significance on the historically located continuum of speech sounds, the sound of the literary in early spoken recordings will become increasingly audible and critically discernible to us.

Phonopoetics tells the story of the early period of spoken recordings with the aim of explaining the emergence of what I have been calling the literary recording as a differential cultural artifact, that is, an artifact that represents one recognizable, material manifestation of literary expression, meaning, and practice among other related artifacts, ideologies and practices. In doing so, this book necessarily challenges certain textual and visual assumptions that inform contemporary literary criticism by taking the recorded text—the heard audiotext and the sounded “phonotext”—as a primary object of analysis. To explain what early literary recordings meant as aesthetic and cultural entities, I interpret such things as the early promotional fantasies about the phonograph as a new kind of speaker; early sound adaptations of novels, poems, and plays by a variety of actors, elocutionists and recordings artists; initiatives to use the phonograph for teaching elocution and as a means of achieving a heightened literary experience; and the voice archive as a new form of cultural memory. Throughout the book I engage in audiotextual “interpretations” of spoken records representative of a range of genres, and especially of recordings that illustrate historical performances of literary interpretation, refined speech, and cultural fluency. Even as it maps and enacts methods of phonopoetic critical practice through the synchronic historical location of particular case studies, this book will also present a diachronic account of the changing techniques and styles of reading literature out loud as audible in literary recordings, from Tennyson to T. S. Eliot, and the great variety of voices that were recorded and heard in between.

Chapter 1 unpacks the early promotional discourse surrounding the phonograph as a medium of natural fidelity and then situates this idea of the phonograph as a medium without corrupting accent in the context of popular recitation anthologies in order to identify some of the elocutionary preconceptions that informed the vocal performances heard in early spoken recordings. In explaining the formal and cultural affinities between late Victorian short spoken recordings (testimonials, dialect monologues, and literary recitations) and the brief texts meant for speaking aloud that were collected in nineteenth-century recitation compilations, I provide an account of the preconceived notions surrounding the meaning of this new recording medium, in general, and the significance of sound-recording technology for the performance of literary texts, in particular.

Continuing the previous chapter’s discussion of the generic categories informing early spoken recordings, chapter 2 focuses on the development and production of some of the earliest sound recordings based on the novels of Charles Dickens. The Dickens recordings of Bransby Williams and William Sterling Battis stand as the earliest fiction-based audio adaptations produced specifically for pedagogical application, and thus represent an interesting bridge between earlier conceptions of the talking record as a novel form of popular entertainment, and the later, pedagogically motivated category of the literary recording. One key element of this historical transition from “talking record” to “literary recording” is the identification of the sound-recording material with the print book. The recordings examined in this chapter also serve as a useful focus for speculation about the particular kinds of literary adaptation, condensation, entextualization, and recital that resulted from the earliest recordings that were produced specifically for such pedagogical application. While the story of Battis’s recordings must, significantly, begin with Dickens himself, both as a novelist and as an adaptor, public reader, and performer of his own work, the overarching trajectory of the plot pursued in this chapter moves from the Lyceum Stage upon which Battis made his reputation as a Dickens impersonator, to a discussion of the context in which a certain kind of public, popular entertainment (with pedagogical motives) was redirected and condensed into a new argument for literary encounter in the classroom, and into an early form of what we now call educational technology.

My discussion of literary recordings in the context of pedagogy continues in chapter 3 with analysis and historical location of recordings made between 1890 and 1920 of Alfred Tennyson’s “The Charge of The Light Brigade.” While considering the methods of performance that would have informed the production of these acoustic records, including discussion of the audiotextual genre of the dramatic recitation with orchestral accompaniment, this chapter locates the modes of recitation heard in these recordings—ranging from my return to Tennyson’s own recording of 1890 to early twentieth-century recordings of the poem made by specific elocutionists and actors—within debates surrounding methods of elocution and verse speaking from the period. My discussion of late Victorian methods of “dramatic” interpretation, as elaborated by Samuel Silas Curry in his 1896 book Imagination and Dramatic Instinct: Some Practical Steps for Their Development, opens into a longer genealogy of such methods of oral interpretation as a legitimate approach to literary study in the 1940s and 1960s and considers the import of New Criticism as a method of literary interpretation that worked to silence oral performance and the study of literary recordings in the classroom. With this genealogy of oral interpretation and the impact that New Criticism had on it explained, the chapter then considers what might be lost in our understanding of Tennyson’s occasional poem when the voice is omitted from the process of interpretation, and explores the potential of digital speech-analysis tools to help us to fix and visualize specific elocutionary, prosodic features of these recordings of “The Charge.”

Chapter 4 offers a series of interpretive takes on T. S. Eliot’s 1930s electrically recorded voice experiments in rendering his poem The Waste Land. These discussions provide a means of comparing what Victorian elocutionary delivery looks like in relation to Eliot’s attempts to invent a manner of reading that is appropriate for the delivery of modernist poetry. I first provide the context in which the 1933 recordings were produced and consider the significance of that context of production, and the media format he recorded with, for the nature of his experiments in reading. I then situate Eliot’s audible reading experiments within contemporary debates surrounding the English verse-speaking movement, and Eliot’s own report, written for the BBC, on how poetry should be read out loud for the purpose of recording. In its third and final interpretive take, the chapter moves into a formal analysis of Eliot’s reading experiments by focusing on Eliot’s attempts to discover a way to read The Waste Land through recorded experimentation with duration and amplitude, as well as a series of techniques of nonsemantic phrasing and intonation, the use of monotone, and the cultivation of a harmonically rich drone in speech. In my analysis of the sonic elements of Eliot’s early recording experiments in the context of historical theories of performance and literary interpretation, I argue that the abstract conception of “voice” that functions as an organizing principle in New Critical discourse is performed in Eliot’s recorded readings as a subtle alternating use of authoritative epic speech, lyrical modulation, and localized dramatic scenario, within an organizing method of incantation that evokes the possibility of an overarching oracular or otherworldly voice.

Segueing from this final chapter’s discussion of a poet’s experiments in performing and capturing the sound of the oracular voice, I conclude Phonopoetics with an exploration of conceptions of voice preservation and models of the voice archive, arguing that the stress placed in early ideas of the voice archive upon the materiality of the audible artifact, and the event-oriented scenario of its use, represent useful points of departure for a historically motivated theorization of the voice recording and voice archive at the present time, specifically in relation to the impact of digital media technologies on the status of the record and its archive. I conclude by thinking about how the analogue artifact of the sound archive has shaped our ideas and expectations about what a digital repository should be, reflecting on the status of our artifact of study as we move increasingly from the study of material media artifacts to virtual instantiations of the signals those media may once have held.


