Skip to content
Error

Failed to copy link to the clipboard.

Success

Link copied to the clipboard.

Introduction

This project thinks about the ways in which computers speak – and we talk to them – today. In so doing, it examines the relationship between communicational forms, interactivity, and digital technologies through the lens of literary studies. The material is presented in ways that encourage you to think about your role as a user/reader – and reading as a communicative and interactive process.

Talking interfaces

My title has a double meaning. In its narrower meaning, stemming from computing, it refers to a means of structuring human-computer interaction (HCI). User interfaces that deploy “talk” or natural language as their primary communicational mode have become popular in recent years, as the prevalence of conversational agents like Siri, Google Now, and Alexa demonstrate. Rather than deploy a mouse in a graphical user interface, we are increasingly asked to use speech to operate the (social media) software.

The dominance of talk interfaces is not restricted to their technical prevalence. Their significance is cultural, social, economic, political, and environmental. Talk interfaces function today as the apparently successful instantiation of a much longer dream of frictionless communication, whether between humans across distances, between humans and computers, or between humans and artificial intelligence. Although they may not (yet) demonstrate consciousness, we now, so this fantasy goes, have the means of talking with a future Hadaly or HAL 9000. The social and cultural significance of talk interfaces is also seen in the prevalence of popular communication tools like WhatsApp, Slack, and WeChat. While based mainly on written communication, they seek to co-opt the qualities that a talk-based interface supposedly offers through their generic descriptor as “chat” applications.1,2 Why? What is it about talk and talk interfaces that speak so strongly to our digital moment? This query drives the second, broader sense of my title. By talking interfaces, I seek to articulate the reasons why talk is held up (often unconsciously) as the ideal form through which communication can occur – and what this recognition tells us about our Ego Media environment.

In talking about talk interfaces, I draw on digital scholar Lori Emerson’s capacious notion of “interface”:

While interface is a productively open-ended, cross-disciplinary term, generally speaking in computing it refers simply to the point of interaction between any combination of hardware/software components. . . . I settle on an even more expansive definition so that interface is a technology – whether it is a fascicle, a typewriter, a command line, or a GUI – that mediates between reader and the surface-level, human-authored writing, as well as, in the case of digital devices, the machine-based writing taking place below the gloss of the surface. The interface is, then, a threshold.

Lori Emerson, Reading Writing Interface, x.

Emerson’s definition usefully enables us to move across digital and analog technologies. In so doing, it allows us to place this analysis of the digital phenomenon of talk interfaces into a longer history of inscription, mediation, and communication practices, thus enabling us to identify the past correspondences and novelties of our current situation.

Emerson’s valuable definition also inadvertently indicates the paradox at the heart of my subject: the interface relies on rhetoric of vision – “gloss,” “surface,” and the viewing “face” of interface. Talk, however, is primarily an aural and oral medium. Historically Western cultures have tended to present speech and writing as opposites – what Walter Ong influentially, if restrictively, described as the “differences in ‘mentality’ between oral and writing cultures.”3,4 In societies that have long privileged writing, talk maintains an uneasy, often denigrated cultural position; nevertheless, current enthusiasm for talk interfaces suggests that this medium has continued and even increasing value. There is something about talk that we want to preserve for our digitally mediated interactions.

We can begin to identify some of talk’s appeal by returning to Emerson. She continues her discussion by noting that the interface is not simply a point of access: it conceals as often as it reveals. “The dream in which the boundary between human and information is eradicated is just that – a dream that the computer industry rides on as it attempts to convince us that the dream is now reality through sophisticated slights of hand that take place at the level of the interface.”5 Emerson gives the example of Google Glass as symptomatic of this dream – a supposedly interface-free device – but, in a nod back to my first meaning, it is talk interfaces that provide the most prevalent example of this trend today. These interfaces purport to offer the dream-become-reality of frictionless communication: talking to Siri seemingly involves no interface at all.

And yet. A talk interface, like any interface, can conceal (distort and muffle) as often as it reveals (isolates and amplifies). An interface that relies on natural language processing or voice recognition still mediates. It still, crucially, acts as a threshold between the user and an underlying program that in turn shapes the kinds of interactions that are operative and those that are not. The talk interface offers a promise that it does not and cannot deliver: talk unconstrained by the program. In exposing this dream for what it is, I am not attempting to criticize the notable advances in software and processing in this area, nor the significant affordances that such interfaces offer. Instead, I examine talk interfaces for what they can tell us about the broader processes – mediative, communicative – shaping our society today, whether that be collective dreams around perfect communication, or the overreliance of digital cultures criticism on a graphic and visual rhetoric. (What do we miss by relying on a vocabulary of layers, surfaces, and friction? What might attending to talk teach us about our disciplinary assumptions and blindspots (acoustic limits)?6)

It should be clear by now that my understanding of talk is, like my adopted conception of interface, expansive. I conceive of talk as a communicational technology with its own forms, affordances, and “imagined affordances.”7,8,9,10 In a basic sense and as I indicate above, talk is often defined by what speech is not: writing. Such a definition brings with it a number of associated characteristics: speech as authentic, ephemeral, informal, face-to-face, spontaneous, and unmediated. Such stereotypes rarely hold; however, they work collectively to position speech itself as retaining certain affordances – cultural and technological.

I define talk both more narrowly and as containing broader affordances than speech. Narrowly, talk is speech with an additional interactivity implied. This qualification enables talk to retain speech’s affordances, while also opening it up to wider cultural significance. Talk’s orientation towards interactivity encourages us to think about the work that it can do, whether as a form of communication; a process of subject constitution; performative, operative or executable language; a representational or creative process.11 Our cultural understandings of talk are therefore inextricable from foundational assumptions about the nature, expression, and function of the body politic, the public sphere, and the individual in Western cultures since the sixteenth century. My point then is that talk is a technology that historically has made things happen; it has been peculiarly executable.

Today talk is a technology that continues to do things; but what talk does and what a talking interface does are not equivalent. Talking to Siri is, in significant ways, not the same as talking to a person. As often as the process is irritating, surprising, amusing, or banal, interacting via a talking interface reveals the limitations of talk as it is currently conceived within digital technologies. We might point to technical limitations – trouble with accent, a query without a response, a conversational loop accidentally entered, or a metaphor taken literally. We might also point to contextual problems – dialect, idiom, deixis, or phatic speech misunderstood (or, sometimes more annoyingly, aped), cultural norms broken, a register or tone mismatch.12

But it is more than that. It is notably less satisfying talking with Siri or Alexa than with a person with limited language skills or a lack of shared cultural knowledge. Talking to a conversational agent distorts some of talk’s affordances just as it amplifies others.

Granted, talk’s affordances, perceived and actual, have shifted in changing media ecologies. Since the late nineteenth century, with the advent of recording and broadcast technologies, such as the gramophone, radio, film, and television, talk has not been limited to being a face-to-face, ephemeral communicative strategy. I have examined some of the implications of this through the lens of interviews – an inscription technology and form that arose in the late nineteenth century and fundamentally altered our understanding of talk’s affordances – in my book Literature and the Rise of the Interview.13 The import of this and other analog talking interfaces for our understandings of talk and, through it, subjectivity, interpersonal relations, the operations of the public sphere, and democracy have been extensively debated.

For scholars of literature and life writing such as myself, talk has particular methodological interest: long functioning as an important but difficult to archive paratext, talk is now newly capturable, with implications for those disciplines. With conversation no longer ephemeral or the sole remit of human actors, what are the cultural ramifications of this material shift? What is the import for literature’s status itself within this new media ecology? Moreover, given that scholars in these fields have tended to demonstrate discomfort in analyzing collaboratively authored or uttered texts, analyzing talk interfaces brings important questions to the fore: How do we think about the role of interactivity in the meaning-making process? How might we think about interactivity in relation to mediation? How might we productively utilize theories developed in (new) media studies to think about talk and its representation in literature? What are the affordances of conversation? the protocols of talk?

Today’s digital talk interfaces offer a contemporary instantiation of a longer historical trend, but they also seem to upend it in crucial ways that are vital to our understanding of cultural, political, and social lives today. While historically talk has been premised on human-to-human communication, today’s talking interfaces change this dynamic. They introduce the machine as the primary interlocutor. More than this, they have the potential to produce a non-agential subject – the human user or interlocutor – via interactivity that is preprogrammed or enables only what Mark Hansen describes as a “feed forward” subjectivity.14 Given the historical significance of talk as a technology, it is vital that we reflect on the forms of talk interfaces we are building (and who is able to build them and who not). We need to critically examine the effects such interfaces have on our interactions and via them, our ability to constitute subjects and publics, to express, create, execute, and communicate. We need to talk about talk interfaces.

But first, let's talk about Writing Talking Interfaces.

Endnotes

  1. Responding to the common and unhelpful hierarchization of writing and orality in Western thought, Ngũgĩ wa Thiong'o promotes the concept of “orature,” an oral system of aesthetics with no hierarchical relationship to literature and notes, “The language of cyberspace may borrow the language of orality, twitter, chat rooms, we-have-been-talking when they mean we-have-been-texting, or chatting through writing emails, but it is orality mediated by writing. It is neither one nor the other. It’s both. It’s cyborality.” Ngũgĩ wa Thiongʾo, Globalectics: Theory and the Politics of Knowing, Wellek Library Lecture Series at the University of California, Irvine (New York: Columbia University Press, 2012). 84.
  2. See also Édouard Glissant, Poetics of Relation, trans. Betsy Wing (Ann Arbor: University of Michigan Press, 1997).
  3. Or “secondary orality,” as in Ong, Orality and Literacy. 2. Ong sketches perhaps the most extreme version of a dialectic that was promoted by other Toronto School of Communications theorists such as Harold Innis, Eric Havelock, and Marshall McLuhan. Numerous scholars have pointed to the limits of dichotomizing speech and writing – whether by noting the interweaving of the oral and the written in cultures dominated by television, radio, and other media (what Ong called “secondary orality”), or by rejecting this written/oral hierarchy as a product of a limited worldview and pointing to the very different operations of the oral and written in non-Western cultures.
  4. For one thoughtful critique of Ong’s ideas, see Jonathan Sterne, “The Theology of Sound: A Critique of Orality,” Canadian Journal of Communication 35 (2011): 207–25.
  5. Lori Emerson, Reading Writing Interfaces: From the Digital to the Bookbound, Electronic Mediations 44 (Minneapolis: University of Minnesota Press, 2014). x–xi.
  6. The growth of sound studies has done significant work in attuning scholars to the complex relationship between sound, mediation, and its cultural meaning. In addition to Rob Gallagher’s contribution in this digital book on voice, see particularly Jonathan Sterne, The Audible Past: Cultural Origins of Sound Reproduction (Durham, N.C.; London: Duke University Press, 2003).
  7. My definition of form incorporates literary critic Caroline Levine’s notion of form as having matter and doing work. Caroline Levine, Forms: Whole, Rhythm, Hierarchy, Network (Princeton, N.J.: Princeton University Press, 2004). 14–23.
  8. and Matthew Kirschenbaum’s concept of formal (as distinct from and additional to forensic) materialism. Matthew G. Kirschenbaum, Mechanisms: New Media and the Forensic Imagination (Cambridge, Mass.; London: MIT Press, 2008).
  9. My use of the term affordance aligns with its deployment in media studies as a means of charting a middle path between technological determinism and social constructivism. Scarlett and Zeilinger provide a useful summary of multidisciplinary uses of the term and current challenges in its theorization for digital technologies. Ashley Scarlett and Martin Zeilinger, “Rethinking Affordance,” Media Theory 3, no. 1 (August 23, 2019): 01–48, http://journalcontent.mediatheoryjournal.org/index.php/mt/article/view/78.
  10. I draw here on Nagy and Neff's notion of the “imagined affordance,” while significantly augmenting it with a more humanistic notion of imaginative agency. Peter Nagy and Gina Neff, “Imagined Affordance: Reconstructing a Keyword for Communication Theory:,” Social Media + Society, September 30, 2015, https://doi.org/10.1177/2056305115603385.
  11. In Forms of Talk the sociologist Erving Goffman usefully gestures towards an expansive definition of talk that encompasses its verbal and nonverbal aspects, its social and cultural affordances, and roles of mediating technologies and forms. Speech Act Theory’s notion of the “performative utterance” is also potentially useful here, although the relationship between it and executable code is culturally murky given (for example) differences in legal protections each can claim. Erving Goffman, Forms of Talk (Philadelphia: University of Pennsylvania Press, 1981).
  12. My analysis is indebted to the large bodies of work dedicated to natural language processing and conversational analysis, which I discuss at more length here.
  13. Rebecca Roach, Literature and the Rise of the Interview (Oxford: Oxford University Press, 2018).
  14. Mark B. N. Hansen, Feed-Forward: On The Future of Twenty-First-Century Media (Chicago: University of Chicago Press, 2015).

Bibliography

  • Emerson, Lori. Reading Writing Interfaces: From the Digital to the Bookbound. Electronic Mediations 44. Minneapolis: University of Minnesota Press, 2014.
  • Glissant, Édouard. Poetics of Relation. Translated by Betsy Wing. Ann Arbor: University of Michigan Press, 1997.
  • Goffman, Erving. Forms of Talk. Philadelphia: University of Pennsylvania Press, 1981.
  • Hansen, Mark B. N. Feed-Forward: On The Future of Twenty-First-Century Media. Chicago: University of Chicago Press, 2015.
  • Kirschenbaum, Matthew G. Mechanisms: New Media and the Forensic Imagination. Cambridge, Mass.; London: MIT Press, 2008.
  • Levine, Caroline. Forms: Whole, Rhythm, Hierarchy, Network. Princeton, N.J.: Princeton University Press, 2004.
  • Nagy, Peter, and Gina Neff. “Imagined Affordance: Reconstructing a Keyword for Communication Theory:” Social Media + Society, September 30, 2015. https://doi.org/10.1177/2056305115603385.
  • Ngũgĩ wa Thiongʾo. Globalectics: Theory and the Politics of Knowing. Wellek Library Lecture Series at the University of California, Irvine. New York: Columbia University Press, 2012.
  • Ong, Walter J. Orality and Literacy: The Technologizing of the Word. London; New York: Routledge, 2002.
  • Roach, Rebecca. Literature and the Rise of the Interview. Oxford: Oxford University Press, 2018.
  • Scarlett, Ashley, and Martin Zeilinger. “Rethinking Affordance.” Media Theory 3, no. 1 (August 23, 2019): 01–48. http://journalcontent.mediatheoryjournal.org/index.php/mt/article/view/78.
  • Sterne, Jonathan. “The Theology of Sound: A Critique of Orality.” Canadian Journal of Communication 35 (2011): 207–25.
  • Sterne, Jonathan. The Audible Past: Cultural Origins of Sound Reproduction. Durham, N.C.; London: Duke University Press, 2003.