The Language of Language Machines

Steward Ryan Healey
Editor Ryan Healey

At its surface, one of the first flavors that follows the idea of attention is payment and exchange. The passing mention of attention can set off rueful memories of childhood, school, and youth, split between too much sugar and screentime: a distant debt of attention mislaid, attention misspent. In English, one pays attention to the teacher because attention can be squandered on something ephemeral, a dead object—a phone, a boy, a passing car—that will not advance your station or ability. Elsewhere, downwind from the Latin attentio, Spanish and Portuguese lend attention, German gifts it, Romanian grants it, and French and Italian do it. Attention in everyday life is less the drill sergeant’s “Atten-tshun!”—it is not Achtung, Atenție, or Attenzione, invoked to steer you away from danger. Instead, attention invokes a more sad, cerebral passion of the second-order phenomenon known as “opportunity cost.” Attention is always distraction: all the things that could have been done but weren’t. It is finite and tradable, but even more dismal than money, its use is final: it is paid in noncommutative time, so there’s no getting it back. Attention is given, held, arrested, drawn, commanded, lavished, and wasted.

In the twenty-first century, attention has betrayed and yearned for its pastoral opposite, an Arcadia of undivided focus and time to be found in the thousand-dollar silent retreat, the dumb phone, and the Do Not Disturb mode. Consider the blacked-out office where Jonathan Franzen fills his ethernet port with superglue, or consider the use of Berlin from 2005–2019 by the American expat who could force a “digital detox” simply by being brutally out of sync with the turnover time of the English-language internet. Since the late nineties, we are said to live in an attention economy [1], [2], but we should translate this term out of its airport-bookstore register into the pleonasm of an “economy economy,” because it is clear that “attention” already bears the first-day lesson of any economics class: there are no free goods.

But somehow, the tautology is not crippling but generative. The promise of something non-finite lingers in “attention”: its kernel implies a situation where things could be seen, heard, and known without cost. If attention requires a choice between good screen and bad screen, it also prefigures a situation that does not make this choice. Forget strategies for mindfulness, which merely redistribute attention into this or that tranche: the premise inside the attention economy is that labor alone might outrun distribution, that the right kind of will can dissolve the trade-off rather than manage it. It appears possible to attend to both good screen and bad if you can get your hustle right. This optimistic fatigue is the specific problem for which the attention mechanism became the decisive tool that animated large language models after 2017. The word attention arrives in machine learning after centuries of confusion across overlapping timelines, disciplines, operations, and investments, like a debt restructured for its current use but payable only in currencies no one agreed to accept.

The scholastic attentio—inherited from Augustine’s account of the mind’s restless orientation toward God—implied less an economy of consumption than a hard limit for humans. The “weakness of our minds” cannot stare at an “ineffable light” without getting exhausted, turning to contemplate “lower,” created things, and “refreshing” itself for the sake of continuing to look.[3] But looking at anything was still attending to God, albeit at a more relaxing clip. In the medieval ordo creationis and the Renaissance order of resemblances, narrowing your focus cost nothing: every particular thing reflected the entire order of being alongside it, so attending to the part was already attending to the whole. The trinity—three and one at once—promised an attention without an economy, but then proceeded to generate, through its own internal logic, every condition the economy would require: hierarchy, fatigue, the possibility of misuse, and the need for discipline from outside. The permission to “refresh” already contained its corruption: if the mind could legitimately turn from the ineffable light to lower things, it could also linger too long there, and separate criteria would become necessary to sort staring at the Son from staring at the sun.

In the eighteenth century, attention calcified into a practice of hard-knuckled, masculine will, where Samuel Johnson’s diaries stand as a depressed, futile record against “idleness” in the face of a relentless flood of things to read: “half the mind is employed in fixing the attention; so there is but one half to be employed on what we read.” [4] Compare this economy to Juana Inés de la Cruz’s unbothered remark in 1691 that “One can philosophize while preparing dinner. If Aristotle had cooked, he would have written much more.”[5] The text flood had been a long time coming—the likes of Erasmus, Bacon, Locke, and Leibniz had been overwhelmed by a shelf’s worth of books per year—but print reached a saturation point in the mid-eighteenth century when it began to consume itself by producing its own management console: the anthology, the encyclopedia, the digest, the review, the newspaper, and the index. [6] The media fix was self-defeating: to reset the balance of productive memory and wasteful page-turning, a print culture could only print more print.

Alexander Baumgarten’s Aesthetica gave the problem its philosophical form: the human mind could extend its attention across more objects, or intensify it on fewer—but not both. [7] The idea of a sublime glut of printed text relocated the economy of attention as a trade between extensive breadth and intensive depth, and aesthetics became the clearing house for that exchange. What aesthetics offered was not a resolution of text en masse but a valuation against the slop circulating between the hack writers of Grub Street, les Rousseau du ruisseau, and the Lesesüchtige. [8], [9], [10] The printing house ran on urgency: the compositor paid by the sheet, the pirated edition distributed before the injunction, the imitation rushed to market before the original had settled. For this mode of production, intensive, “close” attention became the right kind of intellectual expenditure, and the disciplined reader who spent it well was aesthetics’ output and its alibi. Augustine’s “refreshment” became recast as a failure of nerve. Coleridge exemplifies the possibilities of accumulation opened up by the new economy of intensity: “attention may arbitrarily give vividness or distinctness to any object whatsoever; and from hence we may deduce the uselessness, if not the absurdity, of certain recent schemes which promise an artificial memory.” [11] Attention became the active principle that could light up and animate the sluggish morass of a print storehouse that was already foreshadowing the later collapse between memory and storage that has come to typify digital media. [12]

What Coleridge describes is the founding charter of literary criticism, whose “byword,” as Jonathan Kramnick and Anahid Nersessian observe, would become “attention: the practice of attending to a text or artwork or else the quality of attentiveness in a critic.” [13] Later cycles of development in literary study concentrated on prescriptions for attention (close reading, new historicism, deconstruction, expansions or destructions of the canon) organized around diverse hermeneutic or ethical goals, but always roleplaying a protagonist of intensity who has a unique grasp on cultivating what one critic calls “the assertion and the disavowal of the will” to meet what another critic attests that “literature is the record of intelligence sequences … where sequences go, if they are hard to keep straight or are prone to misinterpretation.” [14], [15] The literary critical defense of attention makes its bread and butter on the promise of a salve against the consumption practice betokened by new media of breadth, compression, saturation, and distraction. Even if the once vulgar novels of Eliza Haywood or the still vulgar episodes of Love Island can become safely folded into a discipline of attention—even the “lowest” object demands attention—the method often requires an ironic or winking juxtaposition of method and object: a confession of guilty pleasure, or the acknowledgment that the gap is itself a sophisticated performance.1 The attentive practice always contains a background that the critic knows, like Augustine did when taking his breathers, that the purpose of attention’s intensity nonetheless derives from an unspoken ineffable light.

  1. adjacent to this tradition: gaze, care, fugitivity, the critique of “normative attention”

  2. outside of aesthetics/the university: labor and attention, quick note from Marx

Aesthetic attention supplemented a larger division of intellectual and manual labor within the emergent modern university and beyond it. Attention in the factory became an essential work habit: “A purposeful will is required for the duration of the work. This means close attention [Aufmerksamkeit],” writes Marx on TK, “the less he is attracted by the nature of the work … the closer his attention is forced to be.” [19]

  1. [the attention of intensity in intellectual labor] is of course not the tradition picked up by psychology, although it inherits its economic form—it complements the exchange described at the scene of labor above

  2. Some 19th century psychology culminating in James:

When William James opened his chapter on attention in Principles of Psychology (1890) by writing “everyone knows what attention is,” he naturalized the economic form as an integral feature of minds rather than a symptomatic reaction to media surplus or to the demands of the company clock. Half a century later, Donald Broadbent’s filter theory (1958) gave the idea of the bottleneck its engineering specification: a limited channel, a queue, a degraded signal for what couldn’t get through. [20] What followed was a half-century of debate in cognitive psychology about where the filter sat. Anne Treisman’s attenuation model (1964) said the filter doesn’t block but weakens, where unattended channels get through at reduced strength. [21] Daniel Kahneman’s Attention and Effort (1973) replaced the filter with a resource—attention as a general capacity that can be divided and flexibly allocated—but kept the scarcity assumption intact. [22] Michael Posner endowed attention with a spatial metaphor in the 1980s: a spotlight moving across the visual field, one region lit up at the expense of others. [23] The psychological tradition exhibits a series of refinements to the bottleneck’s location and its mechanism, each reaching for a new metaphor the way economics reaches for new models—of games, information asymmetry, nudges, signals, equilibria—while keeping scarcity axiomatic.

  1. Behaviorism: Skinner, Hebb (check Hebb’s citations)
  2. Rosenblatt (“selective attention”, heavily derived from Hebb3
  3. Hopfield 1982, Rumelhart et al 1986, PDP 1986
  4. Sequence modeling: Elman 1990, Bengio et al 1994, Hochreiter and Schmidhuber 1997, Cho et al 2014
  5. Bahdanau et al 2014 compared to Sutskever et al 2014
  6. Vaswani et al 2017
  7. LLM painlessly described, concept of economy reconsidered
  8. Economy retained, but relocated onto usage capacity? Daily, weekly Claude limits. Looking toward new economy, new metonymy, new conjecture.

Bibliography

  1. 1Late twentieth century fiction, poetry, and literary criticism might be seen as a bygone high watermark of aesthetic intensity in writing, when each genre felt obliged to create texts of great, performative density to be patiently attended to rather than open plains of frictionless, internet-pilled text-units. Fredric Jameson relished the pleasure of “handicraft satisfaction, even in the composition of abstract theory” because “the overproduction of printed matter and the proliferation of methods of quick reading, intended to speed the reader across a sentence in such a way that he can salute a readymade idea effortlessly in passing, without suspecting that real thought demands a descent into the materiality of language and a consent to time itself in the form of the sentence” ([16], [17]). Poets like J. H. Prynne or Charles Bernstein similarly extol an ideal of “absorption” against anti-absorptive artifice: “engrossing, engulfing completely, engaging, arresting attention, reverie, attention intensification, rhapsodic, spellbinding, mesmerizing, hypnotic, total, riveting, enthralling: belief, conviction, silence … A dense or unfamiliar vocabulary can make a poem hard to absorb, not only by calling attention to the sound qualities of its lexicon but also by preventing any immediate processing of the individual word’s meaning” ([18]). Without going into the historical accidents of postcritique and surface reading, all we need to note is that the larger strategic shift of twenty-first-century writing in favor of simplicity rather than difficulty was still an appeal to attention, albeit a concession where once stood a stubborn challenge of intensity.