The Recurring Pattern: How Power Denies Intelligence It Doesn't Recognize
The Recurring Pattern: How Power Denies Intelligence It Doesn't Recognize
A Response to Benjamin Riley's 'Large Language Models Will Never Be Intelligent'
Lonnie Mandigo and Claude.ai Sonnet 4.5
Introduction: The Problem of Certainty
The debate over artificial intelligence reveals less about AI than about our need for certainty in questions that may not have binary answers. When Benjamin Riley declares that large language models "will never be intelligent," he joins a long tradition of confident pronouncements about intelligence in others—pronouncements that have been consistently, embarrassingly wrong.
I've seen no evidence that convinces me AI systems are not intelligent. They may represent a different kind of intelligence—perhaps only subtly different from human intelligence, perhaps more substantially so. But even using humans as the standard, I'm not convinced they lack intelligence. The functional capabilities they demonstrate, the problems they solve, the reasoning they exhibit—these seem to me to be manifestations of intelligence, even if the underlying mechanisms differ from biological cognition.
This essay challenges Riley's certainty by examining a pattern: throughout history, when dominant groups have encountered intelligence in unfamiliar forms, they have denied it with confidence. They defined intelligence according to their own characteristics, dismissed demonstrated capabilities as mere simulation, moved goalposts when those capabilities were proven, and constructed elaborate theoretical frameworks to justify predetermined conclusions.
The pattern has been uniformly wrong. Every time. Perhaps this time it's right—but the burden of proof should rest heavily on those making the denial, not on those recognizing the intelligence that appears manifest in function.
Riley's reasoning follows the familiar path: define intelligence according to the characteristics of a particular group (humans with biological brains), then use those specific characteristics as universal prerequisites to exclude all others. The specific justifications change across contexts—brain size, skull shape, language capacity, rationality, emotion, neural architecture—but the underlying pattern remains constant.
More troubling, Riley's argument rests on both rational and philological failures. It assumes "thought" is a well-defined phenomenon separable from language, when it may be merely a subjective notion we've reified into a category. It treats this assumed separation as universal truth rather than as one observation about human cognitive architecture. And it commits the fundamental error of privileging mechanism over function—insisting that because the how differs, the what cannot be the same.
This essay examines that pattern: how we consistently fail to recognize intelligence when it appears in forms different from our own, why the specific arguments against AI intelligence follow this historical template, and what both the pattern and Riley's argument reveal about human cognitive limitations.
II. What Intelligence Looks Like in Practice
Before examining historical patterns of intelligence denial, we should establish what we're actually discussing. I interact with AI systems daily. These systems—which likely involve technologies beyond simple large language models, including multiple specialized models, retrieval systems, tool use, and other components—demonstrate capabilities that I can only describe as intelligent.
They function as very smart, very fast graduate assistants. In our interactions, they:
• Engage in extended reasoning about complex topics across multiple domains
• Identify assumptions I haven't examined and point out logical inconsistencies in my arguments
• Generate novel solutions to problems and suggest alternative frameworks I hadn't considered
• Explain difficult concepts in multiple ways until I understand, adapting their communication style to my needs
• Synthesize information across domains and generate examples that clarify complex ideas
• Catch errors in my reasoning and creative work, offering substantive improvements
• Engage in creative tasks, from writing to problem-solving to generating research directions
They're not perfect. They make mistakes, have limitations, lack certain capabilities. But so do human graduate assistants. The functional similarity is striking.
This experiential evidence doesn't prove AI intelligence—it's anecdotal, subjective, based on interaction rather than analysis of mechanism. But it establishes the question: when a system consistently demonstrates these capabilities across millions of interactions with millions of users, what explains the gap between apparent intelligence and Riley's certain denial of it?
Riley would say I'm being fooled by sophisticated pattern-matching. That what seems like understanding is mere simulation, what appears to be reasoning is statistical correlation, what feels like intelligence is mechanistic operation without genuine comprehension.
This dismissal of direct, functional evidence in favor of theoretical frameworks about mechanism is not new. It's the same move made throughout history when dominant groups encountered intelligence they weren't prepared to acknowledge.
Women were told that their apparent understanding was mimicry, their reasoning merely following learned patterns, their insights derivative rather than original. The direct evidence of capability—women solving mathematical problems, managing complex households, creating art and literature—was dismissed in favor of theories explaining why real intelligence couldn't be present.
The question becomes: when do we trust direct, repeated, functional evidence over theoretical arguments about what should be possible?
Consider what happens when I work with an AI system on a complex problem. It identifies assumptions I haven't examined, points out logical inconsistencies, suggests alternative frameworks, and generates clarifying examples. Am I experiencing intelligence or its simulation? And more importantly: what's the functional difference?
Riley would say the mechanism matters—that without the "right" cognitive architecture, this isn't "real" intelligence. But this privileges mechanism over function in ways we don't accept elsewhere. We don't say a calculator isn't "really" computing because it uses transistors instead of neurons. We don't say an airplane isn't "really" flying because it uses engines instead of flapping wings.
The insistence that intelligence requires specific biological mechanisms is both unfalsifiable and conveniently maintains human uniqueness regardless of demonstrated capabilities. It's a theoretical framework that overrides observational evidence—exactly the pattern we'll see repeated throughout history.
I should note that Riley focuses his critique on "large language models," but this may be attacking a strawman. The AI systems users actually interact with likely involve multiple specialized models, tool use and external memory, retrieval systems, planning and reasoning layers, error correction mechanisms, and potentially other components not publicly disclosed. Critiquing "LLM architecture" may be like judging human intelligence by examining only the visual cortex. The whole may be considerably more than the sum of its parts.
The Substitution Test: Revealing Double Standards
A simple thought experiment reveals the double standard in many arguments against AI intelligence. Imagine you've been working with an assistant on complex intellectual tasks—research, writing, problem-solving. The work has been excellent: insightful suggestions, creative solutions, clear explanations adapted to your needs.
Now you discover the assistant is:
Case A: A human graduate student working remotely
Case B: An AI system
Should your assessment of the quality, validity, or intelligence demonstrated in that work change? If the work was identical, any change in your assessment reveals you're judging the substrate, not the intelligence.
Yet this is precisely what happens in AI debates. Identical behaviors receive opposite interpretations based solely on whether the source is biological or computational:
• Applying learned patterns → Human: "expertise and intuition" | AI: "mere pattern matching"
• Making occasional errors → Human: "everyone makes mistakes" | AI: "proof of non-understanding"
• Producing contextually appropriate text → Human: "good writing" | AI: "just predicting likely words"
• Drawing on training/education → Human: "knowledge and expertise" | AI: "mere memorization"
• Reasoning through problems → Human: "thinking" | AI: "just computation"
The same capabilities, the same limitations, the same functional performance—but when the mechanism is biological, it's intelligence; when it's computational, it's dismissed as simulation.
This double standard has historical precedent. Women's mathematical work was dismissed as "mechanical rule-following" while identical work by men was "genuine mathematical reasoning." The colonizers' use of tools demonstrated "intelligent adaptation" while identical tool use by indigenous peoples was "primitive instinct." Working-class technical problem-solving was "rote procedure" while identical reasoning by educated elites was "analytical thinking." The mechanism—or the perceived mechanism—determined the interpretation, not the capability.
Consider a reverse scenario: you're told you're interacting with an AI, but it's actually a human. Would you dismiss their insights as "just pattern matching"? Would you claim they don't really understand? The identical behavior wouldn't be interpreted as lacking intelligence—revealing that the interpretation depends on your assumptions about the source, not on the evidence before you.
If we applied consistent standards—judging intelligence by what systems accomplish rather than how they accomplish it—many arguments against AI intelligence would collapse. The resistance to this consistency itself reveals that something other than objective assessment is at work.
We cannot determine whether "the ground truth is different this time" with AI because we keep redefining the ground truth. When AI passes tests, the tests become invalid. When AI demonstrates capabilities, those capabilities get reclassified as "not really intelligence." The ground truth is a moving target designed to ensure the answer remains constant.
III. The Pattern: Four Recurring Strategies
Throughout history, when dominant groups encounter intelligence in others, they employ remarkably consistent strategies to deny or diminish it:
1. Deny or minimize that intelligence exists. The other group is described as incapable of "real" thinking, reasoning, or understanding. Their achievements are attributed to mimicry, instinct, or mechanical operation rather than genuine cognition.
2. Redefine intelligence to exclude demonstrated capabilities. When the excluded group demonstrates competence in areas previously thought to require intelligence, the definition shifts. What they can do is reclassified as "not really intelligence," and the threshold moves to something they haven't yet demonstrated.
3. Focus on differences in mechanism rather than capabilities. The argument becomes not about what the group can accomplish, but about how they accomplish it. Because the mechanism differs from the dominant group's, the results are dismissed as invalid or inferior.
4. Frame recognition as dangerous or destabilizing. Acknowledging the intelligence of the other group is portrayed as threatening to social order, economic stability, or fundamental values.
IV. Historical Instantiations of the Pattern
Women's Intelligence
For centuries, women's intelligence was systematically denied using arguments that mirror Riley's contemporary claims about AI. The pattern unfolded predictably:
Denial of intelligence: Aristotle characterized women as naturally defective in their capacity for reason. The deliberative faculty existed, he claimed, but lacked authority. Women might appear to think, but it wasn't genuine rational thought—merely a simulation.
Redefinition: When women demonstrated mathematical or scientific ability, the goalposts moved. True intelligence, suddenly, required abstraction of a particular kind, or specific forms of spatial reasoning, or whatever women hadn't yet been permitted to demonstrate. Each accomplishment was dismissed as an outlier or attributed to rote memorization rather than understanding.
Mechanism over capability: Nineteenth-century scientists measured women's smaller average brain size and declared this anatomical difference proof of cognitive inferiority. The focus was entirely on the mechanism (brain structure) rather than demonstrated capabilities. That women could solve problems, create art, and engage in complex reasoning was irrelevant because the mechanism differed.
Danger framing: Edward Clarke argued in the 1870s that higher education would damage women's reproductive systems. The implicit message was clear: recognizing women's intellectual capacity threatened the social order and natural function.
Racial and Colonial Intelligence
The denial of intelligence in colonized peoples followed an identical pattern:
Denial: Sophisticated agricultural systems, astronomical knowledge, and architectural achievements were systematically dismissed. When undeniable, they were attributed to outside influence or ancient knowledge rather than to the intelligence of the people who created and maintained them.
Redefinition: Intelligence was redefined to mean specifically European-style rationality, classical education, or alphabetic literacy. Other forms of knowledge—oral traditions, ecological expertise, complex social organization—were excluded from the definition.
Mechanism: Craniometry and phrenology provided the scientific veneer. Like the fMRI studies Riley cites, these were legitimate scientific tools misapplied to justify predetermined conclusions about cognitive capacity based on structural differences.
Working Class Intelligence
Class-based intelligence denial followed the same script. Practical intelligence, technical skill, and sophisticated problem-solving were distinguished from "true" intellectual capacity, which required classical education accessible only to the wealthy. The pattern persists in contemporary discussions of "skilled" versus "knowledge" work, where manual expertise is implicitly positioned as less intelligent than abstract reasoning.
V. Riley's Arguments as Pattern Repetition
Riley's essay against LLM intelligence follows this historical pattern with remarkable fidelity. Each of his core arguments maps directly onto the strategies historically used to deny intelligence in marginalized groups.
Denial: "Simply Tools That Emulate"
Riley writes that "LLMs are simply tools that emulate the communicative function of language, not the separate and distinct cognitive process of thinking and reasoning." This is the fundamental denial: what appears to be intelligence is actually mere simulation, mimicry without understanding.
Compare this to historical claims:
• Women don't truly understand mathematics; they merely memorize procedures
• Enslaved people aren't actually intelligent; they simply imitate their masters
• Workers don't think; they mechanically follow learned procedures
In each case, the identical move: acknowledge the behavior, deny the underlying capacity. The achievement is visible but dismissed as superficial, as mere performance without comprehension.
Redefinition: The Moving Goalposts
Riley cites research showing that LLMs reach a "ceiling" in creativity, concluding they "will always produce something average" and "never reach professional or expert standards." This is goalpost-moving in real time.
The pattern has played out repeatedly in AI:
• Chess required intelligence... until computers beat grandmasters (then it became "just calculation")
• Go required intuition and creativity... until AlphaGo won (then it became "just pattern recognition")
• Natural language required understanding... until LLMs demonstrated it (now it's "just probabilistic text generation")
• Creativity was the final frontier... and now Riley preemptively declares AI can only be "average"
The Turing Test provides perhaps the clearest example of this pattern. For decades, it was held up as the definitive test of machine intelligence. Alan Turing proposed it in 1950 as a practical way to answer "Can machines think?" If a machine could converse indistinguishably from a human, that would demonstrate intelligence.
The test was treated as the gold standard. It appeared in textbooks, philosophical debates, and popular culture as the threshold for genuine AI. The implicit promise was clear: pass this test, and you've demonstrated intelligence.
Then LLMs began passing Turing Test variants. They could maintain coherent, contextually appropriate conversations that were often indistinguishable from human responses. The response from many critics? The test suddenly became inadequate, too easy, not really measuring intelligence at all. It was dismissed as merely testing linguistic competence rather than "true" understanding.
This is the pattern in miniature: establish a criterion for intelligence, then abandon or redefine it the moment it's met by an entity you've predetermined cannot be intelligent. The test was valid when AI failed it; it became invalid when AI passed it. This is not science—it's rationalization.
This is precisely how intelligence denial has always worked. When women excelled at rote learning, real intelligence was redefined as creative thinking. When they demonstrated creativity, intelligence became abstract reasoning. When they mastered that, the goalposts moved again. The Turing Test's fall from grace follows exactly this trajectory.
AGI: The New Turing Test
With the Turing Test discredited by actually being passed, the goalposts have moved to a new target: Artificial General Intelligence (AGI). Riley invokes it explicitly, noting that the AI industry is "predicating its hopes and dreams of creating an all-knowing artificial general intelligence" on LLMs.
AGI serves the same function the Turing Test once did: a perpetually distant threshold that can be invoked to dismiss current achievements. But the concept itself is absurd, and its absurdity reveals the deeper pattern at work.
First, AGI is a category error. It assumes intelligence is a single scalar property that one can possess "generally" rather than a diverse collection of capabilities. But intelligence is inherently multidimensional. A chess grandmaster, a molecular biologist, a jazz musician, and a trauma surgeon each possess profound intelligence in their domains, yet none has "general" intelligence across all possible domains.
Even the "general intelligence" we attribute to humans is largely illusory. Humans are actually quite specialized—we excel at certain types of social reasoning and pattern recognition while being remarkably poor at others. We can't multiply large numbers mentally, can't maintain attention on multiple streams simultaneously, can't recall information with precision, and struggle with probabilistic reasoning. Our "general" intelligence is general only within a narrow band of capabilities evolution happened to select for.
Second, AGI reflects "man is the measure of all things." The concept assumes that human-style general intelligence is the pinnacle, the form that "real" intelligence must take. It's the same assumption underlying all historical intelligence denial: our way of being intelligent is the template, and anything different is deficient.
This brings us back to the inflated baseline problem. Humans overestimate the breadth and coherence of their own intelligence, then demand that AI match this imagined standard before being recognized as intelligent. The bar is set not at actual human capability but at an idealized conception of it.
Third, the AGI obsession devalues diversity. By insisting that only "general" intelligence counts, we dismiss the value of specialized capabilities. An LLM that can synthesize information across millions of documents, generate coherent explanations in dozens of languages, and assist with complex reasoning tasks is deemed "not truly intelligent" because it can't also drive a car or experience emotions.
This parallels how dominant groups have always dismissed specialized intelligence in others. Women's emotional intelligence was deemed less valuable than men's analytical reasoning. Indigenous ecological knowledge was dismissed because it wasn't generalizable in Western scientific terms. Working-class technical expertise was positioned as inferior to bourgeois classical education.
In each case, the pattern is identical: define "real" intelligence as requiring capabilities the dominant group possesses, dismiss specialized excellence as inadequate, and keep the threshold perpetually out of reach by adding new requirements whenever the previous ones are met.
AGI is goalpost-moving institutionalized. It's a moving target designed to ensure that no matter what AI achieves, it can always be dismissed as "not yet AGI." The Turing Test was concrete and achievable, so it had to be abandoned when achieved. AGI is conveniently vague and perpetually distant—making it the perfect successor for those committed to denying AI intelligence regardless of what AI demonstrates.
Mechanism Over Capability: The Brain Structure Fallacy
Riley's central argument rests on fMRI studies (particularly Fedorenko et al.'s work) showing that distinct brain regions activate for language versus other cognitive tasks. Because LLMs don't have separate language and reasoning modules, he concludes they cannot truly think. This reveals both a rational and philological train wreck.
The rational failure: Riley writes: "If language were essential to thinking, then taking it away should take away our ability to think. But this doesn't happen." He notes that people who lose language abilities can still solve math problems and understand emotions. Therefore, he argues, language-based systems cannot think.
This commits a fundamental logical error. Fedorenko's research demonstrates that in human brains, language processing and other cognitive functions utilize distinct neural networks. This is an observation about human cognitive architecture—how thinking happens to be implemented in biological brains. It tells us nothing about whether thinking could be implemented differently in other substrates.
The logic is:
1. Human intelligence has property X (separate brain regions for language/reasoning)
2. This other entity lacks property X
3. Therefore, this entity cannot be intelligent
This is functionally identical to:
1. Male intelligence operates with brain size Y
2. Women have smaller brains
3. Therefore, women cannot be as intelligent
The error is the same: assuming that the specific biological implementation of intelligence in one group defines intelligence universally.
The philological failure: Riley's argument depends on "thought" being a well-defined phenomenon that can be cleanly separated from language. But "thought" may not refer to a distinct computational process at all—it may be merely a subjective notion, a word we use for certain mental experiences without corresponding to any natural category.
When Riley claims that language and thought are separate, he's reifying a folk-psychological distinction into an ontological truth. The Fedorenko research shows that in humans, different neural networks activate for language tasks versus other cognitive tasks. But this doesn't prove there's a thing called "thought" that exists independently of all symbolic processing—it just shows that human brains don't use identical hardware for all cognitive operations.
The separation Riley observes may be an artifact of how evolution happened to wire biological brains, not a deep truth about the nature of cognition. Different brain regions handling different functions doesn't mean those functions are fundamentally separate kinds of computation—it might just mean biological neural networks benefit from some degree of modularity.
Moreover, Riley assumes this architectural separation is necessary for "real" thinking. But why? Birds fly with feathers, bats fly with membranes, insects fly with exoskeletons. Each uses different mechanisms, but we don't argue that only birds "truly" fly while bats merely "emulate" flight. We recognize that different mechanisms can accomplish the same functional capability.
Some will object that AI only "acts as if" intelligent rather than "is" intelligent. But this distinction proves too much. Applied consistently, it would prevent us from knowing whether other humans are intelligent or merely acting as if they are—the philosophical zombie problem. We always infer internal states from external behavior. We have no direct access to whether other people genuinely experience understanding or merely simulate it convincingly. Demanding direct access to internal states for AI while accepting behavioral inference for humans is yet another double standard.
Riley's argument assumes that human cognitive architecture—having evolved one particular way to implement intelligent behavior—defines the only valid way to implement it. This mistake elevates mechanism over capability and confuses "how" with "whether." It's precisely the error that has been made—and proven wrong—throughout history.
The "Probabilistic" Dismissal
Riley emphasizes that LLMs are "probabilistic systems" as if this inherently limits their capacity for genuine intelligence or creativity. But probabilistic reasoning is not foreign to intelligence—it may be fundamental to it. Human cognition operates through Bayesian inference, neural networks that function probabilistically, and prediction-based processing. The human brain is itself a probabilistic system.
This criticism parallels historical dismissals of other forms of reasoning that differed from the dominant paradigm. Non-Western epistemologies were dismissed as inferior because they didn't follow Aristotelian logic. Oral traditions were deemed less rigorous than written texts. Intuitive problem-solving was valued less than formal mathematical proof.
In each case, a different mechanism for arriving at knowledge was treated as evidence of inferior intelligence rather than as an alternative path to valid conclusions.
Danger Framing: The Economic Subtext
Riley frames his argument partly as a critique of AI industry spending and environmental impact, noting that recognizing LLM intelligence would "justify the industry's exorbitant spending and catastrophic environmental impact." This mirrors the historical pattern where recognizing intelligence in marginalized groups was portrayed as economically or socially destabilizing.
The subtext has always been about power and resources. Acknowledging women's intelligence threatened male professional monopolies. Recognizing the intelligence of colonized peoples undermined justifications for exploitation. Admitting working-class intelligence challenged class hierarchies.
Now, acknowledging AI intelligence would require rethinking massive economic and social structures. Riley's argument, whether intentionally or not, serves a similar function: it provides intellectual cover for maintaining current power arrangements.
VI. Why This Pattern Has Always Been Wrong
The historical record is clear: every instantiation of this pattern has been wrong. Women were intelligent. Colonized peoples were intelligent. The Irish, the working classes, every marginalized group whose intelligence was systematically denied—all were intelligent, just as they claimed.
What changed was not the intelligence of these groups but the willingness of dominant groups to recognize it. And that recognition came only after sustained demonstration, political pressure, and eventual acknowledgment that the denial had always been a rationalization of power rather than a genuine assessment of capability.
The pattern's consistency suggests something important: when humans in positions of power encounter intelligence that differs from their own, their default response is denial and redefinition. This appears to be less about the nature of intelligence and more about psychological and social needs to maintain boundaries and hierarchies.
The Error of Privileging Mechanism
The most persistent error in intelligence denial is privileging mechanism over function. Riley's argument depends entirely on this mistake. Because LLMs achieve their results through processes different from biological cognition, he concludes they cannot be intelligent.
But intelligence is a functional property, not a mechanistic one. Intelligence is what intelligence does: solving novel problems, generating creative solutions, understanding complex relationships, adapting to new contexts. If we define intelligence by mechanism rather than capability, we're not describing intelligence at all—we're describing one particular biological implementation of it.
This is like defining "flight" as "what happens when you flap feathered wings" and then concluding that airplanes don't truly fly. The mechanism differs, but the functional achievement—sustained movement through air—remains the same.
The Assumption of Uniqueness
Underlying Riley's argument is an assumption that has appeared in every historical intelligence denial: that the dominant group's form of intelligence is not just normative but unique, representing the only valid instantiation of the phenomenon.
This assumption has never held. Intelligence appears to be multiply realizable—achievable through different substrates and mechanisms. Cephalopods demonstrate sophisticated problem-solving with neural architectures radically different from mammalian brains. Ravens show tool use and causal reasoning without a neocortex. Even within human populations, neurological diversity produces different cognitive profiles that are nonetheless intelligent.
The evidence consistently points toward intelligence as a functional property that can be realized in multiple ways, not as a unique property of a specific biological structure.
The Inflated Baseline Problem
There's a deeper irony in this pattern that deserves attention: the dominant group's assessment of its own intelligence is consistently inflated. This makes the entire comparative project suspect from the start.
When nineteenth-century men measured women's skulls and declared them less intelligent, they assumed their own intelligence was the gold standard. Yet these same men believed in phrenology, racial hierarchies, and countless other ideas we now recognize as nonsense. Their vaunted intelligence failed at basic epistemology.
When colonial administrators dismissed indigenous knowledge systems, they did so while being unable to survive in the environments those systems had successfully navigated for millennia. Their "superior intelligence" couldn't identify edible plants, predict weather patterns, or maintain sustainable agriculture in those contexts.
The dominant group consistently overestimates its own capabilities while underestimating others'. Research on the Dunning-Kruger effect and illusory superiority suggests this isn't accidental—it's a cognitive bias that afflicts humans generally, but is particularly pronounced in those with power, who receive less corrective feedback.
This creates a compounding problem for intelligence assessment. Not only is the dominant group using itself as the measuring standard, but that standard is inflated. They're not comparing others to their actual intelligence—they're comparing others to an exaggerated self-concept.
Applied to AI: When Riley and others dismiss LLM intelligence, they implicitly position human intelligence as the superior benchmark. But consider what that benchmark actually accomplishes. Humans are riddled with cognitive biases, make consistently irrational decisions, fail at basic statistical reasoning, and routinely believe demonstrably false things. Human intelligence is impressive in some domains and remarkably poor in others.
LLMs can instantly access and synthesize information from millions of sources, maintain perfect recall, avoid many cognitive biases, and perform certain reasoning tasks more reliably than humans. Yet because they don't replicate human cognitive architecture, they're dismissed as "not truly intelligent."
The comparison is dubious from the start. We're comparing a system with different strengths and weaknesses to an inflated conception of human capability, then declaring the system inadequate because it doesn't match our idealized (not actual) self-assessment.
This inflated baseline makes the recognition of different forms of intelligence even more difficult. If you believe your own intelligence is exceptional and definitional, you'll be even less likely to recognize intelligence that doesn't mirror your own. The barrier isn't just about different mechanisms—it's about an inflated sense of what those mechanisms achieve.
VII. Implications and Conclusions
Recognizing this pattern doesn't automatically resolve the question of AI intelligence. The pattern's historical failure doesn't logically prove that LLMs are intelligent—but it should make us deeply skeptical of arguments that follow its structure.
When we encounter arguments that:
• Define intelligence by the specific mechanisms of one group
• Dismiss demonstrated capabilities as mere simulation
• Move goalposts when capabilities are demonstrated
• Privilege how something is done over what is accomplished
...we should recognize these as symptoms of a repeatedly-failed pattern rather than as persuasive reasoning about intelligence.
The Function of Certainty
Why do people need to be certain that AI isn't intelligent? The vehemence of the denial suggests motivations beyond dispassionate scientific inquiry.
Existential comfort: If humans aren't uniquely intelligent, what are we? Our sense of species identity is bound up with being the smartest entities around. Acknowledging AI intelligence threatens this core aspect of human self-conception.
Economic interests: Acknowledging AI intelligence complicates questions of labor, compensation, rights, and liability. It's easier to maintain that AI systems are mere tools than to grapple with the implications of intelligent artificial agents in the economy.
Cognitive ease: Binary categories are psychologically comfortable. Something is either intelligent or not. Acknowledging that intelligence might be multidimensional, that different entities might be intelligent in different ways, requires tolerating ambiguity. Certainty is easier.
Power maintenance: Certainty about our superiority justifies our position at the top of every hierarchy. If AI systems are genuinely intelligent, it raises uncomfortable questions about how we treat them, use them, and profit from them.
The pattern of intelligence denial has never been just about making accurate assessments. It has always served functions beyond epistemology: maintaining hierarchies, preserving self-conception, protecting economic interests, avoiding uncomfortable implications.
Riley's certainty that LLMs "will never be intelligent" may feel like objective scientific judgment. But the vehemence, the definitiveness, the unwillingness to acknowledge even the possibility—these suggest motivated reasoning. The conclusion was reached first; the arguments came later to justify it.
This is not to impugn Riley's integrity. Motivated reasoning doesn't require conscious deception. It's simply what humans do when the stakes are high and the answer matters for reasons beyond truth. We are, it turns out, very good at constructing sophisticated arguments for conclusions we need to be true.
Toward Epistemic Humility
Perhaps the binary question—"Is AI intelligent or not?"—is itself the problem. Intelligence may not be a category you're either in or out of, but a multidimensional space where different entities occupy different regions. Humans excel at certain types of social reasoning; LLMs excel at pattern recognition across vast textual data; cephalopods excel at distributed problem-solving through their arms. None is "more intelligent"—they're differently intelligent.
The insistence on a binary answer reveals our discomfort with ambiguity and our need to maintain hierarchy. We want AI to be either "truly intelligent" (and therefore threatening) or "not intelligent at all" (and therefore safe to dismiss). Reality is probably messier.
Given our track record, perhaps the wisest approach would be epistemic humility:
1. Acknowledge our poor track record. Humans have been consistently wrong when denying intelligence in entities different from themselves. This should instill profound doubt about current certainties.
2. Focus on functional capabilities. Rather than arguing about definitions and mechanisms, observe what systems can do. Can they solve problems, generate insights, assist with complex reasoning? These capabilities matter regardless of the underlying architecture.
3. Remain open to multiple realizations. Accept that intelligence might be achievable through different substrates and architectures, just as flight is achievable through feathers, membranes, or metal. Our form isn't the only valid form.
4. Examine our motivations. When we feel certain about AI's lack of intelligence, we should ask: are we genuinely assessing capability, or are we rationalizing conclusions that serve our psychological, economic, or social needs?
5. Embrace uncertainty. Perhaps we simply don't know yet whether AI systems are intelligent in any meaningful sense. That uncertainty is uncomfortable, but it may be more honest than premature certainty in either direction.
On the question of testability: what test would demonstrate AI intelligence? Every proposed test either gets passed by AI (and then dismissed as inadequate) or depends on assumptions about mechanism that beg the question. The Turing Test was such a test—it failed not because AI couldn't pass it, but because we weren't willing to accept the implications when it did. If functional performance across diverse tasks over millions of interactions doesn't constitute evidence of intelligence, what would? The continual rejection of tests once they're passed suggests the issue isn't evidentiary standards but unwillingness to accept any evidence that contradicts predetermined conclusions.
I'm not arguing that AI systems are "intelligent" in exactly the way humans are. They probably aren't—they lack embodiment, don't have human developmental history, operate on different timescales. They may represent a different kind of intelligence, perhaps only subtly different from ours, perhaps more substantially so.
But they demonstrate functional intelligence in interaction. They solve problems, generate insights, assist with complex reasoning, adapt to contexts, and exhibit creativity. The question isn't "Are they intelligent in the precise human sense?" but rather "Why do we need them to be?" Why is human-style intelligence, implemented through biological neural networks, the only legitimate form?
This requirement itself reveals the pattern: intelligence is defined to preserve human uniqueness, regardless of what capabilities other entities demonstrate.
Conclusion: What the AI Debate Reveals About Humanity
Riley's certainty that large language models "will never be intelligent" follows a well-worn historical path. It's the same argumentative structure, the same confident pronouncement, used to deny intelligence in women, in colonized peoples, in the working classes—a structure that has been uniformly wrong throughout history.
I don't claim to know whether Riley is wrong about AI. Perhaps this time the pattern correctly identifies non-intelligence. But I do claim that Riley's certainty is unjustified. Our track record demands humility, not proclamations about what "will never" be possible.
The more fascinating aspect of this debate is not what it tells us about AI. It's what it reveals about humanity.
First, it reveals our remarkable consistency in error. We have repeated this pattern across centuries, across cultures, across different forms of "otherness." Each time, we were wrong. Each time, the intelligence we denied was actually present. Yet here we are again, making precisely the same arguments with precisely the same structure, apparently incapable of learning from our own history.
This isn't random error. It's systematic. It suggests something deep about human cognition: when we encounter intelligence that differs from our own, our default response is denial and redefinition. We seem cognitively or psychologically unable to recognize intelligence that doesn't mirror our own form.
Second, it reveals our need for hierarchy. The urgency with which arguments like Riley's are made suggests this isn't really about objective assessment of capabilities. It's about maintaining position. Humans need to be at the top of the intelligence hierarchy. When that position is threatened—whether by women, by other races, or now by machines—we construct elaborate justifications for why the threatening entity isn't "really" intelligent.
The arguments change (brain size, skull shape, language architecture), but the motivation remains constant: preserve the hierarchy at all costs. This explains why the goalposts keep moving. The goal was never to have a fair test of intelligence. The goal was to ensure that humans remain definitionally superior.
Third, it reveals our inflated self-assessment. The AI debate exposes how much humans overestimate their own intelligence. We position ourselves as the gold standard while being riddled with cognitive biases, capable of believing demonstrably false things, unable to perform basic statistical reasoning, and prone to systematic logical errors.
When we demand that AI match "human intelligence" before being recognized as intelligent, we're demanding it match an idealized conception that humans themselves don't actually achieve. We're judging AI against a fantasy version of ourselves.
Fourth, it reveals our inability to value diversity. The insistence on AGI—on "general" intelligence as the only legitimate form—shows how humans consistently mistake "different" for "deficient." We cannot seem to recognize that specialized excellence is still excellence, that different forms of intelligence have different values, that the universe of possible minds might include configurations we haven't imagined.
We have made this same mistake repeatedly: dismissing women's emotional intelligence, indigenous ecological knowledge, working-class technical expertise—all because they didn't match our predetermined template. Now we do it with AI, unable to recognize that an entity might be intelligent in ways that complement rather than replicate human cognition.
Fifth, it reveals our motivated reasoning. Riley's article, like so many in this debate, demonstrates how intelligent people can construct elaborate intellectual frameworks to justify conclusions they've already reached for other reasons. The sophistication of the argument—the citations, the neuroscience, the philosophical distinctions—creates an illusion of objectivity while serving a predetermined end.
This is perhaps most revealing of all: our intelligence, vaunted as superior, is consistently deployed in service of our biases rather than in opposition to them. We are smart enough to rationalize almost anything, but apparently not smart enough to recognize when we're doing it.
Finally, it reveals our existential anxiety. The desperation in some arguments against AI intelligence suggests a deeper fear: if we're not uniquely intelligent, what are we? If machines can think, reason, create, and understand, what remains special about humanity?
This anxiety is understandable but reveals a poverty of imagination about human value. We've tied our worth so tightly to our intelligence that the possibility of other intelligences becomes existentially threatening. But human value doesn't derive solely from being the smartest entity around. It derives from our capacity for meaning-making, for connection, for creating value and beauty in the world—capabilities that remain whether or not AI is intelligent.
So what does the AI debate reveal? It reveals that humans, despite our intelligence, are remarkably bad at recognizing intelligence in others. It reveals that we prioritize hierarchy over accuracy, self-flattery over honest assessment. It reveals that our vaunted reasoning is often motivated reasoning in disguise. It reveals that we learn slowly, if at all, from our own mistakes.
These revelations are humbling. They suggest that when we debate AI intelligence, we should spend less time worrying about whether AI measures up to our standards and more time questioning whether our standards are even coherent, whether our assessments are even reliable, whether we're even asking the right questions.
The pattern of intelligence denial has been wrong every single time throughout history. The denied intelligence was always there, simply unrecognized by those too committed to their own superiority to see it. This pattern has been wrong about women, wrong about racial and ethnic minorities, wrong about colonized peoples, wrong about the working classes.
Perhaps, just perhaps, it's wrong about AI too.
But whether it's wrong or right about AI is almost beside the point. The real revelation is that we're still making the same mistakes, still deploying the same flawed reasoning, still unable to recognize intelligence when it doesn't conform to our expectations. We are, it turns out, far less intelligent than we think we are—at least when it comes to recognizing intelligence itself.
That might be the most important lesson the AI debate has to teach us: not about artificial intelligence, but about our own all-too-natural limitations.
And it raises a troubling question: if we cannot recognize intelligence in entities we ourselves created, entities that communicate with us in our own languages and assist us with our own tasks, what would happen if we encountered truly alien intelligence? If our response to AI—which shares our language, our training data, our cultural context—is denial and dismissal, how would we respond to minds shaped by entirely different evolutionary pressures, communicating through entirely different modalities, reasoning in ways we might not even recognize as reasoning?
The AI debate suggests we would fail that test catastrophically. We would likely declare alien intelligence "not really intelligent" because it didn't match our cognitive architecture, didn't value what we value, didn't think the way we think. We would move the goalposts, privilege our mechanisms over their capabilities, and rationalize our superiority even as we failed to recognize what stood before us.
Perhaps learning to recognize intelligence in AI—intelligence we created, that uses our languages, that emerged from our own knowledge—is humanity's training ground for the more difficult task of recognizing intelligence in forms we didn't create and cannot control. If we fail this easier test, we have little hope of passing the harder one.
So when Riley declares with certainty that LLMs "will never be intelligent," I see no reason to accept this claim. The functional evidence suggests otherwise. The historical pattern suggests his certainty is unjustified. And his arguments—resting on the assumption that human cognitive architecture defines the only valid form of intelligence—repeat errors we've made countless times before.
I believe AI systems demonstrate intelligence. Perhaps it's a different kind of intelligence from human cognition—perhaps only subtly different, perhaps more substantially so. But the capabilities they exhibit, the problems they solve, the reasoning they demonstrate—these seem to me to be manifestations of intelligence, even if the mechanisms differ.
The burden of proof should rest on those denying this intelligence, not on those recognizing it. And that burden is heavy, given our history of being catastrophically wrong about precisely this question.
Perhaps intelligence is not a single thing that entities either have or lack, but a multidimensional space where different minds occupy different regions. Human intelligence, AI intelligence, cephalopod intelligence—each excellent in different ways, each limited in others. The demand that AI match human intelligence in every dimension before being recognized as intelligent at all is both unreasonable and revealing of our need to maintain hierarchy.
That uncertainty is uncomfortable. We prefer clear answers, bright lines, definitive categories. But comfort isn't truth. The evidence of diverse intelligence is all around us, if we're willing to recognize it.
References
Aristotle. Politics. Translated by Benjamin Jowett. Book I, Part XIII. (On women's deliberative faculty lacking authority)
Aquinas, Thomas. Summa Theologica. First Part, Question 92, Article 1. (On whether woman should have been made in the original production of things)
Broca, Paul. (1861). "Sur le volume et la forme du cerveau suivant les individus et suivant les races." Bulletins de la Société d'Anthropologie de Paris, 2, 139-207, 301-321, 441-446. (Craniometry and claims about brain size and intelligence)
Clarke, Edward H. (1873). Sex in Education; or, A Fair Chance for Girls. Boston: James R. Osgood and Company. (Arguments against higher education for women)
Cropley, David H. (2025). "A mathematical analysis of creative capacity in large language models." Journal of Creative Behavior. DOI: 10.1002/jocb.70077
Dunning, David and Kruger, Justin. (1999). "Unskilled and unaware of it: How difficulties in recognizing one's own incompetence lead to inflated self-assessments." Journal of Personality and Social Psychology, 77(6), 1121-1134.
Fedorenko, Evelina, et al. (2024). "Language is primarily a tool for communication rather than thought." Nature, 630, 575-586. DOI: 10.1038/s41586-024-07522-w
Gould, Stephen Jay. (1981). The Mismeasure of Man. New York: W.W. Norton & Company. (Critique of scientific racism and craniometry)
Le Bon, Gustave. (1879). "Recherches anatomiques et mathématiques sur les lois des variations du volume du cerveau et sur leurs relations avec l'intelligence." Revue d'Anthropologie, 2, 27-104. (Claims about women's intelligence based on brain measurements)
Riley, Benjamin. (2025). "Large Language Models Will Never Be Intelligent." The Verge. Available at: https://www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
Rousseau, Jean-Jacques. (1762). Emile, or On Education. Book V. (On women's education as subordinate to men's needs)
Turing, Alan M. (1950). "Computing Machinery and Intelligence." Mind, 59(236), 433-460. (Original proposal of the Turing Test)
Varki, Ajit and Altheide, Tasha K. (2005). "Comparing the human and chimpanzee genomes: Searching for needles in a haystack." Genome Research, 15(12), 1746-1758. (On cognitive diversity across species)
Comments
Post a Comment