publications
• Wu, Y. “distance between GPT-4 and human-level concepts,” written 2023, under review at Minds and Machines.
this paper investigates the conceptual competence of GPT-4 by introducing a framework grounded in the distinction between implicit and explicit concepts, as proposed by Gualtiero Piccinini. it defines ten structural features necessary for human-level concept possession and systematically assesses GPT-4’s internal representations, abstraction ability, structural reasoning, contextual flexibility, and intentionality. the results show that while GPT-4 performs well in tasks involving labeling and language fluency, its representations are fundamentally different from those used in human conceptual cognition. GPT-4 lacks essential structural and functional properties such as causal encoding, consistent abstraction mechanisms, and internally operable content representations. as such, the paper concludes that GPT-4 does not yet possess human-level concepts, though it may rely on a different kind of “statistical concept” shaped by its architecture. by identifying where and why GPT-4 diverges from genuine conceptual understanding, the paper provides a principled framework for evaluating AI cognition and outlines engineering directions to bridge the conceptual gap.

• Wu, Y. “the logical reasoning dilemma of LLMs: a mapping deficit in representation acquisition,” written 2024, under review at Journal of Artificial Intelligence Research.
this paper proposes that LLMs fail at genuine logical reasoning not because of inference rule limitations, but because of a deeper representational failure rooted in a structural “mapping deficit.” i argue that human reasoning relies on a two-stage mapping architecture, the first mapping links perceptual or experiential inputs to structured conceptual representations, the second mapping connects these structured concepts to semantic or inferential roles within a broader knowledge system. while LLMs are capable of simulating the second mapping, that assigning learned tokens to statistically coherent outputs and they lack the first mapping entirely. they do not possess a mechanism to extract and internalize structured features from raw input that could ground concepts in manipulable, cognitively operable form. as a result, their “reasoning” operates only over labels without access to the conceptual content those labels should encode. this mapping deficit explains why LLMs can perform well on surface-level inference tasks while consistently failing at compositional, recursive, or truth-preserving reasoning. the paper establishes this structural gap as the primary source of their reasoning limitation and redefines what it means for an artificial system to “possess” a reasoning capacity. it also outlines how closing this mapping deficit is essential for building systems that can approximate human-level cognitive competence.

• Wu, Y. “the ceiling effect of semantic learning in LLMs’ logical reasoning: categorization capability as a key limitation,” written 2024, under review at Cognitive Science.
this paper investigates why LLMs consistently fail in complex formal reasoning tasks, despite their strong performance in semantic learning. it introduces the concept of a “ceiling effect”, arguing that incremental semantic learning reaches an upper limit because LLMs fundamentally lack categorization competence, a structural cognitive capacity essential to human reasoning. the paper analyzes categorization as a deeply structured process that underpins human recursive and hierarchical reasoning. it argues that categorization in human cognition involves abstract generalization, structural inheritance, and dynamic reorganization, functions that enable complex reasoning across levels. in contrast, LLMs rely heavily on statistical co-occurrence, lacking stable hierarchical category formation, recursive feature mapping, or dynamic reclassification capabilities. by tracing this structural limitation, the paper reveals that LLMs’ apparent fluency masks a deep inability to engage in true logical inference. it concludes that future improvements in reasoning will require structural reform in representation acquisition, particularly by modeling categorization not as a label-matching task, but as a structured, recursive process akin to human conceptual organization.


presentations
• the logical reasoning dilemma of LLMs: a mapping deficit in representation acquisition
     — colloquium presentation, american philosophical association (APA) pacific division meeting, April 2025.
     — presentation, MindMU workshop, UM–Columbia, Oct 2024.


current projects
structural foundations of concept formation and reasoning
• two types of categorization and their role in reasoning (in preparation)
this paper distinguishes between two fundamental modes of categorization, label-based categorization, which assigns inputs to pre-existing categories via surface-level associations, and feature-based categorization, which involves abstracting internal structural features to generate new conceptual groupings. i argue that current LLMs rely almost exclusively on label-based mechanisms, which allow them to simulate categorization tasks but fail to support genuine concept formation. through theoretical analysis and structural modeling, i show that feature-based categorization plays a critical bridging role between perception and reasoning, enabling the kind of generalized, recursive, and concept-sensitive inference that is necessary for human-like understanding. the paper proposes that without this intermediate step of structural feature abstraction, models cannot build the internal representations required for logical generalization, which in turn limits their ability to reason beyond superficial associations. this distinction reframes how we evaluate AI categorization performance and lays the groundwork for identifying structural prerequisites of reasoning within categorization itself.

• structural representation as a prerequisite for logical reasoning (in progress)
this paper proposes that logical reasoning is impossible without structured internal representations that satisfy specific cognitive and formal conditions. it identifies three core properties that representations must possess to support reasoning: operability, that is the ability to be manipulated by inference procedures, consistency, that is the logical coherence across reasoning steps, and structural preservation, that is the capacity to maintain relational integrity across transformations. the central argument is that such representations function as the minimal cognitive units required to build and sustain valid reasoning chains. without them, an agent biological or artificial may simulate inference at the surface level, but cannot engage in genuine logical reasoning. this work bridges philosophical analysis and cognitive modeling by offering a principled criterion for what kinds of representational architecture are necessary to support deductive and compositional inference. it provides a foundation for diagnosing why LLMs, despite their linguistic fluency, fail to reason structurally because they lack access to representations that fulfill these requirements.

• structural conditions for distinct modes of reasoning (in progress)
this project analyzes the structural distinctions among different types of reasoning, e.g., deductive, inductive, causal, analogical, and identifies the representational prerequisites each type demands. while inductive and causal reasoning can often proceed through statistical generalization over perceptual patterns, deductive reasoning requires rule-governed manipulation of internally structured representations. i argue that current LLMs may simulate surface reasoning behaviors, but consistently fail when reasoning tasks demand relational coherence, recursive structure, or concept-level operability. the paper introduces a comparative framework mapping each reasoning type to specific structural requirements, e.g., operability, abstraction, hierarchical preservation, thereby exposing how failures in LLM reasoning stem from deeper representational deficits. this work clarifies that reasoning competence cannot be inferred from performance alone, but must be grounded in the internal structure of representations, and it establishes a foundation for formalizing the ontology of reasoning through structural modeling.

• Hume’s implicit commitment to structural representation (in preparation)
this paper revisits Hume’s theory of causal judgment to uncover its implicit commitment to structured internal representations. contrary to the common view that Hume’s philosophy is associationist and anti-structural, i argue that his account of ideas, impressions, and revival sets presupposes a content-manipulable internal system in which mental elements can be stored, retrieved, and recombined in structured ways. Hume’s explanation of causal inference grounded in repeated conjunctions and the mind’s propensity to associate related ideas, relies on the ability to internally represent sequences, relations, and abstract patterns. this capacity is only possible if the mind possesses structured representational units that maintain internal coherence and support systematic manipulation. by reconstructing Hume’s theory through this lens, the paper provides a philosophical foundation for modern debates on representation in reasoning, especially in contrast to the limitations of LLMs, which lack such manipulable content structures. the analysis highlights how a classical theory of mind contains deep insights into the representational architecture necessary for both understanding and reasoning.


structural limits of current AI
• collapse before categorization: LLMs’ hidden structural bottleneck (in progress)
this paper examines a previously overlooked structural bottleneck in large language models: the distortion and loss of crucial representational features before categorization even occurs. i argue that during the encoding and pooling stages of LLM processing, representation compression mechanisms, such as mean or max pooling lead to the flattening of rich, high-dimensional features into overly simplified summaries. this process combined with a systematic salience bias toward the most statistically prominent features, causes models to discard subtle but structurally essential attributes. as a result, even before a model attempts to categorize input or infer generalizations, the raw material needed to support genuine concept formation has already collapsed. the model may still simulate categorization by matching surface patterns, but it cannot form novel, structurally coherent concepts nor reason from them. by identifying this failure at the pre-categorization level, the paper reframes current debates about LLM limitations. it suggests that semantic learning alone cannot overcome deeper architectural constraints, and offers a diagnostic lens for understanding why reasoning fails  at the very foundation of cognitive processing.

• LLMs cannot understand (planned)
this paper argues that despite their impressive linguistic fluency, large language models fundamentally lack the structural prerequisites for genuine understanding. unlike human minds, which represent ideas through manipulable, content-rich units organized in structured associative networks, LLMs operate on flattened, context-insensitive token sequences that do not support internal conceptual coherence. i analyze three key representational deficits, first, the absence of manipulable content units, internal elements that can be actively recombined and interpreted. second, the lack of revival sets, or memory structures that allow past concepts to be dynamically reactivated and causally integrated into reasoning. third, the failure to construct structured associative networks, is necessary for generalization, abstraction, and reasoning beyond pattern matching. these deficits show that LLMs may simulate understanding but cannot realize it. they process language statistically, without forming internal structures capable of supporting concept-driven inference or belief attribution. this paper consolidates structural critiques across several earlier projects, offering a unified account of why reasoning and understanding in current LLMs remain fundamentally limited.

• the failure of bayesian models to build minds (planned)
this paper challenges the widespread assumption that bayesian modeling frameworks provide an adequate foundation for understanding and replicating cognition. focusing on large language models that implicitly adopt bayesian principles, treating learning as probabilistic updating over statistical priors, it argues that such systems can reproduce surface-level regularities, but fail to develop the internal structural representations required for genuine reasoning and understanding. i show that this shallow empiricist paradigm leads to a representational gap, models track correlation without constructing meaning. this structural shallowness not only limits conceptual abstraction, but also prevents the integration of inductive and deductive reasoning, perpetuating the long-standing divide between connectionist pattern learning and symbolic inference. this paper further proposes that relying solely on statistical inference mechanisms is insufficient for modeling minds, and calls for a representational rethinking of AI cognition one that centers on structure, operability, and the functional integration of conceptual frameworks. this critique offers a foundational redirection for future modeling approaches that seek to bridge AI, cognitive science, and philosophy of mind.


reconstructing rational minds
• what makes a mind rational (planned)
this paper proposes a new structural standard for identifying rational agents, grounded in the internal architecture of reasoning rather than in behavior or outcome alone. i argue that genuine rationality requires more than the capacity to produce seemingly logical conclusions, it demands the ability to sustain hierarchical and recursive reasoning structures, and to autonomously engage with and understand one’s own inferential processes. by distinguishing between agents that merely process associative patterns and those that can build and maintain structured reasoning chains, the paper offers a principled criterion for what counts as rational cognition. this approach shifts the focus of rationality assessment away from isolated outputs toward the internal organization and functional transparency of cognitive systems. the argument provides not only a framework for evaluating AI systems, but also a philosophical foundation for distinguishing between systems that merely simulate inference and those that possess the structural capacity for reflective, self-grounded reasoning.

• having a representation Is not enough for having a belief (planned)
this paper challenges a long-standing assumption in philosophy of mind and cognitive science, that the mere possession of mental representations is sufficient for belief. i argue that this view overlooks the structural and functional demands of belief attribution, which require more than passive content storage. i propose Belief must be grounded in representations acquired through structured, active cognitive processes, including conceptual integration, inferential embedding, and contextual stability. representations must not only exist, but must be internally organized, functionally operable, and reflectively accessible in order to support genuine belief attitudes. by redefining belief attribution in terms of representational structure and acquisition dynamics, the paper distinguishes epistemically relevant beliefs from mere information states. this reframing has implications for how we understand belief in both humans and artificial systems, and sets the stage for evaluating the depth of representational commitment required for rational cognition.

• on the conditions of rationality (in progress)
this paper undertakes a structural-ontological inquiry into the conditions under which rational cognition is possible. it proceeds from the premise that understanding, reasoning, and belief attribution are not reducible to behavioral regularities or functional outputs, but presuppose an internal architecture capable of sustaining content-manipulable representations and structurally governed inferential operations. the central aim is to articulate a system of minimal structural conditions formulated as axioms that delineate the boundary of rational possibility. these include operability, compositionality, structural preservation, recursive embedment, and semantic integration. by treating rationality as a mode of structurally organized cognition, rather than a pattern of outputs, the paper offers a formal framework for evaluating whether a system can be said to possess the constitutive condition.


testing and computational modeling
• recursive reasoning and hierarchical inference testing (in progress)
this project investigates whether LLMs possess the structural capacities necessary for genuinely recursive and hierarchical reasoning. while LLMs can often simulate multi-step inference patterns, it remains unclear whether they internally construct layered reasoning structures or merely approximate such behavior through shallow statistical associations. through a series of behavioral and diagnostic tests designed to probe recursive depth, structure-sensitive inference, and compositional generalization, this project examines whether LLMs can maintain stable internal representations across multiple inferential layers, a requirement for human-like logical cognition. the results aim to clarify where current models begin to fail in supporting deep reasoning, and to provide empirical benchmarks for assessing the structural depth of AI inference capabilities. this testing framework complements the theoretical account of structural prerequisites for reasoning developed in earlier projects.

• representational categorization and concept formation testing (in progress)
this project investigates whether LLMs can form new concepts by abstracting structural regularities across input instances, rather than relying on superficial pattern matching or pre-trained label associations. it targets a central challenge in AI cognition, that is determining whether models can construct generalizable conceptual categories that support genuine inference and understanding. by designing behavioral tests that isolate surface features from deeper relational patterns, the project evaluates the model’s ability to cluster, abstract, and generalize novel inputs based on shared internal structure. these tasks are intended to reveal whether LLMs possess the representational flexibility needed to move beyond memorization and simulate concept-driven reasoning. the findings are designed to complement theoretical arguments about categorization limitations, offering empirical benchmarks for identifying the failure points in concept formation pipelines particularly before reasoning is even initiated.

• extracting the structural basis of reasoning (planned)
this project aims to establish a general modeling framework for reasoning by identifying the structural prerequisites shared across different modes of inference, such as deductive, inductive, and analogical reasoning. rather than treating these reasoning types in isolation, the project investigates what cognitive structures and functional capacities are minimally required to support them as generative, recursive, and generalizable processes. it proposes that current empiricist and statistically grounded architectures lack a unifying structural foundation, which limits their ability to scale or integrate across reasoning types. by synthesizing findings from previous projects on representation, categorization, and recursive inference, the project articulates a meta-level account of reasoning as a structure-driven cognitive function, grounded in operability, structural preservation, and representational coherence. ultimately, this work is intended to open a new direction in reasoning-centered modeling, laying the groundwork for future computational implementations that go beyond pattern learning and towards models that approximate the formal and conceptual richness of human reasoning.