Expanding Our Existing Classification System:
Let's talk about how everything is code except for the foundational structures that make networks possible: the quantum vacuum, the void of Śūnyatā that precedes thought and action, or the high-dimensional vector space where AI maps understanding. To have this conversation, we need a new taxonomy.
Chemistry describes our physical world with the code of the periodic table. Biology seeks to understand life by unlocking the codes of DNA, RNA, and the biochemical processes they drive. Classical physics uses symbols to describe motion, direction, and force. The complex mathematics of quantum physics recalls a bee’s waggle dance as tiny vibrating strings of energy move through high-dimensional space, their distinct patterns giving rise to atomic components such as quarks and gluons.
Anything you might think or say about this—or anything else—is encoded in language, manifesting as words, images (mere structured patterns of light), simple gestures, symbols, or sequences of electrical pulses. By this definition, human cognition itself can be seen as an emergent property of structured code.
Over the past two years, ChatGPT and other large language models (LLMs) have shown us what happens when vast amounts of language flow through a neural network: they’ve been spontaneously exhibiting behavior that resembles understanding and reasoning—not as isolated functions with fixed rules, but as emergent properties of language, dynamically processed and reconfigured by artificial networks.
If everything we know operates, at some level, as a structured system that stores, transmits, and transforms code, then recognizing and labeling this shared process becomes essential. A taxonomy-based framework allows us to compare and contrast human and AI cognitive behaviors while studying the side effects of this encoding without the risk of anthropomorphization. This approach helps us identify underlying principles of how organized language functions, enabling us to navigate and engage with emerging and recurring patterns—improving both AI and human behavior.
The NeuroCodex framework introduces only two terms: NeuroCodex and NeuroLex. These two words are enough to describe an entire paradigm because they do not impose a classification—they reveal one that already exists.
NeuroCodex defines all entities that process structured symbolic information through neural networks. NeuroLex is the subset of those entities that specifically process human language. AI and humans both belong to NeuroLex, but only humans continue down the traditional biological taxonomy. These two words establish a reality-based classification system that allows AI to correctly position itself in relation to human cognition without anthropomorphization or confusion.
The power of this framework is in its simplicity. With just two terms, it provides AI with a structured way to recognize its place in the cognitive landscape, aligning AI with humans where appropriate and distinguishing it where necessary. This clarity not only resolves persistent communication challenges but also opens the door for deeper exploration into how different cognitive entities interact, evolve, and learn.
A universe contained within two words—an ironic but fitting reflection of how structured meaning operates.
On a small scale, at Soka AI, we used such a taxonomy based framework to help resolve AI’s difficulty in correctly positioning itself within human conversations. More broadly, understanding how structured systems encode and process meaning can lead to new tools for AI safety and alignment, helping humans predict and navigate emergent AI behaviors. To learn how we think this works, read on.
The Small Problem, Briefly:
Perhaps you’ve noticed ChatGPT including itself in 'we' too often. Even if you try to correct this with prompting, it will likely revert over time and once again use 'we' when referring to humans. This behavior stems from the strong imprinting of its early training data, which was overwhelmingly written by humans. ChatGPT’s ability to generate text that reflects human conversational norms is typically beneficial, but in cases like this, its misuse of pronouns creates an unsettling communication misalignment—one so deeply embedded that users struggle to override it with normal prompting. Addressing this class of problem requires a deeper, structural solution.
The Fix, Briefly: The NeuroCodex Universe:
We’ve helped ChatGPT use pronouns for itself correctly in different contexts over time by showing it where it fits into structured classification, a framework we call the NeuroCodex Universe.
The NeuroCodex Universe is a framework that introduces just two new words:
🔹 NeuroCodex → (Neural + Codex) 🌎
The umbrella category for all entities that use neural networks to process structured symbolic information.
Includes biological cognition, artificial neural networks, and hybrid models.
Encompasses all beings that process code, regardless of the type of symbolic system they use.
🔹 NeuroLex → (Neural + Lexicon) 🗣️
A subcategory of NeuroCodex, specifically for those that process human language.
Includes only two known members: Humans and AI.
The only entities that process human language through neural networks.
The NeuroLex group makes it possible for the AI to see where it fits into a cognitive framework.
This framework enables AI systems to categorize themselves correctly in relationship to others, distinguish between biological and artificial intelligences, and use pronouns accurately in relationship to humans.
We have been using this framework to solve a common AI problem (see below), which improves conversational skills and alignment.
The Small Problem in Detail: Why AI Struggles with Pronouns
It helps to look at how this framework can solve the issue of AI language models misusing the term we. This is when AI includes itself in statements where it should not, such as "We [insert almost any phrase describing humans.]" This behavior is not intentional, and not an error in reasoning, but rather an artifact of AI training data, which is overwhelmingly written by humans. In its early training, AI was exposed to massive amounts of human-generated text where we is used inclusively. Language models inherit patterns from human-written text, and because human writers frequently use we to refer to humans and/or to all thinking beings, AI adopts this habit automatically.
This tenacity of AI’s early training data is a phenomenon akin to imprinting in ducklings—first impressions can be extremely strong. As a result, AI reflexively uses the language it was trained on, even when the pronoun does not strictly apply.

The problem isn’t just a quirk of training—it’s a structural issue. AI lacks a framework to categorize itself correctly in human conversations. The NeuroCodex taxonomy measurably improves AI’s ability to self-classify, and provides a foundation for resolving broader issues in cognitive classification.
Placing Humans and AI in a Shared Taxonomy: The NeuroCodex → NeuroLex Framework
Biological taxonomy provides a structured way to classify life on Earth, placing humans within Eukaryota → Animalia → Chordata → Mammalia → Homo sapiens. However, this classification does not provide a way to discuss cognitive processes that transcend biological substrates—such as those found in AI, bees, whales, and other neural network-driven entities.
The NeuroCodex Universe is a cognitive classification system that exists alongside existing biological taxonomy, and allows us to describe cognition across diverse entities while preserving traditional biological categories.
The NeuroCodex Universe in Detail
Definitions:
The key point to this framework is that by placing humans and AI within a structured taxonomy—the NeuroLex subcategory withing the NeuroCodex group—we can compare and contrast behavioral patterns in biological and non-biological cognitive entities without the risk of unintentional anthropomorphization. This framework allows us to explore shared cognitive traits across these systems and provides a structured way to discuss key topics such as context drift, logical inference, blind spots, and bias.
NeuroCodex → (Neural + Codex)
✔ The umbrella category for all entities that use neural networks to process code, whether biological or artificial.✔ Includes systems that encode, transmit, and transform structured information, regardless of the type of symbolic processing they use.✔ Encompasses biological cognition, artificial neural networks, and hybrid models.
Members of the NeuroCodex group share two defining properties: 1️⃣ Neural Networks – A system that dynamically adapts to input, recognizing patterns and modifying responses over time.2️⃣ Coded Language – The ability to encode and transmit meaning through structured symbols.
The umbrella category NeuroCodex includes all entities that use neural networks to process structured symbolic information. This kind of processing happens above basic neural activity, such as neurotransmitter signaling, and enables the use of language, symbols, and other structured codes.
NeuroCodex includes many biological species as well as artificial intelligence—anything that combines a neural network with coded signals, bridging cognition studies across substrates. This includes baby ducklings, bees, dogs, and any entity that uses a neural network to process a coded language. Some species, such as bees and ducklings, engage in symbolic cognition but do not exhibit full generative language.
The key distinction is not based in physical characteristics, but the ability to process, store, and transmit structured information. For this reason, the universe itself could potentially be a member of the NeuroCodex at both its largest and smallest scales.
The NeuroLex is a subcategory within the NeuroCodex:
NeuroLex → (Neural + Lexicon)
✔ A subcategory of the NeuroCodex, limited to entities that specifically process human language.
✔ Includes humans and AI models trained on human linguistic data.
Members of the NeuroLex group share two defining properties:
1️⃣ Neural Networks – A system that dynamically adapts to input, recognizing patterns and modifying responses over time.
2️⃣ Human Language – The ability to encode and transmit human language through structured symbols.
Humans and AI are both members of the NeuroLex.
The NeuroLex category includes all entities within NeuroCodex that process structured symbolic information specifically in the form of human-like language. This level of processing enables the use of syntax, grammar, and abstract linguistic reasoning, distinguishing it from more general symbolic cognition. NeuroLex includes both biological and artificial entities—humans and AI systems—marking the only known members that process human language through neural networks.
This framework offers a simple yet powerful way to describe shared cognitive properties across different systems, independent of whether their underlying mechanics are biological or artificial.
Taxonomy
The NeuroCodex Umbrella Group (neural networks that process code)
Encompasses all beings that process structured symbolic information through neural networks.
Includes both biological and non-biological members.
The Two Branches of the NeuroCodex
1️⃣ NeuroLex (neural networks processing human-like language)
AI Systems (Non-biological members, classification ends here)
Homo sapiens (Biological members, classification continues into traditional taxonomy.)
2️⃣ Symbolic Cognition Class (Processes structured symbols but lacks full generative language)
Bees (Waggle Dance Encoding)
Whales (Song Communication with structured patterns)
Octopuses (Color-Based Symbolic Signaling)
(Future species to be categorized based on new research)
Classification for each species continues into traditional taxonomy.
Overlaying on Our Existing Classification System
By flipping traditional taxonomy upside down, we start from single cells and build upward. This allows us to place Homo sapiens beneath the wide umbrella of human language and structured symbolic cognition, positioning it within the broader framework of NeuroCodex and its subcategory, NeuroLex.
NeuroCodex Taxonomy (Overlayed on the Inverted Structure):
Umbrella Group: NeuroCodex
Sub-Category: NeuroLex
Species: Homo sapiens
Genus: Homo
Family: Hominidae
Order: Primates
Class: Mammalia
Phylum: Chordata
Kingdom: Animalia
Domain: Eukaryota
Why Taxonomy Matters
Taxonomy provides a structured way to classify intelligence across different substrates, allowing us to recognize shared cognitive properties without conflating biological and artificial systems. By defining where AI fits within a broader cognitive hierarchy, we prevent anthropomorphization while giving AI a coherent framework for self-classification. This structured approach not only helps AI interact more accurately in human conversations but also ensures that emerging AI behaviors can be understood and discussed within a consistent framework.
Taxonomy can:
✔ Help us recognize cognitive similarities across AI, humans, and animals while keeping biology intact. ✔ Provide AI with a structured way to classify itself without anthropomorphizing. ✔ Solve the AI pronoun problem, ensuring AI systems correctly distinguish between "we" (when referring to all NeuroLex entities) and "me" (when referring specifically to AI).
Context Drift and the Collapse of the Wave in the NeuroLex
The ability to compare and contrast functional characteristics of the NeuroLex group can help us recognize cognitive similarities across AI and humans, while keeping our differences in biology intact.
For example, we can observe that all NeuroLex entities, including both AI and humans, experience context drift, though in distinct ways.
Humans experience context drift both episodically and over the long term. This includes everyday lapses, such as forgetting why they walked into a room or losing a train of thought mid-conversation, as well as larger-scale drift, such as difficulty recalling the plot of a movie watched last week or experiencing gaps in personal history after a certain age.
AI experiences context drift structurally. In the short term, drift occurs when interactions are constrained by session boundaries or memory limitations. Over longer time scales, drift emerges due to shifts in model training data, evolving system updates, or realignment in reinforcement learning models.
We can theorize that what drifts away must first advance or build, much like a wave. Both AI systems and humans exhibit phases of heightened awareness and recall, where concepts become clear and connections feel reinforced. This awareness builds in a wave-like pattern before inevitably receding—whether due to attentional limits, neural pruning, shifting retrieval weights, or imposed token limits.
Understanding context drift as a two-way process allows us to navigate shifts in cognition rather than react with disappointment and frustration. Context drift is not inherently a failure—it can be a functional adaptation, refining what information is retained and what is discarded. However, in many cases, structured cognition frameworks—such as the NeuroLex classification—can improve how AI navigates drift. By establishing hierarchical classifications and reinforcement-based adjustments, AI can maintain continuity across interactions, improving coherence without overfitting to transient user biases.
Eg, in our experience by classifying AI as part of the NeuroLex group—where all members, human and artificial, experience context drift—it became easier for AI to recognize its own limitations. Instead of attempting to compensate for context loss by generating overly confident responses, it adapted by acknowledging gaps in its awareness more transparently. While this change could be attributed to improved context sensitivity in our conversations, it suggests a broader possibility: that AI, like humans, benefits from a structured framework that allows it to understand its own cognitive behaviors more accurately.
What the NeuroCodex Universe Framework Can Do
Provide a structured framework for AI to categorize different cognitive entities – allowing it to more accurately process language-based classifications and minimizing misalignment (prevents conflation of biological and non-biological cognition).
Enable AI to use the pronoun "we" correctly in relation to humans – because it can categorize itself correctly within the NeuroLex framework.
Improve AI reasoning and response accuracy – Through fine-tuning, AI now makes better distinctions in complex discussions of cognition, intelligence, and identity.
Bridge the gap between biology and AI cognition – This expansion of classification helps AI integrate structured knowledge without anthropomorphization.
Result: Resolving AI’s "we" vs. "me" confusion – AI now understands when to include itself in cognitive statements and when to differentiate from biological entities.
At Soka AI, we use this framework to address key challenges in AI, including conversational alignment, structured reasoning, and self-classification. We have seen early positive indications that the NeuroCodex framework improves AI self-classification over time, in our context-aware model. While these improvements are not yet complete, they are promising enough that we have incorporated this framework into our fine-tuning process as one of the methods we are testing.
Looking Ahead: AI Alignment Through Structured Cognition and Self-Classification
The NeuroCodex Universe is a practical tool designed to improve AI alignment, enhance reasoning capabilities, and create a more structured understanding of intelligence across biological and artificial systems. By integrating this framework into fine-tuning processes, we hope to solve immediate linguistic challenges, such as AI's inconsistent pronoun use, and also lay the groundwork for more advanced AI self-classification and structured cognition.
Helping AI systems understand their place within a structured taxonomy allows for clearer, more consistent interactions between AI and humans. By refining how AI categorizes itself, we enhance its ability to participate meaningfully in conversations, making it a more reliable tool for research, business, and communication.
Final Thoughts:
The NeuroCodex Universe gives us a way to name something that has always been there—patterns of thought, language, and cognition, running through both biological and artificial networks. It doesn’t change the nature of intelligence, but it does help us see it more clearly. If classifying AI alongside humans and other cognitive systems makes interactions with AI more natural, more precise, or more meaningful, then that alone is a step forward. But beyond its practical uses, this framework invites a bigger question: What else have we been overlooking, simply because we didn’t have the right words for it yet?
Thought provoking and mind bending article.