Advancing AI By Nesting Minds Inside The Layers Of Machine Learning And LLMs

In today’s column, I examine a fascinating and quite innovative approach to designing and architecting modern AI. Time will tell whether this represents a substantive and compelling change or whether it might be one of many useful but not definitive side-trips on the pathway to truly advanced AI.

The approach is innocuously coined as nested learning (NL). It is perhaps a lot more brash than the name implies. In brief, Google researchers have proposed NL as a means of overcoming the prevailing limitations and constraints of traditional generative AI and large language models (LLMs). They propose and have built a prototype named Hope that seeks to work on a self-improving basis, employing continual learning, showcasing deeper computational depth, and consisting of interconnected multi-level layers that optimize simultaneously.

Let’s talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

Human Learning As Inspirational

Before we get into the AI aspects, I’d like to set the stage by discussing various facets of human-based learning. Those aspects will help illuminate the AI approach that will be depicted. I will use a catchy scenario to get the discussion underway.

Suppose that you wanted to train someone about the popular sport of baseball. Assume that the person doesn’t yet know anything about baseball. They are starting from scratch. You are going to teach them and do so on a human-to-human learning basis.

You might proceed by explaining the rules of baseball. There are batters who use a bat to hit a ball. They run from base to base. An opposing team resides in the baseball field and tries to catch the ball and tag the runner. And so on.

After you have sought to tell everything about baseball that you know, the person being trained or taught is going to presumably grasp the fundamentals of the sport. One angle is to ensure they know the rules of the game. Another angle is to identify the strategy of playing baseball. There are lots of tricks of the trade that make the difference between a mediocre team and an outstanding team.

Learning Keeps Going

Imagine that the person decides to further pursue their interest in baseball and starts watching MLB games on TV.

Do you think that they will learn more about baseball by doing so?

On the one hand, some might insist that they won’t progress in their know-how about baseball. Whatever they first learned is all that they will ever know. Their mind is frozen in time with respect to what they initially learned about baseball.

Hogwash, you might exclaim, the person is absolutely going to learn more about baseball. By watching games on TV, they are certainly going to identify many additional nuances. The broadcasters will likely reveal insider tips on what the players are doing. A lot of insights about baseball can be gleaned by watching games and contemplating what is going on during the games.

We can safely say that a thinking human is going to learn new things. In this instance, the person’s knowledge about baseball is going to increase. Indeed, they might even be smart enough to correct prior false beliefs that they inadvertently formed when first learning about the sport. All in all, we expect them to be a learner.

In today’s world, we widely embrace the propensity to be a continual learner, a lifelong learner.

Layers To Learning

Is there anything else that the person can learn about baseball?

Sure, the person could learn about the coaching of baseball players and baseball teams.

That is something that might not be obvious when watching games and might not have been directly covered when first being taught about the sport. Being a baseball coach is an entirely different layer of knowing about baseball. It is still obviously immersed in baseball, no doubt about that, but it requires being able to think in a more macroscopic way about the sport and its players.

I’ve got another layer for you to consider. Envision that you were a baseball coach and then got asked to coach other baseball coaches. This is yet another twist on knowing about baseball. You are now using your baseball coaching skills to coach baseball coaches, and they, in turn, will be coaching baseball players and baseball teams.

Whew, that’s a bunch of layers involving thinking about baseball.

Seeing The Big Picture

Let’s do a recap.

A human might learn a particular realm or topic. They could stop there. Period, end of story. The reality is that we usually learn more about any given topic and go beyond what we first learned.

The topics we learn could be conceived of as a series of layers. My baseball example involved the layer of learning about the fundamentals of baseball. The next layer was learning additional nuances about the fundamentals and then extending into the advanced elements of the sport. We didn’t end there. Another layer entailed learning how to coach baseball players and teams. We kept going. The layer above that layer involved coaching other baseball coaches.

You could describe this as consisting of nested layers. There are a multitude of them. Along the way of formulating those layers, we undoubtedly did some amount of optimization. For example, the person might have reorganized or restructured their knowledge about baseball and been able to make it more efficient and effective.

Humans like to keep their knowledge tidy and ready for use (well, sometimes).

Contemporary Generative AI

Shifting gears, I’d like to dive into the nature of contemporary AI.

AI developers craft an LLM by scanning text that exists throughout the Internet. The AI pattern matches the scanned text. As a result of scanning millions upon millions of stories, narratives, poems, and the like, the AI is mathematically and computationally able to seem to be fluent in human natural languages such as English. The AI is essentially mimicking how humans write.

Within the AI is an artificial neural network (ANN). It is a large-scale data structure that contains numeric values. The ANN does the bulk of the work when it comes to representing the pattern matching of the written materials that were scanned.

An ANN is not the same as a true neural network (NN) that exists in your brain, sometimes cheekily referred to as your wetware. The ANN is simplistic and only inspired by some aspects of how the human brain works. I mention this to emphasize that though many in the media tend to equate ANNs with real NNs, it is not a fair comparison. For more details on ANNs and how they function, see my discussion at the link here.

By and large, once an AI developer has done the initial setup of the LLM, it will remain relatively the same until the AI developer comes along to make further changes to it. Most of the common LLMs do not self-adjust in real-time. They are instead adjusted by AI developers, from time to time, and otherwise are relatively static.

The Baseball Example Revisited

When you use a popular LLM such as ChatGPT, GPT-5, Claude, Gemini, Grok, etc., the AI is pretty much basing what it figures out via the initial data training that originally took place. That’s the main corpus of the pattern matching.

Pretend for a moment that the only scanned content for a particular LLM on the topic of baseball consisted of the rules of the sport. Just the barebones rules. There wasn’t anything available to be scanned about the advanced aspects of baseball. Nor was there any data scanned about coaching baseball players and teams. And so on.

Can the LLM adjust or improve on the topic of baseball?

Please know that I hesitate to ask whether the LLM can “learn” more about baseball, and instead phrased this updating action as an adjustment or improvement. I do so to try and avoid anthropomorphizing AI.

Allow me to elaborate. The word “learn” is usually associated with humans and what humans do in their heads. AI is not doing this the same way that we do in our minds. In that manner, it is a bit misleading to refer to AI as “learning” – but everyone uses that phrasing anyway since it is convenient. I will reluctantly proceed to use the word “learn” with respect to AI, but now you know that I mean the word as it relates to AI mathematically and computationally, and not to be equated with the magic that (somewhat mysteriously) occurs inside the human noggin.

Having AI Learn More About Baseball

If you were to enter prompts into our pretend LLM and ask about the fundamentals of baseball, you would probably be satisfied with the response. That is what was contained in the initial setup.

But if you ask advanced questions about baseball, the AI will tell you that there isn’t anything else about baseball that it can say. You would almost certainly stymy the AI if you asked how to coach baseball players and baseball teams. This is because there isn’t anything there for the AI to retrieve or rely upon.

You can temporarily overcome this paucity by entering prompts to tell the AI more about the topic of baseball. If the AI is connected to the Internet for web searching, it could also go look up more data about baseball. Another means of infusing data would be to use in-context modeling or RAG (retrieval augmented generation), which allows you to import documents into the AI as additional data sources. See my explanation about in-context modeling and RAG at the link here.

The thing is that those materials are usually only temporarily utilized by the AI. The LLM isn’t going to on-the-spot permanently “learn” from those inputted aspects of baseball. It will seem to have ingested the data during your conversations, but this isn’t being incorporated on a permanent basis into the totality of the AI system.

The Desire For Robust Learning By AI

If a friend of yours logs into the AI and asks about baseball, the only aspects they will see will be the fundamentals that were gleaned during the initial overall setup. Your conversations about baseball have not automatically caused the AI to update across the board.

We might say that the AI hasn’t been able to learn from your conversations and inputs about baseball. That’s a bummer. It sure would be nifty if the AI could automatically learn and adjust based on the millions upon millions of people interacting with the AI. Imagine the incredible possibilities!

Downsides exist. Suppose the AI learns falsehoods. This could easily happen. Someone tells the AI that in baseball, a player can skip third base and run directly to home plate (that’s not allowed). The AI might be fooled or tricked. Meanwhile, if this is infused in the totality of the AI, the AI will repeat that falsehood to millions of other users. Not good.

Learning is a dicey proposition. That’s why the norm consists of AI developers opting to adjust and improve the AI, refreshing it and updating it, therefore doing the act of learning for the AI by guiding the AI in doing so.

Getting AI To Learn For Real

A cogent argument can be made that contemporary AI is not going to attain artificial general intelligence (AGI) unless we find a suitable means for AI to undertake self-learning (for more about the goals and aims for AGI, see my analysis at the link here). Humans do self-learning. AI ought to do the same.

The self-learning should be immediate and occur in real-time. The self-learning ought to encompass optimization, namely reorganizing and restructuring to accommodate whatever has been learned. Self-learning should be cautious and not willy-nilly. Nor get fooled or bamboozled.

How can we rearchitect the prevailing design and structures of today’s generative AI and LLMs so that self-learning is feasible and meets those criteria?

That valiant question is the focus of an intriguing new research paper written by team members at Google Research. The paper is entitled “Nested Learning: The Illusion of Deep Learning Architectures” by Ali Behrouz, Meisam Razaviyayn, Peiling Zhong, Vahab Mirrokni, 39th Conference on Neural Information Processing Systems (NeurIPS 2025), November 7, 2025, which made these salient points (excerpts):

  • “Despite all their success and remarkable capabilities in diverse sets of tasks, LLMs are largely static after their initial deployment phase, meaning that they successfully perform tasks learned during pre- or post-training, but are unable to continually acquire new capabilities beyond their immediate context.”
  • The only adaptable component of LLMs is their in-context learning ability — a (known to be emergent) characteristic of LLMs that enables fast adaptation to the context and so perform zero- or few-shot tasks.”
  • “In this paper, we present a new learning paradigm, called Nested Learning (NL), that coherently represents a model with a set of nested, multi-level, and/or parallel optimization problems, each of which with its own ‘context flow’.”
  • “NL reveals that existing deep learning methods learns from data through compressing their own context flow and explain how in-context learning emerges in large models.”
  • “NL suggests a path (a new dimension to deep learning) to design more expressive learning algorithms with more ‘levels’, resulting in higher-order in-context learning abilities.”

Technical Considerations

For those of you versed in the technical underpinnings of AI, I suggest you consider reading the research paper to get the eye-popping details.

Their viewpoint is that NL provides a new dimension to the design of AI models. For example, they model backpropagation as a form of associative memory. Likewise, transformer attention mechanisms are devised as associative memory modules. They use a defined frequency rate for when to update weights, serving as a means to arrange the interconnected optimizations into various levels.

Another novelty is an extension of feedforward ANNs into a paradigm they coin as a CMS (continuum memory system). In turn, this form of a memory system of a long-term nature is crucial to enabling continual learning. They have constructed a proof-of-concept named Hope that can be used in experiments to gauge how well this works and can spur additional enhancements by interested AI developers.

New Architectures To Free Us From Malaise

I’ve repeatedly noted in my column and in my many presentations that we are boxed in when it comes to prevailing AI architectures. Though some believe that we only need to toss more and faster hardware at the existing AI to get it to reach the heights of AGI, I seriously doubt this.

That’s why I embrace out-of-the-box attempts to legitimately discover alternative architectures, see for example my coverage at the link here and the link here. New architectures are new beginnings and the only likely way to truly make demonstrative progress toward massive breakthroughs in AI.

As George S. Patton famously said: “If everyone is thinking alike, then somebody isn’t thinking.” We must be thinking beyond the norm. Whether this latest architectural design is the cat’s pajama is not yet determinable, but it is heartwarming to avidly pursue the cat’s meow.

Source: https://www.forbes.com/sites/lanceeliot/2025/11/13/advancing-ai-by-nesting-minds-inside-the-layers-of-machine-learning-and-llms/