Prompting in LLMs - Conceptual View (Manifold of an LLM)

Kaan Bıçakcı
Sep 11, 2024
5 min read

[This blog post is currently only available in EN]

In the previous blog (Deep Learning and AGI: Generalization Issue - Quick Introduction ), we explored the concept of the manifold hypothesis and provided an example of a simple manifold illustrating how an LLM generates the next token.

Manifold Hypothesis in LLMs
So where exactly is prompt engineering coming from?
1. They Don't Understand...
2. Conceptual Example
My Final Thoughts

Manifold Hypothesis in LLMs

Simple Manifold

The figures below are abstracted representation of extremely complex systems of LLMs.

ree — Figure 1. Simplified manifold from previous blog

Okay, it's time to conceptualize this example more clearly. Let's zoom out a bit and add some color to the manifold space.

Manifold Zoomed Out

ree — Figure 2. Zoomed out and colored version of previous manifold.

That's what you would see. Let's expand on this a bit, as it might look complicated to interpret. This colorful 3D surface is actually a simplified visualization of a high-dimensional manifold – think of it as a landscape of relationships (actually called LLM's Manifold).

The peaks (brighter areas) and valleys (darker areas) represent different concepts or word relationships in the language model's vocabulary. Some people call this "model's mind" which I disagree because those attributes can be misleading for non-technical people.

As the model processes language, it's essentially traversing this landscape, moving from one point to another. The smooth transitions between colors show how concepts blend and relate to each other. While the real manifold exists in thousand or billions of dimensions, this 3D projection helps us grasp its underlying idea.

It's like looking at a map of the Earth. We see a flattened version of a much more complex sphere, but it still gives us valuable insights into the terrain.

Manifold in 3-dimensions

Now it's time to label this map and change our perspective:

ree — Figure 3. 3D view of our manifold.

Let's look at a higher-level representation of the language model's 'vocabulary landscape'. While the previous image showed the fine-grained structure of the manifold, this image offers a broader view of how different domains of knowledge might be organized within the model's vocabulary space.

Imagine this as a bird's-eye view of our earlier 3D landscape, where we've labeled the major 'regions' of knowledge. The various colored areas represent different domains. This emphasizes the semantic relationships between broad knowledge domains, which is particularly useful for understanding how the model organizes and accesses different types of information.

In an LLM, different areas of information aren't completely separate. Instead, they're connected and often overlap, like a web. This shows that the model's representations are more like a network than a set of distinct categories.

Concrete Example

Consider a prompt asking the model to:

Explain how advancements in medical technology have impacted our understanding of historical events.

This query activates multiple regions in our vocabulary landscape, primarily "Medical," "Tech," and "Historical events."

The model begins its response in the "Medical" region, gathering information about medical advancements. It then moves to the adjacent "Tech" area, focusing on the technological aspects of these medical breakthroughs.

As the explanation progresses, the model moves towards the "Historical events" region, drawing connections between medical/technological progress and our understanding of history. For instance, it might discuss how modern medical analysis of ancient remains has revised our knowledge of past civilizations.

The model may briefly touch the "Education" area when simplifying complex concepts or providing context. It could also drift towards the "Places" region when discussing how these advancements have affected different parts of the world.

Throughout this process, the model navigates between these labeled regions, creating a narrative that links medical technology to historical understanding.

Note: The figures above do not fully reflect the true complexity of the LLMs' manifold/space. In reality I expect those domains are not so cleanly separated. There should be significant overlap between areas, with blurred boundaries and more intricate transitions between topics. The model often combines information from several related areas at once. Also we don't fully understand the LLMs since they are very complex deep learning models.

So where exactly is prompt engineering coming from?

In my opinion we can divide this into two sub-classes:

Optimizing prompts so they use less tokens:
- In real life scenario if you are using a paid API for the model, tokens will cost you money. If you could get very similar output with less tokens you would choose that.
Revising prompts to get the desired answer:
- It's refining the prompt so that the model produces more accurate, relevant, or specific outputs. It may include:
  - Adding context or constraints to narrow the focus
  - Specifying the desired format or style of the response
  - Including examples to demonstrate the expected output
  - Using clear and precise language to avoid ambiguity
  - Incorporating relevant domain-specific terminology
  - Breaking complex queries into smaller parts, like divide-and-conquer
  - Adjusting the tone or perspective requested from the model

In both cases, the goal is to create prompts that generate the best responses for the task, often by testing and reiterating over them. The model's process of generating a response can be viewed as traversing this manifold. Different prompting techniques makes this navigation more efficiently and accurately.

If we sum up the last part, using different prompts can be viewed as the art of setting the initial conditions for the model's journey. A well-crafted prompt does more than just specify a starting point. We can list the steps roughly:

Provides Direction: Guiding the model towards relevant areas of the manifold
Sets Constraints: Defining boundaries for the model's exploration
Domain and Context: Activating specific regions of the manifold that are most related to the task

As you can see, we're interpolating around the manifold, that's how you can generate text that's not in the training data.

They Don't Understand...

Some people disagree with me when I say LLMs don't understand anything. First, "understanding" is different and a complex process and can't be done using statistical tokens alone. Second, if the models understood anything, you would not need the point 2 (revising prompts to get desired answer).

So it feels like, when you're prompting an LLM, you're trying to gather some information from some.. space let's say for now. You can think this space as a vector database like Pinecone, which you have the encoded information of training data as vectors. In that case your prompt becomes your query.

As I mentioned above, the model's learned representations exist in a high-dimensional space. This allows for a wide range of potential outputs, but the actual number of meaningful, distinct outputs for a given task can vary greatly depending on the task (how specific it is) and generation parameters (like temperature).

Conceptual Example

Let me give you a real life example. I'm pretty sure most of the people who are reading this post been to a library once. Generally, you have a topic ("Machine Learning") in your mind and navigate in the library.

The library layout is like the model's 'landscape'. Different sections (Computer Science, Statistics, etc.) represent different domains.
Your initial query "Machine Learning" is like the prompt given to the model. It sets the general direction of your search.
As you walk through the library, you're navigating this space, much like the model traverses its internal representations.
You might start in the Computer Science section, then move to Statistics for more theoretical aspects, or to Engineering for practical applications. This is similar to how the model connects its different areas.
You can narrow your search by looking for specific sub-topics just as more detailed prompts guide the model to specific areas of its knowledge.
The book you choose is like the model's output in this case - it's the result of navigating this space of knowledge based on your initial query and subsequent refinements.
When you have another visit to the library, it might result in choosing a different book, just as the model might produce varied outputs for the same prompt.

My Final Thoughts

I appreciate how some people "delve" into the concepts I discussed earlier and in this blog post.

Next posts will be more technical and focused around Neuroscience and related stuff.