AI models are possibility space explorers

Status: shelved

Z3 is a solo project that began as a way to scratch my own itch. I started writing fiction last year, and it being 2024, I naturally looked to partner with LLMs. It quickly became clear to me that the existing interfaces for writing with LLMs were poorly suited to writing stories. While the models themselves were helpful, what I really wanted was a deeply integrated assistant in my text editor.

I began with the insight that LLMs are possibility space explorers. The point is not to take the first output, or even the tenth output. The point is to explore the space of possible stories. That is, to explore all the directions a story could take, so that I can make decisions about which is most interesting. Imagination tends to be the limiting factor, and LLMs have imagination in abundance.

I began working on a few experiments and I looked around for users who shared my frustrations. After a few interviews, I developed a conviction that young hobbyists writing in their spare time were probably the best early adopters for a new interface. They would be the most amenable to trying something new, without having sworn allegiance to a dogmatic creative process yet. They were also the least worried about the quality of the writing—which is important, given that chat models continue to struggle with consistently producing high quality writing. From a business point of view, they were also the most numerous, and the most likely to share it with their friends if they liked it.

There were two key problems I discovered among this group. First of all, some form of writer’s block was very common. As I had found in my own writing, imagination was usually the limiting factor. But I also noticed that many writers experienced a lack of motivation arising from writing alone, without an editor or friend to continually give feedback. I could empathise with that—it’s helpful to have someone reflect your ideas back at you and refine your earlier drafts, and this tends to be unavailable to hobbyists without significant resources.

Luckily, LLMs are an excellent fit for both of these problems and they became the foundation of Z3: an exploration of how we may collaborate with our AI friends to write better stories. I believe that this is an important problem, because stories are not just stories. They are a lens through which we understand the present, discover the past and imagine the future. After all, it is not logic that will change the minds of men—so if you have an agenda in the world, you best have a good story to tell.

Generating a completion

The first challenge was to define the interactions with LLMs.

I think of writing stories as taking a path through the space of possible stories. When someone encounters “writer’s block”, it means that they are stuck at a location in that space, unable to continuing forging a path. However, intellectualising doesn’t quite capture how it feels. If you haven’t experienced it, it’s a bit like English has become an arcane language and you have become a monkey. You remember fondly a time when words flowed from your fingertips like Jägermeister in Vegas, but your mind is now just banana.

What you need in this situation is something. It doesn’t have to be perfect or final—just a hint of an idea that can get the wheels turning once more. LLMs shine in this context. They can generate an infinite numbers of next steps forward. Many of them won’t work, but that doesn’t matter because once you’ve seen enough of them, you’ll have a better intuition for where to go next. Then, by cutting, combining and refining the ideas you have explored, the momentary lapse of imagination will pass.

Completions are a natural fit for this problem. Asking an LLM to write the next 50-100 words is intuitive. But it does present a couple of new problems.

First of all, LLMs can be extremely random. Even if you’re writing a beautiful Elizabethan love story, Claude may decide to introduce a quantum anomaly that opens up a tear in the fabric of spacetime. While this is fun the first time, it quickly becomes tiring. To get around this, completions are activated using a CMD+K menu in which you can provide extra guidance. Guidance could mean plot ideas, character ideas or anything that aligns the model with your thinking.

But even when you provide great guidance, you will almost never take the first output. You probably won’t even take the tenth output. Realistically, you want to explore lots of variations and take pieces from each to construct something new. To make this easier, I opted to deliver completions inline and enable users to cycle through the variations.

I received good external feedback on this feature. My intuitions that people wanted an easier way to explore seemed correct, and their feedback matched my experience. But it did create a new problem, because we now had many variations to navigate, and it felt like holding a complex data structure in your head.

Tree-based documents

At this point, my thinking began to diverge from a classical text editor, and away from anything I’d seen in the wild. I wanted to continue down the rabbit hole of “exploring variation”, but a new data structure was needed to make this possible.

Trees struck me as a natural fit for the problem. Written documents are strictly linear, but each text position could branch in many directions. If I could find a way to make branching feel intuitive, then the user would have a way to hold many versions of the same story in one document. They would be able to go down the rabbit hole all the way into wonderland, and then reverse back to the entrance to check out the other rabbit holes.

Based on this idea, I structured Z3 documents as a tree, where each node represents the content between two points. What users see on screen looks like a normal text editor, but it is really a series of nodes stuck together. By navigating around the tree, users have access to many versions of the same story simultaneously. While I’m convinced I have some form of PTSD from implementing this, I do feel that the result was worth the trouble.

Of course, the problem is that trees are complicated to hold in your head. I felt that the user needed to be able to see and interact with it like a map if they were going to understand it. So, over a few iterations, I created a fully interactive branching map, complete with all the interactions you would expect: navigation, organisation, merges, creations and deletions. Through them, the user is empowered to explore the space of possible outputs, gradually refining their guidance without ever leaving the text editor.

The tree has had a mixed reception—users love the capacity for exploration. I also remain curious about a few examples of branching stories with multiple endings. But it can become complicated over time as outputs accumulate, and I received feedback from some who felt it didn’t suit them. Trees definitely add complexity and require a motivated user with patience for a new interface.

More work is needed to simplify and remove the burden of pruning from the user, so that long stories don’t become confusing.

LLM editing

I felt that I had established a foothold into the problem of expanding imagination. But the other problem remained—writers wanted a patient editor to give continuous feedback.

At this point, chat is hardly an innovation. Even so, chat is still extremely useful for the sake of talking about ideas and reviewing drafts. Chat is available in Z3 via a sidebar, and can be quickly accessed through a keyboard shortcut that takes any highlighted text as context. This small interaction enables the user to essentially say “what do you think about this specific part” very easily, while the model retains context of the whole story.

Beyond simple chat, editing seemed to be low-hanging fruit. LLMs have a tendency to take your writing and say “Amazing work! But what if we changed every single word?”. On the contrary, what writers want is for an editor to propose targeted changes to their wording, sentence structure and suggest other small improvements when necessary, but to leave the good bits in tact.

When writing code, diffs are used to highlight the sections of code that changed between two iterations. This mechanism works extremely well, and I have been surprised to see nobody implement it in a text editor. It seems to be the obvious way to ask an LLM to edit a document because it clearly shows what happened. It also comes with a built-in permission step, where the user is able to accept or reject a change, and is intuitively used by millions of people.

Z3 enables LLMs to propose targeted edits to a story inside the chat window as part of a conversation. Edits can include anything from single words to whole sections, and if the user likes the suggestion, it can be visualised as an inline diff with one click. Users then have a secondary permission step through which they can accept or reject the change. This way, there are no mysterious changes and the user remains in full control. Meanwhile, the LLM is fed an updated version of the story behind the scenes so that they know a change has been made.