Backend: Down Into the Pipes
Five years from a blinking cursor to the command line and back into the director's chair. Welcome to a special mini "Backend" episode of Films Not Made.
HOW I’M MAKING THE TRAILERS NOW
Sometime in 2021, the future of my creative life showed up - not with a thunderclap, not with the heavenly host, but as a blinking cursor in a text box.
I’d been trying for a while to get early access to some AI models to see what I could do creatively with them. Turned out that I also signed up for a spot in the beta for Sudowrite - running on one of the early OpenAI GPT models, this is pre-ChatGPT, back when “large language model” still sounded like something unknown. And I got access. Basically, you fed it a paragraph; it handed you the next one. You could also rewrite, expand, do a few more actions. And most of what came back was interesting…
Kind of like a charming stranger at a party describing in detail a book he has plainly never read.
But it was surprising and I could feel the energy in it, something with a little static of life in it, and sometimes I’d even feel the floor tilt a few degrees beneath me. It was also like watching the weather come across a plain, what it was going to become once it was. I immediately used it to make a pilot episode of a series I have yet to release (more on that later).
The images came a couple of years after the text box. Then, more slowly and far more strangely, the video - those early clips that melted at the edges, where a face would reorganize itself between frames. A year ago Amy and I started conceiving Films Not Made, and back then the video models were just getting manageable. Now a new one lands every couple of months, each arriving like more weather and quietly retiring some elaborate workaround I’d spent hours dialing in.
The universe gives; the universe takes; and lately it has been giving at a rate that makes a person dizzy.
For a long time the work happened inside the platforms - long, swampy chat threads, prompts copy-pasted into a browser like a person throwing bullets at a moving target - a conversation I had to babysit and feed until it grew so heavy with its own history that the whole apparatus began to forget. I was a user. I was, God help me, clicking. Command-C, Command-V.
Now I’m mostly in the terminal, or even more so the native app (same diff), and the machine does the clicking. I describe a shot in plain English - out loud, more and more, because I’ve become an unembarrassed addict of Wispr Flow and would frankly rather talk than type, and the system writes the prompt, talks to the image and video engines through their APIs or calls another system via CLI, and sets the finished frame down on my drive. The browser, the copy-paste, the context limits: gone, or going. We’re climbing out of the platforms and down into the pipes.
Which puts me back in the chair I prefer to sit in. Not operator. Not power user. Director. Asking for a shot, looking hard at what comes back, and saying the oldest sentence in the business: again, but -
The thing doing the clicking, I should say, isn’t raw Claude Code. It’s Claude Code wearing a “harness” - Daniel Miessler’s PAI, Personal AI Infrastructure, a scaffolding of skills and agents and memory that takes a general-purpose coding tool and teaches it my projects, my taste, and the very specific way I want a hallway to feel the second a man realizes he is not alone in his own house. On top of it I’ve built my own floor: character-sheet workflows, trailer-outline skills, a private grammar of prompts for holding a period look steady. The model is the engine. PAI is the car I actually get to drive - and the car, it turns out, matters more than the horsepower.
Some of this runs on luck I try not to take for granted. I’m in the OpenAI Artist Program, which is how GPT Image 2 landed in my hands early, and it’s now become the spine of the whole image pipeline - particularly for the single hardest problem in this entire craft, which is keeping a character’s face the same face from one shot to the next, a problem that has humbled better filmmakers than me and will go on humbling all of us for a good while yet.
Here’s the stack as it stands today, fully aware it may be out of date by Labor Day:
Director’s chair / harness: Claude Code, wrapped in PAI
Input / dictation: Wispr Flow - so I’m mostly talking, not typing
Image generation: GPT Image 2, with OpenAI Codex in the loop
Video generation: Kling 3.0 (sometimes Seedance 2.0)
Music & score: Suno
API layer: kie.ai - serving both Kling 3.0 and GPT Image 2 (when needed), so it’s all command line, no browser
Assembly: Adobe Premiere Pro, still and forever
So. I’m going to start showing the backend. This is the first one - I filmed myself building the climactic sequence of the trailer for Joe Berlinger’s upcoming episode wherein we reimagine The Little Fellow in the Attic, the man who’s been living in the attic for years and the husband who finally senses him in the house. It’s unvarnished on purpose: the outline that isn’t good enough, the character who keeps drifting into a different woman, the shot that only works if I cut in late. This is what the work actually looks like with the cover off.
TRANSCRIPT
So I wanted to start documenting how things have changed in terms of how I’m creating both the deck and the trailer for Films Not Made. Right now I’m in the middle of making the trailer for Joe Berlinger’s The Little Fellow in the Attic, and the workflow is fundamentally different from when I started doing this - which was not even a year ago. The tools keep changing, the technology keeps changing.
As you can see, I have Adobe Premiere open on screen, because that’s still where everything gets assembled. But in terms of generating the individual shots, the workflow looks very different now. I’m working inside Claude Code, in a Warp terminal, and I’m in the middle of assembling the sequence.
I do have a trailer outline that was generated through a workflow - a set of skills we’ve built over time inside Claude. But honestly, what usually happens is the outline comes out and it’s just not good enough. It doesn’t follow the actual principles of filmmaking. Once I step into it, I’m changing most of it. So think of it as a really rough first draft - a starting point, not a finish line.
Right now I’m putting together a climactic sequence where the husband, Fred, comes home and finally senses that someone else has been in the house. Apparently this man has been living in the attic for years - and having an affair with his wife. Fred figures it out. So we have: her in bed, the doorknob turns, the attic ladder retracts, she turns. Fred says, “Who’s been in this house?” Then we see the attic door close, the light disappearing from the hatch, and then they’re running up the stairs.
The next shot I needed was the hallway with the attic hatch. I’d already generated a first-frame image for it. The workflow I’m using has been around for a while: you create a first frame, then send it to a video generation engine to animate from there.
For image generation, I’ve been using GPT Image 2 inside ChatGPT. For a while I was leaning heavily on Gemini - what we’ve been affectionately calling “Nano Banana Pro” - but I’m getting better results now from GPT Image 2. I had some early access to it through the OpenAI Artist Program.
Here’s how the pipeline works: from inside Claude Code, I send instructions - based on the trailer outline and the character sheets we built - through to OpenAI Codex via the command line interface. It generates the image inside GPT Image 2, then brings it back and lands it on my machine.
When I started doing these trailers, everything happened inside the chat interface. I’d have an incredibly long ChatGPT conversation, going back and forth, generating images right there in the thread. And sometimes I’d send them out to Gemini. But when a chat gets that long, it gets slow and boggy. You hit context limits. The platforms have gotten better at compaction - basically creating summaries to free up the context window - but it still adds friction. The code interface handles this better, and I’ll get into why at some point.
So - back to the shot. I go into the terminal and just describe in plain English what I want: the attic door closing, the light disappearing from above. Claude Code writes a JSON prompt file, sends it to Kling through their API via Kite (key.ai), and fires it off. I don’t have to copy-paste anything into a browser. It reads the API documentation, the Reddit posts, the AI filmmaking blogs - I’ve pointed it at all of that - and aligns its approach with my own criteria for what I want out of a shot.
It reports back: image uploaded, JSON submitted, shot is generating. And it comes back fast.
The hatch is closing. That’s usable. I’ll cut in late so we get the tension of it closing while Fred is already running up the stairs.
Over in Premiere, I import the shot, trim it down - just enough to land the moment - and we’re already looking at a cut. It needs a beat more to let the light disappear properly, but it’s close.
The next shot is Fred running up the stairs, with Walburga behind him saying, “No, Fred.” I generated both a low angle and a high angle. The problem I kept running into: character consistency. The woman in the shot wasn’t matching our character sheet - and character consistency from shot to shot has always been the hard part of AI filmmaking. It’s getting easier, but it’s still the thing you fight.
Here’s how I’m addressing it: when we build the deck, we create reference character sheets using GPT Image 2. We take the casting image - the actor our AI pipeline picked for the role - and run it through a prompt that generates multiple poses, expressions, costumes, and lighting variations. That sheet is what we feed back into the image engine for every subsequent shot. It gives the model something concrete to anchor on.
So when the high-angle staircase shot didn’t look right, I sent the instruction back: redo that shot, use the character sheet as the primary likeness anchor, do not generate a different woman, do not drift her facial features. It takes all the approved reference images - her by the window, looking up, Fred’s face, the staircase geometry - and bundles them back into GPT Image 2 through Codex. Then I wait.
Meanwhile, I already approved the low-angle version of the staircase shot. Back in Premiere, I drop it in and watch it in context. There’s too much headroom, but once I cut to the high angle when it lands, it’ll work. Fred charges up the stairs. Then we’ll cut to Otto - already have a shot of him up in the attic room, standing there. Low angle to high angle, then Otto. Then cut to black.
And from there, three shots - probably a shot of the gun being thrown away, arcing toward the La Brea Tar Pits. Except it didn’t land in the tar pits. It’s a crazy story. You’ll have to watch the episode.
That’s how I’m making the trailers now. There’s a lot more I’d love to get into - the harness I’ve built around Claude Code, how the skills stack together, what changed between episode one and where we are now. That’ll come in time.
Till then - thanks for watching Films Not Made.
EPILOGUE & NOTES
If you want the why beneath all this how. Its conscience lives in a longer essay I wrote called The Left-Handed Blow: Art in the Age of AI Reproduction - Walter Benjamin, the aura, the contracting industry, and the uncomfortable admission that what I really am, sitting here, is the meat between the chair and the keyboard.
As for that pilot episode I mentioned at the top - it’s called “Disaster On Deneb” and the series may or may not have a name. Anyway, it incorporates documentary audio with AI-assisted writing which at the time was novel. Right now it’s all audio but guess what - I’m adding video. Stay tuned.






