Building Ghost Narrator: Open-Source AI Blog Narration

A note — when you shop through links in this post, I earn a commission — at no extra cost to you. It doesn't affect what I recommend. Full policy

Listen to article 0:00 / 0:00

Speed

Ghost Narrator was the first open-source project I've ever published. It wasn't supposed to be public.

I built it for Founder Reality. We have hundreds of blog posts, and I wanted a listen experience for people who'd rather hear an article than read it.

So we built a self-hosted narration pipeline — a local LLM rewrites articles into natural scripts, a voice model reads them in my cloned voice.

No API keys. No monthly bill.

Almost signed up for ElevenLabs to narrate my blog. $330/month.

Then I tried running an open-source model on my own laptop. Qwen 3.5 14B.

Sounds fine. 200 posts a month. Costs me electricity.

I almost paid $4,000 a year to rent a model I can run myself.

Most AI subscriptions…
— George Pu (@TheGeorgePu) March 27, 2026

That's probably how most good ideas start. Not from a brainstorm, but because you actually need the thing. You build it for yourself, and then you share it.

The reaction on X was stronger than I expected — over 2,000 likes on a tweet about almost signing up for ElevenLabs.

We were genuinely considering it. ElevenLabs was the standard. It was what companies used for authentic audio at scale.

Then we ran an open-source model on a laptop and it sounded fine. So we never signed up.

The Name

Ghost Narrator is named after Ghost, the open-source blogging framework we use for Founder Reality. They deserve credit.

They open-sourced their platform before open-source was popular, they've maintained it for years, and they make money through a premium hosted plan. That's a model worth respecting.

The narration system we built works with any blog, though. No restrictions.

The Angry Email

A few days after I tweeted about it, I got a long, detailed email from a reader. Semi-angry. His position: it's rude to show something cool and not share the code.

We were already planning to open-source it, so the timing worked out. Packaged it up, pushed it to GitHub, announced it as a reply on X.

Within a day or two — dozens of stars, some forks. People were actually using it.

That was a good feeling.

UPDATE: Open-sourced it.

Ghost Narrator - self-hosted AI narration.

LLM rewrites your articles into scripts, voice model reads them. Runs on a Mac.

No API keys. No monthly bill. MIT license.https://t.co/xOUzbaJp2c
— George Pu (@TheGeorgePu) March 31, 2026

Check Your Licenses

This is the biggest lesson from the whole process.

We originally used Fish Audio for the text-to-speech layer. Fish Audio is based in California, and their model works well.

But the license is clear: personal and research use is fine. The moment it's commercial, you need a paid license.

Fair enough. Companies have to make money, even if they open-source their projects. I just caught it late during due diligence.

Here's the thing — two days before we open-sourced Ghost Narrator, I got community-noted on X for tweeting that Mistral's new TTS model was "free for everyone."

The community note said: you didn't mention the license. They were absolutely right.

Mistral just open-sourced a text-to-speech model that beats ElevenLabs.

3 GB of RAM. Runs locally. Free.

The thing people were paying per-word for last year runs on your laptop now. pic.twitter.com/FPsuLMXqlG
— George Pu (@TheGeorgePu) March 28, 2026

After catching the Fish Audio license issue, we started replacing it with Qwen3-TTS.

Alibaba's Qwen team released it under Apache 2.0 — fully permissive, commercial use included. Alibaba doesn't need to monetize TTS.

They care about adoption of their large language model ecosystem. Makes sense strategically.

By the time you're reading this, Ghost Narrator should be running on Qwen3-TTS entirely. All Founder Reality narration should be generated with it.

But we came close to shipping a commercial product on a non-commercial license without realizing it. That's the kind of mistake that compounds quietly until it doesn't.

Want the full playbook? I wrote a free 350+ page book on building without VC.
Read the free book·Online, free

The lesson: check every dependency's license before you build on it. Not after.

Fun Fact: Beehiiv

Quick aside. On Beehiiv, the newsletter platform, they used to offer narration for free. Then one day they changed the rule — only MAX SUBSCRIBERS (their highest tier) get to keep the feature.

Another reminder of why you should self-host what matters to you. When someone else controls the infrastructure, they control the terms.

They can change them whenever they want, and the only thing you can do is accept it or leave.

The Diminishing Moat

Here's what I keep thinking about.

Imagine you're Mistral or Fish Audio. You spent millions developing a TTS model. You drop it in March 2026.

But three months earlier, in January, Qwen dropped their Apache 2.0 audio model. And it's good enough.

The question isn't even which model is better. It's that the gap between them is shrinking with every release. If Qwen3-TTS isn't quite as good today, what about Qwen 3.5? What about 4? What about the model from a completely different team six months from now?

This should terrify any company whose business model is selling access to a model. The rational response is still to sell SaaS, sell credits, sell bundles.

But you're charging for an asset with diminishing differentiation. Not zero value — just less of it with each passing quarter.

Switching Costs Are Collapsing

The switch from Fish Audio to Qwen3-TTS took us a day or two. Rewrite the API routes, update the configs, regenerate the audio files. Done.

That's it. A day or two to swap out the core technology in our pipeline.

I understand brand moats still exist in theory. But when the switching cost is a day of engineering work, the brand has to carry a lot more weight than it used to.

When people hear about a new TTS model, nobody's first reaction is "I need to use this." It's: what's the benchmark? What's the license? Can I run it locally?

Those questions get easier to answer with every release. Eventually, there's nothing left to differentiate on.

Related: I stopped using ChatGPT in January. I'm fine. I use Claude, I use local models, I use whatever gets the task done.

I genuinely do not care whether GPT Pro is better than Claude Opus on some benchmark. My work gets done. That's the only metric that matters.

That tells you everything about where model loyalty is heading.

The Cost of Open-Sourcing

Near zero. And the upside is compounding.

Open-sourcing Ghost Narrator didn't cost us revenue — we were never going to sell it. It cost us a few hours of cleanup and a README.

In return: GitHub stars, forks, credibility, inbound interest, and a portfolio that grows with each release.

For us, open source isn't charity. It's infrastructure. Every project we release makes the next one more credible. People check the GitHub profile, see multiple projects, and take the whole thing more seriously.

We'll keep open-sourcing the tools we build internally. Ghost Narrator was the first. It won't be the last.