We Are No Longer Building a Canadian Legal AI Model

Yesterday I published the post-mortem: we asked flash-1-mini ten questions any Canadian lawyer would consider basic, and it invented seven citations.

That post was about what broke.

This one is about what it changed.

Because two weeks ago I was telling people we were building a Canadian legal AI model - and today we decided we're not.

I want to walk through why, because the answer changed how I think about what "building AI" actually means.

The original scope

The original scope made sense on paper.

Take a 9-billion-parameter foundation model.

Train it on a curated corpus of Canadian legal materials - statutes, regulations, case law, parliamentary debate, the CCQ.

Build a benchmark to evaluate it on the questions a junior lawyer or paralegal would actually ask.

Ship it open weights, Apache 2.0, in September.

The pitch wrote itself.

Canada doesn't have a sovereign foundation model trained on Canadian law.

American models hedge on section 8 of the Charter and invent CCQ provisions.

Quebec professionals operating in French get half-translated common-law reasoning back.

Federal compliance officers can't deploy frontier models without breaching data residency requirements.

There's a real gap. We were going to fill it.

Then three realities collided.

Reality 1 - models trained on cases still hallucinate

If you read yesterday's post, you saw the receipts.

The model reversed the holding in R. v. Oakes - the case every Canadian law student learns, famous precisely because the law failed the test Oakes created.

It invented a 2014 Hufsky that doesn't exist. It fabricated a Quebec statute.

This is not a flash-1-mini problem. It's a foundation model problem.

The architecture treats every fact as a probability distribution.

Bigger models reduce the hallucination rate. They don't eliminate it.

For most uses, that's fine.

A model that drafts emails needs fluency, not verbatim Charter holdings.

Legal work is different.

The whole point of a citation is that the lawyer reading it can pull the decision and check.

If the model invents the citation – or gets the case right and the holding backwards, like Oakes — the lawyer has built an argument on a paragraph that doesn't exist.

Discovered in opposing counsel's brief, that's a sanctionable error.

Discovered in court, it's a career-ending one.

The integrity bar for legal AI is not "sounds plausible."

It's "every claim traces to a verifiable source."

That bar cannot be cleared by training cases into model weights. It can only be cleared by retrieval.

Reality 2 - even Harvey doesn't bet on trained-in case law

I said something on my engineering call this week that I keep coming back to.

"Harvey is just RAG with case law on top. Nothing crazy, nothing special."

Harvey is the most valuable AI legal tools company on the planet.

They serve some of the largest law firms in the world.

If anyone could solve the citation problem by training a model on the world's case law, you'd assume it was them.

And here's the thing - they tried.

Harvey built a custom case-law model with OpenAI back in 2023.

But by every public account of how their product works today, the citations don't come from model weights.

They come from retrieval: a frontier model on top of a curated index of case law, statutes, and firm documents.

The model contributes legal reasoning. The index contributes the facts.

I'd soften my "just" — but the architectural lesson stands.

The best-funded team in legal AI, with every option available to them, lands on retrieval for citation integrity.

That's not a workaround.

That's the right answer.

We accepted it this week.

Reality 3 - Canadian case law access is structurally broken

There's a second problem nobody talks about.

Even if you wanted to train on Canadian case law, the corpus is a procurement nightmare.

CanLII is the obvious source — the largest aggregator of Canadian case law, run by the Federation of Law Societies, free to read.

The terms of use prohibit bulk download for AI training.

Earlier this year CanLII settled its lawsuit against an AI startup called Caseway over alleged bulk scraping.

The terms are confidential.

There is no public precedent — and more importantly, no public path to licensed bulk access for AI builders.

I want to be specific about our position.

We are not going to scrape CanLII.

We are not going to pressure them to license.

We are not going to find legal grey area around it.

If CanLII's position is that their database is not available for AI training, we respect that position completely.

Want the full playbook? I wrote a free 350+ page book on building without VC.
Read the free book·Online, free

The integrity of how we source training data matters more than the convenience of having the largest possible corpus.

SOQUIJ — Quebec's official legal information service — is paywalled.

Same answer.

Federal court archives are public but scattered, inconsistently formatted, and unindexed.

Provincial reporting services mostly charge, and the ones that don't rarely provide structured downloads.

The honest assessment: the Canadian case law corpus available to a builder operating in good faith is a fraction of what exists in published form.

You could spend eighteen months and a six-figure budget assembling it through legitimate channels and still have less than one CanLII bulk export would give you in a day.

So we'd be building a model trained on a partial corpus, hallucinating partial holdings, in a system where the right architecture for citation is retrieval anyway.

That's the moment we stopped trying.

What we are building now

Flash-1 is no longer a Canadian legal AI model.

It's a foundation model for Canadian work.

The distinction is architectural, not marketing.

The model gets trained on the parts of Canadian context that are stable, citable, and need to live in the weights — federal statutes, regulations, parliamentary debate, the Canada Gazette, the CCQ, CRA folios, OQLF style, Hansard.

The connective tissue of how Canadian government and Canadian work actually function.

Case law does not get trained in.

It gets retrieved at inference, from a curated index built only from sources that explicitly license their material for this use — the Access to Justice in AI Lab's open Canadian case law dataset, federal court rulings, and we will be transparent about every source we add.

In practice: flash-1 understands how Canadian law thinks.

The structure of Charter analysis. The difference between common-law and civil-law reasoning. The register of Quebec French legal writing.

When a user asks about a specific case, the system retrieves the actual decision and grounds the answer in real paragraphs with real citations.

This is a smaller claim and a better model — one that can actually be deployed in legal practice, because every citation traces to a retrievable source.

It's also no longer narrowly legal. The same architecture serves tax work, public service work, regulatory compliance, professional services.

The model knows the Canadian context. Whatever specialized index sits next to it determines the vertical.

Yesterday's post described flash-1 as Canadian Business, Legal, and Regulatory specialization.

That's still the deployment story. What changed is where the specialization lives — not in the weights, in the index.

The thing I want to be honest about

I had built an internal narrative around being first to ship a Canadian legal AI model.

The legal angle was clean — clear customer, clear use case, clear pitch.

And yes — this demotes CBLRE.

Two weeks ago I launched that benchmark as the flagship evaluation for a legal model.

It's now one evaluation among several, measuring one vertical the architecture serves.

That's a real cost of this pivot, and I'd rather say it plainly than let you notice it quietly.

The methodology still stands; the model it was built to crown no longer exists.

The broader scope is harder to sell.

It requires explaining retrieval versus training. It doesn't fit on a slide.

But the discipline that matters at our scale is: what would a model that actually works look like, and are we building that — even when it's less marketable than the version that doesn't quite work?

The version that doesn't quite work is a legal model trained on a partial corpus, with citation hallucinations and a CanLII access fight in its near future.

That version pitches well and ships broken.

The version that works is a foundation model for Canadian work, with retrieval handling citation-grounded depth. Harder to explain. Easier to deploy.

We picked the version that works.

What this changes about September 30

The ship date doesn't move. Flash-1 goes on Hugging Face September 30 under Apache 2.0.

Free download, free fine-tune, free deployment, free fork.

What changes is what the model is.

Not a legal specialist — a foundation model, evaluated on Canadian context broadly, with vertical depth handled by retrieval indexes at deployment.

The Quebec H200 cluster runs the training.

AWS ca-central-1 holds data collection. Canadian infrastructure end to end.

September 30 is the day the conversation about what a foundation model built for Canadian work can do becomes a conversation with an actual artifact instead of a thesis.

The model is the framework. The data is the world.

We accepted RAG.

George Pu is the founder of SimpleDirect®, a Toronto-based Canadian AI company building open-weight foundation models for Canadian work. Flash-1 ships September 30, 2026 on Hugging Face under Apache 2.0.