logo Filipe Névola

Filipe Névola

Inscrever-se
Voltar09/06/2026, 21:38

The Migration Was Not the Hard Part

tl;dr: AI works best when migration work becomes metrics, tests, phase plans and tools

I recently migrated a large Meteor 2 application to Meteor 3 with AI.

This was not my first attempt. It was my fourth.

The first attempts were too optimistic. I tried to let AI do the migration more directly, but the work kept falling into the same traps: too many async cascades, too many silent await bugs, too many changes that looked correct locally but were not safe enough for a production codebase.

The successful attempt was different.

The difference was not only that the AI wrote better code. The difference was that AI helped me build a migration system around the code.

It helped create:

  • metrics before touching the code
  • lint rules and warning budgets
  • a phased migration plan
  • test infrastructure
  • progress tracking inside the repo
  • helper scripts to find bugs that lint could not see
  • post-migration validation checklists

That is the part I think is worth writing about.

Without AI, I think this migration would have taken around six months. With AI, the code migration took basically two days, followed by two days of testing. I did it during a trip to Miami, not from my usual desk, and I spent around $600 on AI usage during the migration.

That number sounds high if you compare it to a normal software subscription.

It sounds tiny if you compare it to six months of engineering work.

The real problem was not replacing APIs

At a superficial level, the Meteor 2 to Meteor 3 migration looks like a mechanical async migration.

On the server, sync Mongo APIs have to become async APIs:

const app = AppsCollection.findOne(appId);

becomes:

const app = await AppsCollection.findOneAsync(appId);

That looks easy.

It is not easy in a real system.

The dangerous part is not the line you change. The dangerous part is every caller above that line.

If a helper becomes async, every caller must await it. If that caller becomes async, its callers may need changes too. When the helper is something like a permission check, a missing await is not a small bug. It can mean the permission check returns a Promise and the code continues running.

That is why the migration could not be treated as a giant search-and-replace.

It needed a strategy.

Metrics first

The first useful thing was turning the migration into numbers.

The repo already had Meteor 3 compatibility lint rules. Instead of trying to fix everything at once, we enabled the relevant rules as warnings and created a baseline.

At one point the migration plan recorded:

  • 530 warnings
  • 104 affected files
  • warning distribution by layer
  • the top files by warning count

The top file alone had 44 warnings.

That mattered because it changed the problem from "migrate this repo" into "reduce this measurable warning budget without allowing regressions."

AI is much better when the task has a visible scoreboard.

We also added a CI regression gate. New pull requests could not increase the warning count. That meant normal product work could continue while the migration moved forward, but the repo could not get worse.

This is one of the first lessons: before asking AI to fix a migration, ask it to help you measure the migration.

The plan changed from bottom-up to top-down

The first shape of the plan was bottom-up.

Start with the shared collection layer. Convert the low-level helpers. Move upward.

That is tempting because collection methods are where many sync Mongo calls live.

But it is the wrong direction for an async migration.

If you convert a low-level helper first, you create a cascade. Every caller must change. Then callers of those callers may need to change. A small helper can touch dozens of files across methods, publications, REST endpoints, jobs, and background processes.

So the plan evolved.

The successful plan was top-down:

  1. Entry points: methods, publications, REST handlers, jobs, startup
  2. Middle layer: logic files, processes, API helpers
  3. Shared layer: utils, billing, payment, enums, metrics
  4. Bottom layer: collection helpers, hooks, composers

The reason is simple: entry points do not have application callers. Meteor, Express, or the scheduler invokes them. Making them async does not force another application layer above them to change.

Once the entry points are async, the layer below them can safely become async. Then the layer below that.

This was the core migration insight.

AI helped write it down, refine it, and keep following it.

The plan had memory

The migration plan was not a static document. It became a live progress tracker inside the repo.

It had phases like:

  • Phase 0: tooling setup
  • Phase 0B: migration test infrastructure
  • Phase 1: pinch-point async duplicates
  • Phase 2 to 16: entry points
  • Phase 17 to 21: middle layer
  • Phase 22 to 27: shared code
  • Phase 28 to 34: collections
  • Phase 35: cleanup
  • Phase 36: full sync code sweep
  • Phase 37: post-upgrade AST sweep

After each phase, the plan was updated.

This was important because no single AI chat had to remember everything. The repo remembered.

A new AI session could open the plan, see which phase was done, see the warning count, see the next files, and continue from there.

That is a different way to use AI. The chat is not the source of truth. The repository is.

Temporary duplication made the migration safer

Some functions were too central to convert directly.

For example, permission helpers had dozens of callers. Converting them immediately would have forced a giant cross-cutting change.

Instead, the plan created async duplicates:

checkUserPermission(...)
checkUserPermissionAsync(...)

The sync function stayed in place while each phase moved its own callers to the async version.

This is not beautiful final architecture.

It is good migration architecture.

The goal of a migration is not to make every intermediate commit elegant. The goal is to keep the system working while moving it from one stable state to another.

AI was very useful here because it could perform repetitive, phase-bounded work without losing the larger rule:

"Only touch this layer. Switch callers in this phase. Do not start a cascade outside the phase."

Tests were part of the migration, not an afterthought

The plan also added disposable migration tests.

The idea was not to build a perfect long-term test suite. The idea was to create characterization tests around the areas being migrated.

For each phase:

  1. Write tests first
  2. Run them before conversion
  3. Convert the code
  4. Run them again
  5. Only then update the phase progress

That gave the AI a repeatable loop.

It also gave me a way to review the migration without reading every changed line as if I were manually doing the whole migration myself.

The script was the breakthrough after the migration

The most important insight came near the end.

Even after the lint warnings were gone and the migration was complete, manual testing revealed runtime errors from missing await.

The problem was that ESLint could catch many direct sync Mongo calls, but it could not catch every custom async function call that needed await.

So AI created a script.

The script was app/scripts/find-missing-awaits.js.

It used Babel AST analysis to scan the codebase for calls to async functions that were not awaited.

It did several passes:

  • build a map of exported async functions
  • resolve imports across files
  • find un-awaited calls to known async exports
  • find un-awaited calls to *Async() methods
  • catch Meteor 3 async APIs like findOneAsync, saveAsync, callAsync, and custom async helpers

This script found more than 180 issues after the main conversion.

The migration plan recorded the impact:

  • 1 crash-level bug
  • 3 broken UI components
  • many fire-and-forget audit or logging calls
  • async callback traps that lint had missed

This is the strongest example from the whole migration.

AI did not only write migration code.

AI wrote migration tools.

That changed the economics of the project. A human could have reviewed these manually, but it would have been slow, boring, and unreliable. The AI-generated script gave us a repeatable validator that can be copied to another Meteor 3 migration.

The final migration was not one prompt

This is probably the most important point.

The migration was not:

"Hey AI, migrate my repo from Meteor 2 to Meteor 3."

That failed before.

The migration that worked looked more like this:

  1. Use lint to expose the migration surface.
  2. Turn the surface into metrics.
  3. Add a warning budget and regression gate.
  4. Write a top-down phased plan.
  5. Convert one bounded phase at a time.
  6. Update the repo-local progress tracker after each phase.
  7. Add tests around the risky surfaces.
  8. Use AI to generate additional tools when lint is insufficient.
  9. Run manual testing after the code migration.
  10. Feed every discovered issue back into the plan.

That is why the fourth try worked.

Not because AI suddenly became magic.

Because the work became shaped for AI.

What I learned

The first lesson is that AI migration work needs observability.

If you do not have a count, a failing rule, a test, a checklist, or a script, the AI is operating in fog.

The second lesson is that AI needs a migration architecture.

For async migrations, direction matters. Bottom-up sounds natural but creates cascades. Top-down gives you control.

The third lesson is that temporary code can be good engineering.

Async duplicates, warning baselines, migration-only tests, and phase trackers are not things you necessarily keep forever. They are scaffolding. They let the system move safely.

The fourth lesson is that AI can create leverage beyond code edits.

The script to find missing awaits may have been more valuable than many individual code changes. It turned a hidden class of bugs into a searchable, repeatable validation step.

The fifth lesson is that AI still needs human judgment.

I had to know that missing await on permission checks is dangerous. I had to decide that the plan should become top-down. I had to read the errors during testing and push the system toward better tooling.

AI accelerated the work.

It did not remove the need to understand the work.

The number that matters

This migration cost around $600 in AI usage.

It took about two days of code migration and two days of testing.

The alternative was months of careful human work, with a lot of repetitive edits and a high risk of missing small async bugs.

That is the calculation I think many teams are still not making correctly.

AI is not cheap because the monthly subscription is cheap.

AI is cheap when it compresses a six-month migration into a few focused days and leaves behind better tests, better docs, better rules, and better tools than the project had before.

That is what happened here.