This is a condensed field report on migrating a production n8n install — roughly 80 workflows across business ops, data sync, and support automation — to m9m. Names and specifics are anonymised; the numbers are real.

Setup before migration

n8n self-hosted on a single 4 vCPU / 8 GB VM, Postgres backing store.
~80 active workflows across four tenants: marketing, data, support, and a small agent-experiments team.
Pain points, in order: (1) slow cold starts on webhook-triggered workflows, (2) memory spikes that OOM-killed the container twice a month, (3) community nodes pinning them to specific n8n versions, (4) no good story for sandboxing the agent-experiments team’s Claude Code runs.

Audit — day one

We ran a script over the n8n export bucketing workflows by node type coverage. Bucketing rules:

Green — uses only core n8n nodes that m9m supports unchanged.
Yellow — uses one or two community nodes that can be re-expressed as HTTP calls.
Red — uses a community node that is structurally not expressible as HTTP (rare — typically UI nodes or tightly-integrated auth flows).

Result: 63 green, 14 yellow, 3 red.

Shadow run — week one

We mirrored all 63 green workflows to m9m, triggering them on the same cron schedules and webhooks, but routing outputs to staging destinations. The diff tooling compared m9m’s output JSON to n8n’s for every run.

Findings:

59 of 63 produced byte-identical outputs.
3 workflows had ordering differences in Merge nodes that turned out to be non-deterministic in n8n too — m9m’s output was more stable, but not identical. No action needed.
1 workflow hit a timezone mismatch — an expression using {{ $now.format('YYYY-MM-DD') }} in a cron context. m9m defaults to UTC; n8n to server local. Fixed by adding explicit timezone.

Total engineer-time for the shadow run: 1.5 days including the diff tooling.

Re-expressing community nodes — week two

The 14 yellow workflows used a mix of: a Stripe-specific node, a custom HubSpot connector, and an OCR-by-a-specific-vendor node. Each was replaced with an HTTP Request node using the vendor’s REST API.

The one non-trivial case: the OCR node streamed a file upload as multipart/form-data with a specific field ordering the vendor silently required. We discovered this by comparing request payloads with mitmproxy. Two hours of debugging; one comment in the workflow JSON explaining the non-obvious field order.

Total engineer-time: 3 days.

Custom Go nodes — week three

Two of the three red workflows needed genuine custom logic: a proprietary file format parser and a rate-limited vendor SDK that doesn’t expose a clean HTTP surface. We wrote both as custom Go nodes.

Each node was ~200 lines of Go, with table-driven tests. m9m’s node interface is three methods — Execute, Describe, Validate — so the learning curve was short. The third red workflow we retired; it was a legacy report no one had opened in six months.

Total engineer-time: 4 days (2 nodes × 2 days).

Cut-over — week four

Per-tenant cut-over, one tenant a day:

Pause n8n workflows for the tenant.
Enable m9m equivalents.
Watch logs for the day.
Delete n8n equivalents one week later.

Rollback plan: re-enable n8n. We never used it.

Total cut-over engineer-time: 2 days of watching logs.

Outcome

Total migration time: ~11 engineer-days, spread across a calendar month.
Infra footprint: down from 4 vCPU / 8 GB + Postgres to 2 vCPU / 2 GB, no external database. Monthly infra bill dropped ~$180.
Cold-start pain: gone. Webhook latency p50 down from 2.8 s to 180 ms.
OOM kills: zero in the six months since cut-over.
Sandboxing: the agent-experiments team retired their ad-hoc Docker wrapper and moved to m9m’s CLI nodes with namespace sandboxing. A separate write-up covers that work.

Gotchas worth naming

Timezone defaults differed, as above. Audit any expression that uses $now or $today.
Credential re-entry was mechanical but tedious. We scripted it against a password manager export to save the afternoon.
Retry semantics around flaky HTTP nodes: n8n’s default was “retry on any error up to 3 times, 5 s back-off”; m9m’s default is “no retry, fail fast.” We explicitly set retry policies on HTTP nodes that needed them and documented the new default.
Community node pin meant we had been stuck on an old n8n version for security reasons. Migrating away removed that pin entirely — an unquantified but real win.

What we would do differently

Build the diff tooling earlier. Knowing, concretely, that 59/63 workflows produced identical output gave us the confidence to move fast on cut-over. We wish we’d had that on day one, not day three.
Start the custom-node work in parallel with the shadow run, not after. The nodes had a longer tail than we expected.
Write the runbook for rollback explicitly, even though we never used it. The exercise surfaced two ordering assumptions that would have been painful under real incident pressure.

Migrating ~80 production n8n workflows to m9m: a field report

Setup before migration

Audit — day one

Shadow run — week one

Re-expressing community nodes — week two

Custom Go nodes — week three

Cut-over — week four

Outcome

Gotchas worth naming

What we would do differently

Need help shipping agents or migrating off n8n?

Setup before migration

Audit — day one

Shadow run — week one

Re-expressing community nodes — week two

Custom Go nodes — week three

Cut-over — week four

Outcome

Gotchas worth naming

What we would do differently

Related

Need help shipping agents or migrating off n8n?