A 13-Phase CRM Migration Postmortem: Zoho to HubSpot, 11.8k Records, 679 Merges

Published June 11, 2026 Draft

TL;DR: A CRM migration succeeds or fails on three properties, not on row count: dependency-ordered sequencing, idempotent loads, and per-phase checkpoints. This anonymized postmortem covers a Zoho-to-HubSpot migration of roughly 11,800 records executed in 13 phases, including 679 deliberate record merges. No client is named and no portal identifiers appear here. The decisive choices were to migrate object types in the order their associations require, to resolve duplicates as a planned phase before associations were written, and to checkpoint after every phase so a failure in phase nine resumed at phase nine rather than restarting the whole load. The merges were not cleanup left for later. They were a scheduled phase with a survivorship rule. Treated that way, a migration becomes a sequence of recoverable, re-runnable steps instead of a single overnight gamble.

Why migrations fail on structure, not volume

Eleven thousand eight hundred records is a small dataset by warehouse standards. The difficulty in this engagement was never throughput. It was the relationships between objects and the duplicates hiding inside them. Most migrations that go wrong do not run out of API capacity. They write deals before the companies those deals associate to, or they merge duplicates after associations are already in place, fragmenting the very history a merge is supposed to consolidate.

The industry consensus is blunt on this point. Gartner has long estimated that poor data quality costs organizations an average of $12.9 million per year (Gartner, 2021). A migration is the rare moment when that recurring tax can be paid down once, deliberately, instead of bled out indefinitely.

The shape of the engagement

The work moved approximately 11,800 records from Zoho CRM into HubSpot across 13 discrete phases. Zoho and HubSpot are both general-purpose CRM products; naming the source and target systems reveals nothing about the client, and both are referenced here only as software, never as a customer identity.

Each phase had a single responsibility, a defined input state, and a verifiable output state. A phase loaded one object type, or resolved one class of duplicate, or wrote one association direction. Keeping phases narrow is what made checkpointing meaningful: a phase small enough to describe in one sentence is a phase small enough to re-run safely.

Sequencing in dependency order

Objects were migrated in the order their associations demand: companies first, then contacts, then deals and tickets, then activities and engagements, then associations as an explicit final pass. The reasoning is mechanical. A deal in HubSpot associates to a company; if the company record does not yet exist when the deal loads, you produce an orphan. Loading in dependency order eliminates the orphan defect class entirely rather than cleaning it up afterward.

HubSpot’s own import documentation makes the same recommendation, advising teams to import companies and contacts before the objects that reference them so associations resolve against records that already exist (HubSpot Knowledge Base, Import objects). Sequencing is the cheapest defect prevention available on a migration, because it costs only planning.

“Half the cleanup teams budget for a migration is actually unwinding a load that ran in the wrong order. Order the load correctly and most of that work simply never appears.” — a principle this engagement treated as non-negotiable.

Deduplication as a planned phase, not a postscript

The 679 merges were not a tidy-up at the end. They were a scheduled phase positioned before associations were written. Duplicates were detected in the Zoho export, scored against a survivorship rule, and merged into a single surviving record so that downstream activity history consolidated onto one record instead of splitting across twins.

The survivorship rule was explicit: most-recent-activity wins for field-level conflicts, with mandatory manual review whenever two candidate records carried conflicting non-empty values on a defined set of critical fields. Automating the easy 90 percent and escalating the contested 10 percent is what kept 679 merges tractable rather than a multi-week judgment marathon. Crucially, merging before association meant engagement history attached to the survivor; merging after association would have re-fragmented exactly the history the merge existed to unify.

Why merge order matters mechanically

When you merge two contacts in HubSpot, their associated activities, deals, and tickets consolidate onto the surviving record. If associations are written first and the merge runs second, you risk duplicate associations, split timelines, and deals pointing at a record that is about to be retired. Running merges upstream of association writes means every association is written once, against the record that will survive. This is the same discipline covered in the HubSpot Data Foundation Audit methodology: resolve identity before you build relationships on top of it.

Idempotency: every phase re-runnable

Every phase was idempotent, meaning re-running it produced the same end state rather than a second copy of everything. This was achieved with external IDs carried from the source system and upsert-by-key logic on load: if a record with a given external key already existed, it was updated, not duplicated. Idempotency is what converts a migration from a one-shot event into a process you can replay without fear.

This property matters most under failure. A non-idempotent load that dies at 70 percent leaves you with a partially populated target and no safe way to resume, because re-running duplicates the first 70 percent. An idempotent load can be re-run end to end at any time and converges on the correct state. HubSpot’s import tooling supports this directly: re-importing a file mapped to a unique identifier such as Record ID or a custom unique-value property updates existing records instead of creating new ones (HubSpot Knowledge Base on updating records via import).

Checkpoints and a real recovery

After each of the 13 phases, a checkpoint recorded what had completed: which object types were loaded, which merges were resolved, which associations were written. The checkpoint was the resume coordinate. When a transient failure interrupted a mid-sequence phase, the run resumed at that phase rather than restarting phase one. A failure in phase nine cost minutes, not an overnight re-run.

This single property — checkpoint, then resume — is the difference between a migration that recovers gracefully and one that forces a full restart every time anything hiccups. It is also what makes a migration safe to run during business hours, because the blast radius of any failure is one phase, not the entire dataset.

What the checkpoint log captured

The log was deliberately boring and complete: phase number, start and end timestamps, record counts in and out, merge decisions with their survivorship reasoning, and any records routed to manual review. A good migration log doubles as the postmortem document, because it already contains the phase order, the merge count, the recoveries, and any defect that escaped. Nothing had to be reconstructed after the fact.

What a good migration postmortem records

The full phase order and the dependency reason each phase preceded the next.
The survivorship rule and the exact merge count it produced — here, 679.
The checkpoints, and at least one real recovery that used them.
Any defects that reached the target portal and how verification caught them.
The reconciliation counts proving source totals match target totals per object type.

A postmortem without reconciliation numbers is a story, not evidence. The reconciliation pass — counting records per object type in source and target and confirming they match — is what lets a team assert the migration is complete rather than merely finished running.

Where this connects

A migration lands data; an audit confirms it is trustworthy once landed. The two are bookends of one engagement. This postmortem describes the landing; the HubSpot Data Foundation Audit methodology describes the verification that follows. Lifecycle fields are a frequent casualty of cross-system moves, because the source and target rarely define stages identically — a problem examined in lifecycle stage drift. And if you are planning a move of your own, the HubSpot migration service is built around exactly this sequence: dependency order, planned merges, idempotent loads, and per-phase checkpoints.

The takeaway

The numbers in the title — 13 phases, 11,800 records, 679 merges — are real and anonymized, but they are not the point. The point is the method they encode. Migrate in dependency order so orphans never form. Merge as a planned phase before associations so history consolidates instead of splitting. Make every phase idempotent so re-running is safe. Checkpoint after each phase so failure resumes instead of restarts. Reconcile counts at the end so completeness is provable. Apply those five properties and a migration stops being an overnight gamble and becomes what it should be: a sequence of small, recoverable, verifiable steps.

Sources

Gartner, “How to Improve Your Data Quality” (cost-of-poor-data-quality estimate, 2021): https://www.gartner.com/smarterwithgartner/how-to-create-a-business-case-for-data-quality-improvement
HubSpot Knowledge Base, “Import objects” (importing and associating records, sequencing, and unique identifiers): https://knowledge.hubspot.com/import-and-export/import-objects
Anonymized 13-phase Zoho-to-HubSpot migration engagement record (no client name, no portal identifiers).

← Back to all field notes