When we have thousands of inserts per second, migrating data between different services becomes a challenge.
It is a significant challenge because it is very easy to miss data to import or have inconsistencies in the information.
To address this, we have created this small guide so that, based on a small checklist, you can ensure that you migrate all the data correctly.
1️⃣ Create the new messaging queue(s) and start publishing your events
We will start publishing all the events we need to have the projection of our data. For example, if it is a user projection, we will listen to:
user_registered
user_renamed
user_bio_changed
Depending on how we have set up our system, we will create one or several queues for this.
2️⃣ Create a replica of the Database outside the load balancer
This way, we achieve a "snapshot" of the data without affecting the performance of the production system.
3️⃣ Establish the Moment X™
We disconnect the replica at a precise moment. This timestamp will be crucial; we call it the Moment X™. All data up to that moment will come from the replica, and everything afterward will come from the event queue.
4️⃣ Import the historical data from the replica
We transfer all the data from the replica to the new system. If you have it, we can take our time for this, as new changes are accumulating in the queue.
5️⃣ Consume the event queue
Once the historical import is complete, we can start processing the accumulated events, discarding those prior to the Moment X™ to avoid duplicates. If we forgot to take the timestamp to know that moment, we can check the date of the last record that entered the replica (both by created_at and updated_at).
This technique has allowed us to migrate systems with millions of records without service interruptions and with complete data integrity.
If you want to learn more about how to perform data migrations in systems with high concurrency, we recommend the course on Data Migration: From Legacy to Event-Driven Architecture where we explore how to migrate data between various systems safely and efficiently.