Change Shard Key and Migrate to Atlas Using MongoPush
Changing the shard key of a large data size collection is a time consuming task and painful experience in MongoDB. For whom plan to migrate their clusters to Atlas, one would wish if changing the shard keys of collections was possible together with migrating data to Atlas. MongoPush makes it possible in four simple steps. For the MongoPush introduction, please see MongoPush - Push-Based MongoDB Atlas Migration Tool.
Use Case
The atlanta.vehicles collection is one of many collections in a 2-shard MongoDB cluster, and the shard key is {"year": 1, "brand": 1}. The collection data size has grown over 5TB. The engineering team plans to migrate to Atlas with a new configuration of a 3-shard cluster. However, with the existing shard key, they have experienced a "hot shard" problem when importing inventory data of new vehicles because the "year" field of the shard key. The marketing department advises that car buying is a both practical and passionate decision, and customers choose vehicles by styles and colors, for example a red convertible. After reviewing all their use cases, the team decides the new shard key to be {"style": 1, "color": 1}.
How can we make this process less painful without using mongodump and mongorestore commands? Let’s explore the MongoPush solution.
Copy All Configurations
Because of the shard key change, we can’t use the convenient -push all parameter with the mongopush command and have to divide the entire process into four steps. The first step is to copy all configurations and allow mongopush to automatically configure the target cluster. Let's use {source} to represent the source cluster connection string and {target} for the target cluster connection string. The command is as below:
Copy and Review Indexes
Next, copy all indexes by using the command below:
Configure New Shard Key
The next step is to change the shard key. Let’s assume the primary shard is shard01, and ids of the other two shards are shard02 and shard03. Connect to the target cluster using mongo shell and execute the commands below:
The above commands do the following:
Change to the admin database
Drop the atlanta.vehicles collection
Shard the collection using a new shard key {"style": 1, "color": 1}
Split chunks
Move chunks
A couple of important notes from the above commands. We only create one chunk per shard and let mongod automatically split chunks as the data are inserted. For the midpoint of the splitAt command, you’ll have to examine your data to come up with the best splitting points.
After copying indexes is completed, review, add new or remove unwanted indexes if needed.
Comments