Tag Archives: partitioned

Sharded MongoDB config on Nutanix (1) : Deployment

So far I have posted on MongoDB deployments either as standalone or as part of a replica set. This is fine when you can size your VM memory to hold the entire database working set. However, if your VM’s RAM will not accommodate the working set in memory, you will need to shard to aggregate RAM from multiple replica sets and form a MongoDB cluster.

Having already discussed using clones of gold image VMs to create members for a replica set, then the most basic of MongoDB clusters requires at least two replica sets. On top of which we need a number of MongoDB “infrastructure” VMs that make MongoDB cluster operation possible. These entail a minimum of three (3) Configuration Databases (mongod –configsvr) per cluster and around one (1) Query Router (mongos) for every two shards. Here is the layout of a cluster deployment on my lab system:

2shard-system

In the above lab deployment, for availability considerations, I avoid co-locating any primary replica VM on the same physical host, and likewise any of the Query Router or ConfigDB VMs. One thing to bear in mind is that sharding is done on a per collection basis. Simply put, the idea behind sharding is that you split the collections across the replica sets and then by connecting to a mongos process you are routed to the appropriate shard holding the part of the collection that can serve your query. The following commands show the syntax to create one of the three required configdb’s (ran on three separate VMs, and need to be started first), and a Query Router, or mongos process (where we add the IP addresses of each configdb server VM) :

Config DB Servers – each ran as:
mongod --configsvr --dbpath /data/configdb --port 27019

Query Router - ran as:
mongos --configdb 10.68.64.142:27019,10.68.64.143:27019,10.68.64.145:27019

- the above IP addresses in mongos command line are the addresses of each config DB.

This brings up an issue if you are not cloning replica VMs from “blank” gold VMs. By cloning a new replica set from a current working replica set, ie: so that you essentially have each replica set holding a full copy of all your databases and their collections. Then when you come to add such a replica set as a shard, you generate the error condition shown below.

Here’s the example of what can happen when you attempt to shard and your new replica set (rs02)  is simply cloned off a current running replica set (rs01):

mongos> sh.addShard("rs02/192.168.1.52")
{s
 "ok" : 0,
 "errmsg" : "can't add shard rs02/192.168.1.52:27017 because a local database 'ycsb' 
exists in another rs01:rs01/192.168.1.27:27017,192.168.1.32:27017,192.168.1.65:27017"
}

This is the successful workflow adding both shards (the primary of each replica set) via the mongos router VM:

$ mongo --host localhost --port 27017
MongoDB shell version: 3.0.3
connecting to: localhost:27017/test
mongos>
 
mongos> sh.addShard("rs01/10.68.64.111")
{ "shardAdded" : "rs01", "ok" : 1 }
mongos> sh.addShard("rs02/10.68.64.110")
{ "shardAdded" : "rs02", "ok" : 1 }

We next need to enable sharding on the database and subsequently shard on the collection we want to distribute across the replica sets available. The choice of shard key is crucial to future MongoDB cluster performance. Issues such as read and write scaling, cardinality etc are covered here. For my test cluster I am using the _id field for demonstration purposes.

mongos> sh.enableSharding("ycsb")
{ "ok" : 1 }

mongos> sh.shardCollection("ycsb.usertable", { "_id": 1})
{ "collectionsharded" : "ycsb.usertable", "ok" : 1 }

The balancer process will run for the period of time needed to migrate data between the available shards. This can take anywhere from a number of hours to a number of days depending on the size of the collection, the number of shards, the current workload etc. Once complete however, this results in the following sharding status output. Notice  the “chunks” of the usertable collection held in the ycsb database are now shared across both shards (522 chunks in each shard) :

 mongos> sh.status()
--- Sharding Status ---
 sharding version: {
 "_id" : 1,
 "minCompatibleVersion" : 5,
 "currentVersion" : 6,
 "clusterId" : ObjectId("55f96e6c5dfc4a5c6490bea3")
}
 shards:
 { "_id" : "rs01", "host" : "rs01/10.68.64.111:27017,10.68.64.131:27017,10.68.64.144:27017" }
 { "_id" : "rs02", "host" : "rs02/10.68.64.110:27017,10.68.64.114:27017,10.68.64.137:27017" }
 balancer:
 Currently enabled: yes
 Currently running: no
 Failed balancer rounds in last 5 attempts: 0
 Migration Results for the last 24 hours:
 No recent migrations
 databases:
 { "_id" : "admin", "partitioned" : false, "primary" : "config" }
 { "_id" : "enron_mail", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "mydocs", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "sbtest", "partitioned" : false, "primary" : "rs01" }
 { "_id" : "ycsb", "partitioned" : true, "primary" : "rs01" }
 ycsb.usertable
 shard key: { "_id" : 1 }
 chunks:
 rs01 522
 rs02 522
 too many chunks to print, use verbose if you want to force print
 { "_id" : "test", "partitioned" : false, "primary" : "rs02" }

Additional Links: