Wednesday, August 13, 2014

How to Setup MongoDB Replication Using Replica Set and Arbiters

If you are running MongoDB on production environment it is essential that you setup a real time replication of your primary instance.
Using replica set, you can also scale horizontally and distribute the read load across multiple mongodb nodes.
This tutorial explains in detail on how to setup MongoDB replication.

You can setup mongodb replication in several configurations. But we’ll discuss the minimum configuration setup.
MongoDB recommends that you have minimum of three nodes in a replica set. But, out of those three nodes, two nodes stores data and one node can be just an arbiter node.
An arbiter node doesn’t hold any data, but it participates in the voting process when the primary goes down.
The following is the minimal setup that is explained in this tutorial.
MongoDB Replica Set

1. Install MongoDB

First, install mongodb on all three nodes.
yum install mongo-10gen mongo-10gen-server

2. Modify /etc/hosts file

On mongodb1 (and mongodb2) server, modify the /etc/hosts file and add the following.
192.168.100.1 mongodb1
192.168.100.2 mongodb2
192.168.100.3 arbiter1
Make sure all these nodes are able to talk to each other using the hostname. You can also use FQDN here. For example, instead of mongodb1, you can use mongodb1.thegeekstuff.com

3. Enable Auth on all MongoDB Nodes

While this is not mandatory for the replica set functionality, don’t run your prod mongodb instance without authentication enabled.
By default auth is not enabled on mongodb. Add the following line to your mongod.conf file.
# vi /etc/mongod.conf
auth = true
Restart the mongodb instance after the above change.
service mongod restart
Also make sure you create an admin username and password for the admin database to use replica set commands.
> use admin
switched to db admin
> db.addUser("admin", "SecretPwd");
Note: Do the above on all the mongodb nodes.

4. On mongodb1: Restore Existing DB

If you already have a single instance mongodb server running, and like to migrate it to this new replica set configuration using the three nodes, take a backup of the mongodb using mongodump, and restore it on the mongodb1 instance using mongorestore command.
mongorestore --dbpath /var/lib/mongo --db ${db_destination} --drop dump/${db_source}
After the restore, if the file permissions under /var/lib/mongo directory are different, change it accordingly as shown below.
cd /var/lib/gmongo
chown mongod:mongod *
service mongod start

5. On mongodb1: Add replSet to mongod.conf

Add the following line to mongod.conf file. You can give any value here. I’ve given “prodRepl”, but it can be anything, as long as they are same on all the nodes.
# vi /etc/mongod.conf
replSet = prodRepl
At this stage, you’ll notice that there is no replica set configuration. It will display “null” as shown below.
> use admin;
> db.auth("admin","SecretPwd");
> rs.conf();
null

6. Setup KeyFile for Replication Auth on all MongoDB Nodes

On all the mongodb nodes, create a keyfile with some random password. The main thing is that this password should be same across all mongodb nodes.
# mkdir /root/data/mongodb/
# vi /root/data/keyfile
SecretPwdReplicaSetMongoDB
Add the following line to mongod.conf file on all the nodes
# vi /etc/mongod.conf
keyFile = /root/data/keyfile
Setup appropriate permissions to the keyfile and restart the mongodb instance as shown below.
chown mongod:mongod /root/data/keyfile
chmod 700 /srv/mongodb/keyfile
service mongod restart

7. On mongodb1: Initiate the Replica Set

Now, it is time to initiate the replica set as shown below using the rs.initiate() command.
> use admin
> db.auth("admin","SecretPwd");
> rs.initiate();
{
  "info2" : "no configuration explicitly specified -- making one",
  "me" : "mongodb1:27017",
  "info" : "Config now saved locally.  Should come online in about a minute.",
  "ok" : 1
}
Right after the initiate, you’ll notice that the configuration is not null anymore. Also, you’ll notice that the mongodb prompt changed from “>” to “replicasetName:PRIMARY>” as shown below.
> rs.config();
{
  "_id" : "prodRepl",
  "version" : 1,
  "members" : [
     {
       "_id" : 0,
       "host" : "mongodb1:27017"
     }
  ]
}

8. On mongodb1: View Log and Replica Set Status

At this stage, on mongodb1, mongod.log will display something similar to the following:
# tail /var/log/mongo/mongod.log
Sat Feb 22 18:11:30.995 [conn2] ******
Sat Feb 22 18:11:30.995 [conn2] replSet info saving a newer config version to local.system.replset
Sat Feb 22 18:11:30.996 [conn2] replSet saveConfigLocally done
Sat Feb 22 18:11:30.996 [conn2] replSet replSetInitiate config now saved locally.  Should come online in about a minute.
Sat Feb 22 18:11:34.568 [rsStart] replSet I am mongodb1:27017
Sat Feb 22 18:11:34.569 [rsStart] replSet STARTUP2
Sat Feb 22 18:11:35.570 [rsSync] replSet SECONDARY
Sat Feb 22 18:11:35.570 [rsMgr] replSet info electSelf 0
Sat Feb 22 18:11:36.570 [rsMgr] replSet PRIMARY
...
Also, the status will display the following, indicating that there is only one node added to the replica set so far.
prodRepl:PRIMARY> rs.status();
{
"set" : "prodRepl",
"date" : ISODate("2014-02-22T06:28:49Z"),
"myState" : 1,
"members" : [
  {
    "_id" : 0,
    "name" : "mongodb1:27017",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 645,
    "optime" : Timestamp(1438853170, 1),
    "optimeDate" : ISODate("2014-02-22T06:19:30Z"),
    "self" : true
  }
],
"ok" : 1
}

9. On mongodb1: Add the 2nd node

On mongodb1 server, add the 2nd node using rs.add command as shown below.
prodRepl:PRIMARY> rs.add("mongodb2");
{ "ok" : 1 }
Afer you’ve added the node, you’ll notice that the rs.config command will show both the nodes as shown below.
prodRepl:PRIMARY> rs.config();
{
  "_id" : "prodRepl",
  "version" : 2,
  "members" : [
     {
       "_id" : 0,
       "host" : "mongodb1:27017"
     },
     {
       "_id" : 1,
       "host" : "mongodb2:27017"
     }
  ]
}

10. Sync started Between Nodes

As shown from the rs.status() command, you’ll notice that the mongodb2 node is in startup state, and it is performing the initial cloning of the database from mongodb1 to mongodb2.
Depending on the size of the database on mongodb1, this might take some time to complete.
prodRepl:PRIMARY> rs.status();
{
  "set" : "prodRepl",
  "date" : ISODate("2014-02-22T21:27:53Z"),
  "myState" : 1,
  "members" : [
   {
     "_id" : 0,
     "name" : "mongodb1:27017",
     "health" : 1,
     "state" : 1,
     "stateStr" : "PRIMARY",
     "uptime" : 225,
     "optime" : Timestamp(1343239634, 1),
     "optimeDate" : ISODate("2014-02-22T21:27:14Z"),
     "self" : true
   },
   {
     "_id" : 1,
     "name" : "mongodb2:27017",
     "health" : 1,
     "state" : 5,
     "stateStr" : "STARTUP2",
     "uptime" : 39,
     "optime" : Timestamp(0, 0),
     "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
     "lastHeartbeat" : ISODate("2014-02-22T21:27:52Z"),
     "lastHeartbeatRecv" : ISODate("2014-02-22T21:27:52Z"),
     "pingMs" : 2,
     "lastHeartbeatMessage" : "initial sync cloning db: mongoprod"
   }
  ],
  "ok" : 1
}
Once the sync is completed, the state on the mongodb1 will change to “SECONDARY” and you’ll see the last heartbeat message indicating that the data is all synced and ready to go.
prodRepl:PRIMARY> rs.status();
{
  "set" : "prodRepl",
  "date" : ISODate("2014-02-22T22:03:21Z"),
  "myState" : 1,
  "members" : [
    {
      "_id" : 0,
      "name" : "mongodb1:27017",
      "health" : 1,
      "state" : 1,
      "stateStr" : "PRIMARY",
      "uptime" : 2353,
      "optime" : Timestamp(1394309634, 1),
      "optimeDate" : ISODate("2014-02-22T21:27:14Z"),
      "self" : true
    },
    {
      "_id" : 1,
      "name" : "mongodb2:27017",
      "health" : 1,
      "state" : 2,
      "stateStr" : "SECONDARY",
      "uptime" : 2167,
      "optime" : Timestamp(1394309634, 1),
      "optimeDate" : ISODate("2014-02-22T21:27:14Z"),
      "lastHeartbeat" : ISODate("2014-02-22T22:03:21Z"),
      "lastHeartbeatRecv" : ISODate("2014-02-22T22:03:20Z"),
      "pingMs" : 0,
      "syncingTo" : "mongodb1:27017"
    }
],
"ok" : 1
}
Note: If you login to mongodb2 node, and execute the above command, you’ll see the exact same message, as both the nodes are now part of the same replica set.

11. On arbiter1: Start the mongod instance

The arbiter node doesn’t need to be as powerful as the mongodb1 and mongodb2 nodes. You can pick some existing server that is running some other production application, and run arbiter on it, as it doesn’t consume lot of resources.
Create an empty directory for the mongod instance dbpath. As we explained earlier, this directory will not contain any data from mongodb1 or mongodb2 server. This is needed just to start the mongodb instance on the arbiter node.
The key thing is that specify the replica set name using –replSet as shown below. If you like to change the default mongodb port, you can use –port option as shown below. Run this in the background.
mkdir /var/lib/mongo/data

nohup mongod --port 30000 --dbpath /var/lib/mongo/data --replSet prodRepl &

12. On mongodb1: Add arbiter1 node

Now on the mongodb1 primary node, add this arbiter node using rs.addArb command as shown below.
prodRepl:PRIMARY> rs.addArb("arbiter1:30000");
{ "ok" : 1 }
Now if you view the configuration using rs.config command, you’ll see all three nodes.
prodRepl:PRIMARY> rs.config();
{
  "_id" : "prodRepl",
  "version" : 3,
  "members" : [
    {
      "_id" : 0,
      "host" : "mongodb1:27017"
    },
    {
      "_id" : 1,
      "host" : "mongodb2:27017"
    },
    {
      "_id" : 2,
      "host" : "arbiter1:30000",
      "arbiterOnly" : true
    }
  ]
}

13. Verify Final Replication Status

Execute the rs.status() command which will show the current status of all the three nodes. In this case, everything looks good and functional.
As indicated by the “stateStr”, we now have a Primary, Secondary and an Arbiter. This is the minimum configuration required to get an working mongodb replica set.
Please note that you can execute rs.config() and rs.status() command on any one of the three nodes, which will display the same results.
prodRepl:PRIMARY> rs.status();
{
  "set" : "prodRepl",
  "date" : ISODate("2014-02-22T22:59:15Z"),
  "myState" : 1,
  "members" : [
     {
       "_id" : 0,
       "name" : "mongodb1:27017",
       "health" : 1,
       "state" : 1,
       "stateStr" : "PRIMARY",
       "uptime" : 5707,
       "optime" : Timestamp(1343042482, 1),
       "optimeDate" : ISODate("2014-02-22T22:14:42Z"),
       "self" : true
     },
     {
       "_id" : 1,
       "name" : "mongodb2:27017",
       "health" : 1,
       "state" : 2,
       "stateStr" : "SECONDARY",
       "uptime" : 5521,
       "optime" : Timestamp(1343042482, 1),
       "optimeDate" : ISODate("2014-02-22T22:14:42Z"),
       "lastHeartbeat" : ISODate("2014-02-22T22:59:14Z"),
       "lastHeartbeatRecv" : ISODate("2014-02-22T22:59:13Z"),
       "pingMs" : 0,
       "syncingTo" : "mongodb1:27017"
     },
     {
       "_id" : 2,
       "name" : "arbiter1:30000",
       "health" : 1,
       "state" : 7,
       "stateStr" : "ARBITER",
       "uptime" : 39,
       "lastHeartbeat" : ISODate("2014-02-22T22:59:14Z"),
       "lastHeartbeatRecv" : ISODate("2014-02-22T22:59:15Z"),
       "pingMs" : 0
     }
  ],
  "ok" : 1
}

No comments:

Post a Comment