MongoDB主从复制与副本集群实践


参考资料:
http://www.cnblogs.com/huangxincheng/archive/2012/03/04/2379755.html
http://blog.sina.com.cn/s/blog_a34f10a4010113df.html

一、部署主从复制

1、主服务器和从服务器必须开启安全认证:--auth
2、主服务器和从服务器的admin数据库中必须有全局用户。
3、主服务器的local数据库和从服务器的local数据均有名为repl且密码相同的用户名。

主服务器设置:
dongguo@mongodb:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
> use admin
switched to db admin
> db.addUser('rootm','rootm')
{
	"user" : "rootm",
	"readOnly" : false,
	"pwd" : "aa6526e3b7cbcecc18b2bd822f7c3547",
	"_id" : ObjectId("50659e14d2fe6be605337c18")
}
> db.addUser('repl','repl')
{
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7",
	"_id" : ObjectId("5065a85eccf77b17681365b7")
}
> use cloud
switched to db cloud
> db.addUser('repl','repl')
{
	"_id" : ObjectId("5065a7cbb70f43c4d157e8ec"),
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7"
}
> use local
switched to db local
> db.addUser('repl','repl')
{
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7",
	"_id" : ObjectId("50659e2cd2fe6be605337c19")
}
> exit
bye

dongguo@mongodb:~$ sudo /etc/init.d/mongod stop

dongguo@mongodb:~$ sudo vim /opt/mongodb/etc/mongod.conf
修改如下设置:

master = true
auth = true

dongguo@mongodb:~$ sudo /etc/init.d/mongod start

dongguo@mongodb:~$ vim /etc/init.d/mongod
增加用户名与密码属性

$CLIEXEC admin -u rootm -p rootm --eval "db.shutdownServer()"

在日志中可以看到auth与master参数都已经被启用。
dongguo@mongodb:~$ tailf /opt/mongodb/log/mongodb.log

Fri Sep 28 20:58:33 [initandlisten] MongoDB starting : pid=15992 port=27017 dbpath=/opt/mongodb/data master=1 64-bit 

host=mongodb
Fri Sep 28 20:58:33 [initandlisten] db version v2.2.0, pdfile version 4.5
Fri Sep 28 20:58:33 [initandlisten] git version: f5e83eae9cfbec7fb7a071321928f00d1b0c5207
Fri Sep 28 20:58:33 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 

2009 x86_64 BOOST_LIB_VERSION=1_49
Fri Sep 28 20:58:33 [initandlisten] options: { auth: "true", config: "/opt/mongodb/etc/mongod.conf", dbpath: 

"/opt/mongodb/data", fork: "true", logappend: "true", logpath: "/opt/mongodb/log/mongodb.log", master: "true", pidfilepath: 

"/opt/mongodb/run/mongod.pid", port: 27017 }
Fri Sep 28 20:58:33 [initandlisten] journal dir=/opt/mongodb/data/journal

从服务器设置:
dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
> use admin
switched to db admin
> db.addUser('roots','roots')
{
	"user" : "roots",
	"readOnly" : false,
	"pwd" : "430b42be2092c65349e29043412a9155",
	"_id" : ObjectId("50659fc1eb520e528f23454d")
}
> db.addUser('repl','repl')
{
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7",
	"_id" : ObjectId("5065a9772485dc71fdfe1af5")
}
> db.addUser('repl','repl')
{
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7",
	"_id" : ObjectId("5065a9a22485dc71fdfe1af6")
}
> use local
switched to db local
> db.addUser('repl','repl')
{
	"user" : "repl",
	"readOnly" : false,
	"pwd" : "c9f242649c23670ff94c4ca00ea06fe7",
	"_id" : ObjectId("50659fd1eb520e528f23454e")
}
> exit
bye

dongguo@mongodb-node1:~$ sudo /etc/init.d/mongod stop

dongguo@mongodb-node1:~$ sudo vim /opt/mongodb/etc/mongod.conf
修改如下设置:

slave = true
auth = true
source = 10.6.1.145:27017

dongguo@mongodb-node1:~$ sudo /etc/init.d/mongod start

dongguo@mongodb-node1:~$ vim /etc/init.d/mongod
增加用户名与密码属性

$CLIEXEC admin -u roots -p roots --eval "db.shutdownServer()"

在日志中可以看到auth与master参数都已经被启用,并且数据也开始进行了复制。
dongguo@mongodb-node1:~$ tailf /opt/mongodb/log/mongodb.log

Fri Sep 28 21:05:26 [initandlisten] options: { auth: "true", config: "/opt/mongodb/etc/mongod.conf", dbpath: 

"/opt/mongodb/data", fork: "true", logappend: "true", logpath: "/opt/mongodb/log/mongodb.log", pidfilepath: 

"/opt/mongodb/run/mongod.pid", port: 27017, slave: "true", source: "10.6.1.145:27017" }
Fri Sep 28 21:05:26 [initandlisten] journal dir=/opt/mongodb/data/journal
Fri Sep 28 21:05:26 [initandlisten] recover : no journal files present, no recovery needed
Fri Sep 28 21:05:26 [initandlisten] waiting for connections on port 27017
Fri Sep 28 21:05:26 [websvr] admin web console waiting for connections on port 28017
Fri Sep 28 21:05:27 [replslave] build index local.sources { _id: 1 }
Fri Sep 28 21:05:27 [replslave] build index done.  scanned 0 total records. 0 secs
Fri Sep 28 21:05:27 [replslave] repl: syncing from host:10.6.1.145:27017
Fri Sep 28 21:05:27 [replslave] build index local.me { _id: 1 }
Fri Sep 28 21:05:27 [replslave] build index done.  scanned 0 total records. 0 secs
Fri Sep 28 21:05:36 [replslave] repl:   applied 1 operations
Fri Sep 28 21:05:36 [replslave] repl:  end sync_pullOpLog syncedTo: Sep 28 21:05:38 5065a0a2:1
Fri Sep 28 21:05:36 [replslave] repl: syncing from host:10.6.1.145:27017
Fri Sep 28 21:05:36 [replslave] resync: dropping database cloud
Fri Sep 28 21:05:36 [replslave] removeJournalFiles
Fri Sep 28 21:05:36 [replslave] resync: cloning database cloud to get an initial copy
Fri Sep 28 21:05:36 [FileAllocator] allocating new datafile /opt/mongodb/data/cloud.ns, filling with zeroes...
Fri Sep 28 21:05:36 [FileAllocator] done allocating datafile /opt/mongodb/data/cloud.ns, size: 16MB,  took 0 secs
Fri Sep 28 21:05:36 [FileAllocator] allocating new datafile /opt/mongodb/data/cloud.0, filling with zeroes...
Fri Sep 28 21:05:36 [FileAllocator] done allocating datafile /opt/mongodb/data/cloud.0, size: 64MB,  took 0.001 secs
Fri Sep 28 21:05:36 [FileAllocator] allocating new datafile /opt/mongodb/data/cloud.1, filling with zeroes...
Fri Sep 28 21:05:36 [FileAllocator] done allocating datafile /opt/mongodb/data/cloud.1, size: 128MB,  took 0.004 secs
Fri Sep 28 21:05:36 [replslave] build index cloud.vm_instance { _id: 1 }
Fri Sep 28 21:05:36 [replslave] 	 fastBuildIndex dupsToDrop:0
Fri Sep 28 21:05:36 [replslave] build index done.  scanned 79 total records. 0.014 secs
Fri Sep 28 21:05:36 [replslave] build index cloud.system.users { _id: 1 }
Fri Sep 28 21:05:36 [replslave] 	 fastBuildIndex dupsToDrop:0
Fri Sep 28 21:05:36 [replslave] build index done.  scanned 1 total records. 0 secs
Fri Sep 28 21:05:36 [replslave] resync: done with initial clone for db: cloud
Fri Sep 28 21:05:42 [replslave] repl: syncing from host:10.6.1.145:2701

接下来登陆数据库进行手工验证:
dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
> use cloud;
switched to db cloud
> db.auth('repl','repl')
1
> db.vm_instance
db.vm_instance
> db.vm_instance.find()
"uuid" : "1e5de227-547e-4ed0-957a-8884559d01cd", "created" : "2012-06-29 00:40:53" }
{ "_id" : ObjectId("50658bd04ff74b0e8f2f6488"), "id" : 5, "instance_name" : "i-2-5-VM", "private_ip_address" : "10.6.223.41", 

"uuid" : "123f8676-2fc1-413a-bdec-89e54835ad6a", "created" : "2012-06-29 01:20:37" }
{ "_id" : ObjectId("50658bd04ff74b0e8f2f6489"), "id" : 6, "instance_name" : "i-2-6-VM", "private_ip_address" : "10.6.223.139", 

"uuid" : "7d39898e-cc0d-46b2-a1d4-170201acf832", "created" : "2012-06-29 03:55:32" }
{ "_id" : ObjectId("50658bd04ff74b0e8f2f648a"), "id" : 7, "instance_name" : "i-2-7-VM", "private_ip_address" : "10.6.223.140", 

"uuid" : "ad0690bb-6738-42e3-a3ee-5a93b47fd3a4", "created" : "2012-06-29 03:55:52" }
{ "_id" : ObjectId("50658bd04ff74b0e8f2f648b"), "id" : 8, "instance_name" : "i-2-8-VM", "private_ip_address" : "10.6.223.142", 

"uuid" : "5beb13b5-8319-43a0-a65f-e7c7dde4334f", "created" : "2012-06-29 03:56:07" }
{ "_id" : ObjectId("50658bd04ff74b0e8f2f648c"), "id" : 9, "instance_name" : "i-2-9-VM", "private_ip_address" : "10.6.223.143", 
Type "it" for more
> 
bye

数据库cloud中的数据已经成功的同步过来了,主从复制搭建成功。

二、部署副本集群

接下来我们开始搭建副本集群,这是一个更高级方式,和主从复制不同的是:
1. 该集群没有特定的主数据库;
2. 如果某个主数据库宕机了,集群中就会推选出一个从数据库作为主数据库顶上,实现自动故障恢复功能,非常不错。

首先,我们准备三台服务器 mongodb,mongodb-node1,mongodb-node2。

确保只有mongodb上有数据,而mongodb-node1与mongodb-node2上都没有数据,如果有则通过 use dbname和db.dropDatabase()将数据删除。
dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
> show dbs
local	(empty)
> 
bye

dongguo@mongodb-node2:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
> show dbs
local	(empty)
> 
bye

dongguo@mongodb:~$ mongo admin

MongoDB shell version: 2.2.0
connecting to: admin
> show dbs
admin	0.203125GB
cloud	0.203125GB
local	1.203125GB

并取消之前在mongodb与mongodb-node1的相关主从设置
dongguo@mongodb:~$ sudo vim /opt/mongodb/etc/mongod.conf
修改如下设置:

#master = true
#auth = true

dongguo@mongodb-node1:~$ sudo vim /opt/mongodb/etc/mongod.conf
修改如下设置:

#slave = true
#auth = true
#source = 10.6.1.145:27017

并修改服务管理脚本
$ vim /etc/init.d/mongod

$CLIEXEC admin --eval "db.shutdownServer()"

然后在三台服务器上加入启动参数 --replSet
需要给集群一个名称,比如MoboTap,--replSet 表示让服务器知道MoboTap下还有其他数据库。
$ sudo vim /opt/mongodb/etc/mongod.conf

replSet = MoboTap

然后重启三台服务器的MongoDB
$ sudo /etc/init.d/mongod stop
$ sudo /etc/init.d/mongod start

在mongodb服务器上初始化整个副本集群
dongguo@mongodb:~$ mongo admin

MongoDB shell version: 2.2.0
connecting to: admin
> show dbs
admin	0.203125GB
cloud	0.203125GB
local	1.203125GB
> db.runCommand({"replSetInitiate":{
... "_id":"MoboTap",
... "members":[
... {
... "_id":1,
... "host":"10.6.1.145:27017" 
... }, 
... {
... "_id":2, 
... "host":"10.6.1.146:27017"
... },
... {
... "_id":3,
... "host":"10.6.1.147:27017"
... }
... ]}})
{
	"info" : "Config now saved locally.  Should come online in about a minute.",
	"ok" : 1
}

然后我们检查目前整个副本集群的主从状态:
dongguo@mongodb:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:PRIMARY> 

dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:SECONDARY> 
MoboTap:SECONDARY> show dbs
admin	0.203125GB
cloud	0.203125GB
local	1.203125GB

dongguo@mongodb-node2:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:SECONDARY> 
MoboTap:SECONDARY> show dbs
admin	0.203125GB
cloud	0.203125GB
local	1.203125GB

我们可以看到,目前通过内部的自动“选举”,mongodb担任了PRIMARY的角色,而另外的node1与node2则担任了SECONDARY,并且成功的从mongodb中同步了数据。

我们可以通过rs.status()很方便的查看集群中的服务器状态。
dongguo@mongodb:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:PRIMARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T21:38:29Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 8665,
			"optime" : Timestamp(1348866843000, 1),
			"optimeDate" : ISODate("2012-09-28T21:14:03Z"),
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1464,
			"optime" : Timestamp(1348866843000, 1),
			"optimeDate" : ISODate("2012-09-28T21:14:03Z"),
			"lastHeartbeat" : ISODate("2012-09-28T21:38:28Z"),
			"pingMs" : 0
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1464,
			"optime" : Timestamp(1348866843000, 1),
			"optimeDate" : ISODate("2012-09-28T21:14:03Z"),
			"lastHeartbeat" : ISODate("2012-09-28T21:38:28Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

在MongoDB的副本集群中,还有一个角色叫做仲裁服务器,它的特点是只参与投票选举,而不会同步数据。

我们可以通过下面的步骤来新增一台仲裁服务器mongodb-node3。
MongoDB的配置与上面的mongodb-node2完全相同,同样也要保证没有数据且启用了--replSet参数。

启动mongodb-node3之后,我们在mongodb上执行如下操作:
dongguo@mongodb:~$ mongo admin

MongoDB shell version: 2.2.0
connecting to: admin
MoboTap:PRIMARY> rs.addArb("10.6.1.148:27017")
{ "ok" : 1 }
MoboTap:PRIMARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T21:56:32Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 9748,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 2547,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T21:56:31Z"),
			"pingMs" : 0
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 464,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T21:56:30Z"),
			"pingMs" : 0
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 17,
			"lastHeartbeat" : ISODate("2012-09-28T21:56:31Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

我们可以看到仲裁服务器已经被添加进来了,角色为ARBITER,就是仲裁的意思。

好的,目前整个副本集群就搭建好了,那么这个集群所具备的一个非常好的特性 “自动故障恢复”,我们就可以来好好见识一下了。
下面,我们将目前担任PRIMARY角色的服务器的进程直接Kill掉:
dongguo@mongodb:~$ ps aux | grep mongod
root 17029 0.3 12.9 5480648 64836 ? Sl 03:14 0:31 /opt/mongodb/bin/mongod --config /opt/mongodb/etc/mongod.conf
dongguo 18010 0.0 0.1 6224 584 pts/0 S+ 06:01 0:00 grep --color=auto mongod
dongguo@mongodb:~$ sudo killall -9 mongod
dongguo@mongodb:~$ ps aux | grep mongod
dongguo 18013 0.0 0.1 6224 584 pts/0 S+ 06:01 0:00 grep --color=auto mongod

进程已经被杀掉了,那么目前的主从关系变成了什么样子呢?我们通过mongodb-node1上查看一下:
dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:PRIMARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T22:02:23Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 0,
			"state" : 8,
			"stateStr" : "(not reachable/healthy)",
			"uptime" : 0,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:01:12Z"),
			"pingMs" : 0,
			"errmsg" : "socket exception [CONNECT_ERROR] for 10.6.1.145:27017"
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 3111,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"self" : true
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 816,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:02:23Z"),
			"pingMs" : 0,
			"errmsg" : "syncing to: 10.6.1.146:27017"
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 369,
			"lastHeartbeat" : ISODate("2012-09-28T22:02:22Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

很明显,我们的mongodb-node1自动接管了PRIMARY的角色,并且可以看到mongodb的状态目前变成了(not reachable/healthy)。

接着,我们重新恢复mongodb的服务,再看看主从关系有没有什么变化:
dongguo@mongodb:~$ sudo /etc/init.d/mongod start

dongguo@mongodb:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:SECONDARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T22:04:38Z"),
	"myState" : 2,
	"syncingTo" : "10.6.1.146:27017",
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 19,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"errmsg" : "syncing to: 10.6.1.146:27017",
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 19,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:04:37Z"),
			"pingMs" : 0
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 19,
			"optime" : Timestamp(1348869269000, 1),
			"optimeDate" : ISODate("2012-09-28T21:54:29Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:04:37Z"),
			"pingMs" : 0
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 19,
			"lastHeartbeat" : ISODate("2012-09-28T22:04:37Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

我们可以看到mongodb现在已经自动成为SECONDARY,而PRIMARY则仍然由mongodb-node1担任。

下面,我们来实践一下节点的动态增加与删除。
我们新增一台mongodb-node4,相关配置与mongodb-node3方法一致。
启动好mongodb-node4以后,我们在PRIMARY上添加该节点:

dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:PRIMARY> rs.add("10.6.1.149:27017");
{ "ok" : 1 }
MoboTap:PRIMARY> rs.status();
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T22:21:29Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1031,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:21:29Z"),
			"pingMs" : 0
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 4257,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"self" : true
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1962,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:21:28Z"),
			"pingMs" : 1
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 222,
			"lastHeartbeat" : ISODate("2012-09-28T22:21:29Z"),
			"pingMs" : 0
		},
		{
			"_id" : 5,
			"name" : "10.6.1.149:27017",
			"health" : 1,
			"state" : 5,
			"stateStr" : "STARTUP2",
			"uptime" : 13,
			"optime" : Timestamp(0, 0),
			"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:21:28Z"),
			"pingMs" : 0,
			"errmsg" : "initial sync need a member to be primary or secondary to do our initial sync"
		}
	],
	"ok" : 1
}

目前数据还未同步完成,等待一段时间过后我们再看一下:
dongguo@mongodb-node4:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:SECONDARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-29T10:10:06Z"),
	"myState" : 2,
	"syncingTo" : "10.6.1.146:27017",
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 96,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"lastHeartbeat" : ISODate("2012-09-29T10:10:04Z"),
			"pingMs" : 0
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 96,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"lastHeartbeat" : ISODate("2012-09-29T10:10:04Z"),
			"pingMs" : 1
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 96,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"lastHeartbeat" : ISODate("2012-09-29T10:10:04Z"),
			"pingMs" : 0
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 96,
			"lastHeartbeat" : ISODate("2012-09-29T10:10:04Z"),
			"pingMs" : 0
		},
		{
			"_id" : 5,
			"name" : "10.6.1.149:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 228,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"self" : true
		}
	],
	"ok" : 1
}

新的节点已经成功加入到了服务器当中。

接着,我们来测试动态的删除节点,同样拿刚刚新增的mongodb-node4为例:
dongguo@mongodb-node1:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:PRIMARY> rs.remove("10.6.1.149:27017");
{ "ok" : 1 }
MoboTap:PRIMARY> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-28T22:26:00Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 1,
			"name" : "10.6.1.145:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 62,
			"optime" : Timestamp(1348871096000, 1),
			"optimeDate" : ISODate("2012-09-28T22:24:56Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:26:00Z"),
			"pingMs" : 0,
			"errmsg" : "syncing to: 10.6.1.146:27017"
		},
		{
			"_id" : 2,
			"name" : "10.6.1.146:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 4528,
			"optime" : Timestamp(1348871096000, 1),
			"optimeDate" : ISODate("2012-09-28T22:24:56Z"),
			"self" : true
		},
		{
			"_id" : 3,
			"name" : "10.6.1.147:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 62,
			"optime" : Timestamp(1348871096000, 1),
			"optimeDate" : ISODate("2012-09-28T22:24:56Z"),
			"lastHeartbeat" : ISODate("2012-09-28T22:26:00Z"),
			"pingMs" : 0,
			"errmsg" : "syncing to: 10.6.1.146:27017"
		},
		{
			"_id" : 4,
			"name" : "10.6.1.148:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 62,
			"lastHeartbeat" : ISODate("2012-09-28T22:26:00Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}

我们可以看到,节点mongodb-node4已经被动态的从副本集群中删除了。

我们登录到mongodb-node4上确认一下:
dongguo@mongodb-node4:~$ mongo

MongoDB shell version: 2.2.0
connecting to: test
MoboTap:REMOVED> rs.status()
{
	"set" : "MoboTap",
	"date" : ISODate("2012-09-29T10:14:03Z"),
	"myState" : 10,
	"members" : [
		{
			"_id" : 5,
			"name" : "10.6.1.149:27017",
			"health" : 1,
			"state" : 10,
			"stateStr" : "REMOVED",
			"uptime" : 465,
			"optime" : Timestamp(1348870876000, 1),
			"optimeDate" : ISODate("2012-09-28T22:21:16Z"),
			"self" : true
		}
	],
	"ok" : 1
}

很清楚的可以看到自己的状态变成了REMOVED。

至此,整个mongodb的副本集群的自动故障恢复与节点的动态增加与删除我们都成功的实现了。
这种方式明显要比上面的主从复制方式更可用,并且支持多节点,能够进行动态增加与删除节点,实现水平的扩容,非常不错。

,

  1. No comments yet.
(will not be published)
*