MongoDBsharding分片-创新互联
背景
让客户满意是我们工作的目标,不断超越客户的期望值来自于我们对这个行业的热爱。我们立志把好的技术通过有效、简单的方式提供给客户,将通过不懈努力成为客户在信息化领域值得信任、有价值的长期合作伙伴,公司提供的服务项目有:国际域名空间、虚拟主机、营销软件、网站建设、汕城网站维护、网站推广。当MongoDB存储海量的数据时,一台机器可能不足以存储数据,也可能不足以提供可接受的读写吞吐量。这时,我们就可以通过在多台机器上分割数据,使得数据库系统能存储和处理更多的数据。
1、MongoDB sharding简介
三种角色:
配置服务器(config):是一个独立的mongod进程,保存集群和分片的元数据,即各分片包含了哪些数据的信息。
路由服务器(mongos):起到一个路由的功能,供程序连接。本身不保存数据,在启动时从配置服务器加载集群信息.
分片服务器(sharding):是一个独立mongod进程,保存数据信息。可以是一台服务器,如果想要高可用也可以配置成副本集。
2、实验环境
两台机器的IP:
172.16.101.54 sht-sgmhadoopcm-01
172.16.101.55 sht-sgmhadoopnn-01
config server:
172.16.101.55:27017
mongos:
172.16.101.55:27018
sharding:
172.16.101.54:27017
172.16.101.54:27018
172.16.101.54:27019
2、启动config服务
修改配置服务器的配置文件,主要是参数clusterRole指定角色为configsvr
[root@sht-sgmhadoopnn-01 mongodb]# cat /etc/mongod27017.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27017.log" logAppend: true storage: dbPath: /usr/local/mongodb/data/db27017 journal: enabled: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid net: port: 27017 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false sharding: clusterRole: configsvr archiveMovedChunks: true
[root@sht-sgmhadoopnn-01 mongodb]# bin/mongod --config /etc/mongod27017.conf
warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default
about to fork child process, waiting until server is ready for connections.
forked process: 31033
child process started successfully, parent exiting
3、启动mongos服务
修改路由服务器的配置文件,主要是参数configDB指定config服务器的IP和port,不需要配置有关数据文件的信息,因为路由服务器不存储数据
[root@sht-sgmhadoopnn-01 mongodb]# cat /etc/mongod27018.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27018.log" logAppend: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27018/mongod27018.pid net: port: 27018 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false sharding: autoSplit: true configDB: 172.16.101.55:27017 chunkSize: 64
[root@sht-sgmhadoopnn-01 mongodb]# bin/mongos --config /etc/mongod27018.conf
warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default
2018-11-10T18:57:13.705+0800 W SHARDING running with 1 config server should be done only for testing purposes and is not recommended for production
about to fork child process, waiting until server is ready for connections.
forked process: 31167
child process started successfully, parent exiting
4、启动sharding服务
就是一个普通的mongodb进程,普通的配置文件
[root@sht-sgmhadoopcm-01 mongodb]# cat /etc/mongod27017.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27017.log" logAppend: true storage: dbPath: /usr/local/mongodb/data/db27017 journal: enabled: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid net: port: 27017 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27017.conf
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27018.conf
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27019.conf
5、登陆mongos服务并添加sharding信息
[root@sht-sgmhadoopnn-01 mongodb]# bin/mongo --port=27018
mongos> sh.addShard("172.16.101.54:27017")
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard("172.16.101.54:27018")
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.addShard("172.16.101.54:27019")
{ "shardAdded" : "shard0002", "ok" : 1 }
查看集群分片信息
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" }
mongos> db.runCommand({listshards:1}) { "shards" : [ { "_id" : "shard0000", "host" : "172.16.101.54:27017" }, { "_id" : "shard0001", "host" : "172.16.101.54:27018" }, { "_id" : "shard0002", "host" : "172.16.101.54:27019" } ], "ok" : 1 }
6、开启分片
需要执行分片的库和集合,以及分片模式,分片模式分为两种hash和range
[root@sht-sgmhadoopnn-01 mongodb]# bin/mongo --port=27018
分片库是testdb
mongos> sh.enableSharding("testdb")
{ "ok" : 1 }
(1)hash分片模式测试
分片的集合是collection1,根据id进行hash分片
mongos> sh.shardCollection("testdb.collection1",{"_id":"hashed"})
{ "collectionsharded" : "testdb.collection1", "ok" : 1 }
共插入10个测试document
mongos> use testdb
switched to db testdb
mongos> for(var i=0;i<10;i++){db.collection1.insert({name:"jack"+i});}
WriteResult({ "nInserted" : 1 })
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 2 : Success databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : false, "primary" : "shard0000" } { "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" } testdb.collection1 shard key: { "_id" : "hashed" } chunks: shard0000 2 shard0001 2 shard0002 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2) { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3) { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4) { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5) { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6) { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)
查看每个sharding上的数据分布情况db.collection.stats()
mongos> db.collection1.stats() { "sharded" : true, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "ns" : "testdb.collection1", "count" : 10, #共十个document数据 "numExtents" : 3, "size" : 480, "storageSize" : 24576, "totalIndexSize" : 49056, "indexSizes" : { "_id_" : 24528, "_id_hashed" : 24528 }, "avgObjSize" : 48, "nindexes" : 2, "nchunks" : 6, "shards" : { "shard0000" : { "ns" : "testdb.collection1", "count" : 0, "size" : 0, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 }, "shard0001" : { "ns" : "testdb.collection1", "count" : 6, "size" : 288, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 }, "shard0002" : { "ns" : "testdb.collection1", "count" : 4, "size" : 192, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 } }, "ok" : 1 }
分别登陆sharding节点查看数据分布,和通过命令db.collection.stats()看到的结果一致
可以发现节点27017上没有数据,节点27018上有6个document,节点27019上有4个document,出现这种情况的原因可能是插入的数据量太小,没有分布均匀,数据量越大,分布越均匀。
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27017/testdb
> db.collection1.find()
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27018/testdb
> db.collection1.find()
{ "_id" : ObjectId("5be6c467e6467cc8077da816"), "name" : "jack1" }
{ "_id" : ObjectId("5be6c467e6467cc8077da817"), "name" : "jack2" }
{ "_id" : ObjectId("5be6c467e6467cc8077da818"), "name" : "jack3" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81a"), "name" : "jack5" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81c"), "name" : "jack7" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81e"), "name" : "jack9" }
[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27019/testdb
> db.collection1.find()
{ "_id" : ObjectId("5be6c467e6467cc8077da815"), "name" : "jack0" }
{ "_id" : ObjectId("5be6c467e6467cc8077da819"), "name" : "jack4" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81b"), "name" : "jack6" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81d"), "name" : "jack8" }
(2)range分片模式测试
分片的集合是collection2,根据name进行range分片
mongos> sh.shardCollection("testdb.collection2",{"name":1})
{ "collectionsharded" : "testdb.collection2", "ok" : 1 }
mongos> for(var i=0;i<1000;i++){db.collection2.insert({name:"jack"+i});}
WriteResult({ "nInserted" : 1 })
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 4 : Success databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : false, "primary" : "shard0000" } { "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" } testdb.collection1 shard key: { "_id" : "hashed" } chunks: shard0000 2 shard0001 2 shard0002 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2) { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3) { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4) { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5) { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6) { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7) testdb.collection2 shard key: { "name" : 1 } chunks: shard0000 1 shard0001 1 shard0002 1 { "name" : { "$minKey" : 1 } } -->> { "name" : "jack1" } on : shard0001 Timestamp(2, 0) { "name" : "jack1" } -->> { "name" : "jack5" } on : shard0002 Timestamp(3, 0) { "name" : "jack5" } -->> { "name" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3, 1)
mongos> db.collection2.stats() { "sharded" : true, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "ns" : "testdb.collection2", "count" : 1000, "numExtents" : 6, "size" : 48032, "storageSize" : 221184, "totalIndexSize" : 130816, "indexSizes" : { "_id_" : 65408, "name_1" : 65408 }, "avgObjSize" : 48.032, "nindexes" : 2, "nchunks" : 3, "shards" : { "shard0000" : { "ns" : "testdb.collection2", "count" : 555, "size" : 26656, "avgObjSize" : 48, "numExtents" : 3, "storageSize" : 172032, "lastExtentSize" : 131072, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 65408, "indexSizes" : { "_id_" : 32704, "name_1" : 32704 }, "ok" : 1 }, "shard0001" : { "ns" : "testdb.collection2", "count" : 1, "size" : 48, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "name_1" : 8176 }, "ok" : 1 }, "shard0002" : { "ns" : "testdb.collection2", "count" : 444, "size" : 21328, "avgObjSize" : 48, "numExtents" : 2, "storageSize" : 40960, "lastExtentSize" : 32768, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 49056, "indexSizes" : { "_id_" : 24528, "name_1" : 24528 }, "ok" : 1 } }, "ok" : 1 }
参考链接
Sharding
另外有需要云服务器可以了解下创新互联scvps.cn,海内外云服务器15元起步,三天无理由+7*72小时售后在线,公司持有idc许可证,提供“云服务器、裸金属服务器、高防服务器、香港服务器、美国服务器、虚拟主机、免备案服务器”等云主机租用服务以及企业上云的综合解决方案,具有“安全稳定、简单易用、服务可用性高、性价比高”等特点与优势,专为企业上云打造定制,能够满足用户丰富、多元化的应用场景需求。
当前名称:MongoDBsharding分片-创新互联
网页网址:http://myzitong.com/article/cseich.html