高可用PXC
1.Percona XtraDB Cluster的搭建
企业建站必须是能够以充分展现企业形象为主要目的,是企业文化与产品对外扩展宣传的重要窗口,一个合格的网站不仅仅能为公司带来巨大的互联网上的收集和信息发布平台,成都创新互联面向各种领域:成都湿喷机等网站设计、营销型网站解决方案、网站设计等建站排名服务。
安装环境:
节点1:A: 192.168.91.18
节点2:B:192.168.91.20
节点3:C:192.168.91.21
innodb引擎层实现的复制
ABC server_id要不一样
ABC:
下载软件:
wget http://www.percona.com/downloads/Percona-XtraDB-Cluster-56/Percona-XtraDB-Cluster-5.6.21-25.8/binary/tarball/Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64.tar.gz
安装依赖包:
yum install -y socat
yum install -y perl-DBD-MySQL.x86_64 perl-IO-Socket-SSL.noarch socat.x86_64 nc
(其中nc是一个强大的网络工具)
yum install -yhttp://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
#安装xtrabackup备份软件:
yum list |grep percona-xtrabackup
yum install -y percona-xtrabackup.x86_64
#rpm -qa |grep percona
percona-release-0.1-3.noarch
percona-xtrabackup-2.3.7-2.el6.x86_64
ABC:
解压PXC包:
tar xf Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64.tar.gz
软链接:
ln -s /home/tools/Percona-XtraDB-Cluster-5.6.21-rel70.1-25.8.938.Linux.x86_64 /usr/local/mysql
创建 mysql 的用户及组
groupadd mysql
useradd –g msyql –s /sbin/nologin –d /usr/local/mysql mysql
创建启动文件:
cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld
创建 mysql 需要的基本目录
mkdir -p /data/mysql3306/{data,logs,tmp}
chown -R mysql:mysql *
A 配置文件:
vim /etc/my.cnf
#pxc
default_storage_engine=Innodb
#innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
wsrep_cluster_name=pxc_cluster #集群名称
wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21
wsrep_node_address=192.168.91.18
wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so
#wsrep_provider_options="gcache.size = 1G;debug = yes"
wsrep_provider_options="gcache.size = 1G;"
#wsrep_sst_method=rsync
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst:147258
B配置文件:
#pxc
default_storage_engine=Innodb
#innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
wsrep_cluster_name=pxc_cluster
wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21
wsrep_node_address=192.168.91.20
wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so
#wsrep_provider_options="gcache.size = 1G;debug = yes"
wsrep_provider_options="gcache.size = 1G;"
#wsrep_sst_method=rsync
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst:147258
C配置文件:
#pxc
default_storage_engine=Innodb
#innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
wsrep_cluster_name=pxc_cluster
wsrep_cluster_address=gcomm://192.168.91.18,192.168.91.20,192.168.91.21
wsrep_node_address=192.168.91.21
wsrep_provider=/usr/local/mysql/lib/libgalera_smm.so
#wsrep_provider_options="gcache.size = 1G;debug = yes"
wsrep_provider_options="gcache.size = 1G;"
#wsrep_sst_method=rsync
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst:147258
ABC:
初始化:
[root@Darren1 mysql]# ./scripts/mysql_install_db
A:
第一个节点启动:
/etc/init.d/mysql bootstrap-pxc
Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster)......... SUCCESS!
>mysql
delete from mysql.user where user!='root' or host!='localhost';
truncate mysql.db;
drop database test;
grant all on *.* to sst@localhost identified by '147258'; #创建用于xtrabackup的用户sst,密码要和my.cnf中对应
flush privileges;
BC:
启动节点二和节点三:
/etc/init.d/iptables stop
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
[root@Darren2 data]# /etc/init.d/mysqld start
Starting MySQL (Percona XtraDB Cluster).........State transfer in progress, setting sleep higher
... SUCCESS!
[root@Darren3 data]# /etc/init.d/mysqld start
ERROR! MySQL (Percona XtraDB Cluster) is not running, but lock file (/var/lock/subsys/mysql) exists
Starting MySQL (Percona XtraDB Cluster)..................State transfer in progress, setting sleep higher
... SUCCESS!
测试:
A:
root@localhost [testdb]> create database testdb;
root@localhost [testdb]>create table t1(c1 int auto_increment not null,c2 timestamp,primary key(c1));
root@localhost [testdb]>insert into t1 select 1,now();
root@localhost [testdb]>select * from testdb.t1;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 1 | 2017-03-06 12:29:56 |
+----+---------------------+
B:
root@localhost [testdb]>select * from testdb.t1;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 1 | 2017-03-06 12:29:56 |
+----+---------------------+
C:
root@localhost [testdb]>select * from testdb.t1;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 1 | 2017-03-06 12:29:56 |
+----+---------------------+
关闭方式:
关闭:/etc/init.d/mysql stop
全部节点关闭后重启:
第一个节点启动的节点:/etc/init.d/mysql bootstrap-pxc
其它节点/etc/init.d/mysql start
SST和IST
State Snapshot Transfer(SST) 全量传输
发生在:新节点的加入,或者集群中节点故障(关闭)时间过长
wsrep_sst_method = xtrabackup-v2
这个参数有三个值:
(1)xtrabackup-v2
使用xtrabackup传输,需要提前创建用于备份的用户并制定参数用户名和密码:wsrep_sst_auth=sst:147258
(2)rsync:最快的传输方式,不需要指定wsrep_sst_auth参数,拷贝数据的时候read-only(flush table with read lock)
(3)mysqldump:不建议使用,数据量大的时候不行,拷贝数据的时候read-only(flush table with read lock)
Incremental state Transfer(IST) 增量传输
发生在:一个节点数据的改变,把增量的部分拷贝到另几个节点,通过一个缓存gcache控制,如果增量大于gcache会选择全量传输,再有在增量小于等于gcache时候,才会选择增量传输。
wsrep_provider_options="gcache.size = 1G"
如果去停止PXC其中的一个节点?
当 wsrep_local_state_comment 的状态是 Synced 表示三个节点之间数据同步,这样才能去停止其中一个的服务,滚动重启;
每个节点能够离线多长时间计算?
比如说想离线2h,算一下2个小时能够生成多大的binlog,对应的gcache.size就设置多大。
如一个比较繁忙的订单系统,5分钟产生200M的binog,则一个小时产生2.4G,两个小时4.8G,那么wsrep_provider_options="gcache.size = 6G",gcache是需要实际内存分配的,也不能设置太大,否则会出现oom-kill;
故障恢复后,加入集群的过程分析:
(1)如果数据量不是很大,重新初始化,搞一次SST;
(2)如果数据量很大,用rsync传输;
PXC的特点及注意事项:
(1)PCX每个节点都自动配置了自增初始值和步长,跟双主一样,这样是为了防止主键冲突;
node1:
auto_increment_offset=1
auto_incremnet_increment=3
node2:
auto_increment_offset=2
auto_incremnet_increment=3
node3:
auto_increment_offset=3
auto_incremnet_increment=3
(2)PCX集群是乐观控制,事物冲突情况可能发生在commit阶段,当多个节点修改同一行数据,只有其中一个节点能够成功,失败的节点将终止,并且返回死锁错误代码:
如:
A:
root@localhost [testdb]>begin;
root@localhost [testdb]>update t1 set c2=now() where c1=3;
B:
root@localhost [testdb]>begin;
root@localhost [testdb]>update t1 set c2=now() where c1=3;
root@localhost [testdb]>commit;
A:
出现报错deadlock:
root@localhost [testdb]>commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
(3)PXC只支持innodb引擎,mysql库下的表基本上都是myisam表怎么传输呢,PXC虽然不支持myisam表,但是支持DCL语句,如create user,drop user,grant,revoke等,可以通过开启参数wsrep_replicate_myisam,使pxc支持myisam表,因此当PXC出现数据不一致的时候,首先要查看是否是myisam表;
如:
node1:
root@localhost [testdb]>show create table t2\G
*************************** 1. row ***************************
Table: t2
Create Table: CREATE TABLE `t2` (
`c1` int(11) NOT NULL AUTO_INCREMENT,
`c2` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`c1`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
root@localhost [testdb]>select * from t2;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 2 | 2017-03-08 11:41:31 |
+----+---------------------+
在node2和node3节点上面都看不到,因为没有传送过来。
(4)PXC每个表必须要有主键,如果没有主键,可能造成集群中每个节点的data page里的数据不一样,select limit 可能在不同的节点产生不同的结果集;
(5)不支持表级锁 (lock table),所有的DDL操作都是实例级别的锁,需要用pt-osc工具
如:
例1:
node1:
root@localhost [testdb]>lock table t1 read;
root@localhost [testdb]>insert into t1 select 69,now();
ERROR 1099 (HY000): Table 't1' was locked with a READ lock and can't be updated
node2:节点2仍然可以插入,说明read lock没有生效
root@localhost [testdb]>insert into t1 select 69,now();
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
例2:
node1:
root@localhost [testdb]>lock table t1 write;
root@localhost [testdb]>insert into t1 select 1,now();
Query OK, 1 row affected (0.03 sec)
Records: 1 Duplicates: 0 Warnings: 0
root@localhost [testdb]>select * from t1;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 1 | 2017-03-08 14:59:46 |
+----+---------------------+
node2: 节点二没有受写锁影响,可以读写:
root@localhost [testdb]>insert into t1 select 2,now();
Query OK, 1 row affected (0.05 sec)
Records: 1 Duplicates: 0 Warnings: 0
root@localhost [testdb]>select * from t1;
+----+---------------------+
| c1 | c2 |
+----+---------------------+
| 1 | 2017-03-08 14:59:46 |
| 2 | 2017-03-08 14:59:57 |
+----+---------------------+
(6)不支持XA 事物
(7)query log日志存放在文件中,不能放在表里,即需要指定参数log_output=file;
(8)整个集群的性能/吞吐量由性能最差的节点决定,木桶效应;
不考虑延迟的主从复制:每秒6万insert,
考虑到延迟的主从复制:每秒3万insert,
pxc:每秒1万insert
(9)节点数量是3<=num<=8
(10)脑裂,所以至少需要三个节点,有个仲裁节点,防止脑裂;
演示脑裂:
强制干掉mysql进程:
node2:
[root@Darren1 mysql3306]# kill -9 10014
node3:
[root@Darren3 ~]# kill -9 10115
node1:
root@localhost [(none)]>use testdb;
ERROR 1047 (08S01): Unknown command
脑裂前的值:
show global status like '%wsrep%';
wsrep_local_state_comment | Synced
wsrep_cluster_status | Primary
wsrep_ready | ON
脑裂后的值:
wsrep_local_state_comment | Initialized
wsrep_cluster_status | non-Primary
wsrep_ready | OFF
重启node2或者node3会报错:
[root@Darren1 data]# /etc/init.d/mysqld start
ERROR! MySQL (Percona XtraDB Cluster) is not running, but PID file exists
解决方法:重启node1,然后再重启node2和node3
当前文章:高可用PXC
链接地址:http://myzitong.com/article/jedpsp.html