對於MySQL的HA, 我們必須佈署DRBD, corosync和pacemaker來達成。DRBD就像是軟體的RAID I,同步兩台電腦間的partition。而corosync則是用來在cluster間傳遞訊息和heartbeat,最後pacemaker則是負責管理應用程式的切換(例如MySQL要開在哪一台,另外一台standby)。
DRBD安裝
安裝DRBD只需要安裝drbd8-utils
# apt-get install drbd8-utils
|
DRBD設定
假設我們要新增一個resource給mysql/rabbitmq使用,在master 分別新增檔案 /etc/drbd.d/mysql.res 和
/etc/drbd.d/rabbitmq.res
resource mysql {
on master {
device /dev/drbd0;
disk /dev/mapper/master-mysql; #master要拿來同步的partition
address 10.109.36.58:7788;#master的IP
meta-disk internal;
}
on slave {
device /dev/drbd0;
disk /dev/mapper/master-mysql; #slave要拿來同步的partition
address 10.109.36.59:7788; #slave的IP
meta-disk internal;
}
}
|
resource rabbitmq{
on master {
device /dev/drbd1;
disk /dev/mapper/master-rabbitmq; #master要拿來同步的partition
address 10.109.36.58:7789;#master的IP
meta-disk internal;
}
on slave {
device /dev/drbd1;
disk /dev/mapper/master-rabbitmq; #slave要拿來同步的partition
address 10.109.36.59:7789; #slave的IP
meta-disk internal;
}
}
|
#將設定檔從master copy到slave
# scp /etc/drbd.d/mysql.res slave:/etc/drbd.d/
# scp /etc/drbd.d/rabbitmq.res slave:/etc/drbd.d/
|
#在master及slave 啟動DRBD
# /etc/init.d/drbd start
|
#初始化metadata storage
master:
# drbdadm create-md mysql
# drbdadm create-md rabbitmq
|
#將master設定為primary
master:
# drbdadm -- --overwrite-data-of-peer primary mysql
# drbdadm -- --overwrite-data-of-peer primary rabbitmq
|
#確認安裝狀態
執行# service drbd status應該可以看到
master:
drbd driver loaded OK; device status:
version: 8.3.13 (api:88/proto:86-96)
srcversion: 697DE8B1973B1D8914F04DB
m:res cs ro ds p mounted fstype
0:mysql Connected Primary/Secondary UpToDate/UpToDate C
1:rabbitmq Connected Primary/Secondary UpToDate/UpToDate C
|
slave:
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
srcversion: 2931F0123213F7DB1364EA7
m:res cs ro ds p mounted fstype
0:mysql Connected Secondary/Primary UpToDate/UpToDate C
1:rabbitmq Connected Primary/Secondary UpToDate/UpToDate C
|
DRBD問題排除
- 執行service drbd status出現 '0:mysql StandAlone Secondary/Unknown UpToDate/DUnknown r-----’
如果執行dmesg 有看到
kernel: block drbd0: Split-Brain detected, dropping connection!
代表遇到split-brain的問題,解決的方法如下:
- Disconnect resource(on secondary node)
# drbdadm disconnect mysql
- 將node轉為secondary(on secondary node)
# drbdadm secondary mysql
- 強至取消所有修改(on secondary node)
# drbdadm -- --discard-my-data connect mysql
- 重新連線(on primary node)
# drbdadm connect mysql
Reference
Corosync Installation
Corosync安裝
Corosync安裝只需要安裝corosync套件
both master & master-1
# apt-get install corosync
|
Corosync設定
#編輯/etc/corosync/corosync.conf
both master & master-1
# Please read the openais.conf.5 manual page
totem {
version: 2
# How long before declaring a token lost (ms)
token: 3000
# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10
# How long to wait for join messages in the membership protocol (ms)
join: 60
# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600
# Turn off the virtual synchrony filter
vsftype: none
# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Disable encryption
secauth: off
# How many threads to use for encryption/decryption
threads: 0
# Optionally assign a fixed node id (integer)
# nodeid: 1234
# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none
interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.109.36.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
amf {
mode: disabled
}
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}
aisexec {
user: root
group: root
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}
|
#將設定檔複製到Slave
# scp -r /etc/corosync master-1:/etc/
|
設定開機自動啟動
- 編輯 /etc/default/corosync 將內容的'no’ 改成’yes’
確認安裝狀態
在Master執行 # corosync-objctl runtime.totem.pg.mrp.srp.members 應該可以看到如下畫面
runtime.totem.pg.mrp.srp.1763994890.ip=r(0) ip(10.109.36.105)
runtime.totem.pg.mrp.srp.1763994890.join_count=1
runtime.totem.pg.mrp.srp.1763994890.status=joined
runtime.totem.pg.mrp.srp.1797549322.ip=r(0) ip(10.109.36.107)
runtime.totem.pg.mrp.srp.1797549322.join_count=1
runtime.totem.pg.mrp.srp.1797549322.status=joined
|
Reference
Pacemaker Installation
驗證系統
在重新啟動Master及Slave之後在兩台執行 # crm_mon應該可以看到
Master
============
Last updated: Wed Jun 19 16:52:11 2013
Last change: Wed Jun 19 13:14:15 2013 via crmd on master-1
Stack: openais
Current DC: master - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ master master-1 ]
|
Slave
============
Last updated: Wed Jun 19 16:52:11 2013
Last change: Wed Jun 19 13:14:15 2013 via crmd on master-1
Stack: openais
Current DC: master - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ master master-1 ]
|
基本設定
#在master執行 # crm configure 接著輸入
crm(live)configure# property no-quorum-policy="ignore" \
pe-warn-series-max="1000" \
pe-input-series-max="1000" \
pe-error-series-max="1000" \
cluster-recheck-interval="5min" \
crm(live)configure# commit
|
HA for MySQL
#編輯pacemaker
執行 # crm configure
crm(live)configure# edit
接著輸入
primitive p_drbd_mysql ocf:linbit:drbd \
params drbd_resource="mysql" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="180s" \
op promote interval="0" timeout="180s" \
op demote interval="0" timeout="180s" \
op monitor interval="30s" role="Slave" \
op monitor interval="29s" role="Master"
primitive p_drbd_rabbitmq ocf:linbit:drbd \
params drbd_resource="rabbitmq" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="180s" \
op promote interval="0" timeout="180s" \
op demote interval="0" timeout="180s" \
op monitor interval="30s" role="Slave" \
op monitor interval="29s" role="Master"
primitive p_fs_mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/mysql" directory="/var/lib/mysql" fstype="xfs" options="relatime" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="180s" \
op monitor interval="60s" timeout="60s"
primitive p_fs_rabbitmq ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/rabbitmq" \
directory="/var/lib/rabbitmq" fstype="xfs"
primitive p_ip_mysql ocf:heartbeat:IPaddr2 \
params ip="10.109.36.198" cidr_netmask="24" \
op monitor interval="30s"
primitive p_ip_rabbitmq ocf:heartbeat:IPaddr2 \
params ip="10.109.36.198" cidr_netmask="24" \
op monitor interval="10s"
primitive p_rabbitmq ocf:rabbitmq:rabbitmq-server \
params nodename="rabbit@localhost" mnesia_base="/var/lib/rabbitmq" \
op monitor interval="20s" timeout="10s"
primitive p_mysql ocf:heartbeat:mysql \
params additional_parameters="--bind-address=0.0.0.0 config=/etc/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" log="/var/log/mysql/mysqld.log" \
op monitor interval="20s" timeout="10s" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s"
roup g_rabbitmq p_ip_rabbitmq p_fs_rabbitmq p_rabbitmq \
meta target-role="Started"
group g_mysql p_ip_mysql p_fs_mysql p_mysql
ms ms_drbd_mysql p_drbd_mysql \
meta notify="true" clone-max="2"
ms ms_drbd_rabbitmq p_drbd_rabbitmq \
meta notify="true" master-max="1" clone-max="2" target-role="Started"
colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
colocation c_rabbitmq_on_drbd inf: g_rabbitmq ms_drbd_rabbitmq:Master
order o_drbd_before_rabbitmq inf: ms_drbd_rabbitmq:promote g_rabbitmq:start
order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
order order1 inf: g_rabbitmq:start g_mysql
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
no-quorum-policy="ignore" \
pe-warn-series-max="1000" \
pe-input-series-max="1000" \
pe-error-series-max="1000" \
cluster-recheck-interval="5min" \
stonith-enabled="false"
crm(live)configure# commit
|
接著再執行 # crm_mon 應該就可以看到MySQL和RabbitMQ開起來了。