Install Docker
# echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list
# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
# apt-get update
# apt-get install -y lxc-docker
Pull Ubuntu image
[with proxy]
# service docker stop
# HTTP_PROXY=[proxy] docker -d&
# docker pull ubuntu
[without proxy]
# docker pull ubuntu
Run a Docker Container
# docker run -i -t ubuntu /bin/bash
Frequently used commands
1.列出所有image
# docker images
2.列出所有運作中的container
# docker ps
3.列出所有container 包含暫停中或已結束的container
# docker ps -a
4.刪除container
# docker rm [container id]
5.刪除image
# docker rmi [image id]
note:如果image正在被某個container使用的話,就必須先刪除使用的container才能刪除image
6.build image
# docker build .
note:此指令必須跟Dockerfile同一層
6-1. build image時加入tag
# docker build -t [tag] .
6-2. build image不使用cache
# docker build --no-cache .
note:假設上一次build的時候發生錯誤(either Dockerfile寫錯 或是環境出錯等等),如果仍使用cache則會停留在出錯的狀態。另一個避免錯誤的方式是如6-3 不保留過程產生的container
6-3. build image但不保留過程中產生的container
# docker build --rm .
7. 啟動container
# docker run [image id or tag]
7-1. 啟動container with (I)nteractive and (T)ty
# docker run -t -i [image id or tag] /bin/bash
#note 啟動container並進入bash shell
7-2. 啟動container 並在背景執行
# docker run -i -t -d [image id or tag]
#note 要返回container 則執行 docker attach [container id] 回去
7-3. 啟動container 並掛載local 資料夾
# docker run -v /local/dir:/docker/dir [image id or tag]
7-4. 啟動container 並forward port
# docker run -p host_port:container_port -p host_port:container_port/udp [image id or tag]
#note -p 後面接一組 host_port:container_port,如果是udp則在後面加上/udp
7-5. 啟動container並與host時間同步
# docker run -v /etc/localtime:/etc/localtime:ro [image id or tag]
7-6. 啟動container並傳入環境變數
# docker run --env var_name=var_value [image id or tag]
7-7. 啟動container並給予tag
# docker run --name [tag] [image id or tag]
8. 跳出container
當進入container後,想要回到host的shell但不想結束container
ctrl+ p + q
9. 跳回container
# docker attach [container id]
10. 啟動container
# docker start [container id]
#note 當container是在exit狀態時,在attach必須先start container
11. Commit 變更
# docker commit -m "commit msg" -a "author" [container id] [repository]
#note -a和 repository非必要
12. 列出image/container 資訊
# docker inspect [image or container id]
#note 可以利用執行完docker inspect後檢查$?來判斷image or container是否已經存在
13. 匯出container
# docker export [container id] >image_name
#note 如果要順便壓縮的話docker export [container id] |gzip -c >image_name
14. 匯入container
# cat [image path] |docker import - [tag]
#note 匯入壓縮的image # gzip -dc [image path] | docker import - tag
#15 限制container的memory 使用量
# docker run -m [memory size ex. -m 2g]
#note:如果有出現"WARNING: Your kernel does not support swap limit capabilities. Limitation discarded."
則必須修改/etc/default/grub
將
GRUB_CMDLINE_LINUX="
更改成
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
執行sudo update-grub 並重新開機
2014年11月4日 星期二
2014年10月30日 星期四
Illegal instruction 追蹤
某天執行程式的時候遇到Illegal instruction,這時候如果執行
# dmesg
應該會看到類似如下的訊息
XXXX[4444] trap invalid opcode rip:42a016 rsp:415104b8 error:0
這個問題通常是因為程式使用了CPU不支援的instruction set(也有例外),那我們要怎麼知道是使用到哪個instruction(opcode)呢?
1. 開啟core dump
Ubuntu預設是不會產生core dump file的,要讓系統產生必須執行下面的指令
# ulimit -c unlimited
在執行後,當遇到Illegal instruction的錯誤時,在執行的資料夾應該會出現core.XXX的檔案及為core dump file了
2. 使用gdb追蹤
使用gdb 可以讓我們知道在哪邊發生了錯誤
# gdb [程式] [core dump file]
以我遇到的例子如下:
Program terminated with signal 4, Illegal instruction.
#0 0x000000000042a016 in FUNCTION_NAME ()
我們就可以發現錯誤是在 0x000000000042a016的地方
3. 檢視assembly code
同樣在gdb裡面執行
(gdb) disassemble FUNCTION_NAME
0x000000000042a00d <blake2b_init_param_avx+45>: cmp %rax,%rdx
0x000000000042a010 <blake2b_init_param_avx+48>: jbe 0x42a0c0 <blake2b_init_param_avx+224>
0x000000000042a016 <blake2b_init_param_avx+54>: vmovdqu (%rsi),%xmm1
就可以發現到底是什麼instruction造成錯誤了。
# dmesg
應該會看到類似如下的訊息
XXXX[4444] trap invalid opcode rip:42a016 rsp:415104b8 error:0
這個問題通常是因為程式使用了CPU不支援的instruction set(也有例外),那我們要怎麼知道是使用到哪個instruction(opcode)呢?
1. 開啟core dump
Ubuntu預設是不會產生core dump file的,要讓系統產生必須執行下面的指令
# ulimit -c unlimited
在執行後,當遇到Illegal instruction的錯誤時,在執行的資料夾應該會出現core.XXX的檔案及為core dump file了
2. 使用gdb追蹤
使用gdb 可以讓我們知道在哪邊發生了錯誤
# gdb [程式] [core dump file]
以我遇到的例子如下:
Program terminated with signal 4, Illegal instruction.
#0 0x000000000042a016 in FUNCTION_NAME ()
我們就可以發現錯誤是在 0x000000000042a016的地方
3. 檢視assembly code
同樣在gdb裡面執行
(gdb) disassemble FUNCTION_NAME
0x000000000042a00d <blake2b_init_param_avx+45>: cmp %rax,%rdx
0x000000000042a010 <blake2b_init_param_avx+48>: jbe 0x42a0c0 <blake2b_init_param_avx+224>
0x000000000042a016 <blake2b_init_param_avx+54>: vmovdqu (%rsi),%xmm1
就可以發現到底是什麼instruction造成錯誤了。
2014年5月27日 星期二
OpenStack High Availability I-MySQL, RabbitMQ HA
OpenStack HA的方式主要分為兩種,一是Master-Master另外一種是Master-Slave的架構。顧名思義,Master-Master就是同時有兩組Control/Network Node提供服務。在這裡要介紹的是Master-Slave的佈署方式。
對於MySQL的HA, 我們必須佈署DRBD, corosync和pacemaker來達成。DRBD就像是軟體的RAID I,同步兩台電腦間的partition。而corosync則是用來在cluster間傳遞訊息和heartbeat,最後pacemaker則是負責管理應用程式的切換(例如MySQL要開在哪一台,另外一台standby)。
接著再執行 # crm_mon 應該就可以看到MySQL和RabbitMQ開起來了。
對於MySQL的HA, 我們必須佈署DRBD, corosync和pacemaker來達成。DRBD就像是軟體的RAID I,同步兩台電腦間的partition。而corosync則是用來在cluster間傳遞訊息和heartbeat,最後pacemaker則是負責管理應用程式的切換(例如MySQL要開在哪一台,另外一台standby)。
DRBD安裝
安裝DRBD只需要安裝drbd8-utils
# apt-get install drbd8-utils
|
DRBD設定
假設我們要新增一個resource給mysql/rabbitmq使用,在master 分別新增檔案 /etc/drbd.d/mysql.res 和
/etc/drbd.d/rabbitmq.res
resource mysql {
on master {
device /dev/drbd0;
disk /dev/mapper/master-mysql; #master要拿來同步的partition
address 10.109.36.58:7788;#master的IP
meta-disk internal;
}
on slave {
device /dev/drbd0;
disk /dev/mapper/master-mysql; #slave要拿來同步的partition
address 10.109.36.59:7788; #slave的IP
meta-disk internal;
}
}
|
resource rabbitmq{
on master {
device /dev/drbd1;
disk /dev/mapper/master-rabbitmq; #master要拿來同步的partition
address 10.109.36.58:7789;#master的IP
meta-disk internal;
}
on slave {
device /dev/drbd1;
disk /dev/mapper/master-rabbitmq; #slave要拿來同步的partition
address 10.109.36.59:7789; #slave的IP
meta-disk internal;
}
}
|
#將設定檔從master copy到slave
# scp /etc/drbd.d/mysql.res slave:/etc/drbd.d/
# scp /etc/drbd.d/rabbitmq.res slave:/etc/drbd.d/
|
#在master及slave 啟動DRBD
# /etc/init.d/drbd start
|
#初始化metadata storage
master:
# drbdadm create-md mysql
# drbdadm create-md rabbitmq
|
#將master設定為primary
master:
# drbdadm -- --overwrite-data-of-peer primary mysql
# drbdadm -- --overwrite-data-of-peer primary rabbitmq
|
#確認安裝狀態
執行# service drbd status應該可以看到
master:
drbd driver loaded OK; device status:
version: 8.3.13 (api:88/proto:86-96)
srcversion: 697DE8B1973B1D8914F04DB
m:res cs ro ds p mounted fstype
0:mysql Connected Primary/Secondary UpToDate/UpToDate C
1:rabbitmq Connected Primary/Secondary UpToDate/UpToDate C
|
slave:
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
srcversion: 2931F0123213F7DB1364EA7
m:res cs ro ds p mounted fstype
0:mysql Connected Secondary/Primary UpToDate/UpToDate C
1:rabbitmq Connected Primary/Secondary UpToDate/UpToDate C
|
DRBD問題排除
- 執行service drbd status出現 '0:mysql StandAlone Secondary/Unknown UpToDate/DUnknown r-----’
如果執行dmesg 有看到
kernel: block drbd0: Split-Brain detected, dropping connection!
代表遇到split-brain的問題,解決的方法如下:
- Disconnect resource(on secondary node)
# drbdadm disconnect mysql
- 將node轉為secondary(on secondary node)
# drbdadm secondary mysql
- 強至取消所有修改(on secondary node)
# drbdadm -- --discard-my-data connect mysql
- 重新連線(on primary node)
# drbdadm connect mysql
Reference
Corosync Installation
Corosync安裝
Corosync安裝只需要安裝corosync套件
both master & master-1
# apt-get install corosync
|
Corosync設定
#編輯/etc/corosync/corosync.conf
both master & master-1
# Please read the openais.conf.5 manual page
totem {
version: 2
# How long before declaring a token lost (ms)
token: 3000
# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10
# How long to wait for join messages in the membership protocol (ms)
join: 60
# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600
# Turn off the virtual synchrony filter
vsftype: none
# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Disable encryption
secauth: off
# How many threads to use for encryption/decryption
threads: 0
# Optionally assign a fixed node id (integer)
# nodeid: 1234
# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none
interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.109.36.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
amf {
mode: disabled
}
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}
aisexec {
user: root
group: root
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}
|
#將設定檔複製到Slave
# scp -r /etc/corosync master-1:/etc/
|
設定開機自動啟動
- 編輯 /etc/default/corosync 將內容的'no’ 改成’yes’
確認安裝狀態
在Master執行 # corosync-objctl runtime.totem.pg.mrp.srp.members 應該可以看到如下畫面
runtime.totem.pg.mrp.srp.1763994890.ip=r(0) ip(10.109.36.105)
runtime.totem.pg.mrp.srp.1763994890.join_count=1
runtime.totem.pg.mrp.srp.1763994890.status=joined
runtime.totem.pg.mrp.srp.1797549322.ip=r(0) ip(10.109.36.107)
runtime.totem.pg.mrp.srp.1797549322.join_count=1
runtime.totem.pg.mrp.srp.1797549322.status=joined
|
Reference
Pacemaker Installation
驗證系統
在重新啟動Master及Slave之後在兩台執行 # crm_mon應該可以看到
Master
============
Last updated: Wed Jun 19 16:52:11 2013
Last change: Wed Jun 19 13:14:15 2013 via crmd on master-1
Stack: openais
Current DC: master - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ master master-1 ]
|
Slave
============
Last updated: Wed Jun 19 16:52:11 2013
Last change: Wed Jun 19 13:14:15 2013 via crmd on master-1
Stack: openais
Current DC: master - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ master master-1 ]
|
基本設定
#在master執行 # crm configure 接著輸入
crm(live)configure# property no-quorum-policy="ignore" \
pe-warn-series-max="1000" \
pe-input-series-max="1000" \
pe-error-series-max="1000" \
cluster-recheck-interval="5min" \
crm(live)configure# commit
|
HA for MySQL
#編輯pacemaker
執行 # crm configure
crm(live)configure# edit
接著輸入
primitive p_drbd_mysql ocf:linbit:drbd \
params drbd_resource="mysql" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="180s" \
op promote interval="0" timeout="180s" \
op demote interval="0" timeout="180s" \
op monitor interval="30s" role="Slave" \
op monitor interval="29s" role="Master"
primitive p_drbd_rabbitmq ocf:linbit:drbd \
params drbd_resource="rabbitmq" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="180s" \
op promote interval="0" timeout="180s" \
op demote interval="0" timeout="180s" \
op monitor interval="30s" role="Slave" \
op monitor interval="29s" role="Master"
primitive p_fs_mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/mysql" directory="/var/lib/mysql" fstype="xfs" options="relatime" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="180s" \
op monitor interval="60s" timeout="60s"
primitive p_fs_rabbitmq ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/rabbitmq" \
directory="/var/lib/rabbitmq" fstype="xfs"
primitive p_ip_mysql ocf:heartbeat:IPaddr2 \
params ip="10.109.36.198" cidr_netmask="24" \
op monitor interval="30s"
primitive p_ip_rabbitmq ocf:heartbeat:IPaddr2 \
params ip="10.109.36.198" cidr_netmask="24" \
op monitor interval="10s"
primitive p_rabbitmq ocf:rabbitmq:rabbitmq-server \
params nodename="rabbit@localhost" mnesia_base="/var/lib/rabbitmq" \
op monitor interval="20s" timeout="10s"
primitive p_mysql ocf:heartbeat:mysql \
params additional_parameters="--bind-address=0.0.0.0 config=/etc/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" log="/var/log/mysql/mysqld.log" \
op monitor interval="20s" timeout="10s" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s"
roup g_rabbitmq p_ip_rabbitmq p_fs_rabbitmq p_rabbitmq \
meta target-role="Started"
group g_mysql p_ip_mysql p_fs_mysql p_mysql
ms ms_drbd_mysql p_drbd_mysql \
meta notify="true" clone-max="2"
ms ms_drbd_rabbitmq p_drbd_rabbitmq \
meta notify="true" master-max="1" clone-max="2" target-role="Started"
colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
colocation c_rabbitmq_on_drbd inf: g_rabbitmq ms_drbd_rabbitmq:Master
order o_drbd_before_rabbitmq inf: ms_drbd_rabbitmq:promote g_rabbitmq:start
order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
order order1 inf: g_rabbitmq:start g_mysql
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
no-quorum-policy="ignore" \
pe-warn-series-max="1000" \
pe-input-series-max="1000" \
pe-error-series-max="1000" \
cluster-recheck-interval="5min" \
stonith-enabled="false"
crm(live)configure# commit
|
接著再執行 # crm_mon 應該就可以看到MySQL和RabbitMQ開起來了。
訂閱:
文章 (Atom)