原文地址:http://www.opvps.com/?p=331
一.目的:二台NFS服务器
互为冗余(系统
切换时间约为2x ms左右),保证NFS文件
共享服务的可用
二.系统为CentOS 5.3
二个节点 主节点node1(192.168.10.111) 备用节点node2(192.168.10.112) 虚拟IP:192.168.10.113对外提供服务
node1 /etc/hosts如下
[root@node1 ha.d]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
192.168.10.111 node1
192.168.10.112 node2
127.0.0.1 node1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
node2 /etc/hosts如下
[root@node2 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
192.168.10.111 node1
192.168.10.112 node2
127.0.0.1 node2 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
二台机将/dev/hda5互为镜相
二台机/etc/export相同
cat /etc/exports
/u1 192.168.10.0/255.255.255.0(rw,no_root_squash,no_all_squash,sync)
确定二台机的portmap服务为启动状态
三.所需软件
:使用DRBD作为网络
磁盘镜相,Heartbeart为管理
由提供NFS服务及服务失效节点转移
安装所需的软件DRBD及Heartbeat均从http://mirror.centos
.org/centos/5.3/extras/i386/RPMS
此目录
下载
,省去安装过程
drbd需要手动加载内核
模块
,安装好rpm执行如下指令
insmod /lib/modules/2.6.9-78.ELsmp/extra/drbd.ko
modprobe drbd
四.配置过程
1.DRBD配置 配置文件只有一个/etc/drbd.conf 二个内容都是一样
global { usage-count yes; }
common { syncer { rate 10M; } }
resource r0 {
protocol C;
startup {
}
disk {
on-io-error detach;
}
net {
}
on node1 {
device /dev/drbd0;
disk /dev/hda5;
address 192.168.10.111:7789;
meta-disk internal;
}
on node2 {
device /dev/drbd0;
disk /dev/hda5;
address 192.168.10.112:7789;
meta-disk internal;
}
}
在二台机执行
drbdadm create-md r0 #创建ro的资源
/etc/init.d/drbd start #启动drbd
cat /proc/drbd #查看状态正常显示为
version: 8.0.13 (api:86/proto:86)
GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by buildsvn@c5-i386-build, 2008-10-02 13:31:44
0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
ns:1848 nr:14357752 dw:14359600 dr:653 al:6 bm:901 lo:0 pe:0 ua:0 ap:0
resync: used:0/61 hits:928654 misses:1079 starving:0 dirty:0 changed:1079
act_log: used:0/127 hits:456 misses:6 starving:0 dirty:0 changed:6
在主服务器上执行
drbdsetup /dev/drbd0 primary -o #定义为主节点
mkfs.ext3 /dev/drbd0 #格式化
mount /dev/drbd0 /u1 #挂载
cat /proc/drbd #此时查看状态正常显示为
[root@node1 ha.d]# cat /proc/drbd
version: 8.0.13 (api:86/proto:86)
GIT-hash: ee3ad77563d2e87171a3da17cc002ddfd1677dbe build by buildsvn@c5-i386-build, 2008-10-02 13:31:44
0: cs:Connected st
rimary/Secondary ds:UpToDate/UpToDate C r---
ns:14357752 nr:1848 dw:361872 dr:13998555 al:264 bm:3581 lo:0 pe:0 ua:0 ap:0
resync: used:0/61 hits:928654 misses:1079 starving:0 dirty:0 changed:1079
act_log: used:0/127 hits:89742 misses:268 starving:0 dirty:4 changed:264
2.Heartbeat配置共涉及4个文件
/etc/ha.d/ha.cf
/etc/ha.d/haresources
/etc/ha.d/authkeys
/etc/ha.d/resource.d/killnfsd
二个节的配置的配置文件都是一样,文件内容如下
[root@node1 ha.d]# cat /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 20
bcast eth0
auto_failback off
node node1 node2
[root@node1 ha.d]# cat /etc/ha.d/haresources
node1 IPaddr::192.168.10.113/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/u1::ext3 killnfsd
[root@node1 ha.d]# cat /etc/ha.d/authkeys
auth 1
1 crc
#2 sha1 HI!
#3 md5 Hello!
[root@node1 ha.d]# cat /etc/ha.d/resource.d/killnfsd
killall -9 nfsd ; /etc/init.d/nfs restart ; exit 0
需要将 /etc/ha.d/authkeys设为600的权限 将cat /etc/ha.d/resource.d/killnfsd设为755的权限
chmod 600 /etc/ha.d/authkeys
chmod 755 /etc/ha.d/resource.d/killnfsd
为什么要使用这个killnfsd的原因,使用/etc/inin.d/nfs stop 不能停掉nfsd,所有我使用了killall -9 nfsd再加了一个/etc/inin.d/nfs restart确保万一
在二个节点启动Heartbeat即可,先在主节点启动
/etc/init.d/heartbeat
start
五.测试
将192.168.10.113:/u1挂到本地/mnt
mount 192.168.10.113:/u1 /mnt
创建测试shell
,二秒一个
cat /mnt/test.sh
while true
do
echo ---\> trying touch x : `date`
touch x
echo \<----- done touch x : `date`
echo
sleep 2
done
将主节点的heartbeat服务停止,则备节点node2接管服务
/etc/init.d/heartbeat stop
测试脚本
终端显示如下
---> trying touch x : ?t 6?? 30 15:17:16 CST 2009
<----- done touch x : ?t 6?? 30 15:17:16 CST 2009
---> trying touch x : ?t 6?? 30 15:17:19 CST 2009
<----- done touch x : ?t 6?? 30 15:17:19 CST 2009
---> trying touch x : ?t 6?? 30 15:17:21 CST 2009
<----- done touch x : ?t 6?? 30 15:17:21 CST 2009
---> trying touch x : ?t 6?? 30 15:17:23 CST 2009
<----- done touch x : ?t 6?? 30 15:17:23 CST 2009
---> trying touch x : ?t 6?? 30 15:17:25 CST 2009
<----- done touch x : ?t 6?? 30 15:17:25 CST 2009
---> trying touch x : ?t 6?? 30 15:17:27 CST 2009
touch: cannot touch ??x?ˉ: Stale NFS file handle
<----- done touch x : ?t 6?? 30 15:17:42 CST 2009
---> trying touch x : ?t 6?? 30 15:17:44 CST 2009
touch: cannot touch ??x?ˉ: Stale NFS file handle
<----- done touch x : ?t 6?? 30 15:17:44 CST 2009
---> trying touch x : ?t 6?? 30 15:17:46 CST 2009
touch: cannot touch ??x?ˉ: Stale NFS file handle
<----- done touch x : ?t 6?? 30 15:17:46 CST 2009
---> trying touch x : ?t 6?? 30 15:18:03 CST 2009
<----- done touch x : ?t 6?? 30 15:18:03 CST 2009
---> trying touch x : ?t 6?? 30 15:18:05 CST 2009
<----- done touch x : ?t 6?? 30 15:18:05 CST 2009
至此,测试已实现所需的功能