Linux ·

Oracle 12C RAC启动实例时报ORA-00206: error in writing (block 1, # blocks 1) of control file错误处理

Oracle 12C RAC启动实例时报ORA-00206: error in writing (block 1, # blocks 1) of control file错误处理

一.问题描述
sqlplus启动一节点实例,报如下错:
Total System Global Area 6.4425E+10 bytes
 Fixed Size 7651632 bytes
 Variable Size 9932119760 bytes
 Database Buffers 5.4358E+10 bytes
 Redo Buffers 126558208 bytes
 ORA-00221: error on write to control file
 ORA-00206: error in writing (block 1, # blocks 1) of control file
 ORA-00202: control file: '+MDCCSSDG1/CCSSVER/CONTROLFILE/control01.ctl'
 ORA-15081: failed to submit an I/O operation to a disk
 ORA-15081: failed to submit an I/O operation to a disk

Oracle版本:ORACLE 12.1.0.2 + psu 20160719

二.问题分析
1.检查alert.log
 cd /mdccssver/mdccssver/orabase/diag/rdbms/ccssver/ccssver2/trace
 rw-r----- 1 ccssver oinstall 2043 Sep 14 13:56 ccssver2_ora_2538.trc
 -rw-r----- 1 ccssver oinstall 1550 Sep 14 13:56 ccssver2_m000_2592.trc
 -rw-r----- 1 ccssver oinstall 233 Sep 14 13:56 ccssver2_m000_2592.trm
 -rw-r----- 1 ccssver asmadmin 40400051 Sep 14 13:57 alert_ccssver2.log
 -rw-r----- 1 ccssver oinstall 1661 Sep 14 13:57 ccssver2_lmon_2376.trm
 -rw-r----- 1 ccssver oinstall 10621 Sep 14 13:57 ccssver2_lmon_2376.trc
 -rw-r----- 1 ccssver oinstall 114 Sep 14 13:58 ccssver2_lmhb_2414.trm
 -rw-r----- 1 ccssver oinstall 1273 Sep 14 13:58 ccssver2_lmhb_2414.trc

tail -100 alert_ccssver2.log|more
 ARNING: Write Failed. group:1 disk:1 AU:221 offset:16384 size:16384
path:/dev/asmdisk/datadisk2
incarnation:0x12 asynchronous result:'I/O error'
 subsys:System krq:0xffffffff7cd27dd0 bufp:0xffffffff7c6e7e00 osderr1:0x69a1 osderr2:0x0
 IO elapsed time: 0 usec Time waited on I/O: 0 usec
 Wed Sep 14 13:56:24 2016
 Errors in file /mdccssver/mdccssver/orabase/diag/rdbms/ccssver/ccssver2/trace/ccssver2_ora_2538.trc:
 ORA-15080: synchronous I/O operation failed to write block 1 of disk 1 in disk group MDCCSSDG1
 ORA-27041: unable to open file
 SVR4 Error: 13: Permission denied
 Additional information: 3
 WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 264 in group 1 on disk 1 allocation unit 221
 WARNING: group 1 file 264 vxn 0 block 1 write I/O failed
 Wed Sep 14 13:56:24 2016
 Errors in file /mdccssver/mdccssver/orabase/diag/rdbms/ccssver/ccssver2/trace/ccssver2_ora_2538.trc:
 ORA-00206: error in writing (block 1, # blocks 1) of control file
 ORA-00202: control file: '+MDCCSSDG1/CCSSVER/CONTROLFILE/control01.ctl'
 ORA-15081: failed to submit an I/O operation to a disk
 ORA-15081: failed to submit an I/O operation to a disk
 ORA-221 signalled during: ALTER DATABASE MOUNT...
上面报错说明读写MDCCSSDG1磁盘组有异常。

2.查看ASM磁盘和磁盘组状态:

SQL> select name,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS,STATE,PATH from v$asm_disk;

NAME MOUNT_S HEADER_STATU MODE_ST STATE PATH
 ------------------------------ ------- ------------ ------- -------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 OCRVOTDG_0000 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/crsdisk1
 OCRVOTDG_0001 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/crsdisk2
 OCRVOTDG_0002 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/crsdisk3
 OCRVOTDG_0003 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/crsdisk4
 OCRVOTDG_0004 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/crsdisk5
 MDCCSSDG1_0000 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/datadisk1
 MDCCSSDG1_0001 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/datadisk2
 MDCCSSDG2_0000 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/datadisk3
 MDCCSSDG2_0001 CACHED MEMBER ONLINE NORMAL /dev/asmdisk/datadisk4

9 rows selected.

SQL> select NAME,STATE,TYPE,FREE_MB,OFFLINE_DISKS from gv$asm_diskgroup;

NAME STATE TYPE FREE_MB OFFLINE_DISKS
 ------------------------------ ----------- ------ ---------- -------------
 OCRVOTDG MOUNTED HIGH 15418 0
 MDCCSSDG2 MOUNTED EXTERN 79428 0
 MDCCSSDG1 MOUNTED EXTERN 197212 0

上面都显示正常。

3.检查OS的日志dmesg也没有发现报错或警告信息。
ep 14 12:25:26 msuu435 oracleoks: [ID 679573 kern.notice] NOTICE: ACFSK-0039: Module unloaded.
 Sep 14 12:25:26 msuu435 oracleoks: [ID 742266 kern.notice] NOTICE: ADVMK-0003: Module unloaded.
 Sep 14 12:25:26 msuu435 oracleoks: [ID 290090 kern.notice] NOTICE: OKSK-00006: Module unloaded.
 Sep 14 12:25:29 msuu435 oracleoks: [ID 123267 kern.notice] NOTICE: OKSK-00028: In memory kernel log buffer address: 0xc40168a9a0c8, size: 10485760
 Sep 14 12:25:29 msuu435 oracleoks: [ID 863671 kern.notice] NOTICE: OKSK-00027: Oracle kernel distributed lock manager hash size is 31251
 Sep 14 12:25:29 msuu435 oracleoks: [ID 160659 kern.notice] NOTICE: OKSK-00004: Module load succeeded. Build information: (LOW DEBUG) USM_12.1.0.2.0ACFSPSU_SOLARIS.SPARC64_160211 2016/02/12 01:10:31
 Sep 14 12:25:29 msuu435 pseudo: [ID 129642 kern.info] pseudo-device: oracleadvm0
 Sep 14 12:25:29 msuu435 genunix: [ID 936769 kern.info] oracleadvm0 is /pseudo/oracleadvm@0
 Sep 14 12:25:29 msuu435 oracleoks: [ID 383825 kern.notice] NOTICE: ADVMK-0001: Module load succeeded. Build information: (LOW DEBUG) - USM_12.1.0.2.0ACFSPSU_SOLARIS.SPARC64_160211 built on 2016/02/12 01:13:59.
 Sep 14 12:25:30 msuu435 oracleoks: [ID 617314 kern.notice] NOTICE: ACFSK-0037: Module load succeeded. Build information: (LOW DEBUG) USM_12.1.0.2.0ACFSPSU_SOLARIS.SPARC64_160211 2016/02/12 01:18:55
 Sep 14 12:25:30 msuu435 pseudo: [ID 129642 kern.info] pseudo-device: oracleacfs0
 Sep 14 12:25:30 msuu435 genunix: [ID 936769 kern.info] oracleacfs0 is /pseudo/oracleacfs@0
 Sep 14 12:25:32 msuu435 root: [ID 702911 user.error] exec /mdccssver/oragrid/12.1.0/perl/bin/perl -I/mdccssver/oragrid/12.1.0/perl/lib /mdccssver/oragrid/12.1.0/bin/crswrapexece.pl /mdccssver/oragrid/12.1.0/crs/install/s_crsconfig_msuu435_env.txt /mdccssver/oragrid/12.1.0/bin/ohasd.bin "exclusive"
 Sep 14 12:25:52 msuu435 root: [ID 702911 user.error] exec /mdccssver/oragrid/12.1.0/perl/bin/perl -I/mdccssver/oragrid/12.1.0/perl/lib /mdccssver/oragrid/12.1.0/bin/crswrapexece.pl /mdccssver/oragrid/12.1.0/crs/install/s_crsconfig_msuu435_env.txt /mdccssver/oragrid/12.1.0/bin/ohasd.bin "reboot _ORA_BLOCKING_STACK_LOCALE=AMERICAN_AMERICA.AL32UTF8"
 Sep 14 12:26:52 msuu435 CLSD: [ID 770310 daemon.notice] The clock on host msuu435 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
 Sep 14 13:08:54 msuu435 last message repeated 1 time
 Sep 14 13:49:32 msuu435 root: [ID 702911 user.error] exec /mdccssver/oragrid/12.1.0/perl/bin/perl -I/mdccssver/oragrid/12.1.0/perl/lib /mdccssver/oragrid/12.1.0/bin/crswrapexece.pl /mdccssver/oragrid/12.1.0/crs/install/s_crsconfig_msuu435_env.txt /mdccssver/oragrid/12.1.0/bin/ohasd.bin "reboot"

4.检查两节点磁盘主次设备号与对应的磁盘信息正确:
mknod /dev/asmdisk/datadisk1 c 265 8 对应 /devices/virtual-devices@100/channel-devices@200/disk@1:a,raw
 mknod /dev/asmdisk/datadisk2 c 265 16 对应 /devices/virtual-devices@100/channel-devices@200/disk@2:a,raw
 mknod /dev/asmdisk/datadisk3 c 265 24 对应 /devices/virtual-devices@100/channel-devices@200/disk@3:a,raw
 mknod /dev/asmdisk/datadisk4 c 265 32 对应 /devices/virtual-devices@100/channel-devices@200/disk@4:a,raw

5.回过头检查alert.log日志如下片段:
ORA-27041: unable to open file
 SVR4 Error: 13: Permission denied
说明存在权限问题。

6.查看OS用户和组的情况:
cd /etc/passwd
ccssver:x:1001:106::/mdccssver/mdccssver/home:/bin/bash
oragrid:x:1002:106::/mdccssver/oragrid/home:/bin/bash
 cat /etc/group
 oinstall::106:
 7.查看mknod设备权限
-bash-4.1$ cd /dev/asmdisk
 -bash-4.1$ ls -ltr
 total 0
 crw-rw-r-- 1 oragrid asmadmin 265, 24 Sep 14 13:51 datadisk3
 crw-rw-r-- 1 oragrid asmadmin 265, 16 Sep 14 13:51 datadisk2
 crw-rw-r-- 1 oragrid asmadmin 265, 8 Sep 14 13:51 datadisk1
 crw-rw-r-- 1 oragrid asmadmin 265, 32 Sep 14 13:51 datadisk4
 crw-rw-r-- 1 oragrid asmadmin 265, 70 Sep 15 03:29 crsdisk4
 crw-rw-r-- 1 oragrid asmadmin 265, 62 Sep 15 03:29 crsdisk3
 crw-rw-r-- 1 oragrid asmadmin 265, 54 Sep 15 03:29 crsdisk2
 crw-rw-r-- 1 oragrid asmadmin 265, 78 Sep 15 03:29 crsdisk5
 crw-rw-r-- 1 oragrid asmadmin 265, 46 Sep 15 03:29 crsdisk1

从上面看,应该权限设置有问题,mknod设备的组应该为oinstall,所以直接手工启动库会有问题,但用srvctl启动库是正常的。如下:

-bash-4.1$ srvctl start instance -d ccssver -i ccssver2
 -bash-4.1$ ps -ef|grep pmon
 oragrid 25198 1 0 13:51:07 ? 0:09 asm_pmon_+ASM2
 ccssver 8819 1 0 03:39:40 ? 0:00 ora_pmon_ccssver2
 ccssver 9646 7843 0 03:40:27 pts/3 0:00 grep pmon
 oragrid 25935 1 0 13:51:44 ? 0:07 mdb_pmon_-MGMTDB
 -bash-4.1$ id
 uid=1001(ccssver) gid=106(oinstall)
 -bash-4.1$ sqlplus /nolog

SQL*Plus: Release 12.1.0.2.0 Production on Thu Sep 15 03:40:56 2016

Copyright (c) 1982, 2014, Oracle. All rights reserved.

SQL> conn /as sysdba
 Connected.
 SQL> select open_mode from v$database;

OPEN_MODE
 --------------------
 READ WRITE

用srvctl启动库正常,而用sqlplus 启动报错,参考MOS中这段:
With SRVCTL, the root user OS setup is used since inherited from the crsd.bin that will start the instances as oracle user. With SQLPLUS, the oracle user OS setup is used. Having different user setups for root or oracle will make that the performance can be different (fragmented sga or not, another 'solaris' project setup, ...) Check Note 603051.1 , Note 369424.1 .
两者使用的用户权限不一样。

三.问题解决

更改下面权限,将数据盘的组改为oinstall

chown oragrid:oinstall /dev/asmdisk/datadisk1
 chown oragrid:oinstall /dev/asmdisk/datadisk2
 chown oragrid:oinstall /dev/asmdisk/datadisk3
 chown oragrid:oinstall /dev/asmdisk/datadisk4

root@msuu435:/dev/asmdisk# ls -ltr
 total 0
 crw-rw-r-- 1 oragrid oinstall 265, 24 Sep 15 03:39 datadisk3
 crw-rw-r-- 1 oragrid oinstall 265, 16 Sep 15 03:44 datadisk2
 crw-rw-r-- 1 oragrid oinstall 265, 32 Sep 15 04:08 datadisk4
 crw-rw-r-- 1 oragrid oinstall 265, 8 Sep 15 04:08 datadisk1
 crw-rw-r-- 1 oragrid asmadmin 265, 78 Sep 15 04:08 crsdisk5
 crw-rw-r-- 1 oragrid asmadmin 265, 70 Sep 15 04:08 crsdisk4
 crw-rw-r-- 1 oragrid asmadmin 265, 62 Sep 15 04:08 crsdisk3
 crw-rw-r-- 1 oragrid asmadmin 265, 54 Sep 15 04:08 crsdisk2
 crw-rw-r-- 1 oragrid asmadmin 265, 46 Sep 15 04:08 crsdisk1

更改权限后,用sqlplus启动数据库,不再报错,能正常打开。

root@msuu435:/dev/asmdisk# su - ccssver
 Oracle Corporation SunOS 5.11 11.1 March 2014
 -bash-4.1$ sqlplus /nolog

SQL*Plus: Release 12.1.0.2.0 Production on Thu Sep 15 05:06:20 2016

Copyright (c) 1982, 2014, Oracle. All rights reserved.

SQL> conn /as sysdba
 Connected to an idle instance.
 SQL> startup
 ORACLE instance started.

Total System Global Area 6.4425E+10 bytes
 Fixed Size 7651632 bytes
 Variable Size 9932119760 bytes
 Database Buffers 5.4358E+10 bytes
 Redo Buffers 126558208 bytes
 Database mounted.
 Database opened.

另一个节点作同样处理,用sqlplus和srvctl启动数据库都正常。

参与评论