rac 启动
Basic Things in RAC
inittab
/etc/init/oracle-ohasd.conf -> respawn -> /etc/init.d/init.ohasd
/u01/app/11.2.0/grid/bin/crsctl lsmodules
Usage:
crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
where
mdns multicast Domain Name Server
gpnp Grid Plug-n-Play Service
css Cluster Synchronization Services
crf Cluster Health Monitor
crs Cluster Ready Services
ctss Cluster Time Synchronization Service
evm EventManager
gipc Grid Interprocess Communications
crsctl status resource -t -init
OHASD Oracle High Availability Services Daemon
相关命令
Enable Automatic start of Oracle High Availability services after reboot
crsctl enable has
Disable Automatic start of Oracle High Availability services after reboot
OHASD无法kill,一旦kill,立马又会被拉起了
Ohasd spawns 3 types of services at cluster level
Level 1: Cssd Agent Level 2: Oraroot Agent (respawns cssd, crsd, cttsd, diskmon, acfs) Level 3: OraAgent(respawns mdsnd, gipcd, gpnpd, evmd, asm), CssdMonitor
#To start has services after reboot.
crsctl enable has
# has services should not start after reboot
crsctl disable has
# Check configuration whether autostart is enabled or not.
crsctl config has
# check whether it is enabled or not.
cat /etc/oracle/scls_scr/<Node_name>/root/ohasdstr
# whether restart enabled if node fails.
cat /etc/oracle/scls_scr/<Node_name>/root/ohasdrun
OCR & OLR Oracle Cluster Registry, Oracle Local Registry
OLR -> ASM -> OCR
#OCR file backup location
ocrconfig -showbackup
#OCR Backup
ocrconfig -export < File_Name_with_Full_Location.ocr >
#Restore OCR
ocrconfig -restore <File_Name_with_Full_Location.ocr>
#Import metadata specifically for OCR.
ocrconfig -import <File_Name_With_Full_Location.dmp>
#Gives the OCR info in detail
Ocrcheck -details
#Gives the OLR info in detail
ocrcheck -local
#Take the dump of OLR.
ocrdump -local <File_Name_with_Full_Location.olr>
#Take the dump of OCR.
ocrdump <File_Name_with_Full_Location.ocr>
Voting Disk
两个主要作用:
- 动态 - 心跳信息
- 静态 - 群集中的节点信息
#Taking backup of voting disk
dd if=Name_Of_Voting_Disk of=Name_Of_Voting_Disk_Backup
#Check voting disk details.
crsctl query css votedisk
#To add voting disk
crsctl add css votedisk path_to_voting_disk
#If the Cluster is down
crsctl add css votedisk -force
#Delete Voting disk
crsctl delete css votedisk <File_Name_With_Password_With_file_name>
#If the cluster is down
crsctl delete css votedisk -force
#Replace the voting disk.
crsctl replace votedisk <+ASM_Disk_Group>
CRSD stands for Cluster Resource Service Daemon
起停服务,及维护OCR
#Check crs resources
crs_stat -t -v
#Check in a bit detail view. BEST ONE.
crsctl stat res -t
#Enable Automatic start of Services after reboot
crsctl enable crs
#Check crs Services.
crsctl check crs
#Disable Automatic start of CRS services after reboot
crsctl disable crs
#Stop the crs services on the node which we are executing
crsctl stop crs
#Stop the crs services forcefully
crsctl stop crs -f
#To start the crs services on respective node
crsctl start crs
#To start the crs services in exclusive mode when u lost voting disk.
#You need to replace the voting disk after you start the css.
crsctl start crs -excl
#Stop the crs services on the cluster nodes
crsctl stop cluster -all
#Start the crs services on all the cluster nodes.
crsctl start cluster -all
#Find all the nodes relative to the cluster
olsnodes
#With this you will get master node information
oclumon manage -get master
#Find PID from which crs is running.
cat $CRS_HOME/crs/init/<node_name>.pid
CSSD Cluster Synchronization Service Daemon
如果两个节点都已启动并正在运行。并且由于其中一个通信通道,CSSD进程获得了另一个节点关闭的信息。因此,在这种情况下,无法将新事务分配给该节点。节点驱逐将完成。现在运行的节点将所有权作为主节点。
#用于停止css
crsctl stop css
#重启后禁用自动启动。
crsctl disable css
CTTSD Cluster Time Synchronization Service Daemon
#检查所有节点的时钟同步
cluvfy comp clocksync -n all -verbose
#检查msecs中的服务状态和时间偏移。
crsctl check ctts
VIP
#To start VIP
srvctl start vip -n <node_name> -i <VIP_Name>
#To stop VIP
srvctl stop vip -n <node_name> -i <VIP_Name>
#Enable the VIP.
srvctl enable vip -i vip_name
#Disable the VIP.
srvctl disable vip -i vip_name
#status of nodeapps
srvctl status nodeapps -n <node_name>
#status of vip on a node
srvctl status vip -n <node_name>
SCAN IP & Listener Single Client Access Name
least_recently_loaded算法 -> SCAN Listener
#retrieves scan listener configuration
srvctl config scan
#List of scan listeners with Port number
srvctl config scan_listener
#Add a scan listener to the cluster
srvctl add scan -n <node_name>
#to add scan listener on specific port
srvctl add scan_listener -p <Desired_port_number>
#find the list of scan listeners
SQL> SHOW PARAMETER REMOTE_LISTENER;
#stops all scan listeners when used without -i option
srvctl stop scan
#Stops one or more services in the cluster
srvctl stop scan_listener
#To start the scan VIP
srvctl start scan
#Start the scan listener.
srvctl start scan_listener
#verify scan VIP status
srvctl status scan
#Verify scan listener status.
srvctl status scan_listener
#Modify the scan listener
srvctl modify scan_listener
#relocate the scan listener to another node.
srvctl relocate scan_listener -i <Ordinal_Number> -n <node_name>
Ologgerd cluster logger service Daemon
脑裂分辨,主控制器辨别
此过程负责收集本地节点中的信息。这将从每个节点收集信息,并将数据发送到master loggerd。这将发送信息,如CPU,内存使用情况,Os级别信息,磁盘信息,磁盘信息,进程,文件系统信息。
#Find which is the master node
oclumon manage -get master
#Will get the path of the repository logs
oclumon manage -get reppath
#This will give you the limitations on repository size
oclumon manage -get repsize
#find which nodes are connected to loggerd
oclumon showobjects
#This will give a detail view including system, topconsumers, processes, devices, nics, filesystems status, protocol errors.
oclumon dumpnodeview
#you can view all the details in c. column from a specific time you mentioned.
oclumon dumpnodeview -n <node_1 node_2 node_3> -last "HH:MM:SS"
#If we need info from all the nodes.11.What is sysmon?
oclumon dumpnodeview allnodes -last "HH:MM:SS"
Evmd Event Volume Manager Daemon
它向集群中的所有其他节点发送和接收有关资源状态更改的操作。这将借助ONS Oracle Notification Services
#获取evmd中生成的事件。
evmwatch -A -t "@timestamp @@"
#这将在上述节点的evmd日志中发布消息。
evmpost -u "<message here>" -h <node_name>
mdnsd Multicast Domain Name Service
gpndp使用此过程来定位群集中的配置文件以及GNS以执行名称解析。Mdnsd更新init目录中的pid文件。
ONS Oracle Notification Service
- SMTP, mail sent
- nodes之间传输信息
#Status of nodeapps
srvctl status nodeapps
#Check ons configuration.
cat $ORACLE_HOME/opmn/conf/ons.config
#ONS logs will be in this location.
$ORACLE_HOME/opmn/logs
TAF Trasparent Application Failover
当任何rac节点关闭时,select语句需要故障转移到活动节点。
SELECT machine, failover_type, failover_method, failed_over, COUNT(1) FROM gv$session GROUP BY machine, failover_type, failover_method, failed_over;
GPNPD Grid Plug aNd Play Daemon
此配置文件由群集名称,主机名,具有IP地址的ntwork配置文件,OCR组成。如果我们对表决磁盘进行任何修改,将更新配置文件。
#Check the version of tool.
gpnptool ver
# get local gpnpd server.
gpnptool lfind
#read the profile
gpnptool get
#check daemon is running on local node.
gpnptool lfind
#Check whether configuration is valid.
gpnptool check -p= CRS_HOME/gpnp/<node_name>/profile/peer/profile.xml
FCF Fast Connection Failover
它是一个应用程序级故障转移过程。这将自动订阅FAN事件,这将有助于对数据库集群的上下事件做出即时反应。
** GCS(LMSn)** Global Cache Service
它负责在需要时将块从实例传输到另一个实例, 无需从数据文件中选择数据
GES(LMD) Global Enqueue Service
GES控制所有节点上的库和字典缓存。GES管理事务锁,表锁,库缓存锁,字典缓存锁,数据库安装锁。
GRD Global Resource Directory
这是为了记录资源和入队的信息。就这个词而言,它存储了所有信息的信息。数据块标识符,数据块模式(共享,独占,空),缓冲区高速缓存等信息将具有访问权限
Diskmon
当ocssd启动时,磁盘监视器守护程序会持续运行。它监视并执行Exadata存储服务器的
OPROCD Oracle Process Monitor Daemon
就是cssd
FAN Fast Application Notification
就是ONS
命令
- 节点层:osnodes
- 网络层:oifcfg
- 集群层:crsctl, ocrcheck,ocrdump,ocrconfig
- 应用层:srvctl,onsctl,crs_stat
节点层
grid@stb1 ~ $ $ORACLE_HOME/bin/!!
$ORACLE_HOME/bin/olsnodes -n
stb2 1
stb1 2
网络层
grid@stb1 ~ $ $ORACLE_HOME/bin/oifcfg getif
eth1 10.255.255.0 global public
eth2 192.168.255.0 global cluster_interconnect
[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.1.119:public
[root@rac1 bin]# ./oifcfg setif -globaleth1/10.85.10.119:cluster_interconnect
[root@rac1 bin]# ./oifcfg getif -type public
[root@rac1 bin]# ./oifcfg delif -global
集群层
grid@stb1 ~ $ crsctl status resource -t
grid@stb1 ~ $ crsctl status resource -t -init
CRS进程栈默认随着操作系统的启动而自启动,有时出于维护目的需要关闭这个特性,可以用root用户执行下面命令。
[root@rac1 bin]# ./crsctl disable crs
[root@rac1 bin]# ./crsctl enable crs
这个命令实际是修改了以下两个文件内容,disable后ohasd.bin reboot不能启动,也就没有后面的一系列进程。但/bin/sh /etc/init.d/init.tfa run 和 /bin/sh /etc/init.d/init.ohasd run由init启动。
/etc/oracle/scls_scr/$(HOSTNAME)/root/ohasdstr
/etc/oracle/scls_scr/$(HOSTNAME)/root/ohasdrun
手动启动crs
[root@stb1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
/u01/app/11.2.0/grid/bin/ohasd.bin reboot
/u01/app/11.2.0/grid/bin/oraagent.bin
/u01/app/11.2.0/grid/bin/mdnsd.bin
/u01/app/11.2.0/grid/bin/gpnpd.bin
/u01/app/11.2.0/grid/bin/gipcd.bin
/u01/app/11.2.0/grid/bin/cssdmonitor
/u01/app/11.2.0/grid/bin/cssdagent
/u01/app/11.2.0/grid/bin/ocssd.bin
/u01/app/11.2.0/grid/bin/orarootagent.bin
/u01/app/11.2.0/grid/bin/octssd.bin reboot
/u01/app/11.2.0/grid/bin/evmd.bin
\_ /u01/app/11.2.0/grid/bin/evmlogger.bin -o /u01/app/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/11.2.0/grid/evm/log/evmlogge
asm_pmon_+ASM2
asm_psp0_+ASM2
asm_vktm_+ASM2
asm_gen0_+ASM2
asm_diag_+ASM2
asm_ping_+ASM2
asm_dia0_+ASM2
asm_lmon_+ASM2
asm_lmd0_+ASM2
asm_lms0_+ASM2
asm_lmhb_+ASM2
asm_mman_+ASM2
asm_dbw0_+ASM2
asm_lgwr_+ASM2
asm_ckpt_+ASM2
asm_smon_+ASM2
asm_rbal_+ASM2
asm_gmon_+ASM2
asm_mmon_+ASM2
asm_mmnl_+ASM2
asm_lck0_+ASM2
oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
/u01/app/11.2.0/grid/bin/crsd.bin reboot
oracle+ASM2_ocr (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
asm_asmb_+ASM2
oracle+ASM2_asmb_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
asm_o000_+ASM2
oracle+ASM2_o000_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
/u01/app/11.2.0/grid/bin/oraagent.bin
/u01/app/11.2.0/grid/bin/orarootagent.bin
oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
/u01/app/11.2.0/grid/opmn/bin/ons -d
\_ /u01/app/11.2.0/grid/opmn/bin/ons -d
/u01/app/11.2.0/grid/bin/tnslsnr LISTENER
影响 /etc/oracle/scls_scr/stb1/root/ohasdstr
[root@stb1 bin]# ./crsctl disable has
CRS-4621: Oracle High Availability Services autostart is disabled.
效果和 ./crsctl disable crs相似,就是启动不了。
crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 67fbe885438c4f4cbf85a68076a2aa68 (/dev/oracleasm/disks/DISK1) [DATA]
Located 1 voting disk(s).
OCR 的备份
grid@stb2 ~ $ ls -lh /u01/app/11.2.0/grid/cdata/stb-cluster/
total 20M
-rw------- 1 root root 6.6M Aug 17 21:41 backup00.ocr
-rw------- 1 root root 6.6M Aug 17 21:41 day.ocr
-rw------- 1 root root 6.6M Aug 17 21:41 week.ocr
grid@stb2 /u01/app/11.2.0/grid/cdata/stb-cluster $ sudo md5sum *
4dc00b7f877a0f0b7aa8fbaa2a91a6aa backup00.ocr
4dc00b7f877a0f0b7aa8fbaa2a91a6aa day.ocr
4dc00b7f877a0f0b7aa8fbaa2a91a6aa week.ocr
应用层
crsctl, srvctl 略
onsctl
srvctl add database
standalone db
sudo $GRID_HOME/bin/crsctl delete resource ora.oradb.db
rac
srvctl add database -d db_unique_name -r PRIMARY -n db_name -o $ORACLE_HOME
srvctl add instance -d db_unique_name -i $ORACLE_SID -n $HOSTNAME
srvctl add instance -d db_unique_name -i $ORACLE_SID -n $HOSTNAME
srvctl add database -d db_unique_name -o ORACLE_HOME [-x node_name] [-m domain_name] [-p spfile] [-r {PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY}] [-s start_options] [-t stop_options] [-n db_name] [-y {AUTOMATIC|MANUAL}] [-g server_pool_list] [-a "diskgroup_list"]
srvctl modify database -d db_unique_name [-n db_name] [-o ORACLE_HOME] [-u oracle_user] [-m domain] [-p spfile] [-r {PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY}] [-s start_options] [-t stop_options] [-y {AUTOMATIC|MANUAL}] [-g "server_pool_list"] [-a "diskgroup_list"|-z]
srvctl add service -d db_unique_name -s service_name -r preferred_list [-a available_list] [-P {BASIC|NONE|PRECONNECT}]
[-l [PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY]
[-y {AUTOMATIC|MANUAL}] [-q {TRUE|FALSE}] [-j {SHORT|LONG}]
[-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}]
[-m {NONE|BASIC}] [-x {TRUE|FALSE}] [-z failover_retries] [-w failover_delay]