rac 启动

Basic Things in RAC

inittab

/etc/init/oracle-ohasd.conf -> respawn -> /etc/init.d/init.ohasd

/u01/app/11.2.0/grid/bin/crsctl lsmodules 
Usage:
  crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
 where
   mdns  multicast Domain Name Server
   gpnp  Grid Plug-n-Play Service
   css   Cluster Synchronization Services
   crf   Cluster Health Monitor
   crs   Cluster Ready Services
   ctss  Cluster Time Synchronization Service
   evm   EventManager
   gipc  Grid Interprocess Communications

crsctl status resource -t -init

OHASD Oracle High Availability Services Daemon

OCR & OLR Oracle Cluster Registry, Oracle Local Registry

OLR -> ASM -> OCR

#OCR file backup location
ocrconfig -showbackup 

#OCR Backup
ocrconfig -export < File_Name_with_Full_Location.ocr > 

#Restore OCR
ocrconfig -restore <File_Name_with_Full_Location.ocr> 

#Import metadata specifically for OCR.
ocrconfig -import <File_Name_With_Full_Location.dmp> 

#Gives the OCR info in detail
Ocrcheck -details 

#Gives the OLR info in detail
ocrcheck -local 

#Take the dump of OLR.
ocrdump -local <File_Name_with_Full_Location.olr> 

#Take the dump of OCR.
ocrdump <File_Name_with_Full_Location.ocr> 

Voting Disk

两个主要作用：

动态 - 心跳信息
静态 - 群集中的节点信息

#Taking backup of voting disk
dd if=Name_Of_Voting_Disk of=Name_Of_Voting_Disk_Backup 

#Check voting disk details.
crsctl query css votedisk 

#To add voting disk
crsctl add css votedisk path_to_voting_disk 

#If the Cluster is down
crsctl add css votedisk -force 

#Delete Voting disk
crsctl delete css votedisk <File_Name_With_Password_With_file_name> 

#If the cluster is down
crsctl delete css votedisk -force 

#Replace the voting disk.
crsctl replace votedisk <+ASM_Disk_Group> 

CRSD stands for Cluster Resource Service Daemon

起停服务，及维护OCR

#Check crs resources
crs_stat -t -v 

#Check in a bit detail view. BEST ONE.
crsctl stat res -t 

#Enable Automatic start of Services after reboot
crsctl enable crs 

#Check crs Services.
crsctl check crs 

#Disable Automatic start of CRS services after reboot
crsctl disable crs 

#Stop the crs services on the node which we are executing
crsctl stop crs 

#Stop the crs services forcefully
crsctl stop crs -f 

#To start the crs services on respective node
crsctl start crs 

#To start the crs services in exclusive mode when u lost voting disk.
#You need to replace the voting disk after you start the css.
crsctl start crs -excl 

#Stop the crs services on the cluster nodes
crsctl stop cluster -all 

#Start the crs services on all the cluster nodes.
crsctl start cluster -all 

#Find all the nodes relative to the cluster
olsnodes 

#With this you will get master node information
oclumon manage -get master 

#Find PID from which crs is running.
cat $CRS_HOME/crs/init/<node_name>.pid 

CSSD Cluster Synchronization Service Daemon

如果两个节点都已启动并正在运行。并且由于其中一个通信通道，CSSD进程获得了另一个节点关闭的信息。因此，在这种情况下，无法将新事务分配给该节点。节点驱逐将完成。现在运行的节点将所有权作为主节点。

#用于停止css
crsctl stop css 

#重启后禁用自动启动。
crsctl disable css 

CTTSD Cluster Time Synchronization Service Daemon

#检查所有节点的时钟同步
cluvfy comp clocksync -n all -verbose 
#检查msecs中的服务状态和时间偏移。
crsctl check ctts 

VIP

#To start VIP
srvctl start vip -n <node_name> -i <VIP_Name> 

#To stop VIP
srvctl stop vip -n <node_name> -i <VIP_Name> 

#Enable the VIP.
srvctl enable vip -i vip_name 

#Disable the VIP.
srvctl disable vip -i vip_name 

#status of nodeapps
srvctl status nodeapps -n <node_name> 

#status of vip on a node
srvctl status vip -n <node_name> 

SCAN IP & Listener Single Client Access Name

least_recently_loaded算法 -> SCAN Listener

#retrieves scan listener configuration
srvctl config scan 

#List of scan listeners with Port number
srvctl config scan_listener 

#Add a scan listener to the cluster
srvctl add scan -n <node_name> 

#to add scan listener on specific port
srvctl add scan_listener -p <Desired_port_number> 

#find the list of scan listeners
SQL> SHOW PARAMETER REMOTE_LISTENER; 

#stops all scan listeners when used without -i option
srvctl stop scan 

#Stops one or more services in the cluster
srvctl stop scan_listener 

#To start the scan VIP
srvctl start scan 

#Start the scan listener.
srvctl start scan_listener 

#verify scan VIP status
srvctl status scan 

#Verify scan listener status.
srvctl status scan_listener 

#Modify the scan listener
srvctl modify scan_listener 

#relocate the scan listener to another node.
srvctl relocate scan_listener -i <Ordinal_Number> -n <node_name> 

Ologgerd cluster logger service Daemon

脑裂分辨，主控制器辨别

此过程负责收集本地节点中的信息。这将从每个节点收集信息，并将数据发送到master loggerd。这将发送信息，如CPU，内存使用情况，Os级别信息，磁盘信息，磁盘信息，进程，文件系统信息。

#Find which is the master node
oclumon manage -get master 

#Will get the path of the repository logs
oclumon manage -get reppath 

#This will give you the limitations on repository size
oclumon manage -get repsize 

#find which nodes are connected to loggerd
oclumon showobjects

#This will give a detail view including system, topconsumers, processes, devices, nics, filesystems status, protocol errors.
oclumon dumpnodeview 

#you can view all the details in c. column from a specific time you mentioned.
oclumon dumpnodeview -n <node_1 node_2 node_3> -last "HH:MM:SS" 

#If we need info from all the nodes.11.What is sysmon?
oclumon dumpnodeview allnodes -last "HH:MM:SS" 

Evmd Event Volume Manager Daemon

它向集群中的所有其他节点发送和接收有关资源状态更改的操作。这将借助ONS Oracle Notification Services

#获取evmd中生成的事件。
evmwatch -A -t "@timestamp @@" 
#这将在上述节点的evmd日志中发布消息。
evmpost -u "<message here>" -h <node_name> 

mdnsd Multicast Domain Name Service

gpndp使用此过程来定位群集中的配置文件以及GNS以执行名称解析。Mdnsd更新init目录中的pid文件。

ONS Oracle Notification Service

SMTP, mail sent
nodes之间传输信息

#Status of nodeapps
srvctl status nodeapps 

#Check ons configuration.
cat $ORACLE_HOME/opmn/conf/ons.config 

#ONS logs will be in this location.
$ORACLE_HOME/opmn/logs 

TAF Trasparent Application Failover

当任何rac节点关闭时，select语句需要故障转移到活动节点。

SELECT machine, failover_type, failover_method, failed_over, COUNT(1) FROM gv$session GROUP BY machine, failover_type, failover_method, failed_over;

GPNPD Grid Plug aNd Play Daemon

此配置文件由群集名称，主机名，具有IP地址的ntwork配置文件，OCR组成。如果我们对表决磁盘进行任何修改，将更新配置文件。

#Check the version of tool.
gpnptool ver 

# get local gpnpd server.
gpnptool lfind 

#read the profile
gpnptool get 

#check daemon is running on local node.
gpnptool lfind 

#Check whether configuration is valid.
gpnptool check -p= CRS_HOME/gpnp/<node_name>/profile/peer/profile.xml 

FCF Fast Connection Failover

它是一个应用程序级故障转移过程。这将自动订阅FAN事件，这将有助于对数据库集群的上下事件做出即时反应。

GCS(LMSn) Global Cache Service

它负责在需要时将块从实例传输到另一个实例, 无需从数据文件中选择数据

GES(LMD) Global Enqueue Service

GES控制所有节点上的库和字典缓存。GES管理事务锁，表锁，库缓存锁，字典缓存锁，数据库安装锁。

GRD Global Resource Directory

这是为了记录资源和入队的信息。就这个词而言，它存储了所有信息的信息。数据块标识符，数据块模式（共享，独占，空），缓冲区高速缓存等信息将具有访问权限

Diskmon

当ocssd启动时，磁盘监视器守护程序会持续运行。它监视并执行Exadata存储服务器的

OPROCD Oracle Process Monitor Daemon

就是cssd

FAN Fast Application Notification

就是ONS

命令

节点层：osnodes
网络层：oifcfg
集群层：crsctl, ocrcheck,ocrdump,ocrconfig
应用层：srvctl,onsctl,crs_stat

节点层

grid@stb1 ~ $ $ORACLE_HOME/bin/!!
$ORACLE_HOME/bin/olsnodes -n
stb2    1
stb1    2

网络层

grid@stb1 ~ $ $ORACLE_HOME/bin/oifcfg getif
eth1  10.255.255.0  global  public
eth2  192.168.255.0  global  cluster_interconnect

[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.1.119:public 
[root@rac1 bin]# ./oifcfg setif -globaleth1/10.85.10.119:cluster_interconnect
[root@rac1 bin]# ./oifcfg getif -type public 
[root@rac1 bin]# ./oifcfg delif -global 

集群层

grid@stb1 ~ $ crsctl status resource -t
grid@stb1 ~ $ crsctl status resource -t -init

CRS进程栈默认随着操作系统的启动而自启动，有时出于维护目的需要关闭这个特性，可以用root用户执行下面命令。

[root@rac1 bin]# ./crsctl disable crs 
[root@rac1 bin]# ./crsctl enable crs 

这个命令实际是修改了以下两个文件内容,disable后ohasd.bin reboot不能启动，也就没有后面的一系列进程。但/bin/sh /etc/init.d/init.tfa run 和 /bin/sh /etc/init.d/init.ohasd run由init启动。

/etc/oracle/scls_scr/$(HOSTNAME)/root/ohasdstr
/etc/oracle/scls_scr/$(HOSTNAME)/root/ohasdrun

手动启动crs

[root@stb1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
 /u01/app/11.2.0/grid/bin/oraagent.bin
 /u01/app/11.2.0/grid/bin/mdnsd.bin
 /u01/app/11.2.0/grid/bin/gpnpd.bin
 /u01/app/11.2.0/grid/bin/gipcd.bin
 /u01/app/11.2.0/grid/bin/cssdmonitor
 /u01/app/11.2.0/grid/bin/cssdagent
 /u01/app/11.2.0/grid/bin/ocssd.bin 
 /u01/app/11.2.0/grid/bin/orarootagent.bin
 /u01/app/11.2.0/grid/bin/octssd.bin reboot
 /u01/app/11.2.0/grid/bin/evmd.bin
  \_ /u01/app/11.2.0/grid/bin/evmlogger.bin -o /u01/app/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/11.2.0/grid/evm/log/evmlogge
 asm_pmon_+ASM2
 asm_psp0_+ASM2
 asm_vktm_+ASM2
 asm_gen0_+ASM2
 asm_diag_+ASM2
 asm_ping_+ASM2
 asm_dia0_+ASM2
 asm_lmon_+ASM2
 asm_lmd0_+ASM2
 asm_lms0_+ASM2
 asm_lmhb_+ASM2
 asm_mman_+ASM2
 asm_dbw0_+ASM2
 asm_lgwr_+ASM2
 asm_ckpt_+ASM2
 asm_smon_+ASM2
 asm_rbal_+ASM2
 asm_gmon_+ASM2
 asm_mmon_+ASM2
 asm_mmnl_+ASM2
 asm_lck0_+ASM2
 oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 /u01/app/11.2.0/grid/bin/crsd.bin reboot
 oracle+ASM2_ocr (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 asm_asmb_+ASM2
 oracle+ASM2_asmb_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 asm_o000_+ASM2
 oracle+ASM2_o000_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 /u01/app/11.2.0/grid/bin/oraagent.bin
 /u01/app/11.2.0/grid/bin/orarootagent.bin
 oracle+ASM2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 /u01/app/11.2.0/grid/opmn/bin/ons -d
  \_ /u01/app/11.2.0/grid/opmn/bin/ons -d
 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER 

影响 /etc/oracle/scls_scr/stb1/root/ohasdstr

[root@stb1 bin]# ./crsctl disable has
CRS-4621: Oracle High Availability Services autostart is disabled.

效果和 ./crsctl disable crs相似，就是启动不了。

crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   67fbe885438c4f4cbf85a68076a2aa68 (/dev/oracleasm/disks/DISK1) [DATA]
 Located 1 voting disk(s).

OCR 的备份

grid@stb2 ~ $ ls -lh /u01/app/11.2.0/grid/cdata/stb-cluster/
total 20M
-rw------- 1 root root 6.6M Aug 17 21:41 backup00.ocr
-rw------- 1 root root 6.6M Aug 17 21:41 day.ocr
-rw------- 1 root root 6.6M Aug 17 21:41 week.ocr

grid@stb2 /u01/app/11.2.0/grid/cdata/stb-cluster $ sudo md5sum *
4dc00b7f877a0f0b7aa8fbaa2a91a6aa  backup00.ocr
4dc00b7f877a0f0b7aa8fbaa2a91a6aa  day.ocr
4dc00b7f877a0f0b7aa8fbaa2a91a6aa  week.ocr

应用层

crsctl, srvctl 略

onsctl

srvctl add database

standalone db

sudo $GRID_HOME/bin/crsctl delete resource  ora.oradb.db

rac

srvctl add database -d db_unique_name -r PRIMARY -n db_name -o $ORACLE_HOME 
srvctl add instance -d db_unique_name -i $ORACLE_SID -n $HOSTNAME
srvctl add instance -d db_unique_name -i $ORACLE_SID -n $HOSTNAME

srvctl add database -d db_unique_name -o ORACLE_HOME [-x node_name] [-m domain_name] [-p spfile] [-r {PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY}] [-s start_options] [-t stop_options] [-n db_name] [-y {AUTOMATIC|MANUAL}] [-g server_pool_list] [-a "diskgroup_list"]

srvctl modify database -d db_unique_name [-n db_name] [-o ORACLE_HOME] [-u oracle_user] [-m domain] [-p spfile] [-r {PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY}] [-s start_options] [-t stop_options] [-y {AUTOMATIC|MANUAL}] [-g "server_pool_list"] [-a "diskgroup_list"|-z]

srvctl add service -d db_unique_name -s service_name -r preferred_list [-a available_list] [-P {BASIC|NONE|PRECONNECT}]
[-l [PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY]
[-y {AUTOMATIC|MANUAL}] [-q {TRUE|FALSE}] [-j {SHORT|LONG}]
[-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}]
[-m {NONE|BASIC}] [-x {TRUE|FALSE}] [-z failover_retries] [-w failover_delay]