For Rac , checking hung session is simplified using Rac Hung Manager . For Non Rac i personally use v$sess_io or try enabling session tracing
In 12.1.0.1, hang manager can detect hang between database and asm. 2.Deadlock or Closed Chain
Deadlock or close the chain. The only way to break the deadlock chain is to let some of these sessions complete their work or be terminated. 3.Hang or Open Chain
In the Oracle database, suspend (hang) refers to the waiting state entered by a process due to the inability to obtain the requested resources, which can be lifted only after the requested resources have been obtained, and the HM implements the management of hangs, including the monitoring, analysis, recording and resolution of hang.
The wait chain is made up of blocking processes and waiting processes, while one or more root blocking processes exist in the blocking process, which blocks all other processes, and if the root blocking process is busy with some operations, then perhaps the presence of such a wait chain is normal, if the blocking process is idle, Then perhaps the emergence of this wait chain is not normal, and the way to break the wait chain is to terminate the root blocking process. HM can proactively discover the existence of the waiting chain in the database, and from the perspective of the analysis of them, if found to really affect the performance of the data block hang, depending on the specific circumstances to determine whether to solve the problem, and even if not directly resolved, the corresponding diagnostic information will be recorded and continuous monitoring.
V$hang_info: This view contains details of the hang that was found by HM.
V$hang_session_info: This view contains the session information related to hang.
V$hang_statistics: This view contains statistics related to hang.
The work of HM is composed of seven stages
Phase 1 (Collection Phase): At this stage, the DIA0 process for each instance collects hang analyze information on a regular basis.
Phase 2 (Discovery phase): At this stage, the DIA0 process for each instance analyzes the collected hang Alalyze information, locates the session where hang is present, and sends the DIA0 process to the master node.
Phase 3 (Drawing phase): At this stage, the dia0 process of the master node draws the message from each instance of the DIA0 process, drawing the wait chain.
Phase 4 (Analysis Phase): At this stage, the master node dia0 the process according to the drawn wait chain and analyzes whether hang is indeed present.
Phase 5 (Validation phase): At this stage, the master node dia0 process executes phase 1-4 again, then compares the analysis results of phase 4 with this one, and verifies that hang is really happening.
Phase 6 (Positioning phase): At this stage, the results of the master node dia0 process More validation phase are positioned to the root blocking process of the wait chain.
Phase 7 (resolution Phase): At this stage, the master node dia0 process determines whether hang can be resolved based on the value of the parameter _hang_resoluton_scope.
Trace log files for the DIA0 process
Main trace file (<SID>_DIA0_<PID>.TRC): This log file records the details of the DIA0 process, including the process of discovering, analyzing, and handling the hang.
History Tracker File (<sid>_dia0_<pid>_ N.TRC): Because the trace log file of the DIA0 process constantly generates information as the database runs, it can make the log file very large, and the DIA0 process periodically writes log information to its history log file, where n is a positive integer and increases over time.
Incident Log file: If HM resolves the hang by terminating the process, the ORA-32701 error is first recorded in the Alert.log, and because of the existence of the ADR, the DIA0 process also produces a incident log file that records the details of the problem.
Parameters of HM
_hang_detection_enabled: This parameter determines whether the HM attribute is enabled in the database, and the default value is true.
_hang_detection_interval: This parameter specifies the time interval for which HM collects hang analyze information, and the default value is 32s.
_hang_verification_interval: This parameter specifies the time interval for the HM Validation hang, and the default value is 46s.
_hang_resolution_scope: This parameter specifies the range that HM can operate when the hang is resolved, the default value is process, and the allowable values are as follows:
OFF: The HM will only continue to monitor hang, and will not do anything to fix hang.
Process: Indicates that HM can resolve hang by terminating the root blocking process, but the root blocking process here cannot be an important background process for the database because it causes the instance to crash.
Instance: Indicates that HM can resolve the hang by terminating the instance
We can get complete list from DBA_HANG_MANAGER_PARAMETERS
Related parameters:
NAME VALUE ISDEFAULT ISMOD ISADJ
-------------------------------------------------- ------------------------------ --------- ---------- -----
_hang_analysis_num_call_stacks 3 TRUE FALSE FALSE
_hang_base_file_count 5 TRUE FALSE FALSE
_hang_base_file_space_limit 10000000 TRUE FALSE FALSE
_hang_bool_spare1 TRUE TRUE FALSE FALSE
_hang_delay_resolution_for_libcache TRUE TRUE FALSE FALSE
_hang_detection_enabled TRUE TRUE FALSE FALSE
_hang_detection_interval 32 TRUE FALSE FALSE
_hang_hang_analyze_output_hang_chains TRUE TRUE FALSE FALSE
_hang_hiload_promoted_ignored_hang_count 2 TRUE FALSE FALSE
_hang_hiprior_session_attribute_list TRUE FALSE FALSE
_hang_ignored_hang_count 1 TRUE FALSE FALSE
_hang_ignored_hangs_interval 300 TRUE FALSE FALSE
_hang_int_spare2 FALSE TRUE FALSE FALSE
_hang_log_verified_hangs_to_alert FALSE TRUE FALSE FALSE
_hang_long_wait_time_threshold 0 TRUE FALSE FALSE
_hang_lws_file_count 5 TRUE FALSE FALSE
_hang_lws_file_space_limit 10000000 TRUE FALSE FALSE
_hang_monitor_archiving_related_hang_interval 300 TRUE FALSE FALSE
_hang_msg_checksum_enabled TRUE TRUE FALSE FALSE
_hang_resolution_allow_archiving_issue_termination TRUE TRUE FALSE FALSE
_hang_resolution_confidence_promotion FALSE TRUE FALSE FALSE
_hang_resolution_global_hang_confidence_promotion FALSE TRUE FALSE FALSE
_hang_resolution_policy HIGH TRUE FALSE FALSE
_hang_resolution_promote_process_termination FALSE TRUE FALSE FALSE
_hang_resolution_scope PROCESS TRUE FALSE FALSE
_hang_short_stacks_output_enabled TRUE TRUE FALSE FALSE
_hang_signature_list_match_output_frequency 10 TRUE FALSE FALSE
_hang_statistics_collection_interval 15 TRUE FALSE FALSE
_hang_statistics_collection_ma_alpha 30 TRUE FALSE FALSE
_hang_statistics_high_io_percentage_threshold 15 TRUE FALSE FALSE
_hang_verification_interval 46 TRUE FALSE FALSE