Use ASH first. That is the cleanest way to see whether this morning’s load hit library cache or row cache contention.
1) This morning summary from historical ASH
This gives you the contention picture for today from midnight onward.
COL event FORMAT A35
COL wait_class FORMAT A15
COL first_seen FORMAT A20
COL last_seen FORMAT A20
SELECT
ash.event,
ash.wait_class,
COUNT(*) AS ash_samples,
COUNT(DISTINCT ash.session_id || ':' || ash.session_serial# || ':' || ash.instance_number) AS distinct_sessions,
COUNT(DISTINCT ash.sql_id) AS distinct_sqls,
MIN(ash.sample_time) AS first_seen,
MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
AND ash.session_type = 'FOREGROUND'
AND ash.event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
GROUP BY
ash.event,
ash.wait_class
ORDER BY ash_samples DESC;
2) Top SQLs causing those waits
This is usually the most useful report.
COL event FORMAT A35
COL username FORMAT A20
COL module FORMAT A25
COL program FORMAT A30
SELECT
ash.event,
u.username,
ash.sql_id,
ash.sql_plan_hash_value AS plan_hash_value,
ash.module,
ash.program,
ash.machine,
COUNT(*) AS ash_samples,
MIN(ash.sample_time) AS first_seen,
MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
LEFT JOIN dba_users u
ON u.user_id = ash.user_id
WHERE ash.sample_time >= TRUNC(SYSDATE)
AND ash.session_type = 'FOREGROUND'
AND ash.event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
GROUP BY
ash.event,
u.username,
ash.sql_id,
ash.sql_plan_hash_value,
ash.module,
ash.program,
ash.machine
ORDER BY ash_samples DESC;
3) Timeline by 5-minute bucket
This shows when the spike happened.
COL time_bucket FORMAT A20
COL event FORMAT A35
SELECT
TO_CHAR(
TRUNC(ash.sample_time, 'HH24')
+ FLOOR(TO_NUMBER(TO_CHAR(ash.sample_time, 'MI')) / 5) * 5 / 1440,
'YYYY-MM-DD HH24:MI'
) AS time_bucket,
ash.event,
COUNT(*) AS ash_samples
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
AND ash.session_type = 'FOREGROUND'
AND ash.event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
GROUP BY
TRUNC(ash.sample_time, 'HH24')
+ FLOOR(TO_NUMBER(TO_CHAR(ash.sample_time, 'MI')) / 5) * 5 / 1440,
ash.event
ORDER BY 1, 2;
4) If you know the load module or program
If the load was a known app/job, filter it directly. For example:
AND (ash.module LIKE '%GTJ%'
OR ash.program LIKE '%JarLauncher%')
Full example:
SELECT
ash.event,
ash.sql_id,
ash.module,
ash.program,
COUNT(*) AS ash_samples,
MIN(ash.sample_time) AS first_seen,
MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
AND ash.session_type = 'FOREGROUND'
AND ash.event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
AND (ash.module LIKE '%GTJ%' OR ash.program LIKE '%JarLauncher%')
GROUP BY
ash.event,
ash.sql_id,
ash.module,
ash.program
ORDER BY ash_samples DESC;
5) Current in-memory view if it is still happening
If the issue is still active now, use V$ACTIVE_SESSION_HISTORY:
SELECT
ash.sample_time,
ash.session_id,
ash.session_serial#,
ash.sql_id,
ash.sql_plan_hash_value AS plan_hash_value,
ash.event,
ash.wait_class,
ash.module,
ash.program,
ash.machine,
ash.blocking_session
FROM v$active_session_history ash
WHERE ash.sample_time >= SYSDATE - (2/24)
AND ash.event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
ORDER BY ash.sample_time;
6) If you want the broad wait-family summary from system views
This is not load-specific, but it shows whether the instance accumulated those waits.
COL event FORMAT A35
SELECT
event,
total_waits,
time_waited,
average_wait
FROM v$system_event
WHERE event IN (
'library cache lock',
'library cache pin',
'row cache lock',
'cursor: pin S wait on X',
'cursor: pin X',
'library cache: mutex X',
'library cache: mutex S'
)
ORDER BY time_waited DESC;
Best practical approach
Run these in this order:
Query 1 → confirm whether library/row cache contention happened this morning
Query 2 → identify user / SQL_ID / module / program
Query 3 → show when the spike occurred
Query 4 → isolate your load job if module/program is known
One important note
For DBA_HIST_ACTIVE_SESS_HISTORY, the sample count is best treated as intensity / evidence, not exact execution count. It is very good for proving that this morning’s load hit:
library cache lockcursor: pin S wait on Xrow cache lock
If you want, I can give you one single combined script that prints:
top events
top SQL IDs
top modules/programs
5-minute timeline
for this morning in one shot.
=========================
This output is very useful. It confirms the issue is a SQL-family / session fan-out problem, not one single SQL_ID.
What your result is saying
The top offender appears to be:
Module : JDBC Thin Client
Plan hash : 725220778
Event : resmgr:cpu quantum + ON CPU
Distinct SQLs : 482 to 516 SQL_IDs
ASH samples : 7607 resmgr + 6313 ON CPU
That is the biggest red flag.
It means hundreds of different SQL_IDs are behaving like the same SQL family, probably because the application is generating dynamic SQL variations. Oracle sees many different SQL_IDs, but they collapse into the same force_matching_signature and plan hash.
So the main troubleshooting target is not one SQL_ID. It is this combination:
MODULE = JDBC Thin Client
FORCE_MATCHING_SIGNATURE = top signature from your first row
PLAN_HASH_VALUE = 725220778
JarLauncher is also contributing, but the first two rows show the JDBC Thin Client SQL family is the biggest pressure point right now.
1. Run this cleaner rollup query
Your current query splits by event. This new one combines all events into one row per SQL family.
SET LINES 280
SET PAGES 200
COLUMN module FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140
COLUMN main_events FORMAT A160
WITH ash_base AS (
SELECT ash.module,
NVL(q.force_matching_signature, 0) AS force_matching_signature,
ash.sql_plan_hash_value,
NVL(ash.event, 'ON CPU') AS event,
ash.session_state,
ash.sql_id
FROM gv$active_session_history ash
LEFT JOIN gv$sql q
ON q.inst_id = ash.inst_id
AND q.sql_id = ash.sql_id
AND q.child_number = ash.sql_child_number
WHERE ash.sample_time >= SYSTIMESTAMP - INTERVAL '15' MINUTE
AND ash.session_type = 'FOREGROUND'
AND ash.module IN ('JarLauncher','JDBC Thin Client')
),
grp AS (
SELECT module,
force_matching_signature,
sql_plan_hash_value,
COUNT(*) AS ash_samples,
SUM(CASE WHEN session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu_samples,
SUM(CASE WHEN event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_samples,
SUM(CASE WHEN event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_samples,
SUM(CASE WHEN event LIKE 'db file%read%' OR event LIKE 'direct path read%' THEN 1 ELSE 0 END) AS io_read_samples,
COUNT(DISTINCT sql_id) AS distinct_sql_ids
FROM ash_base
GROUP BY module,
force_matching_signature,
sql_plan_hash_value
),
sqlids AS (
SELECT module,
force_matching_signature,
sql_plan_hash_value,
LISTAGG(sql_id, ', ' ON OVERFLOW TRUNCATE)
WITHIN GROUP (ORDER BY sql_id) AS sample_sql_ids
FROM (
SELECT DISTINCT module,
force_matching_signature,
sql_plan_hash_value,
sql_id
FROM ash_base
WHERE sql_id IS NOT NULL
)
GROUP BY module,
force_matching_signature,
sql_plan_hash_value
),
events AS (
SELECT module,
force_matching_signature,
sql_plan_hash_value,
LISTAGG(event || '=' || event_samples, ', ' ON OVERFLOW TRUNCATE)
WITHIN GROUP (ORDER BY event_samples DESC) AS main_events
FROM (
SELECT module,
force_matching_signature,
sql_plan_hash_value,
event,
COUNT(*) AS event_samples
FROM ash_base
GROUP BY module,
force_matching_signature,
sql_plan_hash_value,
event
)
GROUP BY module,
force_matching_signature,
sql_plan_hash_value
)
SELECT g.module,
g.force_matching_signature,
g.sql_plan_hash_value,
g.ash_samples,
g.on_cpu_samples,
g.resmgr_samples,
g.row_cache_mutex_samples,
g.io_read_samples,
g.distinct_sql_ids,
e.main_events,
s.sample_sql_ids
FROM grp g
LEFT JOIN sqlids s
ON s.module = g.module
AND s.force_matching_signature = g.force_matching_signature
AND s.sql_plan_hash_value = g.sql_plan_hash_value
LEFT JOIN events e
ON e.module = g.module
AND e.force_matching_signature = g.force_matching_signature
AND e.sql_plan_hash_value = g.sql_plan_hash_value
ORDER BY g.ash_samples DESC;
The top row from this query is your real troubleshooting target.
2. Pull the exact SQL text for the top SQL family
Use the top force_matching_signature and plan_hash_value from the query above.
SET LINES 280
SET PAGES 100
SET LONG 1000000
SET LONGCHUNKSIZE 1000000
COLUMN sql_id FORMAT A15
COLUMN sql_text FORMAT A220
COLUMN module FORMAT A35
SELECT inst_id,
sql_id,
child_number,
plan_hash_value,
force_matching_signature,
module,
parsing_schema_name,
executions,
ROUND(elapsed_time/1e6,2) AS elapsed_sec,
ROUND(cpu_time/1e6,2) AS cpu_sec,
buffer_gets,
disk_reads,
parse_calls,
px_servers_executions,
SUBSTR(sql_text,1,220) AS sql_text
FROM gv$sql
WHERE force_matching_signature = <TOP_FORCE_MATCHING_SIGNATURE>
AND plan_hash_value = <TOP_PLAN_HASH_VALUE>
ORDER BY cpu_time DESC, elapsed_time DESC
FETCH FIRST 20 ROWS ONLY;
For your screenshot, the first target is likely:
MODULE = JDBC Thin Client
PLAN_HASH_VALUE = 725220778
Take the top SQL_ID from this query and display the plan:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(
sql_id => '<TOP_SQL_ID>',
cursor_child_no => NULL,
format => 'ALLSTATS LAST +PEEKED_BINDS +OUTLINE +PARTITION +ALIAS +NOTE'
));
3. Find which host/application is opening these sessions
This is critical because the top module is JDBC Thin Client, not only JarLauncher.
SET LINES 260
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN service_name FORMAT A30
SELECT s.username,
s.module,
s.program,
s.machine,
s.service_name,
s.status,
COUNT(*) AS session_count
FROM gv$session s
WHERE s.type = 'USER'
AND s.username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY s.username,
s.module,
s.program,
s.machine,
s.service_name,
s.status
ORDER BY session_count DESC;
This tells you which app server / launcher / JDBC pool is flooding the DB.
4. Interpret what you have now
Your current output strongly indicates:
1. The database is CPU saturated.
2. Resource Manager is parking sessions on resmgr:cpu quantum.
3. The top problem is a SQL family, not one SQL_ID.
4. JDBC Thin Client has the biggest ASH footprint.
5. Many SQL_IDs are generated from the same SQL pattern.
6. JarLauncher is also contributing, especially plan hash 3623520496.
So the fix is two-layered:
Layer 1: Reduce application/session concurrency immediately.
Layer 2: Tune or baseline the top SQL family/signature.
5. Immediate control recommendation
For the ACPT simulation, ask the app/team to reduce the fan-out now:
Reduce JDBC/JarLauncher active workers to 25–50.
Do not allow 1000+ connections.
Disable retry storm / reconnect loop.
Run batches in waves.
Keep CPU below 70–80% before increasing concurrency.
Database-side, you can watch with:
SELECT module,
NVL(event, 'ON CPU') AS event,
COUNT(*) AS active_sessions
FROM gv$session
WHERE type = 'USER'
AND status = 'ACTIVE'
GROUP BY module,
NVL(event, 'ON CPU')
ORDER BY active_sessions DESC;
If resmgr:cpu quantum continues to dominate, the system is still overdriven.
Bottom line
Your top troubleshooting target is currently:
JDBC Thin Client
Force matching signature from the top row
Plan hash value 725220778
Hundreds of distinct SQL_IDs
resmgr:cpu quantum + ON CPU
That is the SQL family to extract, plan, and tune. But operationally, the immediate control is still to reduce the number of concurrent JDBC/JarLauncher sessions. Even a good plan will struggle if hundreds of sessions execute the same SQL family at once.
-----------------------------------------------
Now that we know many sessions are causing the CPU/resource-manager pressure, troubleshoot it as a concurrency/fan-out incident, not as a single SQL_ID issue.
The goal is to answer four questions:
1. Who is opening the sessions?
2. Are they new sessions continuously spawning or old sessions stuck?
3. Are the SQLs different SQL_IDs but the same SQL pattern?
4. Is Resource Manager throttling because CPU is exhausted, or is Resource Manager itself misconfigured?
1. First: prove the session fan-out source
Run this to identify the source host/module/program/service.
SET LINES 260
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN service_name FORMAT A30
COLUMN status FORMAT A12
SELECT username,
module,
program,
machine,
service_name,
status,
COUNT(*) AS session_count
FROM gv$session
WHERE type = 'USER'
GROUP BY username,
module,
program,
machine,
service_name,
status
ORDER BY session_count DESC;
This tells you whether the storm is coming from:
JarLauncher
JDBC Thin Client
Informatica/pmdtm
one app host
one DB service
one schema/user
If one module/host has hundreds of sessions, that is the first root-cause path.
2. Check whether sessions are continuously spawning
This tells you whether the launcher is still opening sessions.
SET LINES 240
SET PAGES 200
COLUMN logon_minute FORMAT A20
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45
SELECT TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI') AS logon_minute,
username,
module,
program,
COUNT(*) AS sessions_opened
FROM gv$session
WHERE type = 'USER'
AND username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI'),
username,
module,
program
ORDER BY logon_minute DESC,
sessions_opened DESC;
Interpretation:
| Result | Meaning |
|---|---|
| Many sessions opened in the last few minutes | Active fan-out / launcher still spawning |
| Sessions are old and not growing | Existing workload is stuck or CPU-starved |
| Same host/module keeps appearing | Upstream job/concurrency setting is the culprit |
3. Group active sessions by SQL family, not SQL_ID
Your screenshot shows each SQL_ID has active_sessions = 1. That means SQL_ID alone is misleading. Group by force matching signature + plan hash.
SET LINES 280
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140
COLUMN sql_text_sample FORMAT A140
SELECT s.username,
s.module,
s.program,
NVL(q.force_matching_signature, 0) AS force_matching_signature,
s.sql_plan_hash_value,
COUNT(*) AS active_sessions,
COUNT(DISTINCT s.sql_id) AS distinct_sql_ids,
SUM(CASE WHEN s.event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_sessions,
SUM(CASE WHEN s.event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_sessions,
LISTAGG(DISTINCT s.sql_id, ', ')
WITHIN GROUP (ORDER BY s.sql_id)
ON OVERFLOW TRUNCATE AS sample_sql_ids,
SUBSTR(MIN(q.sql_text), 1, 140) AS sql_text_sample
FROM gv$session s
LEFT JOIN gv$sql q
ON q.inst_id = s.inst_id
AND q.sql_id = s.sql_id
AND q.child_number = s.sql_child_number
WHERE s.type = 'USER'
AND s.status = 'ACTIVE'
AND s.username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY s.username,
s.module,
s.program,
NVL(q.force_matching_signature, 0),
s.sql_plan_hash_value
ORDER BY active_sessions DESC;
The top row is your real troubleshooting target.
If you see:
active_sessions = 150
distinct_sql_ids = 150
same plan hash = 3623520496
same module = JarLauncher
then you have one SQL pattern being generated many different ways.
4. Use ASH to find the top offender over time
Current sessions are only a snapshot. ASH shows who consumed database time over the last 15 minutes.
SET LINES 280
SET PAGES 200
COLUMN module FORMAT A35
COLUMN event FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140
SELECT ash.module,
NVL(q.force_matching_signature, 0) AS force_matching_signature,
ash.sql_plan_hash_value,
NVL(ash.event, 'ON CPU') AS event,
COUNT(*) AS ash_samples,
SUM(CASE WHEN ash.session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu_samples,
SUM(CASE WHEN ash.event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_samples,
SUM(CASE WHEN ash.event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_samples,
COUNT(DISTINCT ash.sql_id) AS distinct_sql_ids,
LISTAGG(DISTINCT ash.sql_id, ', ')
WITHIN GROUP (ORDER BY ash.sql_id)
ON OVERFLOW TRUNCATE AS sample_sql_ids
FROM gv$active_session_history ash
LEFT JOIN gv$sql q
ON q.inst_id = ash.inst_id
AND q.sql_id = ash.sql_id
AND q.child_number = ash.sql_child_number
WHERE ash.sample_time >= SYSTIMESTAMP - INTERVAL '15' MINUTE
AND ash.session_type = 'FOREGROUND'
AND ash.module IN ('JarLauncher','JDBC Thin Client')
GROUP BY ash.module,
NVL(q.force_matching_signature, 0),
ash.sql_plan_hash_value,
NVL(ash.event, 'ON CPU')
ORDER BY ash_samples DESC;
This query gives you the real top consumer.
5. Check Resource Manager impact
resmgr:cpu quantum means sessions are waiting because CPU is being rationed. It is usually a symptom of CPU saturation, not the original root cause.
SET LINES 260
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A40
COLUMN current_consumer_group FORMAT A35
COLUMN state FORMAT A25
SELECT s.username,
s.module,
s.program,
r.current_consumer_group,
r.state,
COUNT(*) AS sessions,
ROUND(SUM(r.cpu_wait_time)/1000,2) AS cpu_wait_sec,
ROUND(SUM(r.consumed_cpu_time)/1000,2) AS consumed_cpu_sec,
SUM(r.yields) AS yields
FROM gv$session s
JOIN gv$rsrc_session_info r
ON r.inst_id = s.inst_id
AND r.sid = s.sid
WHERE s.type = 'USER'
AND s.status = 'ACTIVE'
GROUP BY s.username,
s.module,
s.program,
r.current_consumer_group,
r.state
ORDER BY cpu_wait_sec DESC;
If JarLauncher has most of the cpu_wait_sec, the workload is overdriving the CPU allocation.
6. Check whether this is a parse / row-cache storm
Your screenshots also show row cache mutex. With hundreds of generated SQLs, this can happen from hard parse / metadata / dictionary contention.
SET LINES 260
SET PAGES 200
COLUMN module FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sql_text_sample FORMAT A140
SELECT module,
force_matching_signature,
plan_hash_value,
COUNT(DISTINCT sql_id) AS distinct_sql_ids,
COUNT(*) AS cursor_count,
SUM(parse_calls) AS parse_calls,
SUM(executions) AS executions,
ROUND(SUM(elapsed_time)/1e6,2) AS elapsed_sec,
ROUND(SUM(cpu_time)/1e6,2) AS cpu_sec,
SUM(buffer_gets) AS buffer_gets,
SUBSTR(MIN(sql_text),1,140) AS sql_text_sample
FROM gv$sql
WHERE module = 'JarLauncher'
GROUP BY module,
force_matching_signature,
plan_hash_value
ORDER BY parse_calls DESC, cpu_sec DESC
FETCH FIRST 30 ROWS ONLY;
Red flags:
distinct_sql_ids high
parse_calls high
executions low
row cache mutex / library cache waits
That means the application is generating too many distinct SQL statements rather than reusing stable bind SQL.
7. Use SQL Monitor to find the heavy active plan family
SET LINES 280
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN sample_sql_ids FORMAT A140
SELECT m.username,
m.module,
m.sql_plan_hash_value,
COUNT(*) AS monitored_execs,
COUNT(DISTINCT m.sql_id) AS distinct_sql_ids,
ROUND(SUM(m.elapsed_time)/1e6,2) AS total_elapsed_sec,
ROUND(SUM(m.cpu_time)/1e6,2) AS total_cpu_sec,
SUM(m.buffer_gets) AS total_buffer_gets,
ROUND(SUM(m.physical_read_bytes)/1024/1024/1024,2) AS physical_read_gb,
MAX(m.px_servers_requested) AS max_px_requested,
MAX(m.px_servers_allocated) AS max_px_allocated,
LISTAGG(DISTINCT m.sql_id, ', ')
WITHIN GROUP (ORDER BY m.sql_id)
ON OVERFLOW TRUNCATE AS sample_sql_ids
FROM gv$sql_monitor m
WHERE m.status = 'EXECUTING'
AND m.module = 'JarLauncher'
GROUP BY m.username,
m.module,
m.sql_plan_hash_value
ORDER BY total_cpu_sec DESC;
If one plan hash dominates, pick one SQL_ID from that row and get the live plan:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(
sql_id => '<TOP_SQL_ID>',
cursor_child_no => NULL,
format => 'ALLSTATS LAST +PEEKED_BINDS +OUTLINE +PARTITION +ALIAS +NOTE'
));
8. Determine the actual root cause category
Use this decision table.
| Evidence | Root cause direction | Fix |
|---|---|---|
| Hundreds of normal JarLauncher sessions | App/job fan-out | Reduce launcher threads / connection pool |
| Each SQL_ID has 1 session but same signature/plan | Dynamic SQL variants | Force matching, bind reuse, SQL family baseline/patch |
resmgr:cpu quantum dominates | CPU saturated / Resource Manager throttling | Reduce concurrency; adjust consumer group only after review |
row cache mutex / library cache lock | Hard parse / dictionary contention | Reduce SQL variants, avoid DDL/stats during run, bind properly |
| PX slaves high | Parallelism storm | Disable/cap parallel query |
| Same plan hash but bad under load | Concurrency issue, not plan issue | Throttle sessions |
| Different bad/good plan hash | Plan regression | SQL Plan Baseline / SQL Patch / stats |
9. What to collect for developers/app team
You need this evidence package:
1. Session count by module/program/machine.
2. Logon-time histogram showing session ramp-up.
3. Top force_matching_signature + plan_hash_value.
4. Top SQL Monitor plan hash and CPU/elapsed time.
5. Resource Manager CPU wait by consumer group.
6. Row cache/library cache wait evidence if present.
7. PX slave count showing whether parallelism is or is not the issue.
That tells the app team exactly whether the issue is:
connection pool max too high
launcher thread count too high
batch fan-out too aggressive
retry/reconnect storm
dynamic SQL generation
bad SQL plan
parallelism
10. Bottom line
Since you already know many sessions are causing it, the next troubleshooting step is not picking one SQL_ID from the 266 rows.
The correct method is:
Group sessions by module + force_matching_signature + plan_hash_value.
Find the SQL family with the highest active sessions / ASH samples.
Confirm whether sessions are continuously spawning.
Then decide whether the fix is app concurrency, SQL baseline, bind reuse, Resource Manager, or parallelism.
For the current ACPT symptom, based on your screenshots, the leading diagnosis is:
JarLauncher is generating too many concurrent normal sessions.
Those sessions are CPU-starved and parked on resmgr:cpu quantum.
The SQL_IDs are fragmented, so troubleshoot by SQL family/signature, not individual SQL_ID.
===============================
It is opening ~1400 sessions because the application/job layer is fanning out connections, not because Oracle independently decides to create 1400 sessions.
From your screenshots, the important proof is:
Session type : mostly NORMAL_SESSION
Module : JarLauncher / JDBC Thin Client
Wait : resmgr:cpu quantum
PX evidence : only one PX SQL with DOP 8 / 16 PX slaves
CPU : ~99.9%
Connections : ~1448
So today’s ACPT issue is not mainly “one SQL opened 1400 PX slaves.” It is more like:
JarLauncher started hundreds/thousands of worker executions
→ each worker opened or held a JDBC database session
→ CPU hit 99–100%
→ Resource Manager parked many sessions on resmgr:cpu quantum
→ sessions stayed open longer
→ more work kept arriving
→ connection count piled up to ~1400
Most likely causes
1. JarLauncher job concurrency is too high
The launcher may be splitting the workload into hundreds/thousands of parallel tasks. Each task gets its own JDBC session.
Typical causes:
parallel thread count too high
batch split count too high
partition/chunk count too high
multiple job instances launched at same time
multiple environments/users launching same workload
This is the most likely based on your screenshots.
2. Connection pool maximum is too high or uncontrolled
If the app uses JDBC pooling, the pool may allow hundreds or thousands of open connections.
Examples:
Hikari maxPoolSize too high
Tomcat JDBC maxActive too high
WebLogic/JBoss datasource max capacity too high
Informatica session/grid concurrency too high
If the pool max is 1500, Oracle will accept 1400 sessions until CPU collapses.
3. Retry storm / timeout amplification
Once CPU hits 100%, each SQL runs slower. The app may think calls are stuck or timed out and launches more sessions/retries.
Bad pattern:
SQL slows down
app retry starts
old DB session still running
new DB session starts same work
CPU gets worse
more retries happen
This is how 200 sessions becomes 1400 very quickly.
4. Sessions are not closing fast enough
Even if the app is not intentionally launching 1400 at once, sessions may be retained because they are waiting on CPU:
resmgr:cpu quantum
That wait means Resource Manager is parking sessions because CPU is over capacity. The sessions remain connected while waiting, so the connection count builds up.
Resource Manager is not opening the sessions. It is only exposing the symptom that too many sessions are competing for CPU.
Prove exactly who is opening them
Run this first.
SET LINES 240
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN status FORMAT A12
SELECT username,
module,
program,
machine,
status,
COUNT(*) AS session_count
FROM gv$session
WHERE type = 'USER'
GROUP BY username,
module,
program,
machine,
status
ORDER BY session_count DESC;
This will show the source host/module responsible for the 1400 connections.
Show when the sessions were opened
SET LINES 240
SET PAGES 200
COLUMN logon_minute FORMAT A20
COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45
SELECT TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI') AS logon_minute,
username,
module,
program,
COUNT(*) AS sessions_opened
FROM gv$session
WHERE type = 'USER'
GROUP BY TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI'),
username,
module,
program
ORDER BY logon_minute DESC,
sessions_opened DESC;
If you see hundreds opened in the same few minutes, that is a launcher/pool burst.
Show active vs inactive sessions
SET LINES 240
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45
SELECT username,
module,
program,
status,
COUNT(*) AS session_count
FROM gv$session
WHERE type = 'USER'
GROUP BY username,
module,
program,
status
ORDER BY session_count DESC;
Interpretation:
Mostly ACTIVE = too many concurrent running workers
Mostly INACTIVE = connection pool/leak/not closing sessions
Show top SQLs being executed by those sessions
SET LINES 240
SET PAGES 200
COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN event FORMAT A35
COLUMN sql_id FORMAT A15
SELECT username,
module,
sql_id,
NVL(event, 'ON CPU') AS event,
COUNT(*) AS active_sessions
FROM gv$session
WHERE type = 'USER'
AND status = 'ACTIVE'
GROUP BY username,
module,
sql_id,
NVL(event, 'ON CPU')
ORDER BY active_sessions DESC;
This tells you whether 1400 sessions are all executing the same SQL family or many different SQLs.
Check Oracle session/process limits
SET LINES 200
SET PAGES 100
SELECT resource_name,
current_utilization,
max_utilization,
initial_allocation,
limit_value
FROM v$resource_limit
WHERE resource_name IN ('sessions','processes')
ORDER BY resource_name;
This does not explain why they opened, but it proves how close the DB is to hard limits.
What to ask the app/job team
Ask them these exact questions:
1. What is the JarLauncher max thread count for this run?
2. How many job chunks/partitions are being launched?
3. How many job instances are running at the same time?
4. What is the JDBC connection pool max size?
5. Are retries enabled when SQL execution or connection acquisition slows down?
6. Does each worker open a separate DB connection?
7. Are connections closed after each chunk or held until the full job completes?
The likely answer will be one of these:
thread count is too high
pool max is too high
job was launched multiple times
retry logic is multiplying sessions
connections are held for too long
Bottom line
The database is not “creating” 1400 sessions by itself. The application side is opening them. Oracle is only showing the result.
Based on today’s evidence, the primary root cause is:
JarLauncher/application concurrency is too high, causing a normal-session flood.
CPU reaches 100%, Resource Manager throttles sessions, and the backlog causes the connection count to remain around 1400.
For tomorrow/prod, the fix is to cap JarLauncher/job concurrency and JDBC pool size, then keep the SQL baseline/plan fix as a secondary safeguard.
=====================
Yes. Use one common STS loaded from all three AWR windows, then run one SQL Tuning Advisor task per SQL_ID. Do not run one giant Advisor task for all SQLs together; it will likely time out and become hard to interpret.
Also, important correction from your screenshot: remove this condition:
OR s.elapsed_time_delta > 0
That will pull almost every SQL from those AWR windows. For STS, keep the filter only on your watchlist SQL IDs.
Below script excludes 9kfty957z5y3 because you marked it NOT NEEDED. Add it back if required.
1. Create one STS for all selected SQL IDs
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_sts_name VARCHAR2(30) := 'STS_JOB_WATCH_AWR';
BEGIN
BEGIN
DBMS_SQLTUNE.DROP_SQLSET(
sqlset_name => l_sts_name,
sqlset_owner => USER
);
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
DBMS_SQLTUNE.CREATE_SQLSET(
sqlset_name => l_sts_name,
description => 'Watchlist SQLs from BAD1, BAD2, and GOOD AWR windows'
);
DBMS_OUTPUT.PUT_LINE('Created STS: ' || l_sts_name);
END;
/
2. Load BAD1, BAD2, and GOOD windows into the STS
This loads all selected SQLs and all available plan hashes from those windows.
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_cursor DBMS_SQLTUNE.SQLSET_CURSOR;
l_sts_name VARCHAR2(30) := 'STS_JOB_WATCH_AWR';
l_filter VARCHAR2(4000) :=
q'[sql_id IN (
'1mufpt4a9awyx',
'f2y77p2gtdg41',
'7r1gn0tpbwjjj',
'fsgs2mtg3tfpx',
'6wa1frsf52nsy'
)]';
PROCEDURE load_window(
p_label IN VARCHAR2,
p_begin_snap IN NUMBER,
p_end_snap IN NUMBER
)
IS
BEGIN
DBMS_OUTPUT.PUT_LINE('Loading ' || p_label || ' snaps ' || p_begin_snap || ' to ' || p_end_snap);
OPEN l_cursor FOR
SELECT VALUE(p)
FROM TABLE(
DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(
begin_snap => p_begin_snap,
end_snap => p_end_snap,
basic_filter => l_filter,
ranking_measure1 => 'elapsed_time',
result_limit => 100,
attribute_list => 'ALL',
dbid => 352148541
)
) p;
DBMS_SQLTUNE.LOAD_SQLSET(
sqlset_name => l_sts_name,
populate_cursor => l_cursor,
load_option => 'MERGE'
);
DBMS_OUTPUT.PUT_LINE('Loaded ' || p_label);
END;
BEGIN
load_window('BAD1_APR6_23_TO_APR7_03', 49153, 49160);
load_window('BAD2_APR7_15_TO_21', 49184, 49196);
load_window('GOOD_APR8_00_TO_01', 49202, 49204);
DBMS_OUTPUT.PUT_LINE('Completed STS load: ' || l_sts_name);
END;
/
3. Validate what loaded into the STS
SET LINES 240
SET PAGES 200
COLUMN sqlset_name FORMAT A25
COLUMN sql_id FORMAT A15
COLUMN sql_text_sample FORMAT A90
SELECT sqlset_name,
sql_id,
plan_hash_value,
SUM(executions) AS executions,
ROUND(SUM(elapsed_time)/1e6, 2) AS elapsed_sec,
ROUND(SUM(cpu_time)/1e6, 2) AS cpu_sec,
SUM(buffer_gets) AS buffer_gets,
SUM(disk_reads) AS disk_reads,
SUBSTR(MAX(sql_text), 1, 90) AS sql_text_sample
FROM dba_sqlset_statements
WHERE sqlset_name = 'STS_JOB_WATCH_AWR'
GROUP BY sqlset_name,
sql_id,
plan_hash_value
ORDER BY sql_id,
elapsed_sec DESC;
For 7r1gn0tpbwjjj, you should see both:
2340129260 -- bad plan
3257229599 -- good/recommended plan
4. Create SQL Tuning Advisor tasks for each SQL_ID
This creates one Advisor task per SQL. It does not execute yet.
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_task_name VARCHAR2(128);
BEGIN
FOR r IN (
SELECT DISTINCT sql_id
FROM dba_sqlset_statements
WHERE sqlset_name = 'STS_JOB_WATCH_AWR'
ORDER BY sql_id
)
LOOP
l_task_name := 'STA_' || UPPER(r.sql_id) || '_AWR';
BEGIN
DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
sqlset_name => 'STS_JOB_WATCH_AWR',
sqlset_owner => USER,
basic_filter => 'sql_id = ''' || r.sql_id || '''',
scope => DBMS_SQLTUNE.SCOPE_COMPREHENSIVE,
time_limit => 7200,
task_name => l_task_name,
description => 'SQL Tuning Advisor from AWR STS for SQL_ID ' || r.sql_id
);
DBMS_OUTPUT.PUT_LINE('Created task: ' || l_task_name);
END LOOP;
END;
/
5. Execute the Advisor tasks one by one
Do not execute everything blindly if you are short on time. Start with the most important SQLs.
Execute only the critical SQL first
BEGIN
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(
task_name => 'STA_7R1GN0TPBWJJJ_AWR'
);
END;
/
Then run the others one at a time
BEGIN
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_1MUFPT4A9AWYX_AWR');
END;
/
BEGIN
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_F2Y77P2GTDG41_AWR');
END;
/
BEGIN
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_FSGS2MTG3TFPX_AWR');
END;
/
BEGIN
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_6WA1FRSF52NSY_AWR');
END;
/
If you want to include 9kfty957z5y3, create a separate task for it after adding it into the STS filter.
6. Check Advisor task status
SET LINES 240
SET PAGES 200
COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN execution_start FORMAT A30
COLUMN execution_end FORMAT A30
COLUMN status_message FORMAT A100
SELECT task_name,
status,
execution_start,
execution_end,
status_message
FROM dba_advisor_log
WHERE task_name LIKE 'STA_%_AWR'
ORDER BY execution_start DESC;
7. Generate Advisor report for each SQL
Example for 7r1gn0tpbwjjj:
SET LONG 10000000
SET LONGCHUNKSIZE 10000000
SET LINES 240
SET PAGES 0
SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
task_name => 'STA_7R1GN0TPBWJJJ_AWR',
type => 'TEXT',
level => 'ALL'
) AS report
FROM dual;
For all reports, change the task name:
STA_1MUFPT4A9AWYX_AWR
STA_F2Y77P2GTDG41_AWR
STA_7R1GN0TPBWJJJ_AWR
STA_FSGS2MTG3TFPX_AWR
STA_6WA1FRSF52NSY_AWR
8. Pull high-level Advisor findings for all tasks
SET LINES 240
SET PAGES 200
COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN finding_name FORMAT A55
COLUMN rec_type FORMAT A35
COLUMN benefit FORMAT 999999999999
COLUMN command FORMAT A120
SELECT t.task_name,
t.status,
f.finding_name,
r.type AS rec_type,
r.benefit,
SUBSTR(a.command, 1, 120) AS command
FROM dba_advisor_tasks t
LEFT JOIN dba_advisor_findings f
ON f.task_id = t.task_id
LEFT JOIN dba_advisor_recommendations r
ON r.task_id = t.task_id
AND r.finding_id = f.finding_id
LEFT JOIN dba_advisor_actions a
ON a.task_id = t.task_id
AND a.rec_id = r.rec_id
WHERE t.task_name LIKE 'STA_%_AWR'
ORDER BY t.task_name,
r.benefit DESC;
9. For tomorrow, do not wait for all Advisor tasks
For immediate production protection, prioritize:
1. SQL_ID 7r1gn0tpbwjjj
Good plan hash: 3257229599
Bad plan hash : 2340129260
2. SQL_ID 1mufpt4a9awyx
High reads/gets and AppFirewallCore impact
3. SQL_ID f2y77p2gtdg41
Very high elapsed time
4. SQL_ID fsgs2mtg3tfpx
Repeated long elapsed SQL
5. SQL_ID 6wa1frsf52nsy
High CPU in good window
For 7r1gn0tpbwjjj, Advisor already confirmed the best action: create SQL Plan Baseline for plan hash 3257229599 and control PX/concurrency. The STS/Advisor work for the other SQLs is for deeper tuning, not tomorrow’s first-line protection.
------------------------------------------------------
This result is expected for a heavy historical AWR SQL. The task did execute, but SQL Tuning Advisor timed out:
Completion Status : INTERRUPTED
Error : ORA-13639: The current operation was interrupted because it timed out
Number of SQLs with Timeouts : 1
This is not an Oracle outage/error. It means the advisor could not finish deep analysis within your 1500 seconds.
For tomorrow’s job, do not wait on SQL Tuning Advisor as the primary fix. You already found the immediate control point:
Bad plan hash : 2340129260
Good plan hash: 3257229599
Risk driver : PX / DOP / session explosion
So the immediate fix remains:
1. Load fixed baseline for good plan hash 3257229599.
2. Control/disable parallelism for this SQL/job.
3. Reduce job concurrency.
4. Monitor PX sessions live.
SQL Tuning Advisor can continue in parallel, but it is not the outage-prevention mechanism for tomorrow.
1. First check whether Advisor produced partial findings
Even with timeout, sometimes Oracle still captures useful findings.
SET LINES 240
SET PAGES 200
COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN finding_name FORMAT A50
COLUMN type FORMAT A30
COLUMN benefit FORMAT 999999999999
COLUMN command FORMAT A120
SELECT t.task_name,
t.status,
f.finding_name,
r.type,
r.benefit,
SUBSTR(a.command, 1, 120) AS command
FROM dba_advisor_tasks t
LEFT JOIN dba_advisor_findings f
ON f.task_id = t.task_id
LEFT JOIN dba_advisor_recommendations r
ON r.task_id = t.task_id
AND r.finding_id = f.finding_id
LEFT JOIN dba_advisor_actions a
ON a.task_id = t.task_id
AND a.rec_id = r.rec_id
WHERE t.task_name = 'STA_7R1GN_BAD1_STS'
ORDER BY r.benefit DESC;
Also check the task log:
SET LINES 240
SET PAGES 200
COLUMN task_name FORMAT A35
COLUMN status FORMAT A20
COLUMN status_message FORMAT A100
SELECT task_name,
status,
execution_start,
execution_end,
status_message
FROM dba_advisor_log
WHERE task_name = 'STA_7R1GN_BAD1_STS'
ORDER BY execution_start DESC;
2. Re-run Advisor with higher time limit
Since the 25-minute run timed out, create a clean second task with a higher limit. For this SQL, I would use 2 hours if running off-window.
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_task_name VARCHAR2(128);
BEGIN
l_task_name := 'STA_7R1GN_BAD1_STS_2H';
BEGIN
DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
sqlset_name => 'STS_7R1GN_BAD1',
sqlset_owner => 'RDSADM',
basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 2340129260]',
scope => DBMS_SQLTUNE.SCOPE_COMPREHENSIVE,
time_limit => 7200,
task_name => 'STA_7R1GN_BAD1_STS_2H',
description => '2-hour SQL Tuning Advisor for bad plan 2340129260'
);
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => l_task_name);
DBMS_OUTPUT.PUT_LINE('Executed task: ' || l_task_name);
END;
/
Report:
SET LONG 10000000
SET LONGCHUNKSIZE 10000000
SET LINES 240
SET PAGES 0
SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
task_name => 'STA_7R1GN_BAD1_STS_2H',
type => 'TEXT',
level => 'ALL'
) AS report
FROM dual;
3. Quick Advisor run for tomorrow readiness
For a fast pre-check, run limited scope. This will not do the full deep SQL Profile analysis, but it can quickly catch stats/index/access-path recommendations.
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_task_name VARCHAR2(128);
BEGIN
l_task_name := 'STA_7R1GN_BAD1_LIMITED';
BEGIN
DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
sqlset_name => 'STS_7R1GN_BAD1',
sqlset_owner => 'RDSADM',
basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 2340129260]',
scope => DBMS_SQLTUNE.SCOPE_LIMITED,
time_limit => 600,
task_name => 'STA_7R1GN_BAD1_LIMITED',
description => 'Quick limited SQL Tuning Advisor for 7r1gn0tpbwjjj'
);
DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => l_task_name);
DBMS_OUTPUT.PUT_LINE('Executed task: ' || l_task_name);
END;
/
Report:
SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
task_name => 'STA_7R1GN_BAD1_LIMITED',
type => 'TEXT',
level => 'ALL'
) AS report
FROM dual;
4. Do not make tomorrow dependent on Advisor
For tomorrow, the hard control should be this:
-- Good historical plan hash
3257229599
-- Bad historical plan hash
2340129260
Load the good plan as fixed baseline:
SET SERVEROUTPUT ON SIZE UNLIMITED
DECLARE
l_loaded PLS_INTEGER;
BEGIN
l_loaded := DBMS_SPM.LOAD_PLANS_FROM_AWR(
begin_snap => 49184,
end_snap => 49196,
basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 3257229599]',
fixed => 'YES',
enabled => 'YES'
);
DBMS_OUTPUT.PUT_LINE('Plans loaded: ' || l_loaded);
END;
/
Then verify live tomorrow:
SET LINES 240
SET PAGES 200
COLUMN sql_profile FORMAT A40
COLUMN sql_patch FORMAT A40
COLUMN sql_plan_baseline FORMAT A40
COLUMN module FORMAT A45
SELECT inst_id,
sql_id,
child_number,
plan_hash_value,
sql_profile,
sql_patch,
sql_plan_baseline,
module,
executions,
elapsed_time/1000000 AS elapsed_sec,
cpu_time/1000000 AS cpu_sec,
buffer_gets,
disk_reads,
px_servers_executions
FROM gv$sql
WHERE sql_id = '7r1gn0tpbwjjj'
ORDER BY inst_id, child_number;
Expected tomorrow:
plan_hash_value should be 3257229599
plan_hash_value should NOT be 2340129260
PX_SERVERS_EXECUTIONS should not explode
5. My recommendation now
Use this decision path:
For tomorrow outage prevention:
Baseline good plan hash 3257229599.
Cap/disable parallelism for this job SQL.
Reduce job concurrency.
For root-cause tuning:
Run SQL Tuning Advisor with 7200 seconds off-window.
Review recommendations.
Do not blindly accept new SQL Profile if it increases parallelism.
Meeting-ready wording:
SQL Tuning Advisor was executed successfully from the historical AWR SQL Tuning Set, but the comprehensive analysis timed out after 1500 seconds. That confirms this is a heavy SQL optimization case, but it does not block tomorrow’s mitigation. We already identified the actionable risk: bad plan hash
2340129260and high PX usage. For tomorrow, we should stabilize with the good plan hash3257229599, control parallelism/concurrency, and continue the extended Advisor run off-window for permanent tuning recommendations.