Sunday, May 3, 2026

AWR Analysis



Use ASH first. That is the cleanest way to see whether this morning’s load hit library cache or row cache contention.

1) This morning summary from historical ASH

This gives you the contention picture for today from midnight onward.

COL event           FORMAT A35
COL wait_class      FORMAT A15
COL first_seen      FORMAT A20
COL last_seen       FORMAT A20

SELECT
    ash.event,
    ash.wait_class,
    COUNT(*) AS ash_samples,
    COUNT(DISTINCT ash.session_id || ':' || ash.session_serial# || ':' || ash.instance_number) AS distinct_sessions,
    COUNT(DISTINCT ash.sql_id) AS distinct_sqls,
    MIN(ash.sample_time) AS first_seen,
    MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
  AND ash.session_type = 'FOREGROUND'
  AND ash.event IN (
        'library cache lock',
        'library cache pin',
        'row cache lock',
        'cursor: pin S wait on X',
        'cursor: pin X',
        'library cache: mutex X',
        'library cache: mutex S'
      )
GROUP BY
    ash.event,
    ash.wait_class
ORDER BY ash_samples DESC;

2) Top SQLs causing those waits

This is usually the most useful report.

COL event           FORMAT A35
COL username        FORMAT A20
COL module          FORMAT A25
COL program         FORMAT A30

SELECT
    ash.event,
    u.username,
    ash.sql_id,
    ash.sql_plan_hash_value AS plan_hash_value,
    ash.module,
    ash.program,
    ash.machine,
    COUNT(*) AS ash_samples,
    MIN(ash.sample_time) AS first_seen,
    MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
LEFT JOIN dba_users u
       ON u.user_id = ash.user_id
WHERE ash.sample_time >= TRUNC(SYSDATE)
  AND ash.session_type = 'FOREGROUND'
  AND ash.event IN (
        'library cache lock',
        'library cache pin',
        'row cache lock',
        'cursor: pin S wait on X',
        'cursor: pin X',
        'library cache: mutex X',
        'library cache: mutex S'
      )
GROUP BY
    ash.event,
    u.username,
    ash.sql_id,
    ash.sql_plan_hash_value,
    ash.module,
    ash.program,
    ash.machine
ORDER BY ash_samples DESC;

3) Timeline by 5-minute bucket

This shows when the spike happened.

COL time_bucket FORMAT A20
COL event       FORMAT A35

SELECT
    TO_CHAR(
        TRUNC(ash.sample_time, 'HH24')
        + FLOOR(TO_NUMBER(TO_CHAR(ash.sample_time, 'MI')) / 5) * 5 / 1440,
        'YYYY-MM-DD HH24:MI'
    ) AS time_bucket,
    ash.event,
    COUNT(*) AS ash_samples
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
  AND ash.session_type = 'FOREGROUND'
  AND ash.event IN (
        'library cache lock',
        'library cache pin',
        'row cache lock',
        'cursor: pin S wait on X',
        'cursor: pin X',
        'library cache: mutex X',
        'library cache: mutex S'
      )
GROUP BY
    TRUNC(ash.sample_time, 'HH24')
      + FLOOR(TO_NUMBER(TO_CHAR(ash.sample_time, 'MI')) / 5) * 5 / 1440,
    ash.event
ORDER BY 1, 2;

4) If you know the load module or program

If the load was a known app/job, filter it directly. For example:

AND (ash.module  LIKE '%GTJ%'
 OR  ash.program LIKE '%JarLauncher%')

Full example:

SELECT
    ash.event,
    ash.sql_id,
    ash.module,
    ash.program,
    COUNT(*) AS ash_samples,
    MIN(ash.sample_time) AS first_seen,
    MAX(ash.sample_time) AS last_seen
FROM dba_hist_active_sess_history ash
WHERE ash.sample_time >= TRUNC(SYSDATE)
  AND ash.session_type = 'FOREGROUND'
  AND ash.event IN (
        'library cache lock',
        'library cache pin',
        'row cache lock',
        'cursor: pin S wait on X',
        'cursor: pin X',
        'library cache: mutex X',
        'library cache: mutex S'
      )
  AND (ash.module LIKE '%GTJ%' OR ash.program LIKE '%JarLauncher%')
GROUP BY
    ash.event,
    ash.sql_id,
    ash.module,
    ash.program
ORDER BY ash_samples DESC;

5) Current in-memory view if it is still happening

If the issue is still active now, use V$ACTIVE_SESSION_HISTORY:

SELECT
    ash.sample_time,
    ash.session_id,
    ash.session_serial#,
    ash.sql_id,
    ash.sql_plan_hash_value AS plan_hash_value,
    ash.event,
    ash.wait_class,
    ash.module,
    ash.program,
    ash.machine,
    ash.blocking_session
FROM v$active_session_history ash
WHERE ash.sample_time >= SYSDATE - (2/24)
  AND ash.event IN (
        'library cache lock',
        'library cache pin',
        'row cache lock',
        'cursor: pin S wait on X',
        'cursor: pin X',
        'library cache: mutex X',
        'library cache: mutex S'
      )
ORDER BY ash.sample_time;

6) If you want the broad wait-family summary from system views

This is not load-specific, but it shows whether the instance accumulated those waits.

COL event FORMAT A35

SELECT
    event,
    total_waits,
    time_waited,
    average_wait
FROM v$system_event
WHERE event IN (
      'library cache lock',
      'library cache pin',
      'row cache lock',
      'cursor: pin S wait on X',
      'cursor: pin X',
      'library cache: mutex X',
      'library cache: mutex S'
)
ORDER BY time_waited DESC;

Best practical approach

Run these in this order:

  1. Query 1 → confirm whether library/row cache contention happened this morning

  2. Query 2 → identify user / SQL_ID / module / program

  3. Query 3 → show when the spike occurred

  4. Query 4 → isolate your load job if module/program is known

One important note

For DBA_HIST_ACTIVE_SESS_HISTORY, the sample count is best treated as intensity / evidence, not exact execution count. It is very good for proving that this morning’s load hit:

  • library cache lock

  • cursor: pin S wait on X

  • row cache lock

If you want, I can give you one single combined script that prints:

  • top events

  • top SQL IDs

  • top modules/programs

  • 5-minute timeline
    for this morning in one shot.

=========================

This output is very useful. It confirms the issue is a SQL-family / session fan-out problem, not one single SQL_ID.

What your result is saying

The top offender appears to be:

Module        : JDBC Thin Client
Plan hash     : 725220778
Event         : resmgr:cpu quantum + ON CPU
Distinct SQLs : 482 to 516 SQL_IDs
ASH samples   : 7607 resmgr + 6313 ON CPU

That is the biggest red flag.

It means hundreds of different SQL_IDs are behaving like the same SQL family, probably because the application is generating dynamic SQL variations. Oracle sees many different SQL_IDs, but they collapse into the same force_matching_signature and plan hash.

So the main troubleshooting target is not one SQL_ID. It is this combination:

MODULE                  = JDBC Thin Client
FORCE_MATCHING_SIGNATURE = top signature from your first row
PLAN_HASH_VALUE          = 725220778

JarLauncher is also contributing, but the first two rows show the JDBC Thin Client SQL family is the biggest pressure point right now.


1. Run this cleaner rollup query

Your current query splits by event. This new one combines all events into one row per SQL family.

SET LINES 280
SET PAGES 200

COLUMN module FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140
COLUMN main_events FORMAT A160

WITH ash_base AS (
    SELECT ash.module,
           NVL(q.force_matching_signature, 0) AS force_matching_signature,
           ash.sql_plan_hash_value,
           NVL(ash.event, 'ON CPU') AS event,
           ash.session_state,
           ash.sql_id
    FROM   gv$active_session_history ash
    LEFT JOIN gv$sql q
           ON q.inst_id      = ash.inst_id
          AND q.sql_id       = ash.sql_id
          AND q.child_number = ash.sql_child_number
    WHERE  ash.sample_time >= SYSTIMESTAMP - INTERVAL '15' MINUTE
    AND    ash.session_type = 'FOREGROUND'
    AND    ash.module IN ('JarLauncher','JDBC Thin Client')
),
grp AS (
    SELECT module,
           force_matching_signature,
           sql_plan_hash_value,
           COUNT(*) AS ash_samples,
           SUM(CASE WHEN session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu_samples,
           SUM(CASE WHEN event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_samples,
           SUM(CASE WHEN event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_samples,
           SUM(CASE WHEN event LIKE 'db file%read%' OR event LIKE 'direct path read%' THEN 1 ELSE 0 END) AS io_read_samples,
           COUNT(DISTINCT sql_id) AS distinct_sql_ids
    FROM   ash_base
    GROUP BY module,
             force_matching_signature,
             sql_plan_hash_value
),
sqlids AS (
    SELECT module,
           force_matching_signature,
           sql_plan_hash_value,
           LISTAGG(sql_id, ', ' ON OVERFLOW TRUNCATE)
               WITHIN GROUP (ORDER BY sql_id) AS sample_sql_ids
    FROM (
        SELECT DISTINCT module,
                        force_matching_signature,
                        sql_plan_hash_value,
                        sql_id
        FROM   ash_base
        WHERE  sql_id IS NOT NULL
    )
    GROUP BY module,
             force_matching_signature,
             sql_plan_hash_value
),
events AS (
    SELECT module,
           force_matching_signature,
           sql_plan_hash_value,
           LISTAGG(event || '=' || event_samples, ', ' ON OVERFLOW TRUNCATE)
               WITHIN GROUP (ORDER BY event_samples DESC) AS main_events
    FROM (
        SELECT module,
               force_matching_signature,
               sql_plan_hash_value,
               event,
               COUNT(*) AS event_samples
        FROM   ash_base
        GROUP BY module,
                 force_matching_signature,
                 sql_plan_hash_value,
                 event
    )
    GROUP BY module,
             force_matching_signature,
             sql_plan_hash_value
)
SELECT g.module,
       g.force_matching_signature,
       g.sql_plan_hash_value,
       g.ash_samples,
       g.on_cpu_samples,
       g.resmgr_samples,
       g.row_cache_mutex_samples,
       g.io_read_samples,
       g.distinct_sql_ids,
       e.main_events,
       s.sample_sql_ids
FROM   grp g
LEFT JOIN sqlids s
       ON s.module = g.module
      AND s.force_matching_signature = g.force_matching_signature
      AND s.sql_plan_hash_value = g.sql_plan_hash_value
LEFT JOIN events e
       ON e.module = g.module
      AND e.force_matching_signature = g.force_matching_signature
      AND e.sql_plan_hash_value = g.sql_plan_hash_value
ORDER BY g.ash_samples DESC;

The top row from this query is your real troubleshooting target.


2. Pull the exact SQL text for the top SQL family

Use the top force_matching_signature and plan_hash_value from the query above.

SET LINES 280
SET PAGES 100
SET LONG 1000000
SET LONGCHUNKSIZE 1000000

COLUMN sql_id FORMAT A15
COLUMN sql_text FORMAT A220
COLUMN module FORMAT A35

SELECT inst_id,
       sql_id,
       child_number,
       plan_hash_value,
       force_matching_signature,
       module,
       parsing_schema_name,
       executions,
       ROUND(elapsed_time/1e6,2) AS elapsed_sec,
       ROUND(cpu_time/1e6,2) AS cpu_sec,
       buffer_gets,
       disk_reads,
       parse_calls,
       px_servers_executions,
       SUBSTR(sql_text,1,220) AS sql_text
FROM   gv$sql
WHERE  force_matching_signature = <TOP_FORCE_MATCHING_SIGNATURE>
AND    plan_hash_value          = <TOP_PLAN_HASH_VALUE>
ORDER BY cpu_time DESC, elapsed_time DESC
FETCH FIRST 20 ROWS ONLY;

For your screenshot, the first target is likely:

MODULE         = JDBC Thin Client
PLAN_HASH_VALUE = 725220778

Take the top SQL_ID from this query and display the plan:

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY_CURSOR(
           sql_id => '<TOP_SQL_ID>',
           cursor_child_no => NULL,
           format => 'ALLSTATS LAST +PEEKED_BINDS +OUTLINE +PARTITION +ALIAS +NOTE'
       ));

3. Find which host/application is opening these sessions

This is critical because the top module is JDBC Thin Client, not only JarLauncher.

SET LINES 260
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN service_name FORMAT A30

SELECT s.username,
       s.module,
       s.program,
       s.machine,
       s.service_name,
       s.status,
       COUNT(*) AS session_count
FROM   gv$session s
WHERE  s.type = 'USER'
AND    s.username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY s.username,
         s.module,
         s.program,
         s.machine,
         s.service_name,
         s.status
ORDER BY session_count DESC;

This tells you which app server / launcher / JDBC pool is flooding the DB.


4. Interpret what you have now

Your current output strongly indicates:

1. The database is CPU saturated.
2. Resource Manager is parking sessions on resmgr:cpu quantum.
3. The top problem is a SQL family, not one SQL_ID.
4. JDBC Thin Client has the biggest ASH footprint.
5. Many SQL_IDs are generated from the same SQL pattern.
6. JarLauncher is also contributing, especially plan hash 3623520496.

So the fix is two-layered:

Layer 1: Reduce application/session concurrency immediately.
Layer 2: Tune or baseline the top SQL family/signature.

5. Immediate control recommendation

For the ACPT simulation, ask the app/team to reduce the fan-out now:

Reduce JDBC/JarLauncher active workers to 25–50.
Do not allow 1000+ connections.
Disable retry storm / reconnect loop.
Run batches in waves.
Keep CPU below 70–80% before increasing concurrency.

Database-side, you can watch with:

SELECT module,
       NVL(event, 'ON CPU') AS event,
       COUNT(*) AS active_sessions
FROM   gv$session
WHERE  type = 'USER'
AND    status = 'ACTIVE'
GROUP BY module,
         NVL(event, 'ON CPU')
ORDER BY active_sessions DESC;

If resmgr:cpu quantum continues to dominate, the system is still overdriven.


Bottom line

Your top troubleshooting target is currently:

JDBC Thin Client
Force matching signature from the top row
Plan hash value 725220778
Hundreds of distinct SQL_IDs
resmgr:cpu quantum + ON CPU

That is the SQL family to extract, plan, and tune. But operationally, the immediate control is still to reduce the number of concurrent JDBC/JarLauncher sessions. Even a good plan will struggle if hundreds of sessions execute the same SQL family at once.


==============================
SET LINES 280
SET PAGES 200

COLUMN module FORMAT A35
COLUMN event FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140

WITH ash_base AS (
    SELECT ash.module,
           NVL(q.force_matching_signature, 0) AS force_matching_signature,
           ash.sql_plan_hash_value,
           NVL(ash.event, 'ON CPU') AS event,
           ash.session_state,
           ash.sql_id
    FROM   gv$active_session_history ash
    LEFT JOIN gv$sql q
           ON q.inst_id      = ash.inst_id
          AND q.sql_id       = ash.sql_id
          AND q.child_number = ash.sql_child_number
    WHERE  ash.sample_time >= SYSTIMESTAMP - INTERVAL '15' MINUTE
    AND    ash.session_type = 'FOREGROUND'
    AND    ash.module IN ('JarLauncher','JDBC Thin Client')
),
grp AS (
    SELECT module,
           force_matching_signature,
           sql_plan_hash_value,
           event,
           COUNT(*) AS ash_samples,
           SUM(CASE WHEN session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu_samples,
           SUM(CASE WHEN event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_samples,
           SUM(CASE WHEN event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_samples,
           COUNT(DISTINCT sql_id) AS distinct_sql_ids
    FROM   ash_base
    GROUP BY module,
             force_matching_signature,
             sql_plan_hash_value,
             event
),
sqlids AS (
    SELECT module,
           force_matching_signature,
           sql_plan_hash_value,
           event,
           LISTAGG(sql_id, ', ' ON OVERFLOW TRUNCATE)
               WITHIN GROUP (ORDER BY sql_id) AS sample_sql_ids
    FROM (
        SELECT DISTINCT module,
                        force_matching_signature,
                        sql_plan_hash_value,
                        event,
                        sql_id
        FROM   ash_base
        WHERE  sql_id IS NOT NULL
    )
    GROUP BY module,
             force_matching_signature,
             sql_plan_hash_value,
             event
)
SELECT g.module,
       g.force_matching_signature,
       g.sql_plan_hash_value,
       g.event,
       g.ash_samples,
       g.on_cpu_samples,
       g.resmgr_samples,
       g.row_cache_mutex_samples,
       g.distinct_sql_ids,
       s.sample_sql_ids
FROM   grp g
LEFT JOIN sqlids s
       ON s.module = g.module
      AND s.force_matching_signature = g.force_matching_signature
      AND s.sql_plan_hash_value = g.sql_plan_hash_value
      AND s.event = g.event
ORDER BY g.ash_samples DESC;


-----------------------------------------------

Now that we know many sessions are causing the CPU/resource-manager pressure, troubleshoot it as a concurrency/fan-out incident, not as a single SQL_ID issue.

The goal is to answer four questions:

1. Who is opening the sessions?
2. Are they new sessions continuously spawning or old sessions stuck?
3. Are the SQLs different SQL_IDs but the same SQL pattern?
4. Is Resource Manager throttling because CPU is exhausted, or is Resource Manager itself misconfigured?

1. First: prove the session fan-out source

Run this to identify the source host/module/program/service.

SET LINES 260
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN service_name FORMAT A30
COLUMN status FORMAT A12

SELECT username,
       module,
       program,
       machine,
       service_name,
       status,
       COUNT(*) AS session_count
FROM   gv$session
WHERE  type = 'USER'
GROUP BY username,
         module,
         program,
         machine,
         service_name,
         status
ORDER BY session_count DESC;

This tells you whether the storm is coming from:

JarLauncher
JDBC Thin Client
Informatica/pmdtm
one app host
one DB service
one schema/user

If one module/host has hundreds of sessions, that is the first root-cause path.


2. Check whether sessions are continuously spawning

This tells you whether the launcher is still opening sessions.

SET LINES 240
SET PAGES 200

COLUMN logon_minute FORMAT A20
COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A45

SELECT TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI') AS logon_minute,
       username,
       module,
       program,
       COUNT(*) AS sessions_opened
FROM   gv$session
WHERE  type = 'USER'
AND    username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI'),
         username,
         module,
         program
ORDER BY logon_minute DESC,
         sessions_opened DESC;

Interpretation:

ResultMeaning
Many sessions opened in the last few minutesActive fan-out / launcher still spawning
Sessions are old and not growingExisting workload is stuck or CPU-starved
Same host/module keeps appearingUpstream job/concurrency setting is the culprit

3. Group active sessions by SQL family, not SQL_ID

Your screenshot shows each SQL_ID has active_sessions = 1. That means SQL_ID alone is misleading. Group by force matching signature + plan hash.

SET LINES 280
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140
COLUMN sql_text_sample FORMAT A140

SELECT s.username,
       s.module,
       s.program,
       NVL(q.force_matching_signature, 0) AS force_matching_signature,
       s.sql_plan_hash_value,
       COUNT(*) AS active_sessions,
       COUNT(DISTINCT s.sql_id) AS distinct_sql_ids,
       SUM(CASE WHEN s.event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_sessions,
       SUM(CASE WHEN s.event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_sessions,
       LISTAGG(DISTINCT s.sql_id, ', ')
           WITHIN GROUP (ORDER BY s.sql_id)
           ON OVERFLOW TRUNCATE AS sample_sql_ids,
       SUBSTR(MIN(q.sql_text), 1, 140) AS sql_text_sample
FROM   gv$session s
LEFT JOIN gv$sql q
       ON q.inst_id      = s.inst_id
      AND q.sql_id       = s.sql_id
      AND q.child_number = s.sql_child_number
WHERE  s.type = 'USER'
AND    s.status = 'ACTIVE'
AND    s.username IN ('FN2GTJSV1_ACPT','FN2GNL2Q1_ACPT','GRBACPT1')
GROUP BY s.username,
         s.module,
         s.program,
         NVL(q.force_matching_signature, 0),
         s.sql_plan_hash_value
ORDER BY active_sessions DESC;

The top row is your real troubleshooting target.

If you see:

active_sessions    = 150
distinct_sql_ids   = 150
same plan hash     = 3623520496
same module        = JarLauncher

then you have one SQL pattern being generated many different ways.


4. Use ASH to find the top offender over time

Current sessions are only a snapshot. ASH shows who consumed database time over the last 15 minutes.

SET LINES 280
SET PAGES 200

COLUMN module FORMAT A35
COLUMN event FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sample_sql_ids FORMAT A140

SELECT ash.module,
       NVL(q.force_matching_signature, 0) AS force_matching_signature,
       ash.sql_plan_hash_value,
       NVL(ash.event, 'ON CPU') AS event,
       COUNT(*) AS ash_samples,
       SUM(CASE WHEN ash.session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu_samples,
       SUM(CASE WHEN ash.event = 'resmgr:cpu quantum' THEN 1 ELSE 0 END) AS resmgr_samples,
       SUM(CASE WHEN ash.event = 'row cache mutex' THEN 1 ELSE 0 END) AS row_cache_mutex_samples,
       COUNT(DISTINCT ash.sql_id) AS distinct_sql_ids,
       LISTAGG(DISTINCT ash.sql_id, ', ')
           WITHIN GROUP (ORDER BY ash.sql_id)
           ON OVERFLOW TRUNCATE AS sample_sql_ids
FROM   gv$active_session_history ash
LEFT JOIN gv$sql q
       ON q.inst_id      = ash.inst_id
      AND q.sql_id       = ash.sql_id
      AND q.child_number = ash.sql_child_number
WHERE  ash.sample_time >= SYSTIMESTAMP - INTERVAL '15' MINUTE
AND    ash.session_type = 'FOREGROUND'
AND    ash.module IN ('JarLauncher','JDBC Thin Client')
GROUP BY ash.module,
         NVL(q.force_matching_signature, 0),
         ash.sql_plan_hash_value,
         NVL(ash.event, 'ON CPU')
ORDER BY ash_samples DESC;

This query gives you the real top consumer.


5. Check Resource Manager impact

resmgr:cpu quantum means sessions are waiting because CPU is being rationed. It is usually a symptom of CPU saturation, not the original root cause.

SET LINES 260
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN program FORMAT A40
COLUMN current_consumer_group FORMAT A35
COLUMN state FORMAT A25

SELECT s.username,
       s.module,
       s.program,
       r.current_consumer_group,
       r.state,
       COUNT(*) AS sessions,
       ROUND(SUM(r.cpu_wait_time)/1000,2) AS cpu_wait_sec,
       ROUND(SUM(r.consumed_cpu_time)/1000,2) AS consumed_cpu_sec,
       SUM(r.yields) AS yields
FROM   gv$session s
JOIN   gv$rsrc_session_info r
       ON r.inst_id = s.inst_id
      AND r.sid     = s.sid
WHERE  s.type = 'USER'
AND    s.status = 'ACTIVE'
GROUP BY s.username,
         s.module,
         s.program,
         r.current_consumer_group,
         r.state
ORDER BY cpu_wait_sec DESC;

If JarLauncher has most of the cpu_wait_sec, the workload is overdriving the CPU allocation.


6. Check whether this is a parse / row-cache storm

Your screenshots also show row cache mutex. With hundreds of generated SQLs, this can happen from hard parse / metadata / dictionary contention.

SET LINES 260
SET PAGES 200

COLUMN module FORMAT A35
COLUMN force_matching_signature FORMAT 999999999999999999999999
COLUMN sql_text_sample FORMAT A140

SELECT module,
       force_matching_signature,
       plan_hash_value,
       COUNT(DISTINCT sql_id) AS distinct_sql_ids,
       COUNT(*) AS cursor_count,
       SUM(parse_calls) AS parse_calls,
       SUM(executions) AS executions,
       ROUND(SUM(elapsed_time)/1e6,2) AS elapsed_sec,
       ROUND(SUM(cpu_time)/1e6,2) AS cpu_sec,
       SUM(buffer_gets) AS buffer_gets,
       SUBSTR(MIN(sql_text),1,140) AS sql_text_sample
FROM   gv$sql
WHERE  module = 'JarLauncher'
GROUP BY module,
         force_matching_signature,
         plan_hash_value
ORDER BY parse_calls DESC, cpu_sec DESC
FETCH FIRST 30 ROWS ONLY;

Red flags:

distinct_sql_ids high
parse_calls high
executions low
row cache mutex / library cache waits

That means the application is generating too many distinct SQL statements rather than reusing stable bind SQL.


7. Use SQL Monitor to find the heavy active plan family

SET LINES 280
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A35
COLUMN sample_sql_ids FORMAT A140

SELECT m.username,
       m.module,
       m.sql_plan_hash_value,
       COUNT(*) AS monitored_execs,
       COUNT(DISTINCT m.sql_id) AS distinct_sql_ids,
       ROUND(SUM(m.elapsed_time)/1e6,2) AS total_elapsed_sec,
       ROUND(SUM(m.cpu_time)/1e6,2) AS total_cpu_sec,
       SUM(m.buffer_gets) AS total_buffer_gets,
       ROUND(SUM(m.physical_read_bytes)/1024/1024/1024,2) AS physical_read_gb,
       MAX(m.px_servers_requested) AS max_px_requested,
       MAX(m.px_servers_allocated) AS max_px_allocated,
       LISTAGG(DISTINCT m.sql_id, ', ')
           WITHIN GROUP (ORDER BY m.sql_id)
           ON OVERFLOW TRUNCATE AS sample_sql_ids
FROM   gv$sql_monitor m
WHERE  m.status = 'EXECUTING'
AND    m.module = 'JarLauncher'
GROUP BY m.username,
         m.module,
         m.sql_plan_hash_value
ORDER BY total_cpu_sec DESC;

If one plan hash dominates, pick one SQL_ID from that row and get the live plan:

SELECT *
FROM   TABLE(DBMS_XPLAN.DISPLAY_CURSOR(
           sql_id => '<TOP_SQL_ID>',
           cursor_child_no => NULL,
           format => 'ALLSTATS LAST +PEEKED_BINDS +OUTLINE +PARTITION +ALIAS +NOTE'
       ));

8. Determine the actual root cause category

Use this decision table.

EvidenceRoot cause directionFix
Hundreds of normal JarLauncher sessionsApp/job fan-outReduce launcher threads / connection pool
Each SQL_ID has 1 session but same signature/planDynamic SQL variantsForce matching, bind reuse, SQL family baseline/patch
resmgr:cpu quantum dominatesCPU saturated / Resource Manager throttlingReduce concurrency; adjust consumer group only after review
row cache mutex / library cache lockHard parse / dictionary contentionReduce SQL variants, avoid DDL/stats during run, bind properly
PX slaves highParallelism stormDisable/cap parallel query
Same plan hash but bad under loadConcurrency issue, not plan issueThrottle sessions
Different bad/good plan hashPlan regressionSQL Plan Baseline / SQL Patch / stats

9. What to collect for developers/app team

You need this evidence package:

1. Session count by module/program/machine.
2. Logon-time histogram showing session ramp-up.
3. Top force_matching_signature + plan_hash_value.
4. Top SQL Monitor plan hash and CPU/elapsed time.
5. Resource Manager CPU wait by consumer group.
6. Row cache/library cache wait evidence if present.
7. PX slave count showing whether parallelism is or is not the issue.

That tells the app team exactly whether the issue is:

connection pool max too high
launcher thread count too high
batch fan-out too aggressive
retry/reconnect storm
dynamic SQL generation
bad SQL plan
parallelism

10. Bottom line

Since you already know many sessions are causing it, the next troubleshooting step is not picking one SQL_ID from the 266 rows.

The correct method is:

Group sessions by module + force_matching_signature + plan_hash_value.
Find the SQL family with the highest active sessions / ASH samples.
Confirm whether sessions are continuously spawning.
Then decide whether the fix is app concurrency, SQL baseline, bind reuse, Resource Manager, or parallelism.

For the current ACPT symptom, based on your screenshots, the leading diagnosis is:

JarLauncher is generating too many concurrent normal sessions.
Those sessions are CPU-starved and parked on resmgr:cpu quantum.
The SQL_IDs are fragmented, so troubleshoot by SQL family/signature, not individual SQL_ID.

===============================

It is opening ~1400 sessions because the application/job layer is fanning out connections, not because Oracle independently decides to create 1400 sessions.

From your screenshots, the important proof is:

Session type : mostly NORMAL_SESSION
Module       : JarLauncher / JDBC Thin Client
Wait         : resmgr:cpu quantum
PX evidence  : only one PX SQL with DOP 8 / 16 PX slaves
CPU          : ~99.9%
Connections  : ~1448

So today’s ACPT issue is not mainly “one SQL opened 1400 PX slaves.” It is more like:

JarLauncher started hundreds/thousands of worker executions
→ each worker opened or held a JDBC database session
→ CPU hit 99–100%
→ Resource Manager parked many sessions on resmgr:cpu quantum
→ sessions stayed open longer
→ more work kept arriving
→ connection count piled up to ~1400

Most likely causes

1. JarLauncher job concurrency is too high

The launcher may be splitting the workload into hundreds/thousands of parallel tasks. Each task gets its own JDBC session.

Typical causes:

parallel thread count too high
batch split count too high
partition/chunk count too high
multiple job instances launched at same time
multiple environments/users launching same workload

This is the most likely based on your screenshots.


2. Connection pool maximum is too high or uncontrolled

If the app uses JDBC pooling, the pool may allow hundreds or thousands of open connections.

Examples:

Hikari maxPoolSize too high
Tomcat JDBC maxActive too high
WebLogic/JBoss datasource max capacity too high
Informatica session/grid concurrency too high

If the pool max is 1500, Oracle will accept 1400 sessions until CPU collapses.


3. Retry storm / timeout amplification

Once CPU hits 100%, each SQL runs slower. The app may think calls are stuck or timed out and launches more sessions/retries.

Bad pattern:

SQL slows down
app retry starts
old DB session still running
new DB session starts same work
CPU gets worse
more retries happen

This is how 200 sessions becomes 1400 very quickly.


4. Sessions are not closing fast enough

Even if the app is not intentionally launching 1400 at once, sessions may be retained because they are waiting on CPU:

resmgr:cpu quantum

That wait means Resource Manager is parking sessions because CPU is over capacity. The sessions remain connected while waiting, so the connection count builds up.

Resource Manager is not opening the sessions. It is only exposing the symptom that too many sessions are competing for CPU.


Prove exactly who is opening them

Run this first.

SET LINES 240
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45
COLUMN machine FORMAT A45
COLUMN status FORMAT A12

SELECT username,
       module,
       program,
       machine,
       status,
       COUNT(*) AS session_count
FROM   gv$session
WHERE  type = 'USER'
GROUP BY username,
         module,
         program,
         machine,
         status
ORDER BY session_count DESC;

This will show the source host/module responsible for the 1400 connections.


Show when the sessions were opened

SET LINES 240
SET PAGES 200

COLUMN logon_minute FORMAT A20
COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45

SELECT TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI') AS logon_minute,
       username,
       module,
       program,
       COUNT(*) AS sessions_opened
FROM   gv$session
WHERE  type = 'USER'
GROUP BY TO_CHAR(logon_time, 'YYYY-MM-DD HH24:MI'),
         username,
         module,
         program
ORDER BY logon_minute DESC,
         sessions_opened DESC;

If you see hundreds opened in the same few minutes, that is a launcher/pool burst.


Show active vs inactive sessions

SET LINES 240
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN program FORMAT A45

SELECT username,
       module,
       program,
       status,
       COUNT(*) AS session_count
FROM   gv$session
WHERE  type = 'USER'
GROUP BY username,
         module,
         program,
         status
ORDER BY session_count DESC;

Interpretation:

Mostly ACTIVE   = too many concurrent running workers
Mostly INACTIVE = connection pool/leak/not closing sessions

Show top SQLs being executed by those sessions

SET LINES 240
SET PAGES 200

COLUMN username FORMAT A25
COLUMN module FORMAT A45
COLUMN event FORMAT A35
COLUMN sql_id FORMAT A15

SELECT username,
       module,
       sql_id,
       NVL(event, 'ON CPU') AS event,
       COUNT(*) AS active_sessions
FROM   gv$session
WHERE  type = 'USER'
AND    status = 'ACTIVE'
GROUP BY username,
         module,
         sql_id,
         NVL(event, 'ON CPU')
ORDER BY active_sessions DESC;

This tells you whether 1400 sessions are all executing the same SQL family or many different SQLs.


Check Oracle session/process limits

SET LINES 200
SET PAGES 100

SELECT resource_name,
       current_utilization,
       max_utilization,
       initial_allocation,
       limit_value
FROM   v$resource_limit
WHERE  resource_name IN ('sessions','processes')
ORDER BY resource_name;

This does not explain why they opened, but it proves how close the DB is to hard limits.


What to ask the app/job team

Ask them these exact questions:

1. What is the JarLauncher max thread count for this run?
2. How many job chunks/partitions are being launched?
3. How many job instances are running at the same time?
4. What is the JDBC connection pool max size?
5. Are retries enabled when SQL execution or connection acquisition slows down?
6. Does each worker open a separate DB connection?
7. Are connections closed after each chunk or held until the full job completes?

The likely answer will be one of these:

thread count is too high
pool max is too high
job was launched multiple times
retry logic is multiplying sessions
connections are held for too long

Bottom line

The database is not “creating” 1400 sessions by itself. The application side is opening them. Oracle is only showing the result.

Based on today’s evidence, the primary root cause is:

JarLauncher/application concurrency is too high, causing a normal-session flood.
CPU reaches 100%, Resource Manager throttles sessions, and the backlog causes the connection count to remain around 1400.

For tomorrow/prod, the fix is to cap JarLauncher/job concurrency and JDBC pool size, then keep the SQL baseline/plan fix as a secondary safeguard.


=====================

Yes. Use one common STS loaded from all three AWR windows, then run one SQL Tuning Advisor task per SQL_ID. Do not run one giant Advisor task for all SQLs together; it will likely time out and become hard to interpret.

Also, important correction from your screenshot: remove this condition:

OR s.elapsed_time_delta > 0

That will pull almost every SQL from those AWR windows. For STS, keep the filter only on your watchlist SQL IDs.

Below script excludes 9kfty957z5y3 because you marked it NOT NEEDED. Add it back if required.


1. Create one STS for all selected SQL IDs

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_sts_name VARCHAR2(30) := 'STS_JOB_WATCH_AWR';
BEGIN
    BEGIN
        DBMS_SQLTUNE.DROP_SQLSET(
            sqlset_name  => l_sts_name,
            sqlset_owner => USER
        );
    EXCEPTION
        WHEN OTHERS THEN
            NULL;
    END;

    DBMS_SQLTUNE.CREATE_SQLSET(
        sqlset_name => l_sts_name,
        description => 'Watchlist SQLs from BAD1, BAD2, and GOOD AWR windows'
    );

    DBMS_OUTPUT.PUT_LINE('Created STS: ' || l_sts_name);
END;
/

2. Load BAD1, BAD2, and GOOD windows into the STS

This loads all selected SQLs and all available plan hashes from those windows.

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_cursor   DBMS_SQLTUNE.SQLSET_CURSOR;
    l_sts_name VARCHAR2(30) := 'STS_JOB_WATCH_AWR';

    l_filter VARCHAR2(4000) :=
        q'[sql_id IN (
              '1mufpt4a9awyx',
              'f2y77p2gtdg41',
              '7r1gn0tpbwjjj',
              'fsgs2mtg3tfpx',
              '6wa1frsf52nsy'
           )]';

    PROCEDURE load_window(
        p_label      IN VARCHAR2,
        p_begin_snap IN NUMBER,
        p_end_snap   IN NUMBER
    )
    IS
    BEGIN
        DBMS_OUTPUT.PUT_LINE('Loading ' || p_label || ' snaps ' || p_begin_snap || ' to ' || p_end_snap);

        OPEN l_cursor FOR
            SELECT VALUE(p)
            FROM   TABLE(
                       DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(
                           begin_snap       => p_begin_snap,
                           end_snap         => p_end_snap,
                           basic_filter     => l_filter,
                           ranking_measure1 => 'elapsed_time',
                           result_limit     => 100,
                           attribute_list   => 'ALL',
                           dbid             => 352148541
                       )
                   ) p;

        DBMS_SQLTUNE.LOAD_SQLSET(
            sqlset_name     => l_sts_name,
            populate_cursor => l_cursor,
            load_option     => 'MERGE'
        );

        DBMS_OUTPUT.PUT_LINE('Loaded ' || p_label);
    END;
BEGIN
    load_window('BAD1_APR6_23_TO_APR7_03', 49153, 49160);
    load_window('BAD2_APR7_15_TO_21',      49184, 49196);
    load_window('GOOD_APR8_00_TO_01',      49202, 49204);

    DBMS_OUTPUT.PUT_LINE('Completed STS load: ' || l_sts_name);
END;
/

3. Validate what loaded into the STS

SET LINES 240
SET PAGES 200

COLUMN sqlset_name FORMAT A25
COLUMN sql_id FORMAT A15
COLUMN sql_text_sample FORMAT A90

SELECT sqlset_name,
       sql_id,
       plan_hash_value,
       SUM(executions) AS executions,
       ROUND(SUM(elapsed_time)/1e6, 2) AS elapsed_sec,
       ROUND(SUM(cpu_time)/1e6, 2) AS cpu_sec,
       SUM(buffer_gets) AS buffer_gets,
       SUM(disk_reads) AS disk_reads,
       SUBSTR(MAX(sql_text), 1, 90) AS sql_text_sample
FROM   dba_sqlset_statements
WHERE  sqlset_name = 'STS_JOB_WATCH_AWR'
GROUP BY sqlset_name,
         sql_id,
         plan_hash_value
ORDER BY sql_id,
         elapsed_sec DESC;

For 7r1gn0tpbwjjj, you should see both:

2340129260  -- bad plan
3257229599  -- good/recommended plan

4. Create SQL Tuning Advisor tasks for each SQL_ID

This creates one Advisor task per SQL. It does not execute yet.

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_task_name VARCHAR2(128);
BEGIN
    FOR r IN (
        SELECT DISTINCT sql_id
        FROM   dba_sqlset_statements
        WHERE  sqlset_name = 'STS_JOB_WATCH_AWR'
        ORDER BY sql_id
    )
    LOOP
        l_task_name := 'STA_' || UPPER(r.sql_id) || '_AWR';

        BEGIN
            DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
        EXCEPTION
            WHEN OTHERS THEN
                NULL;
        END;

        l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
                           sqlset_name  => 'STS_JOB_WATCH_AWR',
                           sqlset_owner => USER,
                           basic_filter => 'sql_id = ''' || r.sql_id || '''',
                           scope        => DBMS_SQLTUNE.SCOPE_COMPREHENSIVE,
                           time_limit   => 7200,
                           task_name    => l_task_name,
                           description  => 'SQL Tuning Advisor from AWR STS for SQL_ID ' || r.sql_id
                       );

        DBMS_OUTPUT.PUT_LINE('Created task: ' || l_task_name);
    END LOOP;
END;
/

5. Execute the Advisor tasks one by one

Do not execute everything blindly if you are short on time. Start with the most important SQLs.

Execute only the critical SQL first

BEGIN
    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(
        task_name => 'STA_7R1GN0TPBWJJJ_AWR'
    );
END;
/

Then run the others one at a time

BEGIN
    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_1MUFPT4A9AWYX_AWR');
END;
/

BEGIN
    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_F2Y77P2GTDG41_AWR');
END;
/

BEGIN
    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_FSGS2MTG3TFPX_AWR');
END;
/

BEGIN
    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'STA_6WA1FRSF52NSY_AWR');
END;
/

If you want to include 9kfty957z5y3, create a separate task for it after adding it into the STS filter.


6. Check Advisor task status

SET LINES 240
SET PAGES 200

COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN execution_start FORMAT A30
COLUMN execution_end FORMAT A30
COLUMN status_message FORMAT A100

SELECT task_name,
       status,
       execution_start,
       execution_end,
       status_message
FROM   dba_advisor_log
WHERE  task_name LIKE 'STA_%_AWR'
ORDER BY execution_start DESC;

7. Generate Advisor report for each SQL

Example for 7r1gn0tpbwjjj:

SET LONG 10000000
SET LONGCHUNKSIZE 10000000
SET LINES 240
SET PAGES 0

SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
           task_name => 'STA_7R1GN0TPBWJJJ_AWR',
           type      => 'TEXT',
           level     => 'ALL'
       ) AS report
FROM dual;

For all reports, change the task name:

STA_1MUFPT4A9AWYX_AWR
STA_F2Y77P2GTDG41_AWR
STA_7R1GN0TPBWJJJ_AWR
STA_FSGS2MTG3TFPX_AWR
STA_6WA1FRSF52NSY_AWR

8. Pull high-level Advisor findings for all tasks

SET LINES 240
SET PAGES 200

COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN finding_name FORMAT A55
COLUMN rec_type FORMAT A35
COLUMN benefit FORMAT 999999999999
COLUMN command FORMAT A120

SELECT t.task_name,
       t.status,
       f.finding_name,
       r.type AS rec_type,
       r.benefit,
       SUBSTR(a.command, 1, 120) AS command
FROM   dba_advisor_tasks t
LEFT JOIN dba_advisor_findings f
       ON f.task_id = t.task_id
LEFT JOIN dba_advisor_recommendations r
       ON r.task_id = t.task_id
      AND r.finding_id = f.finding_id
LEFT JOIN dba_advisor_actions a
       ON a.task_id = t.task_id
      AND a.rec_id  = r.rec_id
WHERE  t.task_name LIKE 'STA_%_AWR'
ORDER BY t.task_name,
         r.benefit DESC;

9. For tomorrow, do not wait for all Advisor tasks

For immediate production protection, prioritize:

1. SQL_ID 7r1gn0tpbwjjj
   Good plan hash: 3257229599
   Bad plan hash : 2340129260

2. SQL_ID 1mufpt4a9awyx
   High reads/gets and AppFirewallCore impact

3. SQL_ID f2y77p2gtdg41
   Very high elapsed time

4. SQL_ID fsgs2mtg3tfpx
   Repeated long elapsed SQL

5. SQL_ID 6wa1frsf52nsy
   High CPU in good window

For 7r1gn0tpbwjjj, Advisor already confirmed the best action: create SQL Plan Baseline for plan hash 3257229599 and control PX/concurrency. The STS/Advisor work for the other SQLs is for deeper tuning, not tomorrow’s first-line protection.

------------------------------------------------------

 This result is expected for a heavy historical AWR SQL. The task did execute, but SQL Tuning Advisor timed out:

Completion Status : INTERRUPTED
Error             : ORA-13639: The current operation was interrupted because it timed out
Number of SQLs with Timeouts : 1

This is not an Oracle outage/error. It means the advisor could not finish deep analysis within your 1500 seconds.

For tomorrow’s job, do not wait on SQL Tuning Advisor as the primary fix. You already found the immediate control point:

Bad plan hash : 2340129260
Good plan hash: 3257229599
Risk driver   : PX / DOP / session explosion

So the immediate fix remains:

1. Load fixed baseline for good plan hash 3257229599.
2. Control/disable parallelism for this SQL/job.
3. Reduce job concurrency.
4. Monitor PX sessions live.

SQL Tuning Advisor can continue in parallel, but it is not the outage-prevention mechanism for tomorrow.


1. First check whether Advisor produced partial findings

Even with timeout, sometimes Oracle still captures useful findings.

SET LINES 240
SET PAGES 200

COLUMN task_name FORMAT A35
COLUMN status FORMAT A15
COLUMN finding_name FORMAT A50
COLUMN type FORMAT A30
COLUMN benefit FORMAT 999999999999
COLUMN command FORMAT A120

SELECT t.task_name,
       t.status,
       f.finding_name,
       r.type,
       r.benefit,
       SUBSTR(a.command, 1, 120) AS command
FROM   dba_advisor_tasks t
LEFT JOIN dba_advisor_findings f
       ON f.task_id = t.task_id
LEFT JOIN dba_advisor_recommendations r
       ON r.task_id = t.task_id
      AND r.finding_id = f.finding_id
LEFT JOIN dba_advisor_actions a
       ON a.task_id = t.task_id
      AND a.rec_id  = r.rec_id
WHERE  t.task_name = 'STA_7R1GN_BAD1_STS'
ORDER BY r.benefit DESC;

Also check the task log:

SET LINES 240
SET PAGES 200

COLUMN task_name FORMAT A35
COLUMN status FORMAT A20
COLUMN status_message FORMAT A100

SELECT task_name,
       status,
       execution_start,
       execution_end,
       status_message
FROM   dba_advisor_log
WHERE  task_name = 'STA_7R1GN_BAD1_STS'
ORDER BY execution_start DESC;

2. Re-run Advisor with higher time limit

Since the 25-minute run timed out, create a clean second task with a higher limit. For this SQL, I would use 2 hours if running off-window.

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_task_name VARCHAR2(128);
BEGIN
    l_task_name := 'STA_7R1GN_BAD1_STS_2H';

    BEGIN
        DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
    EXCEPTION
        WHEN OTHERS THEN
            NULL;
    END;

    l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
                       sqlset_name  => 'STS_7R1GN_BAD1',
                       sqlset_owner => 'RDSADM',
                       basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 2340129260]',
                       scope        => DBMS_SQLTUNE.SCOPE_COMPREHENSIVE,
                       time_limit   => 7200,
                       task_name    => 'STA_7R1GN_BAD1_STS_2H',
                       description  => '2-hour SQL Tuning Advisor for bad plan 2340129260'
                   );

    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => l_task_name);

    DBMS_OUTPUT.PUT_LINE('Executed task: ' || l_task_name);
END;
/

Report:

SET LONG 10000000
SET LONGCHUNKSIZE 10000000
SET LINES 240
SET PAGES 0

SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
           task_name => 'STA_7R1GN_BAD1_STS_2H',
           type      => 'TEXT',
           level     => 'ALL'
       ) AS report
FROM dual;

3. Quick Advisor run for tomorrow readiness

For a fast pre-check, run limited scope. This will not do the full deep SQL Profile analysis, but it can quickly catch stats/index/access-path recommendations.

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_task_name VARCHAR2(128);
BEGIN
    l_task_name := 'STA_7R1GN_BAD1_LIMITED';

    BEGIN
        DBMS_SQLTUNE.DROP_TUNING_TASK(task_name => l_task_name);
    EXCEPTION
        WHEN OTHERS THEN
            NULL;
    END;

    l_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(
                       sqlset_name  => 'STS_7R1GN_BAD1',
                       sqlset_owner => 'RDSADM',
                       basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 2340129260]',
                       scope        => DBMS_SQLTUNE.SCOPE_LIMITED,
                       time_limit   => 600,
                       task_name    => 'STA_7R1GN_BAD1_LIMITED',
                       description  => 'Quick limited SQL Tuning Advisor for 7r1gn0tpbwjjj'
                   );

    DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => l_task_name);

    DBMS_OUTPUT.PUT_LINE('Executed task: ' || l_task_name);
END;
/

Report:

SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK(
           task_name => 'STA_7R1GN_BAD1_LIMITED',
           type      => 'TEXT',
           level     => 'ALL'
       ) AS report
FROM dual;

4. Do not make tomorrow dependent on Advisor

For tomorrow, the hard control should be this:

-- Good historical plan hash
3257229599

-- Bad historical plan hash
2340129260

Load the good plan as fixed baseline:

SET SERVEROUTPUT ON SIZE UNLIMITED

DECLARE
    l_loaded PLS_INTEGER;
BEGIN
    l_loaded := DBMS_SPM.LOAD_PLANS_FROM_AWR(
                    begin_snap   => 49184,
                    end_snap     => 49196,
                    basic_filter => q'[sql_id = '7r1gn0tpbwjjj' and plan_hash_value = 3257229599]',
                    fixed        => 'YES',
                    enabled      => 'YES'
                );

    DBMS_OUTPUT.PUT_LINE('Plans loaded: ' || l_loaded);
END;
/

Then verify live tomorrow:

SET LINES 240
SET PAGES 200

COLUMN sql_profile FORMAT A40
COLUMN sql_patch FORMAT A40
COLUMN sql_plan_baseline FORMAT A40
COLUMN module FORMAT A45

SELECT inst_id,
       sql_id,
       child_number,
       plan_hash_value,
       sql_profile,
       sql_patch,
       sql_plan_baseline,
       module,
       executions,
       elapsed_time/1000000 AS elapsed_sec,
       cpu_time/1000000     AS cpu_sec,
       buffer_gets,
       disk_reads,
       px_servers_executions
FROM   gv$sql
WHERE  sql_id = '7r1gn0tpbwjjj'
ORDER BY inst_id, child_number;

Expected tomorrow:

plan_hash_value should be 3257229599
plan_hash_value should NOT be 2340129260
PX_SERVERS_EXECUTIONS should not explode

5. My recommendation now

Use this decision path:

For tomorrow outage prevention:
    Baseline good plan hash 3257229599.
    Cap/disable parallelism for this job SQL.
    Reduce job concurrency.

For root-cause tuning:
    Run SQL Tuning Advisor with 7200 seconds off-window.
    Review recommendations.
    Do not blindly accept new SQL Profile if it increases parallelism.

Meeting-ready wording:

SQL Tuning Advisor was executed successfully from the historical AWR SQL Tuning Set, but the comprehensive analysis timed out after 1500 seconds. That confirms this is a heavy SQL optimization case, but it does not block tomorrow’s mitigation. We already identified the actionable risk: bad plan hash 2340129260 and high PX usage. For tomorrow, we should stabilize with the good plan hash 3257229599, control parallelism/concurrency, and continue the extended Advisor run off-window for permanent tuning recommendations.

Saturday, April 25, 2026

Monday, January 19, 2026

Drop

 


The scripts below are explicitly designed with Heavy Parallelism using DBMS_SCHEDULER.

  • Tablespace Drops: All 8 tablespaces are dropped simultaneously (8 concurrent threads).

  • Object Cleanup: All 7 schemas are scrubbed simultaneously (7 concurrent threads).

Here is your Consolidated, Parallel Destruction Runbook.

Parallel Destruction Runbook

StepScriptActionParallelism
101_parallel_init.sqlConfig & Safety. Sets up control tables and defines the batches.N/A
202_parallel_drop.sqlThe Nuke. Launches 8 background jobs to drop tablespaces at the same time.8x Threads
303_wait_blocker.sqlTraffic Cop. Pauses your terminal until all Drop jobs are 100% finished.N/A
404_parallel_sweep.sqlThe Sweeper. Launches 7 background jobs to clean schemas at the same time.7x Threads
505_monitor.sqlVerification. Checks status and confirms 0 objects remain.N/A

SCRIPT 01: Config, Safety & Initialization

Filename: 01_parallel_init.sql

SQL
SET SERVEROUTPUT ON;

DECLARE
    -- [SAFETY VALVE] SET TO TRUE TO ENABLE DESTRUCTION
    v_i_have_backups BOOLEAN := FALSE; 

    v_fallback_ts VARCHAR2(30);
BEGIN
    DBMS_OUTPUT.PUT_LINE('=== STEP 1: PARALLEL INIT & SAFETY ===');

    IF NOT v_i_have_backups THEN
        RAISE_APPLICATION_ERROR(-20000, 'STOP! You must edit Script 01 and set v_i_have_backups := TRUE.');
    END IF;

    -- Detect Fallback TS
    BEGIN
        SELECT property_value INTO v_fallback_ts 
        FROM database_properties WHERE property_name = 'DEFAULT_PERMANENT_TABLESPACE';
    EXCEPTION WHEN OTHERS THEN v_fallback_ts := 'USERS'; END;

    -- Cleanup Old Controls
    BEGIN EXECUTE IMMEDIATE 'DROP TABLE tablespace_control PURGE'; EXCEPTION WHEN OTHERS THEN NULL; END;
    BEGIN EXECUTE IMMEDIATE 'DROP TABLE tablespace_log PURGE';     EXCEPTION WHEN OTHERS THEN NULL; END;
    BEGIN EXECUTE IMMEDIATE 'DROP TABLE tablespace_config PURGE';  EXCEPTION WHEN OTHERS THEN NULL; END;

    -- Config & Control Tables
    EXECUTE IMMEDIATE 'CREATE TABLE tablespace_config (config_key VARCHAR2(50) PRIMARY KEY, config_value VARCHAR2(500))';
    EXECUTE IMMEDIATE 'INSERT INTO tablespace_config VALUES (''FALLBACK_TS'', :1)' USING v_fallback_ts;

    EXECUTE IMMEDIATE q'[
        CREATE TABLE tablespace_control (
            item_name     VARCHAR2(50) PRIMARY KEY,
            item_type     VARCHAR2(20),
            batch_id      NUMBER,
            status        VARCHAR2(20) DEFAULT 'PENDING',
            error_msg     VARCHAR2(4000),
            CONSTRAINT chk_safe_items CHECK (
                UPPER(item_name) NOT IN ('SYSTEM','SYSAUX','UNDOTBS1','TEMP','USERS') 
            )
        )
    ]';
    
    EXECUTE IMMEDIATE q'[
        CREATE TABLE tablespace_log (
            log_id    NUMBER GENERATED ALWAYS AS IDENTITY,
            log_time  TIMESTAMP DEFAULT SYSTIMESTAMP,
            item_name VARCHAR2(50), 
            action    VARCHAR2(30), 
            status    VARCHAR2(20), 
            message   VARCHAR2(4000)
        )
    ]';

    -- LOAD PARALLEL BATCHES
    INSERT ALL
        -- Batch A: 8 Tablespaces (Will run on 8 threads)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_CORE',      'TABLESPACE', 1)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_INDEX',     'TABLESPACE', 2)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_AUDIT',     'TABLESPACE', 3)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_HISTORY',   'TABLESPACE', 4)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_REPORTING', 'TABLESPACE', 5)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_ARCHIVE',   'TABLESPACE', 6)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_LARGE',     'TABLESPACE', 7)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('TS_REVANTH_RESERVE',   'TABLESPACE', 8)
        
        -- Batch B: 7 Schemas (Will run on 7 threads)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_APP1',        'SCHEMA', 101)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_APP2',        'SCHEMA', 102)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_DATA',        'SCHEMA', 103)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_REPORT',      'SCHEMA', 104)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_AUDIT',       'SCHEMA', 105)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_INTEGRATION', 'SCHEMA', 106)
        INTO tablespace_control (item_name, item_type, batch_id) VALUES ('SCHEMA_ARCHIVE',     'SCHEMA', 107)
    SELECT * FROM dual;
    COMMIT;

    -- Prepare Environment (Kill Sessions)
    FOR r IN (SELECT item_name FROM tablespace_control WHERE item_type = 'SCHEMA') LOOP
        BEGIN
            EXECUTE IMMEDIATE 'ALTER USER ' || r.item_name || ' ACCOUNT LOCK';
            FOR s IN (SELECT sid, serial# FROM v$session WHERE username = r.item_name) LOOP
                EXECUTE IMMEDIATE 'ALTER SYSTEM KILL SESSION ''' || s.sid || ',' || s.serial# || ''' IMMEDIATE';
            END LOOP;
        EXCEPTION WHEN OTHERS THEN NULL; END;
    END LOOP;

    DBMS_OUTPUT.PUT_LINE('>> Parallel Batches Configured. Ready to Drop.');
END;
/

SCRIPT 02: Launch Parallel Tablespace Drops

Filename: 02_parallel_drop.sql

SQL
SET SERVEROUTPUT ON;

DECLARE
    v_fallback_ts VARCHAR2(50);
    v_jobs_count  NUMBER := 0;
BEGIN
    DBMS_OUTPUT.PUT_LINE('=== STEP 2: LAUNCHING PARALLEL DROPS ===');

    SELECT config_value INTO v_fallback_ts FROM tablespace_config WHERE config_key = 'FALLBACK_TS';

    -- Loop through control table and spawn a job for EACH tablespace immediately
    FOR rec IN (SELECT * FROM tablespace_control WHERE item_type = 'TABLESPACE' ORDER BY batch_id) LOOP
        DBMS_SCHEDULER.CREATE_JOB(
            job_name   => 'JOB_DROP_' || rec.batch_id,
            job_type   => 'PLSQL_BLOCK',
            job_action => q'[
                DECLARE
                    v_ts   VARCHAR2(50) := ']' || rec.item_name || q'[';
                    v_safe VARCHAR2(50) := ']' || v_fallback_ts || q'[';
                    v_cnt  NUMBER;
                BEGIN
                    -- 1. Evacuate Users
                    FOR u IN (SELECT username FROM dba_users WHERE default_tablespace = v_ts) LOOP
                        EXECUTE IMMEDIATE 'ALTER USER ' || u.username || ' DEFAULT TABLESPACE ' || v_safe;
                    END LOOP;

                    -- 2. Drop (The Heavy IO Operation)
                    SELECT count(*) INTO v_cnt FROM dba_tablespaces WHERE tablespace_name = v_ts;
                    IF v_cnt > 0 THEN
                        EXECUTE IMMEDIATE 'DROP TABLESPACE ' || v_ts || ' INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS';
                        INSERT INTO tablespace_log (item_name, action, status, message) VALUES (v_ts, 'DROP_TS', 'SUCCESS', 'Dropped via Parallel Job');
                    ELSE
                        INSERT INTO tablespace_log (item_name, action, status, message) VALUES (v_ts, 'DROP_TS', 'SKIPPED', 'Not Found');
                    END IF;
                    
                    UPDATE tablespace_control SET status = 'COMPLETED' WHERE item_name = v_ts;
                    COMMIT;
                EXCEPTION WHEN OTHERS THEN
                    INSERT INTO tablespace_log (item_name, action, status, message) VALUES (v_ts, 'DROP_TS', 'FAILED', SQLERRM);
                    UPDATE tablespace_control SET status = 'FAILED', error_msg = SQLERRM WHERE item_name = v_ts;
                    COMMIT;
                END;
            ]',
            enabled    => TRUE,
            auto_drop  => TRUE
        );
        v_jobs_count := v_jobs_count + 1;
        DBMS_OUTPUT.PUT_LINE('>> Spawned Thread #' || v_jobs_count || ' for: ' || rec.item_name);
    END LOOP;
    
    DBMS_OUTPUT.PUT_LINE('>> All ' || v_jobs_count || ' threads running. Proceed to Wait Script.');
END;
/

SCRIPT 03: The Blocker (Wait for Completion)

Filename: 03_wait_blocker.sql

SQL
SET SERVEROUTPUT ON;

DECLARE
    v_active_jobs NUMBER;
BEGIN
    DBMS_OUTPUT.PUT_LINE('=== WAITING FOR PARALLEL JOBS TO FINISH ===');
    
    LOOP
        SELECT COUNT(*) INTO v_active_jobs 
        FROM dba_scheduler_jobs 
        WHERE job_name LIKE 'JOB_%';
        
        IF v_active_jobs = 0 THEN
            EXIT;
        END IF;
        
        DBMS_OUTPUT.PUT_LINE('... ' || v_active_jobs || ' threads still active ...');
        
        -- Sleep 10s (Universal compatible approach)
        BEGIN
            EXECUTE IMMEDIATE 'BEGIN DBMS_SESSION.SLEEP(10); END;';
        EXCEPTION WHEN OTHERS THEN
            NULL; -- Busy wait for older versions
        END;
    END LOOP;
    
    DBMS_OUTPUT.PUT_LINE('>> ALL THREADS FINISHED.');
END;
/

SCRIPT 04: Launch Parallel Object Sweepers

Filename: 04_parallel_sweep.sql

SQL
SET SERVEROUTPUT ON;

DECLARE
    v_jobs_count NUMBER := 0;
BEGIN
    DBMS_OUTPUT.PUT_LINE('=== STEP 4: LAUNCHING PARALLEL SWEEPERS ===');

    FOR rec IN (SELECT * FROM tablespace_control WHERE item_type = 'SCHEMA') LOOP
        DBMS_SCHEDULER.CREATE_JOB(
            job_name   => 'JOB_SWEEP_' || rec.batch_id,
            job_type   => 'PLSQL_BLOCK',
            job_action => q'[
                DECLARE
                    v_owner VARCHAR2(50) := ']' || rec.item_name || q'[';
                BEGIN
                    EXECUTE IMMEDIATE 'ALTER SESSION SET RECYCLEBIN = OFF';
                    
                    -- A. Parallel Schema Cleanup
                    FOR obj IN (
                        SELECT object_name, object_type FROM dba_objects WHERE owner = v_owner
                        AND object_type NOT LIKE 'SYSTEM%' AND object_type NOT LIKE 'LOB%' AND object_name NOT LIKE 'BIN$%'
                        ORDER BY CASE object_type 
                            WHEN 'SYNONYM' THEN 1 WHEN 'VIEW' THEN 2 WHEN 'SEQUENCE' THEN 3
                            WHEN 'PROCEDURE' THEN 4 WHEN 'PACKAGE' THEN 5 WHEN 'TABLE' THEN 6 
                            ELSE 7 END ASC
                    ) LOOP
                        BEGIN
                            IF obj.object_type = 'TABLE' THEN
                                EXECUTE IMMEDIATE 'DROP TABLE "'||v_owner||'"."'||obj.object_name||'" CASCADE CONSTRAINTS PURGE';
                            ELSE
                                EXECUTE IMMEDIATE 'DROP '||obj.object_type||' "'||v_owner||'"."'||obj.object_name||'"' || 
                                                  CASE WHEN obj.object_type = 'TYPE' THEN ' FORCE' ELSE '' END;
                            END IF;
                        EXCEPTION WHEN OTHERS THEN NULL; END;
                    END LOOP;

                    -- B. Parallel Public Synonym Cleanup
                    FOR p IN (SELECT synonym_name FROM dba_synonyms WHERE owner = 'PUBLIC' AND table_owner = v_owner) LOOP
                        BEGIN
                            EXECUTE IMMEDIATE 'DROP PUBLIC SYNONYM "' || p.synonym_name || '"';
                        EXCEPTION WHEN OTHERS THEN NULL; END;
                    END LOOP;

                    INSERT INTO tablespace_log (item_name, action, status, message) VALUES (v_owner, 'SWEEP', 'SUCCESS', 'Cleaned via Parallel Job');
                    UPDATE tablespace_control SET status = 'COMPLETED' WHERE item_name = v_owner;
                    COMMIT;
                END;
            ]',
            enabled    => TRUE,
            auto_drop  => TRUE
        );
        v_jobs_count := v_jobs_count + 1;
        DBMS_OUTPUT.PUT_LINE('>> Spawned Thread #' || v_jobs_count || ' for: ' || rec.item_name);
    END LOOP;
END;
/

SCRIPT 05: Status Monitor

Filename: 05_monitor.sql

SQL
SET LINESIZE 200 PAGESIZE 100;
COL item_name FORMAT A30
COL action FORMAT A15
COL status FORMAT A15
COL message FORMAT A60
COL log_time FORMAT A20

PROMPT === FINAL JOB STATUS ===
SELECT job_name, state, run_duration FROM dba_scheduler_jobs WHERE job_name LIKE 'JOB_%';

PROMPT === CONTROL STATUS ===
SELECT item_name, item_type, status, error_msg FROM tablespace_control ORDER BY batch_id;

PROMPT === OBJECT COUNT (Target: 0) ===
SELECT owner, count(*) as remaining_objects 
FROM dba_objects 
WHERE owner IN (SELECT item_name FROM tablespace_control WHERE item_type = 'SCHEMA')
GROUP BY owner;