Monday, September 8, 2025

AWS RDS EXPDP



Export Scripts Using DBMS_DATAPUMP (Run on Source RDS)

Since external expdp binaries are not available in Oracle RDS, we'll use DBMS_DATAPUMP PL/SQL scripts. These scripts perform metadata-only exports (CONTENT => 'METADATA') as specified in your example, excluding indexes and statistics. I've assumed the schemas are SCHEMA_01 to SCHEMA_07 based on your notation (small: 01,02,03,07; big: 04,05,06). Adjust schema names if needed.

  • Small Schemas Export: Groups SCHEMA_01, SCHEMA_02, SCHEMA_03, SCHEMA_07 into a single dump file (small_schemas.dmp), no parallel (default 1).
  • Big Schemas Export: Exports SCHEMA_04, SCHEMA_05, SCHEMA_06 with parallel=24, multi-file dump (big_schemas_%U.dmp).
  • RDS Handling: Dump files are written to DATA_PUMP_DIR and uploaded to S3 for transfer to the target.
  • Robust Features: Job monitoring every 10 seconds, error handling with detach, logging, and status checks. Requires DATAPUMP_EXP_FULL_DATABASE role.

Small Schemas Export Script

sql
DECLARE
    hdl NUMBER;                  -- Job handle
    job_state VARCHAR2(30);      -- Job status
    log_file VARCHAR2(100) := 'small_export.log';
    dump_file VARCHAR2(100) := 'small_schemas.dmp';
BEGIN
    -- Open export job for small schemas
    hdl := DBMS_DATAPUMP.OPEN(
        operation => 'EXPORT',
        job_mode => 'SCHEMA',
        job_name => 'SMALL_SCHEMAS_EXPORT_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MISS')
    );

    -- Add dump and log files
    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => dump_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE
    );
    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => log_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE
    );

    -- Specify schemas to export
    DBMS_DATAPUMP.METADATA_FILTER(
        handle => hdl,
        name => 'SCHEMA_LIST',
        value => '''SCHEMA_01'',''SCHEMA_02'',''SCHEMA_03'',''SCHEMA_07'''
    );

    -- Metadata only
    DBMS_DATAPUMP.SET_PARAMETER(
        handle => hdl,
        name => 'CONTENT',
        value => 'METADATA'
    );

    -- Exclude INDEX and STATISTICS
    DBMS_DATAPUMP.METADATA_FILTER(
        handle => hdl,
        name => 'EXCLUDE_PATH_EXPR',
        value => 'IN (''INDEX'', ''STATISTICS'')'
    );

    -- Start the job
    DBMS_DATAPUMP.START_JOB(handle => hdl);

    -- Monitor job progress
    job_state := 'UNDEFINED';
    WHILE (job_state != 'COMPLETED' AND job_state != 'STOPPED') LOOP
        DBMS_DATAPUMP.GET_STATUS(
            handle => hdl,
            mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
            job_state => job_state
        );
        DBMS_OUTPUT.PUT_LINE('Job State: ' || job_state);
        DBMS_LOCK.SLEEP(10);  -- Check every 10 seconds
    END LOOP;

    DBMS_DATAPUMP.DETACH(handle => hdl);
    DBMS_OUTPUT.PUT_LINE('Export completed successfully.');

EXCEPTION
    WHEN OTHERS THEN
        DBMS_DATAPUMP.DETACH(handle => hdl);
        DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
        RAISE;
END;
/

-- Upload dump file to S3 (run as RDS master user, e.g., rdsadm)
BEGIN
    rdsadmin.rdsadmin_s3_tasks.upload_to_s3(
        p_bucket_name => 'your-rds-s3-bucket',
        p_prefix => 'dmp_files/small/',
        p_directory_name => 'DATA_PUMP_DIR'
    );
END;
/

Big Schemas Export Script

sql
DECLARE
    hdl NUMBER;
    job_state VARCHAR2(30);
    log_file VARCHAR2(100) := 'big_export.log';
    dump_file VARCHAR2(100) := 'big_schemas_%U.dmp';
BEGIN
    hdl := DBMS_DATAPUMP.OPEN(
        operation => 'EXPORT',
        job_mode => 'SCHEMA',
        job_name => 'BIG_SCHEMAS_EXPORT_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MISS')
    );

    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => dump_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE
    );
    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => log_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE
    );

    DBMS_DATAPUMP.METADATA_FILTER(
        handle => hdl,
        name => 'SCHEMA_LIST',
        value => '''SCHEMA_04'',''SCHEMA_05'',''SCHEMA_06'''
    );

    DBMS_DATAPUMP.SET_PARAMETER(
        handle => hdl,
        name => 'CONTENT',
        value => 'METADATA'
    );

    DBMS_DATAPUMP.METADATA_FILTER(
        handle => hdl,
        name => 'EXCLUDE_PATH_EXPR',
        value => 'IN (''INDEX'', ''STATISTICS'')'
    );

    -- Enable parallel=24
    DBMS_DATAPUMP.SET_PARALLEL(
        handle => hdl,
        degree => 24
    );

    DBMS_DATAPUMP.START_JOB(handle => hdl);

    job_state := 'UNDEFINED';
    WHILE (job_state != 'COMPLETED' AND job_state != 'STOPPED') LOOP
        DBMS_DATAPUMP.GET_STATUS(
            handle => hdl,
            mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
            job_state => job_state
        );
        DBMS_OUTPUT.PUT_LINE('Job State: ' || job_state);
        DBMS_LOCK.SLEEP(10);
    END LOOP;

    DBMS_DATAPUMP.DETACH(handle => hdl);
    DBMS_OUTPUT.PUT_LINE('Export completed successfully.');

EXCEPTION
    WHEN OTHERS THEN
        DBMS_DATAPUMP.DETACH(handle => hdl);
        DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
        RAISE;
END;
/

-- Upload dump files to S3
BEGIN
    rdsadmin.rdsadmin_s3_tasks.upload_to_s3(
        p_bucket_name => 'your-rds-s3-bucket',
        p_prefix => 'dmp_files/big/',
        p_directory_name => 'DATA_PUMP_DIR'
    );
END;
/

Import Scripts Using DBMS_DATAPUMP (Run on Target RDS)

Use DBMS_DATAPUMP for import, with REMAP_TABLESPACE applied dynamically from a ts_mappings table (create/populate it on the target or use a dblink to source). Assumes new tablespaces exist on target. Imports are separate for small and big dump sets.

  • REMAP_TABLESPACE: Loops over mappings (e.g., old to new as per previous context).
  • RDS Handling: Downloads from S3 to DATA_PUMP_DIR.
  • Robust Features: Same as export, plus LOB transformation to SecureFile (RDS best practice).

First, create/populate ts_mappings on target (copy from source if needed):

sql
CREATE TABLE ts_mappings (
    old_tablespace VARCHAR2(30) NOT NULL PRIMARY KEY,
    new_tablespace VARCHAR2(30) NOT NULL
);

-- Populate with your 40 old -> 8 new mappings (example)
INSERT INTO ts_mappings VALUES ('PLO_DATA_01', 'TRU_DATA_01');
INSERT INTO ts_mappings VALUES ('PLO_DATA_02', 'TRU_DATA_01');
-- Add all mappings
COMMIT;

Small Schemas Import Script

sql
DECLARE
    hdl NUMBER;
    job_state VARCHAR2(30);
    log_file VARCHAR2(100) := 'small_import.log';
    dump_file VARCHAR2(100) := 'small_schemas.dmp';
BEGIN
    -- Download from S3
    rdsadmin.rdsadmin_s3_tasks.download_from_s3(
        p_bucket_name => 'your-rds-s3-bucket',
        p_s3_prefix => 'dmp_files/small/',
        p_directory_name => 'DATA_PUMP_DIR'
    );

    hdl := DBMS_DATAPUMP.OPEN(
        operation => 'IMPORT',
        job_mode => 'SCHEMA',
        job_name => 'SMALL_SCHEMAS_IMPORT_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MISS')
    );

    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => dump_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE
    );
    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => log_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE
    );

    -- Apply REMAP_TABLESPACE for all mappings
    FOR rec IN (SELECT old_tablespace, new_tablespace FROM ts_mappings) LOOP
        DBMS_DATAPUMP.METADATA_REMAP(
            handle => hdl,
            name => 'REMAP_TABLESPACE',
            old_value => rec.old_tablespace,
            value => rec.new_tablespace
        );
    END LOOP;

    -- Transform LOBs to SecureFile
    DBMS_DATAPUMP.METADATA_TRANSFORM(
        handle => hdl,
        name => 'LOB_STORAGE',
        value => 'SECUREFILE'
    );

    DBMS_DATAPUMP.START_JOB(handle => hdl);

    job_state := 'UNDEFINED';
    WHILE (job_state != 'COMPLETED' AND job_state != 'STOPPED') LOOP
        DBMS_DATAPUMP.GET_STATUS(
            handle => hdl,
            mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
            job_state => job_state
        );
        DBMS_OUTPUT.PUT_LINE('Job State: ' || job_state);
        DBMS_LOCK.SLEEP(10);
    END LOOP;

    DBMS_DATAPUMP.DETACH(handle => hdl);
    DBMS_OUTPUT.PUT_LINE('Import completed successfully.');

EXCEPTION
    WHEN OTHERS THEN
        DBMS_DATAPUMP.DETACH(handle => hdl);
        DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
        RAISE;
END;
/

Big Schemas Import Script

sql
DECLARE
    hdl NUMBER;
    job_state VARCHAR2(30);
    log_file VARCHAR2(100) := 'big_import.log';
    dump_file VARCHAR2(100) := 'big_schemas_%U.dmp';
BEGIN
    -- Download from S3
    rdsadmin.rdsadmin_s3_tasks.download_from_s3(
        p_bucket_name => 'your-rds-s3-bucket',
        p_s3_prefix => 'dmp_files/big/',
        p_directory_name => 'DATA_PUMP_DIR'
    );

    hdl := DBMS_DATAPUMP.OPEN(
        operation => 'IMPORT',
        job_mode => 'SCHEMA',
        job_name => 'BIG_SCHEMAS_IMPORT_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MISS')
    );

    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => dump_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE
    );
    DBMS_DATAPUMP.ADD_FILE(
        handle => hdl,
        filename => log_file,
        directory => 'DATA_PUMP_DIR',
        filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE
    );

    -- Apply REMAP_TABLESPACE
    FOR rec IN (SELECT old_tablespace, new_tablespace FROM ts_mappings) LOOP
        DBMS_DATAPUMP.METADATA_REMAP(
            handle => hdl,
            name => 'REMAP_TABLESPACE',
            old_value => rec.old_tablespace,
            value => rec.new_tablespace
        );
    END LOOP;

    -- LOB transform
    DBMS_DATAPUMP.METADATA_TRANSFORM(
        handle => hdl,
        name => 'LOB_STORAGE',
        value => 'SECUREFILE'
    );

    -- Parallel=24
    DBMS_DATAPUMP.SET_PARALLEL(
        handle => hdl,
        degree => 24
    );

    DBMS_DATAPUMP.START_JOB(handle => hdl);

    job_state := 'UNDEFINED';
    WHILE (job_state != 'COMPLETED' AND job_state != 'STOPPED') LOOP
        DBMS_DATAPUMP.GET_STATUS(
            handle => hdl,
            mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
            job_state => job_state
        );
        DBMS_OUTPUT.PUT_LINE('Job State: ' || job_state);
        DBMS_LOCK.SLEEP(10);
    END LOOP;

    DBMS_DATAPUMP.DETACH(handle => hdl);
    DBMS_OUTPUT.PUT_LINE('Import completed successfully.');

EXCEPTION
    WHEN OTHERS THEN
        DBMS_DATAPUMP.DETACH(handle => hdl);
        DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
        RAISE;
END;
/

Post-Import Validation and Cleanup

  • Validate Objects:
    sql
    SELECT object_type, COUNT(*) FROM dba_objects WHERE status = 'INVALID' GROUP BY object_type;
  • Gather Statistics (since excluded in export):
    sql
    EXEC DBMS_STATS.GATHER_DATABASE_STATS(cascade => TRUE, degree => 4);
  • Clean Up: Delete old dump files from DATA_PUMP_DIR if needed.
  • Notes: Replace placeholders (e.g., S3 bucket, schema names, mappings). If full data export is needed for big schemas, change CONTENT => 'METADATA' to 'ALL'. Test on a snapshot. If temporary tablespaces need remapping, add separate ALTER statements.
=====================================================================

Step 2

I understand you're confused about monitoring Data Pump jobs in Oracle RDS using the DBMS_DATAPUMP API, especially in the context of your tablespace consolidation (40 old tablespaces to 8 new ones) and schema migrations (SCHEMA_01 to SCHEMA_07, with metadata-only exports for small schemas and parallel=24 for large schemas). Below, I’ll break down the steps to monitor these jobs clearly and concisely, using the robust scripts provided earlier. These steps assume you’re using the DBMS_DATAPUMP export/import scripts from previous responses and need to monitor their progress, errors, and resource usage in an RDS environment where expdp/impdp binaries are unavailable.


Overview

You’ll monitor Data Pump jobs (export/import) for:

  • Small schemas (SCHEMA_01, SCHEMA_02, SCHEMA_03, SCHEMA_07): Single dump file (small_schemas.dmp), metadata-only, no parallel.
  • Large schemas (SCHEMA_04, SCHEMA_05, SCHEMA_06): Multi-file dump (big_schemas_%U.dmp), metadata-only, parallel=24.
  • REMAP_TABLESPACE: Applied during import to map old tablespaces (e.g., PLO_DATA_01) to new ones (e.g., TRU_DATA_01).

Monitoring involves tracking job status, progress, errors, and RDS resource usage using DBMS_DATAPUMP.GET_STATUS, Data Pump views, and CloudWatch.


Steps to Monitor Data Pump Jobs

Step 1: Prepare the Environment

  1. Ensure Privileges: Log in as a user with DATAPUMP_EXP_FULL_DATABASE and DATAPUMP_IMP_FULL_DATABASE roles (e.g., rdsadm/frusi@db_name).
  2. Verify ts_mappings Table: Ensure the ts_mappings table exists on the source or target (for REMAP_TABLESPACE). Example:
    sql
    SELECT old_tablespace, new_tablespace FROM ts_mappings ORDER BY new_tablespace;
    • If missing, create/populate it (see previous responses for script).
  3. Create Monitoring Log Table: For historical tracking, create a table to store job status (optional but recommended).
    sql
    CREATE TABLE datapump_monitor_log (
        log_id NUMBER GENERATED ALWAYS AS IDENTITY,
        job_name VARCHAR2(100),
        operation VARCHAR2(30),
        job_mode VARCHAR2(30),
        state VARCHAR2(30),
        percent_done NUMBER,
        objects_processed NUMBER,
        total_work NUMBER,
        error_message VARCHAR2(4000),
        log_time DATE DEFAULT SYSDATE,
        CONSTRAINT pk_datapump_monitor_log PRIMARY KEY (log_id)
    );
  4. Check S3 Setup: Ensure your S3 bucket (your-rds-s3-bucket) is configured for dump file transfers. Test with:
    sql
    BEGIN
        rdsadmin.rdsadmin_s3_tasks.list_s3_files(p_bucket_name => 'your-rds-s3-bucket');
    END;
    /

Step 2: Start Data Pump Jobs

Run the export/import scripts provided earlier (or modified versions). These scripts return a job handle (hdl) needed for real-time monitoring. If you haven’t run them yet, here’s a summary:

  • Export Small Schemas:
    • File: small_schemas.dmp
    • Schemas: SCHEMA_01, SCHEMA_02, SCHEMA_03, SCHEMA_07
    • Metadata-only, excludes INDEX, STATISTICS
    • Script: Use the “Small Schemas Export Script” from the previous response.
    • Note the job handle (hdl) from DBMS_DATAPUMP.OPEN.
  • Export Large Schemas:
    • File: big_schemas_%U.dmp
    • Schemas: SCHEMA_04, SCHEMA_05, SCHEMA_06
    • Metadata-only, parallel=24
    • Script: Use the “Big Schemas Export Script” from the previous response.
    • Note the job handle.
  • Import Small Schemas:
    • File: small_schemas.dmp
    • Applies REMAP_TABLESPACE from ts_mappings
    • Script: Use the “Small Schemas Import Script” from the previous response.
    • Note the job handle.
  • Import Large Schemas:
    • File: big_schemas_%U.dmp
    • Parallel=24, REMAP_TABLESPACE
    • Script: Use the “Big Schemas Import Script” from the previous response.
    • Note the job handle.

Tip: Modify the export/import scripts to output the job handle:

sql
DBMS_OUTPUT.PUT_LINE('Job Handle: ' || hdl);

Run each job in a separate SQL*Plus/SQL Developer session to capture the handle.

Step 3: Monitor Jobs in Real-Time (Using Job Handle)

Use the real-time monitoring script (monitor_datapump_job.sql) to track a specific job’s progress, errors, and completion status. This requires the job handle from Step 2.

  1. Run the Monitoring Script:
    • Copy the script from the previous response (“Script 1: Real-Time Monitoring with DBMS_DATAPUMP.GET_STATUS”).
    • Replace :job_handle with the hdl value from your export/import script.
    • Example:
      sql
      DECLARE
          hdl NUMBER := 1234; -- Replace with actual handle
          job_state VARCHAR2(30);
          status DBMS_DATAPUMP.STATUS;
          done BOOLEAN := FALSE;
      BEGIN
          WHILE NOT done LOOP
              DBMS_DATAPUMP.GET_STATUS(
                  handle => hdl,
                  mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
                  timeout => 10,
                  job_state => job_state,
                  status => status
              );
              DBMS_OUTPUT.PUT_LINE('Job State: ' || job_state || ' | Percent Done: ' || status.percent_done || 
                                   ' | Objects: ' || status.completed_objects || '/' || status.totalwork);
              IF status.error IS NOT NULL THEN
                  FOR i IN 1..status.error.count LOOP
                      DBMS_OUTPUT.PUT_LINE('Error ' || i || ': ' || status.error(i).error_number || ' - ' || status.error(i).error_msg);
                  END LOOP;
              END IF;
              IF job_state IN ('COMPLETED', 'STOPPED', 'ERROR') THEN
                  done := TRUE;
              END IF;
              DBMS_LOCK.SLEEP(10);
          END LOOP;
          DBMS_DATAPUMP.DETACH(handle => hdl);
      EXCEPTION
          WHEN OTHERS THEN
              DBMS_OUTPUT.PUT_LINE('Monitoring Error: ' || SQLERRM);
              DBMS_DATAPUMP.DETACH(handle => hdl);
              RAISE;
      END;
      /
  2. What to Expect:
    • Output every 10 seconds: Job state (EXECUTING, COMPLETED, etc.), percent done, objects processed, and errors (e.g., ORA-00959 if a tablespace is missing).
    • Example output:
      text
      Job State: EXECUTING | Percent Done: 50 | Objects: 100/200
      Time: 2025-09-08 23:12:45
      Job State: COMPLETED | Percent Done: 100 | Objects: 200/200
      Monitoring completed.
    • Run in a separate session for each job (e.g., one for small schemas export, one for large schemas import).
  3. When to Run:
    • Start immediately after launching an export/import job.
    • Use a new session to avoid blocking the job.

Step 4: Monitor All Jobs via Database Views

Use the view-based monitoring script (monitor_datapump_views.sql) to check all running Data Pump jobs, including parallel sessions, without needing the job handle. This is useful for an overview or if you lose the handle.

  1. Run the Script:
    • Copy from the previous response (“Script 2: Monitor All Data Pump Jobs via Views”).
    • Example:
      sql
      SET SERVEROUTPUT ON;
      BEGIN
          DBMS_OUTPUT.PUT_LINE('--- Data Pump Jobs Status ---');
          FOR job_rec IN (
              SELECT job_name, owner_name, operation, job_mode, state, degree, attached_sessions
              FROM dba_datapump_jobs
              WHERE state NOT IN ('NOT RUNNING')
          ) LOOP
              DBMS_OUTPUT.PUT_LINE('Job: ' || job_rec.job_name || ' | State: ' || job_rec.state || 
                                   ' | Parallel: ' || job_rec.degree);
          END LOOP;
          DBMS_OUTPUT.PUT_LINE('--- Active Sessions ---');
          FOR sess_rec IN (
              SELECT dj.job_name, ds.sid, ds.serial#
              FROM dba_datapump_jobs dj
              JOIN dba_datapump_sessions ds ON dj.job_name = ds.job_name
              WHERE dj.state NOT IN ('NOT RUNNING')
          ) LOOP
              DBMS_OUTPUT.PUT_LINE('Job: ' || sess_rec.job_name || ' | SID: ' || sess_rec.sid);
          END LOOP;
          DBMS_OUTPUT.PUT_LINE('--- Progress ---');
          FOR longop_rec IN (
              SELECT opname, target, ROUND(sofar / totalwork * 100, 2) AS percent_complete
              FROM v$session_longops
              WHERE opname LIKE 'DP%' AND totalwork > 0
          ) LOOP
              DBMS_OUTPUT.PUT_LINE('Operation: ' || longop_rec.opname || ' | Progress: ' || longop_rec.percent_complete || '%');
          END LOOP;
      END;
      /
  2. What to Expect:
    • Lists all active jobs (e.g., SMALL_SCHEMAS_EXPORT_202509082312, BIG_SCHEMAS_IMPORT_202509082313).
    • Shows parallel sessions (up to 24 for large schemas).
    • Displays progress from V$SESSION_LONGOPS (e.g., 75% complete).
    • Example output:
      text
      --- Data Pump Jobs Status ---
      Job: BIG_SCHEMAS_EXPORT_202509082312 | State: EXECUTING | Parallel: 24
      Job: SMALL_SCHEMAS_IMPORT_202509082313 | State: COMPLETED | Parallel: 1
      --- Active Sessions ---
      Job: BIG_SCHEMAS_EXPORT_202509082312 | SID: 123
      --- Progress ---
      Operation: DP_EXPORT | Progress: 60%
  3. When to Run:
    • Run periodically to check all jobs.
    • Useful for large schemas with parallel=24 to confirm all sessions are active.

Step 5: Log Monitoring Data for Auditing

Use the comprehensive monitoring script (monitor_datapump_log.sql) to log job details to a table for historical analysis or auditing.

  1. Run the Script:
    • Copy from the previous response (“Script 3: Comprehensive Monitoring with Logging to Table”).
    • Replace :job_handle with the job’s handle.
    • Example:
      sql
      DECLARE
          hdl NUMBER := 1234; -- Replace with actual handle
          job_state VARCHAR2(30);
          status DBMS_DATAPUMP.STATUS;
          done BOOLEAN := FALSE;
      BEGIN
          WHILE NOT done LOOP
              DBMS_DATAPUMP.GET_STATUS(
                  handle => hdl,
                  mask => DBMS_DATAPUMP.KU$_STATUS_JOB_ERROR + DBMS_DATAPUMP.KU$_STATUS_JOB_STATUS,
                  timeout => 10,
                  job_state => job_state,
                  status => status
              );
              INSERT INTO datapump_monitor_log (
                  job_name, operation, job_mode, state, percent_done, 
                  objects_processed, total_work, error_message
              )
              SELECT status.job_name, status.operation, status.job_mode, job_state, 
                     status.percent_done, status.completed_objects, status.totalwork,
                     CASE WHEN status.error IS NOT NULL THEN
                          (SELECT LISTAGG(error_number || ': ' || error_msg, '; ')
                           WITHIN GROUP (ORDER BY error_number)
                           FROM TABLE(status.error))
                     END
              FROM dual;
              COMMIT;
              IF job_state IN ('COMPLETED', 'STOPPED', 'ERROR') THEN
                  done := TRUE;
              END IF;
              DBMS_LOCK.SLEEP(10);
          END LOOP;
          DBMS_DATAPUMP.DETACH(handle => hdl);
      EXCEPTION
          WHEN OTHERS THEN
              INSERT INTO datapump_monitor_log (job_name, state, error_message)
              VALUES ('UNKNOWN', 'ERROR', 'Monitoring Error: ' || SQLERRM);
              COMMIT;
              DBMS_DATAPUMP.DETACH(handle => hdl);
              RAISE;
      END;
      /
  2. Query the Log Table:
    sql
    SELECT job_name, state, percent_done, objects_processed, error_message, 
           TO_CHAR(log_time, 'YYYY-MM-DD HH24:MI:SS') AS log_time
    FROM datapump_monitor_log
    WHERE log_time > SYSDATE - 1/24
    ORDER BY log_time DESC;
  3. What to Expect:
    • Logs job status, progress, and errors to datapump_monitor_log.
    • Useful for auditing or debugging failed jobs.
    • Example output:
      text
      JOB_NAME                    STATE      PERCENT_DONE OBJECTS_PROCESSED ERROR_MESSAGE          LOG_TIME
      BIG_SCHEMAS_EXPORT_20250908 EXECUTING  50           100               NULL                   2025-09-08 23:12:45
      SMALL_SCHEMAS_IMPORT_20250908 COMPLETED 100         50                NULL                   2025-09-08 23:12:30
  4. When to Run:
    • Run alongside each job for persistent logging.
    • Query the table post-job to analyze failures or performance.

Step 6: Monitor RDS Resources

Use the resource monitoring script (monitor_rds_resources.sql) to track CPU, I/O, and storage usage during Data Pump jobs.

  1. Run the Script:
    • Copy from the previous response (“Script 4: RDS-Specific Resource Monitoring”).
    • Example:
      sql
      SET SERVEROUTPUT ON;
      BEGIN
          DBMS_OUTPUT.PUT_LINE('--- RDS Resource Usage ---');
          FOR sess_rec IN (
              SELECT s.sid, s.program, ROUND(ss.value / 100, 2) AS cpu_usage_cs
              FROM v$session s
              JOIN v$sesstat ss ON s.sid = ss.sid
              JOIN v$statname sn ON ss.statistic# = sn.statistic#
              WHERE s.program LIKE 'oracle%(DMP)%'
              AND sn.name = 'CPU used by this session'
          ) LOOP
              DBMS_OUTPUT.PUT_LINE('SID: ' || sess_rec.sid || ' | CPU: ' || sess_rec.cpu_usage_cs || ' cs');
          END LOOP;
          FOR io_rec IN (
              SELECT event, COUNT(*) AS event_count
              FROM v$active_session_history
              WHERE program LIKE 'oracle%(DMP)%' AND sample_time > SYSDATE - 1/24
              GROUP BY event
          ) LOOP
              DBMS_OUTPUT.PUT_LINE('Event: ' || io_rec.event || ' | Count: ' || io_rec.event_count);
          END LOOP;
      END;
      /
  2. Check CloudWatch:
    • Log in to AWS Console > RDS > Your instance > Monitoring tab.
    • Monitor metrics: CPUUtilization, ReadIOPS, WriteIOPS, FreeStorageSpace, DatabaseConnections.
    • Set alerts (e.g., CPUUtilization > 80%).
  3. What to Expect:
    • Identifies Data Pump sessions and their CPU/I/O usage.
    • Example output:
      text
      --- RDS Resource Usage ---
      SID: 123 | CPU: 150.25 cs
      Event: db file sequential read | Count: 500
  4. When to Run:
    • Run during large schema imports/exports (parallel=24) to detect resource bottlenecks.

Step 7: Access Log Files

Data Pump log files (small_export.log, big_import.log, etc.) in DATA_PUMP_DIR provide detailed job information.

  1. List Files:
    sql
    SELECT filename FROM dba_directories WHERE directory_name = 'DATA_PUMP_DIR';
  2. Download from S3:
    sql
    BEGIN
        rdsadmin.rdsadmin_s3_tasks.download_from_s3(
            p_bucket_name => 'your-rds-s3-bucket',
            p_s3_prefix => 'dmp_files/',
            p_directory_name => 'DATA_PUMP_DIR'
        );
    END;
    /
  3. Read Logs:
    • After downloading, use SQL*Plus or an external tool to view logs for errors or object counts.

Step 8: Post-Job Validation

After jobs complete, validate the import and REMAP_TABLESPACE:

  1. Check Object Placement:
    sql
    SELECT object_type, tablespace_name, COUNT(*)
    FROM dba_objects
    WHERE tablespace_name IN (SELECT new_tablespace FROM ts_mappings)
    GROUP BY object_type, tablespace_name;
  2. Check Invalid Objects:
    sql
    SELECT object_type, owner, object_name
    FROM dba_objects
    WHERE status = 'INVALID';
  3. Verify User Defaults (post-ALTER USER):
    sql
    SELECT username, default_tablespace, temporary_tablespace
    FROM dba_users
    WHERE default_tablespace IN (SELECT new_tablespace FROM ts_mappings);
  4. Gather Statistics (since excluded in export):
    sql
    EXEC DBMS_STATS.GATHER_DATABASE_STATS(cascade => TRUE, degree => 4);

Execution Order

  1. Before Jobs:
    • Create ts_mappings and datapump_monitor_log tables (Step 1).
    • Ensure S3 bucket is set up.
  2. During Jobs:
    • Start export/import scripts (Step 2).
    • Run Script 1 or 3 for real-time monitoring with the job handle (Step 3 or 5).
    • Run Script 2 periodically for an overview of all jobs (Step 4).
    • Run Script 4 to check RDS resources (Step 6).
  3. After Jobs:
    • Download log files (Step 7).
    • Validate objects and user defaults (Step 8).
    • Query datapump_monitor_log for audit trail.

Troubleshooting Tips

  • Job Stuck in EXECUTING: Check DBA_DATAPUMP_SESSIONS for active sessions. If fewer than expected (e.g., <24 for large schemas), investigate locks or resource limits.
  • Errors (e.g., ORA-00959): Ensure all old tablespaces in ts_mappings match source objects. Add missing ones with fallback (NEW_DATA8).
  • High CPU/IOPS: Scale RDS instance or reduce parallelism (e.g., from 24 to 16).
  • Log Access: If S3 download fails, verify IAM roles and bucket permissions.

Customization

If you provide:

  • Specific job names or schema names.
  • Full ts_mappings table data (40 old to 8 new tablespaces).
  • Oracle version (e.g., 19c). I can refine the scripts further. For now, these steps and scripts are robust, RDS-compatible, and cover all monitoring needs for your Data Pump jobs. Run them in SQL*Plus or SQL Developer, and let me know if you need clarification or additional features!

No comments: