Feed aggregator

RMAN: Synchronize standby database using production archivelog

Michael Dinh - Tue, 2018-08-21 18:39

I know what you are thinking, “Why is this nut of a DBA writing shell script to synchronize standby with achivelog !”

It just happens the environment is very restrictive and NO changes can be made without any change control.

In one week, there’s a planned switchover to RAC standby and it would be nice to have standby duplicated and ready for switchover.

How is this going to work as standby will lag for days until switchover?

Have no fear, there’s a script for that.

# Primary archivelog resides on shared folder with Standby.
# MANAGED REAL TIME APPLY is running.
PRI: /shared/prod/DB01/arch/
SBY: /shared/backup/arch/DB01/

#!/bin/sh 
# rman_cat_arc.sh
# Michael Dinh Aug 21, 2018
#
# Don't forget to set environment here.
#
set -x
# list 5 most recent achivelog
ls -lrt /shared/prod/DB01/arch/|tail -5
# copy archivelog created in last 1 hour since cron runs script every 1 hour.
/bin/find /shared/prod/DB01/arch/ -type f -mmin -60 -exec cp -ufv {} /shared/backup/arch/DB01/ \;
# 
rman msglog /tmp/rman_cat_arc.log > /dev/null << EOF
set echo on;
connect target;
# delete achivelog older than 3 hours
delete force noprompt archivelog until time 'sysdate-3/24';
catalog start with '/shared/backup/arch/DB01/' noprompt;
EOF
# Review alert log
tail -20f $ORACLE_BASE/diag/rdbms/$ORACLE_UNQNAME/$ORACLE_SID/trace/alert_$ORACLE_SID.log
exit
 
# crontab
26 * * * * /home/oracle/rman_cat_arc.sh > /tmp/rman_cat_arc.sh.out 2>&1

# logs
$ ll /tmp/rman*
-rw-r--r--. 1 oracle oinstall 2664 Aug 21 11:26 /tmp/rman_cat_arc.log
-rw-r--r--. 1 oracle oinstall 1852 Aug 21 11:26 /tmp/rman_cat_arc.sh.out

# alert log 
Tue Aug 21 08:26:09 2018
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86391.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86392.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86393.arc
Media Recovery Waiting for thread 1 sequence 86394
Tue Aug 21 09:26:06 2018
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86394.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86395.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86396.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86397.arc
Media Recovery Waiting for thread 1 sequence 86398
Tue Aug 21 10:26:08 2018
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86398.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86399.arc
Media Recovery Waiting for thread 1 sequence 86400
Tue Aug 21 11:26:06 2018
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86400.arc
Media Recovery Log /shared/backup/arch/DB01/DB01_1_717897269_86401.arc
Media Recovery Waiting for thread 1 sequence 86402
Tue Aug 21 11:49:03 2018

select PID,inst_id inst,thread#,client_process,process,status,sequence#,block#,DELAY_MINS
from gv$managed_standby
where 1=1
and status not in ('CLOSING','IDLE','CONNECTED')
order by status desc, thread#, sequence#
;

# Manual recovery: WAIT_FOR_LOG and BLOCK#=0 and never increment.
*** gv$managed_standby ***
                        CLIENT                                               DELAY
     PID  INST  THREAD# PROCESS    PROCESS  STATUS       SEQUENCE#   BLOCK#   MINS
-------- ----- -------- ---------- -------- ------------ --------- -------- ------
  411795     4        1 N/A        MRP0     WAIT_FOR_LOG     86178        0      0
  
# MANAGED REAL TIME APPLY: APPLYING_LOG and BLOCK#>0 increments
*** gv$managed_standby ***
                        CLIENT                                               DELAY
     PID  INST  THREAD# PROCESS    PROCESS  STATUS       SEQUENCE#   BLOCK#   MINS
-------- ----- -------- ---------- -------- ------------ --------- -------- ------
  245652     4        1 N/A        MRP0     APPLYING_LOG     86410      472      0  

Duplicate columns in Oracle query using row limiting clause

Tom Kyte - Tue, 2018-08-21 16:06
Since Oracle 12c, we can finally use the SQL standard row limiting clause like this: <code>SELECT * FROM t FETCH FIRST 10 ROWS ONLY</code> Now, in Oracle 12.1, there was a limitation that is quite annoying when joining tables. It's not possible...
Categories: DBA Blogs

Blocks - 2 or 4 or 8k?

Tom Kyte - Tue, 2018-08-21 16:06
What benefits can be gained from changing from 2k blocks to 4k blocks or even 8k blocks
Categories: DBA Blogs

How to get session id (without using V$ tables )

Tom Kyte - Tue, 2018-08-21 16:06
How to get session identifier (without using V$ tables). -- Normally DBA deny access to V$tables to developers. To get the os_user if we query the following we will get os_user. SELECT SYS_CONTEXT ('USERENV', 'OS_USER') F...
Categories: DBA Blogs

PUBLIC SYNONYM is not compiling using execute immediate

Tom Kyte - Tue, 2018-08-21 16:06
Hi There, I am having a problem trying to compile public synonyms using a block. The block runs fine - however, it doesn't compile the public synonym. When I run the query select * from all_objects where status = 'INVALID', the public synonyms are...
Categories: DBA Blogs

Restoring a database without having any controlfile backup

Yann Neuhaus - Tue, 2018-08-21 09:00

It should never happen but sometimes it happens. You just lost your datafiles as well as your fast recovery area (probably because most of the time these areas are on the same disks despite the recommendations).

Normal restore operations with RMAN are quite easy and secure as far as you have backupsets for database, archivelogs, and spfile/controlfile:

Step 1 – restore the spfile and start the instance
Step 2 – restore the controlfile and mount the database
Step 3 – restore the database (meaning the datafiles)
Step 4 – recover the database as far as possible (by applying archivelogs)
Step 5 – open the database in (no)resetlogs

If you cannot go through step 2 because you don’t have any controlfile backup, you can’t go further with RMAN, that’s it. But there is another way to get a controlfile back to work.

Not having the spfile is annoying, but it’s just a subset of instance parameters, not really important stuff for your data. You can eventually recreate a pfile (you will probably convert it to spfile later) by picking up the non-default parameters in the alert_SID.log, these are located just after the last start of the instance. Or you can create a very basic pfile with very few parameters: at least the db_unique_name, and for this example I need compatible parameter, and a temporary fast recovery area for easy restore of the archivelogs.

vi /u01/oradata/DBTEST1/initDBTEST1.ora
*.db_name='DBTEST1'
control_files='/u01/oradata/DBTEST1/control01.dbf'
compatible=12.1.0.2
db_recovery_file_dest='/u01/oradata/fast_recovery_area/'
db_recovery_file_dest_size=10G

 

Fortunately you remember where you put the backup and you found this:

oracle@vmoratest1:/oracle/backup/ [DBTEST1] ls -lrt
total 189828
-rw-r-----. 1 oracle oinstall   4333568 Aug 20 14:47 DB_34tb1lfb_1_1
-rw-r-----. 1 oracle oinstall     98304 Aug 20 14:47 DB_36tb1lfd_1_1
-rw-r-----. 1 oracle oinstall  54304768 Aug 20 14:47 DB_33tb1lfb_1_1
-rw-r-----. 1 oracle oinstall 121438208 Aug 20 14:47 DB_32tb1lfb_1_1
-rw-r-----. 1 oracle oinstall     92672 Aug 20 14:49 ARC_3atb1lj7_1_1
-rw-r-----. 1 oracle oinstall   1730560 Aug 20 14:49 ARC_39tb1lj7_1_1
-rw-r-----. 1 oracle oinstall   5758464 Aug 20 14:49 ARC_38tb1lj7_1_1
-rw-r-----. 1 oracle oinstall   6619648 Aug 20 14:49 ARC_37tb1lj7_1_1

 

First of all, start the instance.

sqlplus / as sysdba
SQL> startup nomount pfile='/u01/oradata/DBTEST1/initDBTEST1.ora';

 

After trying to restore the controlfile from backuppieces inside the backup directory, you found that no backup has a controlfile:

rman target /
RMAN> restore controlfile from '/oracle/backup/DB_36tb1lfb_1_1';

Starting restore at 20-AUG-2018 15:53:08
using channel ORA_DISK_1

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 08/20/2018 15:53:08
RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece

RMAN> restore controlfile from '/oracle/backup/DB_32tb1lfb_1_1';

Starting restore at 20-AUG-2018 15:53:21
using channel ORA_DISK_1

channel ORA_DISK_1: restoring control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 08/20/2018 15:53:21
ORA-19697: standby control file not found in backup set

RMAN> restore controlfile from '/oracle/backup/ARC_3atb1lj7_1_1';

Starting restore at 20-AUG-2018 15:53:56
using channel ORA_DISK_1

channel ORA_DISK_1: restoring control file
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 08/20/2018 15:53:56
ORA-19870: error while restoring backup piece /oracle/backup/ARC_3atb1lj7_1_1
ORA-19626: backup set type is archived log - can not be processed by this conversation

...

 

Having an instance started is always better than nothing. And through this instance you can have access to many things without actually having a real database. For example, you can use the dbms_backup_restore package: this package is able to restore datafiles without having any controlfile. Very useful for us now. You can easily restore a datafile from a backuppiece but you have to provide the datafile number. A few lines of PL/SQL code can help you to restore all the datafiles from all the available backuppieces.

cd /u01/oradata/DBTEST1/
vi resto.sql

set serveroutput on
declare
        v_dev           varchar2(30) ;
        v_rest_ok       boolean;
        v_df_num        number := 1;
        v_df_max        number := 30;
        v_bck_piece     varchar2(256) := '&1';
        v_rest_folder   varchar2(226) := '/u01/oradata/DBTEST1/';
        v_rest_df       varchar2(256);
begin
       v_dev := dbms_backup_restore.deviceallocate;
       while v_df_num <= v_df_max loop
                v_rest_df := v_rest_folder||'DF_'||lpad(v_df_num,4,'0');
                dbms_backup_restore.restoreSetDatafile;
                dbms_backup_restore.restoreDataFileTo(dfnumber=>v_df_num,toname=>v_rest_df);
                BEGIN
                        dbms_backup_restore.restoreBackupPiece(done=>v_rest_ok,handle=>v_bck_piece);
                EXCEPTION
                        WHEN OTHERS
                        THEN
                                v_rest_ok := FALSE;
                                -- dbms_output.put_line('Datafile '||v_df_num||' is not in this piece');
                END;
                if v_rest_ok THEN
                        dbms_output.put_line('Datafile '||v_df_num||' is restored : '||v_rest_df);
                end if;
                v_df_num := v_df_num + 1;
        end loop;
        dbms_backup_restore.deviceDeallocate;
end;
/
exit;

 

Let’s iterate this anonymous PL/SQL block for each backuppiece in your backup folder:

for a in `find /oracle/backup/ -name DB*`; do sqlplus -s / as sysdba @resto $a; done;

old   6:     v_bck_piece    varchar2(256) := '&1';
new   6:     v_bck_piece    varchar2(256) := '/oracle/backup/DB_32tb1lfb_1_1';
Datafile 1 is restored : /u01/oradata/DBTEST1/DF_0001.dbf
Datafile 4 is restored : /u01/oradata/DBTEST1/DF_0004.dbf
Datafile 9 is restored : /u01/oradata/DBTEST1/DF_0009.dbf

PL/SQL procedure successfully completed.

old   6:     v_bck_piece    varchar2(256) := '&1';
new   6:     v_bck_piece    varchar2(256) := '/oracle/backup/DB_33tb1lfb_1_1';
Datafile 2 is restored : /u01/oradata/DBTEST1/DF_0002.dbf
Datafile 7 is restored : /u01/oradata/DBTEST1/DF_0007.dbf
Datafile 8 is restored : /u01/oradata/DBTEST1/DF_0008.dbf

PL/SQL procedure successfully completed.

old   6:     v_bck_piece    varchar2(256) := '&1';
new   6:     v_bck_piece    varchar2(256) := '/oracle/backup/DB_36tb1lfd_1_1';

PL/SQL procedure successfully completed.

old   6:     v_bck_piece    varchar2(256) := '&1';
new   6:     v_bck_piece    varchar2(256) := '/oracle/backup/DB_34tb1lfb_1_1';
Datafile 3 is restored : /u01/oradata/DBTEST1/DF_0003.dbf
Datafile 5 is restored : /u01/oradata/DBTEST1/DF_0005.dbf
Datafile 6 is restored : /u01/oradata/DBTEST1/DF_0006.dbf

PL/SQL procedure successfully completed.

 

Well done! 9 datafiles were restored. Now look at your folder, you’ll find the 9 datafiles, actually all your database if your backup is reliable:

ls -lrt /u01/oradata/DBTEST1/

total 2017372
-rw-r--r--. 1 oracle oinstall      1035 Aug 20 23:05 resto.sql
-rw-r--r--. 1 oracle oinstall        91 Aug 20 23:12 initDBTEST1.ora
-rw-r-----. 1 oracle oinstall 734011392 Aug 20 23:15 DF_0001.dbf
-rw-r-----. 1 oracle oinstall   5251072 Aug 20 23:15 DF_0004.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0009.dbf
-rw-r-----. 1 oracle oinstall 576724992 Aug 20 23:15 DF_0002.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0007.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0008.dbf
-rw-r-----. 1 oracle oinstall 487596032 Aug 20 23:15 DF_0003.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0005.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0006.dbf

 

You can now manually create the controlfile with these datafiles (you just have to remember the characterset of your database):

sqlplus / as sysdba

CREATE CONTROLFILE REUSE DATABASE "DBTEST1" RESETLOGS  ARCHIVELOG
      MAXLOGFILES 16
      MAXLOGMEMBERS 3
      MAXDATAFILES 100
      MAXINSTANCES 8
      MAXLOGHISTORY 2073
LOGFILE
    GROUP 1 '/u01/oradata/DBTEST1/redo01.rdo'  SIZE 100M BLOCKSIZE 512,
    GROUP 2 '/u01/oradata/DBTEST1/redo02.rdo'  SIZE 100M BLOCKSIZE 512,
    GROUP 3 '/u01/oradata/DBTEST1/redo03.rdo'  SIZE 100M BLOCKSIZE 512
DATAFILE
    '/u01/oradata/DBTEST1/DF_0001.dbf',
    '/u01/oradata/DBTEST1/DF_0002.dbf',
    '/u01/oradata/DBTEST1/DF_0003.dbf',
    '/u01/oradata/DBTEST1/DF_0004.dbf',
    '/u01/oradata/DBTEST1/DF_0005.dbf',
    '/u01/oradata/DBTEST1/DF_0006.dbf',
    '/u01/oradata/DBTEST1/DF_0007.dbf',
    '/u01/oradata/DBTEST1/DF_0008.dbf',
    '/u01/oradata/DBTEST1/DF_0009.dbf'
CHARACTER SET AL32UTF8 ;


Control file created.

 

ls -lrt /u01/oradata/DBTEST1/

total 2029804
-rw-r--r--. 1 oracle oinstall      1035 Aug 20 23:05 resto.sql
-rw-r--r--. 1 oracle oinstall        91 Aug 20 23:12 initDBTEST1.ora
-rw-r-----. 1 oracle oinstall 734011392 Aug 20 23:15 DF_0001.dbf
-rw-r-----. 1 oracle oinstall   5251072 Aug 20 23:15 DF_0004.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0009.dbf
-rw-r-----. 1 oracle oinstall 576724992 Aug 20 23:15 DF_0002.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0007.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0008.dbf
-rw-r-----. 1 oracle oinstall 487596032 Aug 20 23:15 DF_0003.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0005.dbf
-rw-r-----. 1 oracle oinstall  52436992 Aug 20 23:15 DF_0006.dbf
-rw-r-----. 1 oracle oinstall  12730368 Aug 20 23:24 control01.dbf

 

What a relief to see pfile, controlfile and datafiles all together again!

Work is not yet finished because the datafiles are probably inconsistent. There is no need to mount the database as it’s already mounted, and it’s now possible to catalog all your  backuppieces for some kind of RMAN catalog restore:

rman target /
catalog start with '/oracle/backup/';

using target database control file instead of recovery catalog
searching for all files that match the pattern /oracle/backup/

List of Files Unknown to the Database
=====================================
File Name: /oracle/backup/DB_32tb1lfb_1_1
File Name: /oracle/backup/ARC_37tb1lj7_1_1
File Name: /oracle/backup/ARC_39tb1lj7_1_1
File Name: /oracle/backup/DB_33tb1lfb_1_1
File Name: /oracle/backup/DB_36tb1lfd_1_1
File Name: /oracle/backup/ARC_3atb1lj7_1_1
File Name: /oracle/backup/DB_34tb1lfb_1_1
File Name: /oracle/backup/ARC_38tb1lj7_1_1

Do you really want to catalog the above files (enter YES or NO)? YES

cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /oracle/backup/DB_32tb1lfb_1_1
File Name: /oracle/backup/ARC_37tb1lj7_1_1
File Name: /oracle/backup/ARC_39tb1lj7_1_1
File Name: /oracle/backup/DB_33tb1lfb_1_1
File Name: /oracle/backup/DB_36tb1lfd_1_1
File Name: /oracle/backup/ARC_3atb1lj7_1_1
File Name: /oracle/backup/DB_34tb1lfb_1_1
File Name: /oracle/backup/ARC_38tb1lj7_1_1

 

You now need to restore the archivelogs:

RMAN> restore archivelog all;

Starting restore at 20-AUG-2018 23:43:47
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=23 device type=DISK

channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=44
channel ORA_DISK_1: reading from backup piece /oracle/backup/ARC_39tb1lj7_1_1
channel ORA_DISK_1: piece handle=/oracle/backup/ARC_39tb1lj7_1_1 tag=TAG20180820T144911
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=45
channel ORA_DISK_1: reading from backup piece /oracle/backup/ARC_38tb1lj7_1_1
channel ORA_DISK_1: piece handle=/oracle/backup/ARC_38tb1lj7_1_1 tag=TAG20180820T144911
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=46
channel ORA_DISK_1: reading from backup piece /oracle/backup/ARC_37tb1lj7_1_1
channel ORA_DISK_1: piece handle=/oracle/backup/ARC_37tb1lj7_1_1 tag=TAG20180820T144911
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=47
channel ORA_DISK_1: reading from backup piece /oracle/backup/ARC_3atb1lj7_1_1
channel ORA_DISK_1: piece handle=/oracle/backup/ARC_3atb1lj7_1_1 tag=TAG20180820T144911
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 20-AUG-2018 23:43:52

 

Now it’s probably possible to recover the database:

sqlplus / as sysdba

SQL> recover database until cancel using backup controlfile;
ORA-00279: change 1386561 generated at 08/20/2018 14:47:07 needed for thread 1
ORA-00289: suggestion :
/u01/oradata/fast_recovery_area/DBTEST1/archivelog/2018_08_21/o1_mf_1_47_fqplx2l
n_.arc
ORA-00280: change 1386561 for thread 1 is in sequence #47


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
AUTO
ORA-00279: change 1386635 generated at 08/20/2018 14:49:10 needed for thread 1
ORA-00289: suggestion :
/u01/oradata/fast_recovery_area/DBTEST1/archivelog/2018_08_21/o1_mf_1_48_%u_.arc
ORA-00280: change 1386635 for thread 1 is in sequence #48
ORA-00278: log file
'/u01/oradata/fast_recovery_area/DBTEST1/archivelog/2018_08_21/o1_mf_1_47_fqplx2
ln_.arc' no longer needed for this recovery


ORA-00308: cannot open archived log
'/u01/oradata/fast_recovery_area/DBTEST1/archivelog/2018_08_21/o1_mf_1_48_%u_.ar
c'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

 

Last error is normal because Oracle didn’t know the sequence 48 never existed.

Now all the archivelogs are applied, fingers crossed for the last operation that is supposed to bring back the database to life:

SQL> alter database open resetlogs;

Database altered.

SQL> select instance_name, status from v$instance;

INSTANCE_NAME     STATUS
---------------- ------------
DBTEST1      OPEN

SQL> select file_name from dba_data_files;

FILE_NAME
--------------------------------------------------------------------------------
/u01/oradata/DBTEST1/DF_0001.dbf
/u01/oradata/DBTEST1/DF_0002.dbf
/u01/oradata/DBTEST1/DF_0003.dbf
/u01/oradata/DBTEST1/DF_0004.dbf
/u01/oradata/DBTEST1/DF_0005.dbf
/u01/oradata/DBTEST1/DF_0007.dbf
/u01/oradata/DBTEST1/DF_0006.dbf
/u01/oradata/DBTEST1/DF_0009.dbf
/u01/oradata/DBTEST1/DF_0008.dbf

9 rows selected.

Yes everything is OK!!! Apart from generic name for your datafiles, a single controlfile, no spfile, default-configured redologs and probably no temporary tablespace. But the database is up and running, and you feel like a hero. Or you just manage to keep your job ;-)

 

Cet article Restoring a database without having any controlfile backup est apparu en premier sur Blog dbi services.

When we do a pg_dump and right afterwards truncate a table which is in the dump, what happens?

Yann Neuhaus - Tue, 2018-08-21 01:16

Being at customers is always the best way to learn. Today while discussing that pg_dump will always produce a consistent dump because it uses the “repeatable read” isolation level this question came up: What happens when we dump a database and while the dump is running we truncate a table in that database? Does that block? Well, the answer is in the documentation: pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

What is not in the documentation is that pg_dump uses the “repeatable read” isolation level, but it is documented in the source code:

postgres@pgbox:/home/postgres/postgresql-10.4/ [PG10] vi src/bin/pg_dump/pg_dump.c
...
 *      Note that pg_dump runs in a transaction-snapshot mode transaction,
 *      so it sees a consistent snapshot of the database including system
 *      catalogs. However, it relies in part on various specialized backend
 *      functions like pg_get_indexdef(), and those things tend to look at
 *      the currently committed state.  So it is possible to get 'cache
 *      lookup failed' error if someone performs DDL changes while a dump is
 *      happening. The window for this sort of thing is from the acquisition
 *      of the transaction snapshot to getSchemaData() (when pg_dump acquires
 *      AccessShareLock on every table it intends to dump). It isn't very large,
 *      but it can happen.
...

For the moment lets ignore the rest of that paragraph and focus on the original question. For that lets create some sample data we can dump:

postgres=# create database dump;
CREATE DATABASE
postgres=# \c dump
You are now connected to database "dump" as user "postgres".
dump=# create table t_dump as 
       select a.*, md5(a::text) 
         from generate_series ( 1, 3000000 ) a;
SELECT 3000000

As we need two sessions for this demo we increase the time it takes for the dump by compressing at the highest level:

postgres@pgbox:/home/postgres/ [PG10] pg_dump --compress=9 dump > test.dump

In a second session, while the dump is running, we truncate the table?

dump=# truncate table t_dump;
TRUNCATE TABLE
Time: 9411.574 ms (00:09.412)

And surprise: Yes, the pg_dump operation is blocking the truncate (you can see that from the time it took, usually a truncate is instant). So the documentation is not quite accurate. Before going further, does the same happen when we modify the table while the dump is running? Same test as above for the dump and in the second session:

dump=# alter table t_dump add c text;
ALTER TABLE
Time: 11093.535 ms (00:11.094)

Same here, blocking (otherwise the addition of a column would have been instant). So when you do a DDL against a table while a dump is running that DDL has to wait until the dump completed.

Coming back to the remaining sentences of the paragraph from the source code. pg_dump acquires an AccessShareLock while it is running and we can verify this in the second session while the dump is running:

dump=# select database, relation::regclass, mode from pg_locks where relation = 't_dump'::regclass;
 database | relation |      mode       
----------+----------+-----------------
    33985 | t_dump   | AccessShareLock
(1 row)

This does not lock the table for reading or writing but it does lock the table for DDLs. We can confirm that as well when we do a select and an insert in the second session while the dump is running in the first session:

dump=# insert into t_dump (a,md5,c) values (-1,'aaa','bbb');
INSERT 0 1
Time: 8.131 ms
dump=# select * from t_dump where a = -1;
 a  | md5 |  c  
----+-----+-----
 -1 | aaa | bbb
(1 row)

No issues here. When we manually lock the table in “AccessShareLock” in the first session we will not be able to alter it in the second session.
Session 1:

dump=# begin;
BEGIN
dump=# lock table t_dump IN ACCESS SHARE MODE;
LOCK TABLE
dump=# 

… and in the second session try some DDL:

dump=# alter table t_dump alter COLUMN c set default 'a';
-- blocks

Creating an index on that table while is locked in that mode works:

dump=# create index i1 on t_dump (c);
CREATE INDEX

… while dropping an index while the table is locked in that mode will block as well:

dump=# drop index i1;
-- blocks

So the final advice: Plan to do your dumps when there is no DDL activity.

 

Cet article When we do a pg_dump and right afterwards truncate a table which is in the dump, what happens? est apparu en premier sur Blog dbi services.

[BLOG] Introduction to Performance Tuning In Oracle Database

Online Apps DBA - Mon, 2018-08-20 22:42

Are you an Oracle database Admin? Are you worried about your users complaining about the slow online performance of the system? [BLOG] Introduction to Performance Tuning In Oracle Database Visit: https://k21academy.com/tuning11 & learn about Performance Tuning, which is one of the best ways to identify bottlenecks and ensure your infrastructure is equipped to handle increased […]

The post [BLOG] Introduction to Performance Tuning In Oracle Database appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

[BLOG] Oracle Apps DBA (R12): 10 Must Remember Points while Cloning

Online Apps DBA - Mon, 2018-08-20 22:26

Check out 10 key points of Oracle Apps DBA (R12) Cloning? Must read if you are new to cloning! [BLOG] Oracle Apps DBA (R12): 10 Must Remember Points while Cloning Visit: https://k21academy.com/appsdba27 & know about ✓ Cloning ✓ What is the process of creating a copy of an existing Oracle E-Business Suite system & stated […]

The post [BLOG] Oracle Apps DBA (R12): 10 Must Remember Points while Cloning appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

A tribute to Natural Join

Yann Neuhaus - Mon, 2018-08-20 10:23
By Franck Pachot

.
I know that lot of people are against the ANSI join syntax in Oracle. And this goes beyond the limits when talking about NATURAL JOIN. But I like them and use them quite often.

Why is Natural Join bad?

Natural join is bad because it relies on column names, and, at the time of writing the query, you don’t know which columns will be added or removed later. Here is an example on the SCOTT schema, joining on DEPTNO which has the same name in DEPT and EMP:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

The DEPT table has a ‘LOC’column for the location of the department. But the data model may evolve and you may add a location for each employee. And we may also call it LOC:

SQL> alter table EMP add (LOC varchar2(10));
Table altered.

But now our Natural Join adds this column to the join predicate and the result is wrong because it shows only rows which have same department location as employee location:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
no rows selected

Projection

In my opinion, the problem is not the Natural Join. Column names have a meaning for their tables. But the tables have different roles in our queries. As soon as a table or view participates to our query, we should redefine the column names. If we don’t, the result is completely wrong as:

SQL> select * from EMP join DEPT using(DEPTNO) where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM LOC DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

Look: the result has two columns with the same name. This is completely wrong for a relational database and I don’t even understand why this parses without raising an error.

The projection is the most important relational operation, often overlooked as if it was just a rename for aesthetic purpose. You need to name the columns of your result set. They are the metadata for the interface between SQL and the host language. ‘select *’ is a shortcut when running an interactive query, to get a glance at the result rows. But a SQL query result is not complete without proper column names. And in most cases, at least when you query more than one table, the name of the query result columns should be different than the name of the underlying table columns. A department may have a location. And an employee may have a location. But the location of the employee department is something completely different than the employee location.

Then, as you need to name each column anyway, why not doing it as soon as possible? Do it for each table involved in the query, so that you are sure that all column names are correct within the query. As soon as you introduce an new table in the FROM clause, you should actually name the columns according to their role in the query. Let’s take an example with an airline data model. Each airport is linked to a city. This can be a CITY column in the AIRPORTS table. But as soon as you join FLIGHTS with AIRPORTS, this table has a different role. You join on destination airport or source airport. Then you alias the AIRPORTS table in the FROM clause, such as DST_AIRPORTS or SRC_AIRPORTS. Within the query, you can reference the columns with the table alias, such as DST_AIRPORTS.CITY or SRC_AIRPORTS.CITY but this cannot be exposed as-is in the query result. You must name them in the SELECT clause with something like SELECT DST_AIRPORTS.CITY as DST_ARP_CITY , SRC_AIRPORTS.CITY as SRC_ARP_CITY.

Then, as I’ll need to rename them anyway, I prefer to do it as soon as I join to a new table in the FROM clause. Instead of joining to AIRPORTS DST_AIRPORTS I can join to (SELECT IATA DST_ARP_IATA, CITY DST_ARP_CITY FROM AIRPORTS) and all column names will relate to the role without table aliases and without further renaming. And when I do that correctly, I can use natural join without risk.

Projection in the FROM clause

Let’s take an example. Here is a query in DEPT where I explicitly mention that LOC is the department location. This is implicit when the column name belongs to the DEPT table. But it will not be implicit anymore once I join this table to another table. Here is the view ready to be included in any query:


SQL> select DEPTNO,DNAME DEPT_DNAME,LOC DEPT_LOC from DEPT where DNAME='SALES';
 
DEPTNO DEPT_DNAME DEPT_LOC
---------- -------------- -------------
30 SALES CHICAGO

Now, I can join this to the EMP table. I prefix all columns from EMP with “EMP_” and all columns from DEPT with “EMP_DEPT_” because they belong to DEPT when in the role of employee department:

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_EMPNO
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_EMPNO
---------- ---------- -------------- ------------- ---------- -------------
7521 WARD SALES CHICAGO 7698
7844 TURNER SALES CHICAGO 7698
7499 ALLEN SALES CHICAGO 7698
7900 JAMES SALES CHICAGO 7698
7698 BLAKE SALES CHICAGO 7839
7654 MARTIN SALES CHICAGO 7698

As you can see, when the names are clearly indicating the column with its role in the join, and how they are correlated with the other tables, there is no need to mention any join predicate. I used Natural Join because the join is on EMP_DEPTNO and I’m sure that it will always be the one and only one column with the same name. By query design.

And the column names in the result are correct, explicitly mentioning what is an Employee attribute or an Employee department attribute. That can be easy to parse and put in an object graph in the host language. You can see there that the MGR column of EMP was named EMP_MGR_EMPNO because this is actually what it is: the EMPNO of the employee manager. It is a foreign key to the EMP table.

And then, adding more information about the manager is easy: join with EMP again but with the proper projection of columns: EMPNO will be EMP_MGR_EMPNO when in the role of the employee manager, ENAME will be EMP_MGR_ENAME, DEPTNO will be EMP_MGR_DEPTNO, and so on:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_ENAME
---------- ---------- -------------- ------------- ---------- -------------- -------------
7900 JAMES SALES CHICAGO 30 BLAKE
7499 ALLEN SALES CHICAGO 30 BLAKE
7654 MARTIN SALES CHICAGO 30 BLAKE
7844 TURNER SALES CHICAGO 30 BLAKE
7521 WARD SALES CHICAGO 30 BLAKE
7698 BLAKE SALES CHICAGO 10 KING

No need to review the whole query when adding a new table. No need to solve the new ‘column ambiguously defined’. We don’t even need to alias the tables here.

Want to add the department name of the manager? That’s easy: join to DEPT with the right column projection (all prefixed by EMP_MGR_DEPT as the new columns are all about the employee manager’s department):

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
natural join
(select DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_EN EMP_MGR_DEPT_D
---------- ---------- -------------- ------------- ---------- -------------- ---------- --------------
7698 BLAKE SALES CHICAGO 10 KING ACCOUNTING
7900 JAMES SALES CHICAGO 30 BLAKE SALES
7499 ALLEN SALES CHICAGO 30 BLAKE SALES
7654 MARTIN SALES CHICAGO 30 BLAKE SALES
7844 TURNER SALES CHICAGO 30 BLAKE SALES
7521 WARD SALES CHICAGO 30 BLAKE SALES

This can be even easier when you generate SQL queries. When adding a new table to join to, you just prefix all columns with their role. Check foreign keys so that the naming is consistent with the referenced tables. Then when parsing the result, the naming convention can help to break on the object hierarchy.

Additional notes

I mentioned that aliasing the subquery is not mandatory because I do not have to prefix the column names. However, when looking at the predicates section of the execution plan, the columns may be prefixed with an internal alias:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("from$_subquery$_006"."EMP_MGR_DEPTNO"="from$_subquery$_009"."EMP_MGR_DEPTNO")
2 - access("from$_subquery$_001"."EMP_MGR_EMPNO"="from$_subquery$_006"."EMP_MGR_EMPNO" AND "from$_subquery$_001"."EMP_DEPTNO"="from$_subquery$_003"."EMP_DEPTNO")

Then it is a good idea to add prefixes, such as EMP, EMP_DEPT, EMP_MGR EMP_MGR_DEPTNO in the query above so that the predicates become:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP_MGR"."EMP_MGR_DEPTNO"="EMP_MGR_DEPT"."EMP_MGR_DEPTNO")
2 - access("EMP"."EMP_MGR_EMPNO"="EMP_MGR"."EMP_MGR_EMPNO" AND "EMP"."EMP_DEPTNO"="EMP_DEPT"."EMP_DEPTNO")
5 - filter("DNAME"='SALES')

I also like to add a QB_NAME hint so that I can reference easily those subqueries if I have to add some hints there. Finally, this is what I can generate for this query:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select /*+qb_name(EMP)*/ DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP) EMP
natural join
(select /*+qb_name(EMP_DEPT)*/ DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT) EMP_DEPT
natural join
(select /*+qb_name(EMP_MGR)*/ DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP) EMP_MGR
natural join
(select /*+qb_name(EMP_MGR_DEPT)*/ DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT) EMP_MGR_DEPT
where EMP_DEPT_DNAME='SALES';

So what?

My goal here is not to recommend to always use natural joins. This depends on the context (ad-hoc queries, embedded ones in existing code with naming standards,…) and whether con control exactly the columns names. There are also a few bugs with ANSI joins, and natural join is not widely used, so maybe not tested a lot. But when I hear that Natural Join is bad, I want to explain the why/how/when. And one of the good sides of it is that it forces us to do the projection/rename as soon as possible and this makes the query easier to read/maintain/evolve. Of course, using natural join in that way requires that all tables are added to the FROM clause through a subquery which carefully names all columns in the SELECT clause so that the correlation with the other tables is clearly defined.

 

Cet article A tribute to Natural Join est apparu en premier sur Blog dbi services.

Running PLSQL as SYSDBA through DEFINER-rights?

Yann Neuhaus - Mon, 2018-08-20 10:07

Recently I got an interesting request: The customer wanted to allow the application installation routine to create a guaranteed restore point without giving it all required privileges to do so. So the idea was to encapsulate creating and dropping a guaranteed restore point in a PLSQL package and granting the application owner the permission to execute the package. The problem with that approach is that SYSDBA-privileges are required to create a guaranteed restore point and the question came up if it is actually possible to have a PLSQL package created with DEFINER-rights, where the DEFINER has the SYSDBA-privilege? Actually that is not possible, because you have to be connected “AS SYSDBA” to have the SYSDBA-privilege. A package created from a user, who connected as SYSDBA does not inherit the SYSDBA-privilege as the following example shows:


sqlplus / as sysdba
 
create user dbadmin identified by dbadmin;
grant sysdba to dbadmin;
create user appluser identified by appluser;
grant create session to appluser;
connect dbadmin/dbadmin as sysdba
 
create or replace package grp_handling as
procedure create_grp;
procedure drop_grp;
end;
/
 
create or replace package body grp_handling as
procedure create_grp
as
begin
begin
execute immediate 'drop restore point before_appl_installation';
exception
when others then null;
end;
execute immediate 'create restore point before_appl_installation guarantee flashback database';
end;
procedure drop_grp
as
begin
execute immediate 'drop restore point before_appl_installation';
end;
end;
/
 
exec grp_handling.create_grp;
 
PL/SQL procedure successfully completed.
 
select name from v$restore_point;
 
NAME
----------------------------------------------
BEFORE_APPL_INSTALLATION
 
select owner,object_type from dba_objects where object_name='GRP_HANDLING';
 
OWNER OBJECT_TYPE
------------------------------ -----------------------
SYS PACKAGE
SYS PACKAGE BODY
 
select user from dual;
 
USER
------------------------------
SYS
 
grant execute on grp_handling to appluser;
 
connect appluser/appluser
exec sys.grp_handling.create_grp;
 
*
ERROR at line 1:
ORA-01031: insufficient privileges
ORA-06512: at "SYS.GRP_HANDLING", line 10
ORA-06512: at line 1

So first of all a user, who connects “AS SYSDBA” actually connects as SYS. Secondly the SYSDBA-privilege is not inherited as a DEFINER-right in PLSQL-objects.

So how to resolve the issue to create a guaranteed restore point from the appluser-Session then?
I suggested to wrap a dbms_scheduler-external callout in a Package as follows:

1.) Create the bash-Skript /home/oracle/GRP/cre_grp.bash as OS-User oracle


#!/bin/bash
. oraenv <<EOF
prem122
EOF
 
sqlplus / as sysdba <<EOF
begin
execute immediate 'drop restore point before_appl_installation';
exception
when others then null;
end;
/
create restore point before_appl_installation guarantee flashback database;
exit
EOF

–> Add execute permissions for the user: chmod u+x /home/oracle/GRP/cre_grp.bash

2.) Create credential and job


connect dbadmin as sysdba
BEGIN
DBMS_CREDENTIAL.create_credential(
credential_name => 'oracle_122',
username => 'oracle',
password => '<passwd OS-user oracle>'
);
END;
/
 
BEGIN
dbms_scheduler.CREATE_job
(
job_name => 'CRE_GUARANTEED_RP',
job_type => 'EXECUTABLE',
job_action => '/home/oracle/GRP/cre_grp.bash',
enabled => true,
auto_drop => false,
credential_name => 'oracle_122'
);
END;
/

3.) Create a procedure to run the Job


connect dbadmin as sysdba
create procedure run_my_GRP_job
as
begin
dbms_scheduler.run_job('CRE_GUARANTEED_RP');
end;
/
 
grant execute on run_my_GRP_job to appluser;

4.) Now the appluser can run the job:


connect appluser
exec sys.run_my_GRP_job;

Conclusion: You cannot provide SYSDBA-privileges through DEFINER-rights in PLSQL. In case you have to run PLSQL “AS SYSDBA” then you have to connect “AS SYSDBA”. Running SYSDBA-commands as a non-SYSDBA-user is possible with a workaround like through a procedure, which runs an external job.

 

Cet article Running PLSQL as SYSDBA through DEFINER-rights? est apparu en premier sur Blog dbi services.

[BLOG] Oracle Identity Cloud Service: Configuring Multi-Factor Authentication (MFA)

Online Apps DBA - Mon, 2018-08-20 06:40

Multi-Factor Authentication (MFA) is a method of authentication that requires the use of more than one factor to verify a user’s identity. [BLOG] Oracle Identity Cloud Service: Configuring Multi-Factor Authentication (MFA) Visit: https://k21academy.com/idcs13 to know how to Configure Multi-Factor Authentication (MFA) with Oracle Identity Cloud Services (IDCS). Multi-Factor Authentication (MFA) is a method of authentication […]

The post [BLOG] Oracle Identity Cloud Service: Configuring Multi-Factor Authentication (MFA) appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

Parsing Badly Formatted JSON in Oracle DB with APEX_JSON

Rittman Mead Consulting - Mon, 2018-08-20 05:38
Parsing Badly Formatted JSON in Oracle DB with APEX_JSON

After some blogging silence due to project work and holidays, I thought it was a good idea to do a write-up about a problem I faced this week. One of the tasks I was assigned was to parse a set of JSON files stored in an Oracle 12.1 DB Table.

As probably all of you already know JSON (JavaScript Object Notation) is a lightweight data-interchange format and is the format used widely when talking of web-services due to its flexibility. In JSON there is no header to define (think CSV as example), every field is defined in a format like "field name":"field value", there is no "set of required columns" for a JSON object, when a new attribute needs to be defined, the related name and value can be added to the structure. On top of this "schema-free" definition, the field value can either be

  • a single value
  • an array
  • a nested JSON object

Basically, when you start parsing JSON you feel like

Parsing Badly Formatted JSON in Oracle DB with APEX_JSON

The Easy Part

The task assigned wasn't too difficult, after reading the proper documentation, I was able to parse a JSON File like

{
 "field1": "abc",
 "field2": "cde"
}

Using a simple SQL like

select * 
 from TBL_NAME d,
 JSON_TABLE(d.text, '$' COLUMNS (
   field1 VARCHAR2(10) PATH '$.field1',
   field2 VARCHAR2(10) PATH '$.field2'
   )
 )

Parsing arrays is not very complex either, a JSON file like

{
 "field1": "abc",
 "field2": "cde",
 "field3": ["fgh","ilm","nop"]
}

Can be easily parsed using the NESTED PATH call

select * 
 from TBL_NAME d, 
 JSON_TABLE(d.text, '$' COLUMNS (
   field1 VARCHAR2(10) PATH '$.field1',
   field2 VARCHAR2(10) PATH '$.field2',
   NESTED PATH '$.field3[*]' COLUMNS (
     field3 VARCHAR2(10) PATH '$'
   )
 )
)

In case the Array contains nested objects, those can be parsed using the same syntax as before, for example, field4 and field5 of the following JSON

{
 "field1": "abc",
 "field2": "cde",
 "field3": [
           {
            "field4":"fgh",
            "field5":"ilm"
           },
           {
            "field4":"nop",
            "field5":"qrs"
           }
           ] 
}

can be parsed with

NESTED PATH '$.field3[*]' COLUMNS ( 
   field4 VARCHAR2(10) PATH '$.field4',
   field5 VARCHAR2(10) PATH '$.field5' 
)
...Where things got complicated

All very very easy with well-formatted JSON files, but then I faced the following

{ 
"field1": "abc", 
"field2": "cde", 
"field3": [
    {
     "field4": "aaaa", 
     "field5":{ 
           "1234":"8881", 
           "5678":"8893" 
          }
     },
     {
      "field4": "bbbb",  
      "field5":{ 
            "9876":"8881", 
            "7654":"8945",
            "4356":"7777"
          }
      } 
      ] 
}

Basically the JSON file started including fields with names representing the Ids meaning an association like Product Id (1234) is member of Brand Id (8881). This immediately triggered my reaction:

Parsing Badly Formatted JSON in Oracle DB with APEX_JSON

After checking the documentation again, I wasn't able to find anything that could help me parsing that, since all the calls were including a predefined PATH string, that in the case of Ids I couldn't know beforehand.

I then reached out to my network on Twitter

To all my @Oracle SQL friends out there: I need to parse a JSON object which has a strange format of {“name”:”abc”, “345678”:”123456”} with the 345678 being an Id I need to extract, any suggestions? none of the ones mentioned here seems to help https://t.co/DRWdGvCVfu pic.twitter.com/PfhtUnAeR4

— Francesco Tisiot (@FTisiot) 14 agosto 2018

That generated quite a lot of responses. Initially, the discussion was related to the correctness of the JSON structure, that, from a purist point of view should be mapped as

{ 
"field1": "abc",
"field2": "cde",
"field3": [ 
     {
      "field4": "aaaa", 
      "field5":
           { 
             "association": [
                  {"productId":"1234", "brandId":"8881"},
                  {"productId":"5678", "brandId":"8893"}
                  ]
           },
      },
      {
       "field4": "bbbb", 
       "field5":    
           {
             "association": [
                  {"productId":"9876", "brandId":"8881"},
                  {"productId":"7654", "brandId":"8945"},
                  {"productId":"4356", "brandId":"7777"}
                  ]
           }
      } 
      ]
}

basically going back to standard field names like productId and brandId that could be easily parsed. In my case this wasn't possible since the JSON format was aready widely used at the client.

Possible Solutions

Since a change in the JSON format wasn't possible, I needed to find a way of parsing it, few solutions were mentioned in the twitter thread:

  • Regular Expressions
  • Bash external table preprocessor
  • Java Stored functions
  • External parsing before storing data into the database

All the above were somehow discarded since I wanted to try achieving a solution based only on existing database functions. Other suggestion included JSON_DATAGUIDE and JSON_OBJECT.GET_KEYS that unfortunately are available only from 12.2 (I was on 12.1).

But, just a second before surrendering, Alan Arentsen suggested using APEX_JSON.PARSE procedure!

The Chosen One: APEX_JSON

The APEX_JSON package offers a series of procedures to parse JSON in a PL/SQL package, in particular:

  • PARSE: Parses a JSON formatted string contained in a VARCHAR2 or CLOB storing all the members.
  • GET_COUNT: Returns the number of array elements or object members
  • GET_MEMBERS: Returns the table of members of an object

You can already imagine how a combination of those calls can parse the JSON text defined above, let's have a look at the JSON again:

{ 
"field1": "abc", 
"field2": "cde", 
"field3": [
    {
     "field4": "aaaa", 
     "field5":{ 
           "1234":"8881", 
           "5678":"8893" 
          }
     },
     {
      "field4": "bbbb",  
      "field5":{ 
            "9876":"8881", 
            "7654":"8945",
            "4356":"7777"
          }
      } 
      ] 
}

The parsing process should iterate over the field3 entries (2 in this case), and for each entry, then iterate over the fields in field5 to get both the field name as well as the field value.
The number of field3 entries can be found with

APEX_JSON.GET_COUNT(p_path=>'field3',p_values=>j);

And the list of members of field5 with

APEX_JSON.GET_MEMBERS(p_path=>'field3[%d].field5',p_values=>j,p0=>i);

Note the p_path parameter set to field3[%d].field5 meaning that we want to extract the field5 from the nth row in field3. The rownumber is defined by p0=>i with i being the variable we use in our FOR loop.

The complete code is the following

DECLARE 
   j APEX_JSON.t_values; 
   r_count number;
   field5members   WWV_FLOW_T_VARCHAR2;
   p0 number;
   BrandId VARCHAR2(10);
BEGIN
APEX_JSON.parse(j,'<INSERT_JSON_STRING>');
# Getting number of field3 elements
r_count := APEX_JSON.GET_COUNT(p_path=>'field3',p_values=>j);
dbms_output.put_line('Nr Records: ' || r_count);

# Looping for each element in field3
FOR i IN 1 .. r_count LOOP
# Getting field5 members for the ith member of field3
 field5members := APEX_JSON.GET_MEMBERS(p_path=>'field3[%d].field5',p_values=>j,p0=>i);
# Looping all field5 members
 FOR q in 1 .. field5members.COUNT LOOP
# Extracting BrandId
   BrandId := APEX_JSON.GET_VARCHAR2(p_path=>'field3[%d].field5.'||field5members(q) ,p_values=>j,p0=>i);
# Printing BrandId and Product Id
   dbms_output.put_line('Product Id ="'||field5members(q)||'" BrandId="'||BrandId ||'"');
 END LOOP;
END LOOP;
   
END;

Note that, in order to extract the BrandId we used

APEX_JSON.GET_VARCHAR2(p_path=>'field3[%d].field5.'||field5members(q) ,p_values=>j,p0=>i);

Specifically the PATH is field3[%d].field5.'||field5members(q). As you can imagine we are appending the member name (field5members(q)) to the path described previously to extract the value, forming a string like field3[1].field5.1234 that will correctly extract the value associated.

Conclusion

Three things to save from this experience. The first is the usage of JSON_TABLE: with JSON_TABLE you can parse well-constructed JSON documents and it's very easy and powerful.
The second: APEX_JSON useful package to parse "not very well" constructed JSON documents, allows iteration across elements of JSON arrays and object members.
The last, which is becoming every day more relevant in my career, is the importance of networking and knowledge sharing: blogging, speaking at conferences, helping others in various channels allows you to know other people and be known with the nice side effect of sometimes being able with a single tweet to get help solving problems you may face!

Categories: BI & Warehousing

Masterclass – 1

Jonathan Lewis - Mon, 2018-08-20 04:42

A recent thread on the Oracle developer community database forum raised a fairly typical question with a little twist. The basic question is “why is this (very simple) query slow on one system when it’s much faster on another?” The little twist was that the original posting told use that “Streams Replication” was in place to replicate the data between the two systems.

To make life easy for remote trouble-shooters the poster had supplied (for each system) the output from SQL Monitor when running the query, the autotrace output (which shows the predicate section that SQL Monitor doesn’t report), and the session statistics for the query run, plus some statistics about the single table in the query, the index used in the plan, and the column on which that index was based.

Here, with a little cosmetic editing (and a query that has clearly been camouflaged by the OP), is the information supplied for the faster database, where the query took about 30 seconds to complete.


SELECT c1, c2, c3, c4, c5, c6, c7, c8..  
FROM TAB1  
WHERE STS IN ( 'A', 'B')  
AND cnt < '4'  
AND dt < sysdate  
and rownum <=1;  
  
Sql_monitor and stats from DB1  
******************************  
  
Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)             
 Instance ID         :  1                           
 Execution Started   :  08/17/2018 08:31:22         
 First Refresh Time  :  08/17/2018 08:31:22         
 Last Refresh Time   :  08/17/2018 08:31:53         
 Duration            :  31s                         
 Program             :  sqlplus.exe                 
 Fetch Calls         :  1                           
  
Global Stats  
===============================================================================  
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read  | Read  |  
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes |  
===============================================================================  
|      33 |    3.00 |       30 |        0.08 |     1 |   102K | 38571 | 301MB |  
===============================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity |       Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |         (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    114K | 33399 |        32 |     +0 |     2 |        1 | 38377 | 300MB |    96.77 | Cpu (1)                      |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | db file sequential read (16) |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | read by other session (13)   |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    115K |   723 |        30 |     +2 |     2 |     118K |   194 |   2MB |     3.23 | read by other session (1)    |  
======================================================================================================================================================================================  
  
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 33399   (1)| 00:03:14 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   114K|   201M| 33399   (1)| 00:03:14 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   114K|       |   723   (1)| 00:00:05 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"<SYSDATE@!)  
   4 - access("STS"='A' OR "STS"='B')  
  
  
Table stats 
-----------
table_name    num_rows        blocks  empty_blocks  chain_cnt  avg_row_len             
TAB1        79,654,925    22,416,917             0          0        1,847          
  
column_stats(STS)
------------------
table_name  column_name  num_distinct  num_nulls    density  avg_col_len   
TAB1        STS                     5          0  6.2049E-9            2         
  
Index_stats(on STS)
-------------------
index_name    leaf_blocks  distinct_keys  avg_leaf_blocks_per_key  avg_data_blocks_per_key  clustering_factor    num_rows
TAB1_STS_IDX      487,939              5                   97,587                4,458,874         22,294,372  78,308,939   
 
Session stats
-------------
process last non-idle time              1,534,508,966
session connect time                    1,534,508,966
logical read bytes from cache             839,663,616
cell physical IO interconnect bytes       316,055,552
physical read bytes                       316,055,552
physical read total bytes                 316,055,552
file io wait time                          17,044,083
session pga memory                          8,643,880
session pga memory max                      8,643,880
temp space allocated (bytes)                4,194,304
session uga memory                          1,755,696
session uga memory max                      1,755,696
buffer is pinned count                        135,743
table fetch by rowid                          117,519
non-idle wait count                           107,301
session logical reads                         102,500
consistent gets                               102,450
consistent gets from cache                    102,448
no work - consistent read gets                102,368
buffer is not pinned count                    101,741
free buffer inspected                          43,458
free buffer requested                          38,592
physical read total IO requests                38,581
physical read IO requests                      38,581
physical reads                                 38,581
physical reads cache                           38,579
hot buffers moved to head of LRU               37,258
bytes sent via SQL*Net to client                7,370
bytes received via SQL*Net from client          6,869
redo size                                       5,536
undo change vector size                         4,432
DB time                                         3,166
non-idle wait time                              2,962
user I/O wait time                              2,954
table fetch continued row                       2,423


And here’s the equivalent information from the slower database where the query took more than 9 times as long (4 minutes 42 seconds) to complete.


Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)           
 Instance ID         :  1                         
 Execution Started   :  08/17/2018 08:21:47       
 First Refresh Time  :  08/17/2018 08:21:47       
 Last Refresh Time   :  08/17/2018 08:26:29       
 Duration            :  282s                      
 Module/Action       :  SQL*Plus/-                
 Program             :  sqlplus.exe               
 Fetch Calls         :  1                         
  
Global Stats  
================================================================  
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |  
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |  
================================================================  
|     287 |    8.76 |      278 |     1 |   110K | 110K | 858MB |  
================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |        Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |          (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    142K | 40211 |       282 |     +1 |     2 |        1 | 109K | 854MB |   100.00 | db file sequential read (277) |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    142K |   892 |       280 |     +3 |     2 |     118K |  491 |   4MB |          |                               |  
======================================================================================================================================================================================  
  
Execution Plan (autotrace) 
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 40211   (1)| 00:08:03 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   141K|   249M| 40211   (1)| 00:08:03 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   141K|       |   892   (1)| 00:00:11 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"<SYSDATE@!)  
   4 - access("STS"='A' OR "STS"='B')  
            
Table stats
-----------
table_name    num_rows        blocks  empty_blocks  chain_cnt  avg_row_len             
TAB1        79,447,350   22,318,667            710    537,597        1,847  
  
column_stats(STS)
-----------------
table_name  column_name  num_distinct  num_nulls    density  avg_col_len  
TAB1        STS                     5          0  6.1789E-9            2  
  
Index_stats(on STS)
-------------------
index_name    leaf_blocks  distinct_keys  avg_leaf_blocks_per_key  avg_data_blocks_per_key  clustering_factor    num_rows
TAB1_STS_IDX      493,152              5                   98,630                4,382,625         21,913,127  79,106,263   


Session Stats
-------------
process last non-idle time              1,534,508,200
session connect time                    1,534,508,200
logical read bytes from cache             903,790,592
physical read total bytes                 899,629,056
cell physical IO interconnect bytes       899,629,056
physical read bytes                       899,629,056
file io wait time                         277,881,742
session pga memory                          8,586,744
session pga memory max                      8,586,744
temp space allocated (bytes)                4,194,304
session uga memory max                      1,690,184
session uga memory                          1,690,184
buffer is pinned count                        129,148
table fetch by rowid                          117,521
session logical reads                         110,326
consistent gets                               110,276
consistent gets from cache                    110,276
no work - consistent read gets                110,071
non-idle wait count                           109,879
free buffer requested                         109,830
physical read IO requests                     109,818
physical reads cache                          109,818
physical reads                                109,818
physical read total IO requests               109,818
buffer is not pinned count                    109,577
free buffer inspected                          70,740
hot buffers moved to head of LRU               31,910
DB time                                        28,203
non-idle wait time                             27,788
user I/O wait time                             27,788
dirty buffers inspected                        19,067
bytes sent via SQL*Net to client               14,927
bytes received via SQL*Net from client         10,607
redo size                                       5,440
undo change vector size                         4,432
table fetch continued row                       3,660

There are all sorts of noteworthy details in these two sets of information – some of the “how to see what’s in front of you” type, some of the “be careful, Oracle can deceive you” type. So I’m going to walk though the output picking up a number of background thoughts before commenting on the answer to the basic question.

We’ll start with the object statistics, then we’ll look at the SQL Monitor plan to see if we can determine where the extra time was spent, then we’ll try to work out what else the plan might be telling us about the code and data, then we’ll summarise my observations to make a claim about the difference in behaviour.

Object statistics

The table has 79M rows with average length of 1,847 bytes, using 22M blocks. With an 8KB block size and that average row size we would expect to see about 3 rows per block, and that’s fairly consistent with the value of rows / blocks.  We don’t know what the sample size was for this stats collection, but it might have been a “small” sample size rather than the the 100% you would get from using auto_sample_size, so that might also explain some discrepancy between the two different views on the figures.

We note that the secondary system reports a chain_cnt in excess of 500,000 rows. The only (unhacked) way that this figure could be set would be through a call to analyze statistics, and once the figure is there it won’t go away unless you use the analyze command again to delete statistics.  We don’t know the history of how and when the figure got there so it doesn’t guarantee that there are any chained or migrated rows, nor does the zero in the table stats on the primary system guarantee that it doesn’t have any chained or migrated rows – all it tells us is that at some time someone used the wrong command to gather stats and there were some (less than 1%) migrated or chained rows in the table at the time. (The optimizer will use this figure in its arithmetic if it is set, by the way, so it may affect some of the cost calculations – but not by a huge amount.)

The column sts reports 5 distinct values, no nulls, and a density of 6.2e-9 which is roughly half of 1/79M: so we have a frequency histogram on the column (in the absence of a histogram the density would be 1/5, and it’s reasonable to assume that the number of buckets was either the default or set to something larger than 5).  We were told that the system was running 11.2.0.4 – so we have to be a little suspicious about the accuracy of this histogram since it will have been sampled with a very small sample if the stats collection had used auto_sample_size. (12c will use a specially optimized 100% sample for frequency and top-N histograms when using auto_sample_size)

The index on sts has a clustering_factor of around 22M which is similar to the number of blocks in the table – and that’s not too surprising if there are are only a very small number of distinct values in the column – especially when the presence of the histogram suggest that there’s a skew in the data distribution. (There’s more to come on that point.) The number of leaf blocks is about 500,000 (being lazy about arithmetic) – just as a side note this suggests the index is running in a fairly inefficient state (and probably hasn’t been created with the compress keyword).

Doing a rough estimate of the index arithmetic :  the avg_col_len for sts is 2, so the space required for each index entry will be 13 bytes (2 for the column, 7 for the rowid content, 2 for the row header, 2 for the row directory entry).  Take off the block overhead, and assume the index is running at a “typical” 70% space usage per leaf block and you might expect 5,600 bytes used per leaf block for current index data and that works out to about 430 index entries per leaf block.  With 79M rows in the table that should lead to 79M/430 leaf blocks – i.e. roughly 184,000 leaf blocks, not 493,000 leaf blocks.  However it’s not unusual to see an index with extremely repetitive values operating at something like 50% utilisation, which would bring our estimate to about 310 rows per leaf block and 255,000 leaf blocks – which is still off by a factor of nearly 2 compared to what we’ve actually got. Again, of course, we have to be a little bit cautious about these statistics – we don’t know the sample size, and Oracle uses a surprisingly small number of blocks to sample the stats for an index.

Where’s the time.

The SQL Monitor gives us a very clear report of where most of the time went – almost all of it was spent in I/O waits, and almost all of the wait time was in the “table access by index rowid” opration in both cases; but the primary system did 38,377 read requests while the secondary did 109,000 read requests in that line of the plan. It is significant, though, that quite a lot (40%) of the ASH samples for that operation on the primary system were for “read by other session” rather than “db file sequential read”:  in other words some other session(s) were doing a lot of work to pull the data we wanted into the buffer cache at the same time. Apart from the fact that a wait for “read by other session” often means we spend less time waiting than if we’d had to do the read ourselves, the presence of this wait suggests that other sessions may be “pre-caching” data for us so that we end up having to read far fewer blocks than would otherwise be the case.

It’s important to note at the same time that the difference in Buffer Gets for the two systems was small – 102K vs. 110K – and the “Rows (actual)” was the same in both cases – 118K entries returned by the index range scan.  Both systems did similar amounts of “logical” work, to process similar amounts of data; the difference was the fraction of the work that required a buffer get to turn into a disc read or a “wait for other read”.

We might want to pick up a few more numbers to corroborate the view that the only significant difference was in the volume of data cached and not some more esoteric reason.  Some of the session statistics should help.


DB1:  table fetch by rowid                          117,519
DB2:  table fetch by rowid                          117,521

DB1:  undo change vector size                         4,432
DB2:  undo change vector size                         4,432

DB1:  redo size                                       5,536
DB2:  redo size                                       5,440

DB1:  session logical reads                         102,500
DB2:  session logical reads                         110,326

DB1:  no work - consistent read gets                102,368
DB2:  no work - consistent read gets                110,071

DB1:  table fetch continued row                       2,423
DB2:  table fetch continued row                       3,660

The number of rows fetched by rowid is virtually identical and we have done (virtually) no work that generates undo or redo – such as delayed block cleanout; there are no statistics shown for “%undo record applied” so we probably haven’t done very much work to get a read consistent view of the data though we can’t be sure that the OP simply failed to copy that stat into list supplied (but then the similarity of “session logical reads” to “no work – consistent read gets” confirms the hypothesis that we didn’t do any (significant) work on visiting undo blocks.

We do see a few percent increase in the number of buffer gets (“session logical reads”) – but this may reflect the fact that the actual pattern of data in one table is slightly different from the pattern in the other – thanks to ASSM the process id of the process that inserts a row into a table can affect (within a small range, usually) the block into which the row is inserted; but at 102,000 / 110,000 buffer gets to visit 117,500 rows in the table we can see that there must be some table blocks that hold two (or more) rows that are identified as consecutive in the index – leading to some row visits being achieved through a buffer pin and without a fresh buffer get. You’ll note that this argument is consistent with the small variation in clustering_factor (not that we entirely trust those figures) for the two indexes – the system with the lower clustering_factor for the index has done fewer buffer gets to acquire the same number of rows from the table – by definition that means (assuming default setup) that there are more cases where “the next table row” is in the same block as the current row.

The final figure I’ve shown is the “table fetch continued rows”: according to the table stats (which we don’t necessarily trust completely) 500K out of 79M rows are chained/migrated which is roughly 0.6%. We know that we’re visiting about 117K table rows so might expect (on average) roughly the same percentage migrated/chained viz: 0.6% of 117K = 743, so there’s a little anomaly there (or an error in our assumption about “average” behaviour.  It’s worth noting, though, that a “continued fetch” would have to do an extra buffer visit (and maybe an extra physical read).  You might wonder, of course, how there could be any chained or migrated rows when the average row length is 1,847 bytes but in a follow-up post the OP did say there were 3 BLOB columns in the table, which can cause havoc with interpreting stats for all sorts of reasons. We don’t have any information about the table structure – particularly whether the columns in the query appear before or after the BLOB columns in the table definition – and we don’t know what processing takes place (for example, maybe the 3rd BLOB is only updated after the sts column has been changed to a value other than A or B which would help to explain why we shouldn’t be using the 0.6% calculation above as a table-wide average), so we’re not in a position to say why any of the continued fetches appear but there are several guesses we could make and they’re all easy to check.

Plan observations

If we examine row estimates we see that it 114K for the faster plan and 141K for the slower plan (with a closely corresponding variation in cost). The difference in estimates simply tells us that the histogram gathering was probably a small sample size and subject to a lot of variation. The scale of the estimates tells us that the A and B rows are probably rare – call it 125K out of 79M rows, about 0.16% of the total rows in the table, so it would not be surprising to see consecutive samples for the histogram producing significant variations in estimates.

The other interesting thing we can note in the SQL Monitor plan is that the Starts column for the index range scan / table access operations in both plans shows the value 2: this means that there are no “A” rows that match the other predicates:  Oracle has run the “A” iteration to completion then started the “B” iteration and found a row on the second iteration. Is this a coincidence, or should it always happen, or is it only fairly likely to happen; is it possible to find times when there are no suitable “B” rows but plenty of suitable “A” rows. The final predicate in the query is “rownum <= 1” – so the query is picking one row with no explicit strategy for choosing a specific row when there are multiple appropriate rows, does this mean that we could optimizer the query by rewriting it as a “union all” that searched for B rows first and A rows second ? We just don’t know enough about the processing.

In passing, we can’t get Oracle to search the B rows first by changing the order of the in-list.  If you have a predicate like “where sts in ({list of literal values})” the optimizer will sort the list to eliminate duplicates before rewriting the predicate as a list of disjuncts, and then (if the path uses an iterator) iterate through the list in the resulting order.

In the absence of information about the way in which the data is processed we can only say that we need to avoid visiting the table so frequently. To do this we will need to add one or both of the columns from the other predicates to the index – this might double the size of the index, but eliminate 95% of the potential I/O.  For example if we discover that A and B rows are initially created “into the future” and this query is looking for a row whose “time has come” so that it can be processed and changed to an X row (say) then there may only ever be a tiny number of rows where the “sts = A and the dt < sysdate” and an index on (sts, dt) would be a perfect solution (especially if it were compressed on at least the first column).

The OP has declared a reluctance to add an index to the table – but there are two points to go with this indexing strategy. Since we know there’s a frequency histogram and the A and B rows appear to be rare values – what benefit is there in having an index that covers the other values (unless 2 of the remaining 3 are also rare).  How about creating a function-based index that represents only the rare values and modifying this code to use that index – e.g.

create index t1_id on t1 (
        case sts when 'A' then sts when 'B' then sts end,
        case sts when 'A' then dt  when 'B' then dt  end
) compress 1
;

select  *
from    t1
where   case sts when 'A' then sts when 'B' then sts end in ('A','B')
and     case sts when 'A' then dt  when 'B' then dt  end < sysdate
and     cnt < '4'
and     rownum <= 1
/


You might be able to replace a huge index (79M rows worth) with this small one (120K rows worth) unless there’s too much other code in the system that has to be adjusted or the sts column is actually the target of a referential integrity constraint; at worst you could add this index knowing that it’s generally not going to consume much in the way of extra space or processing resources and is going to save you a lot of work for this query.

Summary

The execution plan from SQL Monitor points very strongly to the fast system benefiting from having a lot of the relevant data cached and constantly being reloaded into the cache by other sessions while the slow system has to acquire almost all of its data by real phyiscal reads. Most of the reads address the table so engineering an index that is low-cost and (fairly) high precision is the most significant strategy for reducing the workload and time on the slow system.

The fact that all the potential A rows fail to match the full predicate set suggests that there MAY be some aspect of the processing that means it would be more efficient to check for B rows before checking for A rows.

Given the massive skew in the data distribution a function-based index that hides all the non-popular values (or even all the values that are not of interest to this query) may be the most cost-effective way of adding a very effective index to the table with minimal resource requirements.

And finally

It’s taken me more than 4 hours to write this note after spending about 10 minutes reading through the information supplied by the OP and identifying and cross-checking details. A small fraction of the 4+ hours was spent creating a little model to check something I had to say about in-lists, the rest of it was trying to write up a coherent description covering all the details.

That’s it for today, but I may have missed a couple of points that I noticed as I read the OP’s posting; and I will want to do a little cosmetic work on this article and check grammar and spelling over the next couple of days.

<h3<Update (already)

Shortly after I posted this blog note the owner of the question reported the following as the distribution of values for the sts column:

 STS   COUNT(*)
---- ----------
   A          6
   E        126
   D        866
   C   80212368
   B     117631

Two things stand out about these figures – first it’s an ideal example of a case where it would be nice avoid having index entries for the 80 million ‘C’ rows. Depending on the coding and testing costs, the supportability of the application and the possible side effects this could be done with a function-based index, or by introducing a virtual column that hides the ‘C’s behing a NULL, or by changing the code to use NULL instead of ‘C’.

Secondly – I made a comment about rewriting the code to query the B’s before the A’s. But we saw that Oracle worked through about 117,000 rows before returning a result: so the fitures above tell us that it must have worked through almost all the B’s and the handful of A’s was just a tiny little blip before it got to the B iteration – so there’s no point in making that change.

My suggestion for the function-based index above could be modified in two ways, of course – add two more “when”s to each “case” to capture the D and E rows, or take the opposite viewpoint and create an index on expressions like: “case sts when ‘C’ then to_char(null) else sts end”. The benefit of the latter approach is that you don’t have to modify the index definition (and rebuild the index) if a new legal value for sts appears.

data guard drift

Tom Kyte - Mon, 2018-08-20 03:26
I have setup DR with maximum performance mode, at any time of instance, how to find the duration of time of data difference between DC and DR.
Categories: DBA Blogs

DATABASE STORAGE USAGE IN ASM

Tom Kyte - Mon, 2018-08-20 03:26
Hey Guys, I am working in a RAC environment and have been struggling to get a perfect query to view ASM Storage usage based on specific and each databases on the ASM. <b>Simply, i want a result that shows a database and how much size that databas...
Categories: DBA Blogs

Materialized View Add Columns

Tom Kyte - Mon, 2018-08-20 03:26
Hi Tom, I have added a new column in my master table and i want this column to be reflected in my Materialized view is this possible without Re creating it i have to fetch all columns from my master table in my view.please suggest.
Categories: DBA Blogs

Event SQL*Net break/reset to client in refresh of materialized view

Tom Kyte - Mon, 2018-08-20 03:26
Dear, I have had event of "SQL*Net break/reset to client" in session (job) with refresh of materialized view Same after kill it the session, kill the running job, the new session again remains with this event. The solution always is re-create th...
Categories: DBA Blogs

Pages

Subscribe to Oracle FAQ aggregator