Jonathan Lewis

Subscribe to Jonathan Lewis feed Jonathan Lewis
Just another Oracle weblog
Updated: 7 hours 8 min ago

Column Stats

Thu, 2018-01-18 08:22

I’ve made several comments in the past about the need for being selective when gathering objects statistics with particular reference to the trade-offs when creating histograms. With Oracle 12c it’s now reasonably safe (as far as I’m concerned) to set a method_opt as a table preference that identifies columns where you expect to see Frequency or (pace the buggy behaviour described in a recent post) a Top-N histograms. The biggest problem I have is that I keep forgetting the exact syntax I need – so I’ve written this note more as a reminder to myself than anything else.

Typically I might expect to use the standard 254 columns for gathering histograms, with an occasional variation to increase the bucket count; but for the purposes of this note I’m going to demonstarate with a much lower value. So here’s a table creation statement (running 12.1.0.2 – so it will gather basic stats on the create) and two variations of a call to gather stats with a specific method_opt – followed by a question:

create table t1
as
select
        object_type o1,
        object_type o2,
        object_type o3,
        object_id,
        object_name
from
        all_objects
where
        rownum <= 50000 -- > comment to bypass wordpress format problem
;

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1' order by column_id;

execute dbms_lock.sleep(2)

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1 for columns o1 o2 o3 size 15'
        );
end;
/

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1' order by column_id;

execute dbms_lock.sleep(2)

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1 for columns size 15 o1 o2 o3'
        );
end;
/

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1';


The big question is this: which columns will have histograms after each of the gather_table_stats() calls:

method_opt=>'for all columns size 1 for columns o1 o2 o3 size 15'
method_opt=>'for all columns size 1 for columns size 15 o1 o2 o3'

The problem I have is simple – to me both options look as if they will create histograms on all three named columns but the first option is the one that I type in “intuitively” if I don’t stop to think about it carefully. The first option, alas, will only gather a histogram on column o3 – the second option is the one that creates three histograms.

The manuals are a little unclear and ambiguous about how to construct a slightly complicated method_opt; there’s a fragment of text with the usual mix of square brackets, italics and ellipses to indicate optional and repeated clauses (interestingly the only clue about multiple columns is that comma separation seems to be required – despite one of the examples above working withough commas) but there’s no explanation of when a “size” clause should go before a “column” column and when it should go after.

So here are a few more method_opt clauses – can you work out in advance which columns would have histograms if you used them and how many buckets in each histogram; there are a couple that may surprise you:


for columns o1 size 12, o2 size 13, o3 size 14

for columns o1 size 15 o2 size 16 o3 size 17

for columns size 18 o1 size 19 o2 size 20 o3

for columns size 21 o1 o2 size 22 o3

for columns o1 size 12, o2 size 12, o3 size 13, object_id size 13 object_name size 14

for columns size 22 o1 o2 for columns size 23 o3 object_id for columns size 24  object_name

Bottom line – to me – is to check very carefully that the method_opt is going to do what I want it to do; and for production systems I tend to use the final form that repeats the “for columns {size clause} {column list}”.

Histogram Hassle

Mon, 2018-01-15 07:01

I came across a simple performance problem recently that ended up highlighting a problem with the 12c hybrid histogram algorithm. It was a problem that I had mentioned in passing a few years ago, but only in the context of Top-N histograms and without paying attention to the consequences. In fact I should have noticed the same threat in a recent article by Maria Colgan that mentioned the problems introduced in 12c by the option “for all columns size repeat”.

So here’s the context (note – all numbers used in this example are approximations to make the arithmetic obvious).  The client had a query with a predicate like the follwing:

    t4.columnA = :b1
and t6.columnB = :b2

The optimizer was choosing to drive the query through an indexed access path into t6, which returned ca. 1,000,000 rows before joining (two tables later) to t4 at which point all but a couple of rows remained – typical execution time was in the order of tens of minutes. A /*+ leading(t4) */ hint to start on t4 with an index that returned two rows reduced the response time to the classic “sub-second”.

The problem had arisen because the optimizer had estimated a cardinality of 2 rows for the index on t6 and the reason for this was that, on average, that was the correct number. There were 2,000,000 rows in the table with 1,000,000 distinct values. It was just very unlucky that one of the values appeared 1,000,000 times and that was the value the users always wanted to query – and there was no histogram on the column to tell the optimizer that there was a massive skew in the data distibribution.

Problem solved – all I had to do was set a table preference for this table to add a histogram to this column and gather stats. Since there were so many distinct values and so much “non-popular” data in the table the optimizer should end up with a hybrid histogram that would highlight this value. I left instructions for the required test and waited for the email telling me that my suggestion was brilliant and the results were fantastic… I got an email telling me it hadn’t worked.

Here’s a model of the situation – I’ve created a table with 2 million rows and a column where every other row contains the same value but otherwise contains the rownum. Because the client code was using a varchar2() column I’ve done the same here, converting the numbers to character strings left-padded with zeros. There are a few rows (about 20) where the column value is higher than the very popular value.


rem
rem     Script:         histogram_problem_12c.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2018
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 2e4
)
select
        rownum  as id,
        case
                when mod(rownum,2) = 0
                        then '999960'
                        else lpad(rownum,6,'0')
        end     as bad_col
from
        generator       v1,
        generator       v2
where
        rownum <= 2e6
;

Having created the data I’m going to create a histogram on the bad_col – specifying 254 columns – then query user_tab_histograms for the resulting histogram (from which I’ll delete a huge chunk of boring rows in the middle):


begin

        dbms_stats.gather_table_stats(
                ownname         => 'TEST_USER',
                tabname         => 'T1',
                method_opt      => 'for columns bad_col size 254'
        );

end;
/

select
        column_name, histogram, sample_size
from
        user_tab_columns
where
        table_name = 'T1'
;

column end_av format a12

select
        endpoint_number         end_pt,
        to_char(endpoint_value,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') end_val,
        endpoint_actual_value   end_av,
        endpoint_repeat_count   end_rpt
from
        user_tab_histograms
where
        table_name = 'T1'
and     column_name = 'BAD_COL'
order by
        endpoint_number
;


COLUMN_NAME          HISTOGRAM             Sample
-------------------- --------------- ------------
BAD_COL              HYBRID                 5,513
ID                   NONE               2,000,000

    END_PT END_VAL                         END_AV          END_RPT
---------- ------------------------------- ------------ ----------
         1  303030303031001f0fe211e0800000 000001                1
        12  3030383938311550648a5e3d200000 008981                1
        23  303135323034f8f5cbccd2b4a00000 015205                1
        33  3032333035311c91ae91eb54000000 023051                1
        44  303239373236f60586ef3a0ae00000 029727                1
...
      2685  3938343731391ba0f38234fde00000 984719                1
      2695  39393235303309023378c0a1400000 992503                1
      2704  3939373537370c2db4ae83e2000000 997577                1
      5513  393939393938f86f9b35437a800000 999999                1

254 rows selected.

So we have a hybrid histogram, we’ve sampled 5,513 rows to build the histogram, we have 254 buckets in the histogram report, and the final row in the histogram is end point 5513 (matching the sample size). The first row of the histogram shows us the (real) low value in the column and the last row of the histogram reports the (real) high value. But there’s something very odd about the histogram – we know that ‘999960’ is the one popular value, occurring 50% of the time in the data, but it doesn’t appear in the histogram at all.

Looking more closely we see that every bucket covers a range of about 11 (sometimes 9 or 10) rows from the sample, and the highest value in each bucket appears just once; but the last bucket covers 2,809 rows from the sample with the highest value in the bucket appearing just once. We expect a hybrid histogram to have buckets which (at least initially) are all roughly the same size – i.e. “sample size”/”number of buckets” – with some buckets being larger by something like the amount that appears in their repeat count, so it doesn’t seem right that we have an enormous bucket with a repeat count of just 1. Something is broken.

The problem is that the sample didn’t find the low and high values for the column – although the initial full tablescan did, of course – so Oracle has “injected” the low and high values into the histogram fiddling with the contents of the first and last buckets. At the bottom end of the histogram this hasn’t really caused any problems (in our case), but at the top end it has taken the big bucket for our very popular ‘999960’ and apparently simply replaced the value with the high value of ‘999999’ and a repeat count of 1.

As an indication of the truth of this claim, here are the last few rows of the histogram if I repeat the experiment but, before gathering the histogram, delete the rows where bad_col is greater than ‘999960’. (Oracle’s sample is random, of course, and has changed slightly for this run.)

    END_PT END_VAL                         END_AV          END_RPT
---------- ------------------------------- ------------ ----------
...
      2641  3938373731371650183cf7a0a00000 987717                1
      2652  3939353032310e65c1acf984a00000 995021                1
      2661  393938393433125319cc9f5ba00000 998943                1
      5426  393939393630078c23b063cf600000 999960             2764

Similarly, if I inserted a few hundred rows with a higher value than my popular value (in this case I thought 500 rows would be a fairly safe bet as the sample was about one in 360 rows) I got a histogram which started with a bucket about the popular bucket, so the problem of that bucket being hacked to the high value was less significant:


    END_PT END_VAL                         END_AV          END_RPT
---------- ------------------------------- ------------ ----------
...
      2718  393736313130fe68d8cfd6e4000000 976111                1
      2729  393836373630ebfe9c2b7b94c00000 986761                1
      2740  39393330323515efa3c99771600000 993025                1
      5495  393939393630078c23b063cf600000 999960             2747
      5497  393939393938f86f9b35437a800000 999999                1

Bottom line, then: if you have an important popular value in a column and there aren’t very many rows with a higher value, you may find that Oracle loses sight of the popular value as it fudges the column’s high value into the final bucket.

Workaround

I did consider writing a bit of PL/SQL for the client to fake a realistic frequency histogram, but decided that that wouldn’t be particularly friendly to future DBAs who might have to cope with changes. Luckily the site doesn’t gather stats using the automatic scheduler job and only rarely updates stats anyway, so I suggested we create a histogram on the column using an estimate_percent of 100. This took about 8 minutes to run – for reasons that I will go into in a moment – after which I suggested we lock stats on the table and document the fact that when stats are collected on this table it’s got to be a two-pass job – the normal gather with its auto_sample_size to start with, then a 100% sample for this column to gather the histogram:


begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt       => 'for columns bad_col size 254',
                estimate_percent => 100,
                cascade          => false
        );
end;
/

    END_PT END_VAL                         END_AV          END_RPT
---------- ------------------------------- ------------ ----------
...
       125  39363839393911e01d15b75c600000 968999                0
       126  393834373530e98510b6f19a000000 984751                0
       253  393939393630078c23b063cf600000 999960                0
       254  393939393938f86f9b35437a800000 999999                0

129 rows selected.

This took a lot longer, of course, and produced an old-style height-balanced histogram. Part of the time came from the increased volume of data that had to be processed, part of it came from a suprise (which also appeared, in a different guise, in the code that created the original hybrid histogram).

I had specifically chosen the method_opt to gather for nothing but the single column. In fact whether I forced the “legact” (height-balanced) code or the modern (hybrid) code, I got a full tablescan that did some processing of EVERY column in the table and then threw most of the results away. Here are fragements of the SQL – old version first:


select /*+  
            no_parallel(t) no_parallel_index(t) dbms_stats
            cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring 
            xmlindex_sel_idx_tbl no_substrb_pad  
       */
       count(*), 
       count("ID"), sum(sys_op_opnsize("ID")),      
       count("BAD_COL"), sum(sys_op_opnsize("BAD_COL"))    
       ...
from
       "TEST_USER"."T1" t


select /*+
           full(t)    no_parallel(t) no_parallel_index(t) dbms_stats
           cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
           xmlindex_sel_idx_tbl no_substrb_pad
       */
       to_char(count("ID")),
       to_char(count("BAD_COL")),
       substrb(dump(min("BAD_COL"),16,0,64),1,240),
       substrb(dump(max("BAD_COL"),16,0,64),1,240),
       ...
       count(rowidtochar(rowid)) 
from
       "TEST_USER"."T1" t  /* ACL,TOPN,NIL,NIL,RWID,U,U254U*/

The new code only used the substrb() functions on the bad_col, but all other columns in the table were subject to the to_char(count()).
The old code applied the count() and sys_op_opnsize() to every column in the table.

This initial scan was a bit expensive – and disappointing – for the client since their table had 290 columns (which means intra-block chaining as a minimum) and had been updated so much that 45% of the rows in the table had to be “continued fetches”. I can’t think why every column had to be processed like this, but if they hadn’t been that would have saved a lot of CPU and I/O since the client’s critical column was very near the start of the table.

Finally

This problem with the popular value going missing is a known issue, for which there is a bug number, but there is further work going on in the same area which means this particular detail is being rolled into another bug fix. More news when it becomes available.

 

 

ASSM tangle

Thu, 2018-01-11 11:35

Here’s a follow-on from Tuesday’s (serious) note about a bug in 12.1.0.2 that introduces random slowdown on large-scale inserts. This threat in this note, while truthful and potentially a nuisance, is much less likely to become visible because it depends on you doing something that you probably shouldn’t be doing.

There have always been problems with ASSM and large-scale deletes – when should Oracle mark a block as having free space on deletion: if your session does it immediately then other sessions will start trying to use the free space that isn’t really there until you commit; if your session doesn’t do it immediately when can it happen, since you won’t want it done on commit – but that means the segment could “lose” a lot of free space if something doesn’t come along in a timely fashion and tidy up.

But here’s a quirky problem that takes things one step further. What happens if you try to delete a load of data and fail and your session rolls back? If we start with yesterday’s script (running on 11.2.0.4 or 12.2.0.1) we can create a table with 1M rows in it and the following space usage:


Unformatted		      : 	   0 /		      0
Freespace 1 (  0 -  25% free) : 	   0 /		      0
Freespace 2 ( 25 -  50% free) : 	   1 /		  8,192
Freespace 3 ( 50 -  75% free) : 	   0 /		      0
Freespace 4 ( 75 - 100% free) : 	  67 /		548,864
Full			      :       41,666 /	    341,327,872

You will recall that each “Full” block actually had the basic 10% free space, plus a couple of hundred extra bytes which Oracle had to “forget about” because the incoming rows were always 290 bytes long. Let’s take this table and delete the first 100,000 rows, then emulate a session error and roll back, and then check the space usage:


delete from t1 where rownum <= 100000;
rollback;

-- generate space usage report

Unformatted		      : 	   0 /		      0
Freespace 1 (  0 -  25% free) :        4,167 /	     34,136,064
Freespace 2 ( 25 -  50% free) : 	   1 /		  8,192
Freespace 3 ( 50 -  75% free) : 	   0 /		      0
Freespace 4 ( 75 - 100% free) : 	  67 /		548,864
Full			      :       37,499 /	    307,191,808

We have 4,167 blocks which were full, and we know they are effectively full for the purposes of our data, but they’re now declared as having some free space. When Oracle rolled back the delete it wasn’t running code that would attempt to discover that the block was going to go over the limit, it simply calculated the byte change from re-inserting the row, added it to the total free space (tosp) and produced a number that hadn’t reached the limit set by pctfree – so flagged the block accordingly. (Remember my comment in the earlier article that Oracle doesn’t generate undo for the state changes on the Level 1 bitmap blocks – this is, at least in part – a consequence of that strategy).

Guideline

If you’re going to do large-scale deletes in an ASSM environment, make sure they don’t fail or subsequent inserts may take a long time.

 


ASSM Argh 2

Wed, 2018-01-10 07:24

After yesterday’s post one of the obvious follow-up questions was whether the problem I demonstrated was a side effect of my use of PL/SQL arrays and loops to load data. What would happen with a pure “insert select” statement.  It’s easy enough to check:


rem
rem     Script:         assm_argh2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2018
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

drop table t2 purge;
drop table t1 purge;

create table t2
segment creation immediate
tablespace test_8k_assm
as
select * from all_objects where rownum <= 50000 -- >comment to avoid WordPress anomaly
;

create table t1
segment creation immediate
tablespace test_8k_assm
as
select * from all_objects where rownum = 0
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

insert /*+ append */ into t1
select t2.*
from
        (
         select /*+ cardinality(40) */ rownum id
         from dual connect by level <= 40 -- > comment to avoid WordPress anomaly
        ) d,
        t2
;

commit;

declare
        m_unformatted_blocks    number;
        m_unformatted_bytes     number;
        m_fs1_blocks            number;
        m_fs1_bytes             number;
        m_fs2_blocks            number;
        m_fs2_bytes             number;

        m_fs3_blocks            number;
        m_fs3_bytes             number;
        m_fs4_blocks            number;
        m_fs4_bytes             number;
        m_full_blocks           number;
        m_full_bytes            number;

begin
        dbms_space.SPACE_USAGE(
                segment_owner           => 'TEST_USER',
                segment_name            => 'T1',
                segment_type            => 'TABLE',
                unformatted_blocks      => m_unformatted_blocks,
                unformatted_bytes       => m_unformatted_bytes,
                fs1_blocks              => m_fs1_blocks ,
                fs1_bytes               => m_fs1_bytes,
                fs2_blocks              => m_fs2_blocks,
                fs2_bytes               => m_fs2_bytes,
                fs3_blocks              => m_fs3_blocks,
                fs3_bytes               => m_fs3_bytes,
                fs4_blocks              => m_fs4_blocks,
                fs4_bytes               => m_fs4_bytes,
                full_blocks             => m_full_blocks,
                full_bytes              => m_full_bytes
        );

        dbms_output.put_line('Unformatted                   : ' || to_char(m_unformatted_blocks,'999,999,990') || ' / ' || to_char(m_unformatted_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 1 (  0 -  25% free) : ' || to_char(m_fs1_blocks,'999,999,990') || ' / ' || to_char(m_fs1_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 2 ( 25 -  50% free) : ' || to_char(m_fs2_blocks,'999,999,990') || ' / ' || to_char(m_fs2_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 3 ( 50 -  75% free) : ' || to_char(m_fs3_blocks,'999,999,990') || ' / ' || to_char(m_fs3_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 4 ( 75 - 100% free) : ' || to_char(m_fs4_blocks,'999,999,990') || ' / ' || to_char(m_fs4_bytes,'999,999,999,990'));
        dbms_output.put_line('Full                          : ' || to_char(m_full_blocks,'999,999,990') || ' / ' || to_char(m_full_bytes,'999,999,999,990'));
end;
/

I’ve copied the first 50,000 rows from all_objects as a way of generating date, then cloned it 40 times into the main table to give me a total of 2,000,000 rows.

A comment on yesterday’s blog reported that the behaviour I described has been fixed in the October bundle patch for 12.1.0.2, but I haven’t patched my copy yet. So here are the results (with a little cosmetic editing) from running the insert and reporting on space usage from 11.2.0.4, 12.1.0.2, and 12.2.0.1 in order:


11.2.0.4
========
2000000 rows created.

Unformatted                   :          764 /        6,258,688
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          133 /        1,089,536
Full                          :       28,579 /      234,119,168

12.1.0.2
========
2000000 rows created.

Unformatted                   :          256 /        2,097,152
Freespace 1 (  0 -  25% free) :       32,810 /      268,779,520
Freespace 2 ( 25 -  50% free) :            0 /                0
Freespace 3 ( 50 -  75% free) :            1 /            8,192
Freespace 4 ( 75 - 100% free) :           47 /          385,024
Full                          :          443 /        3,629,056

12.2.0.1
========
2000000 rows created.

Unformatted		      : 	 764 /	      6,258,688
Freespace 1 (  0 -  25% free) : 	   0 /		      0
Freespace 2 ( 25 -  50% free) : 	   1 /		  8,192
Freespace 3 ( 50 -  75% free) : 	   0 /		      0
Freespace 4 ( 75 - 100% free) : 	 226 /	      1,851,392
Full			      :       39,706 /	    325,271,552

The total number of blocks involved changes from version to version, of course, thanks to the huge differences in the contents of all_objects, but the headline message is clear – 12.1.0.2 is broken for this basic requirement. On the plus side, though, this is what you get from 12.1.0.2 if you change that insert to include the /*+ append */ hint:


2000000 rows created.

Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            0 /                0
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :            0 /                0
Full                          :       33,380 /      273,448,960

Unsurprisingly, 11.2.0.4 and 12.2.0.1 also behave and report 100% Full. This is a slightly special case, of course since there was no previous data in the table, but even when I started the big insert after inserting and committing a few rows all three versions behaved.


ASSM argh!

Tue, 2018-01-09 11:53

Here’s a problem with ASSM that used to exist in older versions of Oracle had disappeared by 11.2.0.4 and then re-appeared in 12.1.0.2 – disappearing again by 12.2.0.1. It showed up on MoS a few days ago under the heading: “Insert is running long with more waits on db file sequential read”.

The obvious response to this heading is to question the number of indexes on the table – because big tables with lots of indexes tend to give you lots of random I/O as Oracle maintains the indexes – but this table had no indexes. The owner of the problem supplied several of bits of information in the initial post, with further material in response to follow-up questions, including the tkprof summary of the 10046/level 12 trace of the insert and two extracts from the trace file to show us some of the “db file sequential read” waits – the first extract made me wonder if there might be some issue involving 16KB blocks but the second one dispelled that illusion.

There are several buggy things that can appear with ASSM and large-scale DML operations, and sometimes the problems can appear long after the original had done the dirty deed, so I thought I’d create a simple model based on the information supplied to date – and discovered what the problem (probably) was. Here’s how it starts – I’ve created a tablespace using ASSM, and in this tablespace I’ve created a table which has 48 columns with a row length of 290 bytes (roughly matching the OP’s table), and I’ve hacked out a simple PL/SQL block that loops around inserting arrays of 100 rows at a time into the table for a total of 1M rows before committing.


rem
rem     Script:         assm_cleanout.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2018
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2        Lots of blocks left "not full"
rem             11.2.0.4
rem

rem
rem     using OMF, so no file-name needed
rem     Ran this bit as SYS, and altered test user to have unlimited quota
rem

/*
create
        tablespace test_8k_assm
        datafile size 1G
        extent management local
        autoallocate
        segment space management auto
;
*/

rem
rem     Create the table, pre-allocate some space.
rem     This means we should get consistent 8M extents and not initial little ones
rem

create table t1 (
        v001 varchar2(5), v002 varchar2(5), v003 varchar2(5), v004 varchar2(5), v005 varchar2(5),
        v006 varchar2(5), v007 varchar2(5), v008 varchar2(5), v009 varchar2(5), v010 varchar2(5),
        v011 varchar2(5), v012 varchar2(5), v013 varchar2(5), v014 varchar2(5), v015 varchar2(5),
        v016 varchar2(5), v017 varchar2(5), v018 varchar2(5), v019 varchar2(5), v020 varchar2(5),
        v021 varchar2(5), v022 varchar2(5), v023 varchar2(5), v024 varchar2(5), v025 varchar2(5),
        v026 varchar2(5), v027 varchar2(5), v028 varchar2(5), v029 varchar2(5), v030 varchar2(5),
        v031 varchar2(5), v032 varchar2(5), v033 varchar2(5), v034 varchar2(5), v035 varchar2(5),
        v036 varchar2(5), v037 varchar2(5), v038 varchar2(5), v039 varchar2(5), v040 varchar2(5),
        v041 varchar2(5), v042 varchar2(5), v043 varchar2(5), v044 varchar2(5), v045 varchar2(5),
        v046 varchar2(5), v047 varchar2(5), v048 varchar2(5)
)
segment creation immediate
tablespace test_8k_assm
storage(initial 8M)
;

alter table t1 allocate extent (size 8M);
alter table t1 allocate extent (size 8M);

rem
rem     Simple anonymous pl/sql block
rem     Large insert, handled with array inserts
rem     Can modify loop count and array size very easily
rem

declare
        type tab_array is table of t1%rowtype;
        junk_array tab_array;
begin

        select
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx',
                'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx', 'xxxxx'
        bulk collect into
                junk_array
        from
                all_objects
        where
                rownum  <= 100 -- > comment to avoid WordPress format issue
        ;

        for i in 1..10000 loop
                forall j in 1..junk_array.count
                        insert into t1 values junk_array(j) ;
        end loop;

end;

commit;

The number of rows per block after this insert is 24, with 1038 bytes free space left (808 due to the pctfree = 10, then the bit that was too small to take a 25th row before breaching the pctfree barrier). This means we should report 1M/24 = 41,666 full blocks and one block with some free space. So we query the table using the dbms_space package:


declare
        m_unformatted_blocks    number;
        m_unformatted_bytes     number;
        m_fs1_blocks            number;
        m_fs1_bytes             number;
        m_fs2_blocks            number;
        m_fs2_bytes             number;

        m_fs3_blocks            number;
        m_fs3_bytes             number;
        m_fs4_blocks            number;
        m_fs4_bytes             number;
        m_full_blocks           number;
        m_full_bytes            number;

begin
        dbms_space.SPACE_USAGE(
                segment_owner           => 'TEST_USER',
                segment_name            => 'T1',
                segment_type            => 'TABLE',
                unformatted_blocks      => m_unformatted_blocks,
                unformatted_bytes       => m_unformatted_bytes,
                fs1_blocks              => m_fs1_blocks ,
                fs1_bytes               => m_fs1_bytes,
                fs2_blocks              => m_fs2_blocks,
                fs2_bytes               => m_fs2_bytes,
                fs3_blocks              => m_fs3_blocks,
                fs3_bytes               => m_fs3_bytes,
                fs4_blocks              => m_fs4_blocks,
                fs4_bytes               => m_fs4_bytes,
                full_blocks             => m_full_blocks,
                full_bytes              => m_full_bytes
        );

        dbms_output.new_line;
        dbms_output.put_line('Unformatted                   : ' || to_char(m_unformatted_blocks,'999,999,990') || ' / ' || to_char(m_unformatted_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 1 (  0 -  25% free) : ' || to_char(m_fs1_blocks,'999,999,990') || ' / ' || to_char(m_fs1_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 2 ( 25 -  50% free) : ' || to_char(m_fs2_blocks,'999,999,990') || ' / ' || to_char(m_fs2_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 3 ( 50 -  75% free) : ' || to_char(m_fs3_blocks,'999,999,990') || ' / ' || to_char(m_fs3_bytes,'999,999,999,990'));
        dbms_output.put_line('Freespace 4 ( 75 - 100% free) : ' || to_char(m_fs4_blocks,'999,999,990') || ' / ' || to_char(m_fs4_bytes,'999,999,999,990'));
        dbms_output.put_line('Full                          : ' || to_char(m_full_blocks,'999,999,990') || ' / ' || to_char(m_full_bytes,'999,999,999,990'));

end;
/

The results aren’t what we expect:


Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :       35,001 /      286,728,192
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :        6,665 /       54,599,680

We have one block marked as 25 – 50% free (that’s the one block with 16 rows in it, which means about 40% space currently free) but our 41,666 full blocks are actually reported as 6,665 full blocks and 35,001 blocks with some space available. That’s going to hurt eventually if some process wants to insert more rows and finds that it has to fail its way through 35,001 blocks before finding a block which has enough free space.

So what happens when I repeat the PL/SQL block (and commit)? Here are the results from calls to dbms_space after the next two cycles:


Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :       70,002 /      573,456,384
Freespace 2 ( 25 -  50% free) :            2 /           16,384
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          220 /        1,802,240
Full                          :       13,330 /      109,199,360

Unformatted                   :          256 /        2,097,152
Freespace 1 (  0 -  25% free) :      105,003 /      860,184,576
Freespace 2 ( 25 -  50% free) :            3 /           24,576
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          117 /          958,464
Full                          :       19,995 /      163,799,040

Every time we execute the PL/SQL block we leave a trail of 35,001 more blocks which are flagged as “not quite full”.

Looking at the session stats while running the insert loop I can tell that Oracle isn’t checking to see whether or not it should be using those blocks. (A quick way of proving this is to flush the buffer cache before each execution of the PL/SQL and note that Oracle doesn’t read back the 105,000 blocks before inserting any data). So somehow, sometime, someone might get a nasty surprise – and here’s one way that it might happen:

Since I know I my data fits 24 rows per block I’m going to modify my PL/SQL block to select one row into the array then loop round the insert 25 times – so I know I’m inserting a little bit more than one block’s worth of data. Starting from the state with 105,003 blocks marked as “Freespace 1” this is what I saw – first, the free space report after inserting 25 rows:


Unformatted                   :          240 /        1,966,080
Freespace 1 (  0 -  25% free) :        1,074 /        8,798,208
Freespace 2 ( 25 -  50% free) :            0 /                0
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :          133 /        1,089,536
Full                          :      123,927 /    1,015,209,984

Then a few wait events and session statistics for the insert:


---------------------------------------------------------
SID:    39:TEST_USER - jonathan
Session Events - 09-Jan 16:57:18
Interval:-      6 seconds
---------------------------------------------------------
Event                                             Waits   Time_outs        Csec    Avg Csec    Max Csec
-----                                             -----   ---------        ----    --------    --------
db file sequential read                          15,308           0         128        .008           3
db file scattered read                           20,086           0         271        .014           4

---------------------------------
Session stats - 09-Jan 16:57:18
Interval:-  6 seconds
---------------------------------
Name                                                                     Value
----                                                                     -----
session logical reads                                                  269,537
physical read total IO requests                                         35,401
db block gets                                                          229,522
consistent gets                                                         40,015
physical reads                                                         124,687
physical reads cache                                                   124,687
db block changes                                                       208,489
physical reads cache prefetch                                           89,293
redo entries                                                           207,892
redo size                                                           16,262,724
undo change vector size                                                  1,720
deferred (CURRENT) block cleanout applications                         103,932
table scan blocks gotten                                                20,797
HSC Heap Segment Block Changes                                              25

The session has read and updated almost all of the level 1 bitmap blocks. I don’t know exactly what triggered this re-read, but seems to be related to the number of rows inserted (or, perhaps, the amount of space used rather than the row count) as an insert crosses the pctfree boundary and fails over to the next block. I’ve only done a couple of little tests to try and get a better idea of why an insert sometimes sweeps through the bitmap blocks – so I know that inserting 2 or 3 rows at a time will also trigger the cleanout – but there are probably several little details involved that need to be identified.

You might note a couple of details in the stats:

  • Because I had flushed the buffer cache before the insert Oracle did its “cache warmup” tablescanning trick – if this had not happened I would probably have done a single block read for every single bitmap block I touched.
  • There are 103,932 block cleanout applications – but 208,000 db block changes and redo entries. Roughly half the latter are for data block cleanouts (OP code 4.1) and half are the state changes on the level 1 bitmap blocks (OP code 13.22). You’ll notice that neither change produces any undo.
  • I’ve also included the HSC Heap Segment Block Changes statistics to show you that not all changes to Heap Segment Blocks show up where you might expect them.
And finally:

If you re-run the tests on 11.2.0.4 and 12.2.0.1 you get the following results after the intial script run – the problem doesn’t exist:


11.2.0.4
========
Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :       41,666 /      341,327,872

12.2.0.1
========
Unformatted                   :            0 /                0
Freespace 1 (  0 -  25% free) :            0 /                0
Freespace 2 ( 25 -  50% free) :            1 /            8,192
Freespace 3 ( 50 -  75% free) :            0 /                0
Freespace 4 ( 75 - 100% free) :           67 /          548,864
Full                          :       41,666 /      341,327,872

So how does this help the OP.
  • First, there may be a huge mess still waiting to be cleaned in the table – but at 34M blocks I’m not too keen to suggest running the dbms_space routine to find out what it looks like – but maybe that’s necessary.
  • Secondly – an upgrade to 12.2 will probably avoid the problem in future.
  • Thirdly – if the number of rows per block is very close to uniform, write a little code to do a loop that inserts (say) 2 * expected number of rows per block as single row inserts and rolls back; the inserts will probably trigger a massive though perhaps not complete cleanout, so rinse and repeat until the cleanout is complete. Try to find a time when you don’t mind the extra load to get this job done.
  • Finally – on the big job that does the bulk insert – repeat the dummy insert/rollback at the end of the job to clean up the mess made by the job.
Addenda

Prompted by comment #2 below, I should add that if the problem has been fixed in 12.2 then possibly there’s a bug report and patch for it already. If there isn’t then the OP could raise an SR (referencing this blog note), and request a bug fix or back-port from 12.2.

And with 24 hours of publication, comment #4 (from Yury Pudovchenko) tells us that the bug is fixed by the Oct 2017 Bundle Patch.

 

 


Defaults

Tue, 2018-01-02 02:43

Following on from a Twitter reference and an update to an old posting about a side effect of  constraints on the work done inserting data, I decided to have a closer look at the more general picture of default values and inserts. Here’s a script that I’ve tested against 11.2.0.4, 12.1.0.2, and 12.2.0.1 (original install, no patches applied in all cases):


rem
rem     Script:         defaults_cost.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2017
rem

create table t1 (
        column1  varchar2(10),
        column2  varchar2(10),
        column3  varchar2(10),
        column4  varchar2(10),
        column32 varchar2(32)   default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
        column33 varchar2(33)   default 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
        virtual1      generated always as (
                column4 || column3 || column2 || column1
        ) virtual
)
segment creation immediate
;

execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1 || column2 || column3 || column4)'))
execute dbms_output.put_line(dbms_stats.create_extended_stats(user,'t1','(column1,column2,column3)'))

create or replace function plsqlfunction_with_30char_name(
        i_in varchar2
)
return varchar
deterministic
is
begin
        return initcap(i_in);
end;
/

create index t1_i1 on t1(substr(plsqlfunction_with_30char_name(column1),1,10));

When you create a function-based index you get a hidden, virtual column supporting the index expression; when you create extended stats (of either type) you get a hidden virtual column holding the extension definition, when you create any type of virtual column, including a “real” virtual column you get a data dictionary entry holding the column name and the expression definition: all these options use the “data_default” column from user_tab_cols to display the defining information – as we can see when we the following query:


select  column_name, data_default
from    user_tab_cols
where   table_name = 'T1'
order by
         column_id
;

COLUMN_NAME                      DATA_DEFAULT
-------------------------------- --------------------------------------------------------------------------------
COLUMN1
COLUMN2
COLUMN3
COLUMN4
COLUMN32                         'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
COLUMN33                         'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
VIRTUAL1                         "COLUMN4"||"COLUMN3"||"COLUMN2"||"COLUMN1"
SYS_STUDAN97VB_XDKRTR_NPFAB80P   "COLUMN1"||"COLUMN2"||"COLUMN3"||"COLUMN4"
SYS_STUTCBJ8OFFSY1D9ZXRYZ0W3W#   SYS_OP_COMBINED_HASH("COLUMN1","COLUMN2","COLUMN3")
SYS_NC00010$                     SUBSTR("TEST_USER"."PLSQLFUNCTION_WITH_30CHAR_NAME"("COLUMN1"),1,10)

Apart from the special cases I’ve just listed, you’ll also see the “default values” I specified for column32 and column33 – you’ll notice that I’ve supplied a 30 character string as the default for column32, and a 31 character string as the default for column33 – this is a convenience that means the used space in the data_default (which is a long column) corresponds to the name of the column once you include the single quotes in the their character count.

Having set my data up I’m going to emulate a bad application that uses lots of literal string SQL and leaves Oracle to fill in the default values (and, of course, derive the various virtual values it might need).


alter session set events '10046 trace name context forever, level 4';

begin
        for i in 1..10 loop
                execute immediate '
                        insert into t1 (column1, column2, column3, column4)
                        values( ' || i || ', ' || i || ', ' || i || ', ' || i || ')'
                ;
                commit;
        end loop;
end;
/

alter session set events '10046 trace name context off';

This code generates 10 strings that populate column1 through to column4 only. But you’ll notice the call to enable SQL tracing – and here’s the interesting bit of the output from applying tkprof to the trace file:


  SQL ID: 47r1y8yn34jmj Plan Hash: 2191121161

select default$
from
 col$ where rowid=:1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       50      0.00       0.00          0          0          0           0
Execute     50      0.00       0.00          0          0          0           0
Fetch       50      0.00       0.00          0        100          0          50
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      150      0.00       0.00          0        100          0          50

The summary is the same for all three versions of Oracle I tested*** – we’ve queried for a default value from col$ 5 times for each row we insert! (Technically that’s 5 times for each insert statement we’ve had to (hard-)parse; this anomaly wouldn’t appear if we have been using a bind-variable method and reusing the insert statement.) *** There is one difference in 12.2 – the number of parse calls reported for this statement was 1 rather than 50 but, judging by the various cursor cache hit stats, that may be due to a change in accounting rather than a change in workload.

Check the table definition: there are two “real defaults” and 4 expressions due to the various virtual columns – so why 5 calls per insert and not 6 ? The answer lies in the length of the actual value involved – if the text that appears in the (long) data_default column is 32 characters or shorter it will be stored in the the dictionary cache (rowcache), but only one of our 6 values is that short, so Oracle looks up the other five on each insert (hard parse).

This is a little strange on two counts: first, given the enormous memories that have been available for years and the extreme usefulness of virtual columns and extended stats it seems something of an oversight to limit the memory available to the cache that holds such critical definitions; secondly, we don’t need to evaluate the virtual columns (unless they are indexed) or extended stats on inserts so why is Oracle querying the definitions anyway ? [Possible answer: perhaps Oracle is using generic code that allows for column check constraints – which might exist on users’ virtual columns – and hasn’t catered for bypassing system-generated virtual columns.]

A key point to remember before you start worrying too much about the impact of the execution count for this query against col$ is that it’s only relevant to “hard” parses – so you’re only likely to notice it if you have a large number of different “literal string” inserts that should be using bind variables; and that means you’ve probably got an important coding defect to address before you worry too much about the extra impact caused by this particular call. Nevertheless there are a couple of bug reports on MoS that have been raised against this query and, after writing this note, I did a quick Google search for the critical SQL_ID and found (among others) this production example from Franck Pachot.

 


Been There

Sun, 2017-12-31 04:51

It’s the end of the year and time for a retrospective of some sort so I was thinking of listing the top 10 most popular pages on my blog, but Connor McDonald beat me to it, so I decided to see if I could remember all the countries I’d visited since starting to work with the Oracle software, and here’s the list in alphabetical order:

Antigua
Australia
Austria
Belgium
Bosnia
Brunei
Bulgaria
Canada
China
Croatia
Czech Republic
Denmark
Dubai
Egypt
Estonia
Finland
France
Germany
Greece
Hungary
Iceland
India
Indonesia
Ireland
Israel
Italy
Japan
Latvia
Lithuania
Malaysia

A few of these were holidays rather than work, and I may have forgotten a couple, so if you’ve seen me in your country and it’s not on the list let me know.

The list can be a bit of a nuisance, I had to list “all the countries you’ve visited in the last 10 years” for both the US and Russian visas: the US form only allowed for 5 countries and the Russian one for 40; and the US expected me to list EVERY visit, with dates and city!

 


nvarchar2

Sat, 2017-12-30 06:08

Here’s an odd little quirk that appeared when I was playing around with default values just recently. I think it’s one I’ve seen before, I may even have written about it many years ago but I can’t find any reference to it at present. Let’s start with a script that I’ll run on 12.2.0.1 (the effect does appear on earlier versions):


rem
rem     Script:         nvarchar2_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2017
rem

create table t1 (
        column1  varchar2(10),
        column2  nvarchar2(10)
);

create table t2 (
        column1  varchar2(10)
);

alter table t2 add column2 nvarchar2(10);

create table t3 (
        column1  varchar2(10),
        column2  nvarchar2(10) default 'xxxxxxxx'
);

create table t4 (
        column1  varchar2(10)
);

alter table t4 add column2 nvarchar2(10) default 'xxxxxxxx';

insert into t1(column1) values('a');
insert into t2(column1) values('a');
insert into t3(column1) values('a');
insert into t4(column1) values('a');

All I’ve done it create 4 tables which. when described will all look the same:


 Name                    Null?    Type
 ----------------------- -------- ----------------
 COLUMN1                          VARCHAR2(10)
 COLUMN2                          NVARCHAR2(10)

There is a significant different between the first two and the last two, of course, thanks to the specification of a default value which means that the inserts will produce two possible results: the first two tables will have nulls in column2; the last two will have the nvarchar2 equivalent of ‘xxxxxxxx’ which, in my instance, will be a string of 16 bytes: “0,78,0,78,0,78,0,78,0,78,0,78,0,78,0,78”.

Surprisingly, though, there is a dramatic oddity between t3 and t4 which shows up when I query user_tab_cols:

select
        table_name, column_id, column_name,  segment_column_id, data_default
from    user_tab_cols
where   table_name like 'T_'
order by
        table_name, column_id
;

TABLE_NAME            COLUMN_ID COLUMN_NAME          SEGMENT_COLUMN_ID DATA_DEFAULT
-------------------- ---------- -------------------- ----------------- --------------------
T1                            1 COLUMN1                              1
                              2 COLUMN2                              2

T2                            1 COLUMN1                              1
                              2 COLUMN2                              2

T3                            1 COLUMN1                              1
                              2 COLUMN2                              2 'xxxxxxxx'

T4                            1 COLUMN1                              1
                              2 COLUMN2                              3 'xxxxxxxx'
                                SYS_NC00002$                         2

Table t4 has acquired two columns – a hidden column (which physically exists as the second column in the stored data and is declared as raw(126)) and the column which I had specified. You’ll note that the test shows two differences that may be significant: comparing t3/t4 we see that adding, rather than initially defining, the nvarchar2() column introduces the extra column; comparing t2/t4 we see that adding a varchar2() rather than an nvarchar2() doesn’t produce the same effect. Tentative assumption, therefore, is that there is something special about adding nvarchar2() columns.

Casting my mind back to various customers who have gone through multiple upgrades of 3rd party applications that invariably seem to add columns to tables, I wondered whether this extra column appeared every time you added an nvarchar2(). I’d not noticed anything in that past that suggested this might be the case, but it’s obviously worth checking: and in my simple tests it looked as if Oracle created just one extra column and used it to capture a value that seemed to be determined by the number and location of columns that had been added.

It’s a curiosity, and leaves room for further investigation – so if anyone has links to related articles please feel free to add them in the comments.

 


Tech 2017 Agenda

Tue, 2017-11-28 08:29

As usual it’s hard to pick a personal agenda from the wealth of content available for the UKOUG annual conference, but this is my starting list:

Sunday

13:40 – 14:25 Roger MacNicol: “My query plan says Table Access Full: what happens next ?”

14:40 – 15:30 Gert Poel: “Smart Database Development with PL/SQL and Oracle REST Data Services”

16:10 – 17:00 Kellyn Pot’Vin: “Oracle vs. SQL Server – the War of the Indexes”

17:10 – 18:00 Luiza Koziel: “How to improve your presentations AK the Tool is Just a Tool… Learn how to use it for a Good Cause”

Monday

9:00 – 9:50  ME! I launch (one stream of) the conference with “Index Statistics and Column Groups”

11:35 – 13:25 Ivica Arsov: “Parallel Execution with Oracle 12c” … there may be some overlap with my 2nd presentation

14:25 – 15:15 ME again! At the CBO round table — we’ll may answer a few of the more specific questions that came in for the CBO panel.

15:25 – 16:25 Community Keynote – how could one possibly miss Maria Colgan, Connor McDonald and Chris Saxon … (maybe with Tim-Tams ?)

16:55 – 17:45 Michael Salt – “An in-memory paradox – increased I/O”.  I think I know the answer, but this reveal another way to do it wrong. (Pity to miss Richard Foote on AWRs)

17:55 – 18:45 Franck Pachot – “From Transportable Tablespace to Pluggable Databases”

 

Tuesday

9:00 – 9:50 ME yet again, starting the day with Maria Colgan, Nigel Bayliss, Chris Antognini and Richard Foote on the CBO Panel, with Martin Widlake and Neil Chandler doing the MC bit and making sarky comments.

10:00 – 10:50 Marcin Przepierowski: “Rman – from Beginner to Advanced”, because you’ve always got to review what you thought you knew about recovery (and backup).  It means I have to miss Bryn Llewellyn and Kamil Stawiarski arguing about how to address performance in PL/SQL

11:25 – 12:15 Lucas Jellema: “Intro to Docker Containers & the Oracle Platform – Database, Weblogic and Cloud”.

12:25 – 13:15 Mark Rittman: “How Analytics is changing the World (again)”.

14:20 – 15:05 Stew Ashton: “Meet your Match: Advanced Row Pattern Matching”

15:40 – 16:25 Martin Berger: “Escape from Exadata”.

16:40 – 17:30 Roger MacNicol: “Using Oracle Columnar Technologies across the Information Lifecycle”

Wednesday

9:00 – 9:45 Simon Pane: “Modernizing your DBA Scripts and Backups with the Oracle Scheduler”

10:00 – 10:45 Allan McAleavy: “Moving to an All Flash Array – Dude Where’s my Bottleneck”

11:25 – 12:15 Jason Arneil: “An introduction to Sharding”

12:25 – 13:15 My final session: “Parallel Execution” – if you come to this it’s probably a good idea to see Ivica Arsov on Monday morning as well

14:15 – 15:05 Jaromir D.B. Nemec: “Anomaly Detection in Database Workload”.

 

Of course I may change my mind between now and the start of the event so if you’re feeling deprived (or relieved) that I’m not going to be in your audience – or if you’re feeling pressured that I am – never mind, my agenda isn’t cast in stone and I’ll probably end up at the wrong sessions anyway because I’ve been busy talking to someone without keeping an eye on the time.

 

 


Question Time

Thu, 2017-10-19 13:15

It’s that time of year again – the UKOUG Tech conference is approaching and I’ve organised a panel session on the Cost Based Optimizer.

This year I’ve got Christian Antognini, Nigel Bayliss, Maria Colgan and special guest star, all the way from Australia, Richard Foote on the panel, with Neil Chandler and Martin Widlake taking on their inimitable role of MCs.

If you’ve got any questions you’d like to put to the panel, you will have a chance to write them down on the day, but it would be nice to have a few supplied in advance in the comment below.  Tactical, strategic, technical, or just plain curious – this is a panel that can tell you what can be done, what shouldn’t be done, and how to do the things you shouldn’t do but sometimes have to.

If you prefer to email your questions then click this link.


In memoriam – 3

Wed, 2017-10-11 07:30

My father-in-law died a few weeks ago, aged 95. This is the story that he wrote for his children and grandchildren a few years ago describing his experiences as a Naval engineer on the aircraft carrier HMS Indefatigable during the second world war.

ROY‘S NAVAL CAREER

When war broke out on 3rd September 1939 I wanted to join the Navy, and a few days later I saw a  new recruiting office near Southend Pier so I went in and asked how I would be able to join. A Petty Officer looked at me and said “Well, sonny, you will have to wait until you are 18”. I was then only 17 so I continued with my plan to become an engineer. In those days parents either had to pay the full cost of university education or rely on their children gaining scholarships. In my case scholarships were essential. So, concentrating on mathematics, I took Higher School Certificate (A-Levels) in July 1938 and July 1939, but did not gain any scholarships. At that time I was Head Boy at Lindisfarne College and in late September the school was evacuated to North Wales from the Southend area because of fears of bombing and invasion but here the buildings were not well equipped and there was no laboratory. However, the Southend High School remained at Southend and arrangements were made to transfer me there.

In December 1939 I was awarded a Scholarship at Queens’ College, Cambridge. Then in May 1940 when the German blitzkrieg started the High School was evacuated to Mansfield in the Midlands, but there I took the HSC again and as a result gained a State Scholarship and a Southend Borough Major Scholarship, which in total was enough to see me through Cambridge. There I made friends with Denis Campbell, Stuart Glass and Edward Higham. In addition to lectures we went regularly to tutorials with a great character called (Professor) Archie Browne. He had additional duties as Steward of the College, and was responsible for obtaining food supplies and coal for heating, which was very difficult in wartime.

The course was completed in two years and, with blackouts, air raid precautions and other restrictions, social life was limited. I joined the Naval Section of the Cadet Corps and the Home Guard which took up one or two afternoons each week. I remember one exercise where we had to make a mock attack by night on an airfield some ten miles north of Cambridge. The defenders somehow knew that we would attack the SE corner and mustered there, but we made a mistake and went for the NE corner which was undefended, so we theoretically captured that bit of airfield! We had to march there and back, and the blisters lasted for weeks. On another exercise Cambridge was attacked by the Welch Fusiliers, I remember being knocked on the head and falling into a ditch half full of water. I was considered a casualty and allowed to return to college for a hot bath.

July – September 1942         I applied to join the Royal Navy as an engineer officer and had interviews at the Admiralty including medical examinations. As a result I was accepted and appointed a Probationary Temporary Acting Sub.Lieutenant (E) RNVR, and the next step was to purchase my uniform at Gieves in London, including the purple stripe denoting engineering.

October 1942         I reported to Portsmouth Barracks for four weeks training. I wore my uniform for the first time at Warminster in Wiltshire where we were living, and traveled to Portsmouth without any knowledge of how to make or receive naval salutes in public! This was soon rectified at Portsmouth where I joined twenty other trainees for the course which included instruction in naval customs and traditions, rules and regulations, security, and the all-embracing Kings Regulations and Admiralty Instructions. We also had training in small arms firing and endless square bashing under the eagle eye of Chief Petty Officer Sims, who was as tough as old boots.

November 1942 – November 1943         I was posted to John Brown’s Engineering Works at Clydebank with Donald Townend and Ian Richardson for practical marine engineering training. John Brown’s was a huge organisation which built engines as well as ships, and just after we arrived Indefatigable was launched. This was an amazing sight, seeing 30,000 tons of ship slide down the slipway into the river Clyde. Before the war the Queen Mary and the Queen Elizabeth were built on the same slipway.

The three of us were billeted with two or three other naval officers in lodgings at Glasgow where we three shared a room and were looked after by a homely landlady and her staff. Every morning we put on civilian clothes and caught a rickety old tram for a 30 minute journey to Clyde bank. There we worked successively in the Pattern-shop (making wood moulds), Foundry, Boiler-shop (being deafened by riveting), Machine-shop, Fitting-shop, Pipe-shop, Drawing Office and Dockyard. We did actually work, scraping bearings, operating lathes, casting metal, always under the supervision of an experienced workman. During lunch hours we used to climb over Indefat, deafened again by riveting, but we got to know the ship. At that time the yard was completing R class destroyers at the rate of about one every fortnight, and we used to take part in their initial sea trials so gaining experience of firing up boilers and operating turbine plant.

During the summer of 1943 we got to know the permanent RN engineer officers appointed to supervise the fitting out of the ship, including Peter Sandison who looked after the flight deck gear. We were seconded to help the checking of the installation and testing of all kinds of machinery, and in November I was chosen to be officially appointed to Indefat, while the other two went off to other ships.

December 1943 – February 1944         The ship was commissioned on 8th December and taken over by the RN from the yard. After dock trials we steamed down the Clyde and carried out various trials including full power of the 148,000 HP engines, the measured mile speed test (32 knots) off the Isle of Arran, and steering and going astern trials from full ahead. I remember on one occasion the steering gear locked solid at hard-a-starboard while doing full speed. We went round in circles flying two black balls showing we were out of control! Several weeks were spent commissioning and training the crew, taking on stores and ammunition, gunnery practice, testing of radar and flight deck equipment, while some time was spent at sea.

March – June 1944         The first aircraft flew in on 23rd March, and thereafter the squadrons began to arrive. We spent days at sea practising aircraft landings and by the end of June we had a complement of some 75 aircraft including Seafires, Fireflies and Hellcats. When at sea engineer officers kept watch for four hours at a time, the middle watch (midnight to 4 am) and the morning watch (4 am to 8 am) being the worst. During a watch we had to visit each engine room and boiler room, and altogether a total of seventeen machinery spaces where each involved climbing up and down three sets of ladders, as the only passage was via the main deck. The best visit was always to a boiler room, where the Chief Stoker would provide a mug of ‘kai’, a chocolate slab heated in hot water and steam.

In addition each officer had responsibility for a department which included the operation and maintenance of all the equipment in it, and the men carrying out this work. Over the years mine included seven steam generators supplying electricity to the ship, three emergency diesel generators, motor boats, steering gear and auxiliary machinery including the big evaporators for making the ship’s fresh water from seawater. Also every six months each officer took it in turn to run the ship’s laundry for 2000 crew!

At Action Stations if not on watch each engineer officer had a Damage Control section of the ship to look after. Mine was the midships section above one of the engine rooms, and my team consisted of about ten stokers and technicians. We might be stationed there all day with only sandwiches and ‘Spotted Dick’ for lunch!

July – October 1944         Indefat joined the Fleet at Scapa Flow surrounded by battleships, cruisers and destroyers, and spent much time at sea on Russian convoy escort duty going beyond the Arctic Circle. In July we made an attack on the largest German battleship Tirpitz, which was moored in a Norwegian fiord and was always a potential menace to Russian convoys. This operation was called MASCOT and with two other carriers the aircraft carrying out the attack included 44 Barracudas, 18 Hellcats and 12 Fireflies, supported by many Seafires as fighter escorts. The weather was not good with cloud and fog around and although the Tirpitz was damaged it was decided to make another attack in mid-August. Prior to that strikes were made against some installations on the Norwegian coast and then on 18th August we sailed for the second Tirpitz attack called operation GOODWOOD. At this time a valuable convoy was en route to Russia and our job was to protect it from the Tirpitz and Uboats. The convoy did arrive safely.

Indefat aircraft included 12 Barracudas, 12 Fireflies, 12 Hellcats and 32 Seafires, and the ship was accompanied by Formidable, Furious and two small escort carriers, together with destroyers. On the first day one escort carrier was torpedoed and badly damaged, and had to return to Scapa escorted by the second small carrier. Some time later a destroyer was torpedoed and sank, with few survivors. The operation lasted for seven days with the ship at Action Stations most of the time. At one point Indefat seemed to be under serious attack by Uboats, with the ship taking evasive action and shaken by exploding depth charges from nearby destroyers, while it was reported that one torpedo passed under Indefat. GOODWOOD was successful as Tirpitz was hit several times and had to be moved to the port of Tromso for repairs, where she was later sunk by the RAF with their 10 ton Tallboy bombs. Had she remained in the narrow fiord in the lee of the mountains protected by smokescreens they might never have hit her.

Above the Arctic Circle the sun at this time only went below the horizon for a short time, which meant that our ships could be continually kept under observation by German aircraft and Uboats. There were however some fascinating panoramas of sea and sky, and I remember that one evening the ship had to steam into the wind straight for the coast and the spectacular black rugged mountains of Norway loomed up ahead. I vowed that one day I would revisit the area, and so I did with Joan during our Norwegian cruise of 1987.

Our base was Scapa Flow where we returned every few days. Occasionally we went ashore and the main treat was a visit to the NAAFI canteen which provided a large dish of bacon and baked beans. Otherwise we spent time in the wardroom eating, drinking and playing shove ha’penny or bar skittles. One day we played hockey against a team of large and ruthless Wrens, who beat us using their sticks with wild abandon.

In July more engineer officers joined the ship and I knew that one of them would occupy the vacant berth in my double cabin. I anxiously watched them come aboard and liked the look of Brian, and was very glad when he was allocated to my cabin. Then began a friendship which has lasted all our lives.

October – November 1944         We returned to Clydebank in October and made preparations for going to the Far East. Then we steamed down to Portsmouth and went into dry dock for maintenance and cleaning the ship’s bottom. After this we were ready for sailing but before doing so on 16th November the King and the Royal Family came aboard to wish us Good Luck. We were all mustered in our divisions on the flight deck, the King inspected us and then asked for a cup of tea. This caused a flap as all the cooks and stewards were mustered, and it took the duty officer nearly half an hour to find some tea and make it!

December 1944         After leaving Portsmouth we sailed to Ceylon, passing through Gibraltar, the Mediterranean, the Suez Canal, and then across the Indian Ocean arriving at Colombo on 10th December. We stopped off Algiers where our Mess Secretary went ashore and triumphantly came back with a large load of Algerian wine, which turned out to be the most awful plonk! We had Admiral Vian, the fighting Admiral, on board and at Colombo he demanded to be ferried ashore immediately in his Admiral’s Barge. This motor boat arrived on board at Portsmouth just before we sailed and was stowed in one of the hangars, where the engine could not be tested. I was in charge of boats and I insisted that the boat should have a trial run before an official trip. The Admiral was furious and came storming down the Hight deck demanding an explanation, so I stood to attention quaking in my shoes and gave one. He looked me up and down and said “Right, I will give you ten minutes”. Luckily all went well. Strange how one remembers these things!

During the remainder of the month we spent time at Colombo or Trincomalee storing ship, or at sea exercising with other ships of the Fleet. Trincomalee was a beautiful harbour, and I remember Brian and I were thrilled to bring back a pineapple (which we hadn’t seen for years) to our cabin, but when with due ceremony we slit it open it was full of insects!

January 1945         On New Year’s Day we sailed in company with three other carriers, the battleship King George V, and several cruisers and destroyers for air strikes against the Japanese oil refineries at Palembang in Sumatra. The first strike took place on 4 January and about 100 aircraft took part plus 40 Seafires which provided fighter cover. The refineries were damaged but after returning to Trincomalee it w as decided that further strikes would be carried out and they took place on 24 and 29 January. These were major strikes carried out by 144 aircraft for the first and 128 for the second, plus the usual fighter cover. This time the Japanese were well prepared and on several occasions the Fleet came under attack by enemy aircraft. These were fought off by our guns and aircraft, two being shot down close to Indefat. There were many air battles and we lost 41 aircraft together with many of the aircrews. This included several aircraft that were damaged by enemy action and then crashed on deck landing. The worst event was the fate of nine aircrew survivors who had to force land in Sumatra, were made prisoners, taken to Singapore and then later beheaded. The strikes were successful as the refineries produced some 50 of Japanese oil requirements and they were reduced to a standstill, only increasing back to one third capacity by the end of March. After this we steamed south for Australia and crossed the line (the Equator) with King Neptune and his cohorts “coming aboard” on 1st February. I was duly ducked and scrubbed m a makeshift swimming pool.

February 1945         We called in at Fremantle and six days later arrived in Sydney and moored at Wooloomooloo near the Harbour Bridge. The Australians were very hospitable and Brian, Colin and I were “adopted” by the Murray-Jones family with two daughters, Judy and Annabel. They would invite us home for a meal or arrange some tennis or swimming, not that there was much time as we were busy with maintenance and storing for the Pacific. Towards the end of the month we steamed north with the British Pacific Fleet under Admiral Rawlings.

March 1945         After 11 days at sea we arrived at the island of Manus and then went on to Ulithi, another island. This had an enormous harbour and was full of American ships, a total of about 1,400 preparing for the invasion of Japan. Our Fleet then became Task Force 57 operating with the American 3rd Fleet under Admiral Spruance, and consisted of three other Fleet carriers, eleven destroyers and a number of support ships including sloops, frigates, minesweepers, oil tankers and hospital ships. Sailing from Ulithi our first strike took place on 26th March against some of the Japanese islands south of Okinawa where it was estimated that the Japanese had 10,000 aircraft, of which about 4,000 were suicide bombers called Kamikazes.

Then began a series of strike days, each being a long day’s activity for the Fleet, particularly for the ships’ companies of the aircraft carriers. We would go to Action Stations at 0600 and return to Defence Stations at 2000, and periodically a “Flash Red” warning would be broadcast when enemy aircraft approached. Several air battles took place and, throughout the day, the Fleet wheeled and turned in and out of the wind for the carriers to land on and fly off strikes and fighter escorts. When the last aircraft landed on at dusk the air engineering department worked all night to repair, re-arm, and refuel aircraft ready for the next day.

April 1945         On the morning of 1 April we were hit by a Kamikaze which exploded into the flight deck and bridge structure. Because the flight deck had 3″ armour plate the damage was not catastrophic but fourteen of the crew, including the ship’s doctor, were killed and there was a lot of damage to the flight deck barrier gear and bridge communications. I was Damage Control Officer for the area and my team had to remove the casualties and start repairing the damage. I remember the whole area was flooded with hot steam, as the steam-to-ships siren pipework was broken, until I managed to telephone Y boiler room to shut off the master valve.

Peter Sandison’s team did a good job to repair arrester wires and barriers, and the ship was flying off aircraft an hour later, much to the amazement of the American ships and Admirals. The American carriers with light steel decks were very vulnerable and many of their carriers were sunk or badly damaged due to Kamikazes. On 6 and 7 April the Japanese made massive attacks on allied ships with most of them concentrated on American ships to the north of our Fleet. These attacks were made by 600 aircraft, including 355 Kamikazes, and some 380 were shot down but six American ships were sunk and twenty-one damaged. At this time the giant Japanese battleship Yamato came out on a suicide mission and was sunk by American torpedo bombers with a loss of 2,100 men.

Operations continued until the last week of April when our Fleet returned to Leyte island for refitting and oiling, having been at sea continuously for 32 days. By this time sixty support ships had arrived to provide repair and maintenance facilities. During the month I was promoted to Temporary Lieutenant (E) RN and wore my second stripe.

May 1945         On 1st May the Fleet including the carriers Indefatigable, Implacable. Indomitable, Formidable and Victorious left Leyte to resume operations against the Japanese shipping and shore installations, with Action Stations every day except for the odd day when we retired for refuelling by waiting tankers. British ships were essentially designed for Atlantic operations, and consequently there was very little air conditioning to deal with the hot climate of the Pacific, Some of the machinery spaces reached temperatures of 1400 F and almost every day one’s boiler suit could be twice soaked with perspiration. After a few weeks one would suffer from prickly heat and would be painted with purple potassium permanganate, so looking like an Ancient Briton! Sleeping at night on the quarter-deck was the most comfortable time. Food was almost all dehydrated or tinned and a staple of the diet of dehydrated potato served in a variety of ways – mashed, cubed, boiled, roast or fried. There were also plenty of tins of egg powder and powdered milk!

On 4 May Formidable was hit by a Kamikaze which caused considerable damage and fires on the flight deck but the ship remained operational. Indomitable was nearly hit by another Kamikaze which was shot down and crashed some thirty feet off the starboard bow. A few days later Formidable was again hit and fires were started in the hangar, and nine aircraft were destroyed. All through this period the enemy pressed home their attacks with great skill and determination, making good use of cloud cover, decoys and variations of height. All five carriers were hit at least once by Kamikazes, but nevertheless our aircraft flew some 2500 sorties, dropped over 500 tons of bombs and destroyed about 60 aircraft, at a loss of 98 aircraft.

June 1945         At the beginning of June we returned to Sydney for vital boiler maintenance, aircraft repairs and other general refitting. This was a welcome relief after 100 days on the ship at sea and again the Murray Jones were very hospitable, so we enjoyed some tennis and swimming off Bondi Beach. Towards the end of June the Fleet sailed north again and resumed operations in co-operation with the American Third Fleet.

July – August 1945         We carried out strikes against the Japanese mainland for the first time, including airfields and installations in the Tokyo area. The routine developed of 4 or 5 days at Action Stations, then a day’s withdrawal for refuelling, and then back again for more strikes. It was a time of Action Stations, watch-keeping, eating and sleeping in a noisy, hot and tiring atmosphere, with some excitement when enemy aircraft appeared. The Flight Deck was again busy from dawn to dusk, sending off bombers and also fighters to protect the Fleet. Unfortunately many did not return, and several had accidents when landing back on. At this stage the whole of the Japanese mainland from north to south was under attack by allied ships, with the Americans concentrating on destroying the remnants of the Japanese navy. The British aircraft bombed industrial targets including shipping, oil storage tanks, railways and factories, and on two occasions the battleship King George V carried out extensive bombardment with her heavy guns.

On 4th August all ships were ordered to withdraw some 300 miles from Japan, and on the 6th the first atomic bomb was dropped on Hiroshima, and then the second on Nagasaki. Further strikes continued until the Japanese finally surrendered on 15th August. The Fleet remained at sea but on the 25th we were hit by a typhoon. The waves were awesome, I remember standing on the flight deck which was 70 ft above normal sea level, and watching waves much higher than this coming towards me. The ship was rolling 35° from one side to the other, but we survived. Three American destroyers capsized, and we saw one American carrier with a large part of its flight deck hanging over its bows, as though it had received a punch on the nose!

September 1945         The Japanese Surrender was signed on the USS Missouri in Tokyo Bay on 2nd September, much to our relief. We remained at sea, and with the American Fleet took part in an enormous “parade” of ships outside Tokyo Bay. Then we spent three days in the Bay, while some of our crew went ashore to find and collect prisoners of war and transport them to hospital ships. The famous Mount Fuji is usually covered in cloud but early one morning the tannoy broadcast that it was visible, and I remember a marvellous view of its snow-capped peak.

After this we steamed back to Sydney arriving towards the end of the month, ready for a respite after 73 days of sea time. It was time to reflect on past events, the worst being during July and August when the Fleet lost over 140 aircraft from all causes, by enemy action or deck landings. Since then there has been a lot of discussion about dropping the atomic bombs and their consequences, but to my mind the following reasons justified the decision.

  1. The Americans estimated that there would be around a half-million Allied casualties if the invasion of Japan had taken place later in the year. This did not happen.
  2. About 40,000 British and Allied prisoners of war were kept by the Japanese in horrendous conditions and most would probably not have survived another winter. They were rescued.
  3. The Japanese had some five thousand aircraft and pilots trained as Kamikazes to be used against an invasion fleet, and we would have been in the forefront of this.

October – December 1945         Indefat remained in Sydney and the crew were allowed a lot of shore leave. The Murray-Jones thoughtfully provided a flat where many of us could stay, including Brian, Colin, Peter Fanghanel and others. One highlight was when all the latter including me made up a party to go ski-ing for a week at Mount Kosciusco. We arrived at the snowline and were then told that the chalet was 12 miles away and could only be reached on skis. Some of us, including me, had not skied before but we were told “Oh, that’s OK, today is Tuesday, and there is a tractor going up on Thursday which could pick you up if you are stuck”!

This was a time of hard work and playas supplies were exhausted, the engines needed refitting, the ship needed cleaning and the typhoon had damaged part of the hull so the ship had to go into dry dock. Some of us were seconded to the dockyard to help out with various jobs and I enjoyed the use of a 500cc motor-bike.

We managed another five days on a sheep farm, again with Brian and Colin. The farm was enormous and the family relied on horses to get around. On the first day we were each provided with a horse, but I viewed this with trepidation. So evidently did the horse, as after 15 minutes he turned round and trotted home, and there was nothing I or anyone else could do to stop him! I decided that I would stick to something with a brake and throttle.

January – March 1946         On 20th January we left Sydney for the journey home. Three days later we arrived at Melbourne where we had a tremendous welcome, with a parade led by the Royal Marines hand marching through the streets to the City Hall where the Governor took the salute accompanied by Admiral Vian. We stayed a week and were well entertained, then steamed across the Australian Bight which was unpleasantly rough to call in at Fremantle for a few hours before setting off for Capetown.

We arrived at Capetown after 17 days at sea. Again we were well looked after with a reception at the Governor’s Residence and an expedition to Table Mountain. This was the highlight of the visit, we took the cable car to the top with marvellous views all round and then came all the way down on foot. On 24th February we left Capetown, arriving at Gibraltar on 11 th March. On the way we passed close to St. Helena and Ascension Island. The Duty Officer went ashore and paid his respects to the Governor, who presented him with a live turtle to make soup! The ship’s butcher did not think much of this so when we left the turtle was returned to the sea, and was last seen swimming happily to the shore.

This part of the trip was pleasant and not too hot, every day there were games of deck hockey on the flight deck using a rope grommet instead of a ball. At Gibraltar we stayed for one day and Peter Fanghanel was the only one of our group who managed to go ashore, he came back with a large case of Tio Pepe sherry.

Finally we arrived at Portsmouth on 16th March and berthed inside the harbour, with crowds lining the Southsea promenade and cheering as we went in. We engineers saw little of this, hut we looked forward to a pint at the St. Enoch’s Hotel and then some leave. I think I had about ten days at Westcliff with Mother and Brenda, it was good to see them again after nearly 2½years.

April – October 1946         The ship sailed again on 25th April with 130 “Bush Brides”, who were brides of Australian servicemen and were going to join their husbands to live in Australia. The voyage again was through the Mediterranean and then a brief stay at Aden. Brian, Colin and I went ashore and we asked some joker the way to the local Club for a drink. “Oh” he said “lt’s that white building up on the hill”. So we trudged up the hill, knocked on the front door which was opened by a smart servant who asked what we wanted. We said we would like a drink, to which he replied that this was the Consul’s Residence. Anyway, the Consul was very decent, gave us more than one drink and we went happily back to the ship.

We arrived back in Sydney on 25th May and left again on the 9th June with over 1,000 service personnel due to be demobilised, including some RAF. We also carried 65 tons of food for Britain and about 18,000 gift parcels of food. From Fremantle the engines worked at full power and lndefat made a record-breaking non-stop trip of 21 days to Portsmouth. Then on 29th July we sailed again to Colombo and repatriated another large number of service personnel. The highlight I remember was a visit to Kandy and the Temple of the Sacred Tooth, where we were guided by Buddhist priests in their saffron yellow robes.

The last major event was a parade by the ship’s company on 19th September through Holborn in London, the borough that had “adopted” us during the war. As one of the officers with the longest-serving time in Indefat I was placed in the front rank, and there is a photo in our album. After the march we were inspected by the Mayor and then had a luncheon in the Town Hall, where our Battle Ensign flown by Indefat during Action Stations was presented to be hung in the Council Chamber. The demobilisation process was slow, but I finally left the ship and the Navy on 1st October 1946, after a wardroom party the night before! I well remember going down the gangway, walking through Portsmouth Dockyard and then out through the Main Gate, ready to face a different kind of life and world.


12c Parse

Fri, 2017-10-06 03:07

Following on from a comment to a recent posting of mine about “bad” SQL ending up in the shared pool and the specific detail that too much bad SQL could cause contention problems while staying virtually invisible, there’s a related note today on the ODC (formerly OTN) forum of a little change in 12.2 that alerts you to the problem.

Try executing the following anonymous block (on a non-production system):


declare m1 number;
begin
        for i in 1..10000 loop
        begin
                execute immediate 'select count(*) frm dual' into m1;
                dbms_output.put_line(m1);
        exception
                when others then null;
        end;
        end loop;
end;
/

Then check your alert log (if you want to be a little cautious, change the 10,000 in the loop to something like 200). If you’re running 12.2.0.1 you’ll find something like the following:


ORCL(3):WARNING: too many parse errors, count=100 SQL hash=0x19a22496
ORCL(3):PARSE ERROR: ospid=4577, error=923 for statement:
2017-10-06T03:46:15.842431-04:00
ORCL(3):select count(*) frm dual
ORCL(3):Additional information: hd=0x7673c258 phd=0x765151a8 flg=0x28 cisid=135 sid=135 ciuid=135 uid=135
2017-10-06T03:46:15.842577-04:00
ORCL(3):----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x76734f18         5  anonymous block
ORCL(3):WARNING: too many parse errors, count=200 SQL hash=0x19a22496
ORCL(3):PARSE ERROR: ospid=4577, error=923 for statement:
2017-10-06T03:46:15.909523-04:00
ORCL(3):select count(*) frm dual
ORCL(3):Additional information: hd=0x7673c258 phd=0x765151a8 flg=0x28 cisid=135 sid=135 ciuid=135 uid=135
2017-10-06T03:46:15.909955-04:00
ORCL(3):----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x76734f18         5  anonymous block

The warning will be repeated every hundred occurrences. As you can see the guilty (ORA-00923: missing FROM) SQL appears in the report so you know what you’re looking for. In my particular case, with the silly PL/SQL block, the address of the calling anonymous pl/sql block was also reported:


select sql_text from V$sql where child_address = '0000000076734F18';

SQL_TEXT
--------------------------------------------------------------------------------
declare m1 number; begin  for i in 1..10000 loop  begin   execute immediate 'sel
ect count(*) frm dual' into m1;   dbms_output.put_line(m1);  exception	 when ot
hers then null;  end;  end loop; end;

In the case of the OP on ODC the SQL reported in the alert log was simply: “SELECT 1”. As Billy Verreynne suggested in the thread, this looks like the sort of code that would be sent to the database by some of the connection pooling clients to check that the database is up. Unfortunately (apart from the waste of effort) this particular setup seems to think it’s talking to some database other Oracle!

 

Footnote:

This is a feature of 12.2 – 11g and 12.1 don’t write such warnings to the alert log.

Lagniappe

A tweet from Mohamed Houri reminds me that parse failures like these, of course, show up in the instance activity stats, in particular:


Name                               Value
----                               -----
opened cursors cumulative         10,006
enqueue requests                  10,002
enqueue releases                  10,002
sql area purged                   10,000
sql area evicted                  10,000
parse count (total)               10,008
parse count (hard)                10,002
parse count (failures)            10,000

The enqueue requests are for the ‘CU’ (cursor) enqueue which, I think, appeared in 10g.

Most of the figures that my session reports here are likely to be highly camouflaged by the rest of the activity from a normal system, so the most important number is the “parse count (failures)” – so it’s useful to know that you can subtract that number the other statistics to give you an idea of the impact that would be eliminated if you could located and stop the thing generating the failing statements.

 


Parsing

Tue, 2017-10-03 10:15

Here’s a quick quiz.

According to the Oracle 12.1 Database SQL Tuning Guide the first stage of parsing a statement is the Syntax Check, which is followed by the Semantic Check, followed by the Shared Pool Check. So where you do think the statement text will be while the Syntax Check is going on ?

 

 

 

 

 

 

And the answer looks like ….

 

 

 

… the shared pool. Here’s a simple test, cut-n-paste from SQL*Plus running under 12.1.0.2 in the SYS schema (I’ve also done this in the past with older versions):

 

 

 


SQL> select user frrom dual;
select user frrom dual
                  *
ERROR at line 1:
ORA-00923: FROM keyword not found where expected

SQL> select kglhdpar, kglhdadr, kglobt03, kglnaobj from x$kglob where kglnaobj like '%frrom%' and kglnaobj not like '%kgl%';

KGLHDPAR         KGLHDADR         KGLOBT03      KGLNAOBJ
---------------- ---------------- ------------- --------------------------------------------------------------------------------
00000000D1EE3880 00000000AEF43F40 bshhvh0ypcz6x select user frrom du
00000000D1EE3880 00000000D1EE3880 bshhvh0ypcz6x select user frrom du

2 rows selected.

SQL> 

So the statement is garbage and fails (or ought to be failing) on a syntax text – but I can find it in x$kglob – the library cache objects – in the SGA with a parent and child entry.

I have to say that when I first read, many years ago, that there was a syntax check it struck me that the fastest way of doing a syntax check on a “good” system would be to start with a search for the text in the library cache. If it were there (and on a good system it probably would be within a few minutes of startup) then you could have a flag that avoided the CPU cost of doing the syntax check and move straight on to the semantic (basically object and permissions) check.

 


markhot

Mon, 2017-10-02 02:42

How can a single piece of SQL text – checked very carefully – end up with multiple SQL_IDs ? There are probably quite a lot of people who know the answer to this question but won’t think of it until they’re reminded and, thanks to a question that came up on the forum formerly known as OTN a couple of days ago, I was reminded about it recently and rediscovered an article I had drafted on the topic a few years ago.

The specific problem on the forum was about having a huge number of child cursors for a single parent thanks to a frequently executed update statement that updated all 50 columns of a table using bind variables to do so. The reason why a single statement could produce so many child cursors seemed to be due to the variation in the lengths of the bind variables supplied – which could well be the consequence of an internal library mechanism rather than an explicit design mechanism written into the client code. One of the comments to my 2007 article suggested event  10503 as a damage limitation mechanism, but there was some problem with it not working as expected at the time. A quick check on MoS now reports bug 10274265 : “EVENT 10503 NOT WORKING ON THE SESSION LEVEL” as fixed in 12.1 with lots of backport patches to various versions of 11.2

Moving on from the back-story, the case that prompted my draft note in 2014 was the simpler one of having lots of sessions constantly executing exactly the same SQL statement and child cursor, so rather than having a latch/mutex type of problem because of a large number of child cursors I was seeing a problem purely because of the level of concurrent access to the same child cursor.  The solution in the version of Oracle the client was using at the time was to tell Oracle to mark the SQL statement as “hot” by setting a hidden parameter; but the mechanism is now officially exposed in a procedure called dbms_shared_pool.markhot() that I learned about a few months ago when I was at a client who had a similar problem with highly concurrent execution of a small set of statements – with the extra twist that the table referenced in the critical statements was a partitioned table which suffered a fairly regular partition exchange.

When a statement (through it’s “full_hash_value”) is marked as hot, an extra value visible as the property column in v$db_object_cache is set to a value that seems to be dependent on the process id of the session attempting to execute the statement, and this value is used as an extra component in calculating a new full_hash_value (which leads to a new hash_value and sql_id). With a different full_hash_value the same text generates a new parent cursor which is (probably) going to be associated with a new library cache hash bucket and latch. The property value for the original parent cursor is set to “HOT”, and the extra copies become “HOTCOPY1”, “HOTCOPY2” and so on. Interestingly once an object is marked as HOT and the first HOTCOPYn has appeared the original version may disappear from v$sql while still existing in v$db_object_cache.

The number of “HOTCOPYn” versions of the statement is limited by the hidden parameter “_kgl_hot_object_copies” which (according to my notes) defaults to either cpu_count or cpu_count/2. On my most recent test on 11.2.0.4 it seemed to be the latter.

Marking a cursor hot

There are three options:

  • set a hidden parameter in the startup file
  • execute an “alter system” command to set the hidden parameter
  • from 11.2.0.3 onwards (possibly earlier in 11.2) call dbms_shared_pool.markhot()

Examples:

Startup file:

_kgl_debug="hash='cc7d5ecdcc9e7c0767456468efe922ea' namespace=0 debug=33554432"

Alter system call with multiple targets:

alter system set "_kgl_debug" =
        "hash='cc7d5ecdcc9e7c0767456468efe922ea' namespace=0 debug=33554432",
        "hash='59a1f6575a5006600792ee802558305b' namespace=0 debug=33554432"
;

markhot() procedure:

begin
        dbms_shared_pool.markhot(
                hash            =>'71fc6ccf9a3265368492ec9fc05b785b',
                namespace       =>0,
                global          =>true
        );
end;
/

The namespace identifies the object as an SQL Cursor (you can mark other types of object as hot if you need to), and for those of a mathematical bent you’ll work out that the debug values is power(2,25).

The value supplied as the hash is the full_hash_value and you can find this in v$db_object_cache either by searching on some string that easily identifies your statement, or by searching v$sql on a string to get the (short) hash value of the statement and using that to search v$db_object_cache on the hash_value column.


select
        hash_value,
        full_hash_value,
        namespace,
        child_latch,
        property        hot_flag,
        executions,
        invalidations
from
        v$db_object_cache
where
        name like '{some part of your critical SQL statement}'
;

I ran into two problems using the markhot() approach. The first not terribly serious – the second fatal, except I’m not going to do it again and I wouldn’t have run into it if I hadn’t been impatient working around the first.

First: if you’ve already got lots of sessions executing the statement and holding cursors open in some way before you call markhot() then it may be some time before all those sessions release the hot parent and child and acquire a “cool” parent and child and unfortunately you can’t call markhot() until at least one session has opened the relevant cursor – and that’s a problem that isn’t relevant if you’ve got the hidden parameter set.

Secondly: although eventually your hot cursor(s) will drop out of use, if you try to get rid of them early by a cunning call to dbms_shared_pool.purge() you may find that you don’t manage to purge them; if you decide to try again, and again (as I did) you may find that your session goes into an infinite CPU spin and no-one can get at the hot cursor.  Be patient, once you’ve marked a cursor as hot your application will (probably) end up spreading itself over the copies.

One last detail – if, for any reason, you decide that a cursor no longer needs to be marked hot there is a procedure dbms_shared_pool.unmarkhot() that takes the same three parameters to clear the property and allow the copies to disappear.

Footnote

The OTN problem that prompted me to write this note wasn’t about high concurrency levels, it was about mutex contention while searching for the right child cursor. The markhot() procedure doesn’t really look as if it’s designed to address this issue but, as a side-effect of having multiple parent cursors for the same statement text, there should be fewer sessions searching each child-cursor chain at any one moment and this may be enough to reduce the contention. Statistically, of course, every child chain is likely to end up the same length so the amount of shared pool memory used by the SQL statement will eventually grow by a factor matching the number of hot copies produced – but if the problem is contention it may be better (e.g.) to have 16 times the memory used so that 100 concurrent sessions can be spread across 16 different chains rather than having 100 sessions all trying to search the same chain at the same time.

 


With Subquery()

Tue, 2017-09-19 13:19

Here’s a little oddity that came up recently on the OTN database forum – an example where a “with” subquery (common table expression / factored subquery) produced a different execution plan from the equivalent statement with the subquery moved to an inline view; tested in 12.1.0.2 and 12.2.0.1. Here are the two variations:

with  tbl as (
          select 1 col1, 'a'  col2 from dual
union all select 2 , 'a' from dual
union all select 3 , 'b' from dual
union all select 4 , 'a' from dual
union all select 5 , 'a' from dual
union all select 6 , 'b' from dual
union all select 7 , 'b' from dual
),
lag_data as (
        select col1, col2, lag(col2) over (order by col1) col2a from tbl
)
select  col1, col2
from    lag_data
where   col2a is null or col2a <> col2
order by col1
;

with  tbl as (
          select 1 col1, 'a'  col2 from dual
union all select 2 , 'a' from dual
union all select 3 , 'b' from dual
union all select 4 , 'a' from dual
union all select 5 , 'a' from dual
union all select 6 , 'b' from dual
union all select 7 , 'b' from dual
)
select  col1, col2
from    (
        select col1, col2, lag(col2) over (order by col1) col2a from tbl
        )
where   col2a is null or col2a <> col2
order by col1
;

You’ll notice that there’s an explicit “order by” clause at the end of both queries. If you want the result set to appear in a specific order you should always specify the order and not assume that it will appear as a side effect; but in this case the ordering for the “order by” clause seems to match the ordering needed for the analytic function, so we might hope that the optimizer would take advantage of the analytic “window sort” and not bother with a “sort order by” clause. But here are the two plans – first with subquery factoring, then with the inline view:


-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    16 (100)|          |
|   1 |  SORT ORDER BY   |      |     7 |    56 |    16  (13)| 00:00:01 |
|*  2 |   VIEW           |      |     7 |    56 |    15   (7)| 00:00:01 |
|   3 |    WINDOW SORT   |      |     7 |    42 |    15   (7)| 00:00:01 |
|   4 |     VIEW         |      |     7 |    42 |    14   (0)| 00:00:01 |
|   5 |      UNION-ALL   |      |       |       |            |          |
|   6 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  11 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  12 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("COL2A" IS NULL OR "COL2A"<>"COL2"



-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    15 (100)|          |
|*  1 |  VIEW            |      |     7 |    56 |    15   (7)| 00:00:01 |
|   2 |   WINDOW SORT    |      |     7 |    42 |    15   (7)| 00:00:01 |
|   3 |    VIEW          |      |     7 |    42 |    14   (0)| 00:00:01 |
|   4 |     UNION-ALL    |      |       |       |            |          |
|   5 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   6 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|  11 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("COL2A" IS NULL OR "COL2A"<>"COL2"))

The two plans are different, and the difference is an extra “sort order by” operation even though the optimizer has moved the subquery with the analtyic function inline so that in principle both statements are the technically the same and merely cosmetically different.

It’s been some time since I’ve noticed subquery factoring resulting in a change in plan when the expected effect is purely cosmetic. Interestingly, though, the “unparsed query” in the 10053 (CBO) trace file is the same for the two cases with the only difference being the name of a generated view:


SELECT  
        "LAG_DATA"."COL1" "COL1","LAG_DATA"."COL2" "COL2" 
FROM    (SELECT 
                "TBL"."COL1" "COL1","TBL"."COL2" "COL2",
                DECODE(
                        COUNT(*) OVER ( ORDER BY "TBL"."COL1" ROWS  BETWEEN 1 PRECEDING  AND 1 PRECEDING ),
                        1, FIRST_VALUE("TBL"."COL2") OVER ( ORDER BY "TBL"."COL1" ROWS  BETWEEN 1 PRECEDING  AND 1 PRECEDING ),
                           NULL
                ) "COL2A" 
        FROM    (
                            (SELECT 1 "COL1",'a' "COL2" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 2 "2",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 3 "3",'b' "'B'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 4 "4",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 5 "5",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 6 "6",'b' "'B'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 7 "7",'b' "'B'" FROM "SYS"."DUAL" "DUAL")
                ) "TBL"
        ) "LAG_DATA" 
WHERE 
        "LAG_DATA"."COL2A" IS NULL OR "LAG_DATA"."COL2A"<>"LAG_DATA"."COL2" 
ORDER BY "LAG_DATA"."COL1"

The above is the unparsed query for the query with two factored subqueries; the only difference in the unparsed query when I moved the analytic subquery inline was that the view name in the above text changed from “LAG_DATA” to “from$_subquery$_008”.

Footnote:

When I used a real table (with the same data) instead of a “union all” factored subquery for the driving data this anomaly disappeared. The union all is a convenient dirty trick for generating very small test data sets on OTN – it remains to be seen whether a more realistic example of multiple factored subqueries would still result in the optimizer losing an opportunity for eliminating a “sort order by” operation.

In passing – did you notice how the optimizer had managed to rewrite a “lag()” analytic function as a form of “first_value()” function with decode ?


Join Elimination Bug

Mon, 2017-08-14 05:59

A few years ago a bug relating to join elimination showed up in a comment to a post I’d done about the need to keep on testing and learining. The bug was visible in version 11.2.0.2 and, with a script to replay it, I’d found that it had disappeared by 11.2.0.4.

Today I had a reason to rediscover the script, and decided to test it against 12.2.0.1 – and found that the bug was still present.

Here’s the model:


rem     Script:         join_eliminate_bug_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2012

drop table child purge;
drop table parent purge;

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        deferrable initially immediate
)
;

create table child(
        id_p    number(4)       
                constraint chi_fk_par
                references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/

set serveroutput off

select
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

The setup is just to show you the correct results with join elimination taking place. Here’s the output from the query and the actual execution plan:

      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill

4 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  1whubydgj8w0s, child number 0
-------------------------------------
select  chi.* from  child chi,  parent par where  par.id = chi.id_p

Plan hash value: 2406669797

-----------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |    11 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    48 |    11 |
-----------------------------------------------------------

On a single column join, with referential integrity in place, and no columns other than the primary key involved, the optimizer eliminates table parent from the query. But if I now defer the primary key constraint on parent and duplicate every row (which ought to duplicate the query result), watch what happens with the query:


set constraint par_pk deferred;

insert into parent (id,name) values (1,'Smith');
insert into parent (id,name) values (2,'Jones');

alter system flush shared_pool;

select
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);


      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill

4 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  1whubydgj8w0s, child number 0
-------------------------------------
select  chi.* from  child chi,  parent par where  par.id = chi.id_p

Plan hash value: 2406669797

-----------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |    11 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    48 |    11 |
-----------------------------------------------------------

I get the same plan, so I get the same results – and notice that I flushed the shared pool before repeating the query so I haven’t fooled Oracle into reusing the wrong plan by accident – it’s a whole new freshly optimized plan.

Just to show what ought to happen here’s the last bit of the test case:


select  /*+ no_eliminate_join(@sel$1 par@sel$1) */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);


      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill
         2          1 Jack
         2          2 Jill

8 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  5p8sp7k8b0fgq, child number 0
-------------------------------------
select /*+ no_eliminate_join(@sel$1 par@sel$1) */  chi.* from  child
chi,  parent par where  par.id = chi.id_p

Plan hash value: 65982890

-----------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |       |     5 |
|   1 |  NESTED LOOPS                |        |     4 |    60 |     5 |
|   2 |   NESTED LOOPS               |        |     4 |    60 |     5 |
|   3 |    INDEX FULL SCAN           | PAR_PK |     2 |     6 |     1 |
|*  4 |    INDEX RANGE SCAN          | CHI_PK |     2 |       |     1 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD  |     2 |    24 |     2 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("PAR"."ID"="CHI"."ID_P")


I ran this test on 11.2.0.4 – and then repeated it on 12.2.0.1: the bug is still present (although I thought I’d seen a MoS note saying it had been fixed in 12.1).

It’s always a little dangerous playing around with deferrable constraints – my view is that you should keep the interval of deferment as short as possible and don’t try to use it for doing anything other than correcting known data errors. At present if you have code that defers constraints and runs non-trivial queries afterwards you might want that code to start with an “alter session” to set the hidden parameter _optimizer_join_elimination_enabled to false (after checking with Oracle support, of course).


Rebuilding Indexes

Thu, 2017-08-03 07:00

One of the special events that can make it necessary to rebuild an index is the case of the “massive DML”, typically a bulk delete that purges old data from a table. You may even find cases where it’s a good idea to mark a couple of your indexes as unusable before doing a massive delete and then rebuild them after the delete.

Despite the fact that a massive delete is an obvious special case it’s still not necessary in many cases to worry about a rebuild afterwards because the space made free by the delete will be smoothly reused over time with very little variation in performance. There is, however, one particular feature that increases the probability of a rebuild becoming necessary – global (or globally partitioned) indexes on partitioned tables. The problem (and the absence of problem in non-partitioned tables) is in the nature of the rowid.

For non-partitioned tables, and partitioned tables with local indexes, the rowid stored in an index is (assuming we’re thinking only of heap tables) stored as a sequence of 6 bytes consisting, in order, of: (tablespace relative file number, block number within file, row number within block). If the table is non-partitioned, or if this is an index segment from a locally partitioned index, all the index entries will be pointing to the same table segment and Oracle knows which segment that is from the data dictionary information – so Oracle can derive the data_object_id of the table segment and convert the tablespace relative file number into the absolute file number to navigate to the right row in the table.

When the index is global or globally partitioned any index entry may point to any of the table’s segments, so the rowid that is stored in the index is expanded by a further 4 bytes to hold the data_object_id of the table segment it is pointing to – and the data_object_id is the leading component: (data_object_id, tablespace relative file number, block number within file, row number within block). Think about what this means when you start to drop “old” partitions and add new partitions. Compare this with what happens when you simply delete a large volume of old data from a table and starting inserting new data. There’s an important difference to the way in which indexes will evolve.

Purging data

When you delete a large volume of data from a (simple, heap) table you will create a lot of empty space in a lot of existing table blocks. If that delete is supposed to get rid of “old” data (and to keep the description simple we’ll assume it’s the first time you’ve done this) it’s very likely that the delete will result in lots of empty blocks near the start of the table – rows that were inserted at the same time will tend to be physically close to each other in the table. This means that future inserts will soon start to reuse those table blocks. Think about what this means for index entries – especially for non-unique keys.

Assume you have 100 rows with value ‘XXX’ for an indexed column. Breaking the rowid into its component parts the index entries will be (‘XXX’,{file_id, block_id, row_number}).  Now assume you delete the oldest 10 rows then, over time, insert 10 more rows with the same key value. You’ve deleted the 10 index entries with the lowest values for (file_id, block_id) but the space that’s become available in the table will be in and around exactly that range of blocks – so the new index entries will probably end up looking very similar to the deleted index entries and inserted in and around the existing index entries for value ‘XXX’, so over time the index is unlikely to allocate much new space.

Now think about what happens when your table it partitioned but the index is global; your index entries are (‘XXX’,{data_object_id, file_id, block_id, row_number}). When you drop the oldest partition you will probably[1] delete all the index entries with the lowest data_object_id. When you start inserting new rows for ‘XXX’ the new table partition will have a data_object_id that is going to be higher than any previous data_object_id – which means you’re going to be inserting rows into the right-hand (high-value) edge of this section of the index. In some cases – typically those where you have a couple of leaf blocks per key value – the index may end up growing significantly because the insertion point for rows in the new partition isn’t in the leaf block with the available space, and it won’t be until you’ve done a few more bulk deletes and the leaf blocks relating to the oldest table partitions become completely empty that the space can be reused.

An example of this type of behaviour probably appeared on the OTN database forum quite recently.  Of course, there are various reasons why indexes can become inefficient, and the degree of inefficiency may only become noticeable over a fairly long period of time; moreover there are various reasons why global indexes are a little problematic, and various reasons why a bulk delete (which is what executing “alter table drop partition” does to a global index) has unpleasant side effects dependent somewhat on the number (and size) of the partitions and on how many you try to drop in one go.

There’s not  a lot you can do about this quirk of global indexes, but it’s always worth taking extra care with partitioned tables and focusing even more carefully on a strategic review of indexes:

  • Does this index really need to exist at all
  • Could this index be replaced by a selective function-based index
  • Does this index really need to be global / globally partitioned
  • How big is this index compared to the size it ought to be
  • Should this index be (basic) compressed
  • Is this index likely to be disrupted by a historic purge – or is there another reason for its undesirable behaviour

 

[1] probably delete entries with the lowest data_object_id” – I have to say this because if you’ve executed a “move partition” at any time a new data_object_id will have been generated for the partition, so the oldest partition could, in principal, have the highest data_object_id. The issue of changing data_object_ids brings a whole new level of complexity to global indexes – but only in a few special cases, fortunately.

 

 


Redo OP Codes:

Tue, 2017-07-25 12:17

This posting was prompted by a tweet from Kamil Stawiarski in response to a question about how he’d discovered the meaning of Redo Op Codes 5.1 and 11.6 – and credited me and Julian Dyke with “the hardest part”.

Over the years I’ve accumulated (from Julian Dyke, or odd MoS notes, etc.) and let dribble out the occasional interpretation of a few op codes – typically in response to a question on the OTN database forum or the Oracle-L listserver, and sometimes as a throwaway comment in a blog post, but I’ve never published the full set of codes that I’ve acquired (or guessed) to date.

It’s been some time since I’ve looked closely at a redo stream, and there are many features of Oracle that I’ve never had to look at at the level of the redo so there are plenty of gaps in the list – and maybe a few people will use the comments to help fill the gaps.

It’s possible that I may be able to add more op codes over the new days – I know that somewhere I have some op codes relating to space management, and a batch relating to LOB handling, but it looks like I forgot to add them to the master list – so here’s what I can offer so far:


1	Transaction Control

2	Transaction read

3	Transaction update

4	Block cleanout
		4.1	Block cleanout record
		4.2	Physical cleanout
		4.3	Single array change
		4.4	Multiple array changes
		4.5	Format block
		4.6	ktbcc redo -  Commit Time Block Cleanout Change (?RAC, ?recursive, ?SYS objects)

5	Transaction undo management
		5.1	Update undo block
		5.2	Get undo header
		5.3	Rollout a transaction begin
		5.4	On a rollback or commit
		5.5	Create rollback segmenr
		5.6	On a rollback of an insert
		5.7	In the ktubl for 'dbms_transaction.local_transaction_id'
			(begin transaction) - also arrives for incoming distributed
			tx, no data change but TT slot acquired. Also for recursive
			transaction (e.g. truncate). txn start scn:  0xffff.ffffffff
		5.8	Mark transaction as dead
		5.9	Rollback extension of rollback seg
		5.10	Rollback segment header change for extension of rollback seg
		5.11	Mark undo as applied during rollback
		5.19	Transaction audit record - first
		5.20	Transaction audit record - subsequent
		5.23	ktudbr redo: disable block level recovery (reports XID)
		5.24	ktfbhundo - File Space Header Undo

6	Control file

10	Index
		10.1	SQL load index block
		10.2	Insert Leaf Row
		10.3	Purge Leaf Row
		10.4	Delete Leaf Row
		10.5	Restore Leaf during rollback
		10.6	(kdxlok) Lock block (pre-split?)
		10.7	(kdxulo) unlock block during undo
		10.8	(kdxlne) initialize leaf block being split
		10.9	(kdxair) apply XAT do to ITL 1	-- related to leaf block split 
		10.10	Set leaf block next pointer
		10.11	(kdxlpr) (UNDO) set kdxleprv (previous pointer)
		10.12 	Initialize root block after split
		10.13	index redo (kdxlem): (REDO) make leaf block empty,
		10.14	Restore block before image
		10.15	(kdxbin) Insert branch block row	
		10.16	Purge branch row
		10.17	Initialize new branch block
		10.18	Update key data in row -- index redo (kdxlup): update keydata
		10.19	Clear split flag
		10.20	Set split flag
		10.21	Undo branch operation
		10.22	Undo leaf operation
		10.23	restore block to tree
		10.24	Shrink ITL
		10.25	format root block
		10.26	format root block (undo)
		10.27	format root block (redo)
		10.28	Migrating block (undo)
		10.29	Migrating block (redo)
		10.30	Update nonkey value
		10.31	index root block redo (kdxdlr):  create/load index
		10.34 	make branch block empty
		10.35	index redo (kdxlcnu): update nonkey
		10.37	undo index change (kdxIndexlogicalNonkeyUpdate) -- bitmap index
		10.38	index change (kdxIndexlogicalNonkeyUpdate) -- bitmap index
		10.39	index redo (kdxbur) :  branch block update range
		10.40	index redo (kdxbdu) :  branch block DBA update,

11	Table
		11.1  undo row operation 
		11.2  insert row  piece
		11.3  delete row piece 
		11.4  lock row piece
		11.5  update row piece
		11.6  overwrite row piece
		11.7  manipulate first column
		11.8  change forwarding address - migration
		11.9  change cluster key index
		11.10 Set Cluster key pointers
		11.11 Insert multiple rows
		11.12 Delete multiple rows
		11.13 toggle block header flags
		11.17 Update multiple rows
		11.19 Array update ?
		11.20 SHK (mark as shrunk?)
		11.24 HCC update rowid map ?

12	Cluster

13	Segment management
		13.1	ktsfm redo: -- allocate space ??
		13.5	KTSFRBFMT (block format) redo
		13.6	(block link modify) (? index )  (options: lock clear, lock set)
		13.7	KTSFRGRP (fgb/shdr modify freelist) redo: (options unlink block, move HWM)
		13.13	ktsbbu undo - undo operation on bitmap block
		13.14	ktsbbu undo - undo operation on bitmap block
		13.17	ktsphfredo - Format Pagetable Segment Header
		13.18	ktspffredo - Format Level1 Bitmap Block
		13.19	ktspsfredo - Format Level2 Bitmap Block
		13.21	ktspbfredo - Format Pagetable Datablock
		13.22	State change on level 1 bitmap block
		13.23	Undo on level 1 bitmap block
		13.24	Bitmap block (BMB) state change (level 2 ?)
		13.25	Undo on level 2 bitmap block 
		13.26	?? Level 3 bitmap block state change ??
		13.27	?? Level 3 bitmap block undo ??
		13.28	Update LHWM and HHWM on segment header
		13.29	Undo on segment header
		13.31	Segment shrink redo for L1 bitmap block
		13.32	Segment shrink redo for segment header block

14	Extent management
		14.1	ktecush redo: clear extent control lock
		14.2	ktelk redo - lock extent (map)
		14.3	Extent de-allocate
		14.4	kteop redo - redo operation on extent map
		14.5	kteopu undo - undo operation on extent map
		14.8	kteoputrn - undo operation for flush for truncate

15	Tablespace

16	Row cache

17	Recovery management
		17.1	End backup mode marker
		17.3	Crash Recovery at scn:  0x0000.02429111
		17.28	STANDBY METADATA CACHE INVALIDATION
	
18	Block image (hot backups)
		18.1	Block image
		18.3	Reuse redo entry 
				   (Range reuse: tsn=1 base=8388753 nblks=8)
				or (Object reuse: tsn=2 objd=76515)

19	Direct loader
		19.1	Direct load block record
		19.2	Nologging invalidate block range
			Direct Loader invalidate block range redo entry

20	Compatibility segment

21	LOB segment 
		21.1	kdlop (Long Feild) redo:  [sic]
				(insert basicfile clob)

22	Locally managed tablespace
		22.2	ktfbhredo - File Space Header Redo:
		22.3	ktfbhundo - File Space Header Undo:
		22.5	ktfbbredo - File BitMap Block Redo:
		22.16	File Property Map Block (FPM)

23	Block writes
		23.1	Block written record
		23.2	Block read record (BRR) -- reference in Doc ID: 12423475.8

24	DDL statements
		24.1	DDL
		24.2	Direct load block end mark
		24.4	?? Media recovery marker
		24.10	??
		24.11	??

(E & O.E) – you’ll notice that some of the descriptions include question marks – those are guesses – and some are little more than the raw text extracted from a redo change vector with no interpretation of what they might mean.

 


Fast Now, Fast Later

Mon, 2017-07-24 07:39

The following is the text of an article I published in the UKOUG magazine several years ago (2010), but I came across it recently while writing up some notes for a presentation and thought it would be worth repeating here.

Fast Now, Fast Later

The title of this piece came from a presentation by Cary Millsap and captures an important point about trouble-shooting as a very memorable aphorism. Your solution to a problem may look good for you right now but is it a solution that will still be appropriate when the database has grown in volume and has more users.

I was actually prompted to write this article by a question on the OTN database forum that demonstrated the need for the basic combination of problem solving and forward planning. Someone had a problem with a fairly sudden change in performance of his system from November to December, and he had some samples from trace files and Statspack of a particular query that demonstrated the problem.

The query was very simple:

select  *
from    tph
where   pol_num = :b0
order by
        pm_dt, snum

When the query was operating fast enough the trace file from a sample run showed the following (edited) tkprof output, with an the optimizer taking advantage of the primary key of (pol_num, pm_dt, snum) on table TPH to avoid a sort for the order by clause. (Note that the heading on the plan is “Row Source Operation” – which means it’s the execution plan that really was used)

call    count    cpu  elapsed  disk  query  current  rows
---------------------------------------------------------
Parse       1   0.01     0.13     0    106        0     0
Execute     1   0.03     0.03     0      0        0     0
Fetch       4   0.01     0.22    46     49        0    43
---------------------------------------------------------
total       6   0.06     0.39    46    155        0    43

Rows Row Source Operation
---- --------------------
  43 TABLE ACCESS BY INDEX ROWID TPH (cr=49 pr=46 pw=0 time=226115 us)
  43   INDEX RANGE SCAN TPH_PK (cr=6 pr=3 pw=0 time=20079 us)(object id 152978)

Elapsed times include waiting on following events:
Event waited on               Times  Max. Wait Total Waited
                             Waited 
-----------------------------------------------------------
db file sequential read          46       0.01         0.21

When the query was running less efficiently the change in the trace didn’t immediately suggest any fundamental problems:


call    count    cpu  elapsed  disk  query  current  rows
---------------------------------------------------------
Parse       1   0.00     0.00     0     51        0     0
Execute     1   0.01     0.01     0      0        0     0
Fetch       4   0.00     0.59    47     51        0    45
---------------------------------------------------------
total       6   0.01     0.61    47    102        0    45

Rows Row Source Operation
---- --------------------
45 TABLE ACCESS BY INDEX ROWID TPH (cr=51 pr=47 pw=0 time=593441 us)
45 INDEX RANGE SCAN TPH_PK (cr=6 pr=2 pw=0 time=33470 us)(object id 152978)

Elapsed times include waiting on following events:
Event waited on               Times  Max. Wait Total Waited
                             Waited 
-----------------------------------------------------------
db file sequential read          47       0.03         0.58

The plan is the same, the number of rows returned is roughly the same, and the number of disc reads and buffer gets has hardly changed. Clearly the overall change in performance comes from the slower average disk read times (a total of 0.21 seconds with a maximum of one hundredth of a second, compared to a total 0.58 seconds with a maximum of 3 hundredths), but why has the disk I/O time changed?

The figures give us a couple of preliminary ideas. An average read time of 4.5 milliseconds ( 0.21 seconds / 46 reads) is pretty good for a “small” random read of a reasonably loaded disc subject to a degree of concurrent access [ed: bearing in mind this was 2010], so the waits for “db file sequential read” in the first tkprof output are probably getting some help from a cache somewhere – possibly a SAN cache at the end of a fibre link or maybe from a local file system buffer (we might get a better idea if we could see the complete list of individual read times).

In the second case an average of 12.3 milliseconds ( 0.58 seconds / 45 reads) looks much more like a reasonable amount of genuine disc I/O is taking place – and the maximum of 30 milliseconds suggests that the disc(s) in question are subject to an undesirable level of concurrent access: our session is spending some of its time in a disk queue. Again, it would be nice to see the wait times for all the reads, but at this point it’s not really necessary.

There are couple more clues about what’s going on – one is the text of the query itself (and I’ll be coming back to that later) and the other is in the detail of the disk I/Os. If you check the “Row Source Operation” details you’ll see that in the first case the sample query selected 43 rows from the table and requested 43 (46 – 3) physical reads (pr) of the table to do so. In the second case it was 45 rows and 45 (47 – 2) physical reads. Is this simply a case of the same query needing a little more data and having to do a little more work as time passes?

So now we come to the Statspack data. Based on my observations (or guesses) about the nature of the query and the work going on, I asked if we could see some summary information for a couple of comparative intervals, and also to see if this particular query appeared in the “SQL ordered by reads” section of the Statspack reports. Here are the results, first for a snapshot taken in October:


Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
db file sequential read                      3,816,939      58,549     15   79.4
CPU time                                                     7,789          10.6
db file parallel write                         371,865       2,005      5    2.7
log file parallel write                         75,554       1,552     21    2.1
log file sync                                   17,198       1,228     71    1.7

                                                     CPU      Elapsd     Old
 Physical Reads  Executions  Reads per Exec %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
        775,166       43,228           17.9   24.3   212.58  12449.98  1505833981
Module: javaw.exe
SELECT * FROM TPH WHERE POL_NUM = :B1 ORDER BY PM_DT ,SNUM FOR UPDATE NOWAIT}

You might notice that the critical query is actually a ‘select for update’ rather than the simple select that we had originally been told about; this doesn’t affect the execution plan, but is going to have some significance as far as undo and redo are concerned.

Now look at the corresponding figures for an interval in December:

Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
----------------------------------------- ------------ ----------- ------ ------
db file sequential read                      7,000,428      92,970     13   89.8
CPU time                                                     6,780           6.5
db file parallel write                         549,286       1,450      3    1.4
db file scattered read                          84,127         720      9     .7
log file parallel write                         41,197         439     11     .4


                                                     CPU      Elapsd     Old
 Physical Reads  Executions  Reads per Exec %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
      2,444,437       43,363           56.4   25.2   221.31  23376.07 1505833981
Module: javaw.exe
SELECT * FROM TPH WHERE POL_NUM = :B1 ORDER BY PM_DT ,SNUM FOR UPDATE NOWAIT

You’ll see in both cases that a huge fraction of the total database time is spent in single block reads (the “db file sequential read” time), but in December the number of reads has gone up by about 3.2 million. You can also see that about 1.7 million of the “extra” reads could be attributed to the critical query even though the number of executions of that query has hardly changed. The average number of reads per execution has gone up from 18 to 56. (I did ask if I could see the section of Statspack titled “SQL ordered by Executions” as this includes the average number of rows per execution and it would have been nice to know whether this average had gone up just a little bit, or whether it had gone up in line with the average reads per execution. Unfortunately the request was ignored, so I am going to proceed as if the change in the average result set was small.)

This, perhaps, tells us exactly what the problem is (and even if it doesn’t, the figures are symptomatic of one of the common examples of non-scalable queries).

Look at the query again – are we reporting all the rows for a “policy number”, ordered by “payment date”. If so, the number of payments recorded is bound to increase with time, and inevitably there will be lots of payments for policies belonging to other people between each pair of payments I make on my policy – and that would put each of my payments in a different table block (if I use a normal heap table).

Initially the payments table may be sufficiently small that a significant fraction of it stays in Oracle’s data cache or even in the file-system or SAN cache; but as time passes and the table grows the probability of me finding most of my blocks cached decreases – moreover, as time passes, I want increasing numbers of blocks which means that as I read my blocks I’m more likely to knock your blocks out of the cache. Given the constantly increasing numbers of competing reads, it is also no surprise that eventually the average single block read time also increases.

In scenarios like this it is inevitable that performance will degrade over time; in fact it is reasonably likely that the performance profile will degrade slowly to start with and then show an increasingly dramatic plunge. The only question really is how much damage limitation you can do.

One strategy, of course, is to increase the memory available for the critical object(s). This may mean assigning the table to a generously sized KEEP cache. (The cache need not be the same size as the table to improve things, but the bigger the better – for this query, at least). But such a strategy is only postponing the inevitable – you really need to find an approach which is less susceptible to the passage of time.

In this case, there are a few options to consider. First – note that we are selecting all the rows for a policy: do we really need to, or could we select the rows within a given date range, thus setting an upper limit on the average volume of data we need to acquire for any one policy. If we do that, we may want to think about strategies for summarizing and deleting older data, or using partitioning to isolate older data in separate segments.

If we can’t deal with the problem by changing the code (and, in this case, the apparent business requirement) can we avoid the need to visit so many data blocks for single policy. There are two obvious options to consider here – we could create the table as an “index cluster” clustered on the policy number; in this way we pay a penalty each time we insert a new row for a policy because we have to find the correct block in the cluster but when we run a query against that policy we will only need to read one or two blocks (probably) to get all the data. Alternatively we could consider setting the table up as an index-organized table (IOT) – again we do more work inserting data into the correct leaf block in the index but again we reap the benefit as we query the data because all the rows we want are in the same two or three leaf blocks (and stored in the order we want them).

Of course we are still subject to the same basic problem of the result set increasing in size as time passes, but at least we have managed to reduce (dramatically) the number of blocks we have to visit and the rate of growth of the number of blocks per query, thereby improving the scalability of the queries significantly.

Introducing new structures to an existing system is difficult, of course, and we may have to work out variations on this theme (like creating an index that includes all the table columns if we can’t switch to an IOT!). The key point is this, though: sometimes we can look at our data and the critical queries and recognize that the volume of data we have to process (even if we don’t return it, as we did in this example) is always going to increase over time, then we need to consider ways of minimizing the volume of data, or improving the packing of data so that the work we do doesn’t change (much) over time. Don’t just think ‘fast now’, think ‘will it still be fast later’.

 


In Memoriam – 2

Sat, 2017-07-22 11:10

This is the second of two items that my mother typed out more than 25 years ago. I had very mixed emotions when reading it but ultimately I felt that it was a reminder that, despite all the nasty incidents and stupid behaviour hyped up by the press and news outlets, people and organisations are generally kinder, gentler and more understanding than they were 60 years ago.

This story is about the birth of my brother who was born with a genetic flaw now known as Trisomy 21 though formerly known as Down’s syndrome or (colloquially, and no longer acceptably) mongolism. It is the latter term that my mother uses as it was the common term at the time of birth and at the time she typed her story.

A child is born. The history written by Dorothy Kathleen Lewis (1925 – 2017) about the birth of her first son

My pregnancy was normal. The first indication I had that something was wrong was in the delivery room when the baby was born; there was “oh” and silence then whispers. I asked what was wrong but was told nothing. The baby was put in a cot and the doctor came into the room and then he was taken out. I had not seen the baby – just knew that it was a boy. Again when I asked I was told that this was routine. Eventually the baby was brought back and given to me. When I saw him I thought he looked very odd and was so floppy. When I held him upright I could see he was a mongol, but prayed that I was wrong and this would go away. I asked to be told what the matter was with the baby and was told to tell my husband to ask if I was worried – which made me more suspicious.

Visiting was restricted and I did not see my husband until the evening. Fathers were just shown the babies at the nursery door and were not allowed to hold them. My husband was delighted that we had a little boy and I didn’t have the heart to tell him what I feared.

David had difficulty feeding and was put on a bottle at three days, the teat of the bottle made with a big enough hole for the milk to drip into his mouth because he was not sucking. When we went home, still not having been told of his conditions, he was being fed 8 times a day taking just 1.5 ounces per feed. Each feed took an hour to get into him, then at night it was back to bed for 2 hours and a repeat performance.

I took David to the child welfare clinic and again the actions of the people there spoke volumes, the health visitor hurried into the doctor and I was shown in – jumping the queue. (The clinic was held in our church hall which was next to the vicarage and I was very embarrassed that I should be singled out, although it was obvious why.) I asked the doctor what she thought and she said he did look mongoloid, but perhaps I should see the paediatrician where he was born.

At 6 weeks [ed: see photo] I went for my post natal and there was great concern in the waiting room as to how the baby was getting on. None of the other mothers who were there were being asked. I said he still looks like a mongol. My husband was still not aware of David’s condition or my suspicions, I wanted to protect him from the hurt I was feeling, but now I know it was not the kindest thing to do.

I then took David back to the Middlesex hospital and saw the paediatrician, who took him away from me, and whilst I sat at one end of a very large room he had David on a table at the other end with a group of students. I could not hear what they were saying, but when David was brought back to me I was told: “You have a complete vegetable; he will never walk or talk – just lie in his pram and stare up at the sky. The best thing you can do is to put him in a home and forget you ever had a baby.” I was devastated; I couldn’t run away from it any more. He had an enlarged liver and spleen and his spine was curving outwards. When I held him in my arms it was a little like a floppy parcel and there was no buoyancy at all.

When I got home I couldn’t hold back the tears that had been stifled all those weeks and I had to tell my husband. It was dreadful, I think it would have been better had we been able to grieve together in the beginning.

From then on everything David did was a milestone and he brought us a lot of joy. Just before he was five I had to take him to County Hall in London for assessment. That was a nightmare because by this time I had two other children – little boys – it was necessary to take the older of the two with me, a very busy child. We went into a large room and an elderly fussy lady had a lot of questions for David to answer. He was shown pictures and asked what they were. He hid his face from her and was saying the words to me, many of which he already knew, but because he was not answering her they were crossed out. So his assessment was a very low one.

I don’t think it would have made a lot of difference whether he had answered her he so surely wasn’t school material. He had been going to a junior training centre from the age of 3½ because I was expecting Jonathan. A social worker who came to see me at that time asked what sex I would like my third child to be and I said I didn’t mind so long as I had a normal healthy child, and she said that was a funny answer to give – I didn’t think it funny.

People’s reactions were very different 37 years ago [ed: now 64 years]. Once it was made known that David was as he was people who had known me from childhood would cross the street [to avoid speaking to me], they didn’t know what to say. But we didn’t hide him away and when we went on holiday we just said three children and we sometimes got a reaction when we arrived, but David was always well-behaved and everybody loved him. He learned a great deal from his brothers and I thank God he was our first child.

[Dorothy Kathleen Lewis: Banbury 1990]

Footnote

While copying up this story I was prompted to look at a few statistics from the UK’s Office of National Statistics for 1953 (and 50 years later); in particular the stats about child mortality and measles caught my eye.

Infant mortality for England and Wales

Year Births Still-births Died with 1 week Died within 4 weeks Died within 1 year 1953 684,372 15,681 10,127 12,088 18,324 2003 621,469 3,612 1,749 2,264 3,306

Don’t forget when you read the mortality figures that the 2003 numbers will include births that could be anything up to 8 weeks premature. I think anything more than about 2 weeks premature would probably have ended up in the still-births column in 1953.

Measles (England and Wales).

Year Cases Reported Deaths 1953 545,050 242 2003 2,048 0

 


Pages