Oracle Performance and Engineering

Random notes on database features, code, practice, optimization and techniques.

Monday, February 28, 2005

Can an Uncommitted Transaction Commit inside it?

Well, stress does so many things to you.

Early 2004, I was consulted about a particular design of role-based scoped data access in a web-based application (i.e., a complex role based data access where Virtual Private Database model was rejected as a design since the logic defining the rules was complex and slow enough to be executed for each single access!).

To make it simple, let's say if you are an employee of a design consultancy you can see all data belonging to your company, plus the data of the other company projects that've been assigned to you. An outsourcer, who outsourced a work to your company, though can login to the same application can see only his or her project out of thousands your company may be working upon. The access-rules (i.e., the SQL queries that define what data you can see / modify) were very complex because this web application was serving to multitude of suppliers-vendors-manufacturers-consultants-employees and more different "roles" - all accessing the SAME database!

However, the crux of the design I was consulted with (I was not invited for design!) was - every main table will have a temp table having the (scoped) primary_keys of the main table. Whenever a query is fired against the 'main' table - it should first check the 'temp' table's "staleness" (i.e., if any row is not supposed to be seen now - or new rows are supposed to be seen by that particular role). Every single role will have its own "temp" table.

If temp table is stale then
the 'temp' table is refreshed and
query is served.
else
the query is served;

I was, stupidly, trying to write single SQLs to achieve this! A SQL that takes the form of

with stale_n_refresh_check as
(select check_stale(:p_handle_id, 'MAIN_ITEM_MASTERS') col from dual)
select b.* from
TEMP_ITEM_MASTERS_<:p_handle_id> a,
MAIN_ITEM_MASTERS b ,
temp1 d
where
a.object_id = b.ITEM_ID
and d.col =1 ;

I wanted to make sure that the check_stale function is evaluated first since it's in the WITH clause. Since check_stale on execution *also* refreshes the TEMP_ITEM_MASTERS_XXXX -- the queries will *always* have fresh data.

Well, I was stupid. Very stupid.

I totally forgot about the database principle that a query cannot modify data, even if it does by any hack like above - it won't be able to see it! Of course, that would break the whole C of ACID properties of transaction management.

But as I said, stress does so many things to you. I tried ORDERED_PREDICATES - so the with sub-factored clause is executed first and TEMP_ITEM_MASTERS_ is only read within a query after being modified by the function! Yeah, I was so stupid. This is database for God's sake.

But the questions are,
(a) can a transaction read committed data that's committed by itself during the transaction? If so, then
(b) can a query - by some means - read the data modified by itself?

So - more secondary questions.

-- How can a transaction go on and still commit data within the scope of itself?
-- How can a query modify data? (Well, it cannot. But what if you are asked to write SELECT to modify data? It's more like those trick interview questions!)

C.R Talk --
--------

Really liked this knife rack.

posted by Nilendu # 5:32 PM 1 comments

Thursday, February 24, 2005

A review of a certain import

I was asked to review one data import procedure document yesterday. The document is pretty elaborate and confidential - thus I am skipping it. But few of my suggestions should work across the board.

Scope of review import, compile objects, analze statistics in a new schema.

The original import syntax was something like -

imp userid=system/manager@SID FILE=dumpfile.dmp FROMUSER=GOLD_60 TOUSER=GOLD_60 LOG=imp_clm.log BUFFER=5242880 COMMIT=Y COMPILE=Y CONSTRAINTS=Y FEEDBACK=1000000 GRANTS=Y IGNORE=Y INDEXES=Y ROWS=Y SHOW=N STATISTICS=SAFE

Some of my comments were --

(1) Since the statistics is anyways calculated after import, why waste time on calculating it during import?

Thus for both export and import STATISTICS should be set to “none”

(2) BUFFER could be much higher provided nothing else runs during the import.

(3) " COMMIT =Y" is also a risky option.

For large tables which have millions of records, it will do a lot of commits and thus potentially slow things down.

For tables that have CLOB – commit = Y means the imp will commit after EACH ROW.

Agreed, COMMIT = N runs the risk of running out of rollback segment (it commits after inserting the whole table) – but, may be, for cases where speed is an issue, this could be set to N (only after making sure enough rollback space is available).

(4) Same for COMPILE =Y – they’re compiled later anyways, why compiling them during import?

The analyze command to collect statistics was like -

begin
dbms_stats.gather_schema_stats (ownname => ‘User’,
cascade => TRUE, method_opt => ‘FOR ALL COLUMNS SIZE 1’);
end;
/

(5) When we specify “method_opt => ‘FOR ALL COLUMNS SIZE 1’, Oracle calculates ONLY ONE histogram bucket. Statistically, a total of 75 histogram buckets fits most data skewness. That's the default number of buckets for Oracle.

That’s not very useful for columns that have skewed data. We should use 'SIZE AUTO' – where Oracle determines the columns to collect histograms based on data skewness automatically.

(6) Also, to use the new 9i CPU based “cost model” (search for "New Cost Model ") we should start collecting the “system statistics” (the CPU speed, DISK I/O speed etc. that oracle collects from the system and stores it in database to help itself optimize cost of query). The call to do so could be --

-- make sure there is at least one job_queue_process
begin
dbms_stats.gather_system_stats (gathering_mode => 'INTERVAL',interval => 360);
-- this will collect system statistics for 6 hours
end;
/

I’ve seen the cost model changes according to the particular hardware and storage, and I would definitely recommend at least trying to gather system statistics to let Optimizer use the new 9i CPU based COST model, rather than the old “I/O” based cost model.

The export clauses were not under the scope of review. However, I noted that "compress = y" could be the biggest offender in the export clause. It should always be set to "compress = n - to avoid tinkering with extent management (solution to a problem that never existed!).

posted by Nilendu # 5:57 PM 0 comments

Wednesday, February 23, 2005

Why so many Consistent Gets on first execution?

Someone asked an explanation for "too many" consistent gets first time against a simple table which spans only 4 blocks. On first execution, the Consistent Gets (or, LIO) is 44. On later hits, the consistent gets is 4.

The test case was more like -

12:41:39 SQL> set autotrace on
12:42:02 SQL> set autotrace on statistics
12:42:23 SQL>
12:42:24 SQL> select * from dept;
more...

DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON

Statistics
----------------------------------------------------------
238 recursive calls
0 db block gets
44 consistent gets
4 physical reads
0 redo size
600 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
4 rows processed

12:42:33 SQL>
12:42:37 SQL> /
more...

DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
600 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processed

12:42:56 SQL> select dbms_rowid.rowid_block_number(rowid) from dept;
more...

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
50450
50450
50450
50450

Why consistent gets initially?

This is interesting.

My answer -

Notice the correlation between these -

First execution - Higher LIO

> 238 recursive calls
> 44 consistent gets

Second Execution - Lower LIO

> 0 recursive calls
> 4 consistent gets

i.e., the 'extra' 40 consistent gets were due to the 'hard parse' - which caused 238 'recursive calls'.

If you'd trace the session and 'tkprof' with "sys=yes" (default is yes) - you can see the particular recursive calls. They are made to make sure the table exists, you have privilege to select the table etc etc.

posted by Nilendu # 5:56 PM 0 comments

Tuesday, February 22, 2005

INSERT APPEND (Direct Insert) - Panacea or Problem?

[Am finally back after a looong 5 day weekend :)
Snowboarding, movies, lots of sleep and catching up on the reading were on agenda. ]

Often, I saw APPEND being hinted in an INSERT statement to speed it up. Oracle manual says "In direct-path INSERT, data is appended to the end of the table, rather than using existing space currently allocated to the table. As a result, direct-path INSERT can be considerably faster than conventional INSERT."

Thus an APPEND would do much less work, generate much less Redo and expected to be faster. Since APPEND does not have to do recursive check to figure out where it shall put the new record(s) in, an APPEND call also generates lower number of recursive calls. True. But there are two issues here.

ONE.
For no redo log to be generated, the table must be in "NOLOGGING" mode or the database should be in "NOARCHIVE" mode.

Plus, if the table has indexes -- redo will be unavoidable.

TWO.
"Insert /*+ APPEND */" grabs a lock on the entire table in EXCLUSIVE mode. Thus any other DML from the application will be waiting. For longer transactions this could indeed be causing waiting locks.

A small test --

-- in SESSION 1

SQL> create table objtab as select * from user_objects where 1 = 2 ;
-- empty table

SQL> insert /*+ APPEND */ into objtab select * from user_objects;

SQL> select object_id from dba_objects where object_name like 'OBJTAB'

OBJECT_ID
------------
361912

SQL> select * from v$lock where sid = (select distinct(sid) from v$mystat);
OR,
SQL> select * from v$lock where sid = userenv('sid');

will show something like -

ADDR KADDR SID TYPE ID1 ID2 LMODE REQUEST CTIME BLOCK
A69B4CF4 A69B4E08 292 TX 196652 739623 6 0 163 0
A6921934 A692194C 292 TM 361912 0 6 0 163 0

Where --

LMODE = lock mode held: 1 (null), 2 (row share), 3 (row exclusive), 4 (share), 5 (share row exclusive), 6 (exclusive) 
Thus we see a direct-mode insert (APPEND) grabs "EXCLUSIVE" "TM" lock on the table it 
is inserting into. The other table with ID 196652 is LINK$ -- the base table on which
USER_OBJECTS view is written on.

Now, from another session if you try to do a DML (INSERT, UPDATE or DELETE) against
OBJTAB -- it will wait since the main session has not committed or rolled back the direct insert
against OBJTAB.

-- in SESSION 2
SQL> insert into objtab select * from user_objects;
...wait...
-- in SESSION 1
SQL> select SID,TYPE,ID1,ID2,LMODE,REQUEST,CTIME,BLOCK  from v$lock
          where id1 = 361912
SID TYPE ID1   ID2 LMODE REQUEST CTIME BLOCK
271 TM 361912 0 0  3  120  0
292 TM 361912 0 6  0  1458        1
So, as we see the original session (SID - 292) doing INSERT /* APPEND */ and thus holding an
EXCLUSIVE "TM" lock  is blocking the new session (271).

Note that session 2 (SID - 271) "requests" a "row-exclusive" lock and not an "exclusive" lock on
the whole table like a direct insert does.
If we repeat the experiment with SESSION 1 doing a conventional INSERT -- we will see INSERTS
are not blocking other INSERTS. i.e., locks are acquired in "row-exclusive" mode.
Thus, using direct-path insert could indeed be a case where insert blocks other inserts.
P.S. TM locks are issued during DML to make sure the table is not dropped or altered
(DDLed against). The total number of such locks can be controlled by setting the initialization
parameter DML_LOCKS. Thus to make a database read only, just set DML_LOCKS = 0 ;)
Well, it sometimes has to be done in a distributed environment (hint!).

posted by Nilendu # 4:47 PM 0 comments

Wednesday, February 16, 2005

Transitivity and a real life tuning test case

I often get questions like - "If a=2 and b= 2 should we also write 'a=b' in the query?"

The answer is, just like everything else, yes and no ;)

If cola = constant and colb = constant, then Oracle optimizer evaluates cola = colb.

i.e., a query like -

SELECT * FROM TABLE TAB
WHERE
TAB.COL1 = 2 AND
TAB.COL2 = 2;

will be applied the rule of transitivity and re-written by the Optimizer as -

SELECT * FROM TABLE TAB
WHERE
TAB.COL1 = 2 AND
TAB.COL2 = 2 AND
TAB.COL1 = TAB.COL2;

*But*, and I didn't know this till pretty recently, if -

col1 = col 3 AND
col2 = col3 =>
Optimizer doesn't automatically apply transitivity and assume
"and col1 = col2"

In other words, transitivity works for literals and not for columns.

Even if transitivity is applied for literals against columns -- we should be careful in coding. Since using different bind variables OR not binding at all (and using a cursor_sharing = force type global replacement of unbound literals) when the data model does not specify a good entity relationship (lack of foreign keys in schema) --- could let the transitivity go for a six. I just had one such issue today --

SELECT bla_bla_bla
FROM
requests r
WHERE r.STATUS_ID in (:b0,:b1) AND
r.ASSIGNED_TO_USER_ID = :b2 AND
r.REQUEST_TYPE_ID in (:b3) AND
exists (SELECT '' FROM
CHECK_V pcv
WHERE
pcv.request_id = r.request_id and
pcv.user_id = :b4)

The query above was stalling all available resources in a load test. The statistics is as follows -

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 9 0.01 0.02 0 0 0 0
Fetch 18 45.37 205.56 0 3336639 0 102
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 28 45.38 205.59 0 3336639 0 102

BINDS #67:

bind 0: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0010 size=256 offset=0 bfp=b69cf018 bln=32 avl=05 flg=05 value="30049"
bind 1: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0010 size=0 offset=32bfp=b69cf038 bln=32 avl=05 flg=01 value="30023"
bind 2: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0010 size=0 offset=160 bfp=b69cf0b8 bln=32 avl=05 flg=01 value="30542"
bind 3: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0010 size=0 offset=192 bfp=b69cf0d8 bln=32 avl=05 flg=01 value="30024"
bind 4: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0010 size=0 offset=224 bfp=b69cf0f8 bln=32 avl=05 flg=01 value="30542"

Taking a quick look in the SQL and trying to figure out if there's any missing join (something I ALWAYS do first!) -- it seemed to me that "b2" and "b4" both are "USER_ID"s and both are passed the same values - though through different bind variables.

Hmmm...a quick call to the data modeler established my suspicion into a fact.

ASSIGNED_TO_USER_ID is indeed a "foreign key" to USER_ID and in this particular business case - they will always be the same for this query! (This schema does not have any foreign keys. Yeah, yeah. I know. But don't blame me!!).

So, I modified the query by supplying the missing join as (in bold) --

SELECT bla_bla_bla
FROM
requests r
WHERE r.STATUS_ID in (:b0,:b1) AND
r.ASSIGNED_TO_USER_ID = :b2 AND
r.REQUEST_TYPE_ID in (:b3) AND
exists (SELECT '' FROM
CHECK_V pcv
WHERE
pcv.request_id = r.request_id and
pcv.user_id = r.ASSIGNED_TO_USER_ID
pcv.user_id = :b4) ;

The whole world changed after. LIO (Consistent Gets, Logical I/O, query in traced data) was down to 72 from 350K.

Thus even though we were supplying the same constant for both USER_ID and ASSIGNED_TO_USER_ID --- the transitivity was not applied.

Reasons -
(1) We were using different bind variables for both constants. Using same bind variable for both, however, let Optimizer apply the transitivity.

and / or

(2) There is no entity relationship between both columns in database. Though that is not the direct reason transitivity was not applied, but that probably is the reason the developer treated both differently while writing the query.

CR talk :

Watched 'SHARK TALE' last night. Not as good as SHREK2, but definitely worth a see. I liked the physical similarity between the screen characters and actors. Agreed, it is difficult to make a vegeterian and friendly shark look like Jack Black (last movie I watched him in was Envy. That sucked!) - but they did a great job to make Skyes the swordfish totally look like Martin Scorcese. Lola - the gold digger lady fish - also had giant, droopy lips a la Angelina. Onscreen De Nero was the most impactful with his "Don Leone", the Godfather of Sharks. The DVD has a pretty cool dancing lesson section.

However, I wish they could show the billboard of the pretty obvious "Coral Cola - Always Real" a bit less. I hate surrogate advertising. Specially in kids' movies.

posted by Nilendu # 4:33 PM 0 comments

Tuesday, February 15, 2005

Test Case: PL/SQL function in WHERE clause of SQL

Yesterday we saw how a PL/SQL function would be called once per row, rather than once per query, when the function is supplied in WHERE clause.

Here's a really sample test case to prove that in the scott/tiger schema.

-- code starts

SQL> create table test(n number);

SQL> insert into test values (20);
SQL> commit;

SQL> create or replace function fun return number
as
t number;
begin
dbms_application_info.set_client_info(sys_context('userenv','client_info')+1 );

-- to increment the value by 1 everytime the function
-- is accessed

select n into t from test;
return t;

-- always returns 20
end;
/

SQL> exec dbms_application_info.set_client_info(0);
-- set it to 0. Otherwise would be NULL

SQL> var test_var number
SQL> exec :test_var:=fun;

SQL> select sys_context( 'userenv', 'client_info') sys_ctx from dual;

SYS_CTX
----------
1

SQL> select * from emp where deptno=:test_var;
-- using a bind variable rather than the function

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
--------- ---------- --------- --------- --------- --------- --------- ---------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20

SQL> select sys_context( 'userenv', 'client_info') sys_ctx from dual;

SYS_CTX
----------
1

-- I am on 10g here. Single call to the function - during binding itself.

SQL > select * from emp where deptno=fun;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
--------- ---------- --------- --------- --------- --------- --------- ---------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20

SQL> select sys_context( 'userenv', 'client_info') sys_ctx from dual;

SYS_CTX
----------
15

-- So a total of 14 calls to the function?

SQL> select count(*) from emp;

COUNT(*)
----------
14

-- Thus the function IS called once per row!

SQL > select * from emp where deptno=(select fun from dual);

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
--------- ---------- --------- --------- --------- --------- --------- ---------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20

SQL> select sys_context( 'userenv', 'client_info') sys_ctx from dual;

SYS_CTX
----------
16

-- One single call to the funtion for the execution. Optimizer treats it as "data" and not as "code"

SQL> with temp as (select fun from dual)
select * from emp where deptno = (select * from temp);

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
--------- ---------- --------- --------- --------- --------- --------- ---------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7902 FORD ANALYST 7566 03-DEC-81 3000 20

SQL> select sys_context( 'userenv', 'client_info') sys_ctx from dual;

SYS_CTX
----------
17

-- With WITH clause - 1 call as well!

DBMS_APPLICATION_INFO is a very useful Oracle supplied package to set session-level context variables. We will also use it to create "parameterized views" to deal with complex views where predicates cannot be pushed inside.

posted by Nilendu # 3:36 PM 1 comments

Monday, February 14, 2005

PL/SQL function in WHERE clause - Very Dangerous

In my opinion, mixing up PL/SQL in SQL is not a very good idea. Specially when we have PL/SQL functions in the WHERE clause of a query. Often, these functions are put there for code reuse purpose. At other times, just to do some plain arithmetic on the values scanned.

In such cases I've seen the optimizer executing the PL/SQL function for each and every row of the dataset - even if every time the function returns the same value. Thus the function does not get executed "once per query" but is executed "once per row"..Yikes!

This is not restricted to just Oracle database.

Often the cure is to get rid of the function and rewrite the query using one of the following strategies -

(a) unnest the function by using (possibly!) extra joins with other tables in the main query

(b) if the function-ed column(s) is a candidate for "function-based index" - create one

(c) take the function out in an OUTER select thus applying it on a (possibly) reduced number of records. i.e., the inner-subquery does not contain the function in WHERE clause anymore and does most of the processing

(d) using analytical functions, specially when the top-, lowest-, sum- type functions are used

(e) use "CASE" if the function returns only a handful of constant value(s) depending on the input data

(f) if the function returns constant, factorize it with a "WITH" clause (as shown in the following example!). Believe me, many people still use a PL/SQL function for a "constant"!

Let me give a first hand example -

select ru.user_id app_user_id
from
change_map cm,
item_versions old,
item_versions new,
state_transitions tr,
cm_config cfg,
cm_board_roles b,
cm_roles r,
cm_role_users ru,
users u
where
cm.change_id = :b2
and cm.version_id = new.version_id
and old.version_id(+) = new.predecessor
and old.item_id (+) = new.item_id
and old.state_id = tr.from_state or
(old.state_id is null and
tr.from_state = get_entry_state('item_masters'))
and cfg.transition_id = tr.transition_id
and cfg.cm_board_id = b.cm_board_id
and b.cm_role_id = r.cm_role_id
and b.cm_role_id = ru.cm_role_id
and u.user_id = ru.user_id
and ru.owner = :b1
and b.owner = :b1
and cm.owner = :b1
and cfg.owner = :b1
and new.owner = :b1
and old.owner(+) = :b1;

The above query was a part of the "workflow" processing. Even for an average of a few thousands data in the mentioned tables this query was taking over 4 minutes to execute.

From studying the data model, the query seemed to contain no missing join and seemed to have been filtered well. Thus, I suspected the red line using PL/SQL function "get_entry_state" to be the bottleneck.

I took it out to a performance environment. Changed the code of "get_entry_state" to increment a fixed number by 1 every time the procedure is called. (Tip: also doable by using DBMS_APPLICATION_INFO.SET_CLIENT_INFO to increment the present value of itself by 1 at the beginning of the function).

It was found the function was called some 14,000 times in a single-user run - exactly the same number of records tr (state_transitions) contained! Since with time state_transitions table gets fatter -- the performance of this query (and the whole module) has been worsening till it became a serious "management issue".

As it appears, get_entry_state is always passed 'item_masters' and was found out to return -101 (the entry state id for item workflow changes). In other words, it was merely a CONSTANT throughout the query processing, but still the developers overdid the mere task by calling a function!

The query was rewritten by factorizing the function out using the cool 9i "WITH clause" as follows -

WITH TEMP_DUMMY AS
( select get_entry_state('item_masters') entry_state from xdual)
-- ONLY ONE execution!
select ru.user_id app_user_id
from
change_map cm,
item_versions old,
item_versions new,
state_transitions tr,
cm_config cfg,
cm_board_roles b,
cm_roles r,
cm_role_users ru,
users u
where
cm.change_id = :b2
and cm.version_id = new.version_id
and old.version_id(+) = new.predecessor
and old.item_id (+) = new.item_id
and old.state_id = tr.from_state or
(old.state_id is null and
tr.from_state = temp_dummy.entry_state)
-- changed part. Used WITH since the input parameter to the
-- function remained a 'constant' throughout
and cfg.transition_id = tr.transition_id
and cfg.cm_board_id = b.cm_board_id
and b.cm_role_id = r.cm_role_id
and b.cm_role_id = ru.cm_role_id
and u.user_id = ru.user_id
and ru.owner = :b1
and b.owner = :b1
and cm.owner = :b1
and cfg.owner = :b1
and new.owner = :b1
and old.owner(+) = :b1;

This time, as expected, the function was executed only once and the execution time was down to 0.32 second!! 4375% improvement (that's how Management was notified ;) )!

Another way to execute the function ONCE PER QUERY, rather than ONCE PER ROW was to rewrite the red line as --

tr.from_state = (select get_entry_state('item_masters') from dual)

Notice, in both cases, the function was 'made to look like' a row from some table! In the first solution, it was the sub-factorized temporary 'WITH' table. In the second, it was DUAL table. Then optimizer treated it as data, and not as code and thus executed it only once.

The beauty of the "dual" solution is - it will work across Oracle versions even when "WITH" sub-factoring is unavailable.

Next, we will demonstrate a reproducible example of this from Oracle's emp-dept tables!

posted by Nilendu # 4:13 PM 0 comments

Thursday, February 10, 2005

So why you should NOT totally rely on EXPLAIN PLAN to tune!

Many times, when they come to me with a particular SQL tuning issue, developers are well armed with the "explain plan" information. These days it's really easy to see the "explain plan" output (the plan that optimizer is supposed to take to execute the statement) thanks to client-end defvelopment tools like TOAD etc. Actually you don't even have to execute the SQL statement to see the plan. The pain is, often these tools put a scary red highlights on the likes of full table scans, cartesian joins etc etc.

So often developers come and the conversation that follows is like -

Developer : "Hey! The SQL for rolling-up cost on the dashboard is superslow. Ummmm..I guess it's doing a full table scan. Plus, I guess, there's some kind of weird cartesian join going on there."

(Why do they say "I guess" so many times anyways?)

Well, well! This is one episode of "Friends" that would be called "Where they knew why there was a problem but were wrong (and say similar things again in future!)".

During mid-90s people tuned by ratios. A ratio looked good and the system got a high grade of reliability. Imagine, if a pilot's blood alcohol test is ratio-based what will happen. Say, you legistate a rule that if the pilot's blood alcohol level is lower than 1/10th of the average of the passengers he could be allowed to fly. Yeah, right! Good luck to your honeymoon flight to Hawaii!! Ratios, IMHO, are best used in presentation to people who are not actually doing the work. You being a CEO presenting the business facts to wall street types should use ratios. But try telling your employees something like "the average pay hike last year was 11% more than the industry median..". Good luck again! Ratios are no-no for field.

Then came the "symptom based" tuning. Run a bunch of scripts, figure out waits, resource crunches and accordingly, add resources. Then buy expensive monitoring software. Design changes? Oh yeah! let's go J2EE "based". The underlying core database code got left out of the whole thing. "Full table scans should be fixed" is one of the symptoms that's very high on this group's target list.

I have only one word to them. Evolve!

However, as the heading tells, EXPLAIN PLAN - specially those from TOOLS could often be wrong. Not only they could be different from the real execution path of the statement but they could be different from execution plan derived from other tools (like, SQLPlus).

Don't believe me? See this. Here explain plans differed between SQL*Plus and a monitoring tool because the tool was using character string by default for all bind variables!!

What I am trying to say is, in some cases, none of these could actually be the real execution plan of the statement. The real execution plan of a statement can only be gotten from tkprof-ing the trace file (from 9i onwards, the execution path is shown by default after tkprof-ing a trace file). Note, that we have an "explain" option for tkprof -- and that's really the same as EXPLAIN PLAN in SQL*Plus, i.e., could really be misleading and different from the real one.

Here's what the Oracle Manual says about Explain Plan statement -

"With the query optimizer, execution plans can and do change as the underlying optimizer inputs change. EXPLAIN PLAN output shows how Oracle runs the SQL statement when the statement was explained. This can differ from the plan during actual execution for a SQL statement, because of differences in the execution environment and explain plan environment."

Three key phrases here -

(a) how Oracle runs (for clarity, should be "would probably have run") the SQL
(b) This can differ from the plan during actual execution
(c) 'because of differences in the ..environments'

So what could change between the "real execution environment" and the "explain plan environment"?

(1) To start, let's say the 'sort area size' or 'hash area size'. After 9i, they are governed by 'workarea_size_policy'. If it's set to AUTO -- Oracle will allocate and de-allocate session memory for sorting, hashing etc to sessions dynamically. That will be a part of the afore-mentioned "environment" change.

(2) For example, a higher value of the parameter "optimizer_dynamic_sampling", for example, analyzes the structure dynamically in run-time. Thus the explain plan - which is derived without actually executing the SQL - could vary from the real plan in the trace file.

(3) The SQL could be aged out of the shared pool

(4) The object could have been re-analyzed, changed or another session or system level parameter that affectes execution path may have been changed in between, and many many other things.

So, EXPLAIN PLAN really is an indicator of how things are supposed to happen.

To see how things actually did happen one should always either

(a) Trace the SQL / Session and do a "tkprof" (without the EXPLAIN option)
or,
(b) (from Oracle 9i and above only), use the V$SQL_PLAN data dictionary view, if the SQL is still in V$SQLAREA (Shared Pool).

In a nutshell, beware of just looking at the EXPLAIN PLAN and take an uninformed decision to fix your problem.

posted by Nilendu # 2:16 PM 0 comments

Wednesday, February 09, 2005

All about DB_FILE_MULTIBLOCK_READ_COUNT - Part TWO

Yesterday we did a little experiment on how tweaking DB_FILE_MULTIBLOCK_READ_COUNT value changes the execution plan of a SQL. Today we will see how it can be countered by re-adjusting OPTIMIZER_INDEX_COST_ADJ.

We saw how lowering the value of DB_FILE_MULTIBLOCK_READ_COUNT increases the I/O cost of full-table or index fast-full scans and vice versa. Now, we will tweak the parameter OPTIMIZER_INDEX_COST_ADJ to counter that effect.

First we set it to a very high value. i.e., we tell Oracle that reading an index block is 3 times as expensive than reading a table block. So, Oracle should back full-table scan (or, index fast-full scan) more than an index scan.

alter session set optimizer_index_cost_adj = 300;

select count(*) total from users b

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.05 0.04 0 595 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.06 0.05 0 595 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=595 pr=0 pw=0 time=47842 us)
300634 INDEX FAST FULL SCAN USERS_N9 (cr=595 pr=0 pw=0 time=601336 us)(object id 323598)

-- Then we change it to a very low value thus almost forcing index usage (full scan
-- is not a multi-block read operation

alter session set optimizer_index_cost_adj = 3;

select count(*) total from users

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.07 0.07 0 590 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.08 0.07 0 590 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=590 pr=0 pw=0 time=74051 us)
300634 INDEX FULL SCAN OBJ#(323598) (cr=590 pr=0 pw=0 time=601309 us)(object id 323598)

There is a rough way to calculate the value of optimizer_index_cost_adj to fix such discrepancy in this document (requires Metalink login).

Some Random (but useful) Notes :

(1) Thus it may appear that the cost of a full-table scan will be =

(# of blocks reads / db_file_multiblock_read_count)

But in reality the denominator is really a function of db_file_multiblock_read_count. Wolfgang Breitling derived the function to be like

k= 1.6765 ** 0.6581

Actual value used by Oracle for a set of popular values of this parameter -

Actual Oracle Used Value
DB_* to "cost" full scan
--------- -----------------
4 4.2
8 6.6
16 10.4
32 16.4
64 25.9
128 40.9

Thus the "cost" for a full-table scan will be -

(Blocks under High-water mark / Scaled DB_FILE_MULTIBLOCK_READ_COUNT)

Running multiple tests with different sized tables on a 9i I/O costing model supports the cost_of_full_table_scan - DB_FILE_MULTIBLOCK_READ_COUNT correlation.

However, in Oracle 9i onwards, if the system statistics is collected or set manually, then the MBRC (maximum block read count) value from the system statistics table (SYS.AUX_STATS$) is used to compute the cost of a full-table scan. This value is not adjusted to compute cost.

(2) Where the optimizer is running in RULE mode, it's beneficial to set this parameter to the maximum useful value. In RULE mode, the FULL scans are performed only when index scans are not feasible. Thus if full-table scan is performed higher value of this parameter will ensure less work by doing less number of I/Os.

Probably the same will hold for FIRST_ROWS (though not always), since Full table scans are generally rare in that mode too.

(3) Setting this value for a buffered I/O system is not very useful.

According to Steve Adams,

"Unfortunately, if you use buffered I/O the operating system will service multiblock reads as a series of small I/O operations anyway and the only benefit from multiblock reads will be a small saving in CPU usage. In this case the multiblock read count should be approximately equal to the typical track length to take advantage of track buffering and no larger lest the optimizer be deceived.

You can tell whether your filesystem is servicing multiblock reads atomically using the sar -d statistics. Divide the blocks per second metric by the reads+writes per second metric to get the average I/O size expressed in 512 byte blocks. If that is consistently close to 16 blocks (8K) while doing "large" multiblock reads then the buffering is indeed breaking your reads up into 8K chunks and you'll need to switch to raw or direct I/O to get the real benefit of multiblock reads. Note that the 8K is not invariant. Many Unix operating systems can also work with 4K file system blocks, and some use 16K."

So, in a nutshell, this parameter effects the choice of full-table scan (multi-block I/O or index fast-full scan where it's enabled) Vs. an index scan (single block I/O). But for Oracle 9i and onwards where system statistics is collected, this parameter is neglected to figure out the cost. In that case the cost is determined by the actual number of database blocks that can be read in a single read and a single multi-block I/O read time.

posted by Nilendu # 3:29 PM 0 comments

Tuesday, February 08, 2005

All about DB_FILE_MULTIBLOCK_READ_COUNT - Part ONE

What is?

It is the initialization parameter that tells Oracle how many data blocks can be sequentially read during full-table scans or index fast-full scans (which are grouped under 'db_file_scattered_reads, though the I/Os are really sequential).

Thus querying v$parameter will give the present value for this which can be changed dynamically in a session or throughout the database.

What should it be set to?

Generally, it should be set to the highest possible depending on the operating system. However, value of (DB_FILE_MULTIBLOCK_READ_COUNT * DB_BLOCK_SIZE) tells how much I/O can the database handle in one single read. This is restricted on two counts.

(a) "Each version of Oracle on each port, is shipped with a preset maximum of how much data can be transferred in a single read (which of course is equivalent to the db_file_multiblock_read_count since the block size is fixed). For 8i and above (on most platforms) this is 1Mb and is referred to as SSTIOMAX"

Thus, for a block size of 8KB or 16KB, the max useful value of db_file_multiblock_read_count is 128 or 64, respectively. So, if the change is not reflected in spite of you setting this parameter too high - now you should know why.

(b) It also depends on the OS. Since the OS also has a max_allowable_IO_size (like, maxphys in /etc/system dictates how much the max. physical I/O would be on Solaris).

Steve Adams has a nifty script to find out the maximum working value for this parameter your particular system.

The script does the following,

(a) set the parameter to an exorbitantly high value.
(b) trace a 'forced' (hint : FULL(tablename)) from a fairly large table to capture the wait events
(c) read the trace file for the maximum value of "p3" for db_file_scattered_read wait event.

How optimizer decision is impacted by this

Well, this parameter tells the optimizer something about the 'cost' of a full-table scan or index fast-full scan (which does sequential I/O unlike other types of index scan). Thus if the optimizer sees a high value for this parameter, it implies lesser effort to read the entire table than it would with a lower value of this (since lower I/O chunks == more number of trips to read the full).

Thus varying this parameter could influence the way optimizer behaves.

Here follows a very simple example --

"USERS" is a fairly large table with over 300K records.

-- db_file_multiblock_read_count = 16 (the usual for the system)

select count(*) from users

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.02 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.06 0.28 592 595 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.10 0.31 592 595 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=595 pr=592 pw=0 time=288781 us)
300634 INDEX FAST FULL SCAN USERS_N9 (cr=595 pr=592 pw=0 time=26493966 us)(object id 323598)

-- setting the parameter extremely low

alter session set db_file_multiblock_read_count = 1;

-- same SQL now does a FULL SCAN (not a FAST-FULL scan since the
-- db_file_multiblock_read_count is too low. Optimizer decides fast full scan or full table scans --would be very costly here

select count(*) total from users a;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.08 0.07 0 590 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.08 0.08 0 590 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=590 pr=0 pw=0 time=73816 us)
300634 INDEX FULL SCAN USERS_N9 (cr=590 pr=0 pw=0 time=601310 us)(object id 323598)

-- set it back to very high. FAST FULL SCAN is back!

alter session set db_file_multiblock_read_count = 32768;

select count(*) total from users b;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.07 0.06 0 595 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.07 0.06 0 595 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=595 pr=0 pw=0 time=63094 us)
300634 INDEX FAST FULL SCAN USERS_N9 (cr=595 pr=0 pw=0 time=601334 us)(object id 323598)

In next instalment we'll see how (possibly) to use "optimizer_index_cost_adj" to counter the effect of "db_file_multiblock_read_count" and some other advanced notes on this parameter.

posted by Nilendu # 4:04 PM 0 comments

Monday, February 07, 2005

Solved Test Case : Unnecessary PLSQL Proc. Single SQL will do

So from the earlier instalment on the test case, it seems the PLSQL procedure does the following -

(1) It selects a bunch of records through a cursor.
(2) For each of the records, it determines the ratio of (EXCEPTION_COUNT / TASK_COUNT) -- both are present in cursor-fetched record
(3) If that ratio is > a threshold value (also present in cursor-fetched record) a "summary_condition" is assigned with each record
(4) The appropriate "rule_id" based on it's YELLOW or RED is also assigned to the record
(5) the EXCEPTION_RULE_ID, SUMMARY_CONDITION and other values are then inserted into a table.

The full sequence can easily be done via a single Nested SQL using "CASE...WHEN..THEN..END" conditional clause.

Here's how it will look -

INSERT INTO exceptions
(exception_rule_id,source_id,exception_message)
(select exception_rule_id,
project_id,
summary_condition
from
(select parent_project_id,
project_id,
(case when summary_condition = 'YELLOW' then yellow_rule_id
when summary_condition ='RED' then red_rule_id
end) exception_rule_id,
summary_condition
from
(select project_id,
parent_project_id,
red_rule_id,
yellow_rule_id,
(case when indicator >= red_rule_threshold then 'RED'
when indicator>= yellow_rule_threshold then 'YELLOW'
end) summary_condition
from
-- the original cursor to populate PLSQL
-- to find the EXCEPTION_RULE_ID and SUMMARY_CONDITION.
(select p.project_id, p.parent_project_id, (pd.EXCEPTION_COUNT / pd.TASK_COUNT) indicator_ratio,
pd.red_rule_id, pd.red_rule_threshold,
pd.yellow_rule_id, pd.yellow_rule_threshold
from project p,
project_details pd
where p.master_project_id = v_master_project_id
and p.project_type_code = 'PROJECT'
and p.project_id = pd.project_id
and task_count > 0)
)
);

The time difference was huge! The single SQL (above) replacing the old, complex PLSQL procedure using (unnecessarily!) two PLSQL tables and extra-SQL iterations finished within 1.5 seconds for the same data set that took 80 seconds to execute through the PLSQL!

There are many benefits to try replacing whole PLSQL blocks with SQL code. The biggest benefit is, however, scalable code. The memory requirement as well as the variance of processing time with data size goes down a lot when the whole operation is done thorough a single "set" like SQL operatiuon.

Like I used "CASE..WHEN..ELSE..END" in the above example, "analytic functions" (available from 8i onwards -- *the* greatest new armour for database guys!) could as well be used in nested SQL for the same. Connor McDonald in his great book "Mastering Oracle PL/SQL - Practical Solutions" illustrates one such example in Chapter 1 - "Efficient PLSQL" (pp 32-47 for the edition I have) under "Don't use PL/SQL to do job of SQL" heading.

In future, I plan to give more examples of PL/SQL code that can be replaced by single SQLs.

posted by Nilendu # 2:12 PM 0 comments

Thursday, February 03, 2005

Test Case : Unnecessary PLSQL Proc. Single SQL will do

I've seen that most people doing application programming have a strong procedural language background. Most of them are not so strong on database technology. Thus when they code, the procedural 'iterative' thinking is put into work, making database 'set' thinking mostly absent in development. Database is all about sets, relations, entities. Procedural language is all about loops, iterations, logic, if-then.

Many times I've seen that a large PLSQL program unit can really be replaced by a single, short, crisp and much more efficient SQL call. That is real tuning. Make the code do less work. Removing redundant code. Doing the same with less. Merge multiple actions together in (if possible) a single call. Removal of bulk, complicated code. Introduction of mean, lean, efficient code. To phrase it - "doing same work with much less resource".

Here goes a case study which I'm going to analyze in two instalments. Today goes the first one.

In one application the following PLSQL procedure was traced to be the bottleneck. This procedure was called 236 times and took around 400 seconds to execute doing over 8 million IO. The code is as follows --

PROCEDURE Rollup
(p_source_entity_id IN NUMBER,
p_master_id IN NUMBER,
p_last_updated_by IN NUMBER)

IS
-- Project info record type
TYPE project_record IS RECORD
( parent_project_id NUMBER,
task_count NUMBER,
exception_count NUMBER,
summary_condition VARCHAR2(20)
);

-- Project info table type
TYPE project_table IS TABLE of project_record
INDEX BY BINARY_INTEGER;

-- Project ID table
TYPE projectID_table IS TABLE of NUMBER
INDEX BY BINARY_INTEGER;

l_project_table project_table;
l_projectID_table projectID_table;
l_index NUMBER DEFAULT 0;
l_work NUMBER;
l_project_id NUMBER;
l_parent_project_id NUMBER;
l_red_threshold NUMBER;
l_yellow_threshold NUMBER;
l_red_rule_id NUMBER;
l_yellow_rule_id NUMBER;
l_exception_rule_id NUMBER;
l_indicator NUMBER;
l_exists VARCHAR2(1);
l_task_count NUMBER;
l_exception_count NUMBER;

-- (1) It selects a bunch of records through a cursor.

CURSOR project_cursor (v_master_project_id NUMBER) IS
select p.project_id, p.parent_project_id, pd.TASK_COUNT,
pd.EXCEPTION_COUNT, pd.red_rule_id, pd.red_rule_threshold,
pd.yellow_rule_id, pd.yellow_rule_threshold
from project p,
project_details pd
where p.master_project_id = v_master_project_id
and p.project_type_code = 'PROJECT'
and p.project_id = pd.project_id
order by p.project_path_id_list desc;

BEGIN

-- It populates two PLSQL tables with the values.

FOR c_projects in project_cursor (p_master_id)
LOOP
l_project_id := c_projects.project_id;
l_project_table(l_project_id).parent_project_id := c_projects.parent_project_id;
l_project_table(l_project_id).task_count := c_projects.task_count;
l_project_table(l_project_id).exception_count := c_projects.exception_count;
l_projectID_table(l_index) := l_project_id;
l_index := l_index + 1;

END LOOP; -- load data

l_work := l_index - 1;

-- It then calculates some other values from the records. The math is totally done in PLSQL.

FOR l_pointer in 0..l_work LOOP

l_project_id := l_projectID_table(l_pointer);
l_parent_project_id:=l_project_table(l_project_id).parent_project_id;

IF (l_project_table(l_project_id).task_count != 0) THEN
l_indicator := l_project_table(l_project_id).exception_count/l_project_table(l_project_id).task_count * 100;
ELSE
l_indicator := 0;
END IF;

IF (l_indicator >= l_red_threshold) THEN
l_project_table(l_project_id).summary_condition := 'RED';
ELSE
l_project_table(l_project_id).summary_condition := 'YELLOW';
END IF;

END LOOP;

FOR l_pointer in 0..(l_index-1)
LOOP
l_project_id := l_projectID_table(l_pointer);
IF (l_project_table(l_project_id).summary_condition = 'RED') THEN
l_exception_rule_id := l_red_rule_id;
ELSE
l_exception_rule_id := l_yellow_rule_id;
END IF;

-- It then INSERTs the values into a table.

INSERT INTO errors(exception_rule_id,source_id,exception_message)
VALUES(l_exception_rule_id,l_project_id,l_project_table(l_project_id).summary_condition);
END LOOP;

END;
/

Here's what the procedure does -

(1) It selects a bunch of records through a cursor.
(2) It then calculates some other values from the records. The math is totally done in PLSQL.
(3) It populates two PLSQL tables with the values.
(4) It then INSERTs the values (both from fetched and computed) into a table.

So, in a high level - some records are fetched. Some conditional arithmetic is performed on it. It's inserted into a table.

Then, cannot it be done in plain SQL?? Of course, it can. With conditional "CASE" expressions we can do these arithmetic in a SQL query itself.

However, there're other flaws of coding in the code above. If you look minutely -

(1) Why two PLSQL tables are needed to store values from one single cursor? One would have been sufficient.

(2) Two match data between two PLSQL tables, the element in the wider one is subscripted as the unique value in the thinner table. Thus we have very stupid looking bloated PLSQL table element like -

l_project_table(l_project_id).summary_condition)

Or, something like -

wide_array(element_*value*_of_thin_array).element

Whereas it should look like --

wide_array(i).element

The subscript is not plain monotonically increaseing integer. Some crappy coding there!

(3) Use of FOR-LOOP iteration to populate a PLSQL table from a cursor. A BULK COLLECT could easily have been used. Here's another detailed treatise on BULK COLLECT from O'Reilly's popular PL/SQL programming book.

(4) FOR-LOOP iteration to INSERT multiple times in the errors table. A FORALL would have done it in single operation.

However, since the overall work here == inserting some 'SELECT'ed data into some table after fudging a bit; the whole procedure can be replaced by a single SQL. Tomorrow we will show how.

posted by Nilendu # 4:57 PM 0 comments

Wednesday, February 02, 2005

DYNAMIC SQL - don't just use it because it's there!!

The following write-up shows how a "high level" problem was broken down and resolved in step-by-step logical improvements.

The Problem -

Today I finally got to investigate a "critical performance issue". Functionally, there was some sort of "clean-up" operation running every 20 minutes and - depending on data volume - was taking more than 20 minutes to run. Thus the next runs were always in queue causing a perpetually spiked CPU.

Analysis -

I won't go into much detail now - tired after a long day - but to diagnose I ran the particular PLSQL code in (a) SQL Trace mode and (b) PLSQL Profiled mode (my favourite for PLSQL -- DBMS_PROFILER rocks. Wish there were a similar tool which would granularize timing data per line (or step) of a SQL query. Agreed it's a tough job!).

From trace data the total time (with our small data subset) for the batch job was 486 seconds - doing 16M IOs. Now, just doing the 16M IOs in our system would take somewhere like 45 seconds flat! So, more work was done in "context switches" - and probably being wasted.

The PLSQL Profile in HTML format (Oracle gives a fantastic script to HTML format PLSQL profile data - proftab.sql) showed more than 90% of time was spent in 4 places. Plus, there was a lot of latch related waits.

Main Culprit? -

When I was scanning the TKPROF output file, I saw a *lot* statements each like --

delete from exceptions ex
where source_id = 12234
and source_entity_id = 6
and last_update_date > to_date(''' || to_char(p_last_update_date, 'YYYY-MM-DD HH24:MI:SS') || ''', ''YYYY-MM-DD HH24:MI:SS'')';

Each of these statements was doing around 10,000 Consistent Gets (a bit too much for EXCEPTIONS table - which has a concatenated index on SOURCE_ID and SOURCE_ENTITY_ID. Trace plan showed the index was NOT being used!

Source of the problem - Where from? -

A quick glance at the database source

SELECT * FROM USER_SOURCE WHERE UPPER(TEXT) LIKE '%DELETE%EXCEPTIONS EX%';

gave the PLSQL package body name where the statement was coming from (the whole thing was a PLSQL batch job - so I just checked the database stored procs)

And boom!

The package body had -

--- code snippet begins

l_sql_stmt :=
'DELETE FROM exceptions ' ||
' WHERE source_id = ' || p_source_id ||
' AND source_entity_id = ' || p_source_entity_id ||
' AND last_update_date >to_date(''' || to_char(p_last_update_date, 'YYYY-MM-DD HH24:MI:SS') || ''', ''YYYY-MM-DD HH24:MI:SS'')';

l_cursor_id := dbms_sql.open_cursor;
dbms_sql.parse(l_cursor_id, l_sql_stmt, dbms_sql.v7);
l_row_count := dbms_sql.execute(l_cursor_id);
dbms_sql.close_cursor(l_cursor_id);

--- code snippet ends

Why Bad? -

One of the WORST PLSQL coding I've ever seen in my life. And yes, I am not at all exaggerating.

The developer did use "DBMS_SQL" (Dynamic SQL) and thought it's "cool". But well, why do we really need a Dynamic SQL here? We are just passing the parameters (or, variables!) to the SQL.

The statement not only will prevent binding (they not only used dynamic SQL on a totally wrong context, but they also made sure the statement does not use bind variable by not using DBMS_SQL.BIND_VARIABLES procedure) - thus causing multiple occurrances with hard-coded literals in the trace file, but also (probably because of a dynamic comparison of numeric source_id to a 'string' (source_id = ' || p_source_id ||') the index on source_id was not being used!

Here's an example to show why "identifiers" (or, real variables) shall NEVER be bound.

This also explained the huge wait on latches in the trace file since we were not using bind variables in the most repeating SQL in the batch process.

To me, this is a criminal offence. Or well, almost.

Fix -

Just recompiling the package commenting the code snippet anove and replacing it with -

-- new code snippet starts

delete /*+ index(ex exceptions_n1) */ from exceptions ex
where source_id = p_source_id
and source_entity_id = p_source_entity_id
and last_update_date > to_date(''' || to_char(p_last_update_date, 'YYYY-MM-DD HH24:MI:SS') || ''', ''YYYY-MM-DD HH24:MI:SS'')';

-- new code snippet ends

This is a STATIC sql. This will make sure that everytime this delete is issued, it uses the same plan (static sql uses bind variables).

Guess how much the time was after just this basic, silly change? It's down to 267 seconds - from 486 seconds.

There's always more to do -

There were more changes made. As you could imagine, when there are silly errors like above in the code - there were more 'deeper' issues (specially with bulk operations and all). After fixing those (will take more time to explain - used the new FORALL, BULK COLLECT rather than populating the PLSQL tables through for-iterations, using CASE to compress several SQLs into one etc etc) the whole batch job was finishing in less than 100 seconds. I am still waiting to hear from the functional guy - there's a row-level trigger on the table just for the batch procedure, and I want to rather do the trigger's work after the whole batch job is done in one shot. After implementing that change, it should really take around 75 seconds. A 650% improvement of the most annoying code within a few hours - not too bad.

posted by Nilendu # 10:39 PM 0 comments

Tuesday, February 01, 2005

CLUSTERING FACTOR - How important this stuff is?

Howard Rogers has neatly defined "Clustering Factor" as "a count of how many visits to the table would you have to make if each entry in the index was read in turn, with consecutive visits to the same table block not being counted".

In other words, CLUSTERING FACTOR of an index (CF, hereafter ;) ) is really the measure of total IO if the whole table is read via that index. When database reads index, it gets the data block address of the actual row in the table and reads it. In other words, the column in an ordinary index is always ordered (sorted), but not necessarily the table has sorted indexed column(s) on its data blocks. CF is a measure of how "ordered" or "sorted" the indexed column(s) really are in the table.

Observed Notes on CF -

(1) The maximum number of times you can read a table block-by-clock is once per row (if the row size is, of course, less than the amount of IO the system can perform in each run). Thus the maximum value of clustering factor could be the number of rows in that table. At that point, the indexed data is pretty badly ordered in the table - and thus index-range scan may be effected.

(2) The minimum number of times rows can be read on a block-by-block scheme is equal to the number of non-empty data blocks.

If the table has many empty blocks, they will not be read during an index scan since the index entry in the leaf blocks will not have a ROWID that maps to an empty block.

(Note: A deleted value has a NULL rowid in an index!)

Thus, CF can actually be less than the number of total blocks in a table!

(3) Rebuilding an index or coalescing it will not impact the CF. This is normal since the CF is really a table property than an index property. It measures "orderliness of indexed data in the table". It does not measure anything of the index.

(4) However, rebuilding the table (such as, lowering PCTFREE thus ensuring more number of rows in each block OR "create table as select ...order by indexed_column(s) ") will impact Clustering Factor. In both cases above, Clustering Factor should be lower after the table reorg.

(5) Someone equated CF with the number of "Consistent Gets" when the table is read through the index here. But it's only partially true. Let's see how the optimizer is affected by the CF.

According to Wolfgang Breitling (well, this is database basic - but he gave a nice perspective of the same w.r.t. Oracle), COST of a table access through index range is derived from the formula -
_________________________________________________

COST = blevel + (FF * leaf_blocks) + (FF * Clustering_Factor)
_________________________________________________

Where FF = "Filter Factor" or selectivity; or the fraction of the rows being read.
For an UNIQUE index scan - selectivity is ZERO.
When all rows are read (say, no filter is applied) - selectivity is 1.

Thus for a unique index scan the cost comes out to be independent of Clustering_Factor, since then the FF = 0. Logically this also makes sense, since if only one key of the unique index is read - the physical distribution of that value in the table (only one ROWID!) does not matter at all.

While reading *all* the rows (i.e., FF = 1), thus the COST of access through the index becomes LARGER (much larger - will be dealt with in a later article) than the Full-table scan (which, at worse, will equal to CF!).

From the above equation, it also seems the larger % of rows are read the more important CF becomes in calculating COST. Thus we can say, CF plays a big role in determining large index-range scans.

(6) A quick (and provocative!) way to try to enable the range-scan of a large index is to rebuild the table as

"CREATE TABLE AS SELECT * FROM ORDER BY INDEXED_COLUMN(S)"

While this will ORDER the data on the data blocks of the table for the indexed column(s) thus making indexed-access to multiple rows cheaper, but at the same time it will break some other sorting in the table (if there was one). Mostly, if an ordinary table contains a sequence generated primary key - the table is more-or-less sorted by that.

(7) I found a more detailed formula to calculate cost from this article. According to this, the cost of a table access through index scan can be equated as -

_________________________________________________
COST =
(BLEVEL * 9742.88
+ CEIL(FF * Leaf_Blocks) * 9742.88 -- INDEX ACCESS COST
+ CEIL(FF * CF) * 9742.88 -- TABLE BLOCK ACCESS COST
+ FF * CF * 330 -- ROW PROCESSING COST
+ FF * CF * NCOL * 20 ) / 10,000 -- COLUMN PROCESSING COST
_________________________________________________

Where, FF = Filtering Factor or SELECTIVITY
CF = Clustering Factor
NCOL = Number of columns accessed

On the same note, in future articles I plan to discuss some more hands on CF and other index parameters.

Here's another good article by Dan Hotka on Clustering Factor of an index.

posted by Nilendu # 3:13 PM 0 comments