Oracle Performance and Engineering

Random notes on database features, code, practice, optimization and techniques.

Monday, February 14, 2005

 

PL/SQL function in WHERE clause - Very Dangerous

In my opinion, mixing up PL/SQL in SQL is not a very good idea. Specially when we have PL/SQL functions in the WHERE clause of a query. Often, these functions are put there for code reuse purpose. At other times, just to do some plain arithmetic on the values scanned.

In such cases I've seen the optimizer executing the PL/SQL function for each and every row of the dataset - even if every time the function returns the same value. Thus the function does not get executed "once per query" but is executed "once per row"..Yikes!

This is not restricted to just Oracle database.

Often the cure is to get rid of the function and rewrite the query using one of the following strategies -

(a) unnest the function by using (possibly!) extra joins with other tables in the main query

(b) if the function-ed column(s) is a candidate for "function-based index" - create one

(c) take the function out in an OUTER select thus applying it on a (possibly) reduced number of records. i.e., the inner-subquery does not contain the function in WHERE clause anymore and does most of the processing

(d) using analytical functions, specially when the top-, lowest-, sum- type functions are used

(e) use "CASE" if the function returns only a handful of constant value(s) depending on the input data

(f) if the function returns constant, factorize it with a "WITH" clause (as shown in the following example!). Believe me, many people still use a PL/SQL function for a "constant"!

Let me give a first hand example -

select ru.user_id app_user_id
from
change_map cm,
item_versions old,
item_versions new,
state_transitions tr,
cm_config cfg,
cm_board_roles b,
cm_roles r,
cm_role_users ru,
users u
where
cm.change_id = :b2
and cm.version_id = new.version_id
and old.version_id(+) = new.predecessor
and old.item_id (+) = new.item_id
and old.state_id = tr.from_state or
(old.state_id is null and
tr.from_state = get_entry_state('item_masters'))

and cfg.transition_id = tr.transition_id
and cfg.cm_board_id = b.cm_board_id
and b.cm_role_id = r.cm_role_id
and b.cm_role_id = ru.cm_role_id
and u.user_id = ru.user_id
and ru.owner = :b1
and b.owner = :b1
and cm.owner = :b1
and cfg.owner = :b1
and new.owner = :b1
and old.owner(+) = :b1;

The above query was a part of the "workflow" processing. Even for an average of a few thousands data in the mentioned tables this query was taking over 4 minutes to execute.

From studying the data model, the query seemed to contain no missing join and seemed to have been filtered well. Thus, I suspected the red line using PL/SQL function "get_entry_state" to be the bottleneck.

I took it out to a performance environment. Changed the code of "get_entry_state" to increment a fixed number by 1 every time the procedure is called. (Tip: also doable by using DBMS_APPLICATION_INFO.SET_CLIENT_INFO to increment the present value of itself by 1 at the beginning of the function).

It was found the function was called some 14,000 times in a single-user run - exactly the same number of records tr (state_transitions) contained! Since with time state_transitions table gets fatter -- the performance of this query (and the whole module) has been worsening till it became a serious "management issue".

As it appears, get_entry_state is always passed 'item_masters' and was found out to return -101 (the entry state id for item workflow changes). In other words, it was merely a CONSTANT throughout the query processing, but still the developers overdid the mere task by calling a function!

The query was rewritten by factorizing the function out using the cool 9i "WITH clause" as follows -

WITH TEMP_DUMMY AS
( select get_entry_state('item_masters') entry_state from xdual)
-- ONLY ONE execution!
select ru.user_id app_user_id
from
change_map cm,
item_versions old,
item_versions new,
state_transitions tr,
cm_config cfg,
cm_board_roles b,
cm_roles r,
cm_role_users ru,
users u
where
cm.change_id = :b2
and cm.version_id = new.version_id
and old.version_id(+) = new.predecessor
and old.item_id (+) = new.item_id
and old.state_id = tr.from_state or
(old.state_id is null and
tr.from_state = temp_dummy.entry_state)
-- changed part. Used WITH since the input parameter to the
-- function remained a 'constant' throughout

and cfg.transition_id = tr.transition_id
and cfg.cm_board_id = b.cm_board_id
and b.cm_role_id = r.cm_role_id
and b.cm_role_id = ru.cm_role_id
and u.user_id = ru.user_id
and ru.owner = :b1
and b.owner = :b1
and cm.owner = :b1
and cfg.owner = :b1
and new.owner = :b1
and old.owner(+) = :b1;

This time, as expected, the function was executed only once and the execution time was down to 0.32 second!! 4375% improvement (that's how Management was notified ;) )!

Another way to execute the function ONCE PER QUERY, rather than ONCE PER ROW was to rewrite the red line as --

tr.from_state = (select get_entry_state('item_masters') from dual)

Notice, in both cases, the function was 'made to look like' a row from some table! In the first solution, it was the sub-factorized temporary 'WITH' table. In the second, it was DUAL table. Then optimizer treated it as data, and not as code and thus executed it only once.

The beauty of the "dual" solution is - it will work across Oracle versions even when "WITH" sub-factoring is unavailable.

Next, we will demonstrate a reproducible example of this from Oracle's emp-dept tables!

Comments: Post a Comment



<< Home

Archives

January 2005   February 2005   March 2005   April 2005   May 2005   March 2006   April 2006   August 2006   September 2006   October 2006   January 2007   May 2008  

This page is powered by Blogger. Isn't yours?