Oracle Performance and Engineering

Random notes on database features, code, practice, optimization and techniques.

Tuesday, April 11, 2006

 

A query tuned today - bad positioning of IN-list

Nothing new - walked in to work this morning, got apprised of some "P1" customer performance issues. Thankfully, we now have a good metric collection in place. Whenever we hear performance issues we collect the following -

The first customer issue -

Fortunately, this customer had already pinned on one SQL that was a performance disaster. They'd uploaded the email trail discussing about the SQL and I used the email to get an idea about the SQL. After pruning the SELECTed fields, the query looked like -


SELECT
task_id, last_update_date,
local_currency_id
FROM
kdrv_my_tasks_v v
WHERE
state_id IN (9, 10, 11, 12)
AND parent_state_id IN (4, 5, 15, 16)
AND master_project_id = '30062'
AND TRUNC (scheduled_start_date) <= (CURRENT_DATE + 1 + 14)
AND 75 < (SELECT SUM (p.percent_complete)
FROM kdrv_projects p
WHERE p.project_id IN
(SELECT pr.predecessor_project_id
FROM kdrv_predecessors pr
WHERE pr.project_id = v.task_id))
AND assignment_user_id = '49745'
ORDER BY scheduled_start_date ASC, task_name;

The statistics and 'explain plan' from the TKPROF-ed trace file looked like -

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.07 0.09 4 54 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 147.20 164.09 141445 57966443 0 50
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 147.28 164.18 141449 57966497 0 50

Rows Row Source Operation
------- ---------------------------------------------------
25 SORT ORDER BY
160 COUNT STOPKEY
160 NESTED LOOPS OUTER
160 NESTED LOOPS
160 NESTED LOOPS OUTER
160 NESTED LOOPS
285 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECTS
1 INDEX UNIQUE SCAN KDRV_PROJECTS_U1
1 TABLE ACCESS BY INDEX ROWID KNTA_REGIONS
1 INDEX UNIQUE SCAN KNTA_REGIONS_U1
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
1 TABLE ACCESS BY INDEX ROWID KDRV_PROJECT_FIELD_SETUPS
1 INDEX UNIQUE SCAN KDRV_PROJECT_FIELD_SETUPS_U2
285 TABLE ACCESS BY INDEX ROWID KDRV_ASSIGNMENTS
285 INDEX RANGE SCAN KDRV_ASSIGNMENTS_N2
160 TABLE ACCESS BY INDEX ROWID KDRV_PROJECTS
278 INDEX UNIQUE SCAN KDRV_PROJECTS_U1
285 SORT AGGREGATE
7 FILTER
14185875 TABLE ACCESS FULL KDRV_PROJECTS
7 INDEX UNIQUE SCAN KDRV_PREDECESSORS_U2
0 TABLE ACCESS BY INDEX ROWID KCRT_STATUSES
0 INDEX UNIQUE SCAN KCRT_STATUSES_U1
160 TABLE ACCESS BY INDEX ROWID KDRV_PROJECTS
160 INDEX UNIQUE SCAN KDRV_PROJECTS_U1
0 TABLE ACCESS BY INDEX ROWID KCST_CURRENCY_LINES
0 INDEX UNIQUE SCAN KCST_CURRENCY_LINES_U1

So, what do we learn from the query and the plan?

  1. kdrv_my_tasks_v is a view that has the table KDRV_PROJECTS in it's WHERE clause
  2. The blue highlighted part in the query maps with the blue highlighted part in the access plan - and is almost fully responsible for the CPU time (147 secs to full-scan 14 million rows as a result full-scan of kdrv_projects -- sounds logical).
  3. The blue highlighted part in the query has an unnecessary IN-list in the correlated sub-query. This could simply have been an equi-join.

i.e., we could rewrite -

(SELECT SUM (p.percent_complete)
FROM kdrv_projects p
WHERE p.project_id IN
(SELECT pr.predecessor_project_id
FROM kdrv_predecessors pr
WHERE pr.project_id = v.task_id))

to -

(SELECT SUM (p.percent_complete)
FROM kdrv_projects p,
kdrv_predecessors pr
WHERE p.project_id = pr.predecessor_project_id
AND pr.project_id = v.task_id)

Then the query becomes -

SELECT task_id,
last_update_date,
local_currency_id
FROM kdrv_my_tasks_v v
WHERE state_id IN (9, 10, 11, 12) AND
parent_state_id IN (4, 5, 15, 16) AND
master_project_id = '30062'AND
TRUNC (scheduled_start_date) <= (CURRENT_DATE + 1 + 14)AND
75 < (SELECT SUM (p.percent_complete)
FROM kdrv_projects p,
kdrv_predecessors pr
WHERE p.project_id = pr.predecessor_project_id
AND pr.project_id = v.task_id) A
ND
assignment_user_id = '49745'
ORDER BY scheduled_start_date ASC, task_name;

The new Logical IO falls down to just about 1500 (from 57 million) and time taken falls to about 08 secs! There's no more a "full table scan" on kdrv_projects table. On a finishing note, how many times have we seen this type of "redundant" IN-lists with a single table replacing an equi-join?


Comments: Post a Comment



<< Home

Archives

January 2005   February 2005   March 2005   April 2005   May 2005   March 2006   April 2006   August 2006   September 2006   October 2006   January 2007   May 2008  

This page is powered by Blogger. Isn't yours?