Random notes on database features, code, practice, optimization and techniques.
I've seen that most people doing application programming have a strong procedural language background. Most of them are not so strong on database technology. Thus when they code, the procedural 'iterative' thinking is put into work, making database 'set' thinking mostly absent in development. Database is all about sets, relations, entities. Procedural language is all about loops, iterations, logic, if-then.
Many times I've seen that a large PLSQL program unit can really be replaced by a single, short, crisp and much more efficient SQL call. That is real tuning. Make the code do less work. Removing redundant code. Doing the same with less. Merge multiple actions together in (if possible) a single call. Removal of bulk, complicated code. Introduction of mean, lean, efficient code. To phrase it - "doing same work with much less resource".
Here goes a case study which I'm going to analyze in two instalments. Today goes the first one.
In one application the following PLSQL procedure was traced to be the bottleneck. This procedure was called 236 times and took around 400 seconds to execute doing over 8 million IO. The code is as follows --
PROCEDURE Rollup
(p_source_entity_id IN NUMBER,
p_master_id IN NUMBER,
p_last_updated_by IN NUMBER)
IS
-- Project info record type
TYPE project_record IS RECORD
( parent_project_id NUMBER,
task_count NUMBER,
exception_count NUMBER,
summary_condition VARCHAR2(20)
);
-- Project info table type
TYPE project_table IS TABLE of project_record
INDEX BY BINARY_INTEGER;
-- Project ID table
TYPE projectID_table IS TABLE of NUMBER
INDEX BY BINARY_INTEGER;
l_project_table project_table;
l_projectID_table projectID_table;
l_index NUMBER DEFAULT 0;
l_work NUMBER;
l_project_id NUMBER;
l_parent_project_id NUMBER;
l_red_threshold NUMBER;
l_yellow_threshold NUMBER;
l_red_rule_id NUMBER;
l_yellow_rule_id NUMBER;
l_exception_rule_id NUMBER;
l_indicator NUMBER;
l_exists VARCHAR2(1);
l_task_count NUMBER;
l_exception_count NUMBER;
-- (1) It selects a bunch of records through a cursor.
CURSOR project_cursor (v_master_project_id NUMBER) IS
select p.project_id, p.parent_project_id, pd.TASK_COUNT,
pd.EXCEPTION_COUNT, pd.red_rule_id, pd.red_rule_threshold,
pd.yellow_rule_id, pd.yellow_rule_threshold
from project p,
project_details pd
where p.master_project_id = v_master_project_id
and p.project_type_code = 'PROJECT'
and p.project_id = pd.project_id
order by p.project_path_id_list desc;
BEGIN
-- It populates two PLSQL tables with the values.
FOR c_projects in project_cursor (p_master_id)
LOOP
l_project_id := c_projects.project_id;
l_project_table(l_project_id).parent_project_id := c_projects.parent_project_id;
l_project_table(l_project_id).task_count := c_projects.task_count;
l_project_table(l_project_id).exception_count := c_projects.exception_count;
l_projectID_table(l_index) := l_project_id;
l_index := l_index + 1;
END LOOP; -- load data
l_work := l_index - 1;
-- It then calculates some other values from the records. The math is totally done in PLSQL.
FOR l_pointer in 0..l_work LOOP
l_project_id := l_projectID_table(l_pointer);
l_parent_project_id:=l_project_table(l_project_id).parent_project_id;
IF (l_project_table(l_project_id).task_count != 0) THEN
l_indicator := l_project_table(l_project_id).exception_count/l_project_table(l_project_id).task_count * 100;
ELSE
l_indicator := 0;
END IF;
IF (l_indicator >= l_red_threshold) THEN
l_project_table(l_project_id).summary_condition := 'RED';
ELSE
l_project_table(l_project_id).summary_condition := 'YELLOW';
END IF;
END LOOP;
FOR l_pointer in 0..(l_index-1)
LOOP
l_project_id := l_projectID_table(l_pointer);
IF (l_project_table(l_project_id).summary_condition = 'RED') THEN
l_exception_rule_id := l_red_rule_id;
ELSE
l_exception_rule_id := l_yellow_rule_id;
END IF;
-- It then INSERTs the values into a table.
INSERT INTO errors(exception_rule_id,source_id,exception_message)
VALUES(l_exception_rule_id,l_project_id,l_project_table(l_project_id).summary_condition);
END LOOP;
END;
/
Here's what the procedure does -
(1) It selects a bunch of records through a cursor.
(2) It then calculates some other values from the records. The math is totally done in PLSQL.
(3) It populates two PLSQL tables with the values.
(4) It then INSERTs the values (both from fetched and computed) into a table.
So, in a high level - some records are fetched. Some conditional arithmetic is performed on it. It's inserted into a table.
Then, cannot it be done in plain SQL?? Of course, it can. With conditional "CASE" expressions we can do these arithmetic in a SQL query itself.
However, there're other flaws of coding in the code above. If you look minutely -
(1) Why two
PLSQL tables are needed to store values from one single cursor? One would have been sufficient.
(2) Two match data between two PLSQL tables, the element in the wider one is subscripted as the unique value in the thinner table. Thus we have very stupid looking bloated PLSQL table element like -
l_project_table(l_project_id).summary_condition)
Or, something like -
wide_array(element_*value*_of_thin_array).element
Whereas it should look like --
wide_array(i).element
The subscript is not plain monotonically increaseing integer. Some crappy coding there!
(3) Use of FOR-LOOP iteration to populate a PLSQL table from a cursor. A
BULK COLLECT could easily have been used. Here's
another detailed treatise on BULK COLLECT from O'Reilly's popular PL/SQL programming book.
(4) FOR-LOOP iteration to INSERT multiple times in the errors table. A
FORALL would have done it in single operation.
However, since the overall work here == inserting some 'SELECT'ed data into some table after fudging a bit; the whole procedure can be replaced by a single SQL. Tomorrow we will show how.