Oracle Performance and Engineering

Random notes on database features, code, practice, optimization and techniques.

Wednesday, March 30, 2005

Storage of (fixed or variable) Multi-byte data in varchar2

Took the 10g upgrade exam this Sunday. It was *long*. It took me about 2 hours 15 mins to crack all 161 questions. Was a good jump-start on 10g.

One of the applications I advise had client issues yesterday. Apparently, they are "UTF8 enabled" but still when customers tried to store 250 multi-byte characters in a VARCHAR2(250) it failed with " ORA-12899: value too large for column <>"

I was asked if there's something to be done on Oracle side. Of course, there is. In fact, this is an issue that's supposed to be handle totally on Oracle side.

Here's what I wrote in different emails back on forth on the issue -

"Yes, there's an Oracle cure :)

Oracle lets you declare the data type according to "character" and not "bytes" (the default). i.e., if you declare a field "name" to be a varchar2 (40 bytes or 40 characters).

In a single-byte charater set the name will contain 40 characters (==40 bytes). In a double-byte character set the name will contain 40 characters (80 bytes). Oracle auto-adjusts bytes when you declare the length semantics on 'characters'.

The syntax looks like -

varchar2(20 byte) -- if byte is not specified that's the default. varchar2(20 char)

This is doable from 9i onwards.

We can change the "default way" Oracle stores the string by adjusting NLS_LENGTH_SEMANTICS (i.e., Oracle should follow the "byte" or "character" semantics). The default length semantic, naturally, is byte. However, the setting of this parameter should not be changed more than once in a database since it gives way to mixed-semantics database.

Any table changed *after* the NLS_LENGTH_SEMANTICS change will inherit the "char" semantics. However, old tables will continue to be on the "byte" semantic.

I was trying with different use cases against a 9i database. Here're the notes on findings -

(1) On changing "byte" semantic to "char" -

If we change NLS_LENGTH_SEMANTICS to CHAR -- all the old tables still are maintained in "Byte Semantic" and new tables are maintained in "Char Semantic".

(2) Effect on "char" semantic on named PL/SQL -

If we change the NLS_LENGTH_SEMANTICS to CHAR -- all named PL/SQL codes will use the Character semantics. i.e., VARCHAR2(20) inside a *named* PL/SQL(as opposed to an anonymous PL/SQL block) will be VARCHAR2(20 CHAR).

Anonymous PL/SQL blocks, however, have to be recompiled.

(3) If VARCHAR2 can store more than 4000 bytes with "char" semantic -

If we change the NLS_LENTH_SEMANTICS we still are bound by maximum bytes (4000) of VARCHAR2.

Let's do some math here - max byte a character can hold is 3 for UTF8 (Oracle 8i); 4 bytes for AL32UTF8 (Oracle 9i, we're going to use this for database, i.e., VARCHAR2s); and 2 bytes for AL16UTF16 (for NVARCHAR data type).

Thus the maximum number of character a VARCHAR2(4000) will hold is 1333 in UTF8 (8i) and 1000 in AL32UTF8 (9i).

Here's a nice table I found on Oracle metalink -

UTF8 (1 to 3 bytes) AL32UTF8 (1 to 4 bytes) AL16UTF16 ( 2 bytes)

MIN MAX MIN MAX MIN MAX
CHAR 2000 666 2000 500 N/A N/A

VARCHAR2 4000 1333 4000 1000 N/A N/A

NCHAR 2000 666 N/A N/A 1000 1000

NVARCHAR2 4000 1333 N/A N/A 2000 2000

N/A means Not possible!

However, if the width of the data type is well below max, i.e., 4000 - max number of characters will be stored.

i.e., a VARCHAR2(100 char) will store 100 characters leading to a max. 400 bytes with AL32UTF8 (Unicode 3.0 encoding)

Note – Pl/SQL code can handle upto 32K bytes in a VARCHAR2 variable!

(4) What if we need to store more than 4000 bytes in a varchar2 field for multi-byte characters without changing the length semantics -

We need to resort to using NVARCHAR2.

NVARCHAR2 falls under "national character set" and *ALWAYS* stores data in "character" semantic, even if the NLS_LENGTH_SEMANTICS is set to be "byte".

(5) What's the easiest way to solve the problem "varchar2(250) field cannot take 250 characters because it was declared in 'byte' semantic” -

Case 1 -- Easiest for an already up and running database --

Alter table modify column (problem_column varchar2(250 char));

Alter table modify column (problem_column nvarchar2(250));

Change the database nls_length_semantics to CHAR to let PL/SQL handle these extra bytes for multi-byte characters.

Case 2 -- Easiest for a "fresh" Install --

Alter system set nls_length_semantics = char;

-- create all the tables or Import them fresh.

If the problem happens only with a few tables - I would say case 1 is a better way to go. For ideal cases or instance-wide multi-byte data, "Case 2" is the one.

(6) What's the performance implication of "char" semantic?

Well, when Oracle stores string as "characters" Oracle

(a) cannot do the easier "byte" arithmetic for string computations. It has to then count characters to begin and start, not bytes. So string-extensive operations - particularly SUBSTR(:string,1,40) types, are slower. I've to benchmark to find out just how much slower. I would guess "LIKE" operator would also be slower.

But then I received an interesting feedback from a co-worker -

"In regard to performance implications resulting from not being able to do byte-level arithmetic for "short-cutting" string manipulation and comparison calculations, the detrimental effect of this is probably marginal.

For example, all string-manipulation done internally by win32 (MS Windows NT and later) is multi-byte. This doesn't have cause major slow-down (not even in heavily multi-byte versions, such as Chinese, Japanese, Korean). Unless oracle does something radically different, and wildly inefficient, this is probably not a concern for us."

(b) (obviously), increases the space requirements.

posted by Nilendu # 6:09 PM 0 comments

Saturday, March 26, 2005

More 10g New Feature Notes (Cont).

11) Oracle Database 10g: Maximize Availability - Backup and Recovery Enhancements

A) Simplified Recovery -

No backup needed after RESETLOGS operation. Old backups still work. Earlier, old backup files could not be used again in the new incarnation of the database after resetting logs.

Earlier, following a resetlogs archived log files could be rewritten - not anymore. So LOG_ARCHIVE_FORMAT is changed to accomodate the resetlogs operation to avoid rewrite. New (%r) format looks like %t%s%r.arc

V$LOG_HISTORY and V$OFFLINE_RANGE are not cleared after a resetlogs.

V$DATABASE_INCARNATION

B) Full Database BEGIN BACKUP -

ALTER DATABASE (BEGIN / END) BACKUP; -- earlier was done tablespace wise.
A very nice operational change.

Is not needed for RMAN backup since RMAN backups are self-consistent (no need to take files in backup 'mode' with RMAN)>

C) FLASH BACKUP -

Oracle is banking on disk to be widely accepted as the default backup area because of (a) ease of disk backup and more because of (b) decreasing price of disk. Thus 'flash backup area' has been designated as the default backup area for control files, archived logs, data files and everything else - particularly when using RMAN.

DB_RECOVERY_FILE_DEST ='' disables the flash recovery area.

Fast recovery - SWITCH DATABASE TO COPY;

No more copying from backup area - but then database runs against backup area.

New DROP DATABASE command - how useful!

Incremental Backup -

Fast Incremental Backup -
If 100M of a 10G file has changed - Oracle will read 10M and copy 10M only. Oracle keeps track of all data block changes in a new 'change tracking file' using the new background process CTWR (aptly called, Change Tracking Writer).

ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE '...';

V$BLOCK_CHANGE_TRACKING

V$BACKUP_DATAFILE shows the effectiveness of block change tracking (% of blocks read / changed).

21) Oracle Database 10g: Maximize Availability - General Enhancements

Tables with columns LONG and LONG RAW can be redefined online.

CREATE OR REPLACE SYNONYM doesn't invalidate PL/SQL program units if they were already valid.

posted by Nilendu # 10:43 PM 0 comments

10g New Feature Notes (Cont).

Contd. from the earlier post -

10) Oracle Database 10g: Maximize Availability - Human Error Correction

Flashback Database - Flashback logs == Before images of data blocks. Created in the flashback recovery area (DB_RECOVERY_FILE_DEST) by (new) background process RVWR (Recovery Writer).

V$FLASHBACK_DATABASE_LOG

V$FLASHBACK_DATABASE_STAT

Following a FLASHBACK DATABASE (TO TIMESTAMP | TO SCN), db must be opened in RESETLOGS mode.

To exclude a tablespace - ALTER TABLESPACE .. FLASHBACK OFF;

FLASHBACK TABLE BEFORE DROP;

DROP TABLE PURGE; -- cannot flashback. No more stored in recycle bin

DBA_RECYCLEBIN

PURGE RECYCLEBIN;

Flashback Versions -

VERSIONS BETWEEN (SCN | TIMESTAMP)

SCN_TO_TIMESTAMP
TIMESTAMP_TO_SCN

RETENTION GUARANTEE for UNDO tablespace.

New pseudo-colums - VERSIONS_STARTTIME, VERSIONS_ENDTIME

Flashback Transaction Query -

Must enable "Supplemental Logging" to use this feature. Additional columns are logged into undo segments and redo log files when this is enabled. SELECT ANY TRANSACTION privilege must be granted as well.

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

V$DATABASE - shows if this is on.

Query FLASHBACK_TRANSACTION_QUERY (operation, undo_sql, table_name). Find the XID of the transaction using Flashback Versions query before.

Flashback Table (TO SCN | TO TIMESTAMP) - doesn't work for AQ, data dictionary tables. Much faster than traditional media recovery.

Must 'ENABLE ROW MOVEMENT' before using it.

Triggers are by default disabled before flashback table operation.

Flashing back before a DDL operation is not possible.

Flashing back during a transaction is not allowed.

posted by Nilendu # 9:58 PM 0 comments

Friday, March 25, 2005

My Learning Path for Oracle 10g - Notes on New Features

As said before, I decided to take the (beta) exam "Oracle 10g upgrade for OCP 8i candidates". Though, I am both 8i and 9i certified - I chose to upgrade from 8i because -

(a) the exam was cheaper ($50 - compared to a $125 for upgrade from 9i)
(b) it's a beta exam (still!) and therefore, much more questions are expected. Since the "cram guides" are not out on market - it will be a bit of challenge.

This is a nice, formal way of learning. I am in a regular job - so could not really assign more than 3-4 days to learn the topics I am expected to cover.

The exam is on Sunday and on this nice, sunny Friday afternoon I feel well-prepared. My preparation strategy was based on the fact that I actually wanted to learn the new product rather than just 'mug' some cramsheets and get a paper certificate.

So, I chose to follow the *excellent* hands-on courses by Oracle University. My present employer is an Oracle partner - so all these courses were free to me. They not only cover the real meat, but I could see some excellent 'demo' of the new features. I don't know what software they were using to create those excellent self-contained demos - but it rocked.

The courses I covered were - you can also search for them on http://ilearning.oracle.com -

(I've added my study notes alongwith) ..

1) Oracle 10g - The Self-managing Database

A great introduction to Oracle 10g by Anuj Goel. Talks about the design decisions behind 10g (minimize the maintenance, administration, installation and troubleshooting time) and how they have actually been tackled.

Without going through the live presentation, you can also download the powerpoint presentation and go through it.

2) Oracle Database 10g Administration in 2 Days

Very basic material. Nothing really 'conceptual' but on 'how to do using different Oracle given tools'. Don't spend more than 15 mins here.

This is totally SQL free zone - as all administration and management tasks have been illustrated with Enterprise Manager

The 'Backup and Recovery' section talks about 'Oracle suggested Backup Strategy' - which is

"The Oracle-suggested backup strategy is based on creating an image copy of your database which is rolled forward using incrementally updated backups. Oracle Enterprise Manager schedules RMAN backups jobs for you to run during the overnight hours.

For each datafile, the strategy calls for backups to be made as follows:

* At the beginning of day 1 of the strategy (the time the first scheduled job actually runs), an incremental level 0 datafile copy backup. It contains the datafile contents at the beginning of day

1. In a restore-and-recovery scenario, the redo logs from day 1 can be used to recover to any point during day 1.

* At the beginning of day 2, an incremental level 1 backup is created, containing the blocks changed during day 1. In a restore-and-recovery scenario, this incremental level 1 can be applied to quickly recover the rolled-forward level 0 backup to the beginning of day 2, and redo logs can be used to recover to any point during day 2.

* At the beginning of each day n for days 3 and later, the level 1 backup from the beginning of day n-1 is applied to the level 0 backup. This brings the datafile copy to its state at the beginning of day n-1. Then, a new level 1 is created, containing the blocks changed during day n-1. In a restore-and-recovery scenario, this incremental level 1 can be applied to quickly recover a restored backup to the beginning of day n, and redo logs can be used to recover the database to any point during day n.

The datafile copies used in the Oracle-suggested backup strategy are tagged with the tag ORA$OEM_LEVEL_0. The level 1 incremental backups for use in this strategy are created for use with datafile copies that are so labeled. You can safely implement other backup strategies without concern for interference from the backups for the Oracle suggested strategy."

3) Oracle Database 10G: Manageability Overview

Mostly the framework for this 'self-managing' database. Same presentation as #1 - but much more detailed and has an informative question-answer session.

Download the presentation (click on "Reference") from link #1 above. This one doesn't work.

4) Oracle Database 10g: Self-Managing Database - Space Management

Deals with

(a) Proactive Tablespace Monitoring every 10 mins done by new background process MMON - supposed to be in-memory check and therefore fast (Critical Alert : 97%, Warning Alert : 85%). Threshold value - can be emailed or paged. Works only on locally managed tablespace though. DBMS_SERVER_ALERTS is the PL/SQL API.

(b) Segment Management : Has a very cool demo of new 'online Shrink' operation. Remember,
even though it's 'online' - there're both shared (when compacting unused free space) as well as exclusive locks (when reclaiming the freed space) obtained on the segment. It will block a DML on the segment that's going on in parallel while shrinking is on.

Takes place through, delete-insert mechanism by Oracle and therefore, chained rows may or may not be impacted. Though it's a DML operation, triggers are NOT fired by Oracle.

Earlier, High water mark can only be brought down by rebuilding (or, moving) the table. The tables need to have 'enable row movement' on. Indexes are usable after a shrink (unlike rebuild / move).

Very useful feature if you remember 'online' doesn't mean 'can do anytime'.

However, following structures cannot be shrunk -
cluster tables, tables with LONG, LOB, tables with function-based indexes.

Temporary and UNDO segments, since handles different way, cannot of course be shrunk.

Syntax -

ALTER TABLE SHRINK SPACE [COMPACT] [CASCADE];

If the COMPACT clause is not specified, Oracle resets the HWM by reclaiming the freed up space. If COMPACT is specified, reclaimable space is reclaimed, but the HWM is not set-back (and thus, no exclusive lock on the segment is obtained).

EM Advisor Central -> Segment Advisor can give an idea of which segments are ripe for SHRINK operation.

(c) Undo Management : Now even the undo_retention is automatic and default. Only, if you set statistics_level to typical (collects 30 statistics on a frequent basis and dumps it on AWR).

AUM = Automatic Undo Management.

Init params like MAX_ROLLBACK_SEGMENTS, UNDO_SUPPRESS_ERRORS (always TRUE), ROW_LOCKING, SERIALIZABLE, and TRANSACTION_AUDITING (always TRUE) no longer work with 10g.

Resumable Space Allocation can be altered in session and system level.

(d) Redo Logfile Advisor : Another advisor in 10g advisor network. Lowly useful, but not a killer feature.

If FAST_START_MTTR_TARGET is set the smallest optimal size of REDO file is calculated.
OPTIMAL_LOGFILE_SIZE (in MB) is a new column in V$INSTANCE_RECOVERY.

5)Oracle Database 10g: Self-Managing Database - Integrated Storage Services

Basically 'Space Management -2'. Talks about little but important things. Like, the obsolescense of MAXTRANS parameter. BIGFILE tablespace. I liked the easy description of big-endian and small-endian architecture.

SYSAUX - Now SYSAUX tablespace stores everything that should be installed with database ('auxilliary database metadata) but does not belong to CORE kernel (i.e., SYS objects). It has to be locally managed, ASSM, Extent Management Local. The data it contains - Text, UltraSearch, InterMedia, OLAP, Scheduler, AWR, XMLDB, Workspace Manager, Logminer, Statspack etc. Query V$SYSAUX_OCCUPANTS to know about it.

Just like SYSTEM tablespace, SYSAUX tablespace cannot be dropped or renamed.

TABLESPACE -

(a) with 10g, a 'temporary tablespace group' - basically a bunch of TEMP tablespaces - can be configured and be defaulted to a user's temporary tablespace. The benefit - user can use different temp tablespaces in different sessions or, a single SQL can use multiple temp tablespaces for sorting / hashing.

A temporary tablespace group - just like a single temporary tablespace - can be made the 'default temporary tablespace' of the database. Then none of the tablespaces in that temp. tablespace group can be dropped.

New DD view - DBA_TABLESPACE_GROUPS.

(b) Now tablespaces can be renamed. Could be useful in transporting tablespaces between two databases. Remember, RENAME does not rename the data files.

(c) A simple but very useful feature - Like 'default temp tablespace' -a 'default permanent tablespace' can be made in database level. Thus, any unqualified object does not end up being in SYSTEM anymore. Can be suggested during 'create database' or later.

BIGFILE Tablespace -

A BFT has a really big datafile. For all practical purpose, the concept of datafile can now be forgotten and whole architecture can be looked at a logical level (i.e., tablespace stores table). Only bitmaps can contain the huge extent information for BFT - thus BFT is possible only with Locally managed tablespace. Oracle can now store upto 8 Exa Bytes (8 Million Tera Bytes) of data thanks to BFT.

Max data in Database = D *F * B (D -> max. data files, F -> Max blocks per file, B-> max block size).

1 PB (petabyte) = 1024 TB (terabyte); 1EB (exabyte) = 1024 PB = 1,048,576 TB = 2**60 bytes!

"You should avoid creating a BFT on a system that does not support striping, because of negative implications for parallel query execution and RMAN backup parallelization."

There's something interesting on the 'it's useless to think how many extents a segment has' -

"You should try to avoid having more than 100,000 extents (based on a 2 K block size). When more extents are created, there may be a performance penalty under concurrency and during DDL operations that involve high space management activity. However, the above number of extents is not an absolute limit and this number varies proportionally higher to the block size used. For example, with an 8 K block size, you can have 400,000 extents; with a 16 K block size, you can have 800,000 extents. This calculation is based on the number of bitmap blocks (and their size) that will be required to represent the file."

Exotic Info - ROWID for normal (small file) tablespaces is of following format -

OOOOOOFFFBBBBBBRRR
- Where OOOOOO -> segment number
FFF -> relative file number
BBBBBB-> data block address (relative to the tablespace, not datafile. Thus two rows of the same table can have same data block address, and reside on two separate data files).
RRR -> slot # inside a block

Since FFF becomes irrelevant with BIGFILE tablespace (relative file number is always 1024!), the rowid FFF part is used to denote the data block address as well.

Thus, DBMS_ROWID now takes a new input parameter called TS_TYPE_IN (BIGFILE or SMALLFILE).

Transportable Tablespace - 10g allows tablespaces to be transported across different platforms (not doable in 9i).

Steps ->
1. Make tablespace READ ONLY
2. Extract METADATA using data pump (expdp)
3. If the target has different 'endianness' - convert byte ordering in files using RMAN
4. Copy the datafiles in target, Load metadata using data pump (impdp)
5. Make tablespace READ WRITE

To determine endianness -

select endian_format
from
v$transportable_format a, v$database b
where a.platform_name = b.platform_name;

ASM -

Where Oracle treats the entire disk subsystem as a single RAW device. No more partitioning, I/O tuning, data movement - Oracle does it for you. Extension of the earlier coined SAME (Stripe and Mirror Everything) concept. No more need to buy "volume manager" or "cluster manager" software.

A separate Oracle ASM instance (database) is started, when used. Both this and the working database have access to a common set of disks (DISK GROUPS). ASM instance has two new background processes - (a) RBAL (rebalancing DISK GROUPS) and (b) ARBn (does data extent movements).

Databases using ASM also has two new background processes namely, (a) RBAL (global opens of the disks in DISK GROUPS) and (b) ASMB (talks to ASM instance). ASM instances can be clustered as well.

6) Oracle Database 10g: Self-Managing Database - General Enhancements

(a) SYS statistics should be collected (Oracle recommends so). GATHER_DATABASE_STATS can collect stats on dictionary tables and so is the new GATHER_DICTIONARY_STATS.

(b) FIXED tables should also be analyzed once in the instance lifetime using GATHER_FIXED_OBJECTS_STATS. However, fixed tables do not have I/O cost since they reside on memory and thus only CPU costing is used, if at all, by CBO.

(c) Table MONITORING no longer relevant since the monitoring is done by STATISTICS_LEVEL anyways.

(d) V$FAST_START_TRANSACTIONS keep record of recovering and recovered transactions.

(e) MAXTRANS is gone. No longer works now. Oracle does MAXTRANS management itself.

(f) MTS_ parameters are now SHARED_ parameters. Just setting SHARED_SERVERS parameter is sufficient for multi-threading.

(g) TRCSESS command line tool collates all trace files from a session. You still have to perform TKPROF on it though.

(h) ALTER SYSTEM FLUSH BUFFER_CACHE;

7) Oracle Database 10g: Self-Managing Database - Common Infrastructure

Talks about the whole ADDM-AWR-Server Alert paradigm. Good to know the framework of the new features. I am not yet done testing the claims.

AWR -
Manageability infrastructure -

Application and SQL

Storage

Space

System Resource

Backup & Recovery

These tasks have been automated with new 10g infrastructure where automated tasks, server alerts and advisory framework work upon AWR (Automatic Workload Repository).

Not all 'base statistics' (those on memory) are converted into AWR (on disk). The ones that go to AWR are -

Access and usage stats for objects

time model stats (V$SYS_TIME_MODEL)

V$SESSTAT and V$SYSSTAT

Optimizer Stats (V$SQL)

ASH (active session history)

ASH - secondly data sampled from v$session. MMON flushes this data to AWR every hour and MMNL whenever the buffer is full. ASH == 2MB per CPU memory and max upto 5% of shared_pool_size.

Base Statistics = Raw Data Collected (total physical reads since startup)
Metrics = secondary stats derived from above (avg. physical reads in an hour) and thus, used by management framework.

10g supports metrics for system, session, files and wait-events stats.

AWR == Base Stats, SQL Stats, Metrics, ASH, Advisor Results in DBA_HIST_* tables. DBMS_WORKLOAD_REPOSITORY is the PL/SQL API to control AWR data and baseline.

$ORACLE_HOME/rdbms/admin/awrrpt creates a statspack type (much more detailed) report.
Database Feature Usage Stats - DBA_FEATURE_USAGE_STATISTICS & DBA_HIGH_WATER_MARK_STATISTICS

Server Generated Alerts --

V$ALERT_TYPES - DBA_THRESHOLDS

Threshold Alerts = Stateful Alerts (e.g., tablespace 97% full. Appear in DBA_OUTSTANDING_ALERTS, when cleared DBA_ALERT_HISTORY)

Stateless (Binary. Like, snapshot_too_old - goes directly to the history table)

V$SYSMETRIC & V$SYSMETRIC_HISTORY keep metrics for 1 hr.

PL/SQL API - DBMS_SERVER_ALERT

Automated Jobs -

By default, GATHER_STATS_JOB.

DBMS_STATS.LOCK_TABLE/SCHEMA_STATS locks a table/schema from getting analyzed.

DBA_OPTSTAT_OPERATIONS - history of DBMS_STATS operations.

A very very useful feature - when index is created, the stats are auto-generated.

Advisory Framework -

Advisors -

ADDM

SQL tuning

SQL access

Memory (PGA, SGA, Buffer Cache, Library Cache)

Segment

Undo

Advisor attributes can be LIMITED or COMPREHENSIVE. Can be specified a max. TIME LIMIT. Can be interruptible. Can take user directives.

PL/SQL api - DBMS_ADVISOR.

DBA_ADVISOR_ tables.

8) Oracle Database 10g: Self-Managing Database - Performance Management

The new SGA advisor, Automated SGA, Redo advisor.

Optimizer can now analyze a statement for, say 30 mins and come out with the best possible plan and store it internally for all future executions (only if you want it!). Thus tuning SQL in packaged application becomes easier (no need to touch code anymore).

Some issues not captured by Statspack, but by ADDM -

excessive logon / logoff

memory undersizing

hot blocks and objects

RAC service issues

Locks and ITL contention

Checkpointing issues

PL/SQL, Java time

Has an *excellent* demo viewlet on Proactive Monitoring -> ADDM recommendations. Really helpful to understand the new philosophy on ohhlistic performance management.

For Reactive Tuning, EM provides

CPU & Wait classes (reports holistic CPU usage including that by extra-Oracle stuff)

Top SQL Statements

Top sessions hurting

Just set SGA_TARGET for automated SGA sizing. Setting it to 0 disables automatic SGA sizing.
It can be increased up to SGA_MAX_SIZE. New background process MMAN serves as SGA memory broker.

SQL Tuning Advisor -

Calls the CBO in a 'special internal mode' to make it do four analyses -

stale / missing stats analysis

Plan Tuning Analysis (SQL Profile) - stored in data dictionary and thus does not need a hard code change in packaged apps

Access Path Analysis

Restructure SQL

Things done in SQL Profile which is not doable when CBO 'normally' analyzes a SQL (usually under fraction of a second) to find the access plan -

runs partial execution

dynamic sampling of statistics

advanced predicate selectivity - correlation between columns, join skews and complex predicates (like function)

PL/SQL API for SQL Advisor: DBMS_SQLTUNE

9) Oracle Database 10g: Maximize Performance - General Enhancements

Talks of the general performance enhancements (as opposed to the new performance management and infrastructure in the previous two lessons).

A) Partitioning Enhancements -

List-partitioned table operations are permissible on list-partitioned IOT
Index status = USABLE throughout partition management (DROP, TRUNCATE..)
Bitmap index on list-partitioned IOT
LOB columns supported on partitioned IOT
Partitioned table can be DDLd with 'UPDATE INDEXES' so Oracle rebuilds all indexe segments
SKIP_UNUSABLE_INDEXES (true) can be set in session or stsrem level
NEW hash-partitioned global indexes are introduced (hash funcion = f(partitioning key, no. of partitions)
Hash-partitioned index saves from right-hand contention (during insert of incremental sequenced columns). The contention is then spread across different partitions from the 'right hand' partition (as happens in list-partitioned indexes)
According to Oracle -For queries involving equality or in-list predicates, performance of hash-partitioned indexes and non-partitioned indexes are comparable
ALTER INDEX ADD PARTITION..COALESCE PARTITION
Hash-partitioned indexes must be built partition wise (RENAME, SPLIT, MERGE are unsupported)
Hash-partitioned indexes work better with parallel queries

B) Resource Manager Enhancements -

A session can roam across different consumer group according to the rule laid out for resource maps.

MAX_IDLE_TIME, MAX_IDLE_BLOCKER_TIME

Mapping for automatic assignment of sessions to consumer groups.

C) Wait Interface Improvements -

New columns in V$EVENT_NAME - wait_class#, wait_class.

V$SESSION now has wait event columns (event, p1, wait_time)

V$SESSION, V$SESSION_WAIT have new columns - BLOCKING_SESSION, BLOCKING_SESSION_STATUS.

V$ACTIVE_SESSION_HISTORY - current_obj#, objects that are accessed the most.

V$EVENT_HISTOGRAM, V$FILE_HISTOGRAM

D) Sorted Hash-cluster table -

Faster data retrieval than IOT since the overhead of index is absent and data is already sorted.

Response time is not proportional to the table size, not to the number of entries (as in IOT). It depends on number of rows sharing the same hash key.

10) Oracle Database 10g: Maximize Availability - Human Error Correction

Talks in great details about the new FLASHBACK features.

11) Oracle Database 10g: Maximize Availability - Backup and Recovery Enhancements

Some RMAN stuff. It was good to know that Oracle now keeps the RESETLOGS timestamp and can apply files from earlier backups. Before 10g, RESETLOGS invalidated all earlier backups.

12) Oracle Database 10g: Maximize Availability - Data Guard Enhancements

13) Oracle Database 10g: Self-Managing Database - Tools and Utilities

talks in great detail about DATA PUMP. Basically imp and exp are now impdp and expdp and much faster (specially, impdp is about 20 times faster than imp!) and can be parallelized. Very nice when you actually do those things.

14) Oracle Database 10g: Self-Managing Database - Security Enhancements

DML can now be audited.
Vitrual Private Database can be implemented upto column level.

15) Oracle Database 10g: Self-Managing Database - BI and Data Warehouse Enhancements

Partitioned Outer Join, among other things - this rhing is super cool. Even Tom Kyte names it among his favoutire 10g features.

16) Oracle Database 10g: BI and Data Warehousing - New Features

17) Oracle Database 10g: Develop Applications - SQL and PL/SQL Enhancements

Much faster PL/SQL compiler. Native interpreter. New datatypes in SQL.

18) Oracle Database 10g: Develop Applications - API Enhancements

Some changes in DBMS_ PL/SQL supplied packages.

19) Oracle Database 10g: Manage Jobs with the Scheduler

DBMS_JOB is dead. Long live the new scheduler. You can now schedule extra database things (like, an OS binary) from this API.

Very detailed. Nice work.

20) Oracle Database 10g: Clustering Enhancements

21) Oracle Database 10g: Maximize Availability - General Enhancements

22) Oracle Database 10g ASM - Admin and Concepts

I'm still trying to get this new ASM thing fully. But I think I'm making progress. DBAs who think only their job is getting 'automated' by this 'self-managing' can finally relax. SYSADMINS will be obsolete pretty soon too.

OK - was kidding. But ASM claims to reduce an enormous system administration load. Yet to test. Sigh!

23) Oracle Database 10g Manageability: Diagnostic Packs

Mostly a good Demo of new-look Enterprise manager.

24) Oracle Database 10g Manageability: Tuning Packs

Mostly a good Demo of new-look Enterprise manager.

25) Oracle Database 10g: Automatic Storage Management and Clustering - New Features & Functions

ASM in-depth.

I've also gone to some other sources - but will talk about it later.

Plus, I am presently going through some cool RAC trainings (best practice et al). When done, I plan to release my notes and may be do a mini-series on new and effective 10g features.

C.R. talk -

Just read the Fortune issue where big people talk about the best ideas they received from another human. Peter Drucker was the only guy who was named twice, in the list of idea-givers. Amazing. Drucker apparently told Rick Warren the difference between 'efficient' and 'effective'. Efficient is 'doing things right'. Effective is 'doing the right thing'. What about 'not doing anything at all?'. I can vouch for a politician who went to the apex just because he could remain silent in 26 languages, or so.

posted by Nilendu # 3:40 PM 0 comments

Wednesday, March 23, 2005

The one with 10g - dealing with single quote(s) in SQL and PL/SQL

I've decided to upgrade my Oracle certifications to 10g. Though there've been debates regarding the worth of certification - but to me certification value is more inward-oriented (self training) rather than outward (get a paper to market yourself in job market).

I usually give a good part of my daily spare time to learning. But in spite of that, I usually end up with learning things I'm presently working on. For example, I would never learn the RMAN improvements in 10g without having been through a mandatory requirement to just learn it. Which happens before such exams.

My training regimen is simple. Learn from the excellent online e-study guides at Oracle partner network and try it out with my own practice 10g database.

So far, for last three days, it's been a huge amount of learning. I will come out with a series of "New in 10g" or "Improved with 10g" articles later. But here's a neat trick I learned today -

Say you need to read 'Tom Jones' (single quotes included) from or feed into a SQL statement -
10g offers this very cool q operator.

i.e, the query will look like -

SQL> select q'('Tom Jones')'q from dual;
Q
-----------
'Tom Jones'

want to retrieve 'Today is Tom's birthday' ?

SQL> select q'['Today is Tom's birthday']' string from dual;

STRING
-------------------------
'Today is Tom's birthday'

Note:
----
(a) [ and ] are used paired. () or {} can also be used. You can also choose your own delimiter :)
"However, if the delimiter appears in the text literal itself, ensure that it is not immediately followed by a single quotation mark."

(b) Everything, including one or more ' (single quotes) are counted automatically (no special encasing for those)

(c) q is the quotation delimiter

With 9i and before this query would have to be written as -

SQL> select '''Today is Tom''s birthday''' string from dual;

STRING
-------------------------
'Today is Tom's birthday'

i.e, for every occurrance of a single quote, you've to encase it! Which was troublesome and needed more coding to be done.

The new q operator is also much more readable!

q can also be used in PL/SQL as -

v_string := q'Z name like 'DBMS_%'Z';

Note: We've used Z as the delimiter here.

Isn't it cool? I like it. No more counting single quotes anymore..Whoo hoo..

posted by Nilendu # 6:08 PM 1 comments

Monday, March 21, 2005

Missing 'recursive call' on 9i!!!

I was trying to build a simple, SQL*Plus based test case for cost of parsing (i.e., recursive calls coming back with each different literals). And I had some surprising finds on 9i - yet to explain it or say it's a 'bug' - but here goes the test.

Test Bed -

create table t (a number,
b number,
c number,
d number,
e number,
f varchar2(200),
g date);

declare
j pls_integer;
begin
for i in 1..1000000 loop
j := dbms_random.random;
insert into t values (j,j+1, j-1, j+7, j/2,'this is a row which will be tested...', sysdate);
end loop;
commit;
end;
/

create index i1 on t(b);

create index i2 on t(g);

analyze table t compute statistics for table
for all indexes
for all columns;

Now on 10g --

SQL> select * from t where b = 233221323 and g > (sysdate - 1);

no rows selected

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=79)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=1 Card=
1 Bytes=79)

2 1 INDEX (RANGE SCAN) OF 'I1' (INDEX) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
1 physical reads
0 redo size
535 bytes sent via SQL*Net to client
373 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

on 9i --

SQL> select * from t where b = 2332213523 and g > (sysdate - 1);

no rows selected

Elapsed: 00:00:00.62

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=79)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
79)

2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE) (Cost=1 Card=1)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
511 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

Both were immediate first execution of a SQL after the index creation!!

Is it a SQL*Plus issue or Oracle version issue. Hmm.. Setting up the "sql_trace = true" on both beds and changing the literal --

on 10g --
--------
SQL> alter session set sql_trace = true;

Session altered.

Elapsed: 00:00:00.04
SQL> select * from t where b = 2332523 and g > (sysdate - 1);

no rows selected

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=79)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=1 Card=
1 Bytes=79)

2 1 INDEX (RANGE SCAN) OF 'I1' (INDEX) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
535 bytes sent via SQL*Net to client
373 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

on 9i
-----

SQL> alter session set sql_trace = true;

Session altered.

Elapsed: 00:00:00.04
SQL> select * from t where b = 2332523 and g > (sysdate - 1);

no rows selected

Elapsed: 00:00:00.63

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=79)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
79)

2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE) (Cost=1 Card=1)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
511 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

Note -- still shows 'recursive calls = 0"!

Let's look at the trace files -

for 10g --

select *
from
t where b = 2332523 and g > (sysdate - 1)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 3 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 3 0 0

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 147

Rows Row Source Operation
------- ---------------------------------------------------
0 TABLE ACCESS BY INDEX ROWID T (cr=3 pr=0 pw=0 time=55 us)
0 INDEX RANGE SCAN I1 (cr=3 pr=0 pw=0 time=48 us)(object id 394056)

for 9i --

select *
from
t where b = 2332523 and g > (sysdate - 1)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 2 3 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 2 3 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 24

Rows Row Source Operation
------- ---------------------------------------------------
0 TABLE ACCESS BY INDEX ROWID T
0 INDEX RANGE SCAN I1 (object id 95644)

Plus, on 9i trace file we actually see a recursive call which is missing on 10g file --

select o.owner#,o.name,o.namespace,o.remoteowner,o.linkname,o.subname,
o.dataobj#,o.flags
from
obj$ o where o.obj#=:1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 3 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: SYS (recursive depth: 1)

Overall for 10g --

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 3 0.00 0.00 0 1 3 3
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 1 3 3

Overall for 9i --

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 1 5 3
Fetch 1 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.00 0.00 0 4 5 4

So, in spite of 9i actually doing more recursive calls than 10g -- 'autotrace' shows 9i doing less (actually none!).

I'm a little lost on why 9i autotrace shows '0 recursive calls' at this moment!!

posted by Nilendu # 4:32 PM 0 comments

Friday, March 18, 2005

BULKING 50% up!

I got some emails on why I named a particular method "Blind Tuning".

It's because the options to 'try out' were very limited. I received the query in email and a bunch of statistics - I had no access to customer data, database and thus was restricted to just 'methods'. Then I tried the 'if I were optimizer how I would approach the problem' method, found out the disparity with the explain plan I was sent - and suggested a cure.

However, I finally finished my paper and presentation. While making the last two slides on the importance of bulking up in PL/SQL - I wrote a small test case. Putting it up here. The results were gotten on a 9.2.0.5 database on Itanium Linux.

create table t (x number, y date);

Declare
begin
for I in 1 .. 100000
Loop
Insert into t values (i, sysdate);
End loop;
End;
/
-- t has 100K records

create table t1 as select * from t where 1 = 2;
-- t1 has 10K records.

-- X is indexed on both tables.

-- Business need is to update T set t.x = t1.x where there exists a record t.y = t1.y

-- LOOP method / Old iteration

Declare
begin
for c1_rec in (select x,y from t1)
Loop
update t set y = c1_rec.y where x = c1_rec.x;
End loop;
End;
/

1.94 secs

--using BULK COLLECT and FORALL

Declare
type i_num is table of number;
type i_date is table of date;
-- declare PL/SQL "collections"

p_num1 i_num;
p_num2 i_date;
-- initialize PL/SQL collections to hold CURSOR

begin

select x, y
BULK COLLECT into p_num1, p_num2
from t1 ;
--BULK BIND

forall indx in p_num1.FIRST .. p_num1.LAST
update t set y = p_num2(indx) where x = p_num1(indx);
-- BULK DML

End;
/

0.70 sec.

i.e., BULKING up saves about 64% time!

What if we try to do this non-procedurally through a single SQL ;)
Say,

update t set t.y = (select t1.y from t1
where t1.x = t.x)
where exists (select '' from t1 where t.x = t1.x);

2.83 sec

How to tune this up? ;)

posted by Nilendu # 3:11 PM 0 comments

Thursday, March 17, 2005

A presentation I'm making for "Database Side Coding Performance Issues"

Happy St. Patrick's Day people!

A bit busy to prepare a presentation on a site's "legacy performance issues" for the Managers and developers. A very first frill free draft can be read here.

This is supposed to be an one-hour long very high level overview of the mistakes (performance perspectives) the developers have been doing time and again. After solving hundreds of pieces of code and performance issues, I figured out that things are coming to a full circle. Same mistakes are being committed in the later branches - which I've already went and fixed in the earlier.

Thus had a long talk with managers and figured it would make perfect sense to pick a few "critical" and "recurring" issues and update the guys who actually can take ownership of this while developing.

Will keep writing on this more. Remember, though this is a very site-specific thing, in my eight years - most of these issues (and many more) always revisit me!!

posted by Nilendu # 3:53 PM 0 comments

Tuesday, March 15, 2005

'Blind' Tuning Methods - A walkthrough

Is there any other job in the world where people just ignore you where you can really help with your knowledge and experience (design phase), but looks up with cute puss in boots type eyes to you to 'save the day' when those designs don't work and an irate customer is threatening to walk out??

Well - if you're the 'performance evangelist' -- that's an integral part of the job. It's fun too. Because once you actually make the application 'Die another day' - you're a hero. People actually don't want to ask "What are you working on these days" during lunch. But they are wide-eared even if you're talking about some silly, stupid ideas. You, only you probably, secretly sigh thinking how *easy* it could have been to prevent the issue at the first place. As Einstein said, "Intelligent people solve problems, Geniuses prevent them (from happening)". But then when was someone declared a hero for doing things as expected?

However, I had a hard call today with one of the "big five's" consultant DBA frantically asking for help after getting thrashed by the customer.

The issue :

(winnowed down from several emails) -

The on-site consultant sent me the following SQL with the plan output and the execution statistics. Apparently, "they found another query that is costing too much I/O for them that they are claiming" and " forgot to mentione the problem: This is another sql statement that is using a large amount of sort space and is currently executing. It is over 2 GB right now. I have included the explain and execution plans with statistics for one project id."

SELECT top_proj.project_id as project_id,
kpw.period_id as period_id,
round(sum(kpc.actual_labor_cost* ACCN_EV_ANALYSIS_UTILS.GET_TASK_RATIO(kpw.period_id,kp.project_id)),2) as actual_labor_cost
from ACCN_project_costs kpc,
ACCN_projects kp,
ACCN_projects top_proj,
ACCN_periods kpw
where top_proj.project_id = :b1
and kpc.project_id = kp.project_id
and kp.project_path_id_list like top_proj.project_path_id_list||'%'
and kp.project_type_code = 'TASK'
and kpw.period_type_id = 3
and kpw.period_id in (select period_id
from ACCN_cost_metrics_by_period where project_id = :b1)
group by kpw.period_id, top_proj.project_id

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 468 | 44460 | 5074 |
| 1 | SORT GROUP BY | | 468 | 44460 | 5074 |
| 2 | HASH JOIN | | 77743 | 7212K| 4429 |
| 3 | VIEW | index$_join$_004 | 784 | 6272 | 9 |
| 4 | HASH JOIN | | 77743 | 7212K| 4429 |
| 5 | INDEX RANGE SCAN | ACCN_PERIODS_N2 | 784 | 6272 | 3 |
| 6 | INDEX FAST FULL SCAN | ACCN_PERIODS_U1 | 784 | 6272 | 3 |
| 7 | MERGE JOIN | | 77743 | 6605K| 4418 |
| 8 | SORT JOIN | | 2961K| 141M| 3254 |
| 9 | MERGE JOIN CARTESIAN | | 2961K| 141M| 3254 |
| 10 | NESTED LOOPS | | 26 | 1040 | 4 |
| 11 | TABLE ACCESS BY INDEX ROWID| ACCN_PROJECTS | 1 | 31 | 2 |
| 12 | INDEX UNIQUE SCAN | ACCN_PROJECTS_U1 | 121K| | 1 |
| 13 | SORT UNIQUE | | | | |
| 14 | INDEX RANGE SCAN | ACCN_COST_METRICS_BY_PERIOD_U2 | 26 | 234 | 2 |
| 15 | BUFFER SORT | | 115K| 1129K| 3252 |
| 16 | TABLE ACCESS FULL | ACCN_PROJECT_COSTS | 115K| 1129K| 125 |
| 17 | FILTER | | | | |
| 18 | SORT JOIN | | | | |
| 19 | TABLE ACCESS FULL | ACCN_PROJECTS | 60725 | 2194K| 737 |
-----------------------------------------------------------------------------------------------------

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5074 Card=468 Bytes=44460)
1 0 SORT (GROUP BY) (Cost=5074 Card=468 Bytes=44460)
2 1 HASH JOIN (Cost=4429 Card=77743 Bytes=7385585)
3 2 VIEW OF 'index$_join$_004' (Cost=9 Card=784 Bytes=6272)
4 3 HASH JOIN (Cost=4429 Card=77743 Bytes=7385585)
5 4 INDEX (RANGE SCAN) OF 'ACCN_PERIODS_N2' (NON-UNIQUE) (Cost=3 Card=784 Bytes=6272)
6 4 INDEX (FAST FULL SCAN) OF 'ACCN_PERIODS_U1' (UNIQUE) (Cost=3 Card=784 Bytes=6272)
7 2 MERGE JOIN (Cost=4418 Card=77743 Bytes=6763641)
8 7 SORT (JOIN) (Cost=3254 Card=2961366 Bytes=148068300)
9 8 MERGE JOIN (CARTESIAN) (Cost=3254 Card=2961366 Bytes=148068300)
10 9 NESTED LOOPS (Cost=4 Card=26 Bytes=1040)
11 10 TABLE ACCESS (BY INDEX ROWID) OF 'ACCN_PROJECTS' (Cost=2 Card=1 Bytes=31)
12 11 INDEX (UNIQUE SCAN) OF 'ACCN_PROJECTS_U1' (UNIQUE) (Cost=1 Card=121450)
13 10 SORT (UNIQUE)
14 13 INDEX (RANGE SCAN) OF 'ACCN_COST_METRICS_BY_PERIOD_U2' (UNIQUE) (Cost=2 Card=26 Bytes=234)
15 9 BUFFER (SORT) (Cost=3252 Card=115656 Bytes=1156560)
16 15 TABLE ACCESS (FULL) OF 'ACCN_PROJECT_COSTS' (Cost=125 Card=115656 Bytes=1156560)
17 7 FILTER
18 17 SORT (JOIN)
19 18 TABLE ACCESS (FULL) OF 'ACCN_PROJECTS' (Cost=737 Card=60725 Bytes=2246825)

Statistics
----------------------------------------------------------
2245 recursive calls
43 db block gets
12336 consistent gets
47510 physical reads
0 redo size
1886 bytes sent via SQL*Net to client
688 bytes received via SQL*Net from client
5 SQL*Net roundtrips to/from client
17 sorts (memory)
1 sorts (disk)
54 rows processed

Analysis -

Looking at the plan and statistics, the 'bad' keyphrases in this case are -

(1) 47510 physical reads (almost 4 times the 'Consistent Gets' or Logical I/O)
(2) 2 1 HASH JOIN (Cost=4429 Card=77743 Bytes=7385585)
(3) 9 8 MERGE JOIN (CARTESIAN) (Cost=3254 Card=2961366 Bytes=148068300)

One cartesian join (i.e., no join condition found between the two data source) producing 148MB of data and one HASH join on this data will certainly use a lot of "sorting" area as they've noted.

Solution / Logical Analysis -

Let's get our critical thinking rolling. I did not have access to this particular database. So firstly,
I tried to see if there's actually a table that's not joined with other tables - i.e., candidate for MERGE JOIN (CARTESIAN).

The query contains four tables and if you compare the joins you'll find that the table ACCN_PERIODS is not directly joined with other three tables -- though ACCN_PERIODS is supposed to be 'qualified' (joined well) since the IN-clause returns 'just a few rows' (got that information from the consultant).

(Probably?) They tried to print a report for some project for *each* period.

The other tables are, in fact, very well joined. In fact, ACCN_PROJECTS (TOP_PROJ) is supplied a value for its primary key - which then should join well with the other ACCN_PROJECTS (KP). KP and KPC join with PROJECT_ID which is primary key on KP and an index on KPC.

So - this is how I think these issues - * if I were the optimizer * -- I would do

(1) find row from TOP_PROJ (PRIMARY KEY Access by INDEX ROWID)

(2) join the row with KP (*probably* Nested LOOP - since the #rows from 1 is 1)

(3) assuming a few rows from KP returned, NESTED LOOP with KPC (a primary key - index join)

(4) find out the qualified rows from KPW by comparing it with the IN-list

(5) just a few rows from the datasource returned by #3 - doing a CARTESIAN JOIN (CROSS JOIN X * Y) with datasource returned by #4

Thus, CARTESIAN join is unavoidable -- but in the 'if I were optimizer' method - we make sure cartesian join takes place between two smallest data source possible. Only after all other filters and predicates have been applied to them.

What's wrong in reality -

Well, look at the plan. The optimizer somehow does confuse. Probably the IN-list is 'merged' with the main query block. So now, the query has Five tables to deal with. So the real plan by optimizer is like -

(1) TOP_PROJ access -- INDEX UNIQUE SCAN since it's a primary key

(2) ACCN_cost_metrics_by_period access - based on project_id

(3) Optimizer sees both tables TOP_PROJ and ACCN_COST_ using PROJECT_ID = :b1 join -- so it ends up joining those two instead.

(So, does it do the same only during a CARTESIAN join? I believe not. Optimizer tries to put as many joins as possible all the time)

(4) Dataset returned by #3 (26 rows) is then CARTESIAN joined with KPC (115K rows)

CARD of step 8 (7) should then be 115656 * 26 = 3007056. In reality, it's 2961366.

There goes the last hope for our query!

(5) Dataset returned by #4 is then joined with KP. But it's a MERGE JOIN - since the number of records returned by #4 is too high.

..and so on..

Fix -

So if we can convince Oracle to follow our path as mentioned in solution - it would not join TOP_PROJ and ACCN_cost_metrics_by_period using PROJECT_ID (this exactly screws up the plan!).

So I rewrote the query and added the ORDERED hint so that TOP_PROJ is earlier joined with other tables and ACCN_COST_...table is used only to join KPW.

The modified query looked like -

SELECT /*+ ORDERED */ top_proj.project_id as project_id,
kpw.period_id as period_id,
round(sum(kpc.actual_labor_cost* ACCN_EV_ANALYSIS_UTILS.GET_TASK_RATIO(kpw.period_id,kp.project_id)),2) as actual_labor_cost
from ACCN_projects top_proj,
ACCN_projects kp,
ACCN_project_costs kpc,
ACCN_periods kpw
-- note the ORDER
where top_proj.project_id = :b1
and kpc.project_id = kp.project_id
and kp.project_path_id_list like top_proj.project_path_id_list||'%'
and kp.project_type_code = 'TASK'
and kpw.period_type_id = 3
and kpw.period_id in (select period_id
from ACCN_cost_metrics_by_period where project_id = :b1)
group by kpw.period_id, top_proj.project_id;

The plan then looked like -

# Operation Options Object name
======== ================ ============== ==============================
0 SELECT STATEMENT
1 SORT GROUP BY
2 NESTED LOOPS
3 MERGE JOIN CARTESIAN
4 NESTED LOOPS
5 NESTED LOOPS
6 TABLE ACCESS BY INDEX ROWID KDRV_PROJECTS
7 INDEX UNIQUE SCAN KDRV_PROJECTS_U1
8 SORT UNIQUE
9 INDEX RANGE SCAN KDRV_COST_METRICS_BY_PERIOD_U2
10 TABLE ACCESS BY INDEX ROWID KNTA_PERIODS
11 INDEX UNIQUE SCAN KNTA_PERIODS_U1
12 BUFFER SORT
13 TABLE ACCESS FULL KDRV_PROJECT_COSTS
14 TABLE ACCESS BY INDEX ROWID KDRV_PROJECTS
15 INDEX UNIQUE SCAN KDRV_PROJECTS_U1

Look, the way the joins are "as intended" now. The query doesn't use any more MERGE or HASH joins - and executed well under 1 sec with less than 4000 Consistent Gets.

In future, I intend to discuss this optimizer "unnesting" part more with some more real life examples. Remind me :)

posted by Nilendu # 4:57 PM 1 comments

Monday, March 14, 2005

Tuning LOOPs with Dynamic SQL (using DBMS_SQL)

Dynamic SQL is a query which is not known fully when you're writing the code. In other words, it will be fully known only at runtime. In the words of Secretary of Defense Donald Rumsfeld, this could be termed "known unknowns". We use it in many cases where

There are two ways Dynamic SQL can be written -

(a) Using the "DBMS_SQL" API
(b) Using the keywords "EXECUTE IMMEDIATE" (Oracle calls this 'Native Dynamic SQL')

While using DBMS_SQL package, you need to OPEN a cursor assigned for that 'dynamic' SQL, bind it if you can, execute it and close the cursor when you're done.

In many of these cases, you'll also iterate through multiple values that will be fed to the Dyanamic SQL (possibly). I've seen many cases where the programmer just 'integrates' all the DBMS_SQL syntax within the loop. Thus, even though one single DBMS_SQL.OPEN_CURSOR call would suffice, since called within LOOP - one separate cursor is opened for each iteration of the loop.

The result?

(a) Much worse performance (demonstrated below)
(b) Your program could actually die throwing out "ORA-1000 : Max Open Cursors Exceeded" error if the number of iterations is near the value of "open_cursors" and other sessions are active too!

A testcase -

The more generic code I see most of the time -

create table t(x number, y date, z varchar2(255));

create or replace procedure p1
as
l_cursor int;
l_status int;
begin
for i in 1 .. 10000
loop
l_cursor := dbms_sql.open_cursor;
dbms_sql.parse( l_cursor,
'insert into t (x,y,z)
values (:x,:y,:z)',
dbms_sql.native );
dbms_sql.bind_variable( l_cursor, ':x', i );
dbms_sql.bind_variable( l_cursor, ':y', sysdate+i );
dbms_sql.bind_variable( l_cursor, ':z',
'this is row ' || i, 255 );
l_status := dbms_sql.execute( l_cursor );
dbms_sql.close_cursor(l_cursor);
end loop;
end;
/

Notice, the open_cursor, parse and close_cursor calls have been made *within* the Loop. Thereby executing them 9999 extra times for a 10,000 row insert. They could easily be placed outside the loop. The tuned code then looks like -

create or replace procedure p2
as
l_cursor int default dbms_sql.open_cursor;
l_status int;
begin
dbms_sql.parse( l_cursor,
'insert into t (x,y,z)
values (:x,:y,:z)',
dbms_sql.native );

for i in 1 .. 10000
loop
dbms_sql.bind_variable( l_cursor, ':x', i );
dbms_sql.bind_variable( l_cursor, ':y', sysdate+i );
dbms_sql.bind_variable( l_cursor, ':z',
'this is row ' || i, 255 );
l_status := dbms_sql.execute( l_cursor );
end loop;

dbms_sql.close_cursor(l_cursor);
end;
/

I've used two different sets of runs. First to insert 100 rows and second, to insert 10,000 rows. I've also set plsql_optimize_level to 0 (so that Oracle does not try to tune the PL/SQL code at all!).

Here's the result - the time is obtained by "set timing on" on SQL*Plus.

P1 - (open-parse-close calls within the LOOP)

Run 1 - 0.64 second
Run 2 - 0.20 second

P2 - (open-parse-close calls only ONCE outside the LOOP)

Run 1 - 5.88 second
Run 2 - 1.99 second

However, for a quick but effective comparison between two Database side code, I like to use runstats.

Using that the difference found between two approaches can be spread out over more stuff (like, latching!) -

Difference between P1 and P2 for run2 (inserting 10,000 records) --

Run1 ran in 3775 hsecs
Run2 ran in 2544 hsecs
run 1 ran in 148.39% of the time

Name Run1 Run2 Diff
STAT...Elapsed Time 3,778 2,546 -1,232
LATCH.active checkpoint queue 5,305 611 -4,694
LATCH.redo writing 5,417 695 -4,722
STAT...undo change vector size 739,464 733,440 -6,024
LATCH.messages 10,893 1,424 -9,469
STAT...session cursor cache hi 10,032 26 -10,006
STAT...parse count (total) 10,064 52 -10,012
STAT...opened cursors cumulati 10,065 49 -10,016
STAT...consistent gets 10,345 292 -10,053
STAT...consistent gets from ca 10,345 292 -10,053
STAT...session logical reads 21,754 11,581 -10,173
LATCH.checkpoint queue latch 21,952 2,856 -19,096
STAT...redo size 2,881,572 2,861,076 -20,496
STAT...recursive calls 70,941 40,639 -30,302
LATCH.cache buffers chains 88,471 57,053 -31,418
LATCH.object queue header oper 44,356 2,234 -42,122
LATCH.session allocation 49,204 536 -48,668
LATCH.library cache lock 103,449 139 -103,310
LATCH.shared pool 144,004 40,463 -103,541
LATCH.row cache objects 129,869 752 -129,117
LATCH.library cache pin 243,979 80,460 -163,519
LATCH.library cache 317,594 80,644 -236,950

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
1,169,318 270,530 -898,788 432.23%

Look at the differences between P1 and P2 -- not only P2 runs much faster, but since P2 does 1 Parse call as opposed to 10,000 done by P1 -- the difference in latch resources is ENORMOUS.

That's (high latching) the real cost of re-parsing.

If we trace the runs for both P1 and P2 here's what we get --

For p1 --

insert into t (x,y,z)
values (:x,:y,:z)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 10000 0.29 0.30 0 0 0 0
Execute 10000 1.00 0.90 0 10037 10503 10000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20000 1.29 1.20 0 10037 10503 10000

* Notice the 9999 extra PARSE calls for P1

For p2 --

insert into t (x,y,z)
values (:x,:y,:z)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 10000 0.63 0.55 0 48 10520 10000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10001 0.64 0.55 0 48 10520 10000

Though I don't like to put things under "Best practices" document - there're always exceptions, but this can probably be a good candidate to implement there.

The statement in there would look something like -

"If you are LOOPing over a dynamic SQL and are using DBMS_SQL for that -- keep the OPEN_CURSOR, PARSE and CLOSE_CURSOR out of the LOOP."

posted by Nilendu # 3:32 PM 0 comments

Thursday, March 10, 2005

NOLOGGING and recovery issues

Got an issue yesterday, described as follows -

"A customer is recovering from a DB crash .. after restoring the data they see the following error -
java.io.IOException: ORA-01578: ORACLE data block corrupted (file # 6, block # 184017)
ORA-26040: Data block was loaded using the NOLOGGING option
ORA-01110: data file 6: '/u02/oradata/knta00/knta_data01.dbf'

Problem Analysis -

A quick "oerr ora 26040" printed -

26040, 00000, "Data block was loaded using the NOLOGGING option\n"
//* Cause: Trying to access data in block that was loaded without
//* redo generation using the NOLOGGING/UNRECOVERABLE option
//* Action: Drop the object containing the block.

As evident, some objects (table / index / lob) was in NOLOGGING mode, and thus, was not generating REDO log. So after the recovery - when everything else was recovered, Oracle did not have any data to roll forward / recover the
particular object.

We thus needed to know the following to solve -
(a) WHICH object(s)?
(b) Is it better to drop/ recreate or there's any other way to bypass?

A quick thinking - since Oracle points out some block and data file id - it's most definitely an index or a LOB. Since if it were a table, Oracle would probably print out the table name rather than bothering to print the subset of data blocks.

Diagnosis :

SQL> select table_name, column_name, logging from user_lobs order by logging

UPGRADE_LOGS MESSAGE NO

DEFAULT_HELP_CONTENTS HELP_CONTENTS NO

HELP_CONTENTS HELP_CONTENTS NO

As you can see there are two important tables with this DEFAULT_HELP_CONTENTS and HELP_CONTENTS. In my instance, none of the columns are set to NO logging. Is it OK to turn on logging for these columns? Eg.

alter table HELP_CONTENTS modify lob (HELP_CONTENTS) (nocache logging);
alter table DEFAULT_HELP_CONTENTS modify lob (HELP_CONTENTS) (nocache logging);"

I was emailed by support before they communicate the issue to the customer. Of course, updating them to LOGGING makes sense. I also added two possible ways they can resolve this issue quickly -

Solution - Method ONE - The "process" method

Query to find out the corrupt ROWIDs -

-- Identifying the LOB table

select owner, segment_name, segment_type  from   dba_extents
where  file_id = 6
and    148017 between block_id and block_id + blocks - 1;

-- Identify the LOB_COLUMN_NAME

select table_name, column_name
from   dba_lobs
where  segment_name = 'SYS_LOB029815C006$$'
-- segment_name returned from above query

-- Identify the corrupt ROWIDs
create table corrupt_data (corrupt_rowid rowid);

declare
error_1578 exception;
pragma exception_init(error_1578,-1578);
n number;
begin
 for cursor_lob in (select rowid r, &&lob_column from &table_owner.&table_with_lob) loop
   begin
     n:=dbms_lob.instr(cursor_lob.&&lob_column,hextoraw('8899')) ;
   exception
   when error_1578 then
      insert into corrupt_data values (cursor_lob.r);
   end;
 end loop;
end;
/   

-- Finally set the particular ROWIDs to EMPTY_BLOB() (or EMPTY_CLOB() if it were a CLOB) --

update &table_owner.&table_with_lob
    set &lob_column = empty_blob()
    where rowid in (select corrupt_rowid from corrupt_data);

Solution - Method TWO - The "out of the box" method

A quick glance through the table names indicate they could hold static data. Quickly querying the test database
confirmed it. So, I told the customer he can just drop the aforementioned tables and import them back from the
test / dev instance of him. As it turned out, this was less technical and since the tables contained 16 and 0 records
respectively was much faster to recover.

P.S.
I believe the mentioned column were modified to NOLOGGING assuming it would speed up the install process
(the client was running a third party software). But is NOLOGGING worth it?  We will discuss it soon. But let me say
this for now - NOLOGGING was actually called UNRECOVERABLE till some releases back.

posted by Nilendu # 4:36 PM 0 comments