Random notes on database features, code, practice, optimization and techniques.
Last year, I was diagnosing CPU spike in a certain site. Strangely enough,
analyzing the issue led to Oracle's SMON process being responsible for occupying a full CPU cycle for hours, and with no intention of slowing down or stopping.
SMON does the following work -
(1) Temporary Segment Cleanup
(2) Free Space Coalescing (only for dictionary-managed tablespaces)
(3) Instance recovery (if needed) at the instance startup
Whenever things like "it was working fine till yesterday" happen - I try to collect a list of "what all were changed since
yesterday".
It was found that the application was creating lots of temporary objects - a *lot* on session-level. The objects were created with a public synonym. But, here was the deal, when the objects were dropped -- the public synonyms were NOT dropped. Thus they remained in the system. The DBA found an exorbitant number of rows in DBA_OBJECTS (thanks to millions of un-dropped public synonyms accumulated over years!) and in the night ran a batch job to drop them. Next morning, they come in and boom! SMON is showing to be the CPU hogger with no sign to retire.
The DBA_OBJECTS is a view on OBJ$ - which resides in SYSTEM tablespace. On this particular installation SYSTEM tablespace was dictionary managed (a legacy from 8i). So suddenly the OBJ$ table is down from 4million records to just over a few thousands - and SMON is trying really hard to clean up the "holes" in dictionary-managed SYSTEM tablespace. OBJ$ table is actually part of a cluster - thus making the process even slower.
The proof of this theory were -
(a) V$SQL showing the query which directs to obj$ table being accessed an inordinate number of times
(b) The result from "select count(*) from sys.obj$" query is dropping with time.
The SMON cleanup cost 12.5% of the horsepower in that production environment. The server had 8CPUs and SMON was clogging 1 of them for too long.
We figured out that event '10052 trace name context forever'; stops SMON from coalescing free space in a dictionary-managed tablespace. We had to put that on init.ora and restart the database to make sure SMON does not go power hungry again.
On the weekend, during very low-usage - we would again restart the database and remove the parameter to unleash the evil force of SMON ;)
Lesson? Yeah. We've dropped objects from a production database and faced problems. Thus, even if not directly linked with other functionality - changes in a working environment could have painful effects. As the adage goes, if it ain't broke ....
C.R. talk -
An
awesome example of problem solving - the right way (analysis, defining, solve, test, refine..in a very out-of-the-box way) and the wrong way (trying iterations based on 'past examples' that may have worked in a different environment, painful, expensive, without any environment analysis).