Many people know that the Tkprof does not report base statistics and times for calls that have recursive calls. This can be handled by other alternative tools. There is one more unfortunate effect to watch out for when doing tuning of a DB that does fair amount of LOB operations. It seems to be more fundamental and much harder to cope with.
Someone was trying to capture performance problem for BLOB loading application. Any method they tried didn’t show good clue to the problem, and they could wonder around it for longer time unless weren’t hinted on certain LOB handling patterns that had been known for some buggy behavior.
But this was just an issue that made me post this other related observation. Here are two excerpts from Tkprof and Statspack reports from database where the case was reproduced.
Tkprof
OVERALL TOTALS FORALLNON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 7 0.02 0.02 4 86 0 0
Execute 9 0.66 1.14 82 2956 5618 8
Fetch 1 0.00 0.00 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 0.68 1.16 86 3044 5618 9
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message from client 100014 54.07 112.26
db file sequential read 1 0.00 0.00
SQL*Net message to client 100013 0.00 0.19
log file sync 10 0.00 0.02
latch: shared pool 11 0.02 0.02
OVERALL TOTALS FORALLRECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4644 0.94 0.93 0 0 107 0
Execute 11059 6.06 9.85 570 45295 5516 8933
Fetch 23131 0.70 2.23 305 38484 3 28673
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 38834 7.71 13.03 875 83779 5626 37606
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 113211 0.30 324.45
control file sequential read 855 0.03 0.06
direct path write 154743 0.00 0.70
direct path read 99973 0.00 0.64
log file switch completion 111 0.97 65.21
log file switch (checkpoint incomplete) 255 0.97 119.11
latch: shared pool 4 0.00 0.00
latch: checkpoint queue latch 1 0.00 0.00
latch: library cache 1 0.00 0.00
latch: cache buffers chains 1 0.00 0.00
db file scattered read 1 0.00 0.00
Statspack
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
----------------------------------------- ------------ ----------- ------ ------
db file sequential read 121,368 373 3 49.4
CPU time 141 18.7
log file switch (checkpoint incomplete) 263 122 463 16.1
log file switch completion 113 66 586 8.8
log file parallel write 3,837 47 12 6.3
-------------------------------------------------------------
Time Model System Stats DB/Inst: TST10203/tst10203 Snaps: 2625-2631
-> Ordered by % of DB time desc, Statistic name
Statistic Time (s) % of DB time
----------------------------------- -------------------- ------------
DB CPU 292.3 40.1
sql execute elapsed time 45.8 6.3
PL/SQL execution elapsed time 32.8 4.5
parse time elapsed 5.4 .7
hard parse elapsed time 5.3 .7
PL/SQL compilation elapsed time 0.1 .0
hard parse (sharing criteria) elaps 0.1 .0
connection management call elapsed 0.0 .0
repeated bind elapsed time 0.0 .0
hard parse (bind mismatch) elapsed 0.0 .0
sequence load elapsed time 0.0 .0
DB time 728.5
background elapsed time 130.7
background cpu time 27.5
-------------------------------------------------------------
Both parts indicate that there was serious activity, like waits on the sequential disk reading and redo log synchronization, CPU utilization, that was not reported in category “sql execute elapsed time” and “PL/SQL execution elapsed time”. There was no statement reported in Tkprof or Statspack that generated this activity, but there was a particular cursor seen all over the 10046 trace that was responsible for some of these missing components.
I kind of was lucky to see it, because the cursor has no associated calls whatsoever, only wait records. Unless there are waits associated with such cursor, there will be no records about it in the Oracle Trace. Another “funny” thing is that this call-less cursor can have recursive calls too. This cursor is created solely to write and read LOB content.
So, here are few words of caution for those who do performance troubleshooting using raw trace files, Statspack or ASH/AWR reports. Tkprof or other tools that would try to calculate CPU, Elapsed Time or Wait statistics and build hierarchy of calls of sessions containing LOB operations will likely produce incorrect results from slightly wrong to misleading. This is because calls contain major information that allows traversing call hierarchy and calculating base statistics and timing figures. The “missing” calls might cause whole call hierarchy tree to break. For the same reason Statspack reports may show resource consumption without any visible consumer.
One more thing worth mentioning. If LOB operations are wrapped into code that uses calls to communicate with the database (i.e. PL/SQL anonymous blocks or procedures), then the resources consumed by the LOB operations inside that code will be assigned to these calls.