in Oracle

SQL*Net Message Waits

Here are some extracts from an 11.2.0.4 AWR Report I ran a simulated workload on this server for about 40minutes and generated this report. I understand that some DBAs may be misinterpreting SQL*Net message time.

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
ora11204 Linux x86 64-bit 2 2 1 3.04

Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 158 18-Sep-16 21:42:34 36 1.0
End Snap: 159 18-Sep-16 22:23:01 33 1.0
Elapsed: 40.45 (mins)
DB Time: 42.04 (mins)

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 1.0 10.2 0.12 0.07
DB CPU(s): 0.0 0.1 0.00 0.00
Redo size (bytes): 2,523.6 24,793.8
Logical read (blocks): 41.7 409.6


Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tota Wait % DB
Event Waits Time Avg(ms) time Wait Class
------------------------------ ------------ ---- ------- ------ ----------
log file sync 1,095 2508 2291 99.5 Commit
db file sequential read 162 37.5 231 1.5 User I/O
DB CPU 19.1 .8
SQL*Net message to client 31,579 .1 0 .0 Network
Disk file operations I/O 103 0 0 .0 User I/O


Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Avg Avg
Total Wait Wait % DB Active
Wait Class Waits Time (sec) (ms) time Sessions
---------------- ---------------- ---------------- -------- ------ --------
Commit 1,095 2,509 2291 99.5 1.0
System I/O 12,899 2,210 171 87.6 0.9
User I/O 1,866 38 20 1.5 0.0
DB CPU 19 .8 0.0
Network 33,651 9 0 .4 0.0


Foreground Wait Class DB/Inst: ORCL/orcl Snaps: 158-159
-> s - second, ms - millisecond - 1000th of a second
-> ordered by wait time desc, waits desc
-> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0
-> Captured Time accounts for 101.7% of Total DB time 2,522.36 (s)
-> Total FG Wait Time: 2,546.18 (s) DB CPU time: 19.14 (s)


Avg
%Time Total Wait wait
Wait Class Waits -outs Time (s) (ms) %DB time
-------------------- ---------------- ----- ---------------- -------- ---------
Commit 1,095 0 2,509 2291 99.5
User I/O 276 0 38 136 1.5
DB CPU 19 0.8
Network 31,579 0 0 0 0.0
Concurrency 21 0 0 0 0.0

Foreground Wait Events DB/Inst: ORCL/orcl Snaps: 158-159
-> s - second, ms - millisecond - 1000th of a second
-> Only events with Total Wait Time (s) >= .001 are shown
-> ordered by wait time desc, waits desc (idle events last)
-> %Timeouts: value of 0 indicates value was < .5%. Value of null is truly 0


Avg
%Time Total Wait wait Waits % DB
Event Waits -outs Time (s) (ms) /txn time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file sync 1,095 0 2,509 2291 4.4 99.5
db file sequential read 162 0 37 231 0.7 1.5
SQL*Net message to client 31,579 0 0 0 127.9 .0
Disk file operations I/O 103 0 0 0 0.4 .0
latch: shared pool 2 0 0 4 0.0 .0
direct path sync 2 0 0 2 0.0 .0
db file scattered read 6 0 0 0 0.0 .0
jobq slave wait 5,522 100 2,770 502 22.4
SQL*Net message from clien 31,577 0 2,404 76 127.8


Wait Event Histogram DB/Inst: ORCL/orcl Snaps: 158-159
-> Units for Total Waits column: K is 1000, M is 1000000, G is 1000000000
-> % of Waits: value of .0 indicates value was <.05%; value of null is truly 0
-> % of Waits: column heading of <=1s is truly 1s is truly >=1024ms
-> Ordered by Event (idle events last)

% of Waits
-----------------------------------------------
Total
Event Waits 1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
ARCH wait on ATTACH 37 97.3 2.7
ARCH wait on DETACH 37 100.0
Disk file operations I/O 920 99.9 .1
LGWR wait for redo copy 54 100.0
Parameter File I/O 640 100.0
SQL*Net break/reset to cli 6 100.0
SQL*Net message to client 33.6K 100.0
...
SQL*Net message from clien 34.2K 82.7 9.5 2.7 1.6 .7 2.4 .3 .0


Other Instance Activity Stats DB/Inst: ORCL/orcl Snaps: 158-159
-> Ordered by statistic name

Statistic Total per Second per Trans
-------------------------------- ------------------ -------------- -------------
SQL*Net roundtrips to/from clien 31,579 13.0 127.9

1ms>1024ms>

Should I be worried about the SQL*Net message waits (to client, from client) ? How should I interpret them ? I have more than 30K SQL*Net messages and roundtrips. Apparently, messages from client are very slow — at 76ms/message. Apparently, round-trips are very slow — at 13 round-trips per second. Here’s a hint : In this workload that I created, all the client sessions were on the database server — so you shouldn’t be expecting such large wait times !

SQL*Net message to client waits are not really measured in terms of the time it took for the message to reach the client. Oracle doesn’t know how long the transmission took. It only knows the time taken to put the message onto the TCP stack on the database server itself. Subsequent network transmission time is unknown. That is why SQL*Net message to client will always be an extremely low figure in terms of time — because it isn’t a true measure of elapsed time sending a message to a client.

SQL*Net message from client is not just the time spent on the network. It is the time between the last message to the client upto the next message from the client. Thus, it also includes client overheads (like “think time”, CPU time, network stack on the client) besides transmission over the network. In most cases, the major component of SQL*Net message from the client is client “think time” — how long the client spent before formatting and sending the next SQL call to the database server. This should also mean that it is not always true that SQL*Net message from client is an idle event.

If I have a single client that is either or some or all of :
a. Loading data in sets of rows (e.g. an ETL client)
b. Extracting data in sets of rows
c. Retrieving results and formatting the results for presentation to a user
b. Running a batch job that majorly consists of SQL calls, not PLSQL
the SQL*Net message from client is majorly the time spent by the client (— unless you really have a high latency network.). In most such cases, when tracing the individual session, this wait event is NOT an idle event. However, when reading an AWR, you cannot isolate such a session from the other sessions that are doing a mix of activity — some with real interactive end-users, some sending periodic “heart-beat” messages, some completely idle waiting for a request from a user. In this AWR report, there are a mix of clients with different “think-times”, some completely idle for 40minutes. We can’t identify them in the AWR report.

Can you use the SQL*Net roundtrips to/from client figure from the AWR ? Not if you have a mix of different clients doing different forms of activity. Idle clients will have very few roundtrips in the 40minutes while clients extracting data row-by-row (not in PLSQL, but SQL or some other client like Java) would have a high number of roundtrips. So, you can’t separate the two types of behaviour in an AWR.

If you are really concerned about identifying SQL*Net overheads and/or round-trips, you should *trace* the specific individual session of interest and extract figures from the trace file.
.
.
.