`
javayestome
  • 浏览: 1042090 次
  • 性别: Icon_minigender_2
  • 来自: 北京
文章分类
社区版块
存档分类
最新评论

Statspack之十四-"log file sync" 等待事件

阅读更多


原文出处:

http://www.eygle.com/statspack/statspack14-LogFileSync.htm

当一个用户提交(commits)或者回滚(rollback),session的redo信息需要写出到redo logfile中.
用户进程将通知LGWR执行写出操作,LGWR完成任务以后会通知用户进程.
这个等待事件就是指用户进程等待LGWR的写完成通知.

对于回滚操作,该事件记录从用户发出rollback命令到回滚完成的时间.

如果该等待过多,可能说明LGWR的写出效率低下,或者系统提交过于频繁.
针对该问题,可以关注:
log file parallel write等待事件
user commits,user rollback等统计信息可以用于观察提交或回滚次数

解决方案:
1.提高LGWR性能
尽量使用快速磁盘,不要把redo log file存放在raid 5的磁盘上
2.使用批量提交
3.适当使用NOLOGGING/UNRECOVERABLE等选项

可以通过如下公式计算平均redo写大小:

avg.redo write size = (Redo block written/redo writes)*512 bytes

如果系统产生redo很多,而每次写的较少,一般说明LGWR被过于频繁的激活了.
可能导致过多的redo相关latch的竞争,而且Oracle可能无法有效的使用piggyback的功能.

我们从一个statspack中提取一些数据来研究一下这个问题.

1.主要信息

DB Name         DB Id    Instance     Inst Num Release     OPS Host
------------ ----------- ------------ -------- ----------- --- ------------
DB           1222010599  oracle              1 8.1.7.4.5   NO  sun
                Snap Id     Snap Time      Sessions
                ------- ------------------ --------
 Begin Snap:       3473 13-Oct-04 13:43:00      540
   End Snap:       3475 13-Oct-04 14:07:28      540
    Elapsed:                  24.47 (mins)

Cache Sizes
~~~~~~~~~~~
           db_block_buffers:     102400          log_buffer:   20971520
              db_block_size:       8192    shared_pool_size:       600M

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:             28,458.11              2,852.03
                  ......
                  					   

2.等待事件

Event                               Waits   Timeouts  Time (cs)    (ms)   /txn
---------------------------- ------------ ---------- ----------- ------ ------
log file sync                      14,466          2       4,150      3    1.0
db file sequential read            17,202          0       2,869      2    1.2
latch free                         24,841     13,489       2,072      1    1.7 
direct path write                     121          0       1,455    120    0.0
db file parallel write              1,314          0       1,383     11    0.1
log file sequential read            1,540          0          63      0    0.1
....
log file switch completion              1          0           3     30    0.0
refresh controlfile command            23          0           1      0    0.0
LGWR wait for redo copy                46          0           0      0    0.0
....
log file single write                   4          0           0      0    0.0
					  

我们看到,这里log file sync和db file parallel write等待同时出现了.
显然log file sync在等待db file parallel write的完成.

这里磁盘IO肯定存在了瓶颈,实际用户的redo和数据文件同时存放在Raid的磁盘上,存在性能问题.
需要调整.

3.统计信息

Statistic                                    Total   per Second    per Trans
--------------------------------- ---------------- ------------ ------------
....
redo blocks written                         93,853         63.9          6.4
redo buffer allocation retries                   1          0.0          0.0
redo entries                               135,837         92.5          9.3
redo log space requests                          1          0.0          0.0
redo log space wait time                         3          0.0          0.0
redo ordering marks                              0          0.0          0.0
redo size                               41,776,508     28,458.1      2,852.0
redo synch time                              4,174          2.8          0.3
redo synch writes                           14,198          9.7          1.0
redo wastage                             4,769,200      3,248.8        325.6
redo write time                              3,698          2.5          0.3
redo writer latching time                        0          0.0          0.0
redo writes                                 14,572          9.9          1.0
....
sorts (disk)                                     4          0.0          0.0
sorts (memory)                             179,856        122.5         12.3
sorts (rows)                             2,750,980      1,874.0        187.8
....
transaction rollbacks                           36          0.0          0.0
transaction tables consistent rea                0          0.0          0.0
transaction tables consistent rea                0          0.0          0.0
user calls                               1,390,718        947.4         94.9
user commits                                14,136          9.6          1.0
user rollbacks                                 512          0.4          0.0
write clones created in backgroun                0          0.0          0.0
write clones created in foregroun               11          0.0          0.0
          -------------------------------------------------------------

						

avg.redo write size = (Redo block written/redo writes)*512 bytes
		    = ( 93,853 / 14,572 )*512 
		    = 3K				



这个平均过小了,说明系统的提交过于频繁.

Latch Sleep breakdown for DB: DPSHDB  Instance: dpshdb  Snaps: 3473 -3475
-> ordered by misses desc

                                Get                                  Spin &
Latch Name                    Requests         Misses      Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
row cache objects              12,257,850     113,299          64 113235/64/0/
                                                                  0/0
shared pool                     3,690,715      60,279      15,857 52484/588/65
                                                                  46/661/0
library cache                   4,912,465      29,454       8,876 23823/2682/2 
                                                                  733/216/0
cache buffers chains           10,314,526       2,856          33 2823/33/0/0/
                                                                  0
redo writing                       76,550         937           1 936/1/0/0/0
session idle bit                2,871,949         225           1 224/1/0/0/0
messages                          107,950         159           2 157/2/0/0/0
session allocation                184,386          44           6 38/6/0/0/0
checkpoint queue latch             96,583           1           1 0/1/0/0/0
          -------------------------------------------------------------				
			

由于过渡频繁的提交,LGWR过度频繁的激活,我们看到这里出现了redo writing的latch竞争.

关于redo writing竞争你可以在steve的站点找到详细的介绍:
http://www.ixora.com.au/notes/lgwr_latching.htm


转引如下:

When LGWR wakes up, it first takes the redo writing latch to update the SGA variable that shows whether it is active. This prevents other Oracle processes from posting LGWR needlessly. LGWR then takes the redo allocation latch to determine how much redo might be available to write (subject to the release of the redo copy latches). If none, it takes the redo writing latch again to record that it is no longer active, before starting another rdbms ipc message wait.
If there is redo to write, LGWR then inspects the latch recovery areas for the redo copy latches (without taking the latches) to determine whether there are any incomplete copies into the log buffer. For incomplete copies above the sync RBA, LGWR just defers the writing of that block and subsequent log buffer blocks. For incomplete copies below the sync RBA, LGWR sleeps on a LGWR wait for redo copy wait event, and is posted when the required copy latches have been released. The time taken by LGWR to take the redo writing and redo allocation latches and to wait for the redo copy latches is accumulated in the redo writer latching time statistic.

(Prior to release 8i, foreground processes held the redo copy latches more briefly because they did not retain them for the application of the change vectors. Therefore, LGWR would instead attempt to assure itself that there were no ongoing copies into the log buffer by taking all the redo copy latches.)

After each redo write has completed, LGWR takes the redo allocation latch again in order to update the SGA variable containing the base disk block for the log buffer. This effectively frees the log buffer blocks that have just been written, so that they may be reused.


本文作者:
eygle,Oracle技术关注者,来自中国最大的Oracle技术论坛itpub.
www.eygle.com是作者的个人站点.你可通过Guoqiang.Gai@gmail.com来联系作者.欢迎技术探讨交流以及链接交换.


原文出处:

http://www.eygle.com/statspack/statspack14-LogFileSync.htm


分享到:
评论

相关推荐

    oracle statspack

    例如,如果发现大量的“log file sync”等待事件,可能意味着日志缓冲区大小配置不当或日志写入频率过高,需要调整相关的参数设置。 ### 优化建议 根据Statspack报告的分析结果,可以制定相应的优化策略。这可能...

    STATSPACK report

    - **log file sync**: 27,338 以上是针对“STATSPACK report”报告的关键知识点的详细解析,涵盖了数据库的基本信息、性能数据以及性能诊断的重要指标。通过对这些数据的综合分析,可以有效定位数据库中的性能问题...

    Oracle9i的数据库优化.pdf

    在STATSPACK报告中,Top 5 Timed Events展示了耗时最多的五个事件,如control file parallel write和log file sync,这些等待事件可能是性能问题的来源。控制文件并行写入和日志文件同步是数据库事务提交的重要部分...

    oracle 常见等待事件及处理方法

    7. **log file sync**:等待日志刷新到磁盘,确保事务持久化。优化数据库配置,如使用RAC(Real Application Clusters)的并行日志写入,或调整日志写入策略来减少这种等待。 在处理这些等待事件时,可以利用`v$...

    How to Use AWR reports to Diagnose Database Performance Issues

    “Log file sync”等待事件可能表明日志写入性能低下,可能需要优化Redo Log文件的配置或硬件性能。 “Buffer busy waits”通常与数据块的锁争用相关,可能需要调整并发控制策略或优化数据访问模式。 除了单独使用...

    如何使用AWR报告来诊断数据库性能问题.docx

    9. **'Log file sync' waits** 和 **'Buffer busy waits'** - 这两种等待事件可能表明日志写入或缓冲区管理存在问题,可能需要调整参数或优化I/O子系统。 10. **ADDM报告** - 使用ADDM (Automatic Diagnostic ...

    常见问题:如何使用Oracle AWR报告来诊断数据库性能问题

    例如,'Log file sync' waits 和 'Buffer busy waits' 是两个常见的等待事件,它们通常与日志文件同步问题和缓冲区争用问题相关。 另外,Oracle提供的ADDM(Automatic Database Diagnostic Monitor)报告是另一种...

    oracle基础课程2

    例如,在一个实际案例中,上海某网站遇到性能问题,通过分析发现"Log File Sync"等待事件居高不下,这表明日志文件同步成为性能瓶颈。解决方案可能包括使用专用的RAID1磁盘存储重做日志,使用Raw Device,减小log ...

    LoadRunner对数据库的监控指标(sqlserver/oracle/db2)

    - **描述**: 统计等待事件的类型和数量,如 latch、log file sync 等。 - **重要性**: 极高 - **参考值**: 依赖于具体场景 - **建议**: 优化索引、减少锁定等待等。 2. **缓冲区命中率 (Buffer Hit Ratio)** -...

    BOSS数据库系统性能瓶颈分析和定位.pdf

    然而,在`v$session_wait`中发现大量`log file sync`等待事件,可能由以下三个原因引起: - a) 事务提交过于频繁,与2000多万用户基数相匹配。 - b) `log_buffer`设置较小,无法满足高并发需求。 - c) 重做日志...

    Oracle 数据库性能优化

    - **Log File Sync**: 这是常见的一个等待事件,表示数据库正在等待写入重做日志文件。 - **db file sequential read**: 表示顺序读取数据文件的等待。 - **db file scattered read**: 表示随机读取数据文件的等待。...

Global site tag (gtag.js) - Google Analytics