`

Append Values and how not to break the database

 
阅读更多

With the advent of the /*+ APPEND_VALUES*/ hint in 11gR2, I suspect we will see the feature misused and a bunch of questions on why it "doesn't work". The documentation states that "direct-path INSERT can be considerably faster than conventional INSERT.". What it should state is that it can also be considerably slower (plus that, if logging is not enforced at either the table or tablespace level, it could also render backups unusable).

Firstly, direct path inserts work by inserting data at the end of existing data, above the high water mark. If you have two conventional inserts throwing data at a table, they can each move the high-water mark as required. Session 1 can move it out by 10 blocks, then session 2 can move it out another 2, then session 1 moves it again. The high-water mark is metadata about the table. It will be stored in one place and there is the potential for contention by multiple sessions wanting to change it at once. For conventional inserts, as soon as the session has adjusted it, it releases its hold and other sessions can do their adjustments. It doesn't need to wait on session commits, and contention generally isn't a problem.

In direct-path inserts, the insert moves the HWM but cannot release its hold on that information. That's because the data it is writing between the old and new HWM is 'dodgy'. It isn't yet committed and it shouldn't be read into the buffer cache. [I suspect it is written to the data file as if it were committed rather than with lock flags and transaction identifiers. That way it avoids the need for a delayed block cleanout when it is subsequently read.] If another insert (or even an update or merge) on the table needs to move the HWM, it has to wait until the direct path transaction is committed or rolled back.  That could happen with conventional insert/update/merge, but will ALWAYS happen with another direct path insert.

Try this in one session
drop table TEST_IA purge;
create table TEST_IA (id number, val varchar2(4000));
insert /*+ APPEND */ into TEST_IA
select rownum rn, 'b' val from dual connect by level <= 1000;

And then this in another
insert /*+ APPEND */ into TEST_IA
select rownum rn, 'b' val from dual connect by level <= 1000;

See what I mean ?
If you have 11gR2, insert /*+ APPEND_VALUES */ into TEST_IA values (1,'b'); will have the same effect.

So the first failure pattern I predict will be multiple sessions all trying APPEND_VALUES into the same table and knocking heads together. This would most likely happen when developers have tried to parallelise operations outside the database. Knocking heads together may be the appropriate solution too.

I suspect this will be somewhat mitigated by the second failure pattern. Once you've done a direct path insert, if you try anything else on that table in the same session, you'll get a "ORA-12838: cannot read/modify an object after modifying it in parallel" error message. I'd say the error message was somewhat mis-leading, but a quick google will tell them that the solution is to do a commit after the insert. You can split coders into two groups, the first who understand the concept of a transaction, and the second who don't. I think the number of the latter are increasing. Even if it is okay to commit, you could still have log sync waits.

The final failure pattern I predict will be those who think "Ah, I can do inserts without generating log data. That should be faster.". The problem is that the metadata changes, moving the HWM, are logged and it is only the creation of content data that might be unlogged. In the following script, I compare several approaches to inserting a single record in a loop. When compared with a conventional part insert, it is apparent that a lot more redo is generated for the single row direct path insert with a small record size. When I used a larger record size (padding the value to several thousand characters) the redo size was comparable but there were still fewer redo entries in the conventional path inserts. I do concede logging is not the only performance impact and performance may still improve due to bypassing of the buffer cache, no need to locate free space in the table etc.

drop table TEST_ROW_IAV purge;
drop table TEST_ROW_IA purge;
drop table TEST_ROW_IV purge;
drop table TEST_ROW_I purge;

create table TEST_ROW_IAV (id number, val varchar2(4000));
create table TEST_ROW_IA  (id number, val varchar2(4000));
create table TEST_ROW_IV  (id number, val varchar2(4000));
create table TEST_ROW_I   (id number, val varchar2(4000));

clear screen
declare
  cursor c_1 is
    select rownum rn, 'b' val from dual connect by level <= 10000;
  --
  procedure rep_ext (p_seg in varchar2)
  is
    cursor c_e  is
      select rpad(segment_name,20) segment_name, tablespace_name,
             count(extent_id) cnt, sum(round(bytes/1024)) kb, sum(blocks) blocks
      from user_extents
      where segment_name  = p_seg
      group by segment_name, tablespace_name;
  begin
    for c_out in c_e loop
      dbms_output.put_line(to_char(c_out.segment_name)||
       ' '||c_out.tablespace_name||' '||to_char(c_out.cnt,'9990')||
       ' '||to_char(c_out.kb,'999,990.00')||' '||to_char(c_out.blocks,'999,990'));
    end loop;
  end rep_ext;
  --
  procedure rep_redo (p_text in varchar2)
  is
    cursor c_r is
      select    sum(case when name = 'redo entries' then value end) redo_entries,
                sum(case when name = 'redo size' then value end) redo_size
      from v$mystat s join v$statname n on n.statistic# = s.statistic#
      where name in ('redo entries','redo size');
  begin
    dbms_output.put(rpad(p_text,20)||' at '||to_char(sysdate,'hh24:mi:ss')||' ');
    for c_rec in c_r loop
      dbms_output.put_line('Entries:'||to_char(c_rec.redo_entries,'9,999,990')||
                           ' Size:'||to_char(c_rec.redo_size,'999,999,990'));
    end loop;
    end rep_redo;
    --
begin
    commit;
    rep_redo('Start');
    for c_rec in c_1 loop
      insert /*+ APPEND_VALUES */ into TEST_ROW_IAV values (c_rec.rn, c_rec.val);
      commit;
    end loop;
    rep_redo('After /*+ APPEND_VALUES */');
    --
    for c_rec in c_1 loop
      insert /*+ APPEND */ into TEST_ROW_IA select c_rec.rn, c_rec.val from dual;
      commit;
    end loop;
    rep_redo('After /*+ APPEND */');
    --
    for c_rec in c_1 loop
      insert into TEST_ROW_IV values (c_rec.rn, c_rec.val);
      commit;
    end loop;
    rep_redo('After insert values');
    --
    for c_rec in c_1 loop
      insert into TEST_ROW_I select c_rec.rn, c_rec.val from dual;
      commit;
    end loop;
    rep_redo('After insert select');
    --
    rep_ext('TEST_ROW_IAV');
    rep_ext('TEST_ROW_IA');
    rep_ext('TEST_ROW_IV');
    rep_ext('TEST_ROW_I');
    --
end;
/And my results:


Start                at 14:10:59 Entries:       912 Size:     125,628
After /*+ APPEND_VAL at 14:11:02 Entries:   112,547 Size:  15,995,632
After /*+ APPEND */  at 14:11:08 Entries:   224,184 Size:  31,863,240
After insert values  at 14:11:09 Entries:   234,409 Size:  36,723,128
After insert select  at 14:11:11 Entries:   244,634 Size:  41,422,384
TEST_ROW_IAV         USERS    81   81,920.00   10,240
TEST_ROW_IA          USERS    81   81,920.00   10,240
TEST_ROW_IV          USERS     3      192.00       24
TEST_ROW_I           USERS     3      192.00       24
APPEND_VALUES (and indeed APPEND) are not intended for single row inserts, but even small arrays will demonstrate similar problems. Ideally you want each insert to leave full blocks to minimize unused (and probably unusable) space. If you target your array size to a single block, you may find that in practice you get variations between 75% of a block and 1 and a bit blocks (more empty space). However if each insert creates one hundred blocks, you don't mind if the last one is a bit empty. With this in mind you probably want to think of arrays of at least thousands of rows, and maybe in the tens or hundreds of thousands of rows depending on both row size and block size

One more script, looking at how things should be done. Firstly the classical insert with append hint from a select. Secondly, with an array of a hundred thousand rows inserted with the append_values hint. Finally, the '10g' way of a direct path insert from a PL/SQL variable, using SQL types. What you see is that, for this data set, there's not much to choose between the three.

drop table TEST_FORALL_IAV purge;
drop table TEST_IA purge;
drop table TEST_TYPE_IA purge;

drop type type_test_ia;
drop type type_tab_test_ia;

create type type_test_ia is object (id number, val varchar2(4000));
/
create type type_tab_test_ia is table of type_test_ia;
/

create table TEST_FORALL_IAV (id number, val varchar2(4000));
create table TEST_IA (id number, val varchar2(4000));
create table TEST_TYPE_IA (id number, val varchar2(4000));

clear screen
declare
  cursor c_1 is
    select rownum rn, 'b' val from dual connect by level <= 100000;
  TYPE tab_1 is table of c_1%rowtype index by pls_integer;
  t_1    tab_1;
  t_tab    type_tab_test_ia;
  --
  procedure rep_ext (p_seg in varchar2)
  is
    cursor c_e  is
      select rpad(segment_name,20) segment_name, tablespace_name,
             count(extent_id) cnt, sum(round(bytes/1024)) kb, sum(blocks) blocks
      from user_extents
      where segment_name  = p_seg
      group by segment_name, tablespace_name;
  begin
    for c_out in c_e loop
      dbms_output.put_line(to_char(c_out.segment_name)||
        ' '||c_out.tablespace_name||' '||to_char(c_out.cnt,'9990')||
        ' '||to_char(c_out.kb,'999,990.00')||
        ' '||to_char(c_out.blocks,'999,990'));
    end loop;
  end rep_ext;
  --
  procedure rep_redo (p_text in varchar2)
  is
    cursor c_r is
      select sum(case when name = 'redo entries' then value end) redo_entries,
             sum(case when name = 'redo size' then value end) redo_size
      from v$mystat s join v$statname n on n.statistic# = s.statistic#
      where name in ('redo entries','redo size');
  begin
    dbms_output.put(rpad(p_text,20)||' at '||to_char(sysdate,'hh24:mi:ss')||' ');
    for c_rec in c_r loop
        dbms_output.put_line('Entries:'||to_char(c_rec.redo_entries,'9,999,990')||
          ' Size:'||to_char(c_rec.redo_size,'999,999,990'));
    end loop;
  end rep_redo;
    --
begin
  commit;
  rep_redo('Start');
  insert /*+ APPEND */ into TEST_IA
  select rownum rn, 'b' val from dual connect by level <= 100000;
  rep_redo('After Append dual');
  --
  open c_1;
  fetch c_1 bulk collect into t_1;
  close c_1;
  forall i in 1..t_1.count
      insert /*+ APPEND_VALUES */ into TEST_FORALL_IAV values t_1(i);
  rep_redo('After Append values');
  --
  select cast(collect(type_test_ia(rn,val)) as type_tab_test_ia)
  into t_tab
  from (select rownum rn, 'b' val from dual connect by level <= 100000);
  --
  insert /*+ APPEND */ into    TEST_TYPE_IA
  select * from table(t_tab);
  rep_redo('After Append type');
  --
  rep_ext ('TEST_IA');
  rep_ext ('TEST_FORALL_IAV');
  rep_ext ('TEST_TYPE_IA');
end loop;
/
And again, my results

Start                at 14:33:04 Entries:       580 Size:     112,392
After Append dual    at 14:33:04 Entries:       850 Size:     153,468
After Append values  at 14:33:04 Entries:     1,116 Size:     193,836
After Append type    at 14:33:05 Entries:     1,383 Size:     234,432
TEST_IA              USERS    17    2,048.00      256
TEST_FORALL_IAV      USERS    17    2,048.00      256
TEST_TYPE_IA         USERS    17    2,048.00      256

 

参考至:http://blog.sydoracle.com/2010/02/append-values-and-how-not-to-break.html

如有错误,欢迎指正

邮箱:czmcj@163.com

分享到:
评论

相关推荐

    append_values.py

    append_values

    SSD7 选择题。Multiple-Choice

    (c) typing a syntactically correct SQL query that uses column and table names similar to the correct column and table names in a database (d) writing an English description of the data that the user...

    Google C++ Style Guide(Google C++编程规范)高清PDF

    For example, if your header file uses the File class in ways that do not require access to the declaration of the File class, your header file can just forward declare class File; instead of having ...

    Spectral Core Full Convert Enterprise v6.11.0.1683

    It auto-adapts to your data and database engines used and selects the best possible data transfer algorithm for your conversion scenario. You have full control over the database migration and can ...

    2009 达内Unix学习笔记

    集合了 所有的 Unix命令大全 ...telnet 192.168.0.23 自己帐号 sd08077-you0 ftp工具 192.168.0.202 tools-toolss ... 各个 shell 可互相切换 ksh:$ sh:$ csh:guangzhou% bash:bash-3.00$ ... 命令和参数之间必需用空格隔...

    DevExpress VCL v2012 vol 1.6源码、例子、帮助-part2

    B218627 - TBasedxReportLink descendants - Customized values of the TimeFormat, DateFormat and PageFormat properties are reset to their default values in some cases if default formats have been ...

    asp.net mvc

    Possible values are input (while the user is typing), blur (after the user is done typing and moves to another field), and submit (when the user submits the form). Bug Fixes The following bugs have ...

    /* append*/ oracle append 知识点

    ### Oracle Append 模式详解 #### 一、概述 在Oracle数据库中,`/*append*/` 是一种特殊的提示,用于指示数据库使用特定的方式执行插入操作。这种插入方式被称为 **Append Mode** 或 **Direct Path Insert**。它能...

    squashfs1.3r3.tar.gz

    The -nopad option informs mksquashfs to not pad the filesystem to a 4K multiple. This is performed by default to enable the output filesystem file to be mounted by loopback, which requires files to be...

    LCTF软件备份VariSpec™ Liquid Crystal Tunable Filters

    Previous revisions did not range-check the palette index number, and hard crashes could be produced if out-of-range values were supplied to this routine. Previous release 1.33b Release date: ...

    BURNINTEST--硬件检测工具

    - Changed the Memory test to wait for the Video Playback test and 3D test to allow memory allocation for these tests. - Minor changes to the No operation error watchdog timer for the CD and Hard ...

    前端开源库-vue-append

    【Vue.js 开源库——Vue-append 深度解析】 Vue.js 是一款广泛使用的前端JavaScript框架,它以其轻量级、高效以及易于学习的特点深受开发者喜爱。在Vue.js的生态系统中,存在许多优秀的开源库,它们扩展了Vue的核心...

    mastering-spring-cloud2018

    will also learn how to enable discovery on the client side and register these clients in different zones. Chapter 5, Distributed Configuration with Spring Cloud Config, will describe how use ...

    appendData异步加载大数据量分片加载数据和增量渲染的解决方案

    "appendData异步加载大数据量分片加载数据和增量渲染的解决方案"是Echarts为优化性能提供的一种高效策略,尤其适用于展示百万级甚至千万级的数据集。本文将深入探讨这一高级进阶技巧。 首先,我们来理解"appendData...

    EMS.Advanced.Data.Export.VCL.v4.9.0.1.Full.Source

    Added the possibility to append date/time to the exported file name. The AppendDateTimeToFileName and AppendDateTimeToDatabaseName (TADO_QExport4Access) properties. Use the ExportedFileName and ...

    Clever Internet Suite (SRC) v9.1.0.0

    GZip compressor component - provides compressing / uncompressing feature when transmitting data over the Internet, storing data in to a file, database and many more. RSS client - is used for creating,...

    php.ini-development

    Expressions in the INI file are limited to bitwise operators and parentheses: ; | bitwise OR ; ^ bitwise XOR ; & bitwise AND ; ~ bitwise NOT ; ! boolean NOT ; Boolean flags can be turned on using ...

    squashfs2.2-r2.tar.gz

    The -no-fragments tells mksquashfs to not generate fragment blocks, and rather generate a filesystem similar to a Squashfs 1.x filesystem. It will of course still be a Squashfs 2.0 filesystem but ...

    arcgis下append面图层

    本文将详细介绍在ArcGIS中如何进行“append面图层”的操作,即如何将多个面图层合并成一个单一的图层,同时保持或更新属性表。 首先,我们可以采用ArcToolbox中的工具来实现这一目标。在ArcToolbox中,找到...

Global site tag (gtag.js) - Google Analytics