第一次意义上的sql优化,得瑟一下,虽然只是对sql的改写而已,但中间涉及到not in, or 和ort exist ,union, union all 和null值的处理。还是做下笔记。
EXISTS用于检查子查询是否至少会返回一行数据,该子查询实际上并不返回任何数据,而是返回值True或False。
NOT EXISTS 的作用与 EXISTS 正相反。如果子查询没有返回行,则满足 NOT EXISTS 中的 WHERE 子句
起因是巡检的时候发现cpu使用很高,就检查了cpu的top 10
System: xxxx Mon Apr 14 09:52:25 2014
Load averages: 2.22, 2.56, 2.75
1055 processes: 918 sleeping, 108 running, 29 zombies
Cpu states: (avg)
LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
2.22 16.1% 5.3% 77.7% 0.9% 0.0% 0.0% 0.0% 0.0%
System Page Size: 4Kbytes
Memory: 31740652K (7913112K) real, 35398200K (9552472K) virtual, 8472480K free Page# 1/82
CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
28 ? 28996 oracle 178 22 50750M 118M run 44:46 176.38 176.07 oracleods1
24 ? 22336 oracle 178 20 50640M 8496K run 277:45 171.64 171.34 oracleods1
16 ? 8200 oracle 178 20 50637M 5636K run 0:29 125.10 98.55 oracleods1
2 ? 8427 oracle 178 20 50637M 5636K run 0:12 102.05 53.84 oracleods1
4 ? 9004 oracle 178 20 50640M 8900K sleep 9886:22 49.43 49.35 oracleods1
4 ? 9385 oracle 178 20 50645M 9128K sleep 0:47 44.09 44.02 oracleods1
16 ? 8442 oracle 178 20 50637M 5632K sleep 0:08 83.33 41.95 oracleods1
1 ? 8436 oracle 178 20 50637M 5632K sleep 0:08 82.80 41.68 oracleods1
1 ? 8446 oracle 178 20 50637M 5632K sleep 0:09 85.73 40.98 oracleods1
13 ? 8407 oracle 178 20 248M 5180K run 0:12 67.42 40.01 oracleods1
26 ? 8449 oracle 178 20 50637M 5624K sleep 0:08 92.36 39.07 oracleods1
29 ? 8490 oracle 178 20 248M 5180K run 0:07 102.37 33.75 oracleods1
SQL> select sql_id, prev_sql_id, s.username, status,logon_time from v$session s , v$process p where s.paddr = p.addr and spid = 28996;
SQL_ID PREV_SQL_ID USERNAME STATUS LOGON_TIME
------------- ------------- ------------------------------ -------- ------------
gpffra78um5zh 08p8kbs837tdp xxx ACTIVE 14-APR-14
NAME VALUE_STRING
------------------------------------------------------------ --------------------
:B1
:B3 20100621
:B2 20140413
new 1: select * from table(dbms_xplan.display_cursor('gpffra78um5zh'))
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gpffra78um5zh, child number 0
-------------------------------------
INSERT INTO STAT_ZHONGHE_TOTAL_MONTH (MONTH_ID, REGION_ID, REGION_NAME, AGENT_ID,
AGENT_NAME, AGENT_LEVEL_NAME, K_I_NUM, K_D_NUM, K_SCORE, R_I_NUM, R_D_NUM, R_SCORE,
T_I_NUM, T_D_NUM, T_SCORE, Z_I_NUM, Z_D_NUM, Z_SCORE, TOTAL_SCORE, GEN_DATE) SELECT
:B1 , B.REGION_ID, B.REGION_NAME, B.AGENT_ID, B.AGENT_NAME, B.AGENT_LEVEL_NAME, 0
K_I_NUM, 0 K_D_NUM, 0 K_SCORE, 0 R_I_NUM, 0 R_D_NUM, 0 R_SCORE, 0 T_I_NUM, 0 T_D_NUM,
0 T_SCORE, 0 Z_I_NUM, 0 Z_D_NUM, 0 Z_SCORE, 0, SYSDATE FROM DIM_AGENT_CHANNEL_TEMP B
WHERE B.TEAM_ID NOT IN (SELECT A.DEV_DEPT_ID FROM LIST_ZHONGHE_DETAIL_DAY A WHERE
A.MONTH_DAY_ID >= :B3 AND A.MONTH_DAY_ID <= :B2 AND A.DEV_DEPT_ID IS NOT NULL ) OR
B.TEAM_ID IS NULL
Plan hash value: 1761209397
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 14 (100)| |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | DIM_AGENT_CHANNEL_TEMP | 3386 | 591K| 14 (0)| 00:00:01 |
|* 3 | FILTER | | | | | |
|* 4 | TABLE ACCESS FULL| LIST_ZHONGHE_DETAIL_DAY | 1272K| 23M| 7587 (2)| 00:01:32 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("B"."TEAM_ID" IS NULL OR IS NULL))
3 - filter(:B3<=:B2)
4 - filter(("A"."DEV_DEPT_ID" IS NOT NULL AND "A"."MONTH_DAY_ID">=:B3 AND
"A"."MONTH_DAY_ID"<=:B2 AND LNNVL("A"."DEV_DEPT_ID"<>TO_NUMBER(:B1))))
Note
-----
- dynamic sampling used for this statement
36 rows selected.
SQL> select gen_date,count(gen_date) from odsapp.STAT_ZHONGHE_TOTAL_MONTH group by gen_date;
GEN_DATE COUNT(GEN_DATE)
------------------- ---------------
2014-04-14 09:04:46 615
2014-04-14 09:04:57 2604
该库的自动统计信息收集给关了。没有统计信息更新。
SQL> select last_analyzed, rownum from dba_tables where table_name in('LIST_ZHONGHE_DETAIL_DAY','DIM_AGENT_CHANNEL_TEMP');
LAST_ANALYZE ROWNUM
------------ ----------
1
2
表的数据量:
SQL> select segment_name,bytes/1024/1024/1024 from dba_segments where segment_name in('LIST_ZHONGHE_DETAIL_DAY','DIM_AGENT_CHANNEL_TEMP');
SEGMENT_NAME BYTES/1024/1024/1024
--------------------------------------------------------------------------------- --------------------
LIST_ZHONGHE_DETAIL_DAY .263671875
DIM_AGENT_CHANNEL_TEMP .000976563
exec DBMS_STATS.GATHER_TABLE_STATS(ownname=>'xxxx',tabname=>'DIM_AGENT_CHANNEL_TEMP',ESTIMATE_PERCENT=>20,degree=>2);
SQL> select table_name,last_analyzed, num_rows from dba_tables where table_name in('LIST_ZHONGHE_DETAIL_DAY','DIM_AGENT_CHANNEL_TEMP');
TABLE_NAME LAST_ANALYZE NUM_ROWS
------------------------------ ------------ ----------
DIM_AGENT_CHANNEL_TEMP 14-APR-14 3386
LIST_ZHONGHE_DETAIL_DAY 14-APR-14 1316065
因为是全表扫,检查是否有索引:
SQL> select table_name,index_name, last_analyzed, status from dba_indexes where table_name in ('LIST_ZHONGHE_DETAIL_DAY','DIM_AGENT_CHANNEL_TEMP') and owner='xxx';
no rows selected
SQL> select table_name,index_name, last_analyzed, status from dba_indexes where table_name in ('LIST_ZHONGHE_DETAIL_DAY','DIM_AGENT_CHANNEL_TEMP');
no rows selected
事实上LIST_ZHONGHE_DETAIL_DAY 表的MONTH_DAY_ID 的数值范围和sql中指定的差不多,就算有索引,应该也会走全表。
SQL>
SQL> set line 300 pages 3000 long 999999 timing on autotrace traceonly
SQL> SELECT
B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE B.TEAM_ID NOT IN (SELECT A.DEV_DEPT_ID
FROM xxx.LIST_ZHONGHE_DETAIL_DAY A
WHERE A.MONTH_DAY_ID >= '20100621'
AND A.MONTH_DAY_ID <= '20140413'
AND A.DEV_DEPT_ID IS NOT NULL)
OR B.TEAM_ID IS NULL;
2604 rows selected.
Elapsed: 01:26:01.23
Execution Plan
----------------------------------------------------------
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 212 | 16536 | 14 (0)|
| 1 | FILTER | | | | |
| 2 | TABLE ACCESS FULL| DIM_AGENT_CHANNEL_TEMP | 3386 | 257K| 14 (0)|
| 3 | TABLE ACCESS FULL| LIST_ZHONGHE_DETAIL_DAY | 2 | 28 | 14 (0)|
-----------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
Statistics
----------------------------------------------------------
466 recursive calls
0 db block gets
92299781 consistent gets
0 physical reads
0 redo size
164109 bytes sent via SQL*Net to client
2395 bytes received via SQL*Net from client
175 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
2604 rows processed
SQL>
SELECT B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE not exists (SELECT A.DEV_DEPT_ID
FROM xxx.LIST_ZHONGHE_DETAIL_DAY A
WHERE A.dev_dept_id =B.TEAM_ID
and A.MONTH_DAY_ID >= '20100621'
AND A.MONTH_DAY_ID <= '20140413'
AND A.DEV_DEPT_ID IS NOT NULL)
union
SELECT B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE B.TEAM_ID IS NULL;
2127 rows selected.
Elapsed: 00:00:05.37
Execution Plan
----------------------------------------------------------
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3430 | 307K| 7636 (2)|
| 1 | SORT UNIQUE | | 3430 | 307K| 7636 (2)|
| 2 | UNION-ALL | | | | |
| 3 | HASH JOIN ANTI | | 3385 | 304K| 7620 (2)|
| 4 | TABLE ACCESS FULL| DIM_AGENT_CHANNEL_TEMP | 3386 | 257K| 14 (0)|
| 5 | TABLE ACCESS FULL| LIST_ZHONGHE_DETAIL_DAY | 1310K| 17M| 7595 (2)|
| 6 | TABLE ACCESS FULL | DIM_AGENT_CHANNEL_TEMP | 45 | 3510 | 14 (0)|
-------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
Statistics
----------------------------------------------------------
824 recursive calls
0 db block gets
34637 consistent gets
0 physical reads
0 redo size
131900 bytes sent via SQL*Net to client
2043 bytes received via SQL*Net from client
143 SQL*Net roundtrips to/from client
34 sorts (memory)
0 sorts (disk)
2127 rows processed
改用not exists后速度快了很多,但是结果集不一样了。比以前的少了,想到用的是union,会去重。用union all试试。
单独用not exists :
SQL>
SELECT B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE not exists (SELECT A.DEV_DEPT_ID
FROM xxx.LIST_ZHONGHE_DETAIL_DAY A
WHERE A.dev_dept_id =B.TEAM_ID
and A.MONTH_DAY_ID >= '20100621'
AND A.MONTH_DAY_ID <= '20140413'
AND A.DEV_DEPT_ID IS NOT NULL)
;
2604 rows selected.
Elapsed: 00:00:01.21
Execution Plan
----------------------------------------------------------
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3385 | 304K| 7620 (2)|
| 1 | HASH JOIN ANTI | | 3385 | 304K| 7620 (2)|
| 2 | TABLE ACCESS FULL| DIM_AGENT_CHANNEL_TEMP | 3386 | 257K| 14 (0)|
| 3 | TABLE ACCESS FULL| LIST_ZHONGHE_DETAIL_DAY | 1310K| 17M| 7595 (2)|
-----------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
34395 consistent gets
0 physical reads
0 redo size
216685 bytes sent via SQL*Net to client
2395 bytes received via SQL*Net from client
175 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2604 rows processed
把 WHERE B.TEAM_ID IS NULL;给去掉后,结果集和修改前一直了,但是为啥?
原因就是null值不能做运算。 在计算not exists的子查询不做null的判断。所以进一步修改sql,将 B.TEAM_ID IS not NULL 加到exists 后面。
这个位置的原因,理解的还不够,希望有人能看见帮忙指点一下。
SQL>
SELECT B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE not exists (SELECT A.DEV_DEPT_ID
FROM xxx.LIST_ZHONGHE_DETAIL_DAY A
WHERE A.dev_dept_id =B.TEAM_ID
and A.MONTH_DAY_ID >= '20100621'
AND A.MONTH_DAY_ID <= '20140413'
)
and b.team_id is not null
union all
SELECT B.REGION_ID,
B.REGION_NAME,
B.AGENT_ID,
B.AGENT_NAME,
B.AGENT_LEVEL_NAME,
0 K_I_NUM,
0 K_D_NUM,
0 K_SCORE,
0 R_I_NUM,
0 R_D_NUM,
0 R_SCORE,
0 T_I_NUM,
0 T_D_NUM,
0 T_SCORE,
0 Z_I_NUM,
0 Z_D_NUM,
0 Z_SCORE,
0,
SYSDATE
FROM xxx.DIM_AGENT_CHANNEL_TEMP B
WHERE B.TEAM_ID IS NULL;
2604 rows selected.
Elapsed: 00:00:01.21
Execution Plan
----------------------------------------------------------
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3385 | 303K| 7632 (2)|
| 1 | UNION-ALL | | | | |
| 2 | HASH JOIN ANTI | | 3340 | 300K| 7618 (2)|
| 3 | TABLE ACCESS FULL| DIM_AGENT_CHANNEL_TEMP | 3341 | 254K| 14 (0)|
| 4 | TABLE ACCESS FULL| LIST_ZHONGHE_DETAIL_DAY | 1310K| 17M| 7593 (2)|
| 5 | TABLE ACCESS FULL | DIM_AGENT_CHANNEL_TEMP | 45 | 3510 | 14 (0)|
------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
34459 consistent gets
0 physical reads
0 redo size
216685 bytes sent via SQL*Net to client
2395 bytes received via SQL*Net from client
175 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2604 rows processed
SQL>
这下结果集一样了
相关推荐
通过对比分析 `EXISTS`, `NOT EXISTS`, `IN`, `NOT IN` 以及不同类型的连接操作,我们可以发现不同的查询策略在不同的应用场景下具有不同的优劣性。选择合适的查询方式不仅能够提高查询效率,还能够节省系统资源,...
这是因为`IN`和`NOT IN`需要将主查询中的每一行都与子查询结果进行比较,而`EXISTS`和`NOT EXISTS`则只需判断子查询是否返回至少一行数据即可,无需获取所有数据行。 **性能对比:** - **IN 和 NOT IN**:需要将主...
MySQL优化之in,exists,not in,not exists的区别in与existsin查询过程结论:exists查询过程:结论:not in与not existsnot in查询过程:结论:not exists查询过程:结论: 首先我们使用两个用户表作为实例 insert ...
### MySQL多表查询和EXISTS查询性能对比 #### 测试环境 在进行MySQL多表查询与EXISTS查询的性能对比之前,首先需要了解测试环境的相关配置。虽然原文中并未给出具体的测试环境细节,但在实际操作中,这一步骤至关...
在SQL查询优化中,`IN`、`EXISTS`、`NOT IN`和`NOT EXISTS`是四个常见的比较和过滤条件,它们在不同场景下有不同的性能表现。以下是对这些操作符优化原则的详细说明: 1. **EXISTS的执行流程**: `EXISTS` 子查询...
在数据库查询语言QL中,`exists`与`in_`(通常写作`IN`)是两种常见的用于子查询的语法结构,它们各自有着独特的应用场景和性能表现。理解这两种语法的区别对于优化查询效率至关重要。 ### `exists`与`in_`的基本...
#### 二、Exists 与 In 的效率对比 **2.1 Exists 与 In 的基础概念** - `EXISTS` 主要用于检测子查询是否有结果返回,通常效率较高,因为一旦找到匹配项就会停止搜索。 - `IN` 用于测试主查询中的值是否存在于子...
在SQL Server中,`EXISTS` 和 `IN` 是两种常用的子查询操作符,用于比较一个查询结果是否与另一个查询结果相匹配。虽然它们在功能上相似,但在性能和优化方面存在一些差异,这在大数据处理时尤其显著。 1. **EXISTS...
在理解`EXISTS`的使用时,我们需要深入探讨其工作原理、优缺点以及与其他查询操作(如`IN`、`JOIN`)的对比。 一、`EXISTS`的基本用法 `EXISTS`子句通常与子查询一起使用,形式如下: ```sql SELECT column1, ...
这里,我们在子查询中明确排除了`NULL`值,确保`NOT IN`操作只对比非`NULL`的值。 这两种方法都能有效地避免`NULL`值导致的错误,并且能正确地计算出`t1`中在`t2`中没有对应记录的条数。 在进行SQL查询优化时,...
本文将深入分析几个关键的知识点,包括:如何构建测试环境、比较不同查询类型(如IN与EXISTS,NOT IN与NOT EXISTS)的性能差异、以及探讨CHARINDEX与LIKE操作符的效率对比,最后讨论UNION与OR运算符在查询中的应用...
使用`NOT EXISTS`和嵌套的`SELECT AVG()`来计算每个学生在每门课程中的平均成绩,然后对比当前学生的成绩,如果所有课程成绩都高于平均值,则返回该学生的信息。 4. 查询选修了至少两门课程的学生的学号和姓名: ...
- `EXISTS` 和 `NOT EXISTS` 子查询仅需要找到一条满足条件的记录即返回结果,而 `IN` 和 `NOT IN` 需要遍历整个子查询的结果集。 - 当子查询结果集很大时,`EXISTS` 和 `NOT EXISTS` 的效率更高。 - **示例**: ...
- 结合使用`IN`和`NOT IN`可以精确控制查询条件,如`SELECT * FROM tb_name WHERE id IN (10,12,15,16) AND NOT id IN (21,22,23)`,这将返回id在第一个列表但不在第二个列表的记录。 2. `EXISTS/NOT EXISTS`: -...
- **`NOT EXISTS`**:与`NOT IN`相比,`NOT EXISTS`在某些情况下能提供更好的性能,因为它在找到第一个不满足条件的记录后即可终止查询,避免了不必要的额外计算。 #### 使用`IN`列表与`OR`条件表达式 在某些情况...
若必须使用,考虑使用EXISTS或NOT EXISTS子句替代,如在查找子表中不存在于主表的记录时,使用EXISTS通常比NOT IN更高效。 在字段设计上,应优先考虑数字型字段,因为它们在比较时的效率更高。字符型字段的比较需要...
4. 练习使用EXISTS和NOT EXISTS,理解它们在实际问题中的应用。 5. 探索分析子查询,学习如何处理时间序列数据或其他需要排名和分组的场景。 6. 实验不同集合操作符与子查询的组合,观察它们对结果的影响。 通过...
4. **EXISTS和NOT EXISTS**:`EXISTS`检查子查询是否返回至少一行数据,`NOT EXISTS`则检查是否返回任何数据。例如,`EXISTS (select * from scott.emp where ename='WARD')`会返回真(TRUE),只要有一个名为WARD的...