`

String Aggregation Techniques

阅读更多

String Aggregation Techniques

On occasion it is necessary to aggregate data from a number of rows into a single row, giving a list of data associated with a specific value. Using the SCOTT.EMP table as an example, we might want to retrieve a list of employees for each department. Below is a list of the base data and the type of output we would like to return from an aggregate query.

Base Data:

    DEPTNO ENAME
---------- ----------
        20 SMITH
        30 ALLEN
        30 WARD
        20 JONES
        30 MARTIN
        30 BLAKE
        10 CLARK
        20 SCOTT
        10 KING
        30 TURNER
        20 ADAMS
        30 JAMES
        20 FORD
        10 MILLER

Desired Output:

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

This article is based on a thread from asktom.oracle.com and contains several methods to achieve the desired results.

LISTAGG Analystic Function in 11g Release 2

The LISTAGG analytic function was introduced in Oracle 11g Release 2, making it very easy to aggregate strings. The nice thing about this function is it also allows us to order the elements in the concatenated list.

COLUMN employees FORMAT A50

SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

3 rows selected.

WM_CONCAT Built-in Function

If you are not running 11g Release 2, but are running a version of the database where the WM_CONCAT function is present, then it is a zero effort solution as it performs the aggregation for you.

COLUMN employees FORMAT A50

SELECT deptno, wm_concat(ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

3 rows selected.

If is actually an example of a user defined aggregate function described below, but Oracle have done all the work for you.

Specific Function

One approach is to write a specific function to solve the problems. The get_employees function listed below returns a list of employees for the specified department.

CREATE OR REPLACE FUNCTION get_employees (p_deptno  in  emp.deptno%TYPE)
  RETURN VARCHAR2
IS
  l_text  VARCHAR2(32767) := NULL;
BEGIN
  FOR cur_rec IN (SELECT ename FROM emp WHERE deptno = p_deptno) LOOP
    l_text := l_text || ',' || cur_rec.ename;
  END LOOP;
  RETURN LTRIM(l_text, ',');
END;
/
SHOW ERRORS

The function can then be incorporated into a query as follows.

COLUMN employees FORMAT A50

SELECT deptno,
       get_employees(deptno) AS employees
FROM   emp
GROUP by deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

To reduce the number of calls to the function, and thereby improve performance, we might want to filter the rows in advance.

COLUMN employees FORMAT A50

SELECT e.deptno,
       get_employees(e.deptno) AS employees
FROM   (SELECT DISTINCT deptno
        FROM   emp) e;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        
3 rows selected.

Generic Function using Ref Cursor

An alternative approach is to write a function to concatenate values passed using a ref cursor. This is essentially the same as the previous example, except that the cursor is passed in making it generic, as shown below.

CREATE OR REPLACE FUNCTION concatenate_list (p_cursor IN  SYS_REFCURSOR)
  RETURN  VARCHAR2
IS
  l_return  VARCHAR2(32767); 
  l_temp    VARCHAR2(32767);
BEGIN
  LOOP
    FETCH p_cursor
    INTO  l_temp;
    EXIT WHEN p_cursor%NOTFOUND;
    l_return := l_return || ',' || l_temp;
  END LOOP;
  RETURN LTRIM(l_return, ',');
END;
/
SHOW ERRORS

The CURSOR function is used to allow a query to be passed to the function as a ref cursor, as shown below.

COLUMN employees FORMAT A50

SELECT e1.deptno,
       concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM   emp e1
GROUP BY e1.deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

Once again, the total number of function calls can be reduced by filtering the distinct values, rather than calling the function for each row.

COLUMN employees FORMAT A50

SELECT deptno,
       concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM   (SELECT DISTINCT deptno
        FROM emp) e1;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

User-Defined Aggregate Function

The WM_CONCAT function described above is an example of a user-defined aggregate function that Oracle have already created for you. If for some reason you don't want to use the built in WM_CONCAT, you can create your own user-defined aggregate function as described at asktom.oracle.com.

CREATE OR REPLACE TYPE t_string_agg AS OBJECT
(
  g_string  VARCHAR2(32767),

  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
     RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER
);
/
SHOW ERRORS


CREATE OR REPLACE TYPE BODY t_string_agg IS
  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER IS
  BEGIN
    sctx := t_string_agg(NULL);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := self.g_string || ',' || value;
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER IS
  BEGIN
    returnValue := RTRIM(LTRIM(SELF.g_string, ','), ',');
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := SELF.g_string || ',' || ctx2.g_string;
    RETURN ODCIConst.Success;
  END;
END;
/
SHOW ERRORS


CREATE OR REPLACE FUNCTION string_agg (p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING t_string_agg;
/
SHOW ERRORS

The aggregate function is implemented using a type and type body, and is used within a query.

COLUMN employees FORMAT A50

SELECT deptno, string_agg(ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

3 rows selected.

ROW_NUMBER() and SYS_CONNECT_BY_PATH functions in Oracle 9i

An example on williamrobertson.net uses the ROW_NUMBER() and SYS_CONNECT_BY_PATH functions to achieve the same result without the use of PL/SQL or additional type definitions.

SELECT deptno,
       LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
       KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM   (SELECT deptno,
               ename,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
        FROM   emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

3 rows selected.

COLLECT function in Oracle 10g

An example on oracle-developer.net uses the COLLECT function in Oracle 10g to get the same result. This method requires a table type and a function to convert the contents of the table type to a string. I've altered his method slightly to bring it in line with this article.

CREATE OR REPLACE TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
/

CREATE OR REPLACE FUNCTION tab_to_string (p_varchar2_tab  IN  t_varchar2_tab,
                                          p_delimiter     IN  VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS
  l_string     VARCHAR2(32767);
BEGIN
  FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP
    IF i != p_varchar2_tab.FIRST THEN
      l_string := l_string || p_delimiter;
    END IF;
    l_string := l_string || p_varchar2_tab(i);
  END LOOP;
  RETURN l_string;
END tab_to_string;
/

The query below shows the COLLECT function in action.

COLUMN employees FORMAT A50

SELECT deptno,
       tab_to_string(CAST(COLLECT(ename) AS t_varchar2_tab)) AS employees
FROM   emp
GROUP BY deptno;
       
    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        
3 rows selected.

For more information see:

分享到:
评论

相关推荐

    aggregation and disaggregation techniques and methodology in optimization

    在IT领域,尤其是在优化模型的研究与应用中,聚合(Aggregation)与解聚合(Disaggregation)技术及其方法论扮演着至关重要的角色。这些技术旨在处理数据与决策模型之间的复杂关系,平衡模型的细节程度与解决模型的...

    Aggregation数据集

    Aggregation Aggregation Aggregation 适合聚类使用!有多类簇

    Association, Aggregation and Composition的区别

    Association, Aggregation and Composition 的区别 在 Java 应用程序开发中,理解 UML 类图元素及其与 Java 的映射关系是非常重要的。其中,Association、Aggregation 和 Composition 是三个常见的 UML 元素,它们...

    机器学习技法课之Aggregation模型

    ### 机器学习技法课之Aggregation模型 #### 一、Aggregation模型概述 Aggregation模型是一种重要的机器学习技术,主要用于组合多个弱学习器以形成一个更强大的预测模型。这一技术的核心在于通过各种策略来整合不同...

    Map overlay and spatial aggregation in sp

    Edzer Pebesma (2018). Numerical “map overlay” ...spatial data, so that any spatial properties can be used to define an aggregation predicate, and any R function can be used as aggregation function.

    Unsupervised Part-based Weighting Aggregation

    Unsupervised Part-based Weighting Aggregation of Deep Convolutional Features for Image Retrieval.pdf

    GXT组件使用教程4——Aggregation Grid

    在本文中,我们将深入探讨GXT组件的使用,特别是关于Aggregation Grid的教程。GXT是Sencha提供的一款强大的JavaScript库,专为构建企业级Web应用程序而设计,它提供了丰富的UI组件和数据网格功能。Aggregation Grid...

    Cross-Scale Cost Aggregation Code

    《跨尺度成本聚合码》(Cross-Scale Cost Aggregation Code)是实现立体匹配算法CSCA(Cross-Scale Cost Aggregation)的一种软件实现。立体匹配是计算机视觉领域的一个重要研究方向,其目标是通过分析图像对来计算...

    tcp.tar.gz_aggregation_aggregation in ns2_ns2_ns2 aggregation_tc

    标题"tcp.tar.gz_aggregation_aggregation in ns2_ns2_ns2 aggregation_tc"暗示了我们即将探讨的是NS2中TCP协议与802.11 MAC层的聚合(aggregation)功能,以及可能涉及的TCP分片重组(fragment retransmission)...

    基于模版实现图像矩阵的aggregation

    使用matlab自己编辑的基于模版大小实现均值滤波,并且将模版均值赋予新的矩阵,生成灰度图像。

    HP Auto Port Aggregation (APA) Release note

    ### HP Auto Port Aggregation (APA) Release Note #### HP APA Overview for HP-UX 11iv1 **Auto Port Aggregation (APA):** Auto Port Aggregation is a feature designed to provide high-speed, fault-...

    EventAggregation.dll

    EventAggregation

    carrier aggregation LTE-Advanced lab

    LTE-Advanced技术中的载波聚合(Carrier Aggregation)是4G网络技术中的一项重要进步,它极大地提高了数据传输速率,改善了频谱效率,特别适合于高密度的用户部署环境。载波聚合技术的核心在于将多个载波(频段)...

    AdaBatch: Efficient Gradient Aggregation.pdf

    We study a new aggregation operator for gradients coming from a mini-batch for stochastic gradient (SG) methods that allows a significant speed-up in the case of sparse optimization problems. We ...

    mongodb aggregation运用例子

    ### 聚合管道(Aggregation Pipeline) 聚合管道是一个多层次的处理过程,它将数据文档转换为聚合结果。它由多个阶段组成,每个阶段都按照声明的顺序进行操作。聚合管道中常见的一些操作包括: - **$match**: 过滤...

    Aggregation and adhesion properties of 22 Lactobacillus strains

    "Aggregation and adhesion properties of 22 Lactobacillus strains" 这篇文章主要探讨了22株不同物种的乳杆菌的自聚性、共聚性、疏水性和粘附性等特性,这些特性是衡量乳杆菌作为潜在益生菌的重要指标。...

    IEEE Std 802.1AX-2008 Link Aggregation

    链路聚合,或称LAG(Link Aggregation Group),是一种将多个物理连接组合成一个逻辑链路的技术,使得媒体访问控制(MAC)客户端可以将链路聚合组视为单一的高带宽链接。这种技术的核心是通过N个并行的全双工点对点...

    LTE-WLAN aggregation [Industry Perspectives]

    LWA offers many additional benefts such as common control of 3GPP and Wi-Fi resources, operator controlled WLAN mobility, and fast RAN-based WLAN authentication. Importantly, Release 13 LWA can be ...

    扩散限制凝聚算法(Diffusion-limited Aggregation,DLA)代码

    扩散限制凝聚(Diffusion-Limited Aggregation,DLA)是一种模拟自然界中自组织过程的算法,常用于研究粒子聚集现象。在DLA模型中,随机行走的粒子在到达已聚集的粒子群时会粘附上去,形成复杂的分支结构。这个过程...

Global site tag (gtag.js) - Google Analytics