`
kfcman
  • 浏览: 397039 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

ElasticSearch Aggs的一些使用方法

 
阅读更多

这段代码是关于多层聚合和嵌套域的聚合,来源:https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/aggregations/bucket/NestedTests.java

 

/*
 * Licensed to Elasticsearch under one or more contributor
 * license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright
 * ownership. Elasticsearch licenses this file to you under
 * the Apache License, Version 2.0 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */
package org.elasticsearch.search.aggregations.bucket;

import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.search.aggregations.Aggregator.SubAggCollectionMode;
import org.elasticsearch.search.aggregations.bucket.histogram.Histogram;
import org.elasticsearch.search.aggregations.bucket.nested.Nested;
import org.elasticsearch.search.aggregations.bucket.terms.LongTerms;
import org.elasticsearch.search.aggregations.bucket.terms.StringTerms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket;
import org.elasticsearch.search.aggregations.metrics.max.Max;
import org.elasticsearch.search.aggregations.metrics.stats.Stats;
import org.elasticsearch.search.aggregations.metrics.sum.Sum;
import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.hamcrest.Matchers;
import org.junit.Test;

import java.util.ArrayList;
import java.util.List;

import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.index.query.QueryBuilders.matchAllQuery;
import static org.elasticsearch.search.aggregations.AggregationBuilders.*;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.is;
import static org.hamcrest.core.IsNull.notNullValue;

/**
 *
 */
@ElasticsearchIntegrationTest.SuiteScopeTest
public class NestedTests extends ElasticsearchIntegrationTest {

    static int numParents;
    static int[] numChildren;
    static SubAggCollectionMode aggCollectionMode;

    @Override
    public void setupSuiteScopeCluster() throws Exception {

        assertAcked(prepareCreate("idx")
                .addMapping("type", "nested", "type=nested"));

        List<IndexRequestBuilder> builders = new ArrayList<>();

        numParents = randomIntBetween(3, 10);
        numChildren = new int[numParents];
        aggCollectionMode = randomFrom(SubAggCollectionMode.values());
        logger.info("AGG COLLECTION MODE: " + aggCollectionMode);
        int totalChildren = 0;
        for (int i = 0; i < numParents; ++i) {
            if (i == numParents - 1 && totalChildren == 0) {
                // we need at least one child overall
                numChildren[i] = randomIntBetween(1, 5);
            } else {
                numChildren[i] = randomInt(5);
            }
            totalChildren += numChildren[i];
        }
        assertTrue(totalChildren > 0);

        for (int i = 0; i < numParents; i++) {
            XContentBuilder source = jsonBuilder()
                    .startObject()
                    .field("value", i + 1)
                    .startArray("nested");
            for (int j = 0; j < numChildren[i]; ++j) {
                source = source.startObject().field("value", i + 1 + j).endObject();
            }
            source = source.endArray().endObject();
            builders.add(client().prepareIndex("idx", "type", ""+i+1).setSource(source));
        }

        prepareCreate("empty_bucket_idx").addMapping("type", "value", "type=integer", "nested", "type=nested").execute().actionGet();
        for (int i = 0; i < 2; i++) {
            builders.add(client().prepareIndex("empty_bucket_idx", "type", ""+i).setSource(jsonBuilder()
                    .startObject()
                    .field("value", i*2)
                    .startArray("nested")
                    .startObject().field("value", i + 1).endObject()
                    .startObject().field("value", i + 2).endObject()
                    .startObject().field("value", i + 3).endObject()
                    .startObject().field("value", i + 4).endObject()
                    .startObject().field("value", i + 5).endObject()
                    .endArray()
                    .endObject()));
        }

        assertAcked(prepareCreate("idx_nested_nested_aggs")
                .addMapping("type", jsonBuilder().startObject().startObject("type").startObject("properties")
                        .startObject("nested1")
                            .field("type", "nested")
                            .startObject("properties")
                                .startObject("nested2")
                                    .field("type", "nested")
                                .endObject()
                            .endObject()
                        .endObject()
                        .endObject().endObject().endObject()));

        builders.add(
                client().prepareIndex("idx_nested_nested_aggs", "type", "1")
                        .setSource(jsonBuilder().startObject()
                                .startArray("nested1")
                                    .startObject()
                                    .field("a", "a")
                                        .startArray("nested2")
                                            .startObject()
                                                .field("b", 2)
                                            .endObject()
                                        .endArray()
                                    .endObject()
                                    .startObject()
                                        .field("a", "b")
                                        .startArray("nested2")
                                            .startObject()
                                                .field("b", 2)
                                            .endObject()
                                        .endArray()
                                    .endObject()
                                .endArray()
                            .endObject())
        );
        indexRandom(true, builders);
        ensureSearchable();
    }

    @Test
    public void simple() throws Exception {
        SearchResponse response = client().prepareSearch("idx")
                .addAggregation(nested("nested").path("nested")
                        .subAggregation(stats("nested_value_stats").field("nested.value")))
                .execute().actionGet();

        assertSearchResponse(response);


        double min = Double.POSITIVE_INFINITY;
        double max = Double.NEGATIVE_INFINITY;
        long sum = 0;
        long count = 0;
        for (int i = 0; i < numParents; ++i) {
            for (int j = 0; j < numChildren[i]; ++j) {
                final long value = i + 1 + j;
                min = Math.min(min, value);
                max = Math.max(max, value);
                sum += value;
                ++count;
            }
        }

        Nested nested = response.getAggregations().get("nested");
        assertThat(nested, notNullValue());
        assertThat(nested.getName(), equalTo("nested"));
        assertThat(nested.getDocCount(), equalTo(count));
        assertThat(nested.getAggregations().asList().isEmpty(), is(false));

        Stats stats = nested.getAggregations().get("nested_value_stats");
        assertThat(stats, notNullValue());
        assertThat(stats.getMin(), equalTo(min));
        assertThat(stats.getMax(), equalTo(max));
        assertThat(stats.getCount(), equalTo(count));
        assertThat(stats.getSum(), equalTo((double) sum));
        assertThat(stats.getAvg(), equalTo((double) sum / count));
    }

    @Test
    public void onNonNestedField() throws Exception {
        try {
            client().prepareSearch("idx")
                    .addAggregation(nested("nested").path("value")
                            .subAggregation(stats("nested_value_stats").field("nested.value")))
                    .execute().actionGet();

            fail("expected execution to fail - an attempt to nested facet on non-nested field/path");

        } catch (ElasticsearchException ese) {
        }
    }

    @Test
    public void nestedWithSubTermsAgg() throws Exception {
        SearchResponse response = client().prepareSearch("idx")
                .addAggregation(nested("nested").path("nested")
                        .subAggregation(terms("values").field("nested.value").size(100)
                                .collectMode(aggCollectionMode)))
                .execute().actionGet();

        assertSearchResponse(response);


        long docCount = 0;
        long[] counts = new long[numParents + 6];
        for (int i = 0; i < numParents; ++i) {
            for (int j = 0; j < numChildren[i]; ++j) {
                final int value = i + 1 + j;
                ++counts[value];
                ++docCount;
            }
        }
        int uniqueValues = 0;
        for (long count : counts) {
            if (count > 0) {
                ++uniqueValues;
            }
        }

        Nested nested = response.getAggregations().get("nested");
        assertThat(nested, notNullValue());
        assertThat(nested.getName(), equalTo("nested"));
        assertThat(nested.getDocCount(), equalTo(docCount));
        assertThat(nested.getAggregations().asList().isEmpty(), is(false));

        LongTerms values = nested.getAggregations().get("values");
        assertThat(values, notNullValue());
        assertThat(values.getName(), equalTo("values"));
        assertThat(values.getBuckets(), notNullValue());
        assertThat(values.getBuckets().size(), equalTo(uniqueValues));
        for (int i = 0; i < counts.length; ++i) {
            final String key = Long.toString(i);
            if (counts[i] == 0) {
                assertNull(values.getBucketByKey(key));
            } else {
                Bucket bucket = values.getBucketByKey(key);
                assertNotNull(bucket);
                assertEquals(counts[i], bucket.getDocCount());
            }
        }
    }

    @Test
    public void nestedAsSubAggregation() throws Exception {
        SearchResponse response = client().prepareSearch("idx")
                .addAggregation(terms("top_values").field("value").size(100)
                        .collectMode(aggCollectionMode)
                        .subAggregation(nested("nested").path("nested")
                                .subAggregation(max("max_value").field("nested.value"))))
                .execute().actionGet();

        assertSearchResponse(response);


        LongTerms values = response.getAggregations().get("top_values");
        assertThat(values, notNullValue());
        assertThat(values.getName(), equalTo("top_values"));
        assertThat(values.getBuckets(), notNullValue());
        assertThat(values.getBuckets().size(), equalTo(numParents));

        for (int i = 0; i < numParents; i++) {
            String topValue = "" + (i + 1);
            assertThat(values.getBucketByKey(topValue), notNullValue());
            Nested nested = values.getBucketByKey(topValue).getAggregations().get("nested");
            assertThat(nested, notNullValue());
            Max max = nested.getAggregations().get("max_value");
            assertThat(max, notNullValue());
            assertThat(max.getValue(), equalTo(numChildren[i] == 0 ? Double.NEGATIVE_INFINITY : (double) i + numChildren[i]));
        }
    }

    @Test
    public void nestNestedAggs() throws Exception {
        SearchResponse response = client().prepareSearch("idx_nested_nested_aggs")
                .addAggregation(nested("level1").path("nested1")
                        .subAggregation(terms("a").field("nested1.a")
                                .collectMode(aggCollectionMode)
                                .subAggregation(nested("level2").path("nested1.nested2")
                                        .subAggregation(sum("sum").field("nested1.nested2.b")))))
                .get();
        assertSearchResponse(response);


        Nested level1 = response.getAggregations().get("level1");
        assertThat(level1, notNullValue());
        assertThat(level1.getName(), equalTo("level1"));
        assertThat(level1.getDocCount(), equalTo(2l));

        StringTerms a = level1.getAggregations().get("a");
        Terms.Bucket bBucket = a.getBucketByKey("a");
        assertThat(bBucket.getDocCount(), equalTo(1l));

        Nested level2 = bBucket.getAggregations().get("level2");
        assertThat(level2.getDocCount(), equalTo(1l));
        Sum sum = level2.getAggregations().get("sum");
        assertThat(sum.getValue(), equalTo(2d));

        a = level1.getAggregations().get("a");
        bBucket = a.getBucketByKey("b");
        assertThat(bBucket.getDocCount(), equalTo(1l));

        level2 = bBucket.getAggregations().get("level2");
        assertThat(level2.getDocCount(), equalTo(1l));
        sum = level2.getAggregations().get("sum");
        assertThat(sum.getValue(), equalTo(2d));
    }


    @Test
    public void emptyAggregation() throws Exception {
        SearchResponse searchResponse = client().prepareSearch("empty_bucket_idx")
                .setQuery(matchAllQuery())
                .addAggregation(histogram("histo").field("value").interval(1l).minDocCount(0)
                        .subAggregation(nested("nested").path("nested")))
                .execute().actionGet();

        assertThat(searchResponse.getHits().getTotalHits(), equalTo(2l));
        Histogram histo = searchResponse.getAggregations().get("histo");
        assertThat(histo, Matchers.notNullValue());
        Histogram.Bucket bucket = histo.getBucketByKey(1l);
        assertThat(bucket, Matchers.notNullValue());

        Nested nested = bucket.getAggregations().get("nested");
        assertThat(nested, Matchers.notNullValue());
        assertThat(nested.getName(), equalTo("nested"));
        assertThat(nested.getDocCount(), is(0l));
    }
}

 

上面的代码是链接上的.下面的是自己的应用.

public static Map<String, Object> GetRegionInfo(Client client,
            RequestSignal requestSignal, Set<String> set) {

        Map<String, Object> result = new HashMap<String, Object>();

        AggregationBuilder aggs1 = AggregationBuilders.nested("level1").path(
                "nna_regions");
        AggregationBuilder aggs2 = AggregationBuilders.filter("level2").filter(
                ConstValue.AGGS_FILTERBUILDER);
        AggregationBuilder aggs3 = AggregationBuilders.terms("level3")
                .field("nna_regions.sca_region").size(5000);
        SumBuilder aggs4 = AggregationBuilders.sum("level4").field(
                "nna_regions.dna_score");

        SearchResponse response = client
                .prepareSearch("flume-*-content-*")
                .setQuery(ConstValue.queryBuilder_statAction(requestSignal))
                .setSearchType("count")
                .addAggregation(
                        aggs1.subAggregation(aggs2.subAggregation(aggs3
                                .subAggregation(aggs4)))).execute().actionGet();

        Nested level1 = response.getAggregations().get("level1");
        Filter level2 = level1.getAggregations().get("level2");

        Terms level3 = level2.getAggregations().get("level3");
        Collection<Terms.Bucket> collection = level3.getBuckets();

        for (Terms.Bucket bucket : collection) {
            String region = bucket.getKey();
            long count = bucket.getDocCount();
            double score = 0;
            if (set.contains(region)) {
                Sum sum = bucket.getAggregations().get("level4");

                if (sum == null) {
                    System.out.println("null");
                } else {
                    score = sum.getValue();
                }
                Map<String, Object> tmp = new HashMap<String, Object>();
                tmp.put("count", count);
                tmp.put("score", score);
                result.put(region, tmp);
            }
        }
        return result;
    }

 

aggs1是取得嵌套域的名.

其他的代码,关于取日期域值的方法.

private String statRequest(Client esClient) {
        FilteredQueryBuilder builder = QueryBuilders.filteredQuery(
                QueryBuilders.matchAllQuery(),
                FilterBuilders.rangeFilter("tfp_save_time").from(begTime)
                        .to(endTime).includeLower(true).includeUpper(true));

        AggregationBuilder aggs1 = AggregationBuilders.terms("inp_type").field(
                "inp_type");

        AggregationBuilder aggs = AggregationBuilders.dateHistogram("By_Date")
                .field("tfp_save_time").format("yyyy-MM-dd HH:mm:ss")
                .extendedBounds(begTime, endTime).interval(statType);


        SearchResponse response = esClient.prepareSearch("flume-*-content*")
                .setQuery(builder).setSearchType("count")
                .addAggregation(aggs1.subAggregation(aggs)).execute()
                .actionGet();

        Terms terms = response.getAggregations().get("inp_type");
        Collection<Terms.Bucket> inp_type = terms.getBuckets();
        Iterator<Bucket> inp_type_It = inp_type.iterator();
        // Gson gson = new GsonBuilder().disableHtmlEscaping().create();

        StatResult statResult = new StatResult(); // result.

        while (inp_type_It.hasNext()) {

            HashMap<String, Integer> test = new HashMap<String, Integer>();// result
            // nested.
            Bucket inpBucket = inp_type_It.next();
            // System.out.println(inpBucket.getKey());
            // System.out.println(inpBucket.getDocCount());
            DateHistogram dateHistogram = (DateHistogram) inpBucket
                    .getAggregations().get("By_Date");
            Collection<DateHistogram.Bucket> by_date = (Collection<DateHistogram.Bucket>) dateHistogram
                    .getBuckets();

            Iterator<DateHistogram.Bucket> by_date_It = by_date.iterator();

            while (by_date_It.hasNext()) {
                DateHistogram.Bucket bucket = by_date_It.next();

                int count = Integer.parseInt(String.valueOf(bucket
                        .getDocCount()));
                String newdate = postDate(bucket.getKey());

                test.put(newdate, count);
            }
            if (!test.isEmpty()) {
                statResult.add(inpBucket.getKey(), test);
            }
        }
        return statResult.toString();
    }

 

 

http://www.cnblogs.com/wmx3ng/p/4096641.html

分享到:
评论

相关推荐

    aggs-matrix-stats-client-6.8.3-API文档-中文版.zip

    标签:elasticsearch、plugin、aggs、matrix、stats、client、中文文档、jar包、java; 使用方法:解压翻译后的API文档,用浏览器打开“index.html”文件,即可纵览文档内容。 人性化翻译,文档中的代码和结构保持...

    aggs-matrix-stats-client-6.8.3-API文档-中英对照版.zip

    标签:elasticsearch、plugin、aggs、matrix、stats、client、中英对照文档、jar包、java; 使用方法:解压翻译后的API文档,用浏览器打开“index.html”文件,即可纵览文档内容。 人性化翻译,文档中的代码和结构...

    ElasticSearch 思维导图.xmind.zip

    全面描述ElasticSearch技术,涵盖:插入数据、版本控制、Mapping 、Query查询【数据准备】、Filter查询【数据准备】、aggs聚合查询、组合查询/复合查询、ElasticSearch原理、JAVA API操作ES、实用技能。是我本人整理...

    02-Elasticsearch 8.x 向量搜索使用详解 杭州 1.6 2024

    **Elasticsearch 8.x 向量搜索使用详解** Elasticsearch 8.x 版本引入了对向量搜索的支持,这使得它能够处理非结构化的数据,如图像、文本和语音的语义搜索。向量搜索是通过将数据转换为高维向量并进行相似度比较来...

    Ruby-elasticsearchrubyRuby集成Elasticsearch

    如果你的Ruby对象需要映射到Elasticsearch索引,可以使用`elasticsearch-model` gem。首先,为你的模型添加继承关系并定义索引: ```ruby class Product include Elasticsearch::Model include Elasticsearch::...

    Elasticsearch聚合 之 Terms

    在Elasticsearch中,Terms聚合可以使用以下方式定义: ``` { "aggs" : { "genders" : { "terms" : { "field" : "gender" } } } } ``` 在上面的示例中,我们使用Terms聚合对gender字段进行分组统计。Elastic...

    Elasticsearch入门(五):Elasticsearch基础概念与基本操作 文章中所用用到的 http 文件

    Elasticsearch是一款强大的开源搜索引擎,广泛应用于大数据分析和实时数据检索。在Elasticsearch的世界里,理解基础概念和掌握基本操作是入门的关键。本篇文章将深入探讨这些知识点,并结合提供的`es.http`和`es_...

    elasticsearch数据库下载、配置、使用案例的简单分享.docx

    ### Elasticsearch 数据库下载、配置及使用案例详解 #### 一、Elasticsearch简介 Elasticsearch 是一个基于 Lucene 的高性能、分布式、多租户能力的全文搜索引擎。它提供了丰富的功能,包括HTTP web接口和无模式的...

    Elasticsearch快速入门:基础配置与使用示例

    通过以上介绍,我们了解了 Elasticsearch 的基本概念、安装配置方法以及一些基本使用案例。Elasticsearch 作为一款高性能的分布式搜索引擎,在实际应用中可以发挥重要作用,尤其是在需要高效搜索和分析大量数据的...

    ElasticSearch增删改查以及聚合查询

    在ElasticSearch中,新增操作使用PUT方法,格式为:`http://XXX.XXX.XXX.XXX:9200/&lt;index&gt;/&lt;type&gt;/&lt;id&gt;`。其中,index和type是必须提供的,id可以手动生成或由ES自动生成。在PUT请求体中,需要提供新增的数据JSON...

    elasticsearch聚合值过滤

    `es-sql.txt` 文件可能包含了使用 Elasticsearch SQL 查询语言编写的类似查询。Elasticsearch 从 6.0 版本开始支持 SQL 查询语法,使得对 ES 的查询更加直观,尤其是对于熟悉 SQL 的用户来说。在 SQL 查询中,我们...

    elasticsearch 开启脚本配置

    elasticsearch,执行脚本错误,java.lang.IllegalArgumentException: cannot execute scripts using [aggs_executable] 配置文件配置

    python操作Elasticsearch数据库.docx

    数据检索是Elasticsearch的核心功能,可以使用`search`方法: ```python query = { "query": { "match": { "title": "Python" } } } results = es.search(index=index_name, body=query) for hit in results['...

    ElasticSearch安装及简单配置说明_OK.docx

    - **JDK**:ElasticSearch 需要 JDK 环境支持,推荐使用 JDK 7 或更高版本。 - **ElasticSearch**:选择适合当前系统的版本进行安装。 - **Kibana**:用于可视化 Elasticsearch 数据。 - **Maven**:构建工具,用于...

    一个关于上传elasticsearch数据库下载、配置、使用入门

    ### Elasticsearch 数据库上传、配置与使用入门 #### 一、Elasticsearch 概述 Elasticsearch 是一款基于 Lucene 的高性能搜索引擎,它不仅提供了强大的全文检索能力,还支持分布式和多租户特性,适用于大规模的...

    elasticsearch-jar.zip

    aggs-matrix-stats-client-7.13.4.jar,elasticsearch-7.13.4.jar,elasticsearch-cli-7.13.4.jar,elasticsearch-core-7.13.4.jar,elasticsearch-geo-7.13.4.jar,elasticsearch-plugin-classloader-7.13.4.jar,elastic...

    elasticsearch-javascript-api 离线文档

    要在 JavaScript 项目中使用 Elasticsearch API,首先需要安装对应的客户端库。如果你在 Node.js 环境中,可以通过 npm 进行安装: ```bash npm install @elastic/elasticsearch ``` 对于浏览器环境,可以使用 CDN ...

    Node.js-Elasticsearch的官方Node.js客户端库

    对于使用Node.js进行开发的项目,Elasticsearch提供了官方的Node.js客户端库,使得在Node.js环境中与Elasticsearch进行交互变得极为便捷。本文将深入探讨这个客户端库的使用和关键特性。 **1. 安装和初始化** 在...

    数据聚合的艺术:如何在 Elasticsearch 中使用聚合?

    ### 数据聚合的艺术:深入解析 Elasticsearch 中的聚合技术 Elasticsearch 作为一个强大的搜索与分析引擎,在处理海量数据方面表现出色。其内置的聚合功能更是数据分析领域的重要工具之一。本篇文章将详细探讨 ...

    Elasticsearch示例数据 accounts.json

    导入 "accounts.json" 到 Elasticsearch 可以通过 `bulk` API 或者使用工具如 Logstash 完成。一旦数据导入,我们可以利用 Elasticsearch 强大的查询语言——Query DSL 来进行复杂的数据检索。例如,我们可以搜索...

Global site tag (gtag.js) - Google Analytics