结果分组是使用一个共同field值分组document,返回顶部的document组,顶部的document是基于分组的document. 一个例子是一个搜索在百思买的常用术语如dvd,显示前3个结果的每个类别(“电视和视频”,“电影”,“计算机”,等)
现在开启结果分组并且请求一个查询,我们第一次尝试在制造商名称分组(manu_exact field)
"groupValue":"Apache Software Foundation",
"name":"Solr, the Enterprise Search Server"}]
"groupValue":"Corsair Microsystems Inc.",
"name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail"}]
"groupValue":"A-DATA Technology Inc.",
"name":"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM"}]
"groupValue":"Canon Inc.",
"name":"Canon PIXMA MP500 All-In-One Photo Printer"}]
"groupValue":"ASUS Computer Inc.",
"name":"ASUS Extreme N7800GTX/2DHTV (256 MB)"}]
response 表明有6条匹配我们的结果,为每一个独特的group.field值,一个得分最高的文档doclist返回。该doclist也返回该组中的总的匹配数为“numfound”。该group本身也按最高的文档的得分在每一组显示。
...&q=memory&group=true&group.query=price:[0 TO 99.99]&group.query=price:[100 TO *]&group.limit=3
"price:[0 TO 99.99]":{
"name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail",
"price:[100 TO *]":{
"name":"Canon PIXMA MP500 All-In-One Photo Printer",
"name":"CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail",
"name":"ASUS Extreme N7800GTX/2DHTV (256 MB)",
从上面的反应,通过查询“memory”可以返回5条 document。当然,1的价格低于100美元,3有100美元以上的价格。总计不达5因为一个document被不存在的价格,因此不匹配group.query。
我们可以使用的一组命令展现”main result”,通过添加参数group.main=true,虽然这一结果格式不拥有尽可能多的信息,它可以为现有的Solr客户端更容易解析。
"name":"Solr, the Enterprise Search Server",
"manu":"Apache Software Foundation"},
"name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail",
"manu":"Corsair Microsystems Inc."},
"name":"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM",
"manu":"A-DATA Technology Inc."},
"name":"Canon PIXMA MP500 All-In-One Photo Printer",
"manu":"Canon Inc."},
"name":"ASUS Extreme N7800GTX/2DHTV (256 MB)",
"manu":"ASUS Computer Inc."}]
参数名 |
参数值 |
描述 |
group |
true/false |
如果设置true,打开结果分组 |
group.field |
[fieldname] |
Group based on the unique values of a field. The field must currently be single-valued and must be either indexed, or be another field type that has a value source and works in a function query - such as ExternalFileField. Note: for Solr 3.x versions the field must by a string like field such as StrField or TextField, otherwise a http status 400 is returned. |
group.func |
[function query] |
group.query |
[query] |
Return a single group of documents that also match the given query. |
rows |
[number] |
分页使用,默认返回10条分组结果。 |
start |
[number] |
分组开始位置 |
group.limit |
[number] |
返回每个组(group)中的document文件数量,默认为1 |
group.offset |
[number] |
每个group分组返回的document的偏量开始位置The offset into the document list of each group. |
sort |
[sortspec] |
How to sort the groups relative to each other. For example, sort=popularity desc will cause the groups to be sorted according to the highest popularity doc in each group. Defaults to "score desc". |
group.sort |
[sortspec] |
How to sort documents within a single group. Defaults to the same value as the sort parameter. |
group.format |
grouped/simple |
if simple, the grouped documents are presented in a single flat list. The start and rows parameters refer to numbers of documents instead of numbers of groups. |
group.main |
true/false |
If true, the result of the last field grouping command is used as the main result list in the response, using group.format=simple |
group.ngroups |
true/false |
group.truncate |
true/false |
group.facet |
true/false |
group.cache.percent |
[0-100] |
If > 0 enables grouping cache. Grouping is executed actual two searches. This option caches the second search. A value of 0 disables grouping caching. Default is 0. Tests have shown that this cache only improves search time with boolean queries, wildcard queries and fuzzy queries. For simple queries like a term query or a match all query this cache has a negative impact on performance |
<!--[if !supportLists]-->1、 <!--[endif]-->任何数量的一组命令(group.field,group.func,group.query)可以在一个单一的请求指定。
<!--[if !supportLists]-->2、 <!--[endif]-->Solr3.5以后,group命令也支持分布式查询,目前group.truncate和group.func是唯一不支持分布式搜索参数。
SolrServer server = this.getSolrServer();
SolrQuery param = new SolrQuery();
param.setParam(GroupParams.GROUP, GROUP); param.setParam(GroupParams.GROUP_FIELD, GROUP_FIELD); param.setParam(GroupParams.GROUP_LIMIT, GROUP_LIMIT); QueryResponse response = null;
try {
response = server.query(param);
} catch (SolrServerException e) {
logger.error(e.getMessage(), e);
Map<String, Integer> info = new HashMap<String, Integer>();
GroupResponse groupResponse = response.getGroupResponse();
if(groupResponse != null) {
List<GroupCommand> groupList = groupResponse.getValues();
for(GroupCommand groupCommand : groupList) {
List<Group> groups = groupCommand.getValues(); for(Groupgroup : groups) { info.put(group.getGroupValue(), (int)group.getResult().getNumFound()); }
SolrQuery SolrQuery = new SolrQuery("*:*"); solrQuery.addFilterQuery("display:1"); solrQuery.addFilterQuery("activityBeginTime:[* TO NOW]"); solrQuery.addFilterQuery("activityEndTime:[NOW TO *]"); solrQuery.setGroup(true); solrQuery.setParam(GroupParams.GROUP_QUERY, {"id:1","id:2"}); solrQuery.setParam(GroupParams.GROUP_LIMIT, pageSize + ""); solrQuery.setParam(GroupParams.GROUP_OFFSET, pageSize * (page - 1) + ""); solrQuery.setParam(GroupParams.GROUP_LIMIT, "1"); solrQuery.setParam(GroupParams.GROUP_SORT, "id desc", "sort asc"); solrQuery.setRows(0);
QueryResponse qr = searchSource.query(searchQuery, SolrRequest.METHOD.POST); GroupResponse groupResponse = qr.getGroupResponse(); List<GroupCommand> list = groupResponse.getValues();
for (GroupCommand gc : list) { List<Group> gs = gc.getValues(); if (CollectionUtils.isNotEmpty(gs)) { for (Group g : gs) { SolrDocumentList sds = g.getResult(); if (CollectionUtils.isNotEmpty(sds)) { for (SolrDocument doc : sds) { String id= doc.getFieldValue("id").toString(); } } } } } } |
