假设,最终搜索结果集是由(A AND B)两个条件对应的命中结果集求交而得到的。如果A条件对应的文档集合非常小(大概不超过300个),而B条件对应的文档集合非常大。在这样的场景下在solr中使用二阶段过滤的方式来查询就再合适不过了。
首先要使用org.apache.solr.search.PostFilter, 接口说明如下:
/** The PostFilter interface provides a mechanism to further filter documents * after they have already gone through the main query and other filters. * This is appropriate for filters with a very high cost. * <p> * The filtering mechanism used is a {@link DelegatingCollector} * that allows the filter to not call the delegate for certain documents, * thus effectively filtering them out. This also avoids the normal * filter advancing mechanism which asks for the first acceptable document on * or after the target (which is undesirable for expensive filters). * This collector interface also enables better performance when an external system * must be consulted, since document ids may be buffered and batched into * a single request to the external system. * <p> * Implementations of this interface must also be a Query. * If an implementation can only support the collector method of * filtering through getFilterCollector, then ExtendedQuery.getCached() * should always return false, and ExtendedQuery.getCost() should * return no less than 100. */
if (q instanceof ExtendedQuery) { ExtendedQuery eq = (ExtendedQuery)q; if (!eq.getCache()) { if (eq.getCost() >= 100 && eq instanceof PostFilter) { if (postFilters == null) postFilters = new ArrayList<>(sets.length-end); postFilters.add(q); } else { if (notCached == null) notCached = new ArrayList<>(sets.length-end); notCached.add(q); } continue; } }
当Query对象满足eq.getCache()为false,cost>=100,且PostFilter对象之后会把query对象放到postFilters list中以备后用。
public class PostFilterQuery extends ExtendedQueryBase implements PostFilter { private final boolean exclude; private final Set<String> items; private final String field; public PostFilterQuery(boolean exclude, Set<String> items, String field) { super(); this.exclude = exclude; this.items = items; this.field = field; } @Override public int hashCode() { return System.identityHashCode(this); } @Override public boolean equals(Object obj) { return this == obj; } @Override public void setCache(boolean cache) { } @Override public boolean getCache() { return false; } public int getCost() { return Math.max(super.getCost(), 100); } @Override public DelegatingCollector getFilterCollector(IndexSearcher searcher) { return new DelegatingCollector() { private SortedDocValues docValue; @Override public void collect(int doc) throws IOException { int order = this.docValue.getOrd(doc); if (order == -1) { if (exclude) { super.collect(doc); } return; } BytesRef ref = this.docValue.lookupOrd(order); if (items.contains(ref.utf8ToString())) { if (!exclude) { super.collect(doc); } } else { if (exclude) { super.collect(doc); } } } @Override protected void doSetNextReader(LeafReaderContext context) throws IOException { super.doSetNextReader(context); this.docValue = DocValues.getSorted(context.reader(), field); } }; } }
- boolean exclude:使用排除过滤还是包含过滤
- Set<String> items:需要过滤的item集合
- String field:通过Document文档上的那个field来过滤。
public class PostFilterQParserPlugin extends QParserPlugin { @Override @SuppressWarnings("all") public void init(NamedList args) { } @Override public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { boolean exclude = localParams.getBool("exclude"); String field = localParams.get(CommonParams.FIELD); if (field == null) { throw new IllegalArgumentException( "field:" + field + " has not been define in localParam"); } Set<String> items = Sets.newHashSet(StringUtils.split(qstr, ',')); final PostFilterQuery q = new PostFilterQuery(exclude, items, field); return new QParser(qstr, localParams, params, req) { @Override public Query parse() throws SyntaxError { return q; } }; }}
<queryParser name="postfilter" class="com.dfire.tis.solrextend.queryparse.PostFilterQParserPlugin" />
SolrQuery query = new SolrQuery(); query.setQuery("customerregister_id:193d43b1734245f5d3bf35092dbb3a40"); query.addFilterQuery("{!postfilter f=menu_id exclude=true}000008424a4234f0014a5746c2cd1065,000008424a4234f0014a5746c2cd1065"); SimpleQueryResult<Object> result = client.query("search4totalpay", "00000241", query, Object.class); System.out.println("getNumberFound:" + result.getNumberFound());
