`
sillycat
  • 浏览: 2539481 次
  • 性别: Icon_minigender_1
  • 来自: 成都
社区版块
存档分类
最新评论

SOLR Performance and SolrJ(1)Client Compare Java VS PHP

 
阅读更多
SOLR Performance and SolrJ(1)Client Compare Java VS PHP

Recently, I am using PHP and SolrJ to connect to SOLR indexer and search.
At first I am using PHP driver Solarium. The code is similar to theses
$this->clientActive = new Client(array(
    'endpoint' => array(
        'localhost' => array(
            'host' => $solrHostActive,
            'port' => $solrPortActive,
            'path' => $solrPathActive,
            'timeout' => 30,
        )
    )
));

public function addJobDocuments($jobs, $commit, $server){
    //set up features needed for this method
    $logger = $this->ioc->getService("logger");
    $solrClient = $this->getSolrServer($server);

    //get an update query instance
    $update = $solrClient->createUpdate();
    $docs = array();

    $logger->debug(var_export($jobs, true));

    foreach ($jobs as $job){
        $doc = $this->prepareDocument($job, $update);
        if($doc != null){
            $docs[] = $doc;
        }
    }

    if(!empty($docs)){
        $update->addDocuments($docs);

        if ($commit) {
            $update->addCommit();
            $logger->debug("committing during add documents.");
        } else {
            $logger->debug("NOT committing during add documents.");
        }

        return $this->ioc->retry(function () use ($solrClient, $update, $logger) {
            $result = $solrClient->update($update);
            $logger->debug("Update query executed---------");
            $logger->debug("Query status: " . $result->getStatus());
            $logger->debug("Query time: " . $result->getQueryTime());
        }, 10, 3, "SolrSearchClient.addJobDocuments");
    }
}

I found the performance for one single PHP is about 200jobs/s, each job is about 10k size. But that is one PHP process. If I am using ECS cluster, I can set up multiple ECS containers and then I can get 200 * N performance. That is ideal.

Then I tried with Java and SolrJ. The code is similar to these.
package com.sillycat.analyzerjava;

import java.io.IOException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient;
import org.apache.solr.common.SolrInputDocument;

public class SolrMainApp {

public static void main(String[] args) throws SolrServerException, IOException {

System.out.println("-----------------start job-------------");





long start = System.currentTimeMillis();

String solrURL = "http://172.23.2.245:8983/job";

ExecutorService executorService =

  new ThreadPoolExecutor(50, 200, 0L, TimeUnit.MILLISECONDS,



  new LinkedBlockingQueue<Runnable>());



SolrClient solrClient = new ConcurrentUpdateSolrClient.Builder(solrURL)

.withThreadCount(100)


.withQueueSize(1000)


.withExecutorService(executorService).build();






for(int i = 1;i<50000; i++){

SolrInputDocument doc = new SolrInputDocument();


doc.addField("id", i);


doc.addField("customer_id", "1");


doc.addField("pool_id", 9528);


doc.addField("source_id", 1);


doc.addField("campaign_id", 1);


doc.addField("segment_id", 1);


doc.addField("job_reference", "referenceId1");


doc.addField("title", "title1");


doc.addField("description", "COMMERCIAL ROOFING SALESMEN, ESTIMATORS & INSTALLERS <br /> <br /> <br />Tired of living on a small draw and commission when you know the business inside out? <br /> <br />Been waiting for the opportunity to run your own show? <br /> <br />Well now you can. And withthe backing of an 82-year old leader in the business. <br /> <br />Company Description: <br /> <br />Southwestern Petroleum Corporation is a Texas-based oil company founded in 1933. Our ISO 9001 certified Coating Technology Division manufactures a full line of industrial and commercial protective coatings and waterproofing systems at manufacturing facilities in the US, Canada and Belgium. We pride ourselves on our track record of helping motivated people establish successful building maintenance companies in 75 countries around the world. <br /> <br /> Total independence be your own boss sell when where and how you want <br /> Keep all the profit from the jobs you sell <br /> Our top people earn six figures consistently <br /> Uncapped, industry leading product commission rate plus high margin profit on installation <br /> Competitive, world-class, industry leading products & systems <br /> Professional factory & ongoing training <br /> Responsive, experienced sales, marketing & technical support <br /> Financially stable, 82-year old private family-owned business <br /> <br />Why Our Company Is Unique: <br /> <br /> Our company was started by sales professionals <br /> Our sales professionals earn the highest commissions in our industry <br /> Our sales program offers true independence and freedom from sales quotas, reports and collections <br /> We treat our sales professionals with respect and integrity and dont downsize their territories or reduce their commissions when they start earning too much <br /> We provide sales tools to make your sales job easier <br /> We provide software tools to cut the paperwork and get more impressive estimates/proposals out fast <br /> Many of our sales professionals have been with us for 20, 30, 40 years and more <br /> Our sales professionals represent the best protective coatings and waterproofing systems in the world, used by Customers like Goodyear, Kraft, Hilton, O'Reilly Auto Parts, Siemens, Honeywell, General Electric, Nestle, Intercontinental Hotels, British Petroleum, Bosch, DuPont, Toyota, Hewlett Packard, Sheraton, Bridgestone, Ingersoll-Rand, Sara Lee and thousands of smaller businesses who demand the best <br /> Our sales professionals enjoy the security and sales potential of a vast, diversified market virtually every commercial, industrial or institutional building has a need for our products hotels, office buildings, manufacturing plants, apartment buildings, government buildings, airports, universities, food stores, shopping centers, hospitals, warehouses, garages, barns, stadiums, storage facilities, distribution centers and terminals and every other type of building you can think of <br /> <br />You Will Be a Perfect Fit for Our Company If: <br /> <br /> You are experienced in sales and enjoy helping business people solve problems and save money <br /> You would like the independence of owning your own business <br /> You prefer to sell a quality product you can be proud of instead of the cheapest one available <br /> You don't like paperwork and don't really need a boss to tell you what to do <br /> You are super competitive, hate losing at anything and prefer setting your own goals instead of dealing with company quotas or call reports <br /> You are a problem solver, good at overcoming obstacles <br /> You are confident in your abilities, make friends easily and have a great sense of humor <br /> You are organized and manage time well enough to work from a home office <br /> You can be demanding at times because you insist on excellent service from the company you represent <br /> You know how important high activity levels are to sales success <br /> You don't mind working hard and getting your hands dirty, if it translates into income <br /> You prefer to spend your days working with prospects and Customers instead of sitting in an office <br /> You would like more control over your own future <br /> You know you are capable of earning much more if given the right training, support and freedom to do it your way <br /> <br />Qualifications: <br /> <br /> Minimum 1 year of successful business to business sales experience <br /> Background in roofing, flooring, paving, construction materials, construction trades, engineering or contracting a plus <br /> <br />If you would like to know more about taking that first step towards financial independence and a secure future, please respond with your name, city, state and email address.");


doc.addField("url", "http://url1");


doc.addField("company_id", 1);


doc.addField("company", "company1");


doc.addField("cities", "austin");


doc.addField("cities", "dallas");


doc.addField("cpc", 12);


doc.addField("reg_cpc",10);


doc.addField("posted","2016-06-23T22:00:00Z");


    doc.addField("created","2016-05-23T22:00:00Z");

    doc.addField("experience", 1);

    doc.addField("salary", 1);

    doc.addField("education", 1);

    doc.addField("jobtype", 1);

    doc.addField("industry", 1);

    doc.addField("quality_score", 1.0);

    doc.addField("boost_factor",1.0);

    doc.addField("paused", false);

    doc.addField("budget", 100);

    doc.addField("email", "cluo@jobs2careers.com");

    doc.addField("phone", "5127850000");

    doc.addField("srcseg_id", 1);

    doc.addField("srccamp_id", 1);

    doc.addField("tags", "tag1");

    doc.addField("tags", "tag2");

    doc.addField("searchtags", "searchtags1");

    doc.addField("searchtags", "searchtags2");

    doc.addField("daily_capped", false);

    doc.addField("qq_multiplier", 1.2);

    doc.addField("j2c_apply", false);

    doc.addField("reranker_info" , "rerankerInfo1");

    doc.addField("major_category","100016");

    doc.addField("major_category", "100017");

    doc.addField("minor_category", "100016");

    doc.addField("minor_category", "111017");

    doc.addField("excluded_company", false);

solrClient.add(doc);


if(i%100 == 0){


System.out.println("process " + i + "/50000");



}


}





long end = System.currentTimeMillis();





System.out.println("total time is " + (end - start) + " ms");

System.out.println("total time is " + 50000 * 1000 / (end - start) + " jobs/s");

solrClient.commit();

}

}

<groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>1.7.21</version> <!-- or use LATEST -->
</dependency>
<dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-simple</artifactId>
        <version>1.7.21</version> <!-- or use LATEST -->
</dependency>

<dependency>


    <groupId>org.apache.solr</groupId>
    <artifactId>solr-solrj</artifactId>


    <version>6.4.2</version>
</dependency>

I can get about 1000 jobs/s. That is multiple threads in Java.

Finally, I found that the bottle neck is not on the clients. The bottle neck is because of each job has 10K, and like 2000 jobs /s or 3000 jobs/s. It already used up all the network band on the SOLR indexer machine.


References:
https://github.com/mosuka/solrj-example/blob/master/src/main/java/com/github/mosuka/apache/solr/example/cmd/SearchCommand.java
https://github.com/mosuka/solrj-example/blob/master/src/main/java/com/github/mosuka/apache/solr/example/cmd/AddCommand.java

https://dzone.com/articles/solr-update-performance
分享到:
评论

相关推荐

    java solr solrj 带账号密码增量查询添加索引

    主要讲解了 solr客户端如何调用带账号密码的solr服务器调用,实现添加索引和查询索引,以及分组查询

    solr配置和solrJ的使用

    - **介绍**: SolrJ是Solr提供的Java客户端库,用于简化Solr服务器的交互过程。 **2. 添加SolrJ依赖** - **步骤说明**: 在项目的构建文件中添加SolrJ的依赖项。 - **操作详情**: 对于Maven项目,在`pom.xml`文件...

    solr-solrj 5.0.0 demo

    Solr-Solrj 5.0.0 是一个用于与Apache Solr进行交互的Java客户端库。在本文中,我们将深入探讨Solr-Solrj的使用、功能及其与自建Solr服务的集成,特别是涉及到中文分词的场景。 Apache Solr是一款流行的开源全文...

    solr-solrj-4.10.3.jar和solr-solrj-5.0.0.jar

    SolrJ是Apache Solr项目的Java客户端库,它为与Solr服务器进行交互提供了便利的API。这个压缩包包含了两个版本的SolrJ库:solr-solrj-4.10.3.jar和solr-solrj-5.0.0.jar。这两个版本的差异主要在于对Solr服务器的...

    solr-solrj-4.9.0.jar

    solr-solrj-4.9.0.jar

    solrj工具类封装

    solrj工具类封装,包括条件批量查询,批量增删改,分段修改。

    solr-solrj-6.1.0

    Solr-Solrj是Apache Lucene项目下的一个子项目,专门为Apache Solr搜索引擎提供Java客户端库。Solr是一款强大的全文检索服务器,而Solrj则是与之交互的Java API,使得开发人员能够轻松地在Java应用程序中集成Solr的...

    solr-solrj-4.4.0.jar

    solr-solrj-4.4.0.jar

    Solrj and Solr and LDAP and SearchEngine

    Solrj是Apache Solr的一个Java客户端库,用于与Solr服务器进行交互。它提供了丰富的API,使得开发人员可以方便地执行索引、查询、配置和管理Solr实例。Solrj简化了Solr的集成工作,例如在Java应用中添加或更新文档,...

    solr-solrj-5.5.0.jar

    solrJ是Java连接solr进行查询检索和索引更新维护的jar包。

    solr-solrj-6.6.0.jar

    solr-solrj-6.6.0.jar

    apache-solr-solrj-3.5.0.jar

    apache-solr-solrj-3.5.0.jar

    详解java整合solr5.0之solrj的使用

    Java整合Solr5.0的过程中,SolrJ是Java客户端库,用于与Solr服务器进行交互,提供了方便的API来执行各种操作。本篇文章将详细阐述如何使用SolrJ进行索引管理和查询。 首先,集成SolrJ需要将相应的依赖库引入项目。...

    solr详细配置教程与solrj的使用

    solr详细配置教程与solrj的使用

    solrj使用教程

    SolrJ是Apache Solr官方提供的Java客户端库,它使得在Java应用程序中与Solr进行交互变得简单。这个“solrj使用教程”很可能是为了帮助开发者了解如何使用SolrJ来连接、查询和管理Solr索引。下面将详细介绍SolrJ的...

    solr-config_solrj-demo.rar_DEMO_solr_solr的j

    SolrJ是Solr的Java客户端库,用于与Solr服务器进行交互,包括创建、更新、删除索引和执行查询。在DEMO中,你会看到如何使用SolrJ连接到Solr服务器,创建`SolrServer`对象,以及如何使用`SolrInputDocument`来添加、...

    solr-solrj-4.10.3.jar

    solr-solrj-4.10.3.jar。

    solr ssm java

    标题中的"solr ssm java"表明这是一个使用Java语言,结合Spring、SpringMVC和MyBatis(SSM)框架的项目,其中整合了Apache Solr搜索引擎。让我们深入了解一下这些技术及其相互作用。 **Solr**: Apache Solr是基于...

    solr8使用solrJ查询数据使用记录

    1.下载solr 下载地址 http://www.apache.org/dyn/closer.lua/lucene/solr/8.0.0 windows下载zip,linux下载tgz 下载完解压 2.solr启动&停止 solr-8.0.0\bin目录下执行cmd solr start 启动 solr stop -all 3.创建solr...

Global site tag (gtag.js) - Google Analytics