论坛首页 Java企业应用论坛

HIbernate-MySql百万级数据的查找问题

浏览 9440 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者 正文
   发表时间:2005-07-08  
近日购得《深入浅出Hibernate》一书,在拜读曹老大等的高论后不禁手痒,做了以下这个试验:MySql百万级数据的查找。
环境:
数据库:mysql 5.0.7 (选择这个数据库是因为从这个版本开始支持视图)
JDBC驱动:mysql-connector-java-3.1.8 这个版本的驱动支持最新的JDBC特性
开发环境:eclipse-SDK-3.0.2
Sql性能测试工具:p6spy 1.1 通过委托的驱动来监视生成的查询的内容,书上说的。
Sql分析工具:sqlprofiler-0.3 把上述生成的查询内容进行分析,格式化输出,也是书上说的。
O/R框架hibernate-3.0.5 不用说了吧,本来就是想测试它的。
操作系统:Win2003 Server Enterprise Edition
JDK:Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode)
主要的配置文件内容如下:
建表语句如下:
DROP TABLE IF EXISTS test.cat;
CREATE TABLE cat (
  CAT_ID char(32) character set ucs2 NOT NULL,
  NAME varchar(16) NOT NULL,
  SEX char(1) NOT NULL,
  WEIGHT double NOT NULL,
  PRIMARY KEY  (CAT_ID)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
在CAT_ID上做了索引
hibernate.cfg.xml:
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
        "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
        "http//hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
    <session-factory>
        <!-- Database connection Settings      -->
        <property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
            <property name="hibernate.connection.driver_class">com.p6spy.engine.spy.P6SpyDriver</property>
            <property name="hibernate.connection.url">jdbcmysql//localhost/test</property>
            <property name="hibernate.connection.username">xproject</property>
            <property name="hibernate.connection.password">xproject</property>
        <!-- Database Connection Pool Settings -->
            <property name="hibernate.dbcp.maxActive">100</property>
<property name="hibernate.dbcp.whenExhaustedAction"> 1</property>
<property name="hibernate.dbcp.maxWait">120000</property>
<property name="hibernate.dbcp.maxIdle">10</property>
<property name="hibernate.dbcp.ps.maxActive">100</property>
<property name="hibernate.dbcp.ps.whenExhaustedAction"> 1</property>
<property name="hibernate.dbcp.ps.maxWait">120000</property>
<property name="hibernate.dbcp.ps.maxIdle">10</property>
<property name="hibernate.dbcp.validationQuery">select 1 from dual</property>
<property name="hibernate.dbcp.testOnBorrow">true</property>
<property name="hibernate.dbcp.testOnReturn">false</property>
<!-- Hibernate Configure Settings      -->
     <property name="hibernate.show_sql">true</property>
     <property name="hibernate.hbm2ddl.auto">create-drop</property>
<property name="hibernate.hbm2ddl.auto">create</property>
             <property name="hibernate.hbm2ddl.auto">update</property>
             <property name="hibernate.connection.isolation">4</property>
             <property name="hibernate.jdbc.fetch_size">25</property>
             <property name="hibernate.jdbc.batch_size">50</property>
             <property name="hibernate.jdbc.batch_versioned_data">true</property>
             <property name="hibernate.jdbc.use_scrollable_resultset">true</property>
             <property name="hibernate.jdbc.use_streams_for_binary">true</property>
             <property name="hibernate.jdbc.use_get_generated_keys">true</property>
             <!--<property name="hibernate.default_schema">test</property>-->
             <property name="hibernate.xml.output_stylesheet">M/test/eclipse/eclipse/workspace/LearnHibernate/hibernate-default.xslt</property>
             <property name="hibernate.use_outer_join">false</property>
             <property name="hibernate.max_fetch_depth">1</property>
             <property name="hibernate.cglib.use_reflection_optimizer">false</property>
             <property name="hibernate.cache.use_minimal_puts">true</property>
             <property name="hibernate.cache.region_prefix">hibernate.test</property>
             <property name="hibernate.cache.use_query_cache">true</property>
             <!--<property name="hibernate.cache.provider_class">net.sf.hibernate.cache.OSCacheProvider</property>-->
        <!-- Mapping   File      Settings      -->
             <mapping resource="Cat.hbm.xml"/>
    </session-factory>
</hibernate-configuration>

spy.properties:
#################################################################
# P6Spy Options File                                            #
# See documentation for detailed instructions                   #
#################################################################

#################################################################
# MODULES                                                       #
#                                                               #
# Modules provide the P6Spy functionality.  If a module, such   #
# as module_log is commented out, that functionality will not   #
# be available.  If it is not commented out (if it is active),  #
# the functionality will be active.                             #
#                                                               #
# Values set in Modules cannot be reloaded using the            #
# reloadproperties variable.  Once they are loaded, they remain #
# in memory until the application is restarted.                 #
#                                                               #
#################################################################

module.log=com.p6spy.engine.logging.P6LogFactory
#module.outage=com.p6spy.engine.outage.P6OutageFactory

#################################################################
# REALDRIVER(s)                                                 #
#                                                               #
# In your application server configuration file you replace the #
# "real driver" name with com.p6spy.engine.P6SpyDriver. This is #
# where you put the name of your real driver P6Spy can find and #
# register your real driver to do the database work.            #
#                                                               #
# If your application uses several drivers specify them in      #
# realdriver2, realdriver3.  See the documentation for more     #
# details.                                                      #
#                                                               #
# Values set in REALDRIVER(s) cannot be reloaded using the      #
# reloadproperties variable.  Once they are loaded, they remain #
# in memory until the application is restarted.                 #
#                                                               #
#################################################################

# oracle driver
# realdriver=oracle.jdbc.driver.OracleDriver

# mysql Connector/J driver
  realdriver=com.mysql.jdbc.Driver

# informix driver
# realdriver=com.informix.jdbc.IfxDriver

# ibm db2 driver
# realdriver=COM.ibm.db2.jdbc.net.DB2Driver

# the mysql open source driver
# realdriver=org.gjt.mm.mysql.Driver

#specifies another driver to use
realdriver2=
#specifies a third driver to use
realdriver3=
…..其余部分略去,按默认的。
Cat.hbm.xml:
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping
    PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
    "http//hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping>
    <class name="org.hibernate.examples.quickstart.Cat" table="CAT">
        <!-- A 32 hex character is our surrogate key. It's automatically
            generated by Hibernate with the UUID pattern. -->
        <id name="id" type="string" unsaved-value="null" >
            <column name="CAT_ID" sql-type="char(32)" not-null="true"/>
            <generator class="uuid.hex"/>
        </id>
        <!-- A cat has to have a name, but it shouldn' be too long. -->
        <property name="name">
            <column name="NAME" length="16" not-null="true"/>
        </property>
        <property name="sex"/>
        <property name="weight"/>
    </class>
</hibernate-mapping>

然后我先运行以下程序:
    public static void main(String[] args){
    Session session = HibernateUtil.currentSession();
    Transaction tx= session.beginTransaction();
    for(int i=0;i<1000000;i++){
        Cat princess = new Cat();
        princess.setName("Princess"+i);
        princess.setSex('F');
        princess.setWeight(7.4f+i);
        session.save(princess);
        if(i%25==0){
        session.flush();
        session.clear();
        }
    }
    tx.commit();
}
往数据库里插入100万条数据,在运行时各位请注意你的内存要够大,不然会OutOfMemory的。我只是示意一下,真正的程序不能这样写的。

然后我写一个简单的查询:
    public static void main(String[] args){
    Session session = HibernateUtil.currentSession();
    Criteria c=session.createCriteria(Cat.class);
    c.add(Expression.eq("name","Princess18367"));
    List list=c.list();
        System.out.println(list.size());
    for(int i=0;i<list.size();i++){
    Cat cat=(Cat)list.get(i);
    System.out.println(cat. getWeight ());
    }
    HibernateUtil.closeSession();
    }
发现实际执行的语句是:select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME='Princess18367'
耗时大概8.9秒。查询结果:18374.4
我在MySQLQuery Browser1.1里执行上述语句,耗时也是差不多:8.7秒.这样看起来似乎jdbc的性能和本地执行的效率差不多。因为我是建了索引的,于是我写了以下语句在MySQLQuery Browser里执行:select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME='Princess18367' and this_.CAT_ID ='9348b95304f473750104f473b38847c0' 耗时0.024秒.查询效率天查地远!看到这么巨大的性能差异,我当然是想提高效率啦,于是我把上面的程序改了一下:
public static void main(String[] args){
    Session session = HibernateUtil.currentSession();
    Criteria c=session.createCriteria(Cat.class);
    c.add(Expression.eq("name","Princess18367"));
    c.add(Expression.eq("id","9348b95304f473750104f473b38847c0"));
    List list=c.list();
        System.out.println(list.size());
    for(int i=0;i<list.size();i++){
    Cat cat=(Cat)list.get(i);
    System.out.println(cat. getWeight ());
    }
    HibernateUtil.closeSession();
}
发现Hibernate把语句转换成:select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME='Princess18367' and this_.CAT_ID='9348b95304f473750104f473b38847c0' .跟我想象中的一样。但是执行结果却是:耗时32毫秒左右,list.size()为0?!怎么回事?奇怪了!于是我又写了以下代码:
package org.hibernate.examples.quickstart;
import java.sql.*;
public class JdbcUtil {
    public JdbcUtil() {
    }

    public static void main(String[] args) {
        JdbcUtil jdbcutil = new JdbcUtil();
        ResultSet rs = null;
        PreparedStatement ps = null;
        Statement s = null;
        ResultSet r = null;
        Connection conn = null;
        try {
            Class.forName("com.p6spy.engine.spy.P6SpyDriver");
            conn = DriverManager.getConnection(
                    "jdbcmysql//localhost/test", "xproject", "xproject");

            ps = conn.prepareStatement("select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME=? and this_.CAT_ID=?");
            ps.setString(1, "Princess18367");
            ps.setString(2, "9348b95304f473750104f473b38847c0");
            rs = ps.executeQuery();
            int i = 0;
            while (rs.next()) {
                System.out.println(rs.getString("weight0_0_"));
                i++;
            }
            System.out.println(i);

            s = conn.createStatement();
            r = s.executeQuery(
                    "select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME='Princess18367' and this_.CAT_ID='9348b95304f473750104f473b38847c0'");
            while (r.next()) {
                System.out.println(r.getString("weight0_0_"));
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                rs.close();
                r.close();
                ps.close();
                s.close();
                conn.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
}
结果是:
用PreparedStatement ,ResultSet的大小为:0
用Statement,能查出对应字段的值:18374.400390625 那这样看来似乎是Mysql的驱动有问题了。但是我也发现在Mysql的表具有比较少的记录的情况下(记录少于1000),执行PreparedStatement和Statement的结果是一样的。在折腾了一下午之后,我得出的结论是:
1.在大记录数的情况下,Mysql的PreparedStatement会出现问题。
2.在数据库设计中提倡用UUID做主键(这个地球人都知道),但是我加点补充:在Mysql下为该主键做索引可以极大的提高性能。
3.Hibernate里都是通过PreparedStatement来执行自身生成的动态语句的
   发表时间:2005-07-08  
1.换了最新的mysql-connector-java-3.1.10驱动,问题依旧。
2.换了mysql的开源数据库驱动程序(org.gjt.mm.mysql.Driver),问题依 旧。
3.本来想用DBCP连接池的,但是发现org.hibernate.connection下无关于DBCP的ConnectionProvider了。
4.打算找个oracle试试有无上述问题。
0 请登录后投票
   发表时间:2005-07-11  
等着看结果~~~~.曾经用HIBERNATE2做百万级数据处理,性能太差,后来用SPRING的JDBCTEMPLATE.
0 请登录后投票
   发表时间:2005-07-11  
简单的测试的话,楼主不如把hibernate生成的sql输出来看看,对比一下自己手写sql会怎么写

一般不要太复杂的sql,应该都是差不多的

如果表很大,数据很多,甚至字段的顺序都有可能影响性能。这就要专门的sql调优工具了。

要真要到这种级别的应用,还要靠视图啊,sp啊,虚拟表什么的来提高性能,那还是直接用jdbc算了
0 请登录后投票
   发表时间:2005-07-11  
楼上的老大:问题现在是我输出的sql语句和我想得到的语句一样,但是就是不出结果,我怀疑是mysql的驱动问题。我已经是有意识的把sql优化过的。你看我帮主键建索引和在查询的时候把有主键的字段也纳入查询语句就知道啦。p6spy(使用Criteria)打印出来的sql语句:select this_.CAT_ID as CAT1_0_, this_.NAME as NAME0_0_, this_.sex as sex0_0_, this_.weight as weight0_0_ from CAT this_ where this_.NAME='Princess72' and this_.CAT_ID='9348b95304f473750104f4737b510049' p6spy(使用Query)打印出来的sql语句:select cat0_.CAT_ID as CAT1_,   cat0_.NAME as NAME0_,   cat0_.sex as sex0_,   cat0_.weight as weight0_   from CAT cat0_ where cat0_.NAME='Princess72' and cat0_.CAT_ID='9348b95304f473750104f4737b510049'  我用jdbc直连的sql语句:select * from cat where cat.name='Princess72' and cat.cat_id='9348b95304f473750104f4737b510049'用query是可以查出结果,用Criteria查不出。用PreparedStatement查不出,用statement查的出,很奇怪啊
0 请登录后投票
   发表时间:2005-07-11  
>>用PreparedStatement查不出,用statement查的出,很奇怪啊

驱动问题,换驱动试试

mysql5.0我还没用过,你使用了视图?用视图的时候,最好限定为只读操作,这样的话,还不如使用jdbc直连,用个DAO把这部分隔离出来。我知道Oracle的视图是可以读写的,但我不知道这个特性其他数据库实现的怎么样。

不过,我认为这样的测试意义不大。

数据库的测试是最不好测试的,这样简单的测试,什么都说明不了。

要评价hibernate的性能首先要看的是它生成sql的质量!hibernate和JDBC的本质是一致的。

另外,如果真做到那么大的项目,用mysql不是好选择。数据库的安全性和灾难恢复能力是首要考虑的。
0 请登录后投票
论坛首页 Java企业应用版

跳转论坛:
Global site tag (gtag.js) - Google Analytics