- 浏览: 348382 次
- 性别:
- 来自: 杭州
文章分类
最新评论
-
lvyuan1234:
你好,你那个sample.txt文件可以分享给我吗
hive insert overwrite into -
107x:
不错,谢谢!
hive 表的一些默认值 -
on_way_:
赞
Hadoop相关书籍 -
bupt04406:
dengkanghua 写道出来这个问题该怎么解决?hbase ...
Unexpected state导致HMaster abort -
dengkanghua:
出来这个问题该怎么解决?hbase master启动不起来。
Unexpected state导致HMaster abort
http://sigizmund.com/debugging-hadoop-applications-using-your-eclipse/
Well, it can be annoying – it can be awfully annoying, in fact, to debug Hadoop applications. But sometimes you need it, because logging doesn’t show anything, and you’ve tried anything but still cannot get under the Hadoop’s cover. In this case, do few simple steps.
1. Download and unpack Hadoop to your local machine.
2. Prepare small set of data you’re planning to run the test on
3. Check that you actually can run Hadoop locally, something like this (don’t forget to set $HADOOP_CLASSPATH first!):
bin/hdebug jar yourprogram.jar com.company.project.HadoopApp
tiny.txt ./out
4. Go to Hadoop’s directory, and copy file bin/hadoop to bin/hdebug
5. Now, we need to make Hadoop start in debug mode. What you should do is to add one line of text into the starting script:
[img]
[/img]
Yes, here’s it. Copy it from here:
-Xdebug -Xrunjdwp:transport=dt_socket,address=8001,server=y,suspend=y
What does it say basically is an instruction to Java to start in debug mode, and wait for socket connection of the remote debugger on port 8001; execution should be suspended after the start until debugger is connected.
Now, go and start your grid application like you did in step 3, but now use bin/hdebug script we’ve created. If you’ve done everything correctly, program should output something like this:
Listening for transport dt_socket at address: 8001
and wait for debugger. So, let’s get it some debugger then! Fire up your Eclipse with your project (likely you have it opened already since you’re trying to debug something) and add new Debug configuration:
[img]
[/img]
After you’ve set everything up, click “Apply” and close the window for now – probably, you’d want to set some breakpoints before starting the actual debugging. Go and do it, and then simply choose created debug configuration – and off you go! If everything worked properly, you should soon get a standard debugger window, with all the nice things Java can offer you. Hope it’ll help some of us in our difficult business of writing distributed grid-enabled applications!
Hive remote debugging
Figured out it is not very easy to debug the code, so here is a useful script we used to enable remote debugging in hive. We used Eclipse remote debugging with Hadoop 0.20.1 running in standalone method with Hive 0.5.0.
Please do remember to remove the extra lines that I had to add for formatting the script. Also, a better job can be done by using something like ‘for’ loop for getting all lib jars from Hadoop and Hive lib directory.
export HADOOP_HOME=/home/hadoop/hadoop-0.20.1
export HIVE_HOME=/home/hadoop/hive-0.5.0-bin
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HIVE_LIB=$HIVE_HOME/lib
export HIVE_CLASSPATH=$HIVE_HOME/conf:$HIVE_LIB/antlr-runtime-3.0.1.jar:$HIVE_LIB/asm-3.1.jar:
$HIVE_LIB/commons-cli-2.0-SNAPSHOT.jar:$HIVE_LIB/commons-codec-1.3.jar:
$HIVE_LIB/commons-collections-3.2.1.jar:$HIVE_LIB/commons-lang-2.4.jar:
$HIVE_LIB/commons-logging-1.0.4.jar:$HIVE_LIB/commons-logging-api-1.0.4.jar:
$HIVE_LIB/datanucleus-core-1.1.2.jar:$HIVE_LIB/datanucleus-enhancer-1.1.2.jar:
$HIVE_LIB/datanucleus-rdbms-1.1.2.jar:$HIVE_LIB/derby.jar:$HIVE_LIB/hive-anttasks-0.5.0.jar:
$HIVE_LIB/hive-cli-0.5.0.jar:$HIVE_LIB/hive-common-0.5.0.jar:$HIVE_LIB/hive_contrib.jar:
$HIVE_LIB/hive-exec-0.5.0.jar:$HIVE_LIB/hive-hwi-0.5.0.jar:$HIVE_LIB/hive-jdbc-0.5.0.jar:
$HIVE_LIB/hive-metastore-0.5.0.jar:$HIVE_LIB/hive-serde-0.5.0.jar:
$HIVE_LIB/hive-service-0.5.0.jar:$HIVE_LIB/hive-shims-0.5.0.jar:
$HIVE_LIB/jdo2-api-2.3-SNAPSHOT.jar:$HIVE_LIB/jline-0.9.94.jar:
$HIVE_LIB/json.jar:$HIVE_LIB/junit-3.8.1.jar:$HIVE_LIB/libfb303.jar:
$HIVE_LIB/libthrift.jar:$HIVE_LIB/log4j-1.2.15.jar:
$HIVE_LIB/mysql-connector-java-5.0.0-bin.jar:$HIVE_LIB/stringtemplate-3.1b1.jar:
$HIVE_LIB/velocity-1.5.jar:
export HADOOP_LIB=$HADOOP_HOME/bin/../lib
export HADOOP_CLASSPATH=$HADOOP_HOME/bin/../conf:$JAVA_HOME/lib/tools.jar:
$HADOOP_HOME/bin/..:$HADOOP_HOME/bin/../hadoop-0.20.1-core.jar:
$HADOOP_LIB/commons-cli-1.2.jar:$HADOOP_LIB/commons-codec-1.3.jar:$HADOOP_LIB/commons-el-1.0.jar:
$HADOOP_LIB/commons-httpclient-3.0.1.jar:$HADOOP_LIB/commons-logging-1.0.4.jar:
$HADOOP_LIB/commons-logging-api-1.0.4.jar:$HADOOP_LIB/commons-net-1.4.1.jar:$HADOOP_LIB/core-3.1.1.jar:
$HADOOP_LIB/hsqldb-1.8.0.10.jar:$HADOOP_LIB/jasper-compiler-5.5.12.jar:
$HADOOP_LIB/jasper-runtime-5.5.12.jar:$HADOOP_LIB/jets3t-0.6.1.jar:$HADOOP_LIB/jetty-6.1.14.jar:
$HADOOP_LIB/jetty-util-6.1.14.jar:$HADOOP_LIB/junit-3.8.1.jar:$HADOOP_LIB/kfs-0.2.2.jar:
$HADOOP_LIB/log4j-1.2.15.jar:$HADOOP_LIB/oro-2.0.8.jar:$HADOOP_LIB/servlet-api-2.5-6.1.14.jar:
$HADOOP_LIB/slf4j-api-1.4.3.jar:$HADOOP_LIB/slf4j-log4j12-1.4.3.jar:$HADOOP_LIB/xmlenc-0.52.jar:
$HADOOP_LIB/jsp-2.1/jsp-2.1.jar:$HADOOP_LIB/jsp-2.1/jsp-api-2.1.jar:
export CLASSPATH=$HADOOP_CLASSPATH:$HIVE_CLASSPATH:$CLASSPATH
export DEBUG_INFO="-Xmx1000m -Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8001,server=y,suspend=n"
$JAVA_HOME/bin/java $DEBUG_INFO -classpath $CLASSPATH -Dhadoop.log.dir=$HADOOP_HOME/bin/../logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=$HADOOP_HOME/bin/..
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=$HADOOP_LIB/native/Linux-i386-32
-Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.util.RunJar $HIVE_LIB/hive-service-0.5.0.jar
org.apache.hadoop.hive.service.HiveServer
$JAVA_HOME/bin/java -Xmx1000m $DEBUG_INFO -classpath $CLASSPATH -Dhadoop.log.dir=$HADOOP_HOME/bin/../logs
-Dhadoop.log.file=hadoop.log
-Dhadoop.home.dir=$HADOOP_HOME/bin/.. -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console
-Djava.library.path=$HADOOP_LIB/native/Linux-i386-32 -Dhadoop.policy.file=hadoop-policy.xml
org.apache.hadoop.util.RunJar $HIVE_LIB/hive-cli-0.5.0.jar org.apache.hadoop.hive.cli.CliDriver
Well, it can be annoying – it can be awfully annoying, in fact, to debug Hadoop applications. But sometimes you need it, because logging doesn’t show anything, and you’ve tried anything but still cannot get under the Hadoop’s cover. In this case, do few simple steps.
1. Download and unpack Hadoop to your local machine.
2. Prepare small set of data you’re planning to run the test on
3. Check that you actually can run Hadoop locally, something like this (don’t forget to set $HADOOP_CLASSPATH first!):
bin/hdebug jar yourprogram.jar com.company.project.HadoopApp
tiny.txt ./out
4. Go to Hadoop’s directory, and copy file bin/hadoop to bin/hdebug
5. Now, we need to make Hadoop start in debug mode. What you should do is to add one line of text into the starting script:
[img]
[/img]
Yes, here’s it. Copy it from here:
-Xdebug -Xrunjdwp:transport=dt_socket,address=8001,server=y,suspend=y
What does it say basically is an instruction to Java to start in debug mode, and wait for socket connection of the remote debugger on port 8001; execution should be suspended after the start until debugger is connected.
Now, go and start your grid application like you did in step 3, but now use bin/hdebug script we’ve created. If you’ve done everything correctly, program should output something like this:
Listening for transport dt_socket at address: 8001
and wait for debugger. So, let’s get it some debugger then! Fire up your Eclipse with your project (likely you have it opened already since you’re trying to debug something) and add new Debug configuration:
[img]
[/img]
After you’ve set everything up, click “Apply” and close the window for now – probably, you’d want to set some breakpoints before starting the actual debugging. Go and do it, and then simply choose created debug configuration – and off you go! If everything worked properly, you should soon get a standard debugger window, with all the nice things Java can offer you. Hope it’ll help some of us in our difficult business of writing distributed grid-enabled applications!
Hive remote debugging
Figured out it is not very easy to debug the code, so here is a useful script we used to enable remote debugging in hive. We used Eclipse remote debugging with Hadoop 0.20.1 running in standalone method with Hive 0.5.0.
Please do remember to remove the extra lines that I had to add for formatting the script. Also, a better job can be done by using something like ‘for’ loop for getting all lib jars from Hadoop and Hive lib directory.
export HADOOP_HOME=/home/hadoop/hadoop-0.20.1
export HIVE_HOME=/home/hadoop/hive-0.5.0-bin
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HIVE_LIB=$HIVE_HOME/lib
export HIVE_CLASSPATH=$HIVE_HOME/conf:$HIVE_LIB/antlr-runtime-3.0.1.jar:$HIVE_LIB/asm-3.1.jar:
$HIVE_LIB/commons-cli-2.0-SNAPSHOT.jar:$HIVE_LIB/commons-codec-1.3.jar:
$HIVE_LIB/commons-collections-3.2.1.jar:$HIVE_LIB/commons-lang-2.4.jar:
$HIVE_LIB/commons-logging-1.0.4.jar:$HIVE_LIB/commons-logging-api-1.0.4.jar:
$HIVE_LIB/datanucleus-core-1.1.2.jar:$HIVE_LIB/datanucleus-enhancer-1.1.2.jar:
$HIVE_LIB/datanucleus-rdbms-1.1.2.jar:$HIVE_LIB/derby.jar:$HIVE_LIB/hive-anttasks-0.5.0.jar:
$HIVE_LIB/hive-cli-0.5.0.jar:$HIVE_LIB/hive-common-0.5.0.jar:$HIVE_LIB/hive_contrib.jar:
$HIVE_LIB/hive-exec-0.5.0.jar:$HIVE_LIB/hive-hwi-0.5.0.jar:$HIVE_LIB/hive-jdbc-0.5.0.jar:
$HIVE_LIB/hive-metastore-0.5.0.jar:$HIVE_LIB/hive-serde-0.5.0.jar:
$HIVE_LIB/hive-service-0.5.0.jar:$HIVE_LIB/hive-shims-0.5.0.jar:
$HIVE_LIB/jdo2-api-2.3-SNAPSHOT.jar:$HIVE_LIB/jline-0.9.94.jar:
$HIVE_LIB/json.jar:$HIVE_LIB/junit-3.8.1.jar:$HIVE_LIB/libfb303.jar:
$HIVE_LIB/libthrift.jar:$HIVE_LIB/log4j-1.2.15.jar:
$HIVE_LIB/mysql-connector-java-5.0.0-bin.jar:$HIVE_LIB/stringtemplate-3.1b1.jar:
$HIVE_LIB/velocity-1.5.jar:
export HADOOP_LIB=$HADOOP_HOME/bin/../lib
export HADOOP_CLASSPATH=$HADOOP_HOME/bin/../conf:$JAVA_HOME/lib/tools.jar:
$HADOOP_HOME/bin/..:$HADOOP_HOME/bin/../hadoop-0.20.1-core.jar:
$HADOOP_LIB/commons-cli-1.2.jar:$HADOOP_LIB/commons-codec-1.3.jar:$HADOOP_LIB/commons-el-1.0.jar:
$HADOOP_LIB/commons-httpclient-3.0.1.jar:$HADOOP_LIB/commons-logging-1.0.4.jar:
$HADOOP_LIB/commons-logging-api-1.0.4.jar:$HADOOP_LIB/commons-net-1.4.1.jar:$HADOOP_LIB/core-3.1.1.jar:
$HADOOP_LIB/hsqldb-1.8.0.10.jar:$HADOOP_LIB/jasper-compiler-5.5.12.jar:
$HADOOP_LIB/jasper-runtime-5.5.12.jar:$HADOOP_LIB/jets3t-0.6.1.jar:$HADOOP_LIB/jetty-6.1.14.jar:
$HADOOP_LIB/jetty-util-6.1.14.jar:$HADOOP_LIB/junit-3.8.1.jar:$HADOOP_LIB/kfs-0.2.2.jar:
$HADOOP_LIB/log4j-1.2.15.jar:$HADOOP_LIB/oro-2.0.8.jar:$HADOOP_LIB/servlet-api-2.5-6.1.14.jar:
$HADOOP_LIB/slf4j-api-1.4.3.jar:$HADOOP_LIB/slf4j-log4j12-1.4.3.jar:$HADOOP_LIB/xmlenc-0.52.jar:
$HADOOP_LIB/jsp-2.1/jsp-2.1.jar:$HADOOP_LIB/jsp-2.1/jsp-api-2.1.jar:
export CLASSPATH=$HADOOP_CLASSPATH:$HIVE_CLASSPATH:$CLASSPATH
export DEBUG_INFO="-Xmx1000m -Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8001,server=y,suspend=n"
$JAVA_HOME/bin/java $DEBUG_INFO -classpath $CLASSPATH -Dhadoop.log.dir=$HADOOP_HOME/bin/../logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=$HADOOP_HOME/bin/..
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=$HADOOP_LIB/native/Linux-i386-32
-Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.util.RunJar $HIVE_LIB/hive-service-0.5.0.jar
org.apache.hadoop.hive.service.HiveServer
$JAVA_HOME/bin/java -Xmx1000m $DEBUG_INFO -classpath $CLASSPATH -Dhadoop.log.dir=$HADOOP_HOME/bin/../logs
-Dhadoop.log.file=hadoop.log
-Dhadoop.home.dir=$HADOOP_HOME/bin/.. -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console
-Djava.library.path=$HADOOP_LIB/native/Linux-i386-32 -Dhadoop.policy.file=hadoop-policy.xml
org.apache.hadoop.util.RunJar $HIVE_LIB/hive-cli-0.5.0.jar org.apache.hadoop.hive.cli.CliDriver
发表评论
-
hadoop
2017-08-01 13:42 0audit log配置 http://hack ... -
hbase jmx
2013-12-11 20:42 2947conf/hbase-env.sh 里面配了 JMX后就可 ... -
Too many fetch failures
2013-10-29 10:42 1432http://lucene.472066.n3.na ... -
cdh3集群 distcp 数据到 cdh4集群
2013-09-26 21:54 1114从cdh3集群 distcp 数据到 cdh4集群上面 ... -
hive rename table name
2013-09-18 14:28 2600hive rename tablename hive re ... -
cdh4 vs cdh3 client处理DataNode异常的不同
2013-09-13 21:13 2220cdh4在处理pipeline中的错误时,逻辑上与原先不一 ... -
hive的distribute by如何partition long型的数据
2013-08-20 10:15 2480有用户问:hive的distribute by分桶是怎么分 ... -
hdfs 升级,cdh3 升级 cdh4
2013-08-05 18:09 2203Step 1: 做下saveNamespace操作,停掉集 ... -
hive like vs rlike vs regexp
2013-04-11 18:53 11216like vs rlike vs regexp r ... -
HDFS HBase NIO相关知识
2012-09-26 18:29 2661HDFS的NIO有一些相关的知识偶尔需要注意下: (1) 使 ... -
java.net.SocketTimeoutException: 480000 millis timeout hdfs
2012-08-13 16:45 8183hdfs集群出现SocketTimeoutException, ... -
HBase如何从Hadoop读取数据,DFSInputStream
2012-08-08 15:41 3350HDFS Client的读取流是从DFSInputStream ... -
DFSClient Packet dfs.write.packet.size
2012-07-30 20:01 1636HBase 里面调用DFSOutputStream的方法常用的 ... -
hbase、hadoop checksum相关
2012-07-25 21:16 1969support checksums in HBase bloc ... -
hive sql where条件很简单,但是太多
2012-07-18 15:51 8744insert overwrite table aaaa ... -
DFSClient 写一个Block的过程
2012-07-12 21:39 1244DFSClient 写一个Block的过程 ... -
insert into时(string->bigint)自动类型转换
2012-06-14 12:30 8282原表src: hive> desc src; ... -
通过复合结构来优化udf的调用
2012-05-11 14:07 1210select split("accba&quo ... -
cdh3u0的jetty导致Error Reading IndexFile
2012-04-13 20:21 2310在36个机器上面跑一个大作业,8千多个map,2w多个r ... -
RegexSerDe
2012-03-14 09:58 1552官方示例在: https://cwiki.apache.or ...
相关推荐
Debugging iOS Applications with IDA Pro, 最新版本,enjoy it
Debugging Mac OSX Applications with IDA Pro, 2020最新版本,非常详细,enjoy it!
《Debugging Applications Windows程序调试中文版》是一本深入探讨Windows平台下程序调试技术的专业书籍。这本书全面涵盖了从基本的调试概念到高级的调试技巧,旨在帮助开发者和IT专业人士解决在开发过程中遇到的...
In the predecessor volume of Debugging Applications for Microsoft .NET and Microsoft Windows, which dealt with Visual Basic 6, John Robbins broke new ground by codifying the techniques and strategies ...
"Debugging Applications for Microsoft .NET and Microsoft Windows" 是一个专门探讨在这些平台上进行调试的技术和策略的资源。本文将深入探讨.NET和Windows环境下的调试基础知识、工具以及最佳实践。 一、调试...
Debugging Microsoft® .NET 2.0 Applications By John Robbins ............................................... Publisher: Microsoft Press Pub Date: November 08, 2006 Print ISBN-10: 0-7356-2202-7 ...
《调试应用程序:DV MPS编程》是John Robbins撰写的一本关于软件调试的专业书籍,出版于2000年。这本书深入探讨了如何有效地诊断和解决应用程序中的问题,特别关注了DV MPS(可能是Digital Visual Multi-Processor ...
《应用程序调试技术》以作者自己的多年编程和调试经验着重介绍了各种语言的程序调试工作。其内容包括程序错误和故障的类型、小组调试所需要的基础结构要求和在编写代码时如何进行预先的调试;什么是调试器,并描述了...
本文将深入探讨“Debugging Applications”的核心概念,帮助开发者掌握高效、精准的调试技能。 首先,我们要理解什么是调试。调试是程序员在开发过程中对代码进行检查和修正错误的过程。它通过运行程序并观察其行为...
包括《The Google File System》 ...《Mochi:Visual Log-Analysis Based Tools for Debugging Hadoop》 《Ganesha:blackBox diagnosis of MapReduce systems》 《SALAS:Analyzing Logs as StAte Machines》
软件破解技术相关书籍_Debugging Applications。。
《MS Press - Debugging Applications》是一本专注于调试应用程序的专业书籍,由权威的MS Press出版。这本书深入探讨了软件调试的各个方面,旨在帮助开发者和IT专业人员有效地定位并解决应用程序中的错误和异常。...
### 使用Eclipse构建与调试Pentaho 2.0 #### 概述 本文档将详细介绍如何在Eclipse集成开发环境中设置开发环境,包括如何连接到Pentaho Subversion仓库、检出Pentaho项目、构建这些项目以及使用独立的Java应用...
This guide provides a complete overview of developing JEE applications using Eclipse. The many features of the Eclipse IDE are explained. These enable the rapid development, debugging, testing, and ...
RDz是一款基于Eclipse的集成开发环境(IDE),旨在提供一个一站式的平台用于CICS应用程序的编码和部署。该工具支持多种编程语言,包括但不限于COBOL、PL/I、C、C++、汇编语言和Java。通过RDz,开发者可以在一个环境中...