pig

fypop

浏览: 57867 次
性别:
来自: 广州

最近访客更多访客>>

zhangxpower

renzhim1330

kfdks155270531

lincolnlee1982

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hadoop

hdfs://hadoopmaster:9000/user/hadoop/ncdc_data.txt<r 2> 71
hdfs://hadoopmaster:9000/user/hadoop/user <dir>
grunt> cat ncdc_data.txt
1953:122:5
1954:83:5
1955:44:5
1956:33:5
1957:50:5
1958:33:5
1959:55:5
grunt> A = LOAD 'ncdc_data.txt' USING PigStorage(':') AS(year:int,temp:int,quality:int);
2015-08-22 00:28:39,239 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> B = FILTER A BY temp !=50 AND ((chararray)quality matches '[012345]');
grunt> C = GROUP B BY year;
grunt> describe C;
C: {group: int,B: {(year: int,temp: int,quality: int)}}
grunt> D = FOREACH C GENERATE group,MAX(B.temp) AS max_temp;
grunt> describe D;
D: {group: int,max_temp: int}
grunt> DUMP D;
2015-08-22 00:33:01,366 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER
2015-08-22 00:33:01,422 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2015-08-22 00:33:01,538 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-08-22 00:33:01,552 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2015-08-22 00:33:01,659 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-08-22 00:33:01,660 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-08-22 00:33:01,708 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:33:01,753 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoopmaster/192.168.1.50:8032
2015-08-22 00:33:01,965 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-08-22 00:33:01,972 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2015-08-22 00:33:01,972 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-08-22 00:33:01,972 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2015-08-22 00:33:01,975 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2015-08-22 00:33:01,976 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2015-08-22 00:33:01,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=71
2015-08-22 00:33:01,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2015-08-22 00:33:01,985 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2015-08-22 00:33:01,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-08-22 00:33:01,986 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job8689104904217919048.jar
2015-08-22 00:33:05,184 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job8689104904217919048.jar created
2015-08-22 00:33:05,184 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jar is deprecated. Instead, use mapreduce.job.jar
2015-08-22 00:33:05,208 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-08-22 00:33:05,214 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-08-22 00:33:05,214 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-08-22 00:33:05,215 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-08-22 00:33:05,372 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-08-22 00:33:05,372 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2015-08-22 00:33:05,401 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoopmaster/192.168.1.50:8032
2015-08-22 00:33:05,425 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:33:06,020 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-08-22 00:33:06,020 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-08-22 00:33:06,055 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2015-08-22 00:33:06,153 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2015-08-22 00:33:06,167 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:33:06,686 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1440227566287_0001
2015-08-22 00:33:07,117 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1440227566287_0001
2015-08-22 00:33:07,179 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://hadoopmaster:8088/proxy/application_1440227566287_0001/
2015-08-22 00:33:07,180 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1440227566287_0001
2015-08-22 00:33:07,180 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B,C,D
2015-08-22 00:33:07,180 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4],A[-1,-1],B[2,4],D[4,4],C[3,4] C: D[4,4],C[3,4] R: D[4,4]
2015-08-22 00:33:07,189 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-08-22 00:33:07,189 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0001]
2015-08-22 00:33:24,290 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2015-08-22 00:33:24,290 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0001]
2015-08-22 00:33:31,308 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0001]
2015-08-22 00:33:32,507 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-08-22 00:33:32,508 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.13.0 hadoop 2015-08-22 00:33:01 2015-08-22 00:33:32 GROUP_BY,FILTER

Success!

Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_1440227566287_0001 1 1 6 6 6 6 3 3 3 3 A,B,C,D GROUP_BY,COMBINER hdfs://hadoopmaster:9000/tmp/temp-321901511/tmp399055604,

Input(s):
Successfully read 7 records (440 bytes) from: "hdfs://hadoopmaster:9000/user/hadoop/ncdc_data.txt"

Output(s):
Successfully stored 6 records (54 bytes) in: "hdfs://hadoopmaster:9000/tmp/temp-321901511/tmp399055604"

Counters:
Total records written : 6
Total bytes written : 54
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1440227566287_0001

2015-08-22 00:33:32,541 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2015-08-22 00:33:32,543 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:33:32,544 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-08-22 00:33:32,557 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-08-22 00:33:32,557 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1953,122)
(1954,83)
(1955,44)
(1956,33)
(1958,33)
(1959,55)
grunt> STORE D INTO 'max_temp' USING PigStorage(':');
2015-08-22 00:35:08,446 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:35:08,475 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.textoutputformat.separator is deprecated. Instead, use mapreduce.output.textoutputformat.separator
2015-08-22 00:35:08,493 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER
2015-08-22 00:35:08,494 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2015-08-22 00:35:08,500 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-08-22 00:35:08,502 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2015-08-22 00:35:08,507 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-08-22 00:35:08,507 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-08-22 00:35:08,518 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-08-22 00:35:08,519 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoopmaster/192.168.1.50:8032
2015-08-22 00:35:08,523 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-08-22 00:35:08,524 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-08-22 00:35:08,524 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2015-08-22 00:35:08,524 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2015-08-22 00:35:08,527 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=71
2015-08-22 00:35:08,527 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2015-08-22 00:35:08,527 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-08-22 00:35:08,527 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job8143857507507996250.jar
2015-08-22 00:35:11,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job8143857507507996250.jar created
2015-08-22 00:35:11,414 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-08-22 00:35:11,415 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-08-22 00:35:11,415 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-08-22 00:35:11,415 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-08-22 00:35:11,454 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-08-22 00:35:11,457 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoopmaster/192.168.1.50:8032
2015-08-22 00:35:11,639 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2015-08-22 00:35:11,639 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2015-08-22 00:35:11,642 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2015-08-22 00:35:11,686 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2015-08-22 00:35:11,755 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1440227566287_0002
2015-08-22 00:35:11,778 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1440227566287_0002
2015-08-22 00:35:11,782 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://hadoopmaster:8088/proxy/application_1440227566287_0002/
2015-08-22 00:35:11,955 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1440227566287_0002
2015-08-22 00:35:11,955 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B,C,D
2015-08-22 00:35:11,955 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4],A[-1,-1],B[2,4],D[4,4],C[3,4] C: D[4,4],C[3,4] R: D[4,4]
2015-08-22 00:35:11,963 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-08-22 00:35:11,963 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0002]
2015-08-22 00:35:23,989 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2015-08-22 00:35:23,990 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0002]
2015-08-22 00:35:29,002 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1440227566287_0002]
2015-08-22 00:35:32,106 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-08-22 00:35:32,107 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.13.0 hadoop 2015-08-22 00:35:08 2015-08-22 00:35:32 GROUP_BY,FILTER

Success!

Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_1440227566287_0002 1 1 3 3 3 3 3 3 3 3 A,B,C,D GROUP_BY,COMBINER hdfs://hadoopmaster:9000/user/hadoop/max_temp,

Input(s):
Successfully read 7 records (440 bytes) from: "hdfs://hadoopmaster:9000/user/hadoop/ncdc_data.txt"

Output(s):
Successfully stored 6 records (49 bytes) in: "hdfs://hadoopmaster:9000/user/hadoop/max_temp"

Counters:
Total records written : 6
Total bytes written : 49
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1440227566287_0002

2015-08-22 00:35:32,135 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
grunt> cat max_temp
1953:122
1954:83
1955:44
1956:33
1958:33
1959:55
grunt>

分享到：

jaas介绍 | pig

2015-08-22 15:34
浏览 850
评论(0)
分类:操作系统
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

pig

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

pig

评论

发表评论

相关推荐

spark snappy save text file

kerberos 5

hadoop network

distcp

hadoop常用命令

hive之jdbc

hadoop fs, hdfs dfs, hadoop dfs科普

pig call hcatalog

hadoop fsck

hcatalog study

pig

hadoop jar command

mapreduce

sqoop command

maven

hadoop 2.5.2安装实录

最近访客更多访客>>