- Apache Pig 教程
- Apache Pig - 首頁
- Apache Pig 簡介
- Apache Pig - 概述
- Apache Pig - 架構
- Apache Pig 環境
- Apache Pig - 安裝
- Apache Pig - 執行
- Apache Pig - Grunt Shell
- Pig Latin
- Pig Latin - 基礎
- 載入與儲存運算子
- Apache Pig - 讀取資料
- Apache Pig - 儲存資料
- 診斷運算子
- Apache Pig - 診斷運算子
- Apache Pig - Describe 運算子
- Apache Pig - Explain 運算子
- Apache Pig - Illustrate 運算子
- Pig Latin 內建函式
- Apache Pig - Eval 函式
- 載入與儲存函式
- Apache Pig - Bag 與 Tuple 函式
- Apache Pig - 字串函式
- Apache Pig - 日期時間函式
- Apache Pig - 數學函式
- Apache Pig 有用資源
- Apache Pig - 快速指南
- Apache Pig - 有用資源
- Apache Pig - 討論
Apache Pig - 診斷運算子
load 語句會將資料簡單地載入到 Apache Pig 中指定的關聯中。要驗證Load語句的執行,必須使用診斷運算子。Pig Latin 提供四種不同型別的診斷運算子:
- Dump 運算子
- Describe 運算子
- Explain 運算子
- Illustrate 運算子
本章將討論 Pig Latin 的 Dump 運算子。
Dump 運算子
Dump 運算子用於執行 Pig Latin 語句並在螢幕上顯示結果。它通常用於除錯目的。
語法
以下是Dump運算子的語法。
grunt> Dump Relation_Name
示例
假設我們在 HDFS 中有一個檔案student_data.txt,內容如下。
001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai.
我們使用 LOAD 運算子將其讀取到名為student的關聯中,如下所示。
grunt> student = LOAD 'hdfs://:9000/pig_data/student_data.txt'
USING PigStorage(',')
as ( id:int, firstname:chararray, lastname:chararray, phone:chararray,
city:chararray );
現在,讓我們使用Dump 運算子列印關聯的內容,如下所示。
grunt> Dump student
執行上述Pig Latin語句後,它將啟動一個 MapReduce 作業來從 HDFS 讀取資料。它將產生以下輸出。
2015-10-01 15:05:27,642 [main]
INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher -
100% complete
2015-10-01 15:05:27,652 [main]
INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.15.0 Hadoop 2015-10-01 15:03:11 2015-10-01 05:27 UNKNOWN
Success!
Job Stats (time in seconds):
JobId job_14459_0004
Maps 1
Reduces 0
MaxMapTime n/a
MinMapTime n/a
AvgMapTime n/a
MedianMapTime n/a
MaxReduceTime 0
MinReduceTime 0
AvgReduceTime 0
MedianReducetime 0
Alias student
Feature MAP_ONLY
Outputs hdfs://:9000/tmp/temp580182027/tmp757878456,
Input(s): Successfully read 0 records from: "hdfs://:9000/pig_data/
student_data.txt"
Output(s): Successfully stored 0 records in: "hdfs://:9000/tmp/temp580182027/
tmp757878456"
Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager
spill count : 0Total bags proactively spilled: 0 Total records proactively spilled: 0
Job DAG: job_1443519499159_0004
2015-10-01 15:06:28,403 [main]
INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLau ncher - Success!
2015-10-01 15:06:28,441 [main] INFO org.apache.pig.data.SchemaTupleBackend -
Key [pig.schematuple] was not set... will not generate code.
2015-10-01 15:06:28,485 [main]
INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
to process : 1
2015-10-01 15:06:28,485 [main]
INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths
to process : 1
(1,Rajiv,Reddy,9848022337,Hyderabad)
(2,siddarth,Battacharya,9848022338,Kolkata)
(3,Rajesh,Khanna,9848022339,Delhi)
(4,Preethi,Agarwal,9848022330,Pune)
(5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,9848022335,Chennai)
廣告