有時候邏輯上相同的 HiveQL,在執行時間上會有所不同。或者,有時候我們只是需要瞭解進行某一個複雜的 HiveQL 查詢,需要做幾次 MapReduce。這時候就可以使用 EXPLAIN 這個關鍵字來進行查詢。語法為 EXPLAIN <HiveQL> ;
。
hive> EXPLAIN SELECT * FROM baseball_stats.fact_player_stats LIMIT 10;
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME baseball_stats fact_player_stats))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_LIMIT 10)))
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: 10
Time taken: 0.14 seconds
hive> use baseball; EXPLAIN SELECT A.PlayerID, B.teamID, B.AB, B.R, B.H, B.2B, B.3B, B.HR, B.RBI FROM Master A JOIN BATTING B ON A.playerID = B.playerID;
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME Master) A) (TOK_TABREF (TOK_TABNAME BATTING) B) (= (. (TOK_TABLE_OR_COL A) playerID) (. (TOK_TABLE_OR_COL B) playerID)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL A) PlayerID)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) teamID)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) AB)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) R)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) H)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) 2B)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) 3B)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) HR)) (TOK_SELEXPR (. (TOK_TABLE_OR_COL B) RBI)))))
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
a
TableScan
alias: a
Reduce Output Operator
key expressions:
expr: playerid
type: string
sort order: +
Map-reduce partition columns:
expr: playerid
type: string
tag: 0
value expressions:
expr: playerid
type: string
b
TableScan
alias: b
Reduce Output Operator
key expressions:
expr: playerid
type: string
sort order: +
Map-reduce partition columns:
expr: playerid
type: string
tag: 1
value expressions:
expr: teamid
type: string
expr: ab
type: int
expr: r
type: int
expr: h
type: int
expr: 2b
type: int
expr: 3b
type: int
expr: hr
type: int
expr: rbi
type: int
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {VALUE._col1}
1 {VALUE._col3} {VALUE._col7} {VALUE._col8} {VALUE._col9} {VALUE._col10} {VALUE._col11} {VALUE._col12} {VALUE._col13}
handleSkewJoin: false
outputColumnNames: _col1, _col38, _col42, _col43, _col44, _col45, _col46, _col47, _col48
Select Operator
expressions:
expr: _col1
type: string
expr: _col38
type: string
expr: _col42
type: int
expr: _col43
type: int
expr: _col44
type: int
expr: _col45
type: int
expr: _col46
type: int
expr: _col47
type: int
expr: _col48
type: int
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Stage: Stage-0
Fetch Operator
limit: -1
Time taken: 2.592 seconds
hive>