2.2 Batting Table playerID Player ID code yearID Year stint player's stint (order of appearances within a season) teamID Team lgID League G Games G_batting Game as batter AB At Bats R Runs H Hits 2B Doubles 3B Triples HR Homeruns RBI Runs Batted In SB Stolen Bases CS Caught Stealing BB Base on Balls SO Strikeouts IBB Intentional walks HBP Hit by pitch SH Sacrifice hits SF Sacrifice flies GIDP Grounded into double plays G_Old Old version of games (deprecated)
根據第一行的資料,我們可以決定資料表每個欄位的資料型態
欄位 | 範例 | 資料型態 |
---|---|---|
playerID | aardsda01 | STRING |
yearID | 2004 | INT |
stint | 1 | INT |
teamID | SFN | STRING |
lgID | NL | STRING |
G | 11 | INT |
G_batting | 11 | INT |
AB | 0 | INT |
R | 0 | INT |
H | 0 | INT |
2B | 0 | INT |
3B | 0 | INT |
HR | 0 | INT |
RBI | 0 | INT |
SB | 0 | INT |
CS | 0 | INT |
BB | 0 | INT |
SO | 0 | INT |
IBB | 0 | INT |
HBP | 0 | INT |
SH | 0 | INT |
SF | 0 | INT |
GIDP | 0 | INT |
G_old | 11 | INT |
hive> create table baseball_userXX.Batting ( playerID STRING, yearID INT, stint INT, teamID STRING, lgID STRING, G INT, G_batting INT, AB INT, R INT, H INT, B2 INT, B3 INT, HR INT, RBI INT, SB INT, CS INT, BB INT, SO INT, IBB INT, HBP INT, SH INT, SF INT, GIDP INT, G_old INT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ;
hive> LOAD DATA LOCAL INPATH "Batting.csv" OVERWRITE INTO TABLE baseball_userXX.Batting;
SHOW TABLES;
來查詢目前的資料庫有哪幾個資料表。確認是否有正確產生 Batting 資料表。
hive> USE baseball_userXX; SHOW TABLES; OK Time taken: 0.031 seconds OK batting master Time taken: 0.22 seconds
hive> SELECT * FROM baseball_userXX.batting;
SELECT <欄位> FROM <資料表A> JOIN <資料表B> ON <條件>
。
hive> use baseball_userXX; hive> SELECT A.PlayerID, B.teamID, B.AB, B.R, B.H, B.B2, B.B3, B.HR, B.RBI FROM Master A JOIN BATTING B ON A.playerID = B.playerID;
zuverge01 BAL 17 0 2 0 0 0 2 zuverge01 BAL 23 1 3 0 0 0 0 zuverge01 BAL 9 0 2 0 1 0 2 zuverge01 BAL 0 0 0 0 0 0 0 zwilldu01 CHA 87 7 16 5 0 0 5 zwilldu01 CHF 592 91 185 38 8 16 95 zwilldu01 CHF 548 65 157 32 7 13 94 zwilldu01 CHN 53 4 6 1 0 1 8 Time taken: 105.467 seconds