Tutorial - Apache Hive - Apache Software Foundation
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Joins
Hive 的說明內只出了Joins的幾種組合用法,
其中有個是 left semi join
In order check the existence of a key in another table, the user can use LEFT SEMI JOIN as illustrated by the following example.
INSERT OVERWRITE TABLE pv_users
SELECT u.*
FROM user u LEFT SEMI JOIN page_view pv ON (pv.userid = u.id)
WHERE pv.date = '2008-03-03';
如果有兩張表
A,B
A
id name
1 abc
2 edf
B
id city
1 taipei
2 ku
1 yl
使用 left semi join 時,B表只會出現一筆 rec ,達到去重效果。
cf.
Hive Join(翻译自Hive wiki) - ggjucheng - 博客园
http://www.cnblogs.com/ggjucheng/archive/2013/01/15/2860723.html
沒有留言:
張貼留言