[hadoop][hive]hive sort by and order by
Order by
- Toto ordering of query result set
- All data is passed through a single reducer
Sort by
- Order the data within each reducer ( local ordering)
- Reducer’s output will be sorted
- 使用 order by 在reduce 階段,因為全排序(也就是要整個排序完資料,才能做後續的處理),所以會集中在一台reducer上面做操作。
- 使用 sort by 會幫你做 local order 在做merge sort 。效率會比較高。
兩個語法使用範例如下:
SELECT s.ymd,s.name,s.price
FROM stocks s
ORDER BY s.ymd ASC ,s.name DESC;
SELECT s.ymd,s.name,s.price
FROM stocks s
SORT BY s.ymd ASC ,s.name DESC;