Description
in JD, we found that more than 90% usage of window function follows this pattern:
select (... (row_number|rank|dense_rank) () over( [partition by ...] order by ... ) as rn) where rn (==|<|<=) k and other conditions
However, existing physical plan is not optimum:
1, we should select local top-k records within each partitions, and then compute the global top-k. this can help reduce the shuffle amount;
For these three rank functions (row_number|rank|dense_rank), the rank of a key computed on partitial dataset is always <= its final rank computed on the whole dataset. so we can safely discard rows with partitial rank > k, anywhere.
2, skewed-window: some partition is skewed and take a long time to finish computation.
A real-world skewed-window case in our system is attached.
Attachments
Attachments
Issue Links
- links to