如何在 Pig 中找到元组的 MAX?
我的代码是这样的:
A,20
B,10
C,40
D,5
data = LOAD 'myData.txt' USING PigStorage(',') AS key, value;
all = GROUP data ALL;
maxKey = FOREACH all GENERATE MAX(data.value);
DUMP maxKey;
返回 40,但我想要完整的键值对:C,40。有什么想法吗?
最佳答案
这适用于 Pig 0.10.0:
data = LOAD 'myData.txt' USING PigStorage(',') AS (key, value: long);
A = GROUP data ALL;
B = FOREACH A GENERATE MAX(data.value) AS val;
C = FILTER data BY value == (long)C.val;
DUMP C;
关于Hadoop PIG Max of Tuple,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14055956/