草庐IT

hadoop - 解释 "There can be many keys (and their associated values) in each partition, but the records for any given key are all in a single partition"

coder 2024-01-08 原文

“每个分区中可以有许多键(及其相关值),但任何给定键的记录都在一个分区中。”这是一本著名的hadoop教科书的一行。我没有理解它的第二部分的全部含义,即“但是任何给定键的记录都在一个分区中。”这是否意味着单个键的所有记录都应该在单个分区或其他地方。

最佳答案

but the records for any given key are all in a single partition

如果您有一个键,则该键及其相关联的值必须位于单个分区上。有时该值可能相当大。但这是对值大小的限制。它必须足够小以适合单个分区。

请注意,键和值上可能还有其他常量,具体取决于您用于后端存储的内容,例如,可能需要单个键值对以适应节点的内存。

关于hadoop - 解释 "There can be many keys (and their associated values) in each partition, but the records for any given key are all in a single partition",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22038034/

有关hadoop - 解释 "There can be many keys (and their associated values) in each partition, but the records for any given key are all in a single partition"的更多相关文章

随机推荐