c++ - C++ 中的 Fast(est) 可变堆实现

coder 2024-02-05 原文

我目前正在寻找满足我要求的 C++ 中最快的数据结构:

我从需要插入的几百万个条目开始。

在每次迭代中，我想查看最大元素并更新
大约 10 个其他元素。我什至可以只使用减少的键，但我更喜欢更新(增加和减少功能)。

我不需要删除/插入(除了最初的)或其他任何东西。我认为堆将是更好的选择。在查看 STL 后，我发现大多数数据结构不支持更新(这是关键部分)。解决方案是删除并重新插入似乎很慢的元素(我的程序的瓶颈)。然后我查看了 boost 提供的堆，发现 pairing_heap 给了我最好的结果。然而，所有堆仍然比 MultiMap 上的删除/插入过程慢。有没有人有建议，我可以尝试哪些其他方法/实现？

非常感谢。

再次为完整起见，这是我目前的时间安排:

MultiMap STL(删除/插入):~70 秒

斐波那契 boost :~110 秒

D-Ary 堆 boost ~(最佳选择:D=150):~120 秒

配对堆 boost :~90 秒

倾斜堆 boost :105 秒

编辑我的帖子以澄清一些事情:

我的条目是 double ( double 是关键，我仍然附加了一些任意值)这就是为什么我认为散列不是一个好主意。

我说的是不正确的 PriorityQueue。相反，第一个实现使用了 MultiMap，其中值将被删除然后重新插入(使用新值)。我更新了我的帖子。对困惑感到抱歉。

我不明白 std::make_heap 如何解决这个问题。

为了更新元素，我有一个单独的查找表，我在其中维护元素的句柄。有了它，我可以直接更新元素而无需搜索它。

最佳答案

减少一般性

我尝试了这个，因为它很有趣，我想我自己可以使用这样的结构。击败专业编写的标准库通常非常困难，除非您可以做出缩小的假设，从而比他们的解决方案的普遍性(以及可能的稳健性)限制更多。

例如，很难击败malloc和 free如果您的目标只是编写一个通用且全面的内存分配器，如 malloc .但是，如果您对特定用例做出很多假设，则很容易编写一个分配器，仅针对特定用例就可以胜过这些假设。

这就是为什么，如果您对数据结构有这样的问题，我建议尽可能多地提供有关您的特定用例的特殊细节的信息(您需要从结构中获取的信息，例如迭代、排序、搜索、从中间删除，插入到前面，键的范围，类型等)。您想要做的一些测试代码(甚至伪代码)是有帮助的。我们想要尽可能多的缩小假设。

高动态内容的低于标准搜索算法

在您的情况下，我们对如此大的堆类型结构进行了特殊的动态使用。我最初来自老派游戏背景，在那些非常动态的情况下，通常低于标准的搜索算法实际上比具有卓越算法复杂性的搜索算法更有效，因为搜索优势通常伴随着更昂贵的价格标签构建/更新过程(较慢的插入/删除)。

例如，如果您有大量 Sprite 在 2D 游戏中的每一帧周围移动，那么平均而言，用于最近邻搜索的粗略编写的、算法上较差的固定网格 boost 器在实践中通常比算法上高级的、专业的——书面四叉树，因为不断移动事物和重新平衡树的成本可能会增加开销，超过一切的优越 boost 和理论对数复杂性。当所有 Sprite 聚集在一个区域时，固定网格会出现病态情况，但这种情况很少发生。

所以我采取了这种策略。我在做一些假设，其中最大的假设是您的 key 在一个范围内合理分布。另一个是你可以粗略地估计容器应该处理的最大元素数(虽然你可以超过或低于这个数字，但最好有一些粗略的知识和预期)。而且我没有费心提供迭代器来遍历容器(如果您愿意，可以这样做，但是键不会像 std::multimap 那样完美排序，它们会被“有点”排序)，但它确实提供除了使用最小键弹出元素之外，还从中间移除。我的解决方案中存在一个病理案例，例如 std::multimap 中不存在的案例，如果您有大量元素的键值大致相同(例如:0.000001、0.00000012、0.000000011 等)对于所有百万个元素，它会降级为对所有元素的线性搜索，并且性能比 multimap 差得多。 .

但我得到了一个比 std::multmap 快约 8 倍的解决方案如果我的假设适合您的用例。

注意:它是草率的代码，并且使用了许多快速而肮脏的分析器辅助的微优化编写，甚至提供了一个池分配器并在位和字节级别使用对齐假设来操纵事物(使用最大对齐假设“足够便携”) .它也不关心异常安全之类的事情。但是，用于 C++ 对象应该是安全的。

作为测试用例，我创建了一百万个随 secret 钥并开始弹出最小 key 、更改它们并重新插入它们。我对多重映射和我的结构都这样做了，以比较性能。

平衡分布堆/优先队列(有点)

#include <iostream>
#include <cassert>
#include <utility>
#include <stdexcept>
#include <algorithm>
#include <cmath>
#include <ctime>
#include <map>
#include <vector>
#include <malloc.h>

// Max Alignment
#if defined(_MSC_VER)
    #define MAX_ALIGN __declspec(align(16))
#else
    #define MAX_ALIGN __attribute__((aligned(16)))
#endif

using namespace std;

static void* max_malloc(size_t amount)
{
    #ifdef _MSC_VER
        return _aligned_malloc(amount, 16);
    #else
        void* mem = 0;
        posix_memalign(&mem, 16, amount);
        return mem;
    #endif
}

static void max_free(void* mem)
{
    #ifdef _MSC_VER
        return _aligned_free(mem);
    #else
        free(mem);
    #endif
}

// Balanced priority queue for very quick insertions and 
// removals when the keys are balanced across a distributed range.
template <class Key, class Value, class KeyToIndex>
class BalancedQueue
{
public:
    enum {zone_len = 256};

    /// Creates a queue with 'n' buckets.
    explicit BalancedQueue(int n): 
        num_nodes(0), num_buckets(n+1), min_bucket(n+1), buckets(static_cast<Bucket*>(max_malloc((n+1) * sizeof(Bucket)))), free_nodes(0), pools(0)
    {
        const int num_zones = num_buckets / zone_len + 1;
        zone_counts = new int[num_zones];
        for (int j=0; j < num_zones; ++j)
            zone_counts[j] = 0;

        for (int j=0; j < num_buckets; ++j)
        {
            buckets[j].num = 0;
            buckets[j].head = 0;
        }
    }

    /// Destroys the queue.
    ~BalancedQueue()
    {
        clear();
        max_free(buckets);
        while (pools)
        {
            Pool* to_free = pools;
            pools = pools->next;
            max_free(to_free);
        }
        delete[] zone_counts;
    }

    /// Makes the queue empty.
    void clear()
    {
        const int num_zones = num_buckets / zone_len + 1;
        for (int j=0; j < num_zones; ++j)
            zone_counts[j] = 0;
        for (int j=0; j < num_buckets; ++j)
        {
            while (buckets[j].head)
            {
                Node* to_free = buckets[j].head;
                buckets[j].head = buckets[j].head->next;
                node_free(to_free);
            }
            buckets[j].num = 0;
        }
        num_nodes = 0;
        min_bucket = num_buckets+1;
    }

    /// Pushes an element to the queue.
    void push(const Key& key, const Value& value)
    {
        const int index = KeyToIndex()(key);
        assert(index >= 0 && index < num_buckets && "Key is out of range!");

        Node* new_node = node_alloc();
        new (&new_node->key) Key(key);
        new (&new_node->value) Value(value);
        new_node->next = buckets[index].head;
        buckets[index].head = new_node;
        assert(new_node->key == key && new_node->value == value);
        ++num_nodes;
        ++buckets[index].num;
        ++zone_counts[index/zone_len];
        min_bucket = std::min(min_bucket, index);
    }

    /// @return size() == 0.
    bool empty() const
    {
        return num_nodes == 0;
    }

    /// @return The number of elements in the queue.
    int size() const
    {
        return num_nodes;
    }

    /// Pops the element with the minimum key from the queue.
    std::pair<Key, Value> pop()
    {
        assert(!empty() && "Queue is empty!");
        for (int j=min_bucket; j < num_buckets; ++j)
        {
            if (buckets[j].head)
            {
                Node* node = buckets[j].head;
                Node* prev_node = node;
                Node* min_node = node;
                Node* prev_min_node = 0;
                const Key* min_key = &min_node->key;
                const Value* min_val = &min_node->value;
                for (node = node->next; node; prev_node = node, node = node->next)
                {
                    if (node->key < *min_key)
                    {
                        prev_min_node = prev_node;
                        min_node = node;
                        min_key = &min_node->key;
                        min_val = &min_node->value;
                    }
                }
                std::pair<Key, Value> kv(*min_key, *min_val);
                if (min_node == buckets[j].head)
                    buckets[j].head = buckets[j].head->next;
                else
                {
                    assert(prev_min_node);
                    prev_min_node->next = min_node->next;
                }
                removed_node(j);
                node_free(min_node);
                return kv;
            }
        }
        throw std::runtime_error("Trying to pop from an empty queue.");
    }

    /// Erases an element from the middle of the queue.
    /// @return True if the element was found and removed.
    bool erase(const Key& key, const Value& value)
    {
        assert(!empty() && "Queue is empty!");
        const int index = KeyToIndex()(key);
        if (buckets[index].head)
        {
            Node* node = buckets[index].head;
            if (node_key(node) == key && node_val(node) == value)
            {
                buckets[index].head = buckets[index].head->next;
                removed_node(index);
                node_free(node);
                return true;
            }

            Node* prev_node = node;
            for (node = node->next; node; prev_node = node, node = node->next)
            {
                if (node_key(node) == key && node_val(node) == value)
                {
                    prev_node->next = node->next;
                    removed_node(index);
                    node_free(node);
                    return true;
                }
            }
        }
        return false;
    }

private:
    // Didn't bother to make it copyable -- left as an exercise.
    BalancedQueue(const BalancedQueue&);
    BalancedQueue& operator=(const BalancedQueue&);

    struct Node
    {
        Key key;
        Value value;
        Node* next;
    };
    struct Bucket
    {
        int num;
        Node* head;
    };
    struct Pool
    {
        Pool* next;
        MAX_ALIGN char buf[1];
    };
    Node* node_alloc()
    {
        if (free_nodes)
        {
            Node* node = free_nodes;
            free_nodes = free_nodes->next;
            return node;
        }

        const int pool_size = std::max(4096, static_cast<int>(sizeof(Node)));
        Pool* new_pool = static_cast<Pool*>(max_malloc(sizeof(Pool) + pool_size - 1));
        new_pool->next = pools;
        pools = new_pool;

        // Push the new pool's nodes to the free stack.
        for (int j=0; j < pool_size; j += sizeof(Node))
        {
            Node* node = reinterpret_cast<Node*>(new_pool->buf + j);
            node->next = free_nodes;
            free_nodes = node;
        }
        return node_alloc();
    }
    void node_free(Node* node)
    {
        // Destroy the key and value and push the node back to the free stack.
        node->key.~Key();
        node->value.~Value();
        node->next = free_nodes;
        free_nodes = node;
    }
    void removed_node(int bucket_index)
    {
        --num_nodes;
        --zone_counts[bucket_index/zone_len];
        if (--buckets[bucket_index].num == 0 && bucket_index == min_bucket)
        {
            // If the bucket became empty, search for next occupied minimum zone.
            const int num_zones = num_buckets / zone_len + 1;
            for (int j=bucket_index/zone_len; j < num_zones; ++j)
            {
                if (zone_counts[j] > 0)
                {
                    for (min_bucket=j*zone_len; min_bucket < num_buckets && buckets[min_bucket].num == 0; ++min_bucket) {}
                    assert(min_bucket/zone_len == j);
                    return;
                }
            }
            min_bucket = num_buckets+1;
            assert(empty());
        }
    }
    int* zone_counts;
    int num_nodes;
    int num_buckets;
    int min_bucket;
    Bucket* buckets;
    Node* free_nodes;
    Pool* pools;
};

/// Test Parameters
enum {num_keys = 1000000};
enum {buckets = 100000};

static double sys_time()
{
    return static_cast<double>(clock()) / CLOCKS_PER_SEC;
}

struct KeyToIndex
{
    int operator()(double val) const
    {
        return static_cast<int>(val * buckets);
    }
};

int main()
{
    vector<double> keys(num_keys);
    for (int j=0; j < num_keys; ++j)
        keys[j] = static_cast<double>(rand()) / RAND_MAX;

    for (int k=0; k < 5; ++k)
    {
        // Multimap
        {
            const double start_time = sys_time();
            multimap<double, int> q;
            for (int j=0; j < num_keys; ++j)
                q.insert(make_pair(keys[j], j));

            // Pop each key, modify it, and reinsert.
            for (int j=0; j < num_keys; ++j)
            {
                pair<double, int> top = *q.begin();
                q.erase(q.begin());
                top.first = static_cast<double>(rand()) / RAND_MAX;
                q.insert(top);
            }
            cout << (sys_time() - start_time) << " secs for multimap" << endl;
        }

        // Balanced Queue
        {
            const double start_time = sys_time();
            BalancedQueue<double, int, KeyToIndex> q(buckets);
            for (int j=0; j < num_keys; ++j)
                q.push(keys[j], j);

            // Pop each key, modify it, and reinsert.
            for (int j=0; j < num_keys; ++j)
            {
                pair<double, int> top = q.pop();
                top.first = static_cast<double>(rand()) / RAND_MAX;
                q.push(top.first, top.second);
            }
            cout << (sys_time() - start_time) << " secs for BalancedQueue" << endl;
        }
        cout << endl;
    }
}

我机器上的结果:

3.023 secs for multimap
0.34 secs for BalancedQueue

2.807 secs for multimap
0.351 secs for BalancedQueue

2.771 secs for multimap
0.337 secs for BalancedQueue

2.752 secs for multimap
0.338 secs for BalancedQueue

2.742 secs for multimap
0.334 secs for BalancedQueue

关于c++ - C++ 中的 Fast(est) 可变堆实现，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29931592/

amp 可变 node buckets num c++performance boost stl heap

有关c++ - C++ 中的 Fast(est) 可变堆实现的更多相关文章

ruby - 如何从 ruby 中的字符串运行任意对象方法？ - 2
总的来说，我对ruby还比较陌生，我正在为我正在创建的对象编写一些rspec测试用例。许多测试用例都非常基础，我只是想确保正确填充和返回值。我想知道是否有办法使用循环结构来执行此操作。不必为我要测试的每个方法都设置一个assertEquals。例如:describeitem,"TestingtheItem"doit"willhaveanullvaluetostart"doitem=Item.new#HereIcoulddotheitem.name.shouldbe_nil#thenIcoulddoitem.category.shouldbe_nilendend但我想要一些方法来使用
ruby - 其他文件中的 Rake 任务 - 2
我试图在一个项目中使用rake，如果我把所有东西都放到Rakefile中，它会很大并且很难读取/找到东西，所以我试着将每个命名空间放在lib/rake中它自己的文件中，我添加了这个到我的rake文件的顶部:Dir['#{File.dirname(__FILE__)}/lib/rake/*.rake'].map{|f|requiref}它加载文件没问题，但没有任务。我现在只有一个.rake文件作为测试，名为“servers.rake”，它看起来像这样:namespace:serverdotask:testdoputs"test"endend所以当我运行rakeserver:testid时
ruby-on-rails - Ruby net/ldap 模块中的内存泄漏 - 2
作为我的Rails应用程序的一部分，我编写了一个小导入程序，它从我们的LDAP系统中吸取数据并将其塞入一个用户表中。不幸的是，与LDAP相关的代码在遍历我们的32K用户时泄漏了大量内存，我一直无法弄清楚如何解决这个问题。这个问题似乎在某种程度上与LDAP库有关，因为当我删除对LDAP内容的调用时，内存使用情况会很好地稳定下来。此外，不断增加的对象是Net::BER::BerIdentifiedString和Net::BER::BerIdentifiedArray，它们都是LDAP库的一部分。当我运行导入时，内存使用量最终达到超过1GB的峰值。如果问题存在，我需要找到一些方法来更正我的代
ruby-on-rails - Rails 3 中的多个路由文件 - 2
Rails2.3可以选择随时使用RouteSet#add_configuration_file添加更多路由。是否可以在Rails3项目中做同样的事情？最佳答案在config/application.rb中:config.paths.config.routes在Rails3.2(也可能是Rails3.1)中，使用:config.paths["config/routes"] 关于ruby-on-rails-Rails3中的多个路由文件，我们在StackOverflow上找到一个类似的问题
ruby-on-rails - Rails - 一个 View 中的多个模型 - 2
我需要从一个View访问多个模型。以前，我的links_controller仅用于提供以不同方式排序的链接资源。现在我想包括一个部分(我假设)显示按分数排序的顶级用户(@users=User.all.sort_by(&:score))我知道我可以将此代码插入每个链接操作并从View访问它，但这似乎不是“ruby方式”，我将需要在不久的将来访问更多模型。这可能会变得很脏，是否有针对这种情况的任何技术？注意事项:我认为我的应用程序正朝着单一格式和动态页面内容的方向发展，本质上是一个典型的网络应用程序。我知道before_filter但考虑到我希望应用程序进入的方向，这似乎很麻烦。最终从任何
ruby-on-rails - Rails 3.2.1 中 ActionMailer 中的未定义方法 'default_content_type=' - 2
我在我的项目中添加了一个系统来重置用户密码并通过电子邮件将密码发送给他，以防他忘记密码。昨天它运行良好(当我实现它时)。当我今天尝试启动服务器时，出现以下错误。=>BootingWEBrick=>Rails3.2.1applicationstartingindevelopmentonhttp://0.0.0.0:3000=>Callwith-dtodetach=>Ctrl-CtoshutdownserverExiting/Users/vinayshenoy/.rvm/gems/ruby-1.9.3-p0/gems/actionmailer-3.2.1/lib/action_mailer
ruby-on-rails - 如何优雅地重启 thin + nginx？ - 2
我的瘦服务器配置了nginx，我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例，但找不到好的解决方案。有没有人能做到这一点？最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器，例如server{listen80;server
ruby-on-rails - Rails 应用程序中的 Rails : How are you using application_controller. rb 是新手吗？ - 2
刚入门rails，开始慢慢理解。有人可以解释或给我一些关于在application_controller中编码的好处或时间和原因的想法吗？有哪些用例。您如何为Rails应用程序使用应用程序Controller？我不想在那里放太多代码，因为据我了解，每个请求都会调用此Controller。这是真的？最佳答案 ApplicationController实际上是您应用程序中的每个其他Controller都将从中继承的类(尽管这不是强制性的)。我同意不要用太多代码弄乱它并保持干净整洁的态度，尽管在某些情况下ApplicationContr
ruby-on-rails - form_for 中不在模型中的自定义字段 - 2
我想向我的Controller传递一个参数，它是一个简单的复选框，但我不知道如何在模型的form_for中引入它，这是我的观点:{:id=>'go_finance'}do|f|%>Transferirde:para:Entrada:"input",:placeholder=>"Quantofoiganho?"%>Saída:"output",:placeholder=>"Quantofoigasto?"%>Nota:我想做一个额外的复选框，但我该怎么做，模型中没有一个对象，而是一个要检查的对象，以便在Controller中创建一个ifelse，如果没有检查，请帮助我，非常感谢,谢谢
ruby - rspec 需要 .rspec 文件中的 spec_helper - 2
我注意到像bundler这样的项目在每个specfile中执行requirespec_helper我还注意到rspec使用选项--require，它允许您在引导rspec时要求一个文件。您还可以将其添加到.rspec文件中，因此只要您运行不带参数的rspec就会添加它。使用上述方法有什么缺点可以解释为什么像bundler这样的项目选择在每个规范文件中都需要spec_helper吗？最佳答案我不在Bundler上工作，所以我不能直接谈论他们的做法。并非所有项目都checkin.rspec文件。原因是这个文件，通常按照当前的惯例，只

c++ - C++ 中的 Fast(est) 可变堆实现

有关c++ - C++ 中的 Fast(est) 可变堆实现的更多相关文章

随机推荐