c++ - 手动编码的快速排序在较小的整数上较慢

coder 2024-02-03 原文

当比较我的编译器上的快速排序实现与 std::sort 以及合并排序的实现时，我注意到大型数据集上的一个奇怪模式:当对 64 位整数进行操作时，快速排序始终比合并排序快；然而，在较小的 int 大小上，快速排序变得更慢，而合并排序变得更快。

测试代码如下:

#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
#include <utility>
#include <random>
#include <chrono>
#include <limits>
#include <functional>

#include <cstdint>


template <typename Iterator>
void insertion_sort(Iterator first, Iterator last)
{
    using namespace std;

    Iterator head = first;
    Iterator new_position;

    while(head != last)
    {
        new_position = head;
        while(new_position != first && *new_position < *prev(new_position))
        {
            swap(*new_position, *prev(new_position));
            --new_position;
        }
        ++head;
    }
}

template <typename Iterator>
void recursive_mergesort_impl(Iterator first, Iterator last, std::vector<typename Iterator::value_type>& temp)
{
    if(last - first > 32)
    {
        auto middle = first + (last-first)/2;
        recursive_mergesort_impl(first, middle, temp);
        recursive_mergesort_impl(middle, last, temp);
        auto last_merged = merge_move(first, middle, middle, last, temp.begin());
        std::move(temp.begin(), last_merged, first);
    }
    else
    {
        insertion_sort(first, last);
    }
}

template <typename Iterator>
void recursive_mergesort(Iterator first, Iterator last)
{
    std::vector<typename Iterator::value_type> temp(last-first);
    recursive_mergesort_impl(first, last, temp);
}

// Pick a pivot and move it to front of range
template <typename Iterator>
template <typename Iterator>
void quicksort_pivot_back(Iterator first, Iterator last)
{
    using namespace std;

    auto middle = first + (last-first)/2;
    auto last_elem = prev(last);
    Iterator pivot;

    if(*first < *middle)
    {
        if(*middle < *last_elem)
            pivot = middle;
        else if(*first < *last_elem)
            pivot = last_elem;
        else
            pivot = first;
    }
    else if(*first < *last_elem)
        pivot = first;
    else if(*middle < *last_elem)
        pivot = last_elem;
    else
        pivot = middle;

    swap(*last_elem, *pivot);
}

template <typename Iterator, typename Function>
std::pair<Iterator, Iterator> quicksort_partition(Iterator first, Iterator last, Function pivot_select)
{
    using namespace std;

    pivot_select(first, last);

    auto pivot = prev(last);
    auto bottom = first;
    auto top = pivot;

    while(bottom != top)
    {
        if(*bottom < *pivot) ++bottom;
        else swap(*bottom, *--top);
    }

    swap(*pivot, *top++);

    return make_pair(bottom, top);
}

template <typename Iterator>
void quicksort_loop(Iterator first, Iterator last)
{
    using namespace std;

    while(last - first > 32)
    {
        auto bounds = quicksort_partition(first, last, quicksort_pivot_back<Iterator>);

        quicksort_loop(bounds.second, last);
        last = bounds.first;
    }
}


template <typename Iterator>
void quicksort(Iterator first, Iterator last)
{
    quicksort_loop(first, last);
    insertion_sort(first, last);
}

template <typename IntType = uint64_t, typename Duration = std::chrono::microseconds, typename Timer = std::chrono::high_resolution_clock, typename Function, typename Generator>
void run_trial(Function sort_func, Generator gen, std::string name, std::size_t trial_size, std::size_t trial_count)
{
    using namespace std;
    using namespace chrono;

    vector<IntType> data(trial_size);

    Duration elapsed(0);

    cout << "Sorting with " << name << endl;

    for(unsigned int i = 0; i < trial_count; ++i)
    {
        generate(data.begin(), data.end(), gen);

        auto start = Timer::now();
        sort_func(data.begin(), data.end());
        auto finish = Timer::now();

        elapsed += duration_cast<Duration>(finish-start);
    }

    cout << "Done. Average elapsed time: " << elapsed.count() / trial_count << endl;
    cout << "Is correct: " << is_sorted(data.begin(), data.end()) << endl << endl;
}

int main()
{
    using namespace std;
    using namespace chrono;

    using int_type = uint64_t;
    const size_t trial_size = 12800000;
    const int trial_count = 15;

    vector<int_type> data(trial_size);
    uniform_int_distribution<int_type> distr;
    mt19937_64 rnd;

    run_trial<int_type>(recursive_mergesort<vector<int_type>::iterator>, bind(distr, rnd), "recursive mergesort", trial_size, trial_count);
    run_trial<int_type>(quicksort<vector<int_type>::iterator>, bind(distr, rnd), "quicksort", trial_size, trial_count);
    run_trial<int_type>(sort<vector<int_type>::iterator>, bind(distr, rnd), "std::sort", trial_size, trial_count);
}

以下是 12800000 个元素的 15 次试验的时间:

uint64_t:

Sorting with recursive mergesort
Done. Average elapsed time: 1725431
Is correct: 1

Sorting with quicksort
Done. Average elapsed time: 1238070
Is correct: 1

Sorting with std::sort
Done. Average elapsed time: 1131464
Is correct: 1

uint16_t:

Sorting with recursive mergesort
Done. Average elapsed time: 1186467
Is correct: 1

Sorting with quicksort
Done. Average elapsed time: 2368535
Is correct: 1

Sorting with std::sort
Done. Average elapsed time: 888517
Is correct: 1

我感觉这个问题与未对齐的内存访问有关，但这仍然让我想知道为什么其他算法会加速而快速排序会变慢。

最佳答案

使用 uint16_t，您将在如此大的数组中得到很多重复项:按照预期，0 到 65535 中的每一个出现 195 次。没有 three-way ("fat") partition ，或者至少返回其正在处理的子数组中重复出现的枢轴值的中间，这会导致快速排序变为二次排序。 (尝试在仅包含零的数组上用铅笔和纸执行朴素的快速排序以查看效果。)

关于c++ - 手动编码的快速排序在较小的整数上较慢，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23501734/

数上 amp Iterator first last c++sorting

有关c++ - 手动编码的快速排序在较小的整数上较慢的更多相关文章

ruby - 什么是填充的 Base64 编码字符串以及如何在 ruby 中生成它们？ - 2
我正在使用的第三方API的文档状态:"[O]urAPIonlyacceptspaddedBase64encodedstrings."什么是“填充的Base64编码字符串”以及如何在Ruby中生成它们。下面的代码是我第一次尝试创建转换为Base64的JSON格式数据。xa=Base64.encode64(a.to_json) 最佳答案他们说的padding其实就是Base64本身的一部分。它是末尾的“=”和“==”。Base64将3个字节的数据包编码为4个编码字符。所以如果你的输入数据有长度n和n%3=1=>"=="末尾用于填充n%
ruby - 用逗号、双引号和编码解析 csv - 2
我正在使用ruby1.9解析以下带有MacRoman字符的csv文件#encoding:ISO-8859-1#csv_parse.csvName,main-dialogue"Marceu","Giveittohimóhe,hiswife."我做了以下解析。require'csv'input_string=File.read("../csv_parse.rb").force_encoding("ISO-8859-1").encode("UTF-8")#=>"Name,main-dialogue\r\n\"Marceu\",\"Giveittohim\x97he,hiswife.\"\
ruby-on-rails - 如何优雅地重启 thin + nginx？ - 2
我的瘦服务器配置了nginx，我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例，但找不到好的解决方案。有没有人能做到这一点？最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器，例如server{listen80;server
C# 到 Ruby sha1 base64 编码 - 2
我正在尝试在Ruby中复制Convert.ToBase64String()行为。这是我的C#代码:varsha1=newSHA1CryptoServiceProvider();varpasswordBytes=Encoding.UTF8.GetBytes("password");varpasswordHash=sha1.ComputeHash(passwordBytes);returnConvert.ToBase64String(passwordHash);//returns"W6ph5Mm5Pz8GgiULbPgzG37mj9g="当我在Ruby中尝试同样的事情时，我得到了相同sha
ruby - 使用 `+=` 和 `send` 方法 - 2
如何将send与+=一起使用？a=20;a.send"+=",10undefinedmethod`+='for20:Fixnuma=20;a+=10=>30 最佳答案恐怕你不能。+=不是方法，而是语法糖。参见http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_expressions.html它说Incommonwithmanyotherlanguages,Rubyhasasyntacticshortcut:a=a+2maybewrittenasa+=2.你能做的最好的事情是:
ruby - 如何计算 Liquid 中的变量 +1 - 2
我对如何计算通过{%assignvar=0%}赋值的变量加一完全感到困惑。这应该是最简单的任务。到目前为止，这是我尝试过的:{%assignamount=0%}{%forvariantinproduct.variants%}{%assignamount=amount+1%}{%endfor%}Amount:{{amount}}结果总是0。也许我忽略了一些明显的东西。也许有更好的方法。我想要存档的只是获取运行的迭代次数。最佳答案因为{{incrementamount}}将输出您的变量值并且不会影响{%assign%}定义的变量，我
ruby - 是否有内置的 Ruby 1.8.7 将数组拆分为相同大小的子数组？ - 2
我已经开始了:defsplit_array(array,size)index=0results=[]ifsize>0whileindex如果我在[1,2,3,4,5,6]上运行它，比如split_array([1,2,3,4,5,6],3)它将产生这个数组:[[1,2,3],[4,5,6]]。在Ruby1.8.7中是否已经有可用的东西可以做到这一点？最佳答案 [1,2,3,4,5,6].each_slice(3).to_a#=>[[1,2,3],[4,5,6]]对于1.8.6:require'enumerator'[1,2,3,4
ruby-on-rails - 有没有一种工具可以在编码时自动保存对文件的增量更改？ - 2
我最喜欢的Google文档功能之一是它会在我工作时不断自动保存我的文档版本。这意味着即使我在进行关键更改之前忘记在某个点进行保存，也很有可能会自动创建一个保存点。至少，我可以将文档恢复到错误更改之前的状态，并从该点继续工作。对于在MacOS(或UNIX)上运行的Ruby编码器，是否有具有等效功能的工具？例如，一个工具会每隔几分钟自动将Gitcheckin我的本地存储库以获取我正在处理的文件。也许我有点偏执，但这点小保险可以让我在日常工作中安心。最佳答案虚拟机有些人可能讨厌我对此的回应，但我在编码时经常使用VIM，它具有自动保存功
ruby-on-rails - 需要帮助最大化多个相似对象中的 3 个因素并适当排序 - 2
我需要用任何语言编写一个算法，根据3个因素对数组进行排序。我以度假村为例(如Hipmunk)。假设我想去度假。我想要最便宜的地方、最好的评论和最多的景点。但是，显然我找不到在所有3个中都排名第一的方法。Example(assumingthereare20importantattractions):ResortA:$150/night...98/100infavorablereviews...18of20attractionsResortB:$99/night...85/100infavorablereviews...12of20attractionsResortC:$120/night
c - Ruby - 源代码 - 编码风格 - 2
查看Ruby代码，它具有以下proc_arity:staticVALUEproc_arity(VALUEself){intarity=rb_proc_arity(self);returnINT2FIX(arity);}更多的是C编码风格问题，但为什么staticVALUE在单独的一行而不是像这样的:staticVALUEproc_arity(VALUEself) 最佳答案它来自UNIX世界，因为它有助于轻松grep函数的定义:$grep-n'^proc_arity'*.c或使用vim:/^proc_arity

c++ - 手动编码的快速排序在较小的整数上较慢

有关c++ - 手动编码的快速排序在较小的整数上较慢的更多相关文章

随机推荐