c++ - 分支预测 : Writing Code to Understand it; Getting Weird Results

coder 2023-06-01 原文

我试图通过测量运行带有可预测分支的循环与带有随机分支的循环的时间来更好地理解分支预测。

所以我编写了一个程序，它采用以不同顺序排列的 0 和 1 的大数组(即全 0、重复 0-1、全 rand)，并根据当前索引是 0 还是 1 遍历数组分支，做浪费时间的工作。

我预计难以猜测的数组会花费更长的时间来运行，因为分支预测器会更频繁地猜错，并且无论数量多少，两组数组上运行之间的时间增量都将保持不变浪费时间的工作。

但是，随着浪费时间的工作量增加，阵列之间的运行时间差异也会增加很多。

(X 轴是浪费时间的工作量，Y 轴是运行时间)

有人理解这种行为吗？您可以在以下代码中看到我正在运行的代码:

#include <stdlib.h>
#include <time.h>
#include <chrono>
#include <stdio.h>
#include <iostream>
#include <vector>
using namespace std;
static const int s_iArrayLen = 999999;
static const int s_iMaxPipelineLen = 60;
static const int s_iNumTrials = 10;

int doWorkAndReturnMicrosecondsElapsed(int* vals, int pipelineLen){
        int* zeroNums = new int[pipelineLen];
        int* oneNums = new int[pipelineLen];
        for(int i = 0; i < pipelineLen; ++i)
                zeroNums[i] = oneNums[i] = 0;

        chrono::time_point<chrono::system_clock> start, end;
        start = chrono::system_clock::now();
        for(int i = 0; i < s_iArrayLen; ++i){
                if(vals[i] == 0){
                        for(int i = 0; i < pipelineLen; ++i)
                                ++zeroNums[i];
                }
                else{
                        for(int i = 0; i < pipelineLen; ++i)
                                ++oneNums[i];
                }
        }
        end = chrono::system_clock::now();
        int elapsedMicroseconds = (int)chrono::duration_cast<chrono::microseconds>(end-start).count();

        //This should never fire, it just exists to guarantee the compiler doesn't compile out our zeroNums/oneNums
        for(int i = 0; i < pipelineLen - 1; ++i)
                if(zeroNums[i] != zeroNums[i+1] || oneNums[i] != oneNums[i+1])
                        return -1;
        delete[] zeroNums;
        delete[] oneNums;
        return elapsedMicroseconds;
}

struct TestMethod{
        string name;
        void (*func)(int, int&);
        int* results;

        TestMethod(string _name, void (*_func)(int, int&)) { name = _name; func = _func; results = new int[s_iMaxPipelineLen]; }
};

int main(){
        srand( (unsigned int)time(nullptr) );

        vector<TestMethod> testMethods;
        testMethods.push_back(TestMethod("all-zero", [](int index, int& out) { out = 0; } ));
        testMethods.push_back(TestMethod("repeat-0-1", [](int index, int& out) { out = index % 2; } ));
        testMethods.push_back(TestMethod("repeat-0-0-0-1", [](int index, int& out) { out = (index % 4 == 0) ? 0 : 1; } ));
        testMethods.push_back(TestMethod("rand", [](int index, int& out) { out = rand() % 2; } ));

        int* vals = new int[s_iArrayLen];

        for(int currentPipelineLen = 0; currentPipelineLen < s_iMaxPipelineLen; ++currentPipelineLen){
                for(int currentMethod = 0; currentMethod < (int)testMethods.size(); ++currentMethod){
                        int resultsSum = 0;
                        for(int trialNum = 0; trialNum < s_iNumTrials; ++trialNum){
                                //Generate a new array...
                                for(int i = 0; i < s_iArrayLen; ++i)  
                                        testMethods[currentMethod].func(i, vals[i]);

                                //And record how long it takes
                                resultsSum += doWorkAndReturnMicrosecondsElapsed(vals, currentPipelineLen);
                        }

                        testMethods[currentMethod].results[currentPipelineLen] = (resultsSum / s_iNumTrials);
                }
        }

        cout << "\t";
        for(int i = 0; i < s_iMaxPipelineLen; ++i){
                cout << i << "\t";
        }
        cout << "\n";
        for (int i = 0; i < (int)testMethods.size(); ++i){
                cout << testMethods[i].name.c_str() << "\t";
                for(int j = 0; j < s_iMaxPipelineLen; ++j){
                        cout << testMethods[i].results[j] << "\t";
                }
                cout << "\n";
        }
        int end;
        cin >> end;
        delete[] vals;
}

粘贴箱链接:http://pastebin.com/F0JAu3uw

最佳答案

我认为您可能测量的是缓存/内存性能，而不是分支预测。您的内部“工作”循环正在访问越来越多的内存。这可以解释线性增长、周期性行为等。

我可能是错的，因为我没有尝试复制你的结果，但如果我是你，我会在计时其他事情之前考虑内存访问。也许将一个 volatile 变量与另一个变量相加，而不是在数组中工作。

另请注意，根据 CPU 的不同，分支预测可能比仅记录上次执行分支的时间要智能得多 - 例如，重复模式并不像随机数据那么糟糕。

好的，我在茶歇时进行了一个快速而肮脏的测试，它试图反射(reflect)您自己的测试方法，但不会破坏缓存，如下所示:

这是你所期望的吗？

如果我以后有空的话，我还想尝试其他的东西，因为我还没有真正了解编译器在做什么......

编辑:

而且，这是我的最终测试 - 我在汇编程序中重新编码以删除循环分支，确保每个路径中的指令数量准确，等等。

我还添加了一个 5 位重复模式的额外案例。似乎很难扰乱我老化的 Xeon 上的分支预测器。

关于c++ - 分支预测 : Writing Code to Understand it; Getting Weird Results，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/14151733/

Understand amp int lt testMethods c++branch-prediction

有关c++ - 分支预测 : Writing Code to Understand it; Getting Weird Results的更多相关文章

ruby-on-rails - 如何优雅地重启 thin + nginx？ - 2
我的瘦服务器配置了nginx，我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例，但找不到好的解决方案。有没有人能做到这一点？最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器，例如server{listen80;server
ruby - 使用 `+=` 和 `send` 方法 - 2
如何将send与+=一起使用？a=20;a.send"+=",10undefinedmethod`+='for20:Fixnuma=20;a+=10=>30 最佳答案恐怕你不能。+=不是方法，而是语法糖。参见http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_expressions.html它说Incommonwithmanyotherlanguages,Rubyhasasyntacticshortcut:a=a+2maybewrittenasa+=2.你能做的最好的事情是:
ruby - 如何计算 Liquid 中的变量 +1 - 2
我对如何计算通过{%assignvar=0%}赋值的变量加一完全感到困惑。这应该是最简单的任务。到目前为止，这是我尝试过的:{%assignamount=0%}{%forvariantinproduct.variants%}{%assignamount=amount+1%}{%endfor%}Amount:{{amount}}结果总是0。也许我忽略了一些明显的东西。也许有更好的方法。我想要存档的只是获取运行的迭代次数。最佳答案因为{{incrementamount}}将输出您的变量值并且不会影响{%assign%}定义的变量，我
arrays - Ruby 数组 += vs 推送 - 2
我有一个数组数组，想将元素附加到子数组。+=做我想做的，但我想了解为什么push不做。我期望的行为(并与+=一起工作):b=Array.new(3,[])b[0]+=["apple"]b[1]+=["orange"]b[2]+=["frog"]b=>[["苹果"],["橙子"],["Frog"]]通过推送，我将推送的元素附加到每个子数组(为什么？):a=Array.new(3,[])a[0].push("apple")a[1].push("orange")a[2].push("frog")a=>[[“苹果”、“橙子”、“Frog”]、[“苹果”、“橙子”、“Frog”]、[“苹果”、“
+= 的 Ruby 方法 - 2
有没有办法让Ruby能够做这样的事情？classPlane@moved=0@x=0defx+=(v)#thisiserror@x+=v@moved+=1enddefto_s"moved#{@moved}times,currentxis#{@x}"endendplane=Plane.newplane.x+=5plane.x+=10putsplane.to_s#moved2times,currentxis15 最佳答案您不能在Ruby中覆盖复合赋值运算符。任务在内部处理。您应该覆盖+，而不是+=。plane.a+=b与plane.a=
ruby - Sinatra + Heroku + Datamapper 使用 dm-sqlite-adapter 部署问题 - 2
出于某种原因，heroku尝试要求dm-sqlite-adapter，即使它应该在这里使用Postgres。请注意，这发生在我打开任何URL时-而不是在gitpush本身期间。我构建了一个默认的Facebook应用程序。gem文件:source:gemcuttergem"foreman"gem"sinatra"gem"mogli"gem"json"gem"httparty"gem"thin"gem"data_mapper"gem"heroku"group:productiondogem"pg"gem"dm-postgres-adapter"endgroup:development,:t
ruby - Ruby 中字符串运算符 + 和 << 的区别 - 2
我是Ruby和这个网站的新手。下面两个函数是不同的，一个在函数外修改变量，一个不修改。defm1(x)x我想确保我理解正确-当调用m1时，对str的引用被复制并传递给将其视为x的函数。运算符当调用m2时，对str的引用被复制并传递给将其视为x的函数。运算符+创建一个新字符串，赋值x=x+"4"只是将x重定向到新字符串，而原始str变量保持不变。对吧？谢谢最佳答案 String#+::str+other_str→new_strConcatenation—ReturnsanewStringcontainingother_strconc
ruby - 如何让 GitHub 页面使用 master 分支？ - 2
我有一个使用Jekyll托管在GitHub上的静态网站。问题是，我真的不需要master分支，因为存储库唯一包含的是网站。这样我就必须gitcheckoutgh-pages，然后gitmergemaster，然后gitpushorigingh-pages。有什么简单的方法可以摆脱gh-pages分支并直接从master推送？最佳答案 Theproblemis,Idon'treallyneedthemasterbranch,astheonlythingtherepositorycontainsisthewebsite.Isthere
ruby - rails 3.2.2(或 3.2.1)+ Postgresql 9.1.3 + Ubuntu 11.10 连接错误 - 2
我正在使用PostgreSQL9.1.3(x86_64-pc-linux-gnu上的PostgreSQL9.1.3，由gcc-4.6.real(Ubuntu/Linaro4.6.1-9ubuntu3)4.6.1，64位编译)和在ubuntu11.10上运行3.2.2或3.2.1。现在，我可以使用以下命令连接PostgreSQLsupostgres输入密码我可以看到postgres=#我将以下详细信息放在我的config/database.yml中并执行“railsdb”，它工作正常。开发:adapter:postgresqlencoding:utf8reconnect:falsedat
ruby - 在 Ruby + Chef 中检查现有目录失败 - 2
这是我在ChefRecipe中的一blockRuby:#ifdatadirdoesn'texist,moveoverthedefaultoneif!File.exist?("/vol/postgres/data")execute"mv/var/lib/postgresql/9.1/main/vol/postgres/data"end结果是:Executingmv/var/lib/postgresql/9.1/main/vol/postgres/datamv:inter-devicemovefailed:`/var/lib/postgresql/9.1/main'to`/vol/post

c++ - 分支预测 : Writing Code to Understand it; Getting Weird Results

有关c++ - 分支预测 : Writing Code to Understand it; Getting Weird Results的更多相关文章

随机推荐