c++ - std::vector 差异

coder 2023-06-02 原文

如何确定 2 个 vector 的差异是什么？

我有 vector<int> v1和 vector<int> v2 ;

我正在寻找的是 vector<int> vDifferences仅包含仅在 v1 中的元素或 v2 .

有标准的方法吗？

最佳答案

这是完整且正确的答案。在可以使用 set_symmetric_difference 算法之前，源范围必须排序:

  using namespace std; // For brevity, don't do this in your own code...

  vector<int> v1;
  vector<int> v2;

  // ... Populate v1 and v2

  // For the set_symmetric_difference algorithm to work, 
  // the source ranges must be ordered!    
  vector<int> sortedV1(v1);
  vector<int> sortedV2(v2);

  sort(sortedV1.begin(),sortedV1.end());
  sort(sortedV2.begin(),sortedV2.end());

  // Now that we have sorted ranges (i.e., containers), find the differences    
  vector<int> vDifferences;

  set_symmetric_difference(
    sortedV1.begin(),
    sortedV1.end(),
    sortedV2.begin(),
    sortedV2.end(),
    back_inserter(vDifferences));

  // ... do something with the differences

应该注意，排序是一项昂贵的操作(即 O(n log n) for common STL implementations )。特别是对于其中一个或两个容器非常大(即数百万个整数或更多)的情况，基于算法复杂性，使用哈希表的不同算法可能更可取。这是该算法的高级描述:

Load each container into a hash table.

If the two containers differ in size, the hash table corresponding to the smaller one will be used for traversal in Step 3. Otherwise, the first of the two hash tables will be used.

Traverse the hash table chosen in Step 2, checking to see if each item is present in both hash tables. If it is, remove it from both of them. The reason that the smaller hash table is preferred for traversal is because hash table lookups are on the average O(1) regardless of container size. Therefore, the time to traverse is a linear function of n (i.e., O(n)), where n is the size of the hash table being traversed.

Take the union of the remaining items in the hash tables and store the result in a difference container.

C++11 通过标准化 unordered_multiset 容器为我们提供了这种解决方案的一些功能。我还使用了 auto 关键字的新用法进行显式初始化，以使以下基于哈希表的解决方案更加简洁:

using namespace std; // For brevity, don't do this in your own code...

// The remove_common_items function template removes some and / or all of the
// items that appear in both of the multisets that are passed to it. It uses the
// items in the first multiset as the criteria for the multi-presence test.
template <typename tVal>
void remove_common_items(unordered_multiset<tVal> &ms1, 
                         unordered_multiset<tVal> &ms2)
{
  // Go through the first hash table
  for (auto cims1=ms1.cbegin();cims1!=ms1.cend();)
  {
    // Find the current item in the second hash table
    auto cims2=ms2.find(*cims1);

    // Is it present?
    if (cims2!=ms2.end())
    {
      // If so, remove it from both hash tables
      cims1=ms1.erase(cims1);
      ms2.erase(cims2);
    }
    else // If not
      ++cims1; // Move on to the next item
  }
}

int main()
{
  vector<int> v1;
  vector<int> v2;

  // ... Populate v1 and v2

  // Create two hash tables that contain the values
  // from their respective initial containers    
  unordered_multiset<int> ms1(v1.begin(),v1.end());
  unordered_multiset<int> ms2(v2.begin(),v2.end());

  // Remove common items from both containers based on the smallest
  if (v1.size()<=v2.size)
    remove_common_items(ms1,ms2);
  else
    remove_common_items(ms2,ms1);

  // Create a vector of the union of the remaining items
  vector<int> vDifferences(ms1.begin(),ms1.end());

  vDifferences.insert(vDifferences.end(),ms2.begin(),ms2.end());

  // ... do something with the differences
}

为了确定哪种解决方案更适合特定情况，分析这两种算法将是明智之举。尽管基于哈希表的解决方案在 O(n) 中，但它需要更多代码，并且每个找到的重复项(即哈希表删除)都需要做更多的工作。它还(可悲地)使用自定义差分函数而不是标准 STL 算法。

应该注意的是，两种解决方案都以与元素在原始容器中出现的顺序很可能完全不同的顺序呈现差异。通过使用哈希表解决方案的变体可以解决此问题。以下是高级描述(仅在第 4 步中与前面的解决方案不同):

Load each container into a hash table.

If the two containers differ in size, the smaller hash table will be used for traversal in Step 3. Otherwise, the first of the two will be used.

Traverse the hash table chosen in Step 2, checking to see if each item is present in both hash tables. If it is, remove it from both of them.

To form the difference container, traverse the original containers in order (i.e., the first container before the second). Look up each item from each container in its respective hash table. If it is found, the item is to be added to the difference container and removed from its hash table. Items not present in the respective hash tables will be skipped. Thus, only the items that are present in the hash tables will wind up in the difference container and their order of appearance will remain the same as it was in the original containers, because those containers dictate the order of the final traversal.

为了保持原始顺序，第 4 步变得比之前的解决方案更昂贵，尤其是在移除的商品数量较多的情况下。这是因为:

将通过在各自哈希表中的存在性测试对所有项目进行第二次测试，以确定是否有资格出现在差异容器中。
当差异容器形成时，哈希表将删除其其余项，作为第 1 项的差异测试的一部分。

关于c++ - std::vector 差异，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7771796/

有关c++ - std::vector 差异的更多相关文章

ruby - 将差异补丁应用于字符串/文件 - 2
对于具有离线功能的智能手机应用程序，我正在为Xml文件创建单向文本同步。我希望我的服务器将增量/差异(例如GNU差异补丁)发送到目标设备。这是计划:Time=0Server:hasversion_1ofXmlfile(~800kiB)Client:hasversion_1ofXmlfile(~800kiB)Time=1Server:hasversion_1andversion_2ofXmlfile(each~800kiB)computesdeltaoftheseversions(=patch)(~10kiB)sendspatchtoClient(~10kiBtransferred)Cl
ruby-on-rails - 如何优雅地重启 thin + nginx？ - 2
我的瘦服务器配置了nginx，我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例，但找不到好的解决方案。有没有人能做到这一点？最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器，例如server{listen80;server
ruby - 使用 `+=` 和 `send` 方法 - 2
如何将send与+=一起使用？a=20;a.send"+=",10undefinedmethod`+='for20:Fixnuma=20;a+=10=>30 最佳答案恐怕你不能。+=不是方法，而是语法糖。参见http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_expressions.html它说Incommonwithmanyotherlanguages,Rubyhasasyntacticshortcut:a=a+2maybewrittenasa+=2.你能做的最好的事情是:
ruby - 如何计算 Liquid 中的变量 +1 - 2
我对如何计算通过{%assignvar=0%}赋值的变量加一完全感到困惑。这应该是最简单的任务。到目前为止，这是我尝试过的:{%assignamount=0%}{%forvariantinproduct.variants%}{%assignamount=amount+1%}{%endfor%}Amount:{{amount}}结果总是0。也许我忽略了一些明显的东西。也许有更好的方法。我想要存档的只是获取运行的迭代次数。最佳答案因为{{incrementamount}}将输出您的变量值并且不会影响{%assign%}定义的变量，我
arrays - Ruby 数组 += vs 推送 - 2
我有一个数组数组，想将元素附加到子数组。+=做我想做的，但我想了解为什么push不做。我期望的行为(并与+=一起工作):b=Array.new(3,[])b[0]+=["apple"]b[1]+=["orange"]b[2]+=["frog"]b=>[["苹果"],["橙子"],["Frog"]]通过推送，我将推送的元素附加到每个子数组(为什么？):a=Array.new(3,[])a[0].push("apple")a[1].push("orange")a[2].push("frog")a=>[[“苹果”、“橙子”、“Frog”]、[“苹果”、“橙子”、“Frog”]、[“苹果”、“
+= 的 Ruby 方法 - 2
有没有办法让Ruby能够做这样的事情？classPlane@moved=0@x=0defx+=(v)#thisiserror@x+=v@moved+=1enddefto_s"moved#{@moved}times,currentxis#{@x}"endendplane=Plane.newplane.x+=5plane.x+=10putsplane.to_s#moved2times,currentxis15 最佳答案您不能在Ruby中覆盖复合赋值运算符。任务在内部处理。您应该覆盖+，而不是+=。plane.a+=b与plane.a=
ruby - Sinatra + Heroku + Datamapper 使用 dm-sqlite-adapter 部署问题 - 2
出于某种原因，heroku尝试要求dm-sqlite-adapter，即使它应该在这里使用Postgres。请注意，这发生在我打开任何URL时-而不是在gitpush本身期间。我构建了一个默认的Facebook应用程序。gem文件:source:gemcuttergem"foreman"gem"sinatra"gem"mogli"gem"json"gem"httparty"gem"thin"gem"data_mapper"gem"heroku"group:productiondogem"pg"gem"dm-postgres-adapter"endgroup:development,:t
ruby - :variable and @variable 之间的差异 - 2
作为RubyonRails新手，我明白“@”和“:”引用有不同的含义。我看到了thispost在SO中，其中描述了一些差异。@表示实例变量(例如@my_selection):表示别名(例如:my_selection)我遇到了一个情况，我有一个标准的MVC页面，类似于我的网络应用程序中的所有其他表单/页面。html.erb片段route.rb片段resources:my_selections当我尝试访问此页面时，出现此错误:NoMethodErrorinselections#createShowingC:/somedir/myapp/app/views/my_selections/ind
ruby - Ruby 中字符串运算符 + 和 << 的区别 - 2
我是Ruby和这个网站的新手。下面两个函数是不同的，一个在函数外修改变量，一个不修改。defm1(x)x我想确保我理解正确-当调用m1时，对str的引用被复制并传递给将其视为x的函数。运算符当调用m2时，对str的引用被复制并传递给将其视为x的函数。运算符+创建一个新字符串，赋值x=x+"4"只是将x重定向到新字符串，而原始str变量保持不变。对吧？谢谢最佳答案 String#+::str+other_str→new_strConcatenation—ReturnsanewStringcontainingother_strconc
ruby - rails 3.2.2(或 3.2.1)+ Postgresql 9.1.3 + Ubuntu 11.10 连接错误 - 2
我正在使用PostgreSQL9.1.3(x86_64-pc-linux-gnu上的PostgreSQL9.1.3，由gcc-4.6.real(Ubuntu/Linaro4.6.1-9ubuntu3)4.6.1，64位编译)和在ubuntu11.10上运行3.2.2或3.2.1。现在，我可以使用以下命令连接PostgreSQLsupostgres输入密码我可以看到postgres=#我将以下详细信息放在我的config/database.yml中并执行“railsdb”，它工作正常。开发:adapter:postgresqlencoding:utf8reconnect:falsedat

c++ - std::vector 差异

有关c++ - std::vector 差异的更多相关文章

随机推荐