相当于 PostsgreSQL 的 MongoDB 聚合查询

coder 2023-11-03 原文

这个问题有两个部分。集合结构为:

_id: MongoID,
agent_id: 字符串,
结果:字符串，
创建时间:ISO 日期，
...其他领域...

第一部分:
期望的输出:每个 agent_id 的一个结果和带有计数的结果组合:使用 PostgreSQL 的具有等效 SQL 的 TUPLE 表示。

( "1234", "Success", 4 ),
( "1234", "Failure", 4 ),
( "4567", "Success", 3 ),
( "7896", "Failure", 2 ),
.....

SELECT agent_id, result, count(*)
FROM table
GROUP BY agent_id, result
HAVING created_on >= now()::date;

我想出了下面的 mongo 查询....我想我有一个概念或语法错误。文档说要使用 $match early in the pipeline: ，但是虽然 $match 在我自己运行时限制了查询，但只要我添加 $group 我就会得到很多结果。此外，我似乎无法理解如何按多个字段进行分组。如何编辑下面的查询以获得类似于上面的 SQL 查询的结果？

db.collection.aggregate(
  { $match : 
    { created_on: 
        { $gte: new Date('08-13-2012') //some arbitrary date
    }
  }, $group:
    { _id:"$agent_id" }, 
   $project:
  {_id:0, agent_id:1, result:1}
})

第 2 部分) 第一个结果集是足够的，但不是最佳的。使用 PostgreSQL，我可以获得如下结果集:

( "1234", { "Success", "Failure" }, { 4, 3 } ),
( "4567", { "Success", "Failure" }, { 3, 0 } ),
( "7896", { "Success", "Failure" }, { 0, 2 } )

我可以在 Postgresql 中使用数组数据类型和 set_to_array 函数(自定义函数)执行此操作。 Pg 特定的 SQL 是:

SELECT agent_id, set_to_array(result), set_to_array( count(*) )
FROM table
GROUP BY agent_id, result
HAVING created_on >= now()::date;

我相信 mongodb 中的等效数据结构如下所示:

[
   { "1234", [ { "success": 4 }, { "failure": 4 } ] },
   { "4567", [ { "success": 3 }, { "failure": 0 } ] },
   { "7896", [ { "success": 0 }, { "failure": 0 } ] }
]

是否可以使用 mongodb 聚合框架实现这些所需的压缩结果？

最佳答案

给你:

创建了一些测试数据:

db.test.insert({agent_id:"1234", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1234", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Success", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()}); db.test.insert({agent_id:"1324", result:"Failure", created_on:new Date()});

db.test.aggregate(
  {
    $match:{ /* filter out the things you want to aggregate */
      created_on:{$gte:new Date(1000000)}
    }
  }, 
  {
    $group: {_
      _id: { /* the things you want to group on go in the _id */
        agent_id:"$agent_id", 
        result:"$result"
      }, 
      count:{$sum:1} /* simple count */
    }
  }, 
  {
    $project: { /* take the id out into the separate fields for your tuple. */
      _id:0, 
      agent_id:"$_id.agent_id", 
      result:"$_id.result", 
      count:"$count"
    }
  });

给予:

{
"result" : [
    {
        "count" : 7,
        "agent_id" : "1324",
        "result" : "Failure"
    },
    {
        "count" : 4,
        "agent_id" : "1324",
        "result" : "Success"
    },
    {
        "count" : 4,
        "agent_id" : "1234",
        "result" : "Success"
    },
    {
        "count" : 3,
        "agent_id" : "1234",
        "result" : "Failure"
    }
],
"ok" : 1
}

添加第 2 部分——与第 1 部分非常相似，但计数有点复杂；基本上，只有当它与您想要计算的相匹配时，您才计算:

db.test.aggregate(
  {
    $match: { 
      created_on: {$gte:new Date(1000000)}
    }
  }, 
  {
    $group: {
      _id: { 
        agent_id:"$agent_id"
      }, 
      failure: {
        $sum:{
          $cond:[
            {$eq:["$result","Failure"]}, 
            1, 
            0
          ]
        }
      }, 
      success: {
        $sum: { 
          $cond:[
            {$eq:["$result","Success"]}, 
            1, 
            0
          ]
        }
      } 
    } 
  }, 
  {
    $project: {
      _id: 0, 
      agent_id: "$_id.agent_id", 
      failure: "$failure", 
      success: "$success"
    }
  });

给予:

{
"result" : [
    {
        "failure" : 7,
        "success" : 4,
        "agent_id" : "1324"
    },
    {
        "failure" : 3,
        "success" : 4,
        "agent_id" : "1234"
    }
],
"ok" : 1
}

关于相当于 PostsgreSQL 的 MongoDB 聚合查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12700836/

PostsgreSQL 相当 34 agent_id agent mongodb aggregation-framework

有关相当于 PostsgreSQL 的 MongoDB 聚合查询的更多相关文章

ruby - ECONNRESET (Whois::ConnectionError) - 尝试在 Ruby 中查询 Whois 时出错 - 2
我正在用Ruby编写一个简单的程序来检查域列表是否被占用。基本上它循环遍历列表，并使用以下函数进行检查。require'rubygems'require'whois'defcheck_domain(domain)c=Whois::Client.newc.query("google.com").available?end程序不断出错(即使我在google.com中进行硬编码)，并打印以下消息。鉴于该程序非常简单，我已经没有什么想法了-有什么建议吗？/Library/Ruby/Gems/1.8/gems/whois-2.0.2/lib/whois/server/adapters/base.
ruby-on-rails - 在 Rails 和 ActiveRecord 中查询时忽略某些字段 - 2
我知道我可以指定某些字段来使用pluck查询数据库。ids=Item.where('due_at但是我想知道，是否有一种方法可以指定我想避免从数据库查询的某些字段。某种反拔？posts=Post.where(published:true).do_not_lookup(:enormous_field) 最佳答案 Model#attribute_names应该返回列/属性数组。您可以排除其中一些并传递给pluck或select方法。像这样:posts=Post.where(published:true).select(Post.attr
Python 相当于 Perl/Ruby ||= - 2
这个问题在这里已经有了答案:关闭10年前。PossibleDuplicate:Pythonconditionalassignmentoperator对于这样一个简单的问题表示歉意，但是谷歌搜索||=并不是很有帮助；)Python中是否有与Ruby和Perl中的||=语句等效的语句？例如:foo="hey"foo||="what"#assignfooifit'sundefined#fooisstill"hey"bar||="yeah"#baris"yeah"另外，类似这样的东西的通用术语是什么？条件分配是我的第一个猜测，但Wikipediapage跟我想的不太一样。
java - 什么相当于 ruby 的 rack 或 python 的 Java wsgi？ - 2
什么是ruby的rack或python的Java的wsgi？还有一个路由库。最佳答案来自Python标准PEP333:Bycontrast,althoughJavahasjustasmanywebapplicationframeworksavailable,Java's"servlet"APImakesitpossibleforapplicationswrittenwithanyJavawebapplicationframeworktoruninanywebserverthatsupportstheservletAPI.ht
sql - 查询忽略时间戳日期的时间范围 - 2
我正在尝试查询我的Rails数据库(Postgres)中的购买表，我想查询时间范围。例如，我想知道在所有日期的下午2点到3点之间进行了多少次购买。此表中有一个created_at列，但我不知道如何在不搜索特定日期的情况下完成此操作。我试过:Purchases.where("created_atBETWEEN?and?",Time.now-1.hour,Time.now)但这最终只会搜索今天与那些时间的日期。最佳答案您需要使用PostgreSQL'sdate_part/extractfunction从created_at中提取小时
java - Ruby 相当于 Java 的 Collections.unmodifiableList 和 Collections.unmodifiableMap - 2
Java的Collections.unmodifiableList和Collections.unmodifiableMap在Ruby标准API中是否有等价物？最佳答案使用freeze应用程序接口(interface):Preventsfurthermodificationstoobj.ARuntimeErrorwillberaisedifmodificationisattempted.Thereisnowaytounfreezeafrozenobject.SeealsoObject#frozen?.Thismethodretur
python - Ruby 相当于 Python str[3 :] - 2
是否有Ruby等效于Python的方法来获取在字符串末尾结束的子字符串，如str[3:]？必须输入字符串的长度并不方便。最佳答案传递最后一个元素=-1的范围str[3..-1] 关于python-Ruby相当于Pythonstr[3:]，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.com/questions/12978768/
ruby-on-rails - solr 清理查询 - 2
我在Rails上使用带有ruby的solr。一切正常，我只需要知道是否有任何现有代码来清理用户输入，比如以?开头的查询。或* 最佳答案我不知道执行此操作的任何代码，但理论上可以通过查看parsingcodeinLucene来完成并搜索thrownewParseException(只有16个匹配!)。在实践中，我认为您最好只捕获代码中的任何solr异常并显示“无效查询”消息或类似信息。编辑:这里有几个“sanitizer”:http://pivotallabs.com/users/zach/blog/articles/937-s
ruby-on-rails - Rails 3 在一个查询中包含多个表 - 2
我正在为锦标赛开发一个Rails应用程序。我在这个查询中使用了三个模型:classPlayertruehas_and_belongs_to_many:tournamentsclassTournament:destroyclassPlayerMatch"Player",:foreign_key=>"player_one"belongs_to:player_two,:class_name=>"Player",:foreign_key=>"player_two"在tournaments_controller的显示操作中，我调用以下查询:Tournament.where(:id=>params
ruby-on-rails - Sunspot:如何对具有不同值的多个字段进行全文查询？ - 2
我想用sunspot重现以下原始solr查询q=exact_term_text:fooORterm_textv:foo*ORalternate_text:bar*但我无法通过标准的太阳黑子界面理解这是否可能以及如何实现，因为看起来:fulltext方法似乎不接受多个文本/搜索字段参数我不知道将什么参数作为第一个参数传递给fulltext，就好像我通过了"foo"或"bar"结果不匹配如果我传递一个空参数，我得到一个q=*:*范围过滤器(例如with(:term).starting_with('foo*')(顾名思义)作为过滤器查询应用，因此不参与评分。似乎可以手动编写字符串(或者可能使

相当于 PostsgreSQL 的 MongoDB 聚合查询

有关相当于 PostsgreSQL 的 MongoDB 聚合查询的更多相关文章

随机推荐