performance - 为什么嵌套的 MaybeT 会导致指数分配

coder 2023-06-04 原文

我有一个程序。

import Control.Monad
import Control.Monad.Identity
import Control.Monad.Trans.Maybe

import System.Environment

tryR :: Monad m => ([a] -> MaybeT m [a]) -> ([a] -> m [a])
tryR f x = do
  m <- runMaybeT (f x)
  case m of
    Just t -> return t
    Nothing -> return x

check :: MonadPlus m => Int -> m Int
check x = if x `mod` 2 == 0 then return (x `div` 2) else mzero

foo :: MonadPlus m => [Int] -> m [Int]
foo [] = return []
foo (x:xs) = liftM2 (:) (check x) (tryR foo xs)


runFoo :: [Int] -> [Int]
runFoo x = runIdentity $ tryR foo x

main :: IO ()
main = do
  [n_str] <- getArgs
  let n = read n_str :: Int
  print $ runFoo [2,4..n]

这个程序的主要有趣之处在于它可以有许多嵌套的 MaybeT 层。在这里，这样做绝对没有任何目的，但在我遇到此问题的原始程序中确实如此。

想猜一下这个程序的时间复杂度吗？

好的，你通过阅读这个问题的标题作弊了。是的，它是指数级的:

[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 50                                                                                                                                        (03-31 17:15)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
./ExpAlloc 50  8.10s user 0.06s system 99% cpu 8.169 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 52                                                                                                                                        (03-31 17:15)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
./ExpAlloc 52  16.10s user 0.12s system 99% cpu 16.227 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 54                                                                                                                                        (03-31 17:16)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]
./ExpAlloc 54  32.32s user 0.23s system 99% cpu 32.561 total

一些进一步的检查表明原因是因为它分配了指数数量的内存，这自然需要指数数量的时间:

[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 40 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
     939,634,520 bytes allocated in the heap
       5,382,816 bytes copied during GC
          75,808 bytes maximum residency (2 sample(s))
          66,592 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1796 colls,     0 par    0.008s   0.009s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.243s  (  0.246s elapsed)
  GC      time    0.008s  (  0.009s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.252s  (  0.256s elapsed)

  %GC     time       3.2%  (3.6% elapsed)

  Alloc rate    3,869,930,149 bytes per MUT second

  Productivity  96.8% of total user, 95.3% of total elapsed

./ExpAlloc 40 +RTS -s  0.25s user 0.00s system 98% cpu 0.260 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 42 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
   1,879,159,424 bytes allocated in the heap
      10,767,048 bytes copied during GC
          95,504 bytes maximum residency (3 sample(s))
          71,152 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3593 colls,     0 par    0.016s   0.018s     0.0000s    0.0000s
  Gen  1         3 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.493s  (  0.498s elapsed)
  GC      time    0.016s  (  0.018s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.510s  (  0.517s elapsed)

  %GC     time       3.1%  (3.5% elapsed)

  Alloc rate    3,810,430,292 bytes per MUT second

  Productivity  96.8% of total user, 95.7% of total elapsed

./ExpAlloc 42 +RTS -s  0.51s user 0.01s system 99% cpu 0.521 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 44 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]
   3,758,208,408 bytes allocated in the heap
      21,499,312 bytes copied during GC
         102,056 bytes maximum residency (5 sample(s))
          73,784 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      7186 colls,     0 par    0.032s   0.037s     0.0000s    0.0009s
  Gen  1         5 colls,     0 par    0.000s   0.001s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.979s  (  0.987s elapsed)
  GC      time    0.033s  (  0.038s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.013s  (  1.024s elapsed)

  %GC     time       3.2%  (3.7% elapsed)

  Alloc rate    3,840,757,815 bytes per MUT second

  Productivity  96.7% of total user, 95.6% of total elapsed

./ExpAlloc 44 +RTS -s  1.01s user 0.01s system 99% cpu 1.029 total

我终其一生都无法弄清楚它为什么会这样。我很感激人们能对这种情况有所了解。

最佳答案

transformers 包(当前版本为 0.5.4.0)使用 liftM 实现 MonadTrans:

lift :: Monad m => m a -> MaybeT m a
lift = MaybeT . liftM Just

其中 liftM 是一个组合子，定义为

liftM :: Monad m => (a -> b) -> m a -> m b
liftM f m = m >>= \a -> return (f a)

此外，return 被定义为 MaybeT 为

return :: Monad m => a -> MaybeT m a
return a = lift . return

减少定义:

return :: Monad m => a -> MaybeT m a
return a = MaybeT (return a >>= \a -> return (Just a))

其中两个内部 return 以 m 类型实例化。

一次调用 return @(MaybeT m) 两次调用 return @m，因此您观察到的 MaybeT 塔呈指数行为.

这可以通过使用 fmap 而不是 liftM 来解决，但是当 Functor 不是 的父类(super class)时，这是向后不兼容的单子(monad).

编辑:其他转换器没有这个问题，因为 return 不是使用 lift 定义的，它提供了更好的修复。

return = MaybeT . return . Just

这是一个更简单的测试用例:

{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
import Control.Monad.Trans.Maybe
import System.Environment

f :: forall m proxy. Monad m => proxy m -> Int -> ()
f _ 0 = (return () :: m ()) `seq` ()
f _ n = f (undefined :: proxy (MaybeT m)) (n - 1)

main = do
  n : _ <- getArgs
  f (undefined :: proxy []) (read n) `seq` return ()

输出

> for i in {18..21} ; time ./b $i
./b $i  0.35s user 0.04s system 99% cpu 0.390 total
./b $i  0.71s user 0.07s system 99% cpu 0.782 total
./b $i  1.38s user 0.18s system 99% cpu 1.565 total
./b $i  2.82s user 0.32s system 100% cpu 3.139 total

关于performance - 为什么嵌套的 MaybeT 会导致指数分配，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/43149878/

有关performance - 为什么嵌套的 MaybeT 会导致指数分配的更多相关文章

ruby - 为什么我可以在 Ruby 中使用 Object#send 访问私有(private)/ protected 方法？ - 2
类classAprivatedeffooputs:fooendpublicdefbarputs:barendprivatedefzimputs:zimendprotecteddefdibputs:dibendendA的实例a=A.new测试a.foorescueputs:faila.barrescueputs:faila.zimrescueputs:faila.dibrescueputs:faila.gazrescueputs:fail测试输出failbarfailfailfail.发送测试[:foo,:bar,:zim,:dib,:gaz].each{|m|a.send(m)resc
ruby-on-rails - Rails - 子类化模型的设计模式是什么？ - 2
我有一个模型:classItem项目有一个属性“商店”基于存储的值，我希望Item对象对特定方法具有不同的行为。Rails中是否有针对此的通用设计模式？如果方法中没有大的if-else语句，这是如何干净利落地完成的？最佳答案通常通过Single-TableInheritance. 关于ruby-on-rails-Rails-子类化模型的设计模式是什么？，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.co
ruby-on-rails - Rails 编辑表单不显示嵌套项 - 2
我得到了一个包含嵌套链接的表单。编辑时链接字段为空的问题。这是我的表格:Editingkategori{:action=>'update',:id=>@konkurrancer.id})do|f|%>'Trackingurl',:style=>'width:500;'%>'Editkonkurrence'%>|我的konkurrencer模型:has_one:link我的链接模型:classLink我的konkurrancer编辑操作:defedit@konkurrancer=Konkurrancer.find(params[:id])@konkurrancer.link_attrib
ruby - 什么是填充的 Base64 编码字符串以及如何在 ruby 中生成它们？ - 2
我正在使用的第三方API的文档状态:"[O]urAPIonlyacceptspaddedBase64encodedstrings."什么是“填充的Base64编码字符串”以及如何在Ruby中生成它们。下面的代码是我第一次尝试创建转换为Base64的JSON格式数据。xa=Base64.encode64(a.to_json) 最佳答案他们说的padding其实就是Base64本身的一部分。它是末尾的“=”和“==”。Base64将3个字节的数据包编码为4个编码字符。所以如果你的输入数据有长度n和n%3=1=>"=="末尾用于填充n%
ruby - 解析 RDFa、微数据等的最佳方式是什么，使用统一的模式/词汇(例如 schema.org)存储和显示信息 - 2
我主要使用Ruby来执行此操作，但到目前为止我的攻击计划如下:使用gemsrdf、rdf-rdfa和rdf-microdata或mida来解析给定任何URI的数据。我认为最好映射到像schema.org这样的统一模式，例如使用这个yaml文件，它试图描述数据词汇表和opengraph到schema.org之间的转换:#SchemaXtoschema.orgconversion#data-vocabularyDV:name:namestreet-address:streetAddressregion:addressRegionlocality:addressLocalityphoto:i
ruby - 将散列转换为嵌套散列 - 2
这道题是thisquestion的逆题.给定一个散列，每个键都有一个数组，例如{[:a,:b,:c]=>1,[:a,:b,:d]=>2,[:a,:e]=>3,[:f]=>4,}将其转换为嵌套哈希的最佳方法是什么{:a=>{:b=>{:c=>1,:d=>2},:e=>3,},:f=>4,} 最佳答案这是一个迭代的解决方案，递归的解决方案留给读者作为练习:defconvert(h={})ret={}h.eachdo|k,v|node=retk[0..-2].each{|x|node[x]||={};node=node[x]}node[
ruby - 为什么 4.1%2 使用 Ruby 返回 0.0999999999999996？但是 4.2%2==0.2 - 2
为什么4.1%2返回0.0999999999999996？但是4.2%2==0.2。最佳答案参见此处:WhatEveryProgrammerShouldKnowAboutFloating-PointArithmetic实数是无限的。计算机使用的位数有限(今天是32位、64位)。因此计算机进行的浮点运算不能代表所有的实数。0.1是这些数字之一。请注意，这不是与Ruby相关的问题，而是与所有编程语言相关的问题，因为它来自计算机表示实数的方式。关于ruby-为什么4.1%2使用Ruby返
Ruby Koans about_array_assignment - 非平行与平行分配歧视 - 2
通过rubykoans.com，我在about_array_assignment.rb中遇到了这两段代码你怎么知道第一个是非并行赋值，第二个是一个变量的并行赋值？在我看来，除了命名差异之外，代码几乎完全相同。4deftest_non_parallel_assignment5names=["John","Smith"]6assert_equal["John","Smith"],names7end45deftest_parallel_assignment_with_one_variable46first_name,=["John","Smith"]47assert_equal'John
ruby - ruby 中的 TOPLEVEL_BINDING 是什么？ - 2
它不等于主线程的binding，这个toplevel作用域是什么？此作用域与主线程中的binding有何不同？>ruby-e'putsTOPLEVEL_BINDING===binding'false 最佳答案事实是，TOPLEVEL_BINDING始终引用Binding的预定义全局实例，而Kernel#binding创建的新实例>Binding每次封装当前执行上下文。在顶层，它们都包含相同的绑定(bind)，但它们不是同一个对象，您无法使用==或===测试它们的绑定(bind)相等性。putsTOPLEVEL_BINDINGput
ruby - Infinity 和 NaN 的类型是什么？ - 2
我可以得到Infinity和NaNn=9.0/0#=>Infinityn.class#=>Floatm=0/0.0#=>NaNm.class#=>Float但是当我想直接访问Infinity或NaN时:Infinity#=>uninitializedconstantInfinity(NameError)NaN#=>uninitializedconstantNaN(NameError)什么是Infinity和NaN？它们是对象、关键字还是其他东西？最佳答案您看到打印为Infinity和NaN的只是Float类的两个特殊实例的字符串

performance - 为什么嵌套的 MaybeT 会导致指数分配

有关performance - 为什么嵌套的 MaybeT 会导致指数分配的更多相关文章

随机推荐