语境
我有一个应用程序,该应用程序从一个表中选择一个加权随机条目,对于这些条目,前缀总和(权重)是至关重要的部分。简化的表定义如下所示:
CREATE TABLE entries (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
weight DECIMAL(9, 3),
fenwick DECIMAL(9, 3)
) ENGINE=MEMORY;
`fenwick`将值存储在`weights`的Fenwick树表示形式内。@r和0之间生成一个随机数SUM(weight),并找到范围包括@r的条目,如下所示:MEMORY引擎和二进制搜索,应该可以让我在O(lg^2(n))时间中找到合适的条目,而不是朴素的查询中的O(n) time:SELECT a.id-1 FROM (SELECT *, (@x:=@x+weight) AS counter FROM entries
CROSS JOIN (SELECT @x:=0) a
HAVING counter>@r LIMIT 1) a;
WHERE子句中存在变量时,MySQL会线性地线性运行表。这是查询:SELECT
SUM(1) INTO @garbage
FROM entries
CROSS JOIN (
SELECT @sum:=0,
@n:=@entryid
) a
WHERE id=@n AND @n>0 AND (@n:=@n-(@n&(-@n))) AND (@sum:=@sum+entries.fenwick);
/*SELECT @sum*/
@entryid是我们正在计算其前缀总和的条目的ID。我确实创建了一个行之有效的查询(连同返回整数最左位的lft函数):SET @n:=lft(@entryid);
SET @sum:=0;
SELECT
SUM(1) INTO @garbage
FROM entries
WHERE id=@n
AND @n<=@entryid
AND (@n:=@n+lft(@entryid^@n))
AND (@sum:=@sum+entries.fenwick);
/*SELECT @sum*/
EXPLAIN查询也是如此:+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | entries | ALL | NULL | NULL | NULL | NULL | 752544 | Using where |
+------+-------------+---------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
SHOW INDEXES FROM entries;
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| entries | 0 | PRIMARY | 1 | id | NULL | 752544 | NULL | NULL | | HASH | | |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
WHERE子句中的变量,以便优化程序可以在查询中使用。但是,我想不出没有id=@n的查询方法。我已经考虑过将要汇总的条目的键值放入表中并使用联接,但是我相信我会得到不良影响:要么表过多,要么通过根据@entryid进行评估来进行线性搜索。最佳答案
免责声明
芬威克树对我来说是新的,我只是在找到这篇文章的时候才发现它们的。
这里给出的结果是基于我的理解和一些研究,
但是我绝不是芬威克树专家,我可能错过了一些事情。
引用资料
fenwick树如何工作的解释
https://stackoverflow.com/a/15444954/1157540转载自
https://cs.stackexchange.com/a/10541/38148
https://cs.stackexchange.com/a/42816/38148
芬威克树的用法
https://en.wikipedia.org/wiki/Fenwick_tree
https://en.wikipedia.org/wiki/Prefix_sum
问题1,找到给定条目的权重
给出下表
CREATE TABLE `entries` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`weight` decimal(9,3) DEFAULT NULL,
`fenwick` decimal(9,3) NOT NULL DEFAULT '0.000',
PRIMARY KEY (`id`)
) ENGINE=INNODB;
@entryid的重量?WHERE ID = value)。ORDER BY)或范围(WHERE (value1 < ID) AND (ID < value2)))的查询都会遗漏要点,并且不会按预期顺序遍历树。SET @entryid := 60;
mysql> SELECT (@entryid & 0x0080) as b8,
-> (@entryid & 0x0040) as b7,
-> (@entryid & 0x0020) as b6,
-> (@entryid & 0x0010) as b5,
-> (@entryid & 0x0008) as b4,
-> (@entryid & 0x0004) as b3,
-> (@entryid & 0x0002) as b2,
-> (@entryid & 0x0001) as b1;
+------+------+------+------+------+------+------+------+
| b8 | b7 | b6 | b5 | b4 | b3 | b2 | b1 |
+------+------+------+------+------+------+------+------+
| 0 | 0 | 32 | 16 | 8 | 4 | 0 | 0 |
+------+------+------+------+------+------+------+------+
1 row in set (0.00 sec)
32 + 16 + 8 + 4 = 60
32 + 16 + 8 + 4 = 60
32 + 16 + 8 = 56
32 + 16 = 48
32
60转换为(32, 48, 56, 60)仅需要对ID值本身进行位数学运算:无需访问表或数据库,并且可以在发出查询的客户端中进行此计算。mysql> select sum(fenwick) from entries where id in (32, 48, 56, 60);
+--------------+
| sum(fenwick) |
+--------------+
| 32.434 |
+--------------+
1 row in set (0.00 sec)
mysql> select sum(weight) from entries where id <= @entryid;
+-------------+
| sum(weight) |
+-------------+
| 32.434 |
+-------------+
1 row in set (0.00 sec)
mysql> explain select sum(fenwick) from entries where id in (32, 48, 56, 60);
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | entries | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 4 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
explain format=json select sum(fenwick) from entries where id in (32, 48, 56, 60);
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "5.61"
},
"table": {
"table_name": "entries",
"access_type": "range",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"rows_examined_per_scan": 4,
"rows_produced_per_join": 4,
"filtered": "100.00",
"cost_info": {
"read_cost": "4.81",
"eval_cost": "0.80",
"prefix_cost": "5.61",
"data_read_per_join": "64"
},
"used_columns": [
"id",
"fenwick"
],
"attached_condition": "(`test`.`entries`.`id` in (32,48,56,60))"
}
}
mysql> explain select sum(weight) from entries where id <= @entryid;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | entries | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 60 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
explain format=json select sum(weight) from entries where id <= @entryid;
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "25.07"
},
"table": {
"table_name": "entries",
"access_type": "range",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"rows_examined_per_scan": 60,
"rows_produced_per_join": 60,
"filtered": "100.00",
"cost_info": {
"read_cost": "13.07",
"eval_cost": "12.00",
"prefix_cost": "25.07",
"data_read_per_join": "960"
},
"used_columns": [
"id",
"weight"
],
"attached_condition": "(`test`.`entries`.`id` <= (@`entryid`))"
}
}
SET @search_weight := 35.123;
SET @found_id := 0;
SET @max_id := (select id from entries order by id desc limit 1);
mysql> set @search_id := @found_id + 128;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+-----+---------+----------------+---------+
| id | fenwick | @search_weight | action |
+-----+---------+----------------+---------+
| 128 | 66.540 | 35.123 | discard |
+-----+---------+----------------+---------+
mysql> set @search_id := @found_id + 64;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+--------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+--------+
| 64 | 33.950 | 35.123 | keep |
+----+---------+----------------+--------+
set @found_id := @search_id, @search_weight := @search_weight - 33.950;
mysql> set @search_id := @found_id + 32;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+---------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+---------+
| 96 | 16.260 | 1.173 | discard |
+----+---------+----------------+---------+
mysql> set @search_id := @found_id + 16;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+---------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+---------+
| 80 | 7.394 | 1.173 | discard |
+----+---------+----------------+---------+
mysql> set @search_id := @found_id + 8;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+---------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+---------+
| 72 | 3.995 | 1.173 | discard |
+----+---------+----------------+---------+
mysql> set @search_id := @found_id + 4;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+---------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+---------+
| 68 | 1.915 | 1.173 | discard |
+----+---------+----------------+---------+
mysql> set @search_id := @found_id + 2;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+--------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+--------+
| 66 | 1.146 | 1.173 | keep |
+----+---------+----------------+--------+
set @found_id := @search_id, @search_weight := @search_weight - 1.146;
mysql> set @search_id := @found_id + 1;
mysql> select id, fenwick, @search_weight,
-> if (fenwick <= @search_weight, "keep", "discard") as action
-> from entries where id = @search_id;
+----+---------+----------------+--------+
| id | fenwick | @search_weight | action |
+----+---------+----------------+--------+
| 67 | 0.010 | 0.027 | keep |
+----+---------+----------------+--------+
set @found_id := @search_id, @search_weight := @search_weight - 0.010;
mysql> select @found_id, @search_weight;
+-----------+----------------+
| @found_id | @search_weight |
+-----------+----------------+
| 67 | 0.017 |
+-----------+----------------+
mysql> select sum(weight) from entries where id <= 67;
+-------------+
| sum(weight) |
+-------------+
| 35.106 |
+-------------+
mysql> select sum(weight) from entries where id <= 68;
+-------------+
| sum(weight) |
+-------------+
| 35.865 |
+-------------+
35.106 (fenwick[67]) <= 35.123 (search) <= 35.865 (fenwick[68])
SELECT fenwick from entries where id = ?;
INSERT INTO `entries` VALUES (1,0.480,0.480),(2,0.542,1.022),(3,0.269,0.269),(4,0.721,2.012),(5,0.798,0.798),(6,0.825,1.623),(7,0.731,0.731),(8,0.181,4.547),(9,0.711,0.711),(10,0.013,0.724),(11,0.930,0.930),(12,0.613,2.267),(13,0.276,0.276),(14,0.539,0.815),(15,0.867,0.867),(16,0.718,9.214),(17,0.991,0.991),(18,0.801,1.792),(19,0.033,0.033),(20,0.759,2.584),(21,0.698,0.698),(22,0.212,0.910),(23,0.965,0.965),(24,0.189,4.648),(25,0.049,0.049),(26,0.678,0.727),(27,0.245,0.245),(28,0.190,1.162),(29,0.214,0.214),(30,0.502,0.716),(31,0.868,0.868),(32,0.834,17.442),(33,0.566,0.566),(34,0.327,0.893),(35,0.939,0.939),(36,0.713,2.545),(37,0.747,0.747),(38,0.595,1.342),(39,0.733,0.733),(40,0.884,5.504),(41,0.218,0.218),(42,0.437,0.655),(43,0.532,0.532),(44,0.350,1.537),(45,0.154,0.154),(46,0.721,0.875),(47,0.140,0.140),(48,0.538,8.594),(49,0.271,0.271),(50,0.739,1.010),(51,0.884,0.884),(52,0.203,2.097),(53,0.361,0.361),(54,0.197,0.558),(55,0.903,0.903),(56,0.923,4.481),(57,0.906,0.906),(58,0.761,1.667),(59,0.089,0.089),(60,0.161,1.917),(61,0.537,0.537),(62,0.201,0.738),(63,0.397,0.397),(64,0.381,33.950),(65,0.715,0.715),(66,0.431,1.146),(67,0.010,0.010),(68,0.759,1.915),(69,0.763,0.763),(70,0.537,1.300),(71,0.399,0.399),(72,0.381,3.995),(73,0.709,0.709),(74,0.401,1.110),(75,0.880,0.880),(76,0.198,2.188),(77,0.348,0.348),(78,0.148,0.496),(79,0.693,0.693),(80,0.022,7.394),(81,0.031,0.031),(82,0.089,0.120),(83,0.353,0.353),(84,0.498,0.971),(85,0.428,0.428),(86,0.650,1.078),(87,0.963,0.963),(88,0.866,3.878),(89,0.442,0.442),(90,0.610,1.052),(91,0.725,0.725),(92,0.797,2.574),(93,0.808,0.808),(94,0.648,1.456),(95,0.817,0.817),(96,0.141,16.260),(97,0.256,0.256),(98,0.855,1.111),(99,0.508,0.508),(100,0.976,2.595),(101,0.353,0.353),(102,0.840,1.193),(103,0.139,0.139),(104,0.178,4.105),(105,0.469,0.469),(106,0.814,1.283),(107,0.664,0.664),(108,0.876,2.823),(109,0.390,0.390),(110,0.323,0.713),(111,0.442,0.442),(112,0.241,8.324),(113,0.881,0.881),(114,0.681,1.562),(115,0.760,0.760),(116,0.760,3.082),(117,0.518,0.518),(118,0.313,0.831),(119,0.008,0.008),(120,0.103,4.024),(121,0.488,0.488),(122,0.135,0.623),(123,0.207,0.207),(124,0.633,1.463),(125,0.542,0.542),(126,0.812,1.354),(127,0.433,0.433),(128,0.732,66.540),(129,0.358,0.358),(130,0.594,0.952),(131,0.897,0.897),(132,0.701,2.550),(133,0.815,0.815),(134,0.973,1.788),(135,0.419,0.419),(136,0.175,4.932),(137,0.620,0.620),(138,0.573,1.193),(139,0.004,0.004),(140,0.304,1.501),(141,0.508,0.508),(142,0.629,1.137),(143,0.618,0.618),(144,0.206,8.394),(145,0.175,0.175),(146,0.255,0.430),(147,0.750,0.750),(148,0.987,2.167),(149,0.683,0.683),(150,0.453,1.136),(151,0.219,0.219),(152,0.734,4.256),(153,0.016,0.016),(154,0.874,0.891),(155,0.325,0.325),(156,0.002,1.217);
关于MySQL:强制查询在WHERE子句中使用带有局部变量的索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29475470/
我正在学习如何使用Nokogiri,根据这段代码我遇到了一些问题:require'rubygems'require'mechanize'post_agent=WWW::Mechanize.newpost_page=post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')puts"\nabsolutepathwithtbodygivesnil"putspost_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div
我有一个Ruby程序,它使用rubyzip压缩XML文件的目录树。gem。我的问题是文件开始变得很重,我想提高压缩级别,因为压缩时间不是问题。我在rubyzipdocumentation中找不到一种为创建的ZIP文件指定压缩级别的方法。有人知道如何更改此设置吗?是否有另一个允许指定压缩级别的Ruby库? 最佳答案 这是我通过查看rubyzip内部创建的代码。level=Zlib::BEST_COMPRESSIONZip::ZipOutputStream.open(zip_file)do|zip|Dir.glob("**/*")d
类classAprivatedeffooputs:fooendpublicdefbarputs:barendprivatedefzimputs:zimendprotecteddefdibputs:dibendendA的实例a=A.new测试a.foorescueputs:faila.barrescueputs:faila.zimrescueputs:faila.dibrescueputs:faila.gazrescueputs:fail测试输出failbarfailfailfail.发送测试[:foo,:bar,:zim,:dib,:gaz].each{|m|a.send(m)resc
很好奇,就使用rubyonrails自动化单元测试而言,你们正在做什么?您是否创建了一个脚本来在cron中运行rake作业并将结果邮寄给您?git中的预提交Hook?只是手动调用?我完全理解测试,但想知道在错误发生之前捕获错误的最佳实践是什么。让我们理所当然地认为测试本身是完美无缺的,并且可以正常工作。下一步是什么以确保他们在正确的时间将可能有害的结果传达给您? 最佳答案 不确定您到底想听什么,但是有几个级别的自动代码库控制:在处理某项功能时,您可以使用类似autotest的内容获得关于哪些有效,哪些无效的即时反馈。要确保您的提
假设我做了一个模块如下:m=Module.newdoclassCendend三个问题:除了对m的引用之外,还有什么方法可以访问C和m中的其他内容?我可以在创建匿名模块后为其命名吗(就像我输入“module...”一样)?如何在使用完匿名模块后将其删除,使其定义的常量不再存在? 最佳答案 三个答案:是的,使用ObjectSpace.此代码使c引用你的类(class)C不引用m:c=nilObjectSpace.each_object{|obj|c=objif(Class===objandobj.name=~/::C$/)}当然这取决于
我正在尝试使用ruby和Savon来使用网络服务。测试服务为http://www.webservicex.net/WS/WSDetails.aspx?WSID=9&CATID=2require'rubygems'require'savon'client=Savon::Client.new"http://www.webservicex.net/stockquote.asmx?WSDL"client.get_quotedo|soap|soap.body={:symbol=>"AAPL"}end返回SOAP异常。检查soap信封,在我看来soap请求没有正确的命名空间。任何人都可以建议我
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题?更新问题,以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点?我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等),但我确实想创建一个输出文件。
我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗?当我运行compasswatch时,它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行?文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们?我自己编译的.sass文件编译成compiled/t
我正在用Ruby编写一个简单的程序来检查域列表是否被占用。基本上它循环遍历列表,并使用以下函数进行检查。require'rubygems'require'whois'defcheck_domain(domain)c=Whois::Client.newc.query("google.com").available?end程序不断出错(即使我在google.com中进行硬编码),并打印以下消息。鉴于该程序非常简单,我已经没有什么想法了-有什么建议吗?/Library/Ruby/Gems/1.8/gems/whois-2.0.2/lib/whois/server/adapters/base.
我想将html转换为纯文本。不过,我不想只删除标签,我想智能地保留尽可能多的格式。为插入换行符标签,检测段落并格式化它们等。输入非常简单,通常是格式良好的html(不是整个文档,只是一堆内容,通常没有anchor或图像)。我可以将几个正则表达式放在一起,让我达到80%,但我认为可能有一些现有的解决方案更智能。 最佳答案 首先,不要尝试为此使用正则表达式。很有可能你会想出一个脆弱/脆弱的解决方案,它会随着HTML的变化而崩溃,或者很难管理和维护。您可以使用Nokogiri快速解析HTML并提取文本:require'nokogiri'h