草庐IT

php - MySQL Match ...反对查询非常慢

coder 2023-10-03 原文

我在一个运行在 Propel ORM 上的网站上工作,我有这个查询:

if(isset($args["QueryText"]) && $args["QueryText"] != "") {
  $query = $query
    ->withColumn("(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE))", "RequestRelevance")
    ->condition('cond1', "(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE)) > 0.2")
    ->condition('cond2', 'Request.Id = ?', $args["QueryText"])
    ->where(array('cond1', 'cond2'), 'or')
    ->orderBy("RequestRelevance", Criteria::DESC);
}

在 SQL 中转换为以下内容:

SELECT DISTINCT 
(MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
    MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) 
  AS RequestRelevance, requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID)
WHERE ((MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
  MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) > 0.2 OR requests.requestID = '46104') 
ORDER BY requests.created ASC,RequestRelevance DESC

使用 Propel 加载网站需要 20 秒,运行 SQL 查询需要 7.020 秒。

我尝试了以下方法:

SELECT DISTINCT 
  requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID) 
WHERE (requests.subject LIKE '%46104%' OR requests.detail LIKE '%46104%' OR  Response.response LIKE '%46104%' OR requests.requestID = '46104') 
ORDER BY requests.created

执行需要 3.308 秒。从中删除 OR Response.response LIKE '%46104%' 可将时间缩短至 0.140 秒。 responses 表包含 288317 行,responses.responses 列是一个带有 FULLTEXT 索引的 TEXT() 列。

减少此搜索的执行时间的最佳方法是什么?我试过使用这个 https://dba.stackexchange.com/questions/15214/why-is-like-more-than-4x-faster-than-match-against-on-a-fulltext-index-in-mysq但是当我执行时回答:

SELECT responseID FROM
(
 SELECT * FROM responses
 WHERE requestID = 15000
 AND responseID != 84056
) A
WHERE MATCH(response) AGAINST('twisted');

我收到这个错误:

错误代码:1191。找不到与列列表匹配的 FULLTEXT 索引

帮助将不胜感激!

编辑 1:

试过@Richard EB 的查询:

ALTER TABLE responses ADD FULLTEXT(response)
288317 row(s) affected Records: 288317  Duplicates: 0  Warnings: 0  78.967 sec

但是:

SELECT responseID FROM (     SELECT * FROM responses     WHERE requestID = 15000     AND responseID != 84056 ) A WHERE MATCH(response) AGAINST('twisted') LIMIT 0, 2000 
Error Code: 1191. Can't find FULLTEXT index matching the column list    0.000 sec

删除 DISTINCT 将执行时间减少到 0.952 秒,但它不会检索我需要的结果。

编辑 2:

执行此查询:

SELECT DISTINCT 
responses.requestID AS "Id" FROM responses WHERE MATCH(response) AGAINST('twisted')

需要 0.062 秒。

执行这个:

SELECT DISTINCT responses.requestID
  FROM responses 
WHERE (
  MATCH (Responses.response) AGAINST ('twisted' IN BOOLEAN MODE)
) 
ORDER BY responses.requestID ASC

只需要 0.046 秒。但是,从请求中选择并加入响应会减慢查询速度。我不确定这是否意味着必须完全重写整个查询以从响应中选择并改为加入请求?

编辑 3:

这是我在 RequestsResponses 表上的索引:

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    responses, 0, PRIMARY, 1, responseID, A, 288317, , , , BTREE, , 
    responses, 1, requestID, 1, requestID, A, 48052, , , YES, BTREE, , 
    responses, 1, response, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_2, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_3, 1, response, , 1, , , YES, FULLTEXT, , 

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    requests, 0, PRIMARY, 1, requestID, A, 46205, , , , BTREE, , 
    requests, 1, supportstatusID, 1, supportstatusID, A, 14, , , YES, BTREE, , 
    requests, 1, internaluserID, 1, internaluserID, A, 344, , , YES, BTREE, , 
    requests, 1, customergroupID, 1, customergroupID, A, 198, , , , BTREE, , 
    requests, 1, userID, 1, userID, A, 1848, , , YES, BTREE, , 
    requests, 1, siteID, 1, siteID, A, 381, , , YES, BTREE, , 
    requests, 1, request, 1, subject, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 2, detail, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 3, ponumber, , 1, , , YES, FULLTEXT, , 

最佳答案

LIKE 搜索将遍历每条记录并执行非精确字符串比较,这就是它如此缓慢的原因。

您粘贴的 mysql 错误表明 MATCH 子句中引用的列没有全文索引(或者有,但与 MATCH 子句中引用的不一样)。

假设您正在使用 MyISAM 或拥有 MySQL 5.6 和 InnoDB,请尝试:

ALTER TABLE responses ADD FULLTEXT(response);

编辑:回答(在评论中): "如果 MATCH 语句中提到了两列,那么这两列应该有一个全文索引,而不是每一列都有两个全文索引"

关于php - MySQL Match ...反对查询非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33264062/

有关php - MySQL Match ...反对查询非常慢的更多相关文章

  1. ruby - ECONNRESET (Whois::ConnectionError) - 尝试在 Ruby 中查询 Whois 时出错 - 2

    我正在用Ruby编写一个简单的程序来检查域列表是否被占用。基本上它循环遍历列表,并使用以下函数进行检查。require'rubygems'require'whois'defcheck_domain(domain)c=Whois::Client.newc.query("google.com").available?end程序不断出错(即使我在google.com中进行硬编码),并打印以下消息。鉴于该程序非常简单,我已经没有什么想法了-有什么建议吗?/Library/Ruby/Gems/1.8/gems/whois-2.0.2/lib/whois/server/adapters/base.

  2. ruby-on-rails - 在 Rails 和 ActiveRecord 中查询时忽略某些字段 - 2

    我知道我可以指定某些字段来使用pluck查询数据库。ids=Item.where('due_at但是我想知道,是否有一种方法可以指定我想避免从数据库查询的某些字段。某种反拔?posts=Post.where(published:true).do_not_lookup(:enormous_field) 最佳答案 Model#attribute_names应该返回列/属性数组。您可以排除其中一些并传递给pluck或select方法。像这样:posts=Post.where(published:true).select(Post.attr

  3. sql - 查询忽略时间戳日期的时间范围 - 2

    我正在尝试查询我的Rails数据库(Postgres)中的购买表,我想查询时间范围。例如,我想知道在所有日期的下午2点到3点之间进行了多少次购买。此表中有一个created_at列,但我不知道如何在不搜索特定日期的情况下完成此操作。我试过:Purchases.where("created_atBETWEEN?and?",Time.now-1.hour,Time.now)但这最终只会搜索今天与那些时间的日期。 最佳答案 您需要使用PostgreSQL'sdate_part/extractfunction从created_at中提取小时

  4. ruby-on-rails - 使用 HTTParty 的非常基本的 Rails 4.1 API 调用 - 2

    Rails相对较新。我正在尝试调用一个API,它应该向我返回一个唯一的URL。我的应用程序中捆绑了HTTParty。我已经创建了一个UniqueNumberController,并且我已经阅读了几个HTTParty指南,直到我想要什么,但也许我只是有点迷路,真的不知道该怎么做。基本上,我需要做的就是调用API,获取它返回的URL,然后将该URL插入到用户的数据库中。谁能给我指出正确的方向或与我分享一些代码? 最佳答案 假设API为JSON格式并返回如下数据:{"url":"http://example.com/unique-url"

  5. ruby-on-rails - solr 清理查询 - 2

    我在Rails上使用带有ruby​​的solr。一切正常,我只需要知道是否有任何现有代码来清理用户输入,比如以?开头的查询。或* 最佳答案 我不知道执行此操作的任何代码,但理论上可以通过查看parsingcodeinLucene来完成并搜索thrownewParseException(只有16个匹配!)。在实践中,我认为您最好只捕获代码中的任何solr异常并显示“无效查询”消息或类似信息。编辑:这里有几个“sanitizer”:http://pivotallabs.com/users/zach/blog/articles/937-s

  6. ruby-on-rails - Rails 3 在一个查询中包含多个表 - 2

    我正在为锦标赛开发一个Rails应用程序。我在这个查询中使用了三个模型:classPlayertruehas_and_belongs_to_many:tournamentsclassTournament:destroyclassPlayerMatch"Player",:foreign_key=>"player_one"belongs_to:player_two,:class_name=>"Player",:foreign_key=>"player_two"在tournaments_controller的显示操作中,我调用以下查询:Tournament.where(:id=>params

  7. ruby-on-rails - Sunspot:如何对具有不同值的多个字段进行全文查询? - 2

    我想用sunspot重现以下原始solr查询q=exact_term_text:fooORterm_textv:foo*ORalternate_text:bar*但我无法通过标准的太阳黑子界面理解这是否可能以及如何实现,因为看起来:fulltext方法似乎不接受多个文本/搜索字段参数我不知道将什么参数作为第一个参数传递给fulltext,就好像我通过了"foo"或"bar"结果不匹配如果我传递一个空参数,我得到一个q=*:*范围过滤器(例如with(:term).starting_with('foo*')(顾名思义)作为过滤器查询应用,因此不参与评分。似乎可以手动编写字符串(或者可能使

  8. ruby - 如何在 Ruby 中生成一个非常大的随机整数? - 2

    我想在ruby​​中生成一个64位整数。我知道在Java中你有很多渴望,但我不确定你会如何在Ruby中做到这一点。另外,64位数字中有多少个字符?这是我正在谈论的示例......123456789999。@num=Random.rand(9000)+Random.rand(9000)+Random.rand(9000)但我认为这是非常低效的,必须有一种更简单、更简洁的方法来做到这一点。谢谢! 最佳答案 rand可以将范围作为参数:pa=rand(2**32..2**64-1)#=>11093913376345012184putsa.

  9. ruby-on-rails - 在不重新查询数据库的情况下重新排序 Rails 中的事件记录? - 2

    例如,假设我有一个名为Products的模型,并且在ProductsController中,我有以下代码用于product_listView以显示已排序的产品。@products=Product.order(params[:order_by])让我们想象一下,在product_listView中,用户可以使用下拉菜单按价格、评级、重量等进行排序。数据库中的产品不会经常更改。我很难理解的是,每次用户选择新的order_by过滤器时,rails是否必须查询,或者rails是否能够以某种方式缓存事件记录以在服务器端重新排序?有没有一种方法可以编写它,以便在用户排序时rails不会重新查询结果

  10. ruby-on-rails - 带句点(或句号)的 Rails 查询字符串。 - 2

    我目前正在尝试了解RoR。我将两个字符串传递到我的Controller中。一个是随机的十六进制字符串,另一个是电子邮件。该项目用于对数据库进行简单的电子邮件验证。我遇到的问题是当我输入如下内容来测试我的页面时:http://signup.testsite.local/confirm/da2fdbb49cf32c6848b0aba0f80fb78c/bob.villa@gmailcom我在:email的参数散列中得到的全部是'bob'。我在gmail和com之间留下了.,因为那样会导致匹配根本不起作用。我的路由匹配如下:match"confirm/:code/:email"=>"conf

随机推荐