ElasticSearch中minimum_should_match理解

梦想东东 2024-05-10 原文

基于elasticsearch7.6.1 和 kibana7.6.1

本文通过案例进行讲解，希望读者耐心阅读【3.查询】中的内容。

1. 创建索引

PUT goods
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
}
}
}
}

说明：

通常情况下，为了提升搜索的效果，ik_max_word和ik_smart两种分词器需要配合使用。
即构建索引时用ik_max_word，尽可能多的分词，而搜索时用ik_smart，尽可能提高匹配准度，让用户的搜索尽可能准确。

2. 通过_bulk批量导入数据

POST goods/_bulk
{"index":{"_id":1}}
{"title":"法国原瓶进口红酒"}
{"index":{"_id":2}}
{"title":"圣罗兰山茶色口红"}
{"index":{"_id":3}}
{"title":"康师傅红烧牛肉味方便面"}
{"index":{"_id":4}}
{"title":"康师傅芥末青柠味方便面"}
{"index":{"_id":5}}
{"title":"康师傅香辣牛肉味方便面"}
{"index":{"_id":6}}
{"title":"康师傅极致酷爽红茶"}
{"index":{"_id":7}}
{"title":"新西兰进口牛奶"}

3. 查询

# match查询
# 基于"进口" OR "红酒"，进行召回。
GET goods/_search
{
"query": {
"match": {
"title": "进口红酒"
}
}
}

GET goods/_search
{
"query": {
"match": {
"title": "口红"
}
}
}

# 基于"康师傅" OR "红烧"，进行召回。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧"
}
}
}
}

# 基于"康师傅" OR "红烧"，进行召回，但是必须匹配两个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧",
"operator": "or",
"minimum_should_match": 2
}
}
}
}

# 基于"康师傅" AND "红烧"，进行召回。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧",
"operator": "and"
}
}
}
}

# 基于"康师傅" OR "红烧" OR "方便面"，进行召回，但是必须匹配两个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧方便面",
"operator": "or",
"minimum_should_match": 2
}
}
}
}

# 3*75%=2.25，向下取整等于2。
# 基于"康师傅" OR "红烧" OR "方便面"，进行召回，但是必须匹配两个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧方便面",
"operator": "or",
"minimum_should_match": "75%"
}
}
}
}

# 等价写法
GET goods/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"title": "康师傅"
}
},
{
"term": {
"title": "红烧"
}
},
{
"term": {
"title": "方便面"
}
}
],
"minimum_should_match": 2
}
}
}

# 3*60%=1.8，向下取整等于1。
# 基于"康师傅" OR "红烧" OR "方便面"，进行召回，但是必须匹配一个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧方便面",
"operator": "or",
"minimum_should_match": "60%"
}
}
}
}

# 等价写法
GET goods/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"title": "康师傅"
}
},
{
"term": {
"title": "红烧"
}
},
{
"term": {
"title": "方便面"
}
}
],
"minimum_should_match": 1
}
}
}

# 如果clauses<=3，那么全部should条件都要满足，如果clauses>3，那么需要满足全部should条件的80%。
# 5*80%=4，向下取整等于4。
# 基于"康师傅" OR "红烧" OR "牛肉" OR "味" OR "方便面"，进行召回，但是必须匹配4个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧牛肉味方便面",
"operator": "or",
"minimum_should_match": "3<80%"
}
}
}
}

# 等价写法
GET goods/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"title": "康师傅"
}
},
{
"term": {
"title": "红烧"
}
},
{
"term": {
"title": "牛肉"
}
},
{
"term": {
"title": "味"
}
},
{
"term": {
"title": "方便面"
}
}
],
"minimum_should_match": 4
}
}
}

# 如果clauses<=3，那么全部should条件都要满足，如果clauses>3，那么需要满足全部should条件的80%。
# 基于"康师傅" OR "红烧" OR "方便面"，进行召回，但是必须匹配3个词以上。
GET goods/_search
{
"query": {
"match": {
"title": {
"query": "康师傅红烧方便面",
"operator": "or",
"minimum_should_match": "3<80%"
}
}
}
}

# 等价写法
GET goods/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"title": "康师傅"
}
},
{
"term": {
"title": "红烧"
}
},
{
"term": {
"title": "方便面"
}
}
],
"minimum_should_match": 3
}
}
}

说明：

"minimum_should_match": "3<80%"含义：当should分支总数小于等于3时，则必须匹配所有的should分支，当should分支总数大于3时，则至少匹配80%的should分支，同时分支数向下取整。
"minimum_should_match": "60%"含义：必须匹配should分支总数的60%，同时分支数向下取整。例如，总共有7个should分支，则7*0.6=4.2，向下取整得到4，即至少匹配4个should分支。

4. 补充，对比ik_max_word和ik_smart的分词效果

GET _analyze
{
"analyzer": "ik_smart",
"text": "进口红酒"
}

GET _analyze
{
"analyzer": "ik_smart",
"text": "康师傅红烧"
}

GET _analyze
{
"analyzer": "ik_smart",
"text": "康师傅红烧方便面"
}

GET _analyze
{
"analyzer": "ik_smart",
"text": "康师傅红烧牛肉味方便面"
}

GET _analyze
{
"analyzer": "ik_max_word",
"text": "新西兰进口牛奶"
}

GET _analyze
{
"analyzer": "ik_max_word",
"text": "康师傅红烧牛肉味方便面"
}

GET _analyze
{
"analyzer": "ik_max_word",
"text": "康师傅香辣牛肉味方便面"
}

minimum_should_match ElasticSearch 34 br xff0c 大数据

有关ElasticSearch中minimum_should_match理解的更多相关文章

ruby-on-rails - rspec should have_select ('cars' , :options => ['volvo' , 'saab' ] 不工作 - 2
关闭。这个问题需要detailsorclarity.它目前不接受答案。想改进这个问题吗？通过editingthispost添加细节并澄清问题.关闭8年前。Improvethisquestion在首页我有:汽车:VolvoSaabMercedesAudistatic_pages_spec.rb中的测试代码:it"shouldhavetherightselect"dovisithome_pathit{shouldhave_select('cars',:options=>['volvo','saab','mercedes','audi'])}end响应是rspec./spec/request
CAN协议的学习与理解 - 2
最近在学习CAN，记录一下，也供大家参考交流。推荐几个我觉得很好的CAN学习，本文也是在看了他们的好文之后做的笔记首先是瑞萨的CAN入门，真的通透；秀！靠这篇我竟然2天理解了CAN协议！实战STM32F4CAN！原文链接：https://blog.csdn.net/XiaoXiaoPengBo/article/details/116206252CAN详解（小白教程）原文链接：https://blog.csdn.net/xwwwj/article/details/105372234一篇易懂的CAN通讯协议指南1一篇易懂的CAN通讯协议指南1-知乎(zhihu.com)视频推荐CAN总线个人知识总
TimeSformer：抛弃CNN的Transformer视频理解框架 - 2
Transformers开始在视频识别领域的“猪突猛进”，各种改进和魔改层出不穷。由此作者将开启VideoTransformer系列的讲解，本篇主要介绍了FBAI团队的TimeSformer，这也是第一篇使用纯Transformer结构在视频识别上的文章。如果觉得有用，就请点赞、收藏、关注！paper:https://arxiv.org/abs/2102.05095code(offical):https://github.com/facebookresearch/TimeSformeraccept:ICML2021author:FacebookAI一、前言Transformers(VIT)在图
ruby - 易于初学者理解的 Ruby 库 - 2
关闭。这个问题不符合StackOverflowguidelines.它目前不接受答案。我们不允许提问寻求书籍、工具、软件库等的推荐。您可以编辑问题，以便用事实和引用来回答。关闭3年前。Improvethisquestion我正处于学习Ruby的阶段，我想查看一些小型库的源代码以了解它们是如何构建的。我不知道什么是小型图书馆，但希望SO能推荐一些易于理解的图书馆来学习。因此，如果有人知道一两个非常小的库，这是新手Rubyists学习的好例子，请推荐!我想使用Manveru'sInnatelib，因为它试图保持在2000LOC以下，但我还不熟悉其中经常使用的Ruby速记。也许大约100-5
ruby - 无法理解 `puts{}.class` 和 `puts({}.class)` 之间的区别 - 2
由于匿名block和散列block看起来大致相同。我正在玩它。我做了一些严肃的观察，如下所示:{}.class#=>Hash好的，这很酷。空block被视为Hash。print{}.class#=>NilClassputs{}.class#=>NilClass为什么上面的代码和NilClass一样，下面的代码又显示了Hash？puts({}.class)#Hash#=>nilprint({}.class)#Hash=>nil谁能帮我理解上面发生了什么？我完全不同意@Lindydancer的观点你如何解释下面几行:print{}.class#NilClassprint[].class#A
ruby - 如何理解 Ruby 中的发送者和接收者？ - 2
我很难理解Ruby中sender和receiver的实际含义。它们一般是什么意思？到目前为止，我只是将它们理解为方法调用和获取其返回值的调用。但是，我知道我的理解还远远不够。谁能给我一个Ruby中发送者和接收者的具体解释？最佳答案面向对象中的一个核心概念是消息传递和早期概念化，这在很大程度上借鉴了计算的Actor模型。艾伦·凯(AlanKay)创造了面向对象一词并发明了最早的OO语言之一SmallTalk，他拥有voicedregretatusingatermwhichputthefocusonobjectsinsteadofo
ruby-on-rails - Rails - 理解 application.js 和 application.css - 2
rails新手。只是想了解\assests目录中的这两个文件。例如，application.js文件有如下行://=requirejquery//=requirejquery_ujs//=require_tree.我理解require_tree。只是将所有JS文件添加到当前目录中。根据上下文，我可以看出requirejquery添加了jQuery库。但是它从哪里得到这些jQuery库呢？我没有在我的Assets文件夹中看到任何jquery.js文件——或者直接在我的整个应用程序中没有看到任何jquery.js文件？同样，我正在按照一些说明安装TwitterBootstrap(http:
ruby - Rails Elasticsearch 聚合 - 2
不知何故，我似乎无法获得包含我的聚合的响应...使用curl它按预期工作:HBZUMB01$curl-XPOST"http://localhost:9200/contents/_search"-d'{"size":0,"aggs":{"sport_count":{"value_count":{"field":"dwid"}}}}'我收到回复:{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":90,"max_score":0.0,"hits":[]},"a
elasticsearch源码关于TransportSearchAction【阶段三】 - 2
1.回顾.TransportServicepublicclassTransportServiceextendsAbstractLifecycleComponentTransportService：方法：1publicfinalTextendsTransportResponse>voidsendRequest(finalTransport.Connectionconnection,finalStringaction,finalTransportRequestrequest,finalTransportRequestOptionsoptions,TransportResponseHandlerT>
ruby - 为什么 `Symbol#match` 的行为与 `String#match` 和 `Regexp#match` 不同？ - 2
String#match和Regexp#match在匹配成功时返回一个MatchData:"".match(//)#=>#//.match("")#=>#//.match(:"")#=>#但是Symbol#match返回匹配位置(如String#=~)::"".match(//)#=>0为什么Symbol#match表现不同？有用例吗？最佳答案我将其报告为Ruby核心中的错误:https://bugs.ruby-lang.org/issues/11991.让我们看看他们会怎么说。更新被质疑的行为似乎是一个错误。似乎从Ruby2.

ElasticSearch中minimum_should_match理解

1. 创建索引

2. 通过_bulk批量导入数据

3. 查询

4. 补充，对比ik_max_word和ik_smart的分词效果

有关ElasticSearch中minimum_should_match理解的更多相关文章

随机推荐