ES:ElasticSearch的基本使用之DSL(二)

2020-10-15

文本主要是记录一些有关于ElasticSearch的DSL语句的使用

上一篇说明了文档操作请求体查询(基本查询结果过滤) 本文将继续说明其他的语法

具体操作

1. 请求体查询（高级查询）

1.1 布尔组合（bool）
bool 把各种其他查询通过 must ，must_not ， should 的方式进行组合

发送请求

GET /shopping/_search
{
“query”: {
“bool”: {
“must”: [
{“match”: {
“title”: “小米”
}}
],
“must_not”: [
{“match”: {
“title”: “电视”
}}
],
“should”: [
{“match”: {
“title”: “手机”
}}
]
}
}
}

响应结果

{
“took” : 9,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 0.5753642,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “1”,
“_score” : 0.5753642,
“_source” : {
“title” : “小米手机”,
“images” : “http://www.youngdonkey.cn/xm.jpg",
“price” : 3999.0
}
}
]
}
}

1.2 范围查询(range)
range 查询找出那些落在指定区间内的数字或时间 range查询允许以下字符

gt（greater than）大于

gte（greater than equal）大于等于

lt（less than）小于

lte（less than equal）小于等于

发送请求

GET /shopping/_search

{

“query”: {

“range“: {

“price”: {

“gte”: 2500,

“lte”: 4000

}

}

}

}

响应结果

{
“took” : 0,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 1.0,
“hits” : [
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “1”,
“_score” : 1.0,
“_source” : {
“title” : “小米手机”,
“images” : “http://www.youngdonkey.cn/xm.jpg“,
“price” : 3999.0
}
}
]
}
}

1.3 模糊查询（fuzzy）
返回包含与搜索字词相似的字词的文档

编辑距离是将一个术语转换为另一个术语所需的一个字符更改的次数。这些更改可以包括

更改字符（box -> fox）

删除字符 (black -> lack)

插入字符（sic -> sick）

转置两个相邻字符（act -> cat）

为了找到相似的术语，fuzzy查询会在指定的编辑距离内创建一组搜索词的所有可能的变体或扩展。然后查询返回每个扩展的完全匹配

通过fuzziness修改编辑距离。一般使用默认值AUTO，根据术语的长度生成编辑距离。

0-2 必须完全匹配

3-5 允许编辑一次

大于5 允许编辑两次

添加数据

POST /shopping/product/4
{
“title”:”apple手机”,
“image”:”apple.jpg”,
“price”:5999.00
}
POST /shopping/product/5
{
“title”:”apple”,
“image”:”apple.jpg”,
“price”:6999.00
}

发送请求

GET /shopping/_search
{
“query”: {
“fuzzy”: {
“title”: {
“value”: “ccple”
}
}
}
}

响应结果

{
“took” : 12,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 0,
“max_score” : null,
“hits” : [ ]
}
}

如果只有一个字符错误

发送请求

GET /shopping/_search
{
“query”: {
“fuzzy”: {
“title”: {
“value”: “cpple”
}
}
}
}

响应结果

{
“took” : 2,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 2,
“max_score” : 0.55451775,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 0.55451775,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “5”,
“_score” : 0.23014566,
“_source” : {
“title” : “apple”,
“image” : “apple.jpg”,
“price” : 6999.0
}
}
]
}
}

2 请求体查询（查询排序）

2.1 单字段排序
sort 可以让我们按照不同的字段进行排序，并且通过order指定排序的方式。desc降序，asc升序。

发送请求

GET /shopping/_search

{

“query”: {

“match_all”: {}

},

“sort”: [

{

“price”: {

“order”: “desc”

}

}

]

}

响应结果

{
“took” : 4,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 5,
“max_score” : null,
“hits” : [
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “4”,
“_score” : null,
“_source” : {
“title” : “apple手机”,
“images” : “http://www.youngdonkey.cn/apple.jpg“,
“price” : 5999.0
},
“sort” : [
5999.0
]
},
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “3”,
“_score” : null,
“_source” : {
“title” : “小米电视”,
“images” : “http://www.youngdonkey.cn/xmds.jpg“,
“price” : 5999.0
},
“sort” : [
5999.0
]
},
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “5”,
“_score” : null,
“_source” : {
“title” :
“apple”,
“images” : “http://www.youngdonkey.cn/apple.jpg“,
“price” : 4999.0
},
“sort” : [
4999.0
]
},
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “2”,
“_score” : null,
“_source” : {
“title” : “华为手机”,
“images” : “http://www.youngdonkey.cn/hw.jpg“,
“price” : 4999.0
},
“sort” : [
4999.0
]
},
{
“_index” :
“shopping”,
“_type” :
“product”,
“_id” : “1”,
“_score” : null,
“_source” : {
“title” : “小米手机”,
“images” : “http://www.youngdonkey.cn/xm.jpg“,
“price” : 3999.0
},
“sort” : [
3999.0
]
}
]
}
}

2.2 多字段排序
假定我们想要结合使用 price和 _score（得分）进行查询，并且匹配的结果首先按照价格排序，然后按照相关性得分排序：

发送请求

GET /shopping/_search
{
“query”: {
“match_all”: {}
},
“sort”: [
{
“price”: {
“order”: “desc”
}
},
{
“_score”:{
“order”: “desc”
}
}
]
}

响应结果

{
“took” : 4,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 5,
“max_score” : null,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “5”,
“_score” : 1.0,
“_source” : {
“title” : “apple”,
“image” : “apple.jpg”,
“price” : 6999.0
},
“sort” : [
6999.0,
1.0
]
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 1.0,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
},
“sort” : [
5999.0,
1.0
]
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “3”,
“_score” : 1.0,
“_source” : {
“title” : “小米电视”,
“images” : “http://www.youngdonkey.cn/xmds.jpg",
“price” : 5999.0
},
“sort” : [
5999.0,
1.0
]
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “2”,
“_score” : 1.0,
“_source” : {
“title” : “华为手机”,
“images” : “http://www.youngdonkey.cn/hw.jpg",
“price” : 4999.0
},
“sort” : [
4999.0,
1.0
]
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “1”,
“_score” : 1.0,
“_source” : {
“title” : “小米手机”,
“images” : “http://www.youngdonkey.cn/xm.jpg",
“price” : 3999.0
},
“sort” : [
3999.0,
1.0
]
}
]
}
}

3 请求体查询【高亮查询】

在进行关键字搜索时，搜索出的内容中的关键字会显示不同的颜色，称之为高亮。

高亮查询请求

ElasticSearch可以对查询内容中的关键字部分，进行标签和样式(高亮)的设置。在使用match查询的同时，加上一个highlight属性：

pre_tags : 前置标签

post_tags : 后置标签

fields : 需要高亮的字段

title : 这里声明title字段需要高亮，后面可以为这个字段设置特有配置，也可以空

发送请求

GET /shopping/_search
{
“query”: {
“match”: {
“title”: “华为”
}
},
“highlight”: {
“pre_tags”: “<font color='red'>“,
“post_tags”: “</font>“,
“fields”: {
“title”: {}
}
}
}

响应结果

{
“took” : 29,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 0.6931472,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “2”,
“_score” : 0.6931472,
“_source” : {
“title” : “华为手机”,
“images” : “http://www.youngdonkey.cn/hw.jpg",
“price” : 4999.0
},
“highlight” : {
“title” : [
“<font color='red'>华为</font>手机”
]
}
}
]
}
}

4 请求体查询分页查询

from : 当前页的起始索引默认从0开始 from = (pageNum - 1) * size

size ：每页显示多少条

发送请求

GET /shopping/_search
{
“query”: {
“match_all”: {}
},
“sort”: [
{
“price”: {
“order”: “desc”
}
},
{
“_score”:{
“order”: “desc”
}
}
],
“from”: 0,
“size”: 2
}

响应结果

{
“took” : 1,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 5,
“max_score” : null,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “5”,
“_score” : 1.0,
“_source” : {
“title” : “apple”,
“image” : “apple.jpg”,
“price” : 6999.0
},
“sort” : [
6999.0,
1.0
]
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 1.0,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
},
“sort” : [
5999.0,
1.0
]
}
]
}
}

5 聚合（aggs）

类似于MySQL的group by

发送请求

GET /shopping/_search
{
“aggs”: {
“abc”: {
“terms”: {
“field”: “price”
}
}
}
}

响应结果

{
“took” : 5,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 5,
“max_score” : 1.0,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “5”,
“_score” : 1.0,
“_source” : {
“title” : “apple”,
“image” : “apple.jpg”,
“price” : 6999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “2”,
“_score” : 1.0,
“_source” : {
“title” : “华为手机”,
“images” : “http://www.youngdonkey.cn/hw.jpg",
“price” : 4999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 1.0,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “1”,
“_score” : 1.0,
“_source” : {
“title” : “小米手机”,
“images” : “http://www.youngdonkey.cn/xm.jpg",
“price” : 3999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “3”,
“_score” : 1.0,
“_source” : {
“title” : “小米电视”,
“images” : “http://www.youngdonkey.cn/xmds.jpg",
“price” : 5999.0
}
}
]
},
“aggregations” : {
“abc” : {
“doc_count_error_upper_bound” : 0,
“sum_other_doc_count” : 0,
“buckets” : [
{
“key” : 5999.0,
“doc_count” : 2
},
{
“key” : 3999.0,
“doc_count” : 1
},
{
“key” : 4999.0,
“doc_count” : 1
},
{
“key” : 6999.0,
“doc_count” : 1
}
]
}
}
}

请求说明

GET /shopping/_search
{
“aggs”: { # 关键字
“abc”: { # 查询出来接收的名称
“terms”: { # 多值匹配 term 单值匹配
“field”: “price” # field固定语法：分组条件(此处为价格pirce)
}
}
}
}

6 查询前先过滤
符合条件的留下不符合条件的过滤掉

发送请求

GET /shopping/_search
{
“query”: {
“bool”: {
“filter”: {
“term”: {
“title”: “apple”
}
}
}
}
}

响应结果

{
“took” : 0,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 2,
“max_score” : 0.0,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “5”,
“_score” : 0.0,
“_source” : {
“title” : “apple”,
“image” : “apple.jpg”,
“price” : 6999.0
}
},
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 0.0,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
}
}
]
}
}

发送请求

GET /shopping/_search
{
“query”: {
“bool”: {
“filter”: {
“term”: {
“title”: “apple”
}
}
, “must”: [ # 过滤后查询条件价格必须为5999
{
“term”: {
“price”:”5999”
}
}
]
}
}
}

查询结果

{
“took” : 0,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 1.0,
“hits” : [
{
“_index” : “shopping”,
“_type” : “product”,
“_id” : “4”,
“_score” : 1.0,
“_source” : {
“title” : “apple手机”,
“image” : “apple.jpg”,
“price” : 5999.0
}
}
]
}
}

数据类型补充（特殊的数据类型nested）

是什么？

nested : 类型是一种特殊的对象object数据类型（specialised version of the object datatype）允许对象数组彼此独立地进行索引和查询

添加数据

PUT my_index/_doc/1
{
“group”:”fans”,
“user”:[
{
“first”:”John”,
“last”:”Smith”
},{
“first”:”Alice”,
“last”:”White”
}
]
}

返回结果

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
“_index” : “my_index”,
“_type” : “_doc”,
“_id” : “1”,
“_version” : 1,
“result” : “created”,
“_shards” : {
“total” : 2,
“successful” : 1,
“failed” : 0
},
“_seq_no” : 0,
“_primary_term” : 1
}

执行请求

GET my_index/_search
{
“query”: {
“bool”: {
“must”: [
{“match”: {
“user.first”: “Alice”
}},
{“match”: {
“user.last”: “Smith”
}}
]
}
}
}

响应结果

{
“took” : 12,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 0.5753642,
“hits” : [
{
“_index” : “my_index”,
“_type” : “_doc”,
“_id” : “1”,
“_score” : 0.5753642,
“_source” : {
“group” : “fans”,
“user” : [
{
“first” : “John”,
“last” : “Smith”
},
{
“first” : “Alice”,
“last” : “White”
}
]
}
}
]
}
}

此时问题来了我所需要的是叫 Alice Smith 的这个此时并没有这个数据在 ElasticSearch

为什么会出现这种情形？

建立my_index的时候它默认的数据类型是Object{user.first:”John , Alice”} {user.last:”Smith,White”}

实际上从业务的角度出发应该没有数据

因为 User1{John,Smith} User2{Alice,White}

删除此索引 DELETE my_index

执行新请求

PUT my_index
{
“mappings”: {
“_doc”:{
“properties”:{
“user”:{
“type”:”nested”
}
}
}
}
}

响应结果

#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
#! Deprecation: [types removal] The parameter include_type_name should be explicitly specified in create index requests to prepare for 7.0. In 7.0 include_type_name will default to ‘false’, and requests are expected to omit the type name in mapping definitions.
{
“acknowledged” : true,
“shards_acknowledged” : true,
“index” : “my_index”
}

再次执行前面查询是否有 Alice Smith 数据

发送请求

GET my_index/_search
{
“query”: {
“bool”: {
“must”: [
{“match”: {
“user.first”: “Alice”
}},
{“match”: {
“user.last”: “Smith”
}}
]
}
}
}

响应结果

{
“took” : 3,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 0,
“max_score” : null,
“hits” : [ ]
}
}

结论：第二次执行获取不到数据加上nested 将会将nested里面的每一个对象中的属性看成一个整体数组对象可以独立的被检索

将nested 和 group 结合使用

发送请求

GET my_index/_search
{
“query”: {
“match”: {
“group”: “fans”
}
},
“aggs”: {
“fan”: {
“nested”: {
“path”: “user”
}
}
}
}

响应结果

{
“took” : 2,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“skipped” : 0,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 0.2876821,
“hits” : [
{
“_index” : “my_index”,
“_type” : “_doc”,
“_id” : “1”,
“_score” : 0.2876821,
“_source” : {
“group” : “fans”,
“user” : [
{
“first” : “John”,
“last” : “Smith”
},
{
“first” : “Alice”,
“last” : “White”
}
]
}
}
]
},
“aggregations” : {
“fan” : {
“doc_count” : 2
}
}
}

nested的总结：可以查询，也可以进行聚合；查询时 nested 将查询条件看做是一个整体

文本 主要是记录一些 有关于ElasticSearch的DSL语句的使用

上一篇 说明了 文档操作 请求体查询(基本查询 结果过滤) 本文将继续说明其他的语法

具体操作

1. 请求体查询（高级查询）

1.1 布尔组合（bool）

1.2 范围查询(range)

1.3 模糊查询（fuzzy）

2 请求体查询 （查询排序）

2.1 单字段排序

2.2 多字段排序

3 请求体查询 【高亮查询】

4 请求体查询 分页查询

5 聚合（aggs）

6 查询前 先过滤

数据类型补充（特殊的数据类型nested）

是什么？

此时问题来了 我所需要的是 叫 Alice Smith 的这个 此时 并没有这个数据 在 ElasticSearch

为什么会出现这种情形？

结论：第二次执行 获取不到数据 加上nested 将会将nested里面的每一个对象中的属性 看成一个整体 数组对象 可以独立的 被检索

将nested 和 group 结合使用

nested的总结 ：可以查询，也可以进行聚合； 查询时 nested 将查询条件 看做 是一个整体

文本主要是记录一些有关于ElasticSearch的DSL语句的使用

上一篇说明了文档操作请求体查询(基本查询结果过滤) 本文将继续说明其他的语法

2 请求体查询（查询排序）

3 请求体查询【高亮查询】

4 请求体查询分页查询

6 查询前先过滤

此时问题来了我所需要的是叫 Alice Smith 的这个此时并没有这个数据在 ElasticSearch

结论：第二次执行获取不到数据加上nested 将会将nested里面的每一个对象中的属性看成一个整体数组对象可以独立的被检索

nested的总结：可以查询，也可以进行聚合；查询时 nested 将查询条件看做是一个整体