Elaseacsearch之Search的基本API

admin • 2021-12-24 21:18 • 云计算

Elaseacsearch之Search的基本API

Search API

URL Search：在url中使用查询参数来进行数据查询
Reqeust Body Search：使用ES提供的基于JSON格式的查询操作。Query Domain Specific Language（DSL）

基本API：

/_search：查询集群上的所有索引
/index1/_search：查询范围为index1
/index1,index2/_search：查询范围为index1和index2
4./index*/_search：查询范围为index开头的索引

URI查询

使用“q”来指定查询字符串查询的字段，“query string syntax”，KV键值对。
df：默认字段，不指定时，会对所有字段进行查询
Sort：排序，form和size用于分页
profile：可以查看查询是否被执行，返回实际执行查询的明细。

比如：curl -XGET localhost:9200/index/_search?q=user:程大帅2。查询index索引中user字段为程大帅2的文档。使用get请求。

{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":2,"relation":"eq"},"max_score":0.45315093,"hits":[{"_index":"index","_type":"_doc","_id":"KnXR0n0BGhZc0_6ZWpOi","_score":0.45315093,"_source":{
  "user":"程大帅2",
  "comment":"ES真好玩2"
}
},{"_index":"index","_type":"_doc","_id":"K3XS0n0BGhZc0_6ZeZMe","_score":0.45315093,"_source":{
  "user":"程大帅2",
  "comment":"ES真好玩2"
}
}]}}

profile

我们要查询index索引中user字段包含程的文档数据。可以看到携带了profile查询结果的明细信息。

get index/_search?q=程&df=user
{
  "profile":"true"
}

{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.14426158,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.14426158,
        "_source" : {
          "user" : "程大帅",
          "comment" : "ES真好玩"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "KnXR0n0BGhZc0_6ZWpOi",
        "_score" : 0.12874341,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "K3XS0n0BGhZc0_6ZeZMe",
        "_score" : 0.12874341,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        }
      }
    ]
  },
  "profile" : {
    "shards" : [
      {
        "id" : "[rOvJL9KfRHyGdHGN6kgoJA][index][0]",
        "searches" : [
          {
            "query" : [
              {
                "type" : "TermQuery",
                "description" : "user:程",
                "time_in_nanos" : 537900,
                "breakdown" : {
                  "set_min_competitive_score_count" : 0,
                  "match_count" : 0,
                  "shallow_advance_count" : 0,
                  "set_min_competitive_score" : 0,
                  "next_doc" : 2100,
                  "match" : 0,
                  "next_doc_count" : 3,
                  "score_count" : 3,
                  "compute_max_score_count" : 0,
                  "compute_max_score" : 0,
                  "advance" : 3900,
                  "advance_count" : 2,
                  "score" : 5600,
                  "build_scorer_count" : 4,
                  "create_weight" : 37700,
                  "shallow_advance" : 0,
                  "create_weight_count" : 1,
                  "build_scorer" : 488600
                }
              }
            ],
            "rewrite_time" : 4200,
            "collector" : [
              {
                "name" : "SimpleTopScoreDocCollector",
                "reason" : "search_top_hits",
                "time_in_nanos" : 20000
              }
            ]
          }
        ],
        "aggregations" : [ ],
        "fetch" : {
          "type" : "fetch",
          "description" : "",
          "time_in_nanos" : 1179300,
          "breakdown" : {
            "load_stored_fields" : 53900,
            "load_stored_fields_count" : 3,
            "next_reader" : 14700,
            "next_reader_count" : 2
          },
          "debug" : {
            "stored_fields" : [
              "_id",
              "_routing",
              "_source"
            ]
          },
          "children" : [
            {
              "type" : "FetchSourcePhase",
              "description" : "",
              "time_in_nanos" : 8300,
              "breakdown" : {
                "process_count" : 3,
                "process" : 4400,
                "next_reader" : 3900,
                "next_reader_count" : 2
              },
              "debug" : {
                "fast_path" : 3
              }
            }
          ]
        }
      }
    ]
  }
}

Request Body查询

1.搜索 match

我们也可以用请求体的方式来查询，可以是get/post请求，请求体中填入ES提供的JSON格式的DSL。推荐使用Request Body
例如：

post index/_search
{
  "query":{
    "match_all": {}
  },
  "from":0,
  "size":2,
  "sort":[{"user":"desc"}]
}
-----------------------------------
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        },
        "sort" : [
          "程大帅2"
        ]
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "user" : "程大帅",
          "comment" : "ES真好玩"
        },
        "sort" : [
          "程大帅"
        ]
      }
    ]
  }
}

from和size用于分页，从from开始，读取size个数。
sort是一个list，可以对多个字段进行desc/asc的排序。

_source：可以获取到指定字段的数据，并支持通配符，比如我们想拿以user开头的字段数据，那么就可以在请求体中加 "_source":["user*]
例子：

post index/_search
{
  "query":{
    "match_all": {}
  },
  "from":0,
  "size":2,
  "_source":["user"]
}
-----------------------------------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "user" : "程大帅"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "user" : "程大帅"
        }
      }
    ]
  }
}

2.脚本字段 - script_fields

在ES中提供了脚本字段，es中会使用painless脚本，来帮我们实现对现有字段的一些处理并返回为一个新的字段结果。

比如：我要在user字段前拼接my name is:，按照新字段myname返回，则可以这么写：

post index/_search
{
  "script_fields":{
    "myname":{
      "script":{
        "lang":"painless",
        "source":"'my name is :'+doc['user'].value"
      }
    }
  }
}
----------------------------------------------
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "myname" : [
            "my name is :程大帅"
          ]
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "myname" : [
            "my name is :程大帅"
          ]
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "fields" : {
          "myname" : [
            "my name is :程大帅2"
          ]
        }
      }
    ]
  }
}

3.查询表达式 - Match

match query可以帮助我们实现分词查询的效果，比如我现在要查字段comment包含 i 和 es 的数据。

get index/_doc/_search
{
  "query":{
    "match":{
      "comment":"i es"
    }
  }
}
-------------------------------
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.5169399,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.5169399,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "i like study"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 0.31479347,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "es is funny"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.28161854,
        "_source" : {
          "user" : "程大帅",
          "comment" : "ES真好玩"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.28161854,
        "_source" : {
          "user" : "程大帅",
          "comment" : "ES真好玩"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.25476927,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        }
      }
    ]
  }
}

通过上面的DSL，我们可以查询到包含i 和 es的所有数据。那么可以理解为或查询，那么我们如果想要同时包含多个单词的查询应该怎么做呢？
可以给查询加一个属性"operator":"AND"。比如我们要查询同时包含is和es的数据

get index/_doc/_search
{
  "query":{
    "match":{
      "comment":{
        "query":"is es",
        "operator":"AND"
      }
    }
  }
}
-----------------------------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.8317333,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.8317333,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "es is funny"
        }
      }
    ]
  }
}

4.短句搜索 - match phrase

短句搜索，顾名思义，就是必须包含搜索的句子，比如我们要查询包含 i like这样语句的，那么使用match phrase查询的时候，query里面的词必须按照顺序出现才能被匹配到。
例子：

get index/_doc/_search
{
  "query":{
    "match_phrase": {
      "comment":"i like"
    }
  }
}
---------------------------
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.2602494,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 2.2602494,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "i like study"
        }
      }
    ]
  }
}

当然，我们也可以给查询增加一个属性"slop":n来指定，query条件中间可以跳n个单词。
例子：

get index/_doc/_search
{
  "query":{
    "match_phrase": {
      "comment":{
        "query":"i like",
        "slop":1
      }
    }
  }
}
--------------------------------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.2602494,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 2.2602494,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "i like study"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.3024688,
        "_source" : {
          "user" : "程大帅气",
          "comment" : "i don't like study"
        }
      }
    ]
  }
}

可以看到，我们不仅仅搜索到了包含i like短句的，也搜索到了包含i don't like短句的文档。

5.Query String

在QueryString中，可以更加灵活的对一个或多个字段进行搜索。

比如要对字段comment查询同时包含i和like的数据，可以这么写

get index/_doc/_search
{
  "query":{
    "query_string": {
      "default_field": "comment",
      "query": "i AND like"
    }
  }
}

比如我们要对字段user和comment查询同时包含程大帅和 i 或es的文档，我们可以这么写。

get index/_doc/_search
{
  "query":{
    "query_string": {
      "fields": ["comment","user"],
      "query": "(程大帅 AND i) OR es"
    }
  }
}

6.Simple Query String

几个特性：

类似于Query String，但是Simple Query String同会忽略错误的语法，同时只支持部分查询语法。
它不支持AND OR NOT这类连接符，会将其当作字符串处理。
Term之间默认关系时OR，可以指定default_operator来进行修改
使用+来代替AND、|代替OR、-代替NOT

比如我们要查询comment字段包含i和like的语句，可以这么写


GET index/_doc/_search
{
  "query":{
    "simple_query_string":{
      "query":"i +like",
      "fields":["comment"]
    }
  }
}
------或者指定default_operator-----
GET index/_doc/_search
{
  "query":{
    "simple_query_string":{
      "query":"i like",
      "fields":["comment"],
      "default_operator":"AND"
    }
  }
}

Response

当我们输入查询之后，ES会给我们一些什么响应数据呢？

get index/_search
{
    "query": {
      "match_all":{
      }
    }
}

响应：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "KnXR0n0BGhZc0_6ZWpOi",
        "_score" : 1.0,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        }
      },
      {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "K3XS0n0BGhZc0_6ZeZMe",
        "_score" : 1.0,
        "_source" : {
          "user" : "程大帅2",
          "comment" : "ES真好玩2"
        }
      }
    ]
  }
}

可以看到ES返回了一大串数据，我们重点关注几个点：

took：本次请求花费的时间
_shard：本次查询覆盖的分片数量，以及成功失败数量
total：符合条件的总文档数
hits：本次请求查询到的结果集
- _index：索引名
- _id：文档在ES中保存的id
- _score：相关度评分
- _source：文档原始信息

本图文内容来源于网友网络收集整理提供，作为学习参考使用，版权属于原作者。

THE END

elasticsearch java 搜索引擎

二维码

Hive之查询操作

< <上一篇

springcloud之Stream

下一篇>>

搜索内容

Elaseacsearch之Search的基本API

Elaseacsearch之Search的基本API

Search API

URI查询

profile

Request Body查询

1.搜索 match

2.脚本字段 - script_fields

3.查询表达式 - Match

4.短句搜索 - match phrase

5.Query String

6.Simple Query String

Response

最新文章

分类

标签云