ElasticSearch 学习

发表于 2021-03-18 更新于 2025-01-16 分类于学习阅读次数：本文字数： 2.6k 阅读时长 ≈ 9 分钟

ElasticSearch 学习

1. 下载安装 ElasticSearch相关环境

下载ElasticSearch https://www.elastic.co/cn/downloads/elasticsearch

下载Kibana https://www.elastic.co/cn/downloads/kibana

2. 安装运行

如果是Windows上面的

直接解压ElasticSearch压缩包，点击 bin目录下的 elasticsearch.bat运行

浏览器访问 127.0.0.1:9200得到如下回显信息

{
  "name" : "DESKTOP-RAJN3CL",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "PCtXmfWsT52OBT9rR2kONQ",
  "version" : {
    "number" : "7.6.2",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
    "build_date" : "2020-03-26T06:34:37.794943Z",
    "build_snapshot" : false,
    "lucene_version" : "8.4.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

安装ElasticSearch 网页界面

下载 elasticsearch-head https://github.com/mobz/elasticsearch-head

因为是个前端框架包，需要配置node.js https://nodejs.org/en/

安装完 node.js后，配置淘宝镜像 :

1	npm install -g cnpm --registry=https://registry.npm.taobao.org

配置启动:

1
2
3

cnpm install # 添加相关依赖

npm run start # 运行

访问http://localhost:9100/得到如下界面

因为ElasticSearch默认不允许跨域使得 elasticsearch-head 能访问到，这里需要配置elasticsearch的跨域访问配置，在ElasticSearch的解压包目录下的 config 文件夹下的 elasticsearch.yml 添加如下配置：

1 2	http.cors.enabled: true http.cors.allow-origin: "*"

重新访问得到的界面如下:

3. 安装 Kibana

如果是Windows上面的

直接解压Kibana压缩包，点击 bin目录下的 kibana.bat运行

浏览器访问 127.0.0.1:5601得到如下回显信息

汉化

修改解压Kibana解压文件下 config/kibana.yml

# 找到这处注释
# i18n.locale: "en"

# 修改为 
i18n.locale: "zh-CN"

# 重启 Kibana

4. ElasticSearch 概述

索引
字段类型(mapping)
文档(document)

ElasticSearch是面向文档的，ElasticSearch中一切都是JSON

ElasticSearch 和传统的关系型数据库对比

Mysql	ElasticSearch
数据库(database)	索引(indices)
表(tables)	type(慢慢会被弃用)
行(rows)	文档(document)
字段(columns)	fields

5. ElasticSearch 使用 ik分词器

下载 https://github.com/medcl/elasticsearch-analysis-ik/releases

下载后解压到 elasticsearch的 plugins 目录下，设置文件夹名为 ik,如下:

重启ElasticSearch，可看见控制台加载了 ik

也可在 ElasticSearch 的 bin 目录下使用 cmd 控制台输入如下命令查看插件信息

elasticsearch-plugin

#得到的回显信息
future versions of Elasticsearch will require Java 11; your Java version from [D:\Program Files\Java\jdk1.8.0_202\jre] does not meet this requirement  # 这段是jdk版本推荐,不用在意
ik

使用 Kibana 测试 ik

# 在开发者工具中输入 如下命令 ik_smart 表示 对字段的 最小切分
GET _analyze
{
  "analyzer":"ik_smart",
  "text":"中华人民共和国"
}

# 得到如下信息
{
  "tokens" : [
    {
      "token" : "中华人民共和国",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    }
  ]
}

#输入如下命令 ik_max_word 为最细颗粒度划分,穷尽词库的可能
GET _analyze
{
  "analyzer":"ik_max_word",
  "text":"中华人民共和国"
}

# 得到
{
  "tokens" : [
    {
      "token" : "中华人民共和国",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "中华人民",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "中华",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "华人",
      "start_offset" : 1,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "人民共和国",
      "start_offset" : 2,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "人民",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "共和国",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "共和",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "国",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "CN_CHAR",
      "position" : 8
    }
  ]
}

如果某些字段被拆开了，即自己需要的词，需要把这个字段加到分词器的字典中

ik 分词器增加自己的配置

重启 es

一般都能够实现字段分词设置

Rest 风格说明

Method	url地址	描述
PUT	location:9200/索引名称/类型名称/文档id	创建文档(指定文档id)
POST	location:9200/索引名称/类型名称	创建文档(随机文档id)
POST	location:9200/索引名称/类型名称/文档id/_update	修改文档
DELETE	location:9200/索引名称/类型名称/文档id	删除文档
GET	location:9200/索引名称/类型名称/文档id	查询文档通过文档id
POST	location:9200/索引名称/类型名称/_search	查询所有数据

基础测试

创建一个索引

PUT /索引名/类型名/文档id
{
	请求体
}

ElasticSearch 的基本数据类型
- 字符串类型
  
  text、keyword
- 数值类型
  
  long、integer、short、byte、double、float、half float、scaled float
- 日期类型
  
  date
- 布尔值类型
  
  boolean
- 二进制类型
  
  binary
- 等….

指定字段的类型

# 输入如下命令
PUT /test2
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text"
      },
      "age":{
        "type": "long"
      },
      "birthday":{
        "type": "date"
      }
    }
  }
}

# 得到如下回显
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test2"
}

获取这个规则信息

GET 索引名称

# 输入
GET test2

# 得到如下信息
{
  "test2" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "long"
        },
        "birthday" : {
          "type" : "date"
        },
        "name" : {
          "type" : "text"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1594005434518",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "PRZ3KvJ-TrGCqSLWe5dMhw",
        "version" : {
          "created" : "7060299"
        },
        "provided_name" : "test2"
      }
    }
  }
}

查看ElasticSearch的状态信息

1
2
3

GET _cat/health

GET _cat/indices?v

修改索引文档信息

可以直接使用 PUT命令重复提交数据，但是之前PUT的值必须在这次请求中

使用 POST /索引名/类型/文档id/_update 修改
1
2
3
4
5
6
POST /索引名/类型/文档id/_update
{
"doc":{
请求体
}
}
删除索引
1
DELETE 索引名

基于文档的基本操作

简单操作

添加 PUT

PUT /索引名/类型/文档id
{
 请求体
}

更新

PUT

PUT /索引名/类型/文档id
{
 请求体  # 该请求体里面的值必须全部传入,否则之前添加的值会被设置为空
}

POST

POST /索引名/类型/文档id
{
 "doc":{
 	请求体
 }
}

查询 GET

获取某文档
1
GET /索引名/类型/文档id

在索引中进行查询

1	GET /索引名/类型/_search?q=属性:值 # 这里的值可以是某属性值的部分字段

删除 DELETE
1
DELETE /索引名/类型/文档id

复杂搜索

# 模糊匹配某值 但是这种查询不支持多个查询条件
GET /jiang/user/_search
{
  "query": {
    "match": {
      "name": "二"
    }
  }
}

# 设置查询结果需要哪些值 这里设置 查询结果只显示"name","age"字段
GET /jiang/user/_search
{
  "query": {
    "match": {
      "desc": "大"
    }
  },
  "_source": ["name","age"]
}

# 通过某个字段对查询结果排序 这里是对 age 排序
GET /jiang/user/_search
{
  "query": {
    "match_all": {
      
    }
  },
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

# 分页查询数据
# from 从第几条数据开始
# size 返回几条数据
GET /jiang/user/_search
{
  "query": {
    "match_all": {
      
    }
  },
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ],
  "from": 0,
  "size": 1
}

# 多个查询条件查询 bool
# must (类似mysql 中的 and)
# should (类似mysql 中的 or)
# must_not 条件不满足
GET /jiang/user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "葛大"
          }
        },
        {
          "match": {
            "age": 25
          }
        }
      ]
    }
  }
}

过滤器 filter gte 表示大于等于 e:等于 ,lte 类似

GET /jiang/user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "葛大"
          }
        }
      ],
      "filter": {
        "range":{
          "age":{
            "gte": 1
            , "lte": 20
          }
        }
      }
    }
  }
}

匹配多个条件

# 多个条件使用空格隔开,只要满足其中一个结果就可以被查出,这个时候可以通过分值基础的判断
GET /jiang/user/_search
{
  "query": {
    "match": {
      "tags": "男 勤"
    }
  }
}

精确查询

term 查询是直接通过倒排索引指定的词条进行精确的查找

关于分词:

term 直接查询精确的
match 会使用分词器解析(先分析文档,然后通过分析的文档进行查询)

两个类型 text keyword

text
keyword

多个值匹配的精确查询

# 添加测试数据
PUT testdb/_doc/3
{
  "t1":"222",
  "t2":"2020-07-07"
}

PUT testdb/_doc/4
{
  "t1":"333",
  "t2":"2020-07-08"
}

# 实现精确查询
GET testdb/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "t1":  "222"
          }
        },
        {
          "term": {
            "t1": "333"
          }
        }
      ]
    }
  }
}

高亮查询

搜索的结果会添加HTML标签显示

GET jiang/user/_search
{
  "query": {
    "match": {
      "name": "葛大"
    }
  },
  "highlight": {
    "fields": {
      "name":{
        
      }
    }
  }
}

# 自定义 搜索的高亮标签
GET jiang/user/_search
{
  "query": {
    "match": {
      "name": "葛大"
    }
  },
  "highlight": {
    "pre_tags": "<p class='key' style='color:red;'>", 
    "post_tags": "</p>", 
    "fields": {
      "name":{
        
      }
    }
  }
}

匹配
按条件匹配
精确匹配
区间范围匹配
匹配字段过滤
多条件查询
高亮查询

集成 springboot

创建项目并导入依赖

<!-- 检查当前 springboot默认导入的包是否和当前环境包一致,如果不一致的话在properties中设置相对应的版本信息 -->

<properties>
    <!-- 这里配置和环境版本一致的 -->
    <elasticsearch.version>7.6.2</elasticsearch.version>
</properties>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

配置ElasticSearch 连接

@Configuration
public class ElasticSearchClientConfig {

    @Bean
    public RestHighLevelClient restHighLevelClient(){
        // 配置连接ElasticSearch
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("localhost", 9200, "http")));
        return  client;
    }
}

连接测试

创建索引

// 创建索引
@Test
void testCreatedIndex() throws IOException {
    // 1.创建索引请求
    CreateIndexRequest request = new CreateIndexRequest("jiang_index");
    // 2.执行请求
    CreateIndexResponse createIndexResponse =
        restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);
    System.out.println(createIndexResponse);
}

获取索引

// 获取索引 判断其是否存在
@Test
void testExistIndex() throws IOException {
    GetIndexRequest request = new GetIndexRequest("jiang_index");
    boolean exists = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);
    System.out.println(exists);
}

删除索引

// 删除索引
@Test
void testDeleteIndex() throws IOException {
    DeleteIndexRequest request = new DeleteIndexRequest("jiang_index");
    AcknowledgedResponse delete = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT);
    System.out.println(delete.isAcknowledged());
}

创建文档

// 测试添加文档
@Test
void testAddDocument() throws IOException {
    // 创建对象
    User user = new User("葛大", 10);
    // 创建请求
    IndexRequest request = new IndexRequest("jiang_index");

    // 规则 put /jiang_index/_doc/1
    request.id("1");
    request.timeout(TimeValue.timeValueSeconds(1));
    request.timeout("1s");

    // 将数据放入请求中  json格式
    request.source(JSON.toJSONString(user), XContentType.JSON);

    IndexResponse index = restHighLevelClient.index(request, RequestOptions.DEFAULT);

    System.out.println(index.toString());
    System.out.println(index.status());
}

crud文档

//获取文档 判断是否存在  get /index/doc/1
@Test
void testIsExists() throws IOException {
    GetRequest request = new GetRequest("jiang_index", "1");

    // 不获取返回的 _source 的上下文 (这样效率更高)
    request.fetchSourceContext(new FetchSourceContext(false));
    request.storedFields("_none_");

    boolean exists = restHighLevelClient.exists(request, RequestOptions.DEFAULT);
    System.out.println(exists);
}

// 获取文档信息
@Test
void testGetDocument() throws IOException {
    GetRequest request = new GetRequest("jiang_index", "1");

    GetResponse response = restHighLevelClient.get(request, RequestOptions.DEFAULT);
    // 打印文档的内容
    System.out.println(response.getSourceAsString());

    //返回的全部内容 和命令 是一样的
    System.out.println(response);
}


// 获取文档信息
@Test
void testUpdateDocument() throws IOException {
    UpdateRequest request = new UpdateRequest("jiang_index", "1");
    request.timeout("1s");
    User user = new User("葛大的大", 24);

    request.doc(JSON.toJSONString(user), XContentType.JSON);
    UpdateResponse update = restHighLevelClient.update(request, RequestOptions.DEFAULT);

    System.out.println(update.status());
    System.out.println(update);
}

// 获取文档信息
@Test
void testDeleteDocument() throws IOException {
    DeleteRequest request = new DeleteRequest("jiang_index", "1");
    // 设置请求时间
    request.timeout("1s");

    DeleteResponse delete = restHighLevelClient.delete(request, RequestOptions.DEFAULT);
    System.out.println(delete.status());
    System.out.println(delete);
}

// 批量插入数据
@Test
void testBulkRequest() throws IOException {
    BulkRequest request = new BulkRequest();
    request.timeout("10s");
    ArrayList<User> list = new ArrayList<>();
    list.add(new User("葛大", 29));
    list.add(new User("赵二", 12));
    list.add(new User("张三", 31));
    list.add(new User("李四", 33));
    list.add(new User("王五", 23));
    list.add(new User("陈六", 6));
    list.add(new User("潘七", 85));

    for (int i = 0; i < list.size(); i++) {
        request.add(new IndexRequest("jiang_index")
                    .source(JSON.toJSONString(list.get(i)), XContentType.JSON));
    }

    BulkResponse bulkResponse = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);

    // 是否失败 false 表示 成功
    System.out.println(bulkResponse.hasFailures());
    System.out.println(bulkResponse);
}

// 查询

/**
 * SearchRequest 搜索请求
 * HighlightBuilder 构建高亮
 * TermQueryBuilder 精确查询
 * MatchQueryBuilder 模糊匹配
 * xxx QueryBuilders  对应命令
 * @throws IOException
 */
@Test
void testSearch() throws IOException {
    SearchRequest request = new SearchRequest("jiang_index");

    // 构建搜索条件
    SearchSourceBuilder builder = new SearchSourceBuilder();
    // 查询条件 使用 QueryBuilders 工具实现
    // QueryBuilders.termQuery 精确匹配
    // QueryBuilders.matchAllQuery() 匹配所有
    MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("name", "张三");

    builder.query(queryBuilder);
    builder.timeout(new TimeValue(60, TimeUnit.SECONDS));

    request.source(builder);

    SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);

    System.out.println(JSON.toJSONString(response.getHits()));
    System.out.println("================================");

    for (SearchHit fields : response.getHits().getHits()) {
        System.out.println(fields.getSourceAsMap());
    }

}

ElasticSearch 学习

1. 下载安装 ElasticSearch相关环境

2. 安装运行

如果是Windows上面的

安装ElasticSearch 网页界面

3. 安装 Kibana

如果是Windows上面的

4. ElasticSearch 概述

5. ElasticSearch 使用 ik分词器

Rest 风格说明

基于文档的 基本操作

简单操作

复杂搜索

多个值匹配的精确查询

集成 springboot

基于文档的基本操作