用Elasticsearch搭建一个全文搜索服务

ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。 (百度百科)

准备工作

在Docker容器中运行 Elasticsearch 并安装 elasticsearch-head 管理工具, 注意将 Elasticsearch9200 9300 端口映射到宿主机, 并将 configplugins 目录挂载到宿主机.

docker-compose.yml 配置:

version: '2'  
services:

  elasticsearch:
    image: elasticsearch:5.2.0
    container_name: med-news-es-dev
    ports:
      - "9200:9200"
      - "9300:9300"
    volumes: 
     - $PWD/elasticsearch/config:/usr/share/elasticsearch/config
     - $PWD/elasticsearch/plugins:/usr/share/elasticsearch/plugins
    restart: always

  elasticsearch-head:
    image: mobz/elasticsearch-head:5
    container_name: med-news-es-head-dev
    ports:
      - "9100:9100"
    restart: always
配置及插件

编辑 config/elasticsearch.yml:

http.host: 0.0.0.0  
http.cors.enabled: true  
http.cors.allow-origin: "*"

cluster.name: "cluster_name"  
node.name: "node_name"  

IK 分词插件, 将插件下载解压并放入plugins目录, 详见项目主页

创建索引

curl -XPOST localhost:9200/news -d '

"mappings": {
  "news": {
    "properties":{
      "title" : {
        "type": "string",
        "fields": {
          "raw": {
            "type": "string",
            "index": "not_analyzed"
          },
          "ik": {
            "type":"string",
            "analyzer":"ik_max_word"
          }
        }
      },
      "content": {
        "type": "text",
        "analyzer":"ik_smart"
      },
      "keywords": {
        "type": "keyword"
      },
      "publishedAt": {
        "type": "date"
      },
    }
  }
}

'

  • Elasticsearch 5.0中, string 被拆分为 textkeyword 两种类型, text 会被分词,整个字符串根据一定规则分解成一个个term, keyword 则不会被分词.
查询语句

curl -XGET localhost:9200/news/news/_search -d '

{
    "query":{
      "bool":{
        "should":[
          {
            "multi_match": {
              "query":"关键词1",
              "type":"most_fields",
              "fields":["title.ik^10","keywords^10","content"],
              "boost":5
            }
          },
          {
            "multi_match": {
              "query":"关键词2",
              "type":"most_fields",
              "fields":["title.ik^10","keywords^10","content"],
              "boost":10
            }
          },
        ],
        "filter":{
          "bool":{
            "must":[
              {
                "term":{
                  "status":2
                }
              }
            ],
          }
        }
      }
    },
    "sort":{
      "_score":{
        "order":"desc"
      },
      "publishedAt":{
        "order":"desc"
      }
    },
    "from":0,
    "size":10
}

通过 Elasticsearch 查询语句可以构造出 搜索 过滤 排序 分页 等查询条件.

Show Comments