Elasticsearch Term Exact Query & Bool Combination Practic...

TL;DR

Scenario: Business has both exact matching (price, ID, time) and fault tolerance needs (prefix, fuzzy, typos).
Conclusion: Use term-level queries for structured exact conditions, then use bool to combine must/filter/should/must_not.
Output: Complete flow DSL examples from index creation, data writing to term/terms/range/exists/prefix/regexp/fuzzy/ids/bool.

Version Matrix

Item	Description
Elasticsearch 7.x	Verified in 7.x environment per DSL in article
Elasticsearch 8.x	Query DSL syntax compatible
IK Analyzer Plugin	Examples depend on ik_max_word tokenizer
Dev Tools / Kibana Console	All examples executed based on Dev Tools console

Initial Index

Create a new book index:

PUT /book
{
  "settings": {},
  "mappings" : {
    "properties" : {
      "description" : {
        "type" : "text",
        "analyzer" : "ik_max_word"
      },
      "name" : {
        "type" : "text",
        "analyzer" : "ik_max_word"
      },
      "price" : {
        "type" : "float"
      },
      "timestamp" : {
        "type" : "date",
        "format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
      }
    }
  }
}

Write Data

PUT /book/_doc/1
{
  "name": "lucene",
  "description": "Lucene Core is a Java library providing powerful indexing and search features...",
  "price":100.45,
  "timestamp":"2020-08-21 19:11:35"
}

PUT /book/_doc/2
{
  "name": "solr",
  "description": "Solr is highly scalable, providing fully fault tolerant distributed indexing...",
  "price":320.45,
  "timestamp":"2020-07-21 17:11:35"
}

PUT /book/_doc/3
{
  "name": "Hadoop",
  "description": "The Apache Hadoop software library is a framework...",
  "price":620.45,
  "timestamp":"2020-08-22 19:18:35"
}

PUT /book/_doc/4
{
  "name": "ElasticSearch",
  "description": "Elasticsearch是一个基于Lucene的搜索服务器...",
  "price":999.99,
  "timestamp":"2020-08-15 10:11:35"
}

Term Query

term query is used to query documents where specified field contains a certain term. term is exact retrieval, one more or less won’t work.

POST /book/_search
{
  "query": {
    "term" : {
      "name" : "solr"
    }
  }
}

Terms Query

terms query is used to query documents where specified field contains certain terms.

POST /book/_search
{
  "query": {
    "terms" : {
      "name" : ["solr", "elasticsearch"]
    }
  }
}

Range Query

gte: greater than or equal
gt: greater than
lte: less than or equal
lt: less than
boost: query weight

POST /book/_search
{
  "query": {
    "range" : {
      "price" : {
        "gte" : 10,
        "lte" : 200,
        "boost" : 2.0
      }
    }
  }
}

Date range query:

POST /book/_search
{
  "query": {
    "range" : {
      "timestamp" : {
        "gte": "18/08/2020",
        "lte": "2021",
        "format": "dd/MM/yyyy||yyyy"
      }
    }
  }
}

Exists Query

Query documents where specified field is not empty, equivalent to SQL column is not null.

POST /book/_search
{
  "query": {
    "exists" : { "field" : "price" }
  }
}

Prefix Query

POST /book/_search
{
  "query": {
    "prefix" : {
      "name" : "so"
    }
  }
}

Regexp Query

regexp allows using regular expressions for term query. Note: If used incorrectly, can cause serious performance issues, e.g., queries starting with * will match all keywords in inverted index, almost like full table scan.

POST /book/_search
{
  "query": {
    "regexp":{
      "name": "s.*"
    }
  }
}

With boost value:

POST /book/_search
{
  "query": {
    "regexp":{
      "name":{
        "value":"s.*",
        "boost":1.2
      }
    }
  }
}

Fuzzy Query

POST /book/_search
{
  "query": {
    "fuzzy" : {
      "name" : "sol"
    }
  }
}

POST /book/_search
{
  "query": {
    "fuzzy" : {
      "name" : "so"
    }
  }
}

POST /book/_search
{
  "query": {
    "fuzzy" : {
      "name" : {
        "value": "so",
        "fuzziness": 2
      }
    }
  }
}

Typos matching:

POST /book/_search
{
  "query": {
    "fuzzy" : {
      "name" : {
        "value": "sorl"
      }
    }
  }
}

POST /book/_search
{
  "query": {
    "fuzzy" : {
      "name" : {
        "value": "osrl",
        "fuzziness":2
      }
    }
  }
}

IDs Query

POST /book/_search
{
  "query": {
    "ids" : {
      "values" : ["1", "3"]
    }
  }
}

Compound Query - Bool Query

bool query combines query clauses into one query using keywords:

must: Must match
filter: Must match, simple check for inclusion/exclusion, very fast, doesn’t participate or affect scoring
should: OR relationship
must_not: Must not match, executed in filter context, doesn’t participate or affect scoring

Example business requirements:

description must have Java
price must satisfy greater than 100 less than 1000
name field can be either lucene or solr
timestamp satisfies certain time point

POST /book/_search
{
  "query": {
    "bool": {
      "filter": {
        "match": {
          "description": "java"
        }
      },
      "must": [
        {
          "range": {
            "price": {
              "gte": 100,
              "lte": 1000
            }
          }
        },
        {
          "bool": {
            "should": [
              {
                "term": {
                  "name": "lucene"
                }
              },
              {
                "term": {
                  "name": "solr"
                }
              }
            ]
          }
        }
      ],
      "must_not": [
        {
          "range": {
            "timestamp": {
              "gte": "18/08/2020",
              "lte": "2021",
              "format": "dd/MM/yyyy||yyyy"
            }
          }
        }
      ]
    }
  }
}

Error Quick Reference

Symptom	Root Cause Location	Fix
Using `term` to query text field Chinese content has very low hit rate	text field is tokenized, `term` matches by tokenized token exactly	For exact match use `.keyword` subfield or change field to `keyword`
`range` query date has no results or reports date parsing error	Date string in `gte`/`lte` doesn’t match field `format`	Adjust date format in DSL to match `format` in mappings
`exists` query has unusually low hit count	Field not actually written, field name typo, or dynamically mapped as object/nested structure	Use `_source` to see original document structure, confirm field path and name
`prefix` / `regexp` query high CPU, slow response	Doing prefix/regexp scan on high cardinality fields, and regex starts with `*`	Try to add fixed prefix, avoid patterns like `.*xxx`
`fuzzy` query returns unstable results or significantly slower	`fuzziness` set too large, allowed edit distance too high	Keep `fuzziness` at 1-2
`bool` query results more or less than expected	Confused semantics of `must`, `should`, `filter`, `must_not`	Precisely distinguish meaning of each keyword
Some IDs can’t be hit in `ids` query	IDs written as mixed string/numeric or index name inconsistent	Keep ID type consistent; confirm index name correct