Summary
This article provides in-depth explanation of core Query DSL usage in Elasticsearch 7.3, focusing on differences and pitfalls of match, match_phrase, query_string, multi_match and other full-text search statements in real business scenarios. Through complete index mapping config, sample data and Kibana Dev Tools request examples, demonstrates from match_all query all, to match OR/AND control, then match_phrase order matching and slop tolerance, finally extending to query_string logical expressions, multi-field search and fuzzy matching.
1. Overview
Elasticsearch provides a complete query DSL (Domain Specific Language) based on JSON to define queries. Consider query DSL as query AST (Abstract Syntax Tree), composed of two types of clauses:
- Leaf query clauses: Look for specific values in specific fields, like match, term, range queries
- Compound query clauses: Wrap other leaf or compound queries, and used to combine multiple queries in logical ways (like bool or dis_max queries), or change their behavior (like constant_score queries)
2. Query All (match_all)
Example
POST /wzkicu-index/_search
{
"query":{
"match_all": {}
}
}
Return Result Analysis
After execution, result field description:
took: Query time in millisecondstime_out: Whether timed out_shards: Shard informationhits: Search result overview objecttotal: Total searchedmax_score: Highest score among all result documents_index: Index_type: Document type_id: Document id_score: Document score_source: Document data source
3. Full-text Query
Full-text search can search analyzed text fields like email body, product description, using the same tokenization processing applied to fields during indexing to query string.
Full-text search classification includes: match query, match_phrase query, query_string query, multi_match query, etc.
3.1 Match Query
Standard query for full-text search, query conditions are relatively loose:
- Need to specify field name
- Input text will be tokenized, e.g., “hello world” will be split into hello and world, then matching
- If field content contains hello or world, it will be queried
- match is a fuzzy query with partial matching
match queries receive text/numerics/dates, tokenize them, then organize into a boolean query, can specify bool combination operation via operator (or, and, default is or).
Create Index
PUT /wzk-property
{
"settings": {},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"images": {
"type": "keyword"
},
"price": {
"type": "float"
}
}
}
}
Add Data
POST /wzk-property/_doc/
{
"title": "小米电视4A",
"images": "https://profile-avatar.csdnimg.cn/xxx.jpg",
"price": 4288
}
POST /wzk-property/_doc/
{
"title": "小米手机",
"images": "https://profile-avatar.csdnimg.cn/xxx.jpg",
"price": 2699
}
POST /wzk-property/_doc/
{
"title": "华为手机",
"images": "https://profile-avatar.csdnimg.cn/xxx.jpg",
"price": 5699
}
OR Match (Default)
POST /wzk-property/_search
{
"query":{
"match":{
"title":"小米电视4A"
}
}
}
Result: Not only found Xiaomi TV, but also found Xiaomi Phone. This is because match defaults to OR relationship, after tokenization any word match counts.
AND Match
If need exact search, can use and:
POST /wzk-property/_search
{
"query": {
"match": {
"title": {
"query": "小米电视4A",
"operator": "and"
}
}
}
}
Result: Precisely matched Xiaomi TV 4A.
3.2 Match Phrase Query
match_query tokenizes, text also tokenizes. match_phrase tokenization results must all be contained in text field, and order must be same, and must be continuous.
Basic Usage
POST /wzk-property/_search
{
"query": {
"match_phrase": {
"title": "小米电视"
}
}
}
Order Requirement
POST /wzk-property/_search
{
"query": {
"match_phrase": {
"title": "电视小米"
}
}
}
Since “电视小米” tokenization order differs from “小米电视”, no result matched.
slop Parameter (Word Skip Tolerance)
Through slop can skip a word to allow match_phrase to match ordered result:
POST /wzk-property/_search
{
"query": {
"match_phrase": {
"title": {
"query": "小米4A",
"slop": 1
}
}
}
}
3.3 Query String Query
This query is similar to match, but match needs to specify field name, query_string searches in all fields, range is wider.
Query String Query provides a高级 query that matches documents without specifying a specific field, and can specify which fields to match.
Broad Query
POST /wzk-property/_search
{
"query": {
"query_string": {
"query": "2699"
}
}
}
Specify Field Query
POST /wzk-property/_search
{
"query": {
"query_string": {
"query": "2699",
"default_field": "title"
}
}
}
Logical Query (OR/AND)
POST /wzk-property/_search
{
"query": {
"query_string": {
"query": "手机 OR 小米",
"default_field": "title"
}
}
}
POST /wzk-property/_search
{
"query": {
"query_string": {
"query": "手机 AND 小米",
"default_field": "title"
}
}
}
Fuzzy Query
Use ~ for fuzzy matching, ~1 allows 1 word change:
POST /wzk-property/_search
{
"query": {
"query_string": {
"query": "小米~1",
"default_field": "title"
}
}
}
Multi-field Support
POST /lagou-property/_search
{
"query": {
"query_string" : {
"query":"2699",
"fields": ["title","price"]
}
}
}
3.4 Multi-match Query
If need to search text on multiple fields, can use multi_match. multi_match supports text query on multiple fields based on match.
Basic Usage
POST /wzk-property/_search
{
"query": {
"multi_match" : {
"query":"小米4A",
"fields": ["title","images"]
}
}
}
4. Error Quick Reference
| Symptom | Root Cause | Location | Fix |
|---|---|---|---|
| Query “小米电视4A” with match, result also includes “小米手机” | match default operator=OR, after Chinese tokenization any word match counts | Use _analyze in Kibana to see title tokenization results, confirm tokenization granularity | Explicitly set “operator”: “and” in match, or use keyword/exact match field for product names |
| match_phrase can’t find expected docs (like “电视小米""小米4A” both have no results) | match_phrase requires tokenization order and position contiguous, default doesn’t allow word skip | Use _analyze to see phrase tokenization order, compare with actual _source.title tokenization order | Use “slop”: N to relax position constraint, or rewrite query phrase to match document |
| query_string reports parse error or can’t find data | query_string syntax complex, logical operators, special characters not escaped, or default_field doesn’t contain target content | See Kibana error message, gradually simplify query string, only keep single word to verify | Avoid directly passing user input, escape + - && |
| Using query_string fuzzy query “小米~1” but results too many or too few | Fuzzy matching based on edit distance, affected by analyzer, not “looks similar then hits” | Test same word with _analyze and query_string respectively, observe actual hit terms | Clarify fuzzy degree acceptable to business, reasonably set 1/2, when necessary change to prefix/wildcard or pinyin index more explicit scheme |
| multi_match cross-field query hits not as expected | Different field types and tokenization methods among multiple fields, score dominated by certain field, causing sorting or hit deviation | Check mapping of each field’s type/analyzer, compare with single-field match effect | Explicitly configure fields weight in multi_match (like “title^3”), unify text field analyzer, avoid mixing keyword and text causing misunderstanding |
| match_all can find documents, but any full-text query can’t find | Field mapped as keyword or not indexed, full-text query on wrong field | Use mapping interface to confirm field type and index attribute, check field name spelling in query JSON | Change field to text or set appropriate multi-field (text + keyword), correct field name in DSL, rebuild index then verify again |
5. Summary
Distinguishing match, match_phrase, query_string, multi_match matching boundaries and tokenization semantics is key to whether Chinese search “can find”.
- match: Tokenize then match any (OR) or all (AND)
- match_phrase: Tokenize order continuous match, supports slop word skip
- query_string: Supports logical operators, multi-field, fuzzy query
- multi_match: Multi-field full-text search, supports field weight