Elasticsearch Mapping & Document CRUD Practice (Based on ...

Mapping Operations

After creating an index, need to set field constraints, called field mapping (mapping). Field constraints include: field data type, whether to store, whether to index, analyzer.

Create Mapping Field

Syntax:

PUT /index_name/_mapping
{
  "properties": {
    "field_name": {
      "type": "data_type",
      "index": true,
      "store": false,
      "analyzer": "analyzer"
    }
  }
}

Example:

PUT /wzkicu-index
PUT /wzkicu-index/_mapping/
{
  "properties": {
    "name": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "job": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "logo": {
      "type": "keyword",
      "index": "false"
    },
    "payment": {
      "type": "float"
    }
  }
}

Mapping Properties Detail

type:
- String type: text can be tokenized, cannot participate in aggregation. keyword cannot be tokenized, used as complete field for distribution, can participate in aggregation
- Numerical type: Numeric types, divided into two categories, basic data types and high-precision floating point types
- Date: Date type, ES can store formatted date strings, but recommend storing as millisecond value, stored as long, saves space
- Array type: When matching, if any element satisfies, consider it satisfied
- Object type: { name: "jack", age: 21, girl: {name: "Rose", age: 21}}, if stored in index as object type, girl becomes girl.name and girl.age
index: true field will be indexed, can be used for searching, default is true. false field won’t be indexed, cannot be used for searching
store: Whether to store data independently, original text is stored in _source, by default other extracted fields are not independently stored, extracted from _source
analyzer: Specify analyzer, generally use IK analyzer ik_max_word / ik_smart

View Mapping

GET /index_name/_mapping
GET _mapping
GET _all/_mapping

Modify Mapping

Important note: Modifying mapping can only add fields, other changes require deleting index and recreating.

Create Index and Mapping at Once

PUT /index_name
{
  "settings": {
    "index_property_name": "index_property_value"
  },
  "mappings": {
    "properties": {
      "field_name": {
        "mapping_property_name": "mapping_property_value"
      }
    }
  }
}

Document CRUD and Partial Update

Document, i.e., data in index, will create index according to rules for searching, can be compared to a row in database.

Add Document

Manual add:

POST /index_name/_doc/{id}
{
  "name" : "百度",
  "job" : "小度用户运营经理",
  "payment" : "30000",
  "logo" : "https://profile-avatar.csdnimg.cn/xxx.jpg"
}

Auto add:

POST /index_name/_doc
{
  "field": "value"
}

Query Document

Single document:

GET /index_name/_doc/{id}

All documents:

POST /index_name/_search
{
  "query": {
    "match_all": {}
  }
}

Custom return fields:

GET /index_name/_doc/1?_source=name,job

Update Document

Full update:

PUT /index_name/_doc/{id}
{
  "name" : "百度",
  "job" : "百度测试",
  "payment" : "20000",
  "logo" : "https://xxx.jpg"
}

Partial update:

POST /index_name/_update/{id}
{
  "doc": {
    "field": "value"
  }
}

Note: When ES executes update, it first marks old as deleted, then adds new document.

Delete Document

Delete by ID:

DELETE /index_name/_doc/{id}

Delete by condition:

POST /index_name/_delete_by_query
{
  "query": {
    "match": {
      "field_name": "search_keyword"
    }
  }
}

Delete all:

POST /index_name/_delete_by_query
{
  "query": {
    "match_all": {}
  }
}

Error Quick Reference

Symptom	Root Cause Location	Fix
PUT index/_mapping error: mapper_parsing_exception	Existing field type incompatible with submitted type/analyzer	Check current index _mapping, compare field types with new config
Query behavior “weird” after modifying existing field type	Mapping not really changed (old segment still uses old mapping)	Adhere to “design mapping before creating index”, when needs adjustment use “new index + reindex” approach
Query hit count 0, but data confirmed exists in _source	Field index:false or mapped as keyword, but query used tokenized match	Need full-text search use text + analyzer; for exact match use keyword
Fuzzy search on keyword field or Chinese search effect very poor	keyword not tokenized, stores whole value; Chinese must use text type with analyzer	Common pattern: text + keyword multi-field
POST index/_update/id reports version conflict	Mixed “partial update” and “full overwrite”; or high concurrency version conflict	Use _update + doc for partial updates; consider optimistic lock or retry strategy in high concurrency scenarios
After DELETE_BY_QUERY, disk space barely changes in short term	ES first “marks deleted”, waits for segment merge to truly recover space	Normal behavior, wait for background merge; consider force merge when space pressure high
_source return field too large, interface latency and network overhead obvious	Returns full _source by default, contains all fields	Use ?_source=field1,field2 to limit returned fields
After full update PUT, found “added a new data”	Previous id didn’t exist, ES treats as new document creation	Confirm business semantics: when need “update only if exists”, first GET or use script update with condition