Overview

This article introduces Elasticsearch aggregation analysis function, including complete usage of Metrics Aggregations and Bucket Aggregations.

Aggregation Syntax

"aggregations" : {
  "<aggregation_name>" : {
    "<aggregation_type>" : {
      <aggregation_body>
    }
    [,"aggregations" : { [<sub_aggregation>]+ } ]?
  }
}

Note: aggregations can be abbreviated as aggs


1. Metrics Aggregations

1. Basic Statistics Types

TypeDescription
avgCalculate average
sumCalculate sum
minMinimum
maxMaximum
value_countCount
cardinalityDistinct count
statsReturn min/max/avg/count/sum
extended_statsExtended statistics (including standard deviation, etc.)

2. Example: Query Max Value

POST /book/_search
{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}

3. Example: Conditional Query Count

POST /book/_count
{
  "query": {
    "range": {
      "price" : {
        "gt": 100
      }
    }
  }
}

4. Example: Count Documents with Values

POST /book/_search
{
  "size": 0,
  "aggs": {
    "book_nums": {
      "value_count": {
        "field": "price"
      }
    }
  }
}

5. Example: Distinct Count

POST /book/_search?size=0
{
  "aggs": {
    "price_count": {
      "cardinality": {
        "field": "price"
      }
    }
  }
}

6. Example: Stats Statistics

POST /book/_search?size=0
{
  "aggs": {
    "price_stats": {
      "stats": {
        "field": "price"
      }
    }
  }
}

7. Example: Extended Stats

POST /book/_search?size=0
{
  "aggs": {
    "price_stats": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}

Returns: Sum of squares, variance, standard deviation, average ± 2 standard deviation range

8. Example: Percentiles

POST /book/_search?size=0
{
  "aggs": {
    "price_percents": {
      "percentiles": {
        "field": "price",
        "percents" : [75, 99, 99.9]
      }
    }
  }
}

9. Example: Percentile Ranks

POST /book/_search?size=0
{
  "aggs": {
    "gge_perc_rank": {
      "percentile_ranks": {
        "field": "price",
        "values": [100, 200]
      }
    }
  }
}

2. Bucket Aggregations

1. Common Bucket Aggregation Types

TypeDescription
termsGroup by field value (similar to GROUP BY)
rangeGroup by numeric range
date_histogramGroup by time interval
histogramGroup by numeric interval
filterGroup by condition
filtersMulti-condition grouping
geo_distanceGroup by geographic location distance

2. Example: Range Group + Nested Aggregation

POST /book/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {"from": 0, "to": 200},
          {"from": 200, "to": 400},
          {"from": 400, "to": 1000}
        ]
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

3. Example: Implement HAVING Effect (bucket_selector)

POST /book/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {"from": 0, "to": 200},
          {"from": 200, "to": 400},
          {"from": 400, "to": 1000}
        ]
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        },
        "having": {
          "bucket_selector": {
            "buckets_path": {
              "avg_price": "average_price"
            },
            "script": {
              "source": "params.avg_price >= 200"
            }
          }
        }
      }
    }
  }
}

3. Common Error Quick Reference

SymptomRoot CauseSolution
Fielddata is disabled on text fieldsAggregate on text fieldUse .keyword field
Aggregation result emptyField doesn’t exist or is nullUse exists query to confirm, supplement data
cardinality deviation largeprecision_threshold lowIncrease parameter value
too_many_buckets_exceptionToo many bucketsLimit size, increase shard_size

Summary

Reasonably combining Metrics Aggregations and Bucket Aggregations can complete statistics, grouping, filtering and HAVING-like logic in one query, applicable to log analysis, report statistics, operations dashboard and other scenarios.