概述

本文介绍 Elasticsearch 聚合分析功能,包括**指标聚合(Metrics Aggregations)桶聚合(Bucket Aggregations)**的完整用法。

聚合语法

"aggregations" : {
  "<aggregation_name>" : {
    "<aggregation_type>" : {
      <aggregation_body>
    }
    [,"aggregations" : { [<sub_aggregation>]+ } ]?
  }
}

注意:aggregations 可简写为 aggs


一、指标聚合

1. 基本统计类型

类型说明
avg计算平均值
sum计算总和
min最小值
max最大值
value_count计数
cardinality去重计数
stats返回 min/max/avg/count/sum
extended_stats扩展统计(含标准差等)

2. 示例:查询最大值

POST /book/_search
{
  "size": 0,
  "aggs": {
    "max_price": {
      "max": {
        "field": "price"
      }
    }
  }
}

3. 示例:条件查询计数

POST /book/_count
{
  "query": {
    "range": {
      "price" : {
        "gt": 100
      }
    }
  }
}

4. 示例:统计有值的文档数

POST /book/_search
{
  "size": 0,
  "aggs": {
    "book_nums": {
      "value_count": {
        "field": "price"
      }
    }
  }
}

5. 示例:去重计数

POST /book/_search?size=0
{
  "aggs": {
    "price_count": {
      "cardinality": {
        "field": "price"
      }
    }
  }
}

6. 示例:stats 统计

POST /book/_search?size=0
{
  "aggs": {
    "price_stats": {
      "stats": {
        "field": "price"
      }
    }
  }
}

7. 示例:扩展统计

POST /book/_search?size=0
{
  "aggs": {
    "price_stats": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}

返回:平方和、方差、标准差、平均值±2个标准差区间

8. 示例:百分位占比

POST /book/_search?size=0
{
  "aggs": {
    "price_percents": {
      "percentiles": {
        "field": "price",
        "percents" : [75, 99, 99.9]
      }
    }
  }
}

9. 示例:百分位排名

POST /book/_search?size=0
{
  "aggs": {
    "gge_perc_rank": {
      "percentile_ranks": {
        "field": "price",
        "values": [100, 200]
      }
    }
  }
}

二、桶聚合

1. 常见桶聚合类型

类型说明
terms按字段值分组(类似 GROUP BY)
range按数值范围分组
date_histogram按时间间隔分组
histogram按数值间隔分组
filter按条件过滤
filters多条件分组
geo_distance地理位置距离分组

2. 示例:范围分组 + 嵌套聚合

POST /book/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {"from": 0, "to": 200},
          {"from": 200, "to": 400},
          {"from": 400, "to": 1000}
        ]
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

3. 示例:实现 HAVING 效果(bucket_selector)

POST /book/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {"from": 0, "to": 200},
          {"from": 200, "to": 400},
          {"from": 400, "to": 1000}
        ]
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        },
        "having": {
          "bucket_selector": {
            "buckets_path": {
              "avg_price": "average_price"
            },
            "script": {
              "source": "params.avg_price >= 200"
            }
          }
        }
      }
    }
  }
}

三、常见错误速查

症状根因解决方案
Fielddata is disabled on text fields对 text 字段做聚合改用 .keyword 字段
聚合结果为空字段不存在或为 null用 exists 查询确认,补齐数据
cardinality 偏差大precision_threshold 低提升参数值
too_many_buckets_exception桶数量过多限制 size,增加 shard_size

总结

合理组合指标聚合桶聚合,能在一次查询中完成统计、分组、过滤和类 HAVING 逻辑,适用于日志分析、报表统计、运营看板等场景。