概述
本文介绍 Elasticsearch 聚合分析功能,包括**指标聚合(Metrics Aggregations)和桶聚合(Bucket Aggregations)**的完整用法。
聚合语法
"aggregations" : {
"<aggregation_name>" : {
"<aggregation_type>" : {
<aggregation_body>
}
[,"aggregations" : { [<sub_aggregation>]+ } ]?
}
}
注意:aggregations 可简写为 aggs
一、指标聚合
1. 基本统计类型
| 类型 | 说明 |
|---|---|
| avg | 计算平均值 |
| sum | 计算总和 |
| min | 最小值 |
| max | 最大值 |
| value_count | 计数 |
| cardinality | 去重计数 |
| stats | 返回 min/max/avg/count/sum |
| extended_stats | 扩展统计(含标准差等) |
2. 示例:查询最大值
POST /book/_search
{
"size": 0,
"aggs": {
"max_price": {
"max": {
"field": "price"
}
}
}
}
3. 示例:条件查询计数
POST /book/_count
{
"query": {
"range": {
"price" : {
"gt": 100
}
}
}
}
4. 示例:统计有值的文档数
POST /book/_search
{
"size": 0,
"aggs": {
"book_nums": {
"value_count": {
"field": "price"
}
}
}
}
5. 示例:去重计数
POST /book/_search?size=0
{
"aggs": {
"price_count": {
"cardinality": {
"field": "price"
}
}
}
}
6. 示例:stats 统计
POST /book/_search?size=0
{
"aggs": {
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
7. 示例:扩展统计
POST /book/_search?size=0
{
"aggs": {
"price_stats": {
"extended_stats": {
"field": "price"
}
}
}
}
返回:平方和、方差、标准差、平均值±2个标准差区间
8. 示例:百分位占比
POST /book/_search?size=0
{
"aggs": {
"price_percents": {
"percentiles": {
"field": "price",
"percents" : [75, 99, 99.9]
}
}
}
}
9. 示例:百分位排名
POST /book/_search?size=0
{
"aggs": {
"gge_perc_rank": {
"percentile_ranks": {
"field": "price",
"values": [100, 200]
}
}
}
}
二、桶聚合
1. 常见桶聚合类型
| 类型 | 说明 |
|---|---|
| terms | 按字段值分组(类似 GROUP BY) |
| range | 按数值范围分组 |
| date_histogram | 按时间间隔分组 |
| histogram | 按数值间隔分组 |
| filter | 按条件过滤 |
| filters | 多条件分组 |
| geo_distance | 地理位置距离分组 |
2. 示例:范围分组 + 嵌套聚合
POST /book/_search
{
"size": 0,
"aggs": {
"group_by_price": {
"range": {
"field": "price",
"ranges": [
{"from": 0, "to": 200},
{"from": 200, "to": 400},
{"from": 400, "to": 1000}
]
},
"aggs": {
"average_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
3. 示例:实现 HAVING 效果(bucket_selector)
POST /book/_search
{
"size": 0,
"aggs": {
"group_by_price": {
"range": {
"field": "price",
"ranges": [
{"from": 0, "to": 200},
{"from": 200, "to": 400},
{"from": 400, "to": 1000}
]
},
"aggs": {
"average_price": {
"avg": {
"field": "price"
}
},
"having": {
"bucket_selector": {
"buckets_path": {
"avg_price": "average_price"
},
"script": {
"source": "params.avg_price >= 200"
}
}
}
}
}
}
}
三、常见错误速查
| 症状 | 根因 | 解决方案 |
|---|---|---|
| Fielddata is disabled on text fields | 对 text 字段做聚合 | 改用 .keyword 字段 |
| 聚合结果为空 | 字段不存在或为 null | 用 exists 查询确认,补齐数据 |
| cardinality 偏差大 | precision_threshold 低 | 提升参数值 |
| too_many_buckets_exception | 桶数量过多 | 限制 size,增加 shard_size |
总结
合理组合指标聚合和桶聚合,能在一次查询中完成统计、分组、过滤和类 HAVING 逻辑,适用于日志分析、报表统计、运营看板等场景。