当前位置: 首页 > news >正文

厦门网站建设公百度地图网页版进入

厦门网站建设公,百度地图网页版进入,设计非常好的网站,常州做的网站的公司聚合查询 概念 聚合(aggs)不同于普通查询,是目前学到的第二种大的查询分类,第一种即“query”,因此在代码中的第一层嵌套由“query”变为了“aggs”。用于进行聚合的字段必须是exact value,分词字段不可进行…

聚合查询

  1. 概念

    聚合(aggs)不同于普通查询,是目前学到的第二种大的查询分类,第一种即“query”,因此在代码中的第一层嵌套由“query”变为了“aggs”。用于进行聚合的字段必须是exact value,分词字段不可进行聚合,对于text字段如果需要使用聚合,需要开启fielddata,但是通常不建议,因为fielddata是将聚合使用的数据结构由磁盘(doc_values)变为了堆内存(field_data),大数据的聚合操作很容易导致OOM,详细原理会在进阶篇中阐述。

  2. 聚合分类

    1. 分桶聚合(Bucket agregations):类比SQL中的group by的作用,主要用于统计不同类型数据的数量
    2. 指标聚合(Metrics agregations):主要用于最大值、最小值、平均值、字段之和等指标的统计
    3. 管道聚合(Pipeline agregations):用于对聚合的结果进行二次聚合,如要统计绑定数量最多的标签bucket,就是要先按照标签进行分桶,再在分桶的结果上计算最大值。
  3. 语法

    GET product/_search
    {"aggs": {"<aggs_name>": {"<agg_type>": {"field": "<field_name>"}}}
    }
    

    aggs_name:聚合函数的名称

    agg_type:聚合种类,比如是桶聚合(terms)或者是指标聚合(avg、sum、min、max等)

    field_name:字段名称或者叫域名。

  4. 桶聚合:

    场景:用于统计不同种类的文档的数量,可进行嵌套统计。

    函数:terms

    注意:聚合字段必须是exact value,如keyword

  5. 指标聚合

    场景:用于统计某个指标,如最大值、最小值、平均值,可以结合桶聚合一起使用,如按照商品类型分桶,统计每个桶的平均价格。

    函数:平均值:Avg、最大值:Max、最小值:Min、求和:Sum、详细信息:Stats、数量:Value count

  6. 管道聚合

    场景:用于对聚合查询的二次聚合,如统计平均价格最低的商品分类,即先按照商品分类进行桶聚合,并计算其平均价格,然后对其平均价格计算最小值聚合

    函数:Min bucket:最小桶、Max bucket:最大桶、Avg bucket:桶平均值、Sum bucket:桶求和、Stats bucket:桶信息

    注意:buckets_path为管道聚合的关键字,其值从当前聚合统计的聚合函数开始计算为第一级。比如下面例子中,my_aggs和my_min_bucket同级, my_aggs就是buckets_path值的起始值。

    GET product/_search
    {"size": 0, "aggs": {"my_aggs": {"terms": {...},"aggs": {"my_price_bucket": {...}}},"my_min_bucket":{"min_bucket": {"buckets_path": "my_aggs>price_bucket"}}}
    }
    
  7. 嵌套聚合

    语法:

    GET product/_search
    {"size": 0,"aggs": {"<agg_name>": {"<agg_type>": {"field": "<field_name>"},"aggs": {"<agg_name_child>": {"<agg_type>": {"field": "<field_name>"}}}}}
    }
    

    用途:用于在某种聚合的计算结果之上再次聚合,如统计不同类型商品的平均价格,就是在按照商品类型桶聚合之后,在其结果之上计算平均价格

  8. 聚合和查询的相互关系

    1. 基于query或filter的聚合

      语法:

      GET product/_search
      {"query": {...}, "aggs": {...}
      }
      

      注意:以上语法,执行顺序为先query后aggs,顺序和谁在上谁在下没有关系。query中可以是查询、也可以是filter、或者bool query

    2. 基于聚合结果的查询、

      GET product/_search
      {"aggs": {...},"post_filter": {...}
      }
      

      注意:以上语法,执行顺序为先aggs后post_filter,顺序和谁在上谁在下没有关系。

    3. 查询条件的作用域

      GET product/_search
      {"size": 10,"query": {...},"aggs": {"avg_price": {...},"all_avg_price": {"global": {},"aggs": {...}}}
      }
      

      上面例子中,avg_price的计算结果是基于query的查询结果的,而all_avg_price的聚合是基于all data的

  9. 聚合排序

    1. 排序规则:

      order_type:_count(数量) _key(聚合结果的key值) _term(废弃但是仍然可用,使用_key代替)

      GET product/_search
      {"aggs": {"type_agg": {"terms": {"field": "tags","order": {"<order_type>": "desc"},"size": 10}}}
      }
      
    2. 多级排序:即排序的优先级,按照外层优先的顺序

      GET product/_search?size=0
      {"aggs": {"first_sort": {..."aggs": {"second_sort": {...}}}}
      }
      

      上例中,先按照first_sort排序,再按照second_sort排序

    3. 多层排序:即按照多层聚合中的里层某个聚合的结果进行排序

      GET product/_search
      {"size": 0,"aggs": {"tag_avg_price": {"terms": {"field": "type.keyword","order": {"agg_stats>my_stats.sum": "desc"}},"aggs": {"agg_stats": {..."aggs": {"my_stats": {"extended_stats": {...}}}}}}}
      }
      

      上例中,按照里层聚合“my_stats”进行排序

  10. 常用的查询函数

    1. histogram:直方图或柱状图统计

      用途:用于区间统计,如不同价格商品区间的销售情况

      语法:

      GET product/_search?size=0
      {"aggs": {"<histogram_name>": {"histogram": {"field": "price", 				#字段名称"interval": 1000,					#区间间隔"keyed": true,						#返回数据的结构化类型"min_doc_count": <num>,		#返回桶的最小文档数阈值,即文档数小于num的桶不会被输出"missing": 1999						#空值的替换值,即如果文档对应字段的值为空,则默认输出1999(参数值)}}}
      }
      
    2. date-histogram:基于日期的直方图,比如统计一年每个月的销售额

      语法:

      GET product/_search?size=0
      {"aggs": {"my_date_histogram": {"date_histogram": {"field": "createtime",					#字段需为date类型"<interval_type>": "month",			#时间间隔的参数可选项"format": "yyyy-MM", 						#日期的格式化输出"extended_bounds": {						#输出空桶"min": "2020-01","max": "2020-12"}}}}
      }
      

      interval_type:时间间隔的参数可选项

      ​ fixed_interval:ms(毫秒)、s(秒)、 m(分钟)、h(小时)、d(天),注意单位需要带上具体的数值,如2d为两天。需要当心当单位过小,会 导致输出桶过多而导致服务崩溃。

      ​ calendar_interval:month、year

      ​ interval:(废弃,但是仍然可用)

    3. percentile 百分位统计 或者 饼状图

      计算结果为何为近似值。

      1. percentiles:用于评估当前数值分布情况,比如99 percentile 是 1000 , 是指 99%的数值都在1000以内。常见的一个场景就是我们制定 SLA 的时候常说 99% 的请求延迟都在100ms 以内,这个时候你就可以用 99 percentile 来查一下,看一下 99 percenttile 的值如果在 100ms 以内,就代表SLA达标了。

        语法:

        GET product/_search?size=0
        {"aggs": {"<percentiles_name>": {"percentiles": {"field": "price","percents": [percent1,				#区间的数值,如510305099 即代表5%10%30%50%99%的数值分布percent2,...]}}}
        }
        
      2. percentile_ranks: percentile rank 其实就是percentiles的反向查询,比如我想看一下 1000、3000 在当前数值中处于哪一个范围内,你查一下它的 rank,发现是95,99,那么说明有95%的数值都在1000以内,99%的数值都在3000以内。

        GET product/_search?size=0
        {"aggs": {"<percentiles_name>": {"percentile_ranks": {"field": "<field_value>","values": [rank1,rank2,...]}}}
        }
        

示例

# 聚合查询
DELETE product
## 数据
PUT product
{"mappings" : {"properties" : {"createtime" : {"type" : "date"},"date" : {"type" : "date"},"desc" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}},"analyzer":"ik_max_word"},"lv" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}},"name" : {"type" : "text","analyzer":"ik_max_word","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}},"price" : {"type" : "long"},"tags" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}},"type" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}}}}
}
PUT /product/_doc/1
{"name" : "小米手机","desc" :  "手机中的战斗机","price" :  3999,"lv":"旗舰机","type":"手机","createtime":"2020-10-01T08:00:00Z","tags": [ "性价比", "发烧", "不卡顿" ]
}
PUT /product/_doc/2
{"name" : "小米NFC手机","desc" :  "支持全功能NFC,手机中的滑翔机","price" :  4999,"lv":"旗舰机","type":"手机","createtime":"2020-05-21T08:00:00Z","tags": [ "性价比", "发烧", "公交卡" ]
}
PUT /product/_doc/3
{"name" : "NFC手机","desc" :  "手机中的轰炸机","price" :  2999,"lv":"高端机","type":"手机","createtime":"2020-06-20","tags": [ "性价比", "快充", "门禁卡" ]
}
PUT /product/_doc/4
{"name" : "小米耳机","desc" :  "耳机中的黄焖鸡","price" :  999,"lv":"百元机","type":"耳机","createtime":"2020-06-23","tags": [ "降噪", "防水", "蓝牙" ]
}
PUT /product/_doc/5
{"name" : "红米耳机","desc" :  "耳机中的肯德基","price" :  399,"type":"耳机","lv":"百元机","createtime":"2020-07-20","tags": [ "防火", "低音炮", "听声辨位" ]
}
PUT /product/_doc/6
{"name" : "小米手机10","desc" :  "充电贼快掉电更快,超级无敌望远镜,高刷电竞屏","price" :  "","lv":"旗舰机","type":"手机","createtime":"2020-07-27","tags": [ "120HZ刷新率", "120W快充", "120倍变焦" ]
}
PUT /product/_doc/7
{"name" : "挨炮 SE2","desc" :  "除了CPU,一无是处","price" :  "3299","lv":"旗舰机","type":"手机","createtime":"2020-07-21","tags": [ "割韭菜", "割韭菜", "割新韭菜" ]
}
PUT /product/_doc/8
{"name" : "XS Max","desc" :  "听说要出新款12手机了,终于可以换掉手中的4S了","price" :  4399,"lv":"旗舰机","type":"手机","createtime":"2020-08-19","tags": [ "5V1A", "4G全网通", "大" ]
}
PUT /product/_doc/9
{"name" : "小米电视","desc" :  "70寸性价比只选,不要一万八,要不要八千八,只要两千九百九十八","price" :  2998,"lv":"高端机","type":"耳机","createtime":"2020-08-16","tags": [ "巨馍", "家庭影院", "游戏" ]
}
PUT /product/_doc/10
{"name" : "红米电视","desc" :  "我比上边那个更划算,我也2998,我也70寸,但是我更好看","price" :  2999,"type":"电视","lv":"高端机","createtime":"2020-08-28","tags": [ "大片", "蓝光8K", "超薄" ]
}
PUT /product/_doc/11
{"name": "红米电视","desc": "我比上边那个更划算,我也2998,我也70寸,但是我更好看","price": 2998,"type": "电视","lv": "高端机","createtime": "2020-08-28","tags": ["大片","蓝光8K","超薄"]
}
## 语法
GET product/_search
{"aggs": {"<aggs_name>": {"<agg_type>": {"field": "<field_name>"}}}
}
## 桶聚合 例:统计不同标签的商品数量
GET product/_search
{"aggs": {"tag_bucket": {"terms": {"field": "tags.keyword"}}}
}
## 不显示hits数据:size:0
GET product/_search
{"size": 0, "aggs": {"tag_bucket": {"terms": {"field": "tags.keyword"}}}
}
## 排序
GET product/_search
{"size": 0, "aggs": {"tag_bucket": {"terms": {"field": "tags.keyword","size": 3,"order": {"_count": "desc"}}}}
}## doc_values和field_data
GET product/_search
{"size": 0, "aggs": {"tag_bucket": {"terms": {"field": "name"}}}
}
GET product/_search
{"size": 0, "aggs": {"tag_bucket": {"terms": {"field": "name.keyword"}}}
}
POST product/_mapping
{"properties": {"name": {"type": "text","analyzer": "ik_max_word","fielddata": true}}
}
GET product/_search
{"size": 0,"aggs": {"tag_bucket": {"terms": {"size": 20,"field": "name"}}}
}#*****************************************
## 指标聚合 
## 例:最贵、最便宜和平均价格三个指标
GET product/_search
{"size": 0, "aggs": {"max_price": {"max": {"field": "price"}},"min_price": {"min": {"field": "price"}},"avg_price": {"avg": {"field": "price"}}}
}
## 单个聚合查询所有指标
GET product/_search
{"size": 0, "aggs": {"price_stats": {"stats": {"field": "price"}}}
}
##按照name去重的数量
GET product/_search
{"size": 0, "aggs": {"type_count": {"cardinality": {"field": "name"}}}
}
GET product/_search
{"size": 0, "aggs": {"type_count": {"cardinality": {"field": "name.keyword"}}}
}
##对type计算去重后数量
GET product/_search
{"size": 0, "aggs": {"type_count": {"cardinality": {"field": "lv.keyword"}}}
}
##*********************************************
## 管道聚合 二次聚合
## 例:统计平均价格最低的商品分类
GET product/_search
{"size": 0, "aggs": {"type_bucket": {"terms": {"field": "type.keyword"},"aggs": {"price_bucket": {"avg": {"field": "price"}}}},"min_bucket":{"min_bucket": {"buckets_path": "type_bucket>price_bucket"}}}
}##=============================================
## 嵌套聚合
## 语法
GET product/_search
{"size": 0,"aggs": {"<agg_name>": {"<agg_type>": {"field": "<field_name>"},"aggs": {"<agg_name_child>": {"<agg_type>": {"field": "<field_name>"}}}}}
}
# 例:统计不同类型商品的不同级别的数量
GET product/_search
{"size": 0, "aggs": {"type_lv": {"terms": {"field": "type.keyword"},"aggs": {"lv": {"terms": {"field": "lv.keyword"}}}}}
}
#按照lv分桶 输出每个桶的具体价格信息
GET product/_search
{"size": 0, "aggs": {"lv_price": {"terms": {"field": "lv.keyword"},"aggs": {"price": {"stats": {"field": "price"}}}}}
}##结合了上面两个例子
##统计不同类型商品 不同档次的 价格信息 标签信息
GET product/_search
{"size": 0, "aggs": {"type_agg": {"terms": {"field": "type.keyword"},"aggs": {"lv_agg": {"terms": {"field": "lv.keyword"},"aggs": {"price_stats": {"stats": {"field": "price"}},"tags_buckets": {"terms": {"field": "tags.keyword"}}}}}}}
}## 统计每个商品类型中 不同档次分类商品中 平均价格最低的档次
GET product/_search
{"size": 0,"aggs": {"type_bucket": {"terms": {"field": "type.keyword"},"aggs": {"lv_bucket": {"terms": {"field": "lv.keyword"},"aggs": {"price_avg": {"avg": {"field": "price"}}}},"min_bucket": {"min_bucket": {"buckets_path": "lv_bucket>price_avg"}}}}}
}#======================================================
#基于查询结果的聚合
GET product/_search
{"size": 0, "query": {"range": {"price": {"gte": 5000}}}, "aggs": {"tags_bucket": {"terms": {"field": "tags.keyword"}}}
}#基于filter的aggs
GET product/_search
{"query": {"constant_score": {"filter": {"range": {"price": {"gte": 5000}}}}},"aggs": {"tags_bucket": {"terms": {"field": "tags.keyword"}}} 
}GET product/_search
{"query": {"bool": {"filter": {"range": {"price": {"gte": 5000}}}}}, "aggs": {"tags_bucket": {"terms": {"field": "tags.keyword"}}}
}#基于聚合的查询
GET product/_search
{"aggs": {"tags_bucket": {"terms": {"field": "tags.keyword"}}},"post_filter": {"term": {"tags.keyword": "性价比"}}
}#取消查询条件&&查询条件嵌套
## 例:最贵、最便宜和平均价格三个指标
GET product/_search
{"size": 10,"query": {"range": {"price": {"gte": 4000}}},"aggs": {"max_price": {"max": {"field": "price"}},"min_price": {"min": {"field": "price"}},"avg_price": {"avg": {"field": "price"}},"all_avg_price": {"global": {},"aggs": {"avg_price": {"avg": {"field": "price"}}}},"muti_avg_price": {"filter": {"range": {"price": {"lte": 4500}}}, "aggs": {"avg_price": {"avg": {"field": "price"}}}}}
}#===============================================
#聚合排序_count _key _term
GET product/_search
{"size": 0,"aggs": {"type_agg": {"terms": {"field": "tags","order": {"_count": "desc"},"size": 10}}}
}
#多级排序
GET product/_search?size=0
{"aggs": {"first_sort": {"terms": {"field": "type.keyword","order": {"_count": "desc"}},"aggs": {"second_sort": {"terms": {"field": "lv.keyword","order": {"_count": "asc"}}}}}}
}#多层排序
GET product/_search
{"size": 0,"aggs": {"tag_avg_price": {"terms": {"field": "type.keyword","order": {"agg_stats>stats.sum": "desc"}},"aggs": {"agg_stats": {"filter": {"terms": {"type.keyword": ["耳机","手机","电视"]}},"aggs": {"stats": {"extended_stats": {"field": "price"}}}}}}}
}#===========================================================
# 常用的查询函数
## histogram 直方图 或者 柱状图 
GET product/_search
{"aggs": {"price_range": {"range": {"field": "price","ranges": [{"from": 0,"to": 1000},{"from": 1000,"to": 2000},{"from": 3000,"to": 4000},{"from": 4000,"to": 5000}]}}}
}
GET product/_search?size=0
{"aggs": {"price_range": {"range": {"field": "createtime","ranges": [{"from": "2020-05-01", "to": "2020-05-31"},{"from": "2020-06-01","to": "2020-06-30"},{"from": "2020-07-01","to": "2020-07-31"},{"from": "2020-08-01"}]}}}
}
#空值的处理逻辑 对字段的空值赋予默认值
GET product/_search?size=0
{"aggs": {"price_histogram": {"histogram": {"field": "price","interval": 1000,"keyed": true,"min_doc_count": 0,"missing": 1999}}}
}
#date-histogram
#ms s m h d
GET product/_search?size=0
{"aggs": {"my_date_histogram": {"date_histogram": {"field": "createtime","calendar_interval": "month","min_doc_count": 0,"format": "yyyy-MM", "extended_bounds": {"min": "2020-01","max": "2020-12"},"order": {"_count": "desc"}}}}
}
GET product/_search?size=0
{"aggs": {"my_auto_histogram": {"auto_date_histogram": {"field": "createtime","format": "yyyy-MM-dd","buckets": 180}}}
}
#cumulative_sum
GET product/_search?size=0
{"aggs": {"my_date_histogram": {"date_histogram": {"field": "createtime","calendar_interval": "month","min_doc_count": 0,"format": "yyyy-MM", "extended_bounds": {"min": "2020-01","max": "2020-12"}},"aggs": {"sum_agg": {"sum": {"field": "price"}},"my_cumulative_sum":{"cumulative_sum": {"buckets_path": "sum_agg"}}}}}
}
## percentile 百分位统计 或者 饼状图
## https://www.elastic.co/guide/en/elasticsearch/reference/7.10/search-aggregations-metrics-percentile-aggregation.htmlGET product/_search?size=0
{"aggs": {"price_percentiles": {"percentiles": {"field": "price","percents": [1,5,25,50,75,95,99]}}}
}
#percentile_ranks
#TDigest
GET product/_search?size=0
{"aggs": {"price_percentiles": {"percentile_ranks": {"field": "price","values": [1000,2000,3000,4000,5000,6000]}}}
}
http://www.dinnco.com/news/43683.html

相关文章:

  • 建设企业网站制作公司在线crm网站
  • 公司企业邮箱怎么登陆seo优化一般包括哪些内容()
  • 做搜索网站能发财吗全国前十名小程序开发公司
  • 保洁公司在哪个网站做推广比较好百度竞价包年推广是怎么回事
  • 网站空间服务器整合营销传播的定义
  • 网站介绍怎么写培训班线上优化
  • 建e网官方网站现在感染症状有哪些
  • 怎么做多个网站单点登录seo网站推广多少钱
  • 网站设计不包括河南企业网站建设
  • 织梦做中英文企业网站seo页面内容优化
  • 湖南seo网站策划佛山网站优化排名推广
  • 做网站平面一套多少钱360搜索引擎推广
  • 阿里云服务器怎么发布网站trinseo公司
  • 通辽网站制作公司湖南网站制作公司
  • 章丘做网站驾校推广网络营销方案
  • 江西泰飞建设有限公司网站交友平台
  • 有那些是做批发的网站sem优化软件选哪家
  • 漳州做网站匹配博大钱少a电商平台排行榜前十名
  • 通州个人做网站超级外链工具
  • 织梦转WordPress插件怀来网站seo
  • 买正品东西哪个网最好武汉百度推广seo
  • 微网站定制多久seo刷关键词排名免费
  • 制作网站商营销策略是什么意思
  • 做网站设计累吗微信公众号平台官网
  • 教育部学校规划建设发展中心网站苏州关键词优化seo
  • wordpress漂浮花瓣网站关键词快速优化
  • wordpress iis部署seo关键词优化排名推广
  • 展示类网站网站seo的内容是什么
  • 做美女图片网站违法吗网站的营销推广方案
  • 跨境电商怎么做?如何从零开始学做电商赚钱南昌seo网站管理