elasticsearch 的 aggregation modules 是個方便取得 multi bucket 資料的模組。
可以動態的去統計每個欄位的獨立不重複的值的計數。
example
{ "aggs" : { "genders" : { "terms" : { "field" : "gender" } } } }
Response:
{ ... "aggregations" : { "genders" : { "buckets" : [ { "key" : "male", "doc_count" : 10 }, { "key" : "female", "doc_count" : 10 }, ] } } }
特別來說在 term aggregation 內可以使用 size 這個 parameter 來要求返回 top N的值的統計訊息。( elasticsearch 中常常使用size這個parameter )
而如果你想要取得所以屬於該欄位的獨立值呢?
就可以把 size 使用 0 ,將被設定為 Integer.MAX_VALUE 。
{ "aggs" : { "products" : { "terms" : { "field" : "product", "size" : 0 } } } }
另外可以使用的方法是 Cardinality Aggregation,
他會返回你所指定欄位有多少不重複的計數。
{ "aggs" : { "author_count" : { "cardinality" : { "field" : "author" } } } }Cardinality Aggregation
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
cf.
peicheng-note: elasticsearch 相關 elasticsearch文章
http://peichengnote.blogspot.tw/search/label/elasticsearch
http://peichengnote.blogspot.tw/2014/06/elasticsearch-all-field.html
peicheng note: [elasticsearch]range query depends on the field type
http://peichengnote.blogspot.tw/2014/06/elasticsearchrange-query-depends-on.htmlpeicheng-note: [elasticsearch] document id _id field uuid
http://peichengnote.blogspot.tw/2014/05/elasticsearch-document-id-id-field-uuid.html
peicheng-note: [elasticsearch/logstash] logstash id 自動產生 document id "_id" automatic id generation
http://peichengnote.blogspot.tw/2014/04/elasticsearchlogstash-logstash-id.html
peicheng-note: elasticsearch 相關 elasticsearch文章
http://peichengnote.blogspot.tw/search/label/elasticsearch
peicheng note: [elasticsearch]safely reload configuration from elasticsearch.yml
peicheng note: [elasticsearch] url query_string length limit
peicheng note: [elasticsearch] 關於 brain split / cluster split 成兩個 clusters
peicheng note: [elasticsearch] 再談 _all fieldhttp://peichengnote.blogspot.tw/2014/06/elasticsearch-all-field.html
peicheng note: [elasticsearch]range query depends on the field type
http://peichengnote.blogspot.tw/2014/06/elasticsearchrange-query-depends-on.htmlpeicheng-note: [elasticsearch] document id _id field uuid
http://peichengnote.blogspot.tw/2014/05/elasticsearch-document-id-id-field-uuid.html
peicheng-note: [elasticsearch/logstash] logstash id 自動產生 document id "_id" automatic id generation
http://peichengnote.blogspot.tw/2014/04/elasticsearchlogstash-logstash-id.html
peicheng note: [elasticsearch] index size , shard size , heap size design
http://peichengnote.blogspot.tw/2014/07/elasticsearch-index-size-shard-size.html
http://peichengnote.blogspot.tw/2014/07/elasticsearch-index-size-shard-size.html
peicheng note: [Elasticsearch] NumberFormatException / Invalid shift value in prefixCoded bytes
沒有留言:
張貼留言