2016年7月1日 星期五

[elasticsearch]Rollover API and Shrink API 更好的管理 time-based event data Elastic Stack Release - 5.0.0-alpha4


Elastic Stack Release - 5.0.0-alpha4 令人振奮的提供了兩個對於 time-based index 更友善的API 。

New Rollover API and Shrink API makes managing indices for time-based event data much easier



Rollover Index


Rollover Index | Elasticsearch Reference [master] | Elastic
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html

The rollover index API rolls an alias over to a new index when the existing index is considered to be too large or too old.

可以使用這個 rollover api 定義時間(max_age)或者是  資料的筆數 (max_docs) ,當條件符合時,就會根據規則自己建立新的 index 。



PUT /logs-0001 
{
  "aliases": {
    "logs_write": {}
  }
}

POST logs_write/_rollover 
{
  "conditions": {
    "max_age":   "7d",
    "max_docs":  1000
  }
}

 1.  建立一個新的 index 並指定 alias 為 logs_write
 2.  設定 當 index 被建立了七天,或是文件數超過 1000 就會建立新的 index 。並且命名為 logs-0002

這裡是 POST 的 response 
{
  "old_index": "logs-0001",
  "new_index": "logs-0002",
  "rolled_over": true, 
  "dry_run": false, 
  "conditions": { 
    "[max_age: 7d]": false,
    "[max_docs: 1000]": true
  }
}


index naming 的規格就是 使用 - 分隔 ,後面的數字會自行遞增。


Shrink index API

Shrink Index | Elasticsearch Reference [master] | Elastic
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-shrink-index.html
另外一個 api 就是 shrink api
The shrink index API allows you to shrink an existing index into a new index with fewer primary shards.

如果你想變更 primary shard 的設定,可以去自動的去幫你建立一個新的 index 。


Shrinking works as follows:
  • First, it creates a new target index with the same definition as the source index, but with a smaller number of primary shards.
  • Then it hard-links segments from the source index into the target index. (If the file system doesn’t support hard-linking, then all segments are copied into the new index, which is a much more time consuming process.)
  • Finally, it recovers the target index as though it were a closed index which had just been re-opened.

這邊有個值得注意的地方就是shrink 過的index 每個shard 不能超過 2,147,483,519 documents  。


  • The index must not contain more than 2,147,483,519 documents in total across all shards that will be shrunk into a single shard on the target index as this is the maximum number of docs that can fit into a single shard.

首先,要先確定cluster 的 health 是 green , index 要被改成 read only ,且要把index的所有shard 搬到同一台node 上才能進行。
這兩個要求可以透過下面的 request 達到。
index.blocks.write 設成 true 是還是允許 delete 的這種改變metadata的操作。

PUT /my_source_index/_settings
{
  "settings": {
    "index.routing.allocation.require._name": "shrink_node_name", 
    "index.blocks.write": true 
  }
}


當 shard 搬移完成後,可以就可以使用 shrink api

POST my_source_index/_shrink/my_target_index


要觀察這些狀態,可以使用   > curl -XGET 'localhost:9200/_cat/recovery?v'

elasticsearch 加上了這兩個 api ,
對於 index time base 的資料的使用者,方便性又更加的提升了。


沒有留言:

張貼留言