peicheng note

2014年12月25日星期四

[elasticsearch]從 Elasticsearch index 與基本概念談起

從 Elasticsearch index 與基本概念談起

"index" 索引這個詞常常在談論Elasticsearch 的時候被誤用太多，從核心概念談起，很多表現出來的行為就不言自明了。

index (索引)

"index" 在 Elasticsearch 中可以把它類比成關聯式資料庫的資料庫(database)概念。index即是存放與索引資料的一個單位。事實上，在底層中 index 是一個邏輯的namespace，它指向若干個shards(分片)。

"to index" 也意味著，使用Elasticsearch索引你的資料。你的資料會為了要搜尋而索引且存放起來。

inverted index (倒排索引)

倒排索引是一個在Lucene 底層為了加快數據搜索的一種資料結構。

透過處理數據的過程，萃取出獨特的 terms 或是 tokens ，然後記下來 terms 包含了哪些documents。

shard

shard 是一個 Lucene 的實例，也就是說，shard才是實際是上，在Lucene操作的"index"。在Elasticsearch中，一個index可以由若干個shard組成。

"primary shard" 是主要的一個部分， "replica shard" 是 "primary shard"的副本。

"replica shard" 主要的作用就是當 primary shard 失效時的 failover ，還有可以增加讀取的throughpu。

segment (段)

每個shard通常都包還著多個segment，每個segment都是一個"倒排索引"。

當我們在索引文件時，elasticsearch會先把那些資料收集到記憶體當中(當然其中有 trasation log來確保資料不會遺失)，Elasticsearch 預設會在每秒寫一個新的segment到硬碟中，必且刷新(refresh)它，讓使用者可以搜尋到資料。

這也就是，Elasticsearch一直被標榜的Near Realtime search engine。

雖然，這些新的資料已經可以被搜尋，但是並還沒有fsync'ed 到硬碟。每隔一段時間，Elasticsearch會flush這些資料，也就是意味 fsyncing 這些 segments(也就是說這些 document已經commited )在此同時也會把 transation log 清掉，因為新的數據已經被寫到硬碟。

一個index包含了越多的segments也就意味著需要更多的search時間。所以，Elasticsearch會在背景中，把大小相似的segment 合併一個更大的segment。(這個過程稱作merge,elasticsearch 預設使用 tier 的 merge policy。) 一旦新的segment產生了，舊的segment就會被棄用。這個 merge的過程中，若是有很多塊相似大小的segment，就會從比較大塊的開始進行merge。

segment具有"不可變更"(immutable)的特性。當index內的documents被更新時，事實上，它只是先標記把舊的document標記成刪除，然後再索引新的document。Merge 進行時，就會順便把這些舊的已經刪除的document實際刪除。(不合併在新的segment內)

2015 聖誕夜

咚一百聖誕快樂 Merry Christmas 每年的這一天聖誕節子夜崇拜

每一年的今天對於每個東海人心中都有一個藏在心裡的回憶，
猶記當年東海風，好似就在昨日。

2015 平安夜

2014年12月23日星期二

[life] 生活總要有個改變每月財經書之旅

生活總要有些改變，這週末在圖書館在就在想說，何不訂下一些計畫，每月來固定點財經書。

本月書單

30歲之後，你想要多有錢?
博客來-30歲之後，你想要多有錢？：就算完全不懂理財，這樣開始做股票、基金、買房，你一輩子有錢
http://www.books.com.tw/products/0010567339

2014年12月21日星期日

[武術]141220 六 141221 日吳式太極拳

週六比較稍微比較晚到
看到師姐正跟老師請教半段的細節，
過了一陣子，我們就在打一次拳趟。

打完之後，老師就說跟師兄請教一下推手。

剛勁

師兄再推完手，以後跟我說，我的手部的支撐的剛勁要鍛鍊出來。
發力的時候結構，就是，好像脊椎後面被撐著(命門)，如果要加上往前的力量，可以前腳虛腳，然後往前，但是要維持剛體與角度不變得情況下往前撐。

再來就是纏絲勁，不是單向的力量，還是整體的扭轉。

週日

在打完拳後，老師說，只要一鬆氣就能打出，拳架裡面很多動作，都是你熟練以後，才能用。

跟師兄推手練習，老師先說了，目前先練習四隅推手，之後還有很多種多樣的推手方式，更能夠感觸別人的力量。

跟師兄推完手後，我請問他說，肩膀好像打不開，被壓住，
要把肩往前，有靈活度才能轉動。

後來又提到幾個概念在打拳中可以訓練筋骨，
先不要平衡把單腳直接提起來然後在不平衡中肌肉無法作用時就使用筋的力量去做提的
師兄舉了一本陳式太極拳的架子來看前人很多都在不平衡時抬腳然後以垂直線來看離重心腳滿遠的。

2014年12月19日星期五

[think] 關於旅程的目的

“旅遊景點本身從不是旅程的目的，而是在旅途中找到一個看待事物的新角度.”

– Henry Miller

2014年12月18日星期四

[linux][redis]MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled.

MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.

2014年12月16日星期二

[linux][redis](error) ERR max number of clients reached

127.0.0.1:6379> llen logimport
(error) ERR max number of clients reached

Maximum number of clients

In Redis 2.4 there was an hard-coded limit about the maximum number of clients that was possible to handle simultaneously.

In Redis 2.6 this limit is dynamic: by default is set to 10000 clients, unless otherwise stated by the maxmemory directive in Redis.conf.

However Redis checks with the kernel what is the maximum number of file descriptors that we are able to open (the soft limit is checked), if the limit is smaller than the maximum number of clients we want to handle, plus 32 (that is the number of file descriptors Redis reserves for internal uses), then the number of maximum clients is modified by Redis to match the amount of clients we are really able to handle under the current operating system limit.

When the configured number of maximum clients can not be honored, the condition is logged at startup as in the following example:

$ ./redis-server --maxclients 100000
[41422] 23 Jan 11:28:33.179 # Unable to set the max number of files limit to 100032 (Invalid argument), setting the max clients configuration to 10112.

When Redis is configured in order to handle a specific number of clients it is a good idea to make sure that the operating system limit to the maximum number of file descriptors per process is also set accordingly.

Under Linux these limits can be set both in the current session and as a system-wide setting with the following commands:

ulimit -Sn 100000 # This will only work if hard limit is big enough.
sysctl -w fs.file-max=100000

redis报-ERR max number of clients reached错误 – 酷喃｜coolnull｜

http://coolnull.com/2842.html

2014年12月12日星期五

[logstash]check field exits 判斷 event 是否包含某欄位

在 filter 中，常常有時候要確認是否有某個field存在再繼續進行，
可以使用 logstash 的 condition 。

if ![someotherfield] {
do something not including the [someitherfield]
else
do something including the [somotherfield]
}

如此就可以判斷 event 是否有包含該欄位，如果欄位存在就做若干操作。

2014年12月11日星期四

[linux]using cron date variables

using cron date variables

常常有個情境，在使用 crontab run 時，需要把結果歸檔到當天的資料夾或檔案裡面，這時候就會使用，date command 的 format。

在cron內，使用 date variables 遇到 % format 時候就被break了。

一個範例表示

# For details see man 4 crontabs

# Example of job definition:

# .---------------- minute (0 - 59)

# | .------------- hour (0 - 23)

# | | .---------- day of month (1 - 31)

# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...

# | | | | .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat

# | | | | |

# * * * * * user-name command to be executed

一般而言可以省略 user-name ，直接寫command。

如果直接使用下面語句會遇到 % 跳脫了。

* * * * * /bin/bash /root/pc_back/logstash/script/stat_count.sh >> "/root/pc_back/logstash/script/stat_count.2stat."`date +%Y%m%d`

必須加上 "%" 來讓 date 成功執行

* * * * * /bin/bash /root/pc_back/logstash/script/stat_count.sh >> "/root/pc_back/logstash/script/stat_count.stat."`date +\%Y\%m\%d`

2014年12月8日星期一

[電影]五星主廚快餐車 Chef

五星主廚快餐車 - 維基百科，自由的百科全書

http://zh.wikipedia.org/wiki/%E4%BA%94%E6%98%9F%E4%B8%BB%E5%BB%9A%E5%BF%AB%E9%A4%90%E8%BB%8A

2014年12月7日星期日

[life] 新生活新運動 PATHFINDER

最近狗屁倒灶的事情太多了，
不過也算是認清一些事情與真理。

不必為了別人而停留下腳步，
反而要更加堅定自己的想法，加快自己的行動。

這就是一個新生活運動的開始了。

PATHFINDER ~

2014年12月5日星期五

[python]remove null byte "TypeError: must be string without null bytes, not str"

problems

Traceback (most recent call last):
File "change_wallpaper.py", line 39, in <module>
os.popen(cmd_first+' ; '+cmd)
TypeError: must be string without null bytes, not str

sol

use rstrip remove the trailling NULLs

str.rstrip('\n')

2014年12月4日星期四

[linux][ubuntu]W: GPG error

W: A error occurred during the signature verification.
The repository is not updated and the previous index files will be used. GPG error:http://extras.ubuntu.com quantal Release: 
The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 16126D3A3E5C1192

W: GPG error: http:// archive.canonical.com quantal Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 40976EAF437D05B5 NO_PUBKEY 3B4FE6ACC0B21F32

在做 apt-get update 時，若出現 GPG key missing or GPG key error的狀況

先找到 error message 標示的 key 3B4FE6ACC0B21F32

# gpg --keyserver pgpkeys.mit.edu --recv-key 3B4FE6ACC0B21F32
# gpg -a --export 3B4FE6ACC0B21F32 | apt-key add -

sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 40976EAF437D05B5

note.

後面標示的key 務必替換成 error message內的 key

2014年12月3日星期三

[tech_note] 新任台北市長柯P使用GitBook 編寫政策白皮書

Taipei's new mayor writes a GitBook
https://www.gitbook.com/blog/authors/taipei-mayor-writes-gitbook

新任的台北市長柯文者使用 gitbook來編寫他的官方版政策白皮書。

柯文哲醫師與他的團隊在週六贏得了選戰(141229)。

本週的早些時候他們決定使用gitbook 發佈與分享他們的官方版的政策白皮書。

他們的白皮書(whitebook)獲得廣泛的成功，在選前幾天，不到五天吸引了25萬人的瀏覽，更在facebook上獲得了 36000 個讚(like)，更有數以百萬計的台北市民瀏覽。

有趣的是政治家正在走向這條擁抱數位的時代，並且使用電子書公開的與市民分享他的計畫與想法。這種透明度屎的公民得以讓他們的想法反饋到政治家的競選活動上。

我們看到使用GitBook當做一個平台，不論其社會地位年齡經驗，它使每個人都可以公開編寫並且分享他們的想法。我們希望，柯醫師的例子能夠激勵更多人加入。

最後，我們恭賀柯醫生與他們競選團隊獲得勝利，也祝他們好運，希望在未來幾年，他們能幫助台北成長，使台北變成一個更國際化的城市。

2014年11月25日星期二

[哲學]141125 先知先覺不知不覺後知後覺

看完信淚都流下來了，世上不知不覺者多，若有後知後覺者，也稱上欣慰了...

2014年11月24日星期一

[武術][吳式太極拳]141122(六) 141123(日) 野馬分鬃與推手心得

接續的週四老師的起式演示，今天像老師請教了"野馬分鬃"，老師一直強調的逢開必打，逢合必拿，像是野馬分鬃就是種很高段的用法。

我本來是請教老師，分手與野馬分鬃的用法上有何不同，因為我所認知到的是使用肩靠。
當我一扣在老師的後頸時，一般有要用肘頂，野馬分鬃一開始就用肩膀去拿關節一鬆沉就被控制住了，在補上一手。
有了上次經驗，知道就算是使盡全力，反而老師會更快察覺到。

老師說了，"用心學細心學可能幾個月有人可能好幾年熟練以後就用的出來
像是野馬分鬃玉女穿梭都是比較高段的比起搬攔捶揉膝拗步這種明顯的招式
引進落空合即出東北來我順著再從北出
從西南來我順著再往南"

走轉一個角度，順著回來在垂直斜向發出。

還觀察到老師的鬆肩沉肘，看到老師手臂與力的繩捲。
老師最喜歡演示的，一鬆沉，鬆肩出。

每次打完拳後，老師就會說跟師兄請教一下，請師兄教教你推手。最近有幸，都可以跟師兄一起推手。
老師一直強調，要先順著師兄的力量，他推你多少你在退多少，慢慢的去感覺力量的來。

週日時，
老師說了 "現在只是知道再來才能用出"
透過不斷的磨練，有時候，自己就會搞懂了。
還說，"不問我我不知道從哪解釋"
確實每次只要問起老師哪個動作老師就會分常興奮的解釋，沒有個開頭還真的不知道從哪邊講起。

而且，就算是同一個動作，老師也會因為示範的人不同而有不同的講解，每一次講解可能都有不一樣的地方。

週日的推手，師兄說我要把手肘的筋給挑起來，這樣才能支撐。

2014年11月20日星期四

[git]error: The following untracked working tree files would be overwritten by checkout:

在git checkout 時發生

$ git checkout 8c
error: The following untracked working tree files would be overwritten by checkout:

通常那些untracked 的 file應該也不會是想保留的，
可以使用

git clean -dfx

來 Remove untracked files from the working tree

git-clean(1)
https://www.kernel.org/pub/software/scm/git/docs/git-clean.html

git merge - Git error: The following untracked working tree files would be overwritten by checkout - Stack Overflow
http://stackoverflow.com/questions/4858047/git-error-the-following-untracked-working-tree-files-would-be-overwritten-by-ch

[武術]141120(四) 吳式太極拳起式的奧祕

接連幾天的忙碌，
在昨晚早睡一點依舊是比較晚爬起來。

剛去練拳時，已經開始打第一趟了。

接著再打第二趟時，老師講了起式，走到我身邊請我試試。

我左右手分別往上頂，往上抬，老師的手按下，話說我完全感覺不到老師的手有力量在，我只覺得我好像在抬一個物體似的。

第一次，我用力抬，發現重心偏了，右手往上左手往下偏了出去。

第二次，我在繼續用力試試，發現，整個人失重了，站不穩了，整個人被往左邊帶，而且，還一跳一跳的。

後來，跟老師說，這實在太奧祕了，老師也說，這就是最精妙的地方，但是你只要好好練也可以練的出來。

2014年11月18日星期二

[武術]吳式太極拳 141115(六) 111116(日) 目標與學習推手與間架

週六老師說到

學一個東西都要學到精學到妙不能糊裡糊塗這樣才有目標

每次來練拳老師都不忘說，有什麼問題提出來問，任何用法任何打法都可以提出來討論，因為吳家太極拳，一動必有一用，逢開必打，逢何必拿，是非常精細精妙的。

今天老師有特別講了，四正手跟四隅手，

四隅是推往斜的方向，四正是往中間推，要求比較高練習的難度比較高。

老師還走了九宮步推手，直說，透過這樣的訓練你在與人交手就會靈活，這個九宮步不只可以雙練還可以單練。

看了之後，覺得，這個體系如此之豐富，從拳架起，推手，到致用，甚至刀槍劍棍，內功心法。

後來，老師說多跟師兄請教，這禮拜讓我非常的有收穫，師兄平常溫文儒雅，但是一提到拳，話夾子就會開了。 : )

師兄在跟我推手時，他跟我說了幾點，

像是推手時力變化的方向性，我的勁還只有直力，還沒有橫例跟壓迫，

要在不平衡中，找平衡，才能使用到筋的力量，有鍛鍊到筋，才能使用出勁。

類似像，起式抬左腳，就要在不平衡中，找平衡，感覺不用肌肉力量後，調動起全身的筋出來作用。

有稍微跟老師討論一下，我觀察到的現象，但是老師也說，"力的本質是一樣的，你有哪邊覺得會不一樣呢？" 讓我陷入了新的沉思。

週日，因為有其他新同學來，在打完一趟拳後，繼續與師兄討教，也跟他聊了一些涂行健老師的事情。

再與師兄推手，師兄說我的手太沒有力量了，主要是勁不能斷，要一直維持著，不然遇到高手，很容易就被近來了。這就是掤勁，如果沒有掤勁就支撐不住了。

師兄有說幾個法子，像是把頭往下看弓背維持原姿勢，再把手抬起來，再推我，真的就是手維持的某個位置，沒有出力，力量會往腳底送。

所以，再次與我強調，身體六個關節的訓練，加上旋轉的力量。

後來還談到我的力還不錯，以後只要在移動中把位置擺對了，就有抵擋，再來把力量發出來就很強了。

與師兄推手讓我直覺的思考，這不就是意拳所講的間架。師兄所說，道理的相通，不就是意拳所說，內家拳法的核心嗎？

2014年11月9日星期日

[think]我寫故我在

文思泉湧溢於言表
曾經思考過除非我能變，否則我不再寫。

庸庸碌碌的生活，阻斷了自己內心的渴望，還有，自己真心的思考，那是一種純潔可是卻未必無暇的"初心"，也是那個追尋思考真的一的路途。

我想，如果我停止了思考，停止了反芻，停止了一次又一次的寫下，那，就陷入一個停滯不前身不由己的泥沼了。

我寫故我在

14 /11/9(日) 立冬初過

2014年11月6日星期四

[武術]吳式太極拳札記多練多問多摸索

老師今天在電梯提到

要多練，多問，有問題就提出來，不僅僅是練法，更重要的用法不懂更是要提出來。

吳家太極拳弓坐腿在打拳途中都是維持著的對身形有一定的要求。

2014年11月5日星期三

[note]coarse-grained / fine-grained

In Computer Science (as you're looking for),

coarse-grained means 'monolithic'
fine-grained means 'modulized' or 'devided into smaller pieces'

For examples, there are many kinds of architecture for web services:

monolithic architecture is coarse-grained architecture.
microservice architecture is fined-grained architecture.

Those meanings is illustrated in in this article.

2014年10月23日星期四

[git]列出更改該commit 更改的檔案 / show commit changed files

有時想看該commit改了什麼檔案，又不想秀出該commit的內容。
使用 git show 會秀出 commit 的內容。

git show [commit_id] --name-only

ex
git show 8637187 --name-only

141023 吳式太極拳發力方式

今天練完拳跟老師提到，週日那天溫師兄分享的發力的結構跟方式，好像跟老師的不太一樣。
老師，笑一了下，沉思，說，這就是鬆沉，然後擺出鬆沉的架勢。
(攬雀尾樣)

他呵呵的說，這東西沒有什麼絕對的對錯，你要多練多用心體會，看看你自己的體會是如何？

這樣的對話是集富哲理的，讓我能夠再往更深的一層去。

2014年10月17日星期五

[book] 引爆趨勢引爆點的三大原則

引爆點的三大原則

少數原則

在特定的過程及體系內，少數才是關鍵。

定著因素

代表這則訊息發生作用，它留在你的腦海中，揮之不去。

環境力量

2014年10月3日星期五

140928 (日) 吳式太極拳推手初探

140928 (日) 吳式太極拳推手初探

今天去練拳，先練習了一下拳架。
後來老師就先跟師兄推一下手，推一陣子後就問我來推一下吧。

看了好幾個月，師兄跟老師的推手，在這段期間內，我也對照吳鑑泉宗師的拳架推手。

老師今天講的是基本的推手，四動，先右腳在前，攻右手，再攻左手，對方攻過來，左轉腰，坐腿，再攻過來，坐腿轉腰。

這動作僅可雙練，也可單鏈，更可以先訓練姿勢正確，先不在守時變換。不然守的時候可以捋。

- 槍型推手心得
這推手方式就像是槍型推手，先讓自己姿勢正確能夠弓腿坐腿。
在拳架上面也更能體會奧祕。

- 單人也可練

老師還問說網路上面有吳大揆的影片嗎
我就找了現在比較常看到的兩段，一段是草皮推手，一段是家裡推手跟太極劍。

老師也說，吳家的跤會這麼厲害就是全佑本身就是摔跤大王，才能進宮去當皇上的侍衛。
(這方面來說應該是楊露禪在王府教拳，有其他滿人的侍衛。)

一看影片老師就說，對對對這大揆。

2014年9月27日星期六

[武術]140927 (六) 吳式太極拳練習的階段與學習的過程

今天去的時候已經打到快結束拳套了。
老師請我自己動動，我打了兩次拳套請老師指點。

呼呼，不過昨天晚睡早上精神不太好，反而忘了第一段的連續動作。有點小傷心。

老師有多提到，練拳的過程，不是你拳套動作，跟你講一次或者是錄完一次，就完結了。
而是一次次，又聽越熟，自己從練到動作正確，熟練以後，慢慢體會用法用意，每一次都有不同的所思所得，所以這是個持續進步的過程。

提到了吳大揆師伯的說法，他說，"我們一代不如一代，鑑泉不如全佑，而我老爸又不如鑑泉，而我差我老爸又差很遠了。可是，還是所向無敵，如果遇到更高的，我就要更苦練，練的更高。"

老師講的練拳與追求的過程，讓我陷入了一個沉思，剛好與最近的狀態做了一個相呼應。

"老實練拳~" 或許是這階段的一個啟示。

今天老師指點了抱虎歸山的用法用意，一步步的修正他。
會後，老師也說了下次可以早點來練習。

[mac]wxpython install

brew install wxmac

ImportError: No module named html2

2014年9月16日星期二

[elasticsearch] insert automatically timestamp filed to document 自動加入 timestamp

有個需求是，如果再index records的同時加入index 的時間呢?

logstash會在欲index 的 event 加入 @timestamp 欄位，是由 logstash所加入的。

elasticsearch 的 mapping 內有個 _timestamp ，只要在 put mapping 時啟用這個欄位，索引時就會自動把索引時間加入 records中。

用法如下，

{
    "logs" : {
        "_timestamp" : { "enabled" : true }
    }
}

_timestamp 欄位預設在 index setting中 "store" 是設為 false ， "index" 屬性設為 not_analyzed 。

create index
PUT http://localhost:9200/indextest

put mapping

PUT http://localhost:9200/indextest/logs/
{
    "logs" : {
        "_timestamp" : { "enabled" : true }
    }
}

post data to index

POST http://localhost:9200/indextest/logs
{
    "name" : "elasticsearch"
}

search docs

GET http://localhost:9200/indextest/logs/_search?q=*&fields=_timestamp,_source

{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "indextest",
"_type": "logs",
"_id": "0AXft-9gTK-DUdfEvUZFiQ",
"_score": 1,
"_source": {
"name": "elasticsearch"
},
"fields": {
"_timestamp": 1410870337057
}
}
]
}
}

cf.
peicheng-note: elasticsearch 相關 elasticsearch文章
http://peichengnote.blogspot.tw/search/label/elasticsearch
_timestamp
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html
peicheng note: [elasticsearch]get all facet term in elasticsearch / 得到所有的aggregation term list
http://peichengnote.blogspot.tw/2014/09/elasticsearchget-all-facet-term-in.html

peicheng note: [elasticsearch]safely reload configuration from elasticsearch.yml

http://peichengnote.blogspot.tw/2014/08/elasticsearchsafely-reload.html

peicheng note: [elasticsearch] url query_string length limit

http://peichengnote.blogspot.tw/2014/08/elasticsearch-url-querystring-length.html

peicheng note: [elasticsearch] 關於 brain split / cluster split 成兩個 clusters

http://peichengnote.blogspot.tw/2014/07/elasticsearch-brain-split-cluster-split.html

peicheng note: [elasticsearch] 再談 _all field
http://peichengnote.blogspot.tw/2014/06/elasticsearch-all-field.html
peicheng note: [elasticsearch]range query depends on the field type
http://peichengnote.blogspot.tw/2014/06/elasticsearchrange-query-depends-on.htmlpeicheng-note: [elasticsearch] document id _id field uuid
http://peichengnote.blogspot.tw/2014/05/elasticsearch-document-id-id-field-uuid.html
peicheng-note: [elasticsearch/logstash] logstash id 自動產生 document id "_id" automatic id generation
http://peichengnote.blogspot.tw/2014/04/elasticsearchlogstash-logstash-id.html

peicheng note: [elasticsearch] index size , shard size , heap size design
http://peichengnote.blogspot.tw/2014/07/elasticsearch-index-size-shard-size.html

peicheng note: [Elasticsearch] NumberFormatException / Invalid shift value in prefixCoded bytes

http://peichengnote.blogspot.tw/2014/08/elasticsearch-numberformatexception.html

2014年9月14日星期日

[日記]2014.09.13(六) 紀錄片播映 | 楊牧：朝向一首詩的完成

活動訊息

====
9/13(六) 19:00~21:00 　紀錄片播映 | 楊牧：朝向一首詩的完成
於國家圖書館3F國際會議廳舉辦(北市中山南路20號)

【活動說明】
1. 紀錄片播映會將於 9/13(六)18:30 開放報名民眾自由入座，19:00準時播放。

期待與您的見面。

趨勢教育基金會

====

原來楊牧的影響不只深植在我心中，還有這麼多當代現代詩人出來現身說法，除了那提攜後進，與開創新的漢詩新局。裡面有句話說的很棒，楊牧的生活與家人維護的平靜，讓他有平靜的內心能量做狂野的創作。整個紀錄片看來好衝擊...

2014年9月11日星期四

[ruby][WARNING] MultiJson is using the default adapter (ok_json).We recommend loading a different JSON library to improve performance.

[WARNING] MultiJson is using the default adapter (ok_json).We recommend loading a different JSON library to improve performance.

[linux]monitor your process network / iftop / NetHogs 監控程序流量

想要再Linux看到網路的流量可以使用

cat /sys/class/net/eth0/statistics/rx_bytes
or
cat /sys/class/net/eth0/statistics/tx_bytes

若是想更直覺的看到網路流量的情況可以使用

iftop - display bandwidth usage on an interface by host
會列出到每個host的網路的流出與流入。

如果想要看到更詳細的每個process的流量，可以使用 nethogs
nethogs - Net top tool grouping bandwidth per process

nethogs 可以列出每個process流量消耗，

在redhat 系列安裝只要

yum install -y iftop
yum install -y nethogs

2014年9月9日星期二

[武術][吳式太極拳]140904(四) 140906(六) 太極收式弓腿

一動必有一用

老師說到用法，說到吳大揆當時最強調的就是連收式都有用。

像老師請教了收式，老師說，扣緊他的脖子，想不到直接被抬了起來，這實在太妙了。

"再往下的鬆沈可能是你以前沒嘗試過的，可能有人還會害怕，所以要一點點的鬆沈一次比一次還進步"

本週見識了師兄跟老師的進步推手

老師有提到提手上勢，轉跨的靠的力量。都是很明細的。

2014年9月3日星期三

[elasticsearch]get all facet term in elasticsearch / 得到所有的aggregation term list

elasticsearch 的 aggregation modules 是個方便取得 multi bucket 資料的模組。
可以動態的去統計每個欄位的獨立不重複的值的計數。

example

{
    "aggs" : {
        "genders" : {
            "terms" : { "field" : "gender" }
        }
    }
}

Response:

{
    ...

    "aggregations" : {
        "genders" : {
            "buckets" : [
                {
                    "key" : "male",
                    "doc_count" : 10
                },
                {
                    "key" : "female",
                    "doc_count" : 10
                },
            ]
        }
    }
}

特別來說在 term aggregation 內可以使用 size 這個 parameter 來要求返回 top N的值的統計訊息。( elasticsearch 中常常使用size這個parameter )

而如果你想要取得所以屬於該欄位的獨立值呢?

就可以把 size 使用 0 ，將被設定為 Integer.MAX_VALUE 。

{
    "aggs" : {
        "products" : {
            "terms" : {
                "field" : "product",
                "size" : 0
            }
        }
    }
}

另外可以使用的方法是 Cardinality Aggregation，
他會返回你所指定欄位有多少不重複的計數。

{
    "aggs" : {
        "author_count" : {
            "cardinality" : {
                "field" : "author"
            }
        }
    }
}

Cardinality Aggregation
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

cf.
peicheng-note: elasticsearch 相關 elasticsearch文章
http://peichengnote.blogspot.tw/search/label/elasticsearch

peicheng note: [elasticsearch]safely reload configuration from elasticsearch.yml

http://peichengnote.blogspot.tw/2014/08/elasticsearchsafely-reload.html

peicheng note: [elasticsearch] url query_string length limit

http://peichengnote.blogspot.tw/2014/08/elasticsearch-url-querystring-length.html

peicheng note: [elasticsearch] 關於 brain split / cluster split 成兩個 clusters

http://peichengnote.blogspot.tw/2014/07/elasticsearch-brain-split-cluster-split.html

peicheng note: [elasticsearch] index size , shard size , heap size design
http://peichengnote.blogspot.tw/2014/07/elasticsearch-index-size-shard-size.html

peicheng note: [Elasticsearch] NumberFormatException / Invalid shift value in prefixCoded bytes

http://peichengnote.blogspot.tw/2014/08/elasticsearch-numberformatexception.html

2014年9月2日星期二

[hadoop][best practices] how to choose the appropriate linux file system for HDFS 如何選擇一個合適的檔案系統

Hadoop Distributed File System(HDFS) 是一個獨立的平台，可以在運行在任何不同的文件系統與操作系統上運行。Linux 提供了多種的檔案系統的選擇，每個選擇對於HDFS性能有不同的影響。

一般來說的最佳實踐方案，在掛載(mount) Hadoop Data的硬碟建議不啟用 "noatime"。這將會加快文件讀取速度。

有三種比較流行的檔案系統

Ext3
Ext4
XFS

Yahoo 使用 ext3 檔案系統來做他們的hadoop 的預設檔案系統。Ext3 也是許多作業系統預設的檔案系統。所以 HDFS on ex3 已經被Yahoo 廣泛的測試，這可能是個比較安全的檔案系統選項。

ext4 的前身是ext3 。ext4 對於大檔案有比較好的性能表現。ext4 還有 " delayed allocation of data "，可以會增加一點風險造成服務氣中段，但是同時減少碎片的產生與改進效能。

XFS 比起ext3 提供比較好的磁碟空間使用率而且有更快的格式化時間。這意味著你可以比較快的使用 XFS 的 datanode。

硬碟的I/O 是個主要影響 Hadoop 的性能問題。 ext3 已經被廣泛的使用在hadoop上，而 ext4 , xfs可以用來提供更加的性能。

Best Practices: Linux File Systems for HDFS - Hortonworks http://zh.hortonworks.com/kb/linux-file-systems-for-hdfs/

2014年9月1日星期一

[武術]140830 (六) 140831(日) 吳式太極拳白鶴亮翅攬雀尾

秉持著好記性不如爛筆頭的精神，把所思所想學所聽給記錄下來。

本週六去了有點晚，已經打完了一趟拳架，聽老師說是在研究關於 "白鶴亮翅" 這個勢子。

白鶴亮翅應該是分為兩個部分，從提手上勢接過來，開吸氣，左手下按，右手前按，俯跨轉腰，左手往後抬(微微轉手，兩手平面夾角約大於九十度)，靠背起來，轉正，左手下，右手捋。

剛開始看老師跟師兄們在討論的關於，左手下右手捋的鬆沈應用。不過，在看吳家太極的演示比較多都是以摔來做演示，本來想說老師年紀高也不方便做這樣的演示。在老師說自己動動的時候，我請教了他，為什麼要右手前按，左手垂直按了。

他馬上叫我用兩隻手從後面扣住他的手，第一次，老師一鬆沈把我卸開了。可能第一次演示，老師也發現我沒有用力扣緊，他叫我在扣他一次，在我用力扣緊確定已經站穩的同時，老師一說 "鬆沈" 我整個人拔跟而起，往右前甩出，完完全全都在沒有知覺感應的狀態下，好像整個身體都負在老師身上一樣被甩出。這一記白鶴亮翅跟我在北京遇到梁中成老師(梁老師師承戰升堂宗師，戰老師為王茂齋關門弟子李文杰所傳唯一弟子 )，在北大燕南園推手時變力被甩出去一樣，整個是個失重感。

後來老師跟師兄演示進步推手，老師說光是動步的推手，就有好幾種。從順步開始就是可以自由活動走步，再有坳步推手，轉身推手，最高就是跨步推手。

這裡觀察到，讓我一解長久的疑問，看到吳鑑泉宗師的拳架，甚至是看到很多清末的高手的拳照，上身幾乎都是 / 然後下身有點 \ 呈現好像ㄑ字型一樣的狀態，我看到老師的身形也是如此。

老師提到，吳家太極明細，希望大家能夠多多的吸收。

法才侶地

有方法去訓練

有才能

有同伴他比你高或是比你差一點的或跟你差不多的他第1手過來變了之後你又可以第2手第3手...etc

有練習的地方可以跌撲

吳式太極拳很明細的

精了之後才會慢慢細

有些東西練了一次聽了一次可能還不太懂，動作都還做不標準，慢慢做作了三四次等到動作標準，再吸收用法用意。

週日，老師聊了一些在中國文革時期的故事，還有他關於解散軍隊結果軍隊被接收的看法。還有人民大公審，還有提到江青說他殺害6000萬人的故事。

鬆沈鬆靈

學習永無止境尋找更高的高手

2014年8月31日星期日

[tech]"parse.com" api how to do "like" query or "contains" query / whitespace

You can get what you need with a a regex of the form "\^\QJ\E". The whole query will look like the following:

curl -H "X-Parse-Application-Id: $app_id" \
     -H "X-Parse-REST-API-Key:  $api_key" \
     -G --data-urlencode 'where={"first_name" : { "$regex" : "^\QJ\E" } }' \
     https://api.parse.com/1/users/

Are "like" or regex queries possible via the REST API? | Parse
https://www.parse.com/questions/are-like-or-regex-queries-possible-via-the-rest-api

2014年8月26日星期二

[linux]shell script add zero padding

shell script中怎麼在數字前補上位數的零呢?

for i in $(seq -f "%05g" 10 15)
do
  echo $i
done

will produce the following output:

More generally, bash has printf as a built-in so you can pad output with zeroes as follows:

$ i=99
$ printf "%05d\n" $i
00099

You can use the -v flag to store the output in another variable:

$ i=99
$ printf -v j "%05d" $i
$ echo $j
00099

numbers - Zero Padding In Bash - Stack Overflow
http://stackoverflow.com/questions/8789729/zero-padding-in-bash

[linux] ubuntu ssh slow appear

ubuntu ssh slow appear

在 ubuntu 上使用ssh 指令時，常常要等一陣子。
可以在 ssh時使用-vvv option列出debug message。

debug2: bits set: 528/1024
debug1: ssh_rsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: Roaming not allowed by server
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /home/peicheng/.ssh/id_rsa (0x7f81c3da6ef0)
debug2: key: /home/peicheng/.ssh/id_dsa ((nil))
debug2: key: /home/peicheng/.ssh/id_ecdsa ((nil))
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug3: start over, passed a different list publickey,gssapi-keyex,gssapi-with-mic,password
debug3: preferred publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/peicheng/.ssh/id_rsa
debug3: send_pubkey_test
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Trying private key: /home/peicheng/.ssh/id_dsa
debug3: no such identity: /home/peicheng/.ssh/id_dsa
debug1: Trying private key: /home/peicheng/.ssh/id_ecdsa
debug3: no such identity: /home/peicheng/.ssh/id_ecdsa
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: ,password
debug3: authmethod_is_enabled password
debug1: Next authentication method: password

可能原因有 GSSAPIAuthentication,GSSAPIAuthentication fail 問題，
可以在 sudo vim /etc/ssh/ssh_config or ~/.ssh/config
加上
GSSAPIAuthentication no
GSSAPIDelegateCredentials no

Why ssh “password” prompt takes too long to appear? - Ask Ubuntu
http://askubuntu.com/questions/246323/why-ssh-password-prompt-takes-too-long-to-appear

2014年8月18日星期一

[elasticsearch] transport and http modules 的差別

elasticsearch java client 使用TCP/IP trasport
如果再不能撰寫java client 的情況下，可以使用http client ，透過 http 9200 port來做溝通。

[elasticsearch]safely reload configuration from elasticsearch.yml

在設定好 elasticsearch 的 elasticsearch.yml 後，如何安全的重新 reload 新的config呢?

目前唯一的reload config的方式就是 restart service。
所以，如果想達成沒有donwtime的cluster service，在restart前可以先使用 api check一下每個node service的status，在確保每個shard都已經存在另外的node上了，在逐步重開node。

http://esnode:9200/_cluster/health

cat health
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/cat-health.html

cf.
peicheng-note: elasticsearch 相關 elasticsearch文章
http://peichengnote.blogspot.tw/search/label/elasticsearch

peicheng note: [elasticsearch] url query_string length limit

http://peichengnote.blogspot.tw/2014/08/elasticsearch-url-querystring-length.html

peicheng note: [elasticsearch] 關於 brain split / cluster split 成兩個 clusters

http://peichengnote.blogspot.tw/2014/07/elasticsearch-brain-split-cluster-split.html

peicheng note: [elasticsearch] index size , shard size , heap size design
http://peichengnote.blogspot.tw/2014/07/elasticsearch-index-size-shard-size.html

peicheng note: [Elasticsearch] NumberFormatException / Invalid shift value in prefixCoded bytes

http://peichengnote.blogspot.tw/2014/08/elasticsearch-numberformatexception.html

訂閱：意見 (Atom)

2014年12月25日 星期四

從 Elasticsearch index 與基本概念談起

index (索引)

inverted index (倒排索引)

shard

segment (段)

2014年12月23日 星期二

2014年12月21日 星期日

2014年12月19日 星期五

2014年12月18日 星期四

2014年12月16日 星期二

Maximum number of clients

2014年12月12日 星期五

2014年12月11日 星期四

2014年12月8日 星期一

2014年12月7日 星期日

2014年12月5日 星期五

sol

2014年12月4日 星期四

2014年12月3日 星期三

2014年11月25日 星期二

2014年11月24日 星期一

2014年11月20日 星期四

2014年11月18日 星期二

2014年11月9日 星期日

2014年11月6日 星期四

2014年11月5日 星期三

2014年10月23日 星期四

2014年10月17日 星期五

2014年10月3日 星期五

2014年9月27日 星期六

2014年9月16日 星期二

2014年9月14日 星期日

2014年9月11日 星期四

2014年9月9日 星期二

2014年9月3日 星期三

2014年9月2日 星期二

2014年9月1日 星期一

2014年8月31日 星期日

2014年8月26日 星期二

2014年8月18日 星期一

2014年12月25日星期四

2014年12月23日星期二

2014年12月21日星期日

2014年12月19日星期五

2014年12月18日星期四

2014年12月16日星期二

2014年12月12日星期五

2014年12月11日星期四

2014年12月8日星期一

2014年12月7日星期日

2014年12月5日星期五

2014年12月4日星期四

2014年12月3日星期三

2014年11月25日星期二

2014年11月24日星期一

2014年11月20日星期四

2014年11月18日星期二

2014年11月9日星期日

2014年11月6日星期四

2014年11月5日星期三

2014年10月23日星期四

2014年10月17日星期五

2014年10月3日星期五

2014年9月27日星期六

2014年9月16日星期二

2014年9月14日星期日

2014年9月11日星期四

2014年9月9日星期二

2014年9月3日星期三

2014年9月2日星期二

2014年9月1日星期一

2014年8月31日星期日

2014年8月26日星期二

2014年8月18日星期一