Introduction and benchmarking of MeCab.jl #JapanR
-
Upload
michiaki-ariga -
Category
Software
-
view
1.546 -
download
3
description
Transcript of Introduction and benchmarking of MeCab.jl #JapanR
![Page 1: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/1.jpg)
MeCab.jlつくってみたMichiaki Ariga @chezou
Japan.R 2014 @Freakout
![Page 2: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/2.jpg)
大事なこと
![Page 3: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/3.jpg)
の話はしません
![Page 4: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/4.jpg)
Tokyoから来ました
![Page 5: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/5.jpg)
自己紹介有賀康顕 / @chezou
ソフトウェアエンジニア@Cookpad
クックパッド本体のサービス開発
最近はレコメンドとか
JuliaTokyo / MLCT / kawasaki.rb
![Page 6: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/6.jpg)
と私
![Page 7: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/7.jpg)
![Page 8: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/8.jpg)
そそのかされた😇
![Page 9: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/9.jpg)
Julia Advent Calendar 参加者募集中です!http://qiita.com/advent-calendar/2014/julialang
![Page 10: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/10.jpg)
![Page 11: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/11.jpg)
日本で10枚くらいの ステッカー
![Page 13: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/13.jpg)
スターが一番多い!!!
![Page 14: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/14.jpg)
つくったものたち
Julia100本ノック
ConfidenceWeighted.jl
MeCab.jl
![Page 15: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/15.jpg)
MeCab.jl
工藤さん作の形態素解析器MeCabのJulia wrapper
これで、Juliaで自然言語処理ができる!!1
![Page 16: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/16.jpg)
DEMO
![Page 17: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/17.jpg)
で、速いの?
![Page 18: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/18.jpg)
Benchmark平均処理時間
[sec
]
0
0.2
0.4
0.6
0.8
Ruby(node) Julia(node) Ruby Julia R
• 対象: ブログデータ(734kB) • 単語の頻度カウント • 10回の平均を取得 • Nodeと表層の2パターン
• RはRMeCabFreq()
https://gist.github.com/chezou/1f947423c6655c266e0a
![Page 19: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/19.jpg)
ファッ!?
![Page 20: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/20.jpg)
これは…!?
![Page 21: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/21.jpg)
あ…ありのまま 今 起こった事を話すぜ!
![Page 22: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/22.jpg)
「おれはRと戦っていたと思ったら いつのまにかCと戦っていた」
![Page 23: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/23.jpg)
RMeCabはCだった
![Page 24: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/24.jpg)
ちなみに…
![Page 25: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/25.jpg)
Benchmark平均処理時間
[sec
]
0
0.2
0.4
0.6
0.8
Ruby(node)Julia(node) Ruby Julia R Julia(w/o gc)
• 対象: ブログデータ(734kB) • 単語の頻度カウント • 10回の平均を取得 • Nodeと表層の2パターン
• RはRMeCabFreq()
![Page 26: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/26.jpg)
GCを抑えればよかった😇
![Page 27: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/27.jpg)
Julia Advent Calendar 参加者募集中です!http://qiita.com/advent-calendar/2014/julialang
![Page 28: Introduction and benchmarking of MeCab.jl #JapanR](https://reader033.fdocuments.net/reader033/viewer/2022060202/559c73561a28ab9d088b46ea/html5/thumbnails/28.jpg)
JuliaのPros/Cons
Pros
Cのコードを書かないでもバインディングできる
Cons
C++は(まだ)辛い
gc!!!