NoSQLでオープンソースの 統合監視ソフトZabbixを高速化!
-
Upload
- -
Category
Technology
-
view
162 -
download
0
description
Transcript of NoSQLでオープンソースの 統合監視ソフトZabbixを高速化!
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
2014/1/24 ミラクル・リナックス株式会社
シニアエキスパート 大和 一洋
NoSQLでオープンソースの統合監視ソフトZabbixを高速化!
@Cassandra Summit JPN 2014
Self-Introduction
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Kazuhiro Yamato (大和 一洋)
An software engineer works forMIRACLE LINUX CORPORATION http://www.miraclelinux.com/
Interested in- Low-level layer technologies(kernel, libgc, glib, C++, debugging)
- Multi-media (video/audio procesing) using OSS such as gstreamer.
2
MIRACLE LINUX CORPORATION
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
- Asianux (based on RHEL)
- Embedded Solutions * Device w/ display (digital signage) * Network device (router) (BSP and customize service)
- MIRACLE ZBX (described later)
- MIRACLE System Savior(System backup software)
- Support of our products and others (such CentOS, RHEL, Montavista, WindRiver)
3
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
ZABBIX
and
Our study~ An idea and implementation of
its improvement ~
What is ZABBIX ?
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 5
Enterpise-level monitoring system
- Developped by ZABBIX SIA (http://www.zabbix.com/)
- One of the most popular OSS monitoring solutions.
- With rich features and the graphical Web interface.
A picture of monitoring system
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 6
ZABBIX Server
MIRACLE ZBX
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 7
ZABBIX(+ fixes)
OS(LAMP)
Enterpise monitoring solution based on ZABBIX
- provided by MIRACLE LINUX CORPORATION
- OS + ZABBIX (+ fixes) + H/W + Support + Consulting
Want to monitor more devices
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 8
Recently, the number of monitoring devices
increasingdue to new technologies:
IaaS (Openstack), Hadoop, NoSQL …
Some users want to monitor more than a few thousand of devices by one software.
How many devices can be monitored ?
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 9
E.g. ZBX8000a (Xeon 3.40Hz) => ~1000 devices
Interval: 5min, Monitoring points: ~100/device, No log/trap monitring
MIRACLE LINUX’sFlagship appliance
However, it is hard to answer in general.It largely depneds on the situation.
such as - Monitoring interval - Number of monitoring points for each device. - Kind of target value (number or string)
Our challenge
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 10
Increase the number ofmonitoring targets
(or monitoring interval)with ZABBIX
Original software stack of ZABBIX
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 11
Target(ZABBIX agent)
ZABBIX server
RDBMS
Frontend (PHP)
Linux + Apache
Operator’sPC
(Browser)
Server Machine
Target(ZABBIX agent)
Target(ZABBIX agent)
Target(ZABBIX agent)
Almost all data- Configuration- User info.- Monitored value- Event…
Where is bottleneck?
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 12
RDBMS (MySQL, PostgreSQL, or ...)
<Observation>As the number of targets devices increases,the following also go high.
● CPU load of mysqld● IOwait
<Source code (Architecture) of ZABBIX>uses many SQL statements
What kind of data is using a DB?
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 13
History
Trend
Others: less than 1% (Event, Item, etc.)
Example:Ratio of data size in DB
Monitored value- CPU load- free memory- free diskspace
A kind of compressed data of history.term-by-term min, avg, max.
Characteristics of History data
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 14
● Basically appeneded only● Deleted after a while● Read when a chart is drawn
Strict consistency (and SQL features):NOT so needed
Our idea to icrease monitoring targets
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 15
Save History data in NoSQL DB
● Some NoSQL DBs are faster than RDBMS
In addition● Easy to backup (replicate)● Scalability by the cluster and the possibility to
predict failures by the analysis
Our proposed software stack
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 16
Target(ZABBIX agent)
ZABBIX server
RDBMS
Frontend (PHP)
Linux + Apache
Operator’sPC
(Browser)Target(ZABBIX agent)
Target(ZABBIX agent)
Target(ZABBIX agent)
NoSQL NoSQL NoSQL
Scale-out if needed
HistoryEverything else
RDBMS + NoSQL: Hybrid architecture
ZABBIX Frontend (PHP)
Connection with NoSQL
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 17
Core Logic
HBaseDriver
CassandraDriver
RiakDriver
MemDriver
HistroyGluon
(Java)
HBase Cassandra Riak For test. Not scaled-out and volatile
Client Libraryfor C-language
Client Libraryfor C-language
PHPextension
Socket (TCP)HistoryGluon's own binary protocol
BackendModules
ZABBIX Server
NoSQL
ZABBIX
HistoryGluon
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 18
● Newly developed for this study
● Can switch NoSQL DBs○ HBase, Cassrandra, Riak
● Provides common interfaces○ E.g. insert and query
● Client libraries○ C/C++, PHP, Ruby
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Benchmark
Index of the performance
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 20
● Thoughput○ Number of
written or readhistories for a certain period
How many nodes were used ?
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 21
● Stand-alone (Number of node: 1)○ Monitoring equipment should be small
■ Not standard in terms of typical NoSQL usage■ But ideal way under the above concept
● Small Cluster○ We used 3 or 4 nodes
Destination of History
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 22
● RDBMS○ MySQL○ PostgreSQL
● NoSQL○ HBase○ Cassandra○ Riak
● Special as a reference○ Mem (on memory DB in HistoryGluon)○ Null (Spoil data in ZABBIX server)
Categorization of benchmarks we did
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 23
#Write
orRead
Standalone
SmallCluster
StorageDevice Target NoSQL
1 Write O O HDDHBase, Cassandra,Riak
2 Write O - Fusion-ioioDrive Cassandra
3 Read O - HDD Cassandra
O: Measured, -: Not measure
HW & SW in the benchmark
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 24
● Machine○ Core i5 (3570K 3.4GHz) Mem: 16GB○ Disk SATA 2TB○ LAN: Gigabit Ether
● Software○ OS: CentOS 6.3○ Zabbix: 2.0.3○ HBase: 0.94.1○ Cassandra: 1.1.6○ Riak: 1.2.1
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Benchmark 1
-
Setup (Write: stand-alone)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 26
monioring target
Registered multiple timesas if there are a lot of targerts.
MonitoringInterval5 sec.
Result (Write: with MySQL)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 27
Saturated for a factor other than DB
Approx. 3.4x
Result (Write: with PostgreSQL)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 28
Saturated
Setup (Write: small cluster)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 29
Monitoring Target: 1
NoSQL nodes: 4
Token setup for Cassandra
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 30
To query range data fast:Random Partitioner => Byte Order Partioner
One of the typical motivations:Chart of monitored value for a duration
We prepared 3 configuration of token range.local: All data to node Sexternal: All data to node Usharding: Distributed evenly (calculated in advance)
Result (Write: with small cluster)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 31
Riak (stand-alone) performance is around hereapprox. 1.6x
Nearly indentical to the result of stand-alone
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Benchmark 2
-
Write performance (w/ ioDrive)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 33
● Difference between Benchmark 1○ DB was created on Fusion-io ioDrive
■ Consists of flash memory■ Interface: PCIe■ SLC 80GB
○ The number of PC usedas monitoring target: 1 => 3
Shortage of capacity of monitored data generation
Result (Write: w/ Fusion I/O and MySQL)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 34
approx. 4x
approx. 8x
Saturated for a factor other than DB
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Benchmark 3
-
Read performance
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 36
● Read histories with a duration○ A drawing of a chart.
● With two parameters○ Duration
■ 10 - 100 days (Step: 10days)○ Write load
■ Number of monitoring targets: Corresponding to 0 - 28000 items
In actual monitoring situationmonitoring data => always written
Connection modes
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 37
Conventional pathAs a reference
(see performance ofZABBIX Front-end)
Front-end connection mode Direct connection mode
Setup (read performance)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 38
The benchmark tool and other software components are in one computer
Resistered multiple timesto a lot of data during the read benchmark
Result (Read: Thoughput v.s. read duration)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 39
Increase as durationin all conditions
Direct connection is higher
Result (Read Thoughput v.s. write load) 1/2
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 40
W/ MySQL/PostgreSQL only, descreased as # of targers
(Write load)
MySQL/PostgreSQL only:Descrease as # of targets
Requested duration = 50days
Result (Read Thoughput v.s. write load) 2/2
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 41
Almost flatw/ MySQL + Cassandra
Saturation of write thoughput Requested duration = 50days
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Benchmark Summaryand discussion
Benchmark summary and discussion
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 43
● Write thoughput○ MySQL + Cassandra:
~3.4 times higher than the original
○ With Fusion-io ioDrive~8 times higher than the original (HDD)
● Read thoughput○ w/ MySQL + Cassandra: Almost flat (stable)
(doesn’t largely depend on write load)■ higher where write load is large
Things we want to do aftertime
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 44
● Find the reason of the staturation
● Imporve performance with cluster○ increse threads that writes history
● Evaluate with other NoSQL
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Appendix. A
Place of our R&D results
Place of our R&D results (source code)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 46
● Hosted on GitHub under GPLv2
● HistroyGluon○ https://github.com/miraclelinux/HistoryGluon
● ZABBIX (modified)○ https://github.com/miraclelinux/MIRACLE-ZBX-2.0.3-NoSQL
● Benchmark tools○ https://github.com/miraclelinux/zabbix-benchmark
Place of our R&D results (White paper)
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 47
● http://www.miraclelinux.com/online-service/labs
OR
● http://www.miraclelinux.com/online-service/labs/pdf/
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved
Appendix. B
Project Hatohol~An another approach~
Another approach to monitor many devices
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 49
+ NoSQL Approach
Another approach
Enhancement
Integration ofindependentmonitoring systems
Hatohol
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 50
An Open Source Integrated Monitoring Software● Can connect with
○ ZABBIX○ Naigios
Hatohol Development History
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 51
● Early 2013: Started
● 2013/6: First release
● 2013/9: Sencond release (v0.1)○ Action
(execute scripts or command on events)
● 2013/12: 3rd release (13.12)○ User access control○ Dynamically addtion/deletion of servers
Hatohol: related sites
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 52
● Hosted on GitHub under GPLv2○ https://github.com/project-hatohol/hatohol
○ Copyrighted by ‘Project Hatohol’ (Community)■ not owned by a specific company.
● Article (OSS NEWS)○ http://www.ossnews.jp/closeup/articles/?aid=201308-00003
● Presentation at hbstudy (2013/8/30)○ http://www.slideshare.net/koedoyoshida/hatohol-introduction20130830hbstudy-25744631○ http://www.slideshare.net/koedoyoshida/hatohol-technicalbrief20130830hbstudy
Appendix in Appendix: Basic architecture
Copyright © 2000-2014 MIRACLE LINUX CORPORATION All rights reserved 53
ミラクル・リナックス株式会社 【無断転載を禁ず】この文書はあくまでも参考資料であり、掲載されている情報は予告なしに変更されることがあります。ミラクル・リナックス(株)は本書の内容に関していかなる保証もいたしません。また、本書の内容に関連したいかなる損害についても責任を負いかねます。又、本資料の著作権は特に指定されている箇所を除いて、ミラクル・リナックスが有します。ミラクル・リナックスが著作権を有するコンテンツにつきましては、ミラクル・リナックスに対して無断で複製、改変、頒布などをすることはできません。
MIRACLE LINUX の製品名、ロゴ、サービス名などは、ミラクル・リナックスが所有するか、使用権許諾を受けている商標もしくは登録商標です。その他、本 Web サイトに掲載されている他社の製品名、ロゴなどは、それぞれ該当する各社が所有する商標もしくは登録商標です。
【お問い合わせ先】[email protected]
http://www.miraclelinux.com