Advanced Sharding Techniques with Spider (MUC2010)
-
Upload
kentoku -
Category
Technology
-
view
4.934 -
download
5
Transcript of Advanced Sharding Techniques with Spider (MUC2010)
Advanced sharding techniques with Spider
Kentoku SHIBAkentokushiba at gmail dot com
How to shard databasewithout stopping the service
How to shard database
What is database sharding?When the data volume increases or the updating traffic increases, your updating database server cannot process effectively.We often use the technique for dividing data into two or more databases to solve the problem. This is database sharding.
Here, I will explain how to shard a data,without stopping the service.
Initial Structure
There is 1 MySQL server without Spider.
DB1tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Step 1 (for sharding)
Create table on DB2 and DB3.Then create tables on DB1.
DB1
tbl_aDB2tbl_a
DB3tbl_atbl_a3
Create table tbl_a3 (col_a int,col_b int,primary key(col_a)
) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”‘partition by list(mod(col_a, 2)) (partition pt1 values in(0)comment ‘host “DB2”’,partition pt2 values in(1)comment ‘host “DB3”’
);
tbl_a4
tbl_a2Create table tbl_a4 (col_a int,col_b int,primary key(col_a)
) engine = VPComment ‘cit "2",cil "2",ctm “1”,ist “1”,zru “1”,tnl “tbl_a2 tbl_a3”‘;
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
col_a%2=1
col_a%2=0
Step 2
Rename table on DB1.(rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
DB1
tbl_a2DB2tbl_a
DB3tbl_atbl_a3
tbl_a
tbl_a5col_a%2=1
col_a%2=0
Step 3
Copy data from tbl_a2 to tbl_a3 on DB1.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
DB1
tbl_a2DB2tbl_a
DB3tbl_atbl_a3
tbl_a
tbl_a5col_a%2=1
col_a%2=0
Step 4
Rename table on DB1.(rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
DB1
tbl_a2DB2tbl_a
DB3tbl_atbl_a
tbl_a4
tbl_a5col_a%2=1
col_a%2=0
Finish
Drop table on DB1.(drop table tbl_a2, tbl_a4, tbl_a5)
DB1
DB2tbl_a
DB3tbl_atbl_a
col_a%2=1
col_a%2=0
How to re-shard databasewithout stopping the service
How to re-shard database
What is re-sharding?When the data volume increases or the updating traffic increases so much, even if you had your database sharded, your updating database server cannot process right again.So we solve that problem by increasing the number of servers and distributing the load.It is called re-sharding to increase the number of servers, and to distribute the load.
Here, I will explain how to re-shardwithout stopping the service.
Initial Structure
There are 1 MySQL server with Spider and 2 remote MySQL servers without Spider.
DB2
DB1tbl_a
tbl_aCreate table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”‘partition by list(mod(col_a, 2)) (partition pt1 values in(0)comment ‘host “DB2”’,partition pt2 values in(1)comment ‘host “DB3”’
);
DB3tbl_a
col_a%2=1col_a%2=0
Step 1 (for re-sharding)
Create table on DB4 and DB5.Then create tables on DB3.
DB2
DB1tbl_a
tbl_a
col_a%2=1col_a%2=0
DB4tbl_a
DB5tbl_a
DB3
tbl_a
tbl_a3
tbl_a4
tbl_a2
Create table tbl_a3 (col_a int,col_b int,primary key(col_a)
) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”‘partition by list(mod(col_a, 4)) (partition pt1 values in(1)comment ‘host “DB4”’,partition pt2 values in(3)comment ‘host “DB5”’
);
Create table tbl_a4 (col_a int,col_b int,primary key(col_a)
) engine = VPComment ‘cit "2",cil "2",ctm “1”,ist “1”,zru “1”,tnl “tbl_a2 tbl_a3”‘;
col_a%4=3
col_a%4=1
Step 2
DB2
DB1tbl_a
tbl_a
col_a%2=1col_a%2=0
DB4tbl_a
DB5tbl_a
DB3
tbl_a2
tbl_a3
tbl_a
tbl_a5col_a%4=3
col_a%4=1
Rename table on DB3.(rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
Step 3
DB2
DB1tbl_a
tbl_a
col_a%2=1col_a%2=0
DB4tbl_a
DB5tbl_a
DB3
tbl_a2
tbl_a3
tbl_a
tbl_a5col_a%4=3
col_a%4=1
Copy data from tbl_a2 to tbl_a3 on DB3.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
Step 4
DB2
DB1tbl_a
tbl_a
col_a%2=1col_a%2=0
DB4tbl_a
DB5tbl_a
DB3
tbl_a2
tbl_a
tbl_a4
tbl_a5col_a%4=3
col_a%4=1
Rename table on DB3.Then alter table on DB1.
Alter table tbl_apartition by list(mod(col_a, 4)) (partition pt1 values in(0,2)comment ‘host “DB2”’,partition pt2 values in(1)comment ‘host “DB4”’,partition pt2 values in(3)comment ‘host “DB5”’
);
Rename tabletbl_a to tbl_a4,tbl_a3 to tbl_a;
Finish
DB2
DB1tbl_a
tbl_a
col_a%2=0
DB4tbl_a
DB5tbl_a
col_a%4=3
col_a%4=1
Drop DB3.
How to add an indexwithout stopping the service
How to add an index
If you add an index in MySQL, you cannotupdate your data until the process is completed.When it comes to a big table, it takesa long time to complete, sometimes you cannotuse the service during the change.
Here, I will explain how to add an index,without stopping the update of your data.
Initial Structure
There is 1 MySQL server.
DB1tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Step 1 (for adding an index)
Create tables on DB1.
DB1
tbl_a
tbl_a3
tbl_a4
tbl_a2 Create table tbl_a4 (col_a int,col_b int,primary key(col_a)
) engine = VPComment ‘cit "2",cil "2",ctm “1”,ist “1”,zru “1”,tnl “tbl_a2 tbl_a3”‘;
Create table tbl_a2 (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Create table tbl_a3 (col_a int,col_b int,primary key(col_a),key idx1(col_b)
) engine = InnoDB;
Step 2
Rename table on DB1.(rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a5
Step 3
Copy data from tbl_a2 to tbl_a3 on DB1.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a5
Step 4
Rename table on DB1.(rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
DB1
tbl_a2
tbl_a
tbl_a4
tbl_a5
Finish
Drop table on DB1.(drop table tbl_a2, tbl_a4, tbl_a5)
DB1
tbl_a
How to change the schemawithout stopping the service
How to change the schema
If you change schema in MySQL, you cannotupdate your data until the process is completed.When it comes to a big table, it takesa long time to complete, sometimes you cannotuse the service during the change.
Here, I will explain how to change schema,without stopping the update of your data.
Initial Structure
There is 1 MySQL server.
DB1tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Step 1 (for adding a column)
Create tables on DB1.
DB1
tbl_a
tbl_a3
tbl_a4
tbl_a2 Create table tbl_a4 (col_a int,col_b int,primary key(col_a)
) engine = VPComment ‘cit "2",cil "2",ctm “1”,ist “1”,zru “1”,tnl “tbl_a2 tbl_a3”‘;
Create table tbl_a2 (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Create table tbl_a3 (col_a int,col_b int,col_c int default null,primary key(col_a)
) engine = InnoDB;
Step 2
Rename table on DB1.(rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a5
Step 3
Copy data from tbl_a2 to tbl_a3 on DB1.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a5
Step 4
Rename table on DB1.(rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
DB1
tbl_a2
tbl_a
tbl_a4
tbl_a5
Finish
Drop table on DB1.(drop table tbl_a2, tbl_a4, tbl_a5)
DB1
tbl_a
How to set up a clusterfor fault tolerance
without stopping the service
How to set up a cluster for fault tolerance
Spider can set up a cluster for fault toleranceby each table.
Here, I will explain how to set up cluster,without stopping service.
'Monitoring node' in this slide is a node that works to observethe trouble of each node that composes clustering.'Spider_copy_tables' in this slide is in development , so pleasewait for a while to use it.
Initial Structure
There are 1 MySQL server with Spider and 1 remote Mysql servers without Spider.
DB2
DB1tbl_a
tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2”‘;
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
DB1
Step 1 (for clustering)
Add new data nodes(DB3 and DB4) and tables.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
DB1
Step 2
Add new monitoring nodes(DB5, DB6, DB7) and tables.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = SpiderConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB3 DB4”‘;
DB1
Step 3
Register monitornig node information toMySQL servers with Spider.
Then alter table on DB1.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
insert into mysql.spider_link_mon_servers(db_name, table_name, link_id, sid, server, scheme, host, port, socket, username, password)values('db_name', 'tbl_a', 0, DB5_sid, null, 'mysql', 'DB5', 3306, null, 'user', 'pass‘),('db_name', 'tbl_a', 0, DB6_sid, null, 'mysql', 'DB6', 3306, null, 'user', 'pass‘),('db_name', 'tbl_a', 0, DB7_sid, null, 'mysql', 'DB7', 3306, null, 'user', 'pass‘);
Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB3 DB4”,mbk “2”, mkd “2”,msi “DB5_sid”,link_status “0 2 2”‘;
DB7DB6
DB5tbl_aDB1
Select spider_copy_tables(‘tbl_a’, ‘’, ‘’);
Step 4
Copy data from DB2 to DB3 and DB4.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB1
Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB3 DB4”,mbk “2”, mkd “2”,msi “DB5_sid”,link_status “0 1 1”‘;
Finish
Alter table on DB1.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
How to add new nodeafter failoverand preparing new server
without stopping the service
Create a table of a new node to the clustered table
You need to create a new node, in order tomaintain redundancy, when there is a troubleat the node that composes the cluster.
Here, I will explain how to add a table of a new node, without stopping the service.
'Monitoring node' in this slide is a node that works to observethe trouble of each node that composes clustering.'Spider_copy_tables' in this slide is still in development , it will
be available in future releases.
DB1
Initial Structure
There are 4 MySQL servers with Spider(include 3 monitoring nodes) and
3 MySQL servers without Spider (including 1 broken node).
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB1
Step 1
Add new data node(DB8) and table.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB8tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
DB1
Step 2
Alter table on monitoring nodes(DB5, DB6 and DB7).
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB8tbl_a
Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB4 DB8”‘;
DB1
Step 3
Alter table on DB1.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB8tbl_a
Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB4 DB8”,mbk “2”, mkd “2”,msi “DB5_sid”,link_status “0 0 2”‘;
DB1
Step 4
Copy data from DB2 to DB8.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB8tbl_a
Select spider_copy_tables(‘tbl_a’, ‘’, ‘’);
DB1
Finish
Alter table on DB1.
DB2
tbl_a
tbl_aDB3tbl_a
DB4tbl_a
DB7DB6
DB5tbl_a
DB8tbl_a
Alter table tbl_aConnection ‘table “tbl_a”,user “user”,password “pass”,host “DB2 DB4 DB8”,mbk “2”, mkd “2”,msi “DB5_sid”,link_status “0 0 1”‘;
How to avoid table partitioningUNIQUE column limitation
without stopping the service
How to avoid table partitioning UNIQUE column limitation
Right now, there is a restriction of MySQL thatyou cannot partition in other columns whenthere is a PK or UNIQUE.
Here, I will show you how to partition a table by any columns even if there is a PK or UNIQUE.
Initial Structure
There is 1 MySQL server.
DB1tbl_a
Create table tbl_a (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Step 1 (for avoiding partitioning limitation)
Create tables on DB1.
DB1
tbl_a
tbl_a3
tbl_a5
tbl_a2Create table tbl_a5 (col_a int,col_b int,primary key(col_a)
) engine = VPComment ‘ctm “1”, ist “1”,zru “1”, pcm “1”‘Connection ‘tnl “tbl_a2 tbl_a3 tbl_a4”‘;
Create table tbl_a2 (col_a int,col_b int,primary key(col_a)
) engine = InnoDB;
Create table tbl_a3 (col_a int,primary key(col_a)
) engine = InnoDBpartition bylinear hash(col_a)partitions 4;
tbl_a4
Create table tbl_a4 (col_a int,col_b int,key idx1(col_a),key idx2(col_b)
) engine = InnoDBpartition by list(mod(col_b, 2)) (partition pt1 values in(0),partition pt2 values in(1)
);
Step 2
Rename table on DB1.(rename table tbl_a2 to tbl_a6, tbl_a to tbl_a2, tbl_a5 to tbl_a)
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a6
tbl_a4
Step 3
Copy data from tbl_a2 to tbl_a3 and tbl_a4.(select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3 tbl_a4’))
DB1
tbl_a2
tbl_a3
tbl_a
tbl_a6
tbl_a4
Step 4
Alter table tbl_a.
DB1
tbl_a2
tbl_a6
tbl_a3
tbl_a
tbl_a4 Alter table tbl_aComment ‘ctm “1”, ist “1”,pcm “1”‘,Connection ‘tnl “tbl_a3 tbl_a4”‘;
Finish
Drop table.(drop table tbl_a2, tbl_a6)
DB1
tbl_a3
tbl_a
tbl_a4
Case study
About MicroAd
MicroAd is an advatising company.
This company can advertise efficientlyusing "behavioral targeting" technology.
http://www.microad.jp/english/【MicroAd, Inc.]
The previous architecture
Batch processing updates new statistical rules every day.(For every advertisers, every advertising medias
and every users)
MasterDB
replication
AP AP
Batch
Register new statistical rules from batch server
SlaveDB
SlaveDB
…… AP AP ……
LVS
The problem with business expansion
Increase data and request.At that time the limit of updates were 20 million records a day.They needed to update 100 million records a day.
They also wanted to improve the performance of the reference slave by decreasing the amount of the update by one slave.
They did not want to change or modify their application to support the increase.
Then, Spider was used.
The architecture with Spider
They created the shards withthe unit of the replication.
MasterDBreplication
APwith Spider
APwith Spider
Batch
Register newstatistical rules from batch server
SlaveDB SlaveDB
…… APwith Spider
APwith Spider
……
LVS
SlaveDB SlaveDB
LVS
SlaveDB SlaveDB
LVS
MasterDBreplication
MasterDBreplication
SpiderDB(MySQL with Spider)
Spider sharding
Spider sharding
Resolved the problem
As a result,They achieved update 100 million records a dayand improved the performance of the reference.
They didn't need to change or modify their applications so much.
They are planning in the near future of resharding, when they expand the business.
http://wild-growth.blogspot.com/http://spiderformysql.com
Kentoku SHIBA (kentokushiba at gmail dot com)
Any Questions?
Thank you for taking
your time!!