Copyright 2004 MySQL ABThe World’s Most Popular Open Source Database Writing Storage Engines Brian...
-
Upload
piers-tyler -
Category
Documents
-
view
213 -
download
0
Transcript of Copyright 2004 MySQL ABThe World’s Most Popular Open Source Database Writing Storage Engines Brian...
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Writing Storage Engines
Brian AkerDirector of Architecture
Montreal PHP ConferenceMarch 2005
MySQL AB
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Who am I?
• Brian Aker– Director of Architecture, MySQL AB
– Author of mod_layout, the apache streaming services mod_mp3, Slash (Slashdot’s CMS System), and lot of other things on Freshmeat....
– http://mysql.com/
– http://krow.net/
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
MySQL
• 5 million installations
• 200 employees, 20+ countries– Most North American Developers live in Seattle
– We even have an office in Seattle
• No, developers never go there
Copyright 2003 MySQL AB The World’s Most Popular Open Source Database
MySQL Server
• High Performance RDBMS
• SQL-Based, aiming to be SQL-99 compliant
• Stable
• Scalable– Embedded in hardware (including JMX)
– Extremely high load applications
– Master/Slave Replication
• Easy to use
• Modular – “Storage Engines”
– Many features can be disabled at runtime and/or compile time to conserve resources
Copyright 2003 MySQL AB The World’s Most Popular Open Source Database
Client Library Support
• Libmysql c-library (think OCI)
• JDBC – Type IV JDBC Driver
• ODBC
• Perl DBD::DBI
• PHP (built in)
• ADO.Net, OleDB, Ruby, Erlang, Eiffel, Smalltalk, etc, etc. provided by third parties
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Goals
• Overview of MySQL Architecture
• Understanding of Storage Engine Architecture
• Knowledge of required methods
• Starting points for coding– sql/ is for the kernel
– mysys/ is the portable runtime
– mysql-test is for you test cases
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
What does it take?
• All code is written in simplified C++
• An example storage engine
• Your Ideas
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Server’s Kernel
Parser
Optimizer
Storage Engine
MyISAM Innodb NDB HEAP Merge
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
What is a Storage Engine?
• “Data formats on Disk”
• Examples– Innodb
– MyISAM
– BDB
– Cluster
– HEAP
– CSV
– Your’s!
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Example Table
• CREATE TABLE foo (
• a int,
• b char(4),
• c varchar(9),
• d blob)
• ENGINE = MYISAM;
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Rows
• Rows are made up of Fields
NULL INT CHAR VAR BLOB
NULL 4 4 L + 9 L + P
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
• Rows are made up of fields
Fields
NULL C H A R
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
What do I need to do to add one?
• Subclass Field in field.h
• Implement a few methods:– Storage: store(string), store(long long), store(double)
– Retrieve:val_real(), val_int(), val_str()
– Other: field_cast_type(), result_type(), cmp(), sort_string(), max_length()
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Field Store Example
• int Field_ipaddrv4::store(const char *from, uint length, CHARSET_INFO *cs){ int count; count= sscanf(from, "%u.%u%u.%u", ptr, (ptr +1), (ptr +2), (ptr +3)); if (count != 4) { bzero(ptr, 4); return -1; } return 0;}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Field val Example
• String *Field_ipaddrv4::val_str(String *val_buffer __attribute__((unused)), String *val_ptr){ int count;
count= snprintf(buffer, 15, "%u.%u.%u.%u", ptr[0], ptr[1], ptr[2], ptr[3]); val_ptr->set((const char*) buffer,count, &my_charset_latin1);
return val_ptr;}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Break Down of Storage Engine Methods
• Table Control
• Optimizer
• SQL Modifiers
• SQL Reads
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Table Control
• ::create()
• ::open()
• ::close()
• ::delete_table()
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::create()
• int ha_example::create(const char *name, TABLE *table_arg, HA_CREATE_INFO *create_info){ DBUG_ENTER("ha_example::create"); /* This is not implemented but we want someone to be able that it works. */ DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::open()
• int ha_example::open(const char *name, int mode, uint test_if_locked){ DBUG_ENTER("ha_example::open");
if (!(share = get_share(name, table))) DBUG_RETURN(1); thr_lock_data_init(&share->lock,&lock,NULL);
DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::close()
• int ha_example::close(void){ DBUG_ENTER("ha_example::close"); DBUG_RETURN(free_share(share));}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::delete_table()
• int ha_example::delete_table(const char *name){ DBUG_ENTER("ha_example::delete_table"); /* This is not implemented but we want someone to be able that it works. */ DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Optimizer
• ::info()
• ::records_in_range()
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
info()
• void ha_heap::info(uint flag)
• { records = info.records; deleted = info.deleted; errkey = info.errkey; mean_rec_length=info.reclength; data_file_length=info.data_length; index_file_length=info.index_length; max_data_file_length= info.max_records* info.reclength; delete_length= info.deleted * info.reclength; if (flag & HA_STATUS_AUTO) auto_increment_value= info.auto_increment;}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
records_in_range()
• ha_rows ha_example::records_in_range(uint inx, key_range *min_key, key_range *max_key){ DBUG_ENTER("ha_example::records_in_range"); DBUG_RETURN(10); // low number to force index usage}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
SQL Modifiers
• delete_row()
• write_row()
• update_row()
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::delete_row()
• int ha_example::delete_row(const byte * buf){ DBUG_ENTER("ha_example::delete_row"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_archive::insert_row()
• int ha_archive::write_row(byte * buf){ char *pos; z_off_t written; DBUG_ENTER("ha_archive::write_row");
statistic_increment(ha_write_count,&LOCK_status); if (table->timestamp_default_now) update_timestamp(buf+table->timestamp_default_now-1); written= gzwrite(share->archive_write, buf, table->reclength); DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_tina::write_row()
• int ha_tina::update_row(const byte * old_data, byte * new_data){ int size; DBUG_ENTER("ha_tina::update_row"); size= encode_quote(new_data); if (chain_append()) DBUG_RETURN(-1); if (my_write(share->data_file, buffer.ptr(), size, MYF(MY_WME | MY_NABP))) DBUG_RETURN(-1); DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
SQL Reads
• Scan Reads– rnd_init(), rnd_next(), position(), rnd_pos()
• Index Reads– index_read(), index_next(), index_prev(), index_first(),
index_last()
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_tina::rnd_init()
• int ha_tina::rnd_init(bool scan){ DBUG_ENTER("ha_tina::rnd_init");
current_position= next_position= 0; records= 0; chain_ptr= chain; if (scan) (void)madvise(share->mapped_file,share->file_stat.st_size,MADV_SEQUENTIAL);
DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_tina::rnd_next()
• int ha_tina::rnd_next(byte *buf){ DBUG_ENTER("ha_tina::rnd_next");
current_position= next_position; if (!share->mapped_file) DBUG_RETURN(HA_ERR_END_OF_FILE); if (HA_ERR_END_OF_FILE == find_current_row(buf) ) DBUG_RETURN(HA_ERR_END_OF_FILE);
records++; DBUG_RETURN(0);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_tina::position()
• void ha_tina::position(const byte *record){ DBUG_ENTER("ha_tina::position"); ha_store_ptr(ref, ref_length, current_position); DBUG_VOID_RETURN;}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_tina::rnd_pos()
• int ha_tina::rnd_pos(byte * buf, byte *pos){ DBUG_ENTER("ha_tina::rnd_pos"); current_position= ha_get_ptr(pos,ref_length); DBUG_RETURN(find_current_row(buf));}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::index_read()
• int ha_example::index_read(byte * buf, const byte * key, uint key_len __attribute__((unused)), enum ha_rkey_function find_flag __attribute__((unused))){ DBUG_ENTER("ha_example::index_read"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
ha_example::index_next()
• /* Used to read forward through the index.*/int ha_example::index_next(byte * buf){ DBUG_ENTER("ha_example::index_next"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Table Scan
• ha_example::store_lockha_example::external_lockha_example::infoha_example::rnd_initha_example::extra Cash record in HA_rrnd()ha_example::rnd_nextha_example::rnd_nextha_example::rnd_nextha_example::extra End cacheing of records (def)ha_example::external_lockha_example::extra Reset database to after open
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
That is All?
• Transaction methods
• Bulk load methods
• Defrag methods
• Lot more (read handler.h)
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Autoconf
• Autoconf files in the top-level source directory– acconfig.h
– acinclude.m4
– config.in
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Additional Files
• Basic server files modified under sql/– sql/Makefile.am
– sql/handler.h
– sql/mysql_priv.h
– sql/handler.cc
– sql/mysqld.cc
– sql/set_var.cc
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Test Cases
• Test cases created under mysql-test– mysql-test/include/have_mmap.inc
– mysql-test/t/mmap.test
– mysql-test/r/mmap.result
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database
Other Thoughts
• What are your goals?– Read only?
– Durable?
– Network?