PerlApp2Postgresql (2)

49
MySql::To::PostgreSQL Migrating a Perl App from MySQL to PostgreSQL

Transcript of PerlApp2Postgresql (2)

Page 1: PerlApp2Postgresql (2)

MySql::To::PostgreSQL

Migrating a Perl App from MySQL to PostgreSQL

Page 2: PerlApp2Postgresql (2)

Who I am

Jérôme Étévé

Full stack application developer at Broadbean

Page 3: PerlApp2Postgresql (2)

Our app

Medium size. 24 tables , 3 ‘large’ ones.

Uses DBIx::Class + some raw SQL (w/ DBI)A few DB based featuresA couple of nice accidentally convenient MySQL idiosyncrasies

Page 4: PerlApp2Postgresql (2)

Prereqs

- Run your app in stricter mode (TRADITIONAL + DBI RaiseError)

- App dates time zones are not floating- Have a good test suite, unit, functional and

regressions.- Data Integrity, Consistency- Know your app functionally

Page 5: PerlApp2Postgresql (2)

Steps

- Schema migration- Code compatibility- Data migration

Page 6: PerlApp2Postgresql (2)

Schema migration

Can it be automated?

Plenty of scripts there:https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL

DDL: Less expressive(MySQL) to more expressive (Pg)

Page 7: PerlApp2Postgresql (2)

Schema migration

Can it be automated?

Sort of

Do it by hand! (keeping the columns order)

Page 8: PerlApp2Postgresql (2)

Schema migration

Work in a branch

mysqldump --no-data --compatible=postgresql mysql > schema_mysql.sql

cp schema_mysql.sql schema_pg.sql

Page 9: PerlApp2Postgresql (2)

Schema migration

1 - Wrap schema_pg.sql in a BEGIN; ROLLBACK;2 - Adapt and reorder DDL statements3 - Run against Postgresql DB4 - Read error messages5 - GOTO 2

Page 10: PerlApp2Postgresql (2)

Schema Migration - DDL

Mysql: Character set at column level

Pg: Pick one that makes sense for your data

Page 11: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: Collation at column levelTrick to make a string Case Sensitive: COLLATE utf8_bin

Pg: Don’t need that, except if a collation is really needed for itself.

Page 12: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: TINYINT(1)A sort of Mysql Boolean.

Pg: Use BOOLEAN if your app really assumes a boolean. (0,1 .. 2 ..) Otherwise SMALLINT

Page 13: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: INT(11,25,33,7)

Pg: INTEGER

Page 14: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: INT(*) UNSIGNEDQ: Semantic UNSIGNED or cargo cult UNSIGNED? Look at your app and at the data.

Pg: Semantic: CHECK >= 0 , Cargo cult: INTEGER if the data allows, BIGINT otherwise.

Page 15: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: BIGINT(*) UNSIGNEDSemantic: CHECK >=0Data: Do you really need integers above 9223372036854775807? 9.2 x 1018

If yes then use NUMERIC (Not sequence-able)

Page 16: PerlApp2Postgresql (2)

Number of cat hair on earth: 2.4 x 1016

Source: Yahoo answer, Ask.com

Page 17: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: DATETIME

Pg: TIMESTAMP WITH TIMEZONE

DateTime->now() is in UTC

Page 18: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: JSON blobs. Often stored as TEXT

Pg: JSONB indexable (or just JSON if needed)

Page 19: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: IP addresses as VARCHAR

Pg: Use INET (compatible with V4 and V6 IPs)

Page 20: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: CHI::Driver::DBI tableschi_..(key VARCHAR(300), value BLOB)

Pg:chi_..(key BYTEA NOT NULL, value BYTEA)

Page 21: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: ENUM

Pg: Define an ENUM type and use it as a column type:CREATE TYPE bla_type AS ENUM(‘v1’, ‘v2’);.. COLUMN one_bla bla_type, ...

Page 22: PerlApp2Postgresql (2)

Schema migration - DDL

Mysql: Trigger bodies are in the trigger definition.Postgres: Define a special function, then CREATE TRIGGER example:CREATE TRIGGER trigger_watchdog_resultcount_update AFTER update ON watchdog_result FOR EACH ROW WHEN ( NEW.viewed <> OLD.viewed ) EXECUTE PROCEDURE watchdog_subscriptions_count_maintain();

Page 23: PerlApp2Postgresql (2)

Schema - AUTO_INCREMENT

Mysqldump: Does not output AUTO_INCREMENT property.

Postgres: Implement sequences manually with a SQL script..

Page 24: PerlApp2Postgresql (2)

Schema - AUTO_INCREMENTDROP FUNCTION IF EXISTS install_auto_increment(varchar, varchar) CASCADE;CREATE FUNCTION install_auto_increment(t_name varchar, c_name varchar) RETURNS void AS $end_of_function$ DECLARE sequence_name varchar; max_id BIGINT; BEGIN sequence_name = t_name || '_' || c_name || '_seq'; EXECUTE 'ALTER TABLE ' || quote_ident(t_name) || ' ALTER COLUMN ' || c_name || ' DROP DEFAULT'; EXECUTE 'DROP SEQUENCE IF EXISTS ' || quote_ident(sequence_name); EXECUTE ' SELECT COALESCE( max( ' || quote_ident(c_name) || ' ), 0 ) + 1 FROM ' || quote_ident(t_name) INTO max_id; EXECUTE 'CREATE SEQUENCE ' || sequence_name || ' INCREMENT BY 1 START WITH ' || max_id; EXECUTE 'ALTER TABLE ' || quote_ident(t_name) || ' ALTER COLUMN ' || quote_ident(c_name) || ' SET DEFAULT nextval(''' || quote_ident(sequence_name) || ''')'; RETURN; END;$end_of_function$ LANGUAGE plpgsql;

Page 25: PerlApp2Postgresql (2)

Schema - AUTO_INCREMENT

Then call

install_auto_increment(‘table’ , ‘id_columns’)

For each of your auto_increment tables.

Page 26: PerlApp2Postgresql (2)

Application features

Goal: Adapt your application so it works just fine on Postgresql too.

Even if we use DBIx::Class, we will need a function to make the switch.

Page 27: PerlApp2Postgresql (2)

Application features

Add that to your DBIx::Class schema:

sub with_db{ my ($self, $pg, $mysql) = @_; if( $self->storage()->sqlt_type() eq 'PostgreSQL' ){ return &$pg(); } return &$mysql();}

Page 28: PerlApp2Postgresql (2)

Application features

Make sure you use DBIx::Class the right way

Some things easy to overlook..

Page 29: PerlApp2Postgresql (2)

Application features

Some things not to be overlooked:

Formatting DateTime in DBIx::Class queries.

$schema->storage->datetime_parser ->format_datetime( a DateTime instance )

Page 30: PerlApp2Postgresql (2)

Application features

Some things not to be overlooked:

Escaping %,_ characters in your LIKE queries

Page 31: PerlApp2Postgresql (2)

Application featuresuse DBI::Const::GetInfoType;

sub escape_for_like{ my ($self, $string) = @_; $string //= '';

my $escape_char = ''; $self ->storage->dbh_do(sub{ my ( $storage, $dbh ) = @_; if( my $new_escape = $dbh->get_info( $GetInfoType{SQL_SEARCH_PATTERN_ESCAPE} ) ){ $escape_char = $new_escape; } }); $string =~ s/([_%])/$escape_char$1/g; return $string;}

Page 32: PerlApp2Postgresql (2)

Application features

The case of case insensitivity

MySQL default collation makes varchar matching case insensitive-> Easy LIKE ‘Bla%’ == LIKE ‘bla%’-> Painful and bug-prone VARCHAR as KEYs..

Page 33: PerlApp2Postgresql (2)

Application features

The case of case insensitivityPostgres: Use a functional index with a good operator class.CREATE INDEX tag_name_ci ON tag( LOWER(tag_name) text_pattern_ops);

SELECT * FROM tag WHERE LOWER(tag_name) LIKE LOWER(‘AB’) || ‘%’;

Page 34: PerlApp2Postgresql (2)

Application features

The case of case insensitivityEXPLAIN SELECT * FROM tag WHERE lower(tag_name) LIKE LOWER('AB') || '%';

Index Scan using tag_name_ci on tag (cost=0.14..8.17 rows=1 width=566)

Index Cond: ((lower((tag_name)::text) ~>=~ 'ab'::text) AND (lower((tag_name)::text) ~<~ ' ac'::text))

Filter: (lower((tag_name)::text) ~~ 'ab%'::text)

Page 35: PerlApp2Postgresql (2)

Application features

The case of case insensitivity

In your DBIx::Class app: Use the method with_db(..) to do the right thing for PostgreSQL.

Page 36: PerlApp2Postgresql (2)

Application features

Database wide advisory locking.

-> Make sure you implement your locking based on ‘with_db’.

Page 37: PerlApp2Postgresql (2)

Application features

If you use CHI::Driver::DBI:

Bug when running with DBI RaiseError option.

This fixes it:https://github.com/jonswar/perl-chi-driver-dbi/pull/5

Page 38: PerlApp2Postgresql (2)

Application features

Does your DBIx::Class application use Postgresql reserved words as relation names?Like ‘user’.

$schema->storage->sql_maker->quote_names(1);

Page 39: PerlApp2Postgresql (2)

So you’ve tested your application?

But the upstream schema has changed :($ mysqldump --no-data --compatible=postgresql yourdb > schema_mysql.sql$ git diff schema_mysql.sqlThen edit schema_pg.sql..And test again

Page 40: PerlApp2Postgresql (2)

Time for Data migration

Google will point you to plenty of tools..

The mysqldump way, a few ideas..

Page 41: PerlApp2Postgresql (2)

Time for Data migration

Skip the cruft:

--compact--no-create-info--skip-triggers

Page 42: PerlApp2Postgresql (2)

Time for Data migration

Avoid character set issues:

--default-character-set=utf8 --result-file=data.sql

Page 43: PerlApp2Postgresql (2)

Time for Data migration

Using CHI::Driver::DBI?Skip chi_* tables

--ignore-table=db.chi_bla

Page 44: PerlApp2Postgresql (2)

Time for Data migration

cat header.sql data.sql footer.sql > data_pg.sqlHeader:SET standard_conforming_strings = 'off'; -- Compatible with mysql dump stringsSET backslash_quote = 'on'; -- That are escaped with \SET client_min_messages TO ERROR; -- No warnings for those

UPDATE pg_cast SET castcontext='a' WHERE casttarget = 'boolean'::regtype; -- Mysql does not have boolean.

DO $$ DECLARE tables CURSOR FOR SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename; BEGIN FOR table_record IN tables LOOP EXECUTE 'ALTER TABLE ' || table_record.tablename || ' DISABLE TRIGGER ALL' ; -- RUN IN superuser mode. END LOOP; END$$;

Page 45: PerlApp2Postgresql (2)

Time for Data migration

cat header.sql data.sql footer.sql > data_pg.sqlFooter:SET standard_conforming_strings = 'on';SET backslash_quote = 'off';

UPDATE pg_cast SET castcontext='e' WHERE casttarget = 'boolean'::regtype;

DO $$ DECLARE tables CURSOR FOR SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename; BEGIN FOR table_record IN tables LOOP EXECUTE 'ALTER TABLE ' || table_record.tablename || ' ENABLE TRIGGER ALL' ; END LOOP; END$$;

Page 46: PerlApp2Postgresql (2)

A timeline1 - Reach prereq state2 - Fork the code and migrate schema3 - Test and adapt your app,write data migration scripts4 - rebase and goto 35 - schema freeze6 - Full data migration test7 - Live goes to read only or down8 - Migrate live data for good9 - Re-install the sequences10 - Switch the application!11 - Fix and support

Page 47: PerlApp2Postgresql (2)

Conclusion

PostgreSQL’s richer DDL allows stronger validation and consistency.

No magic automatic migration possible

Decisions have to be taken by humans

Page 48: PerlApp2Postgresql (2)

Conclusion, continued

- A very good occasion to review and fix your schema

- A very good occasion to make your App DB agnostic

- Most of the steps have to be tailored to the specific application

- Application expertise needed

Page 49: PerlApp2Postgresql (2)

Thanks!

Questions?