B3: Putting OpenEdge Auditing to Work: Dump and Load with (Almost) No Downtime
David EDDYSenior Solution Consultant
© 2008 Progress Software Corporation2
© 2008 Progress Software Corporation3
Why?
Get access to Type-II storage areas Reorganise database
Can’t afford to have DB off-line!
© 2008 Progress Software Corporation4
So What’s the Big Deal?
Dump and load with 2 small downtime slots:• A few minutes at front end
• As little as 5 min at back end – YMMV
Old DB continues in live use while building new• No lost data
© 2008 Progress Software Corporation5
Agenda
Introduction The Dump/Load Process Techniques: Turning Auditing to this Purpose Techniques: Offline Dump and Load Other Issues
© 2008 Progress Software Corporation6
Introduction
Why is this important? The plan Why it works Why it might not work Why auditing?
© 2008 Progress Software Corporation7
Why is this Important?
Type II Storage Areas• 15-20% performance boost
• Removal of limits
• On-line maintainability
• Future capabilities (e.g. per-area backups)
© 2008 Progress Software Corporation8
Plan Timeline
Old DB New DB Add storage ares for auditing dataEnable auditing
Copy live DB via backup/restore
Add auditing policy & enable
Dump
Load & index build
Initial replication
Second replication
Changeover:• Final replication• Verification test (db content comparison)• Signoff• Make new DB live
Build new structure
© 2008 Progress Software Corporation9
Why It Works
Bulk of DB activity is reads• Updates small proportion of total
• Batch update can be done in << real time
Multiple updates per row• Only interested in the last
• Further decreases total work replicating changes
...for most databases
© 2008 Progress Software Corporation10
Why It Might Not Work
Update rate too high / no duplication of activity
Cost of change capture too high No unique key
• Record replication needs unique key
© 2008 Progress Software Corporation11
Why Auditing
Reliable, light-weight change capture• Implemented in DB engine core
Easy to set up• Can use GUI tool for policy set up if desired
No code impact See OpenEdge Core Services manual for
more detail about auditing
© 2008 Progress Software Corporation12
Agenda
Introduction The Dump/Load Process Techniques: Turning Auditing to this Purpose Techniques: Offline Dump and Load Other Issues
© 2008 Progress Software Corporation13
The Dump/Load Process
Migrate to OpenEdge 10.1C in-situ
Prepare auditing policy Initiate auditing Offline dump/load Repeated batch replication
to “catch up” Verification of database
contents & signoff Changeover
Old DB New DB
© 2008 Progress Software Corporation14
The Dump/Load Process:Migrate to OpenEdge 10.1C In-Situ
Auditing requires 10.1• OpenEdge 10.1C+ ideally
Must run 10.1 in production during change capture
If this represents a migration, you must first:• Do a trial migration in your test environment
• Test & get signoff
• Migrate clients to 10.1 if needed
© 2008 Progress Software Corporation15
Migration In-Situ (continued)
Convert database to 10.1 if not already there• First disable JTA, Replication, AI if enabled• Then
proutil db –C truncate bi$DLC10/bin/proutil db –C conv910
New DB will have 64-bit features enabled• So we will also enable them in migrated DB to
catch problems early
proutil db –C enableseq64proutil db –C enablelargekeys
© 2008 Progress Software Corporation16
The Dump/Load Process:Prepare Auditing Policy
Nominate tables to be replicated• Omit any scratch/work tables
Specify unique key for each table• Ideally corresponding with an index
Create auditing policy• APMT (from DB Admin Tool)• Direct policy data creation
Automation• Extract key field list from DB• Load policy without using APMT
© 2008 Progress Software Corporation17
The Dump/Load Process:Initiation of Auditing
Add storage areas to contain auditing datad “AuditData”,128;512 /path/to/extentsd “AuditIndex”,16;64 /path/to/extents
prostrct add[online] db audit.st
Enable auditing (and create auditing tables)proutil db –C enableauditing area
AuditDataindexarea AuditIndex deactivateidx
Consider enabling large file supportproutil db –C enablelargefiles
Load and enable the policies• APMT or bespoke tool
© 2008 Progress Software Corporation18
The Dump/Load Process:Offline Dump/Load
Take online backup of production DB Restore in working area Dump contents
• Don’t dump scratch/work tables
Delete old DB Build new structure for DB
• Modify schema to suit (and load into new DB)
Load dumped data Procedural details later
© 2008 Progress Software Corporation19
The Dump/Load Process:Batch Replication
Gather committed changes• Record timestamp of last
change gathered
Eliminate duplicates• Only interested in last
change for each row
Split list of changes into multiple lists• Use multiple loaders to load
changes
Write a script to do all this
© 2008 Progress Software Corporation20
Batch Replication – Catching Up
Initial run will take some time• 3 days work might take 4 hours to apply
Next run will be faster• 4 hours work might take 15 minutes to apply
Once caught up, replicate periodically• Keep replication execution time low
Replication interval is a trade-off• Less frequent means more data reduction
through duplicate elimination• Too infrequent means replication takes too
long
© 2008 Progress Software Corporation21
The Dump/Load Process:Verification of Contents
You should have a basis for certifying that new DB is a good copy of the old• proutil tabanalys
• Application reports e.g. trial balance
Everyone must agree that tests are valid basis for certification
Tests should not take too long to run
© 2008 Progress Software Corporation22
The Dump/Load Process:Changeover
Block user access to live DB Perform final replication Dump users, sequence values from old DB Verify content equality (& signoff) Make new DB live
• Load users, seqvals into new DB
• Shut down old DB
• Change references in startup scripts, OR move or delete old DB and replace with new DB
Allow users access Initiate backup
© 2008 Progress Software Corporation23
Agenda
Introduction The Dump/Load Process Techniques: Turning Auditing to this Purpose Techniques: Offline Dump and Load Other Issues
© 2008 Progress Software Corporation24
Techniques: Auditing for Dump/Load
Audit policy Audit data tables Extracting key data Data reduction Applying changes
© 2008 Progress Software Corporation25
Audit Policy
Defines auditing:• What tables/fields are audited
• What data is recorded
Can be multiple policies active• DB will merge at runtime to generate
“effective” policy
• Report available through APMT
Contained in tables• _aud-*-policy
© 2008 Progress Software Corporation26
Audit Policy Starting Point
You must know the name of each table you will replicate
You must know for each replicated table the list of fields you will use to uniquely identify each row in that table
Suggest automatic generation from schema (unique indexes)
# Table keys for dump/load auditing policy – # C:\oebin\oework\tmp\testSalesrep: SalesRepVacation: StartDateDepartment: DeptCodeState: State...
© 2008 Progress Software Corporation27
Audit Policy with APMT
GUI tool provided with OpenEdge distribution How to use:
• Define a new policy
• Add tables
• Add fields
• Add events
• Enable policy
• Export new policy to XML (for later upload to live)
© 2008 Progress Software Corporation28
Audit Policy with APMT: Policy
These values are default
© 2008 Progress Software Corporation29
Audit Policy with APMT: Tables
Note “standard” audit level
© 2008 Progress Software Corporation30
Audit Policy with APMT: Fields
List of key fields (for the Customer table in this example).
Don’t forget to set the identifying seq num for each field in unique key.
© 2008 Progress Software Corporation31
Audit Policy with APMT: Events
You need to define audit events, which describe the events used by this policy.
Events 5100-5102 are the standard events for capturing creates, updates, and deletes in the database.
Don’t forget to do this or no changes will be captured.
© 2008 Progress Software Corporation32
Audit Policy Tables
If you’re not using APMT, you need to set:• _aud-audit-policy
• _aud-file-policy
• _aud-field-policy
• _aud-event-policy
© 2008 Progress Software Corporation33
Audit Policy Tables: _aud-audit-policy
_audit-policy-guid = base64-encode(generate-uuid)_audit-data-security-level = 0_audit-custom-detail-level = 0_audit-policy-active = true
One record per policy (i.e. we add 1 record)
© 2008 Progress Software Corporation34
Audit Policy Tables: _aud-file-policy
_audit-policy-guid = _aud-audit-policy._audit-policy-guid
_file-name = “table name”_owner = “PUB” /* usually */_audit-create-level = 12_audit-update-level = 12 _audit-delete-level = 12_create-event-id = 5100_update-event-id = 5101_delete-event-id = 5102
One record for each table being audited
© 2008 Progress Software Corporation35
Audit Policy Tables: _aud-field-policy
_audit-policy-guid = _aud-audit-policy._audit-policy-guid
_file-name = “table name”_owner = “PUB” /* usually */_field-name = “field name”_audit-identifying-field =
seq_num_of_field_in_unique_key_audit-create-level = 12_audit-update-level = 12 _audit-delete-level = 12
One record for each key field of each table
© 2008 Progress Software Corporation36
Audit Policy Tables: _aud-event-policy
_audit-policy-guid = _aud-audit-policy._audit-policy-guid
_event-id = {5100|5101|5102}_event-level = 1
Three records per policy (one for each event used)
© 2008 Progress Software Corporation37
Audit Data Tables
Only interested in _aud-audit-data
Key fields:• _audit-date-time• _event-id• _event-context
Some indexes present even after deactivateidx:• _Audit-data-guid• _Audit-date-time
© 2008 Progress Software Corporation38
Audit Data Tables: _aud-audit-data
_audit-date-time is a datetime-tz field _event-id tells you whether it was a create,
update or delete _audit-event-context contains the table name
and key field values _audit-event-data contains changed field
values but can be safely ignored
© 2008 Progress Software Corporation39
_audit-event-context
_audit-event-context = Tablename + chr(6) + list-of-key-field-values
List-of-key-field-values =key-value-1 [ + chr(7) + key-value-2 ] …
Order of key field values is dictated by identifying field sequence number
© 2008 Progress Software Corporation40
Extracting key data
Pull data from _aud-audit-data in _audit-date-time order
Want only those rows which can be share-locked• Unlockable audit data means transaction still in
progress• Must not replicate uncommitted data
Dump only up to last row time time stamp different to last lockable row• Order not guaranteed within same date/time value
Need only 3 fields:• _event-context • _event-id• _audit-date-time
© 2008 Progress Software Corporation41
Data reduction
Need only last change for each row mentioned• Ignore or discard earlier changes
© 2008 Progress Software Corporation42
Applying changes
Changes can be applied in any order Recommend multi-threading load
• OpenEdge performance best with multiple threads
• Tune new DB for heavy transaction activity Write program to read event data and emulate
in new DB fail-safe• Create/Update = Create if row does not exist in
new; copy row from live DB if does exist in new• Delete = do nothing if target row does not exist• Batch transactions for performance
Run multiple threads e.g. 4-8
© 2008 Progress Software Corporation43
Agenda
Introduction The Dump/Load Process Techniques: Turning Auditing to this Purpose Techniques: Offline Dump and Load Other Issues
© 2008 Progress Software Corporation44
Techniques: Off-Line Dump and Load
Planning the new database Adjusting the schema Dumping Loading Index rebuild
© 2008 Progress Software Corporation45
Planning the New Database
Plan to use Type II storage areas everywhere Rows per block not that critical
• Type-II areas have no max size (>1PB @ 128)• Use 128 or 256 for all table areas• Use 8 or 16 for all indexes
Blocks per cluster more important• Use 512 bpc only for objects 20MB+• Use 8bpc otherwise
Distinct data & index areas Try to equalise area sizes
• Index build is parallel across areas
© 2008 Progress Software Corporation46
Adjusting the Schema
Old .df has old storage areas encoded Manual editing is error prone Recommended technique:
• Build empty DB with both old & new areas (only 1 var extent for each area)
• Load old schema• Build & execute script which tablemove’s to
new schema• Dump new .df
New .st and .df define the new database
© 2008 Progress Software Corporation47
Dumping
Use binary dump• No size limit• No data corruption risk
Self-balancing parallel technique:• List tables in descending order of size• Start multiple dump threads
– Each thread takes 1st table from list and dumps it
Speed is nice but not critical• Dump multithreaded might be useful for very
large tables
© 2008 Progress Software Corporation48
Loading
Use binary load Difficult to go faster than offline load as follows:
proutil db –C truncate bi –bi 512 –biblocksize 8for f in *.bd* ; do
proutil db –C load $f –r –B 500done
Don’t bother with “build indexes”• Multithreaded idxbuild is faster
© 2008 Progress Software Corporation49
Index Rebuild
Set up sort file dbname.srt Invoke index build in multi-threaded mode:
proutil db –C idxbuild all –TB 31 –TM 32 –SS db.srt –SG 64 –B 500 –thread 1–threadnum cpus*1.5
IDXBUILD is multi-threaded by default• No need to specify –thread, –threadnum if
happy with 1 thread per CPU
© 2008 Progress Software Corporation50
Agenda
Introduction The Dump/Load Process Techniques: Turning Auditing to this Purpose Techniques: Offline Dump and Load Other Issues
© 2008 Progress Software Corporation51
Other Issues
Do I need to use auditing? Do I need to use 10.1C? What if I am already using auditing?
© 2008 Progress Software Corporation52
Do I Need to Use Auditing?
No – key factor is record-based replication• Trigger-based replication works too
Auditing is easy to implement, low-impact, and fast• Much better choice than trigger-based replic
© 2008 Progress Software Corporation53
Do I Need to Use OpenEdge 10.1C+?
No – auditing is available from 10.1A+• 10.1C is current version
10.1BSP03+ is recommended because of auditing bugs• Not critical as audited database will be
discarded• Bugs mostly have to do with disabling auditing
and removing users If you’re using trigger-based replication
• Any V8.3+ is okay
© 2008 Progress Software Corporation54
What If I’m Already Using Auditing?
Dump/load policy settings will be merged into existing policy by DB engine
Simply don’t load dump/load policy into new DB
Don’t enable auditing in new DB until live changeover (or immediately before if testing is required)
© 2008 Progress Software Corporation55
Questions?
© 2008 Progress Software Corporation56
Thank You
© 2008 Progress Software Corporation57
Top Related