Breaking your proprietary software habit fileBreaking your proprietary software habit Best practices...
Transcript of Breaking your proprietary software habit fileBreaking your proprietary software habit Best practices...
Young-Jin Kim, Eileen McNaughton, Micah Lee
Breaking your proprietary
software habitBest practices for data
import into CiviCRM
7 deadly sins of data migration 1. Wrath - Feeling you'll get if you don't plan!2. Gluttony - Failure to restrict import scope3. Greed - Failure to get rid of data4. Sloth - Failure to iterate quickly, work cleanly5. Pride - Failure to validate the import6. Lust - Failure to dedupe7. Envy - Failure to leave behind old ways
Best practices for data migrations1. Use a dedicated environment for data imports
2. Automate scripts for the full import early on! Use APIs!
3. Judiciously, with client input, limit data import scope
4. Data import is an iterative process: iterate, iterate!
5. Think about current workflow and future workflow as it
impacts data mapping into CiviCRM
6. If you can, draw up a time horizon that will demarcate
stale data from current data, e.g. 3 years in the past
7. Don't reinvent the wheel make use of free tools, i.e.
migrate, civimigrate, ETL tools, Google Refine, APIs
Google Refine
Two possible migration workflows
CiviCRM DBPentahoKettle
LegacyDB
Cleanse
Export Import
Transform
CiviCRM DBCivimigrate
ModuleLegacy
DB
Export Import
Transform
ExportDB
Google Refine
● Free Open Source Data Cleaning tool written in Java running on a local tomcat instance
● Uber-spreadsheet on "steroids" with GUI● Reads in many file types and data formats
and also Google Docs spreadsheets● Many built in data transformations for
merging, clustering, matching, faceting● Ability to extend capabilities by writing
custom transforms in GREL, Python or Clojure
● Cleaning procedure can be saved as JSON and replayed back easily
Pentaho Data Integration
● Free Open Source Extract-Translate-Load tool (ETL) written in Java Eclipse framework
● Visual programming interface (GUI) for pipelining data and inspecting data streams
● Comes with connectors to many existing data(base) formats for input and output
● Write custom Javascript and Java steps● Data stream is routed using a transformation
step, transformations can be chained in a job● Transformations and jobs are stored as XML● Replay XMLs from command line
What is Civimigrate?
It's a bandaid between Migrate Module and the CiviCRM API More technically it exposes the API as a migrate destination
● Maps source data to migrate destinations (csv, oracle , xml, mysql, JSON ....)
● Supplies a framework to do trial imports, rollbacks, updates- Drush or GUI
● Map tables maintain relationships between source data and the resulting CiviCRM entities
● Allows you to use hooks to manipulate data during the migration (prepareRow + callbacks, e.g to sanitize data)
What does migrate do
You've migrated your data,but what about your donors?
EFF had ~1,000 recurring donors in Convio, bringing in ~$20,000 per month. We spent a long, long time saving them, but in the end succeeded. Probably worth it.
Ways to save your recurring donors:● Call them on the phone,
ask them to re-donate (recommended)
● Get credit card numbers, carefully baby-sit selenium script
● Keep old payment processor around until all cards expire, write CiviCRM integration code