Migrating from RDBMS to MongoDB

Post on 16-Apr-2017

1.933 views 2 download

Transcript of Migrating from RDBMS to MongoDB

RDBMS to MongoDBMigration Best Practices

Mrinal SarkarSolutions Architect

mrinal.sarkar@mongodb.com

2

• Relational Challenges• Migration Roadmap• Schema Design• Application Integration• Data Migration• Operational Considerations• Resources to Get Started

What We’ll Cover

Relational Challenges

4

Relational

Expressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

5

Relational Database ChallengesData TypesUnstructured dataSemi-structured dataPolymorphic data

Agile DevelopmentIterativeShort development cyclesNew workloads

Volume of DataTera-Peta Bytes of dataBillions of records‘000s of queries/sec

New ArchitecturesHorizontal scaling Commodity serversCloud computing

6

The World Has ChangedData Risk

Time Cost

7

NoSQL

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

8

Nexus Architecture

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Migration Steps

Migration Roadmap

• Backed by Free, Online MongoDB Training• Paid Consulting, Services and Support available

Schema Design

DefinitionsRDBMS MongoDBDatabase Database

Table Collection

Row Document

Index Index

JOIN Embedded document, document references or $lookup to combine data from different Collections

SQL to Aggregation Mapping

Mapping Chart:http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/

Mapping MongoDB Query Language to SQL

Mapping Chart:http://docs.mongodb.org/manual/reference/sql-comparison/

15

• Embedding– For 1:1 or 1:Many (where “many” viewed with the parent)– Ownership and containment– Document limit of 16MB, consider document growth– Atomicity of updates

• Referencing– _id field is referenced in the related document– Application runs 2nd query to retrieve the data– Data duplication vs performance gain– Object referenced by many different sources– Models complex Many : Many & hierarchical structures

Modeling Relationships:Embedding and Referencing

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Data Models: Relational to DocumentRelational MongoDB

Referencing Documents

18

RDBMS

Document Model BenefitsMongoDB

{ _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type :  "Health", plan : "PPO Plus" }, { type :   "Dental", plan : "Standard" }

] }

19

Anatomy of a BSON Document{ first_name: ‘Paul’, surname: ‘Miller’, cell: ‘+447557505611’ city: ‘London’, location: [45.123,47.232], Profession: [banking, finance, trader], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Fields can contain an array of sub-documents

Fields

Typed field values

Fields can contain arrays

String

Number

Geo-

Coordinates

Document Model BenefitsAgility and flexibility• Data model supports business

change• Rapidly iterate to meet new

requirements

Intuitive, natural data representation• Eliminates ORM layer• Developers are more productive

Reduces the need for joins, disk seeks• Programming is more simple• Performance delivered at scale

{

_id :

ObjectId("4c4ba5e5e8aabf3"),

employee_name: "Dunham,

Justin",

department : "Marketing",

title : "Product Manager,

Web",

report_up: "Neray, Graham",

pay_band: “C",

benefits : [

{ type :  "Health",

plan : "PPO

Plus" },

{ type :   "Dental",

plan :

"Standard" }

]

}

MongoDB is Fully Featured

22

• MongoDB indexing will be familiar to DBAs– B-Tree Indexes, Secondary Indexes

• Single biggest tunable performance factor– Define indexes by identifying common queries– Use MongoDB explain to ensure index coverage– MongoDB profiler logs all slow queries

Indexing in MongoDB

• Compound• Unique • Array • TTL

• Geospatial • Hash • Sparse• Partial (new in version

3.2)

• Text Search

Further Reading

http://docs.mongodb.org/manual/data-modeling/

Application Integration

Drivers & Ecosystem

Morphia

MEAN Stack

Python PerlRuby

Support for the most popular languages and frameworks

27

• Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce

– Max, Min, Averages, Sum, Union, Redact, GeoNear

• Similar functionality to SQL GROUP_BY• Processes a stream of documents• Series of operators

– Filter or transform data– Input/output chain

• Supports single servers & shards

Application IntegrationMongoDB Aggregation Framework

High Availability: Replica SetsReplica Set – 2 to 50 copies

Addresses availability considerations:

High Availability

Disaster Recovery

Maintenance

Workload Isolation: operational & analytics

29

Scalability via Sharding

Multiple query optimization models

Each sharding option appropriate for different apps

Elastic and self-balancing

Shard Key Selection:http://docs.mongodb.org/manual/tutorial/choose-a-shard-key/

30

BI Integration

https://docs.mongodb.org/ecosystem/tools/hadoop/

31

MongoDB Connector for BIVisualize and explore multi-dimensional

documents using SQL-based BI tools. The

connector does the following:

• Provides the BI tool with the schema of the MongoDB

collection to be visualized

• Translates SQL statements issued by the BI tool into

equivalent MongoDB queries that are sent to MongoDB

for processing

• Converts the results into the tabular format expected by

the BI tool, which can then visualize the data based on

user requirements

Data Integrity

33

Data Governance with Document Validation

Implement data governance without sacrificing agility that comes from dynamic schema

• Enforce data quality across multiple teams and applications

• Use familiar MongoDB expressions to control document structure

• Validation is optional and can be as simple as a single field, all the way to every field, including existence, data types, and regular expressions

34

Document Validation Example

The example on the left adds a rule to the contacts collection that validates:

• The year of birth is no later than 1994

• The document contains a phone number and / or an email address

• When present, the phone number and email addresses are strings

Data Durability: Write Concern & Journal

• Configurable per operation• Combination of Write Concern

Levels & Journaling allow multiple levels of Guarantees

Write Concern describes the level of acknowledgement requested from MongoDB for write operations

Migration and Operations

39

Traditional ETL

Source Database ETL

Incremental Migration, Live

Legacy Database

MongoDB Database

41

• Configuration, Provisioning, Monitoring and Backup• High Availability & Disaster Recovery• Scalability• Hardware selection

– Commodity Servers: Prioritize RAM, Fast CPUs & SSD• Security

– Access Control, Authentication, Encryption

Operations

Download the WhitepaperMongoDB Operations Best Practices

42

Ops Manager & Cloud Manager

Single-click provisioning, scaling & upgrades, admin tasks

Monitoring, with charts, dashboards and alerts on 100+ metrics

Backup and restore, with point-in-time recovery, support for sharded clusters

The Best Way to Manage MongoDB Up to 95% Reduction in Operational Overhead

43

MongoDB CompassFor fast schema discovery and visual construction of ad-hoc queries

• Visualize schema– Frequency of fields– Frequency of types– Determine validator rules

• View Documents• Graphically build queries• Authenticated access

Migration Roadmap

• Backed by Free, Online MongoDB Training• Paid Consulting, Services and Support available

Getting Started

MongoDB EnablementConsulting, training, and professional services throughout your project lifecycle

For Operations

For Developers

Design & Development

Pre-Production(Test, QA, Deployment) Production Expansion

Dedicated Consulting Engineer | Custom Projects

OperationsRapid Start Production Readiness

MongoDBPrivate CloudAccelerator

Health Check

DevelopmentRapid Start Performance Evaluation and Tuning

For Both

T

DeveloperTraining

T

Essentials Training

T

Administrator Training

T

Advanced DeveloperTraining

T

Advanced AdministratorTraining

Migration in Action

eCommerce Application• Migration from MS-SQL• Project completed in 8

months vs original 18 month planned.

• High Availability, Performance and reliability at a fraction of the cost.

• Lower latency• Faster dev cycles

Content Management• Migration from Oracle• 80% cost reduction with

commodity hardware• 900% performance

improvement• Development cycles in

weeks vs. tens of months

Customer Data Mgmt & Analytics• Multi RDBMS Migration• 95% faster in identifying

matches• 50% increase in paying

subscribers • 60% increase in unique web

site visits.

48

• MongoDB Brings the best of Both Relational & NoSQL Data Models• MongoDB is a full featured Database Platform• MongoDB Helps you reduce your Project Time, Cost and Risks• Migrating to MongoDB is easier than before with Enterprise level

Consulting, Training and Support.

Summary

Download the Guidehttps://www.mongodb.com/collateral/rdbms-mongodb-migration-guide