10 Things to Avoid in Data Model

download 10 Things to Avoid in Data Model

of 20

Transcript of 10 Things to Avoid in Data Model

  • 7/28/2019 10 Things to Avoid in Data Model

    1/20

    Ten Things to Avoid in a Data Model

    Dr. Michael BlahaModelsoft Consulting Corpwww.modelsoftcorp.com

    E-mail: [email protected]

  • 7/28/2019 10 Things to Avoid in Data Model

    2/20

    PAGE 2

    Introduction

    A modelis an abstraction of some aspect of a problem.

    A data modelis a model that describes how data is represented

    and accessed, usually for a database.

    Data modeling can be a difficult task and is often pivotal to the success or failure of

    a project.

    There are many pitfalls to data modeling as we will explain...

    Strategic pitfalls.

    Detailed pitfalls.

    We do not discuss detailed modeling constructs such as keys, datatypes, nullability, and referential integrity.

  • 7/28/2019 10 Things to Avoid in Data Model

    3/20

    PAGE 3

    Strategic Pitfalls

  • 7/28/2019 10 Things to Avoid in Data Model

    4/20

    PAGE 4

    Strategic Pitfall: Vague Purpose

    Dont build a model without understanding the business rationale.

    The purpose for a model dictates the level of detail.

    Just entities and relationships.

    Fully attributed.

    With data types and constraints. The purpose also dictates the level of polish, the degree of completeness, and

    the amount of time justified.

    Different kinds of data models.

    Detailed application model for development.

    Rough application for a purchase spec.

    Enterprise model for integration.

    This pitfall might seem obvious, but Ive seen modeling efforts with little

    business purpose and no clear definition of deliverables.

  • 7/28/2019 10 Things to Avoid in Data Model

    5/20

    PAGE 5

    Strategic Pitfall: Literal Modeling

    Your job is not to do what the customer says. Your job is to solve

    the problem that the customer is imperfectly describing.

    You must pay attention to the hidden true requirements.

    You must interpret and abstract what the customer tells you.

    You must recognize arbitrary business decisions that could easily change.

    You can raise abstraction by thinking in terms of patterns.

    The use case mentality really misses this point.

  • 7/28/2019 10 Things to Avoid in Data Model

    6/20

    PAGE 6

    Strategic Pitfall: Literal Modeling Example

    The original model is correct, but has problems. What happens if a person gets

    promoted to a supervisor and then to a manager? Are there multiple records?

    Movement of a record? Or???

    The improved model is more abstract and softcodes the management

    hierarchy.

    Originalliteralmodel

    mproved

    abstractmodel

  • 7/28/2019 10 Things to Avoid in Data Model

    7/20

    PAGE 7

    Strategic Pitfall: Large Size

    Avoid large models. Limit a model to no more than 200 tables.

    Large models involve more work.

    Is the large size really justified or can you simplify the model with

    abstraction?

    I rarely encounter a large model with a compelling justification.

    I dont see this step in software development methodologies, but

    it is certainly needed.

  • 7/28/2019 10 Things to Avoid in Data Model

    8/20

    PAGE 8

    Strategic Pitfall: Speculative Content

    Do not include content that is not needed now and might be

    helpful in the future..

    All this does is to make a model larger, increase development

    time, and raise cost.

    A model must fully address the requirements, but not greatly

    exceed them.

    A quality model should be readily extended, so there is no need to

    add content in advance of need.

    Speculative content runs counter to the philosophy of agile

    development.

  • 7/28/2019 10 Things to Avoid in Data Model

    9/20

    PAGE 9

    Strategic Pitfall: Lack of Clarity

    A relational database is declarative. Declare data in your

    models.

    A domain is the set of possible values for an attribute.

    ERwin lets you define domains and then assign them to the pertinent attributes.

    An enumeration is a domain that has a finite set of values.

    Declare enumerations in your databases.

    Dont store data structures with a binary encoding.

    Dont use cryptic names.

    Dont use anonymous fields that application code must interpret.

    Obfuscation can happen through sloppy development practices.

  • 7/28/2019 10 Things to Avoid in Data Model

    10/20

    PAGE 10

    Strategic Pitfall: Lack of Clarity Example

    Enumerationstored in

    lace

    Enumerationstoredseparately

    Car table

    1

    2

    3

    carID year color weight

    2001

    1989

    2000

    red

    red

    blue

    2000

    1500

    2500

    Car table

    1

    2

    3

    carID year colorID weight

    2001

    1989

    2000

    1

    1

    3

    2000

    1500

    2500

    Color table

    1

    2

    3

    colorID color

    red

    green

    blue

    Enumerationencoded

    Car table

    1

    2

    3

    carID year color weight

    2001

    1989

    2000

    1

    1

    3

    2000

    1500

    2500

  • 7/28/2019 10 Things to Avoid in Data Model

    11/20

    PAGE 11

    Detailed Pitfalls

  • 7/28/2019 10 Things to Avoid in Data Model

    12/20

    PAGE 12

    Detailed Pitfall: Reckless Violation of Normal Forms

    Do not accidentally violate normal forms.

    A normal form is a guideline that increases data consistency.

    As tables satisfy higher levels of normal forms, they are less likely

    to store redundant or contradictory data.

    Denormalization is only justified when there is a major

    performance bottleneck, such as for data warehouses.

    Be suspicious of large tables (30 attributes or more).

    Be suspicious of any entity type that is difficult to define.

    It is acceptable to violate normal forms deliberately, when there

    is a good reason to do so.

  • 7/28/2019 10 Things to Avoid in Data Model

    13/20

    PAGE 13

    Detailed Pitfall: Normal Forms Example

    The contact position and contact phone depend on the contactname.

    The contact name depends on customerPK.

    Violatesnormalorm

    Satisfiesnormalorm

  • 7/28/2019 10 Things to Avoid in Data Model

    14/20

    PAGE 14

    Detailed Pitfall: Needless Redundancy

    Be careful with redundancy.

    Redundancy across applications.

    Redundancy within an application.

    Normal forms are one aspect of redundancy.

    Ideally there should be a single recording of each data item. (Rarely is thiscompletely feasible.)

    Organizations are rife with applications that overlap in awkward and loosely

    controlled ways.

    This is a major justification for data warehouses.

    Dont include redundant data to compensate for a poorly conceivedapplication.

    Redundant data is acceptable if you use built-in database features to keep

    redundant data consistent (such as materialized views).

  • 7/28/2019 10 Things to Avoid in Data Model

    15/20

    PAGE 15

    Detailed Pitfall: Parallel Attributes

    Avoid parallel attributes for non-data-warehouse applications.

    Parallel attributes often codify arbitrary business decisions, reducing

    information system flexibility.

    Widespread use of parallel attributes often indicates a poor model.

    Parallel attributes Parameterized model

  • 7/28/2019 10 Things to Avoid in Data Model

    16/20

  • 7/28/2019 10 Things to Avoid in Data Model

    17/20

  • 7/28/2019 10 Things to Avoid in Data Model

    18/20

    PAGE 18

    Summary

    Data modeling is often a pivotal task in building a database

    application.

    A data model determines an applications data quality,

    extensibility, and performance and influences whether the

    application has a chance at business success.

    You can improve your data models if you pay attention to the

    pitfalls we have covered.

  • 7/28/2019 10 Things to Avoid in Data Model

    19/20

    PAGE 19

    Speaker Bio

    Since 1994 Dr. Michael Blaha has been a consultant and trainer inconceiving, architecting, modeling, designing, and tuningdatabases for dozens of organizations throughout the world.

    He has authored six U.S. patents, five widely used books, andmany papers.

    His most recent book, Patterns of Data Modeling, was publishedin June 2010.

    Blaha received his doctorate from Washington University in St.Louis and is an alumnus of GE Global Research in Schenectady, NY.

    You can contact him at [email protected] andwww.modelsoftcorp.com.

  • 7/28/2019 10 Things to Avoid in Data Model

    20/20

    PAGE 20

    Questions?