Dr. Honghui Deng

36
8.1 Dr. Honghui Deng Dr. Honghui Deng Assistant Professor Assistant Professor MIS Department MIS Department UNLV UNLV MIS 370 System Analysis Theory

description

MIS 370 System Analysis Theory. Dr. Honghui Deng. Assistant Professor MIS Department UNLV. MIS 370 System Analysis Theory. Chapter 8. DATA MODELING AND ANALYSIS. Learning Objectives. Define data modeling and explain its benefits. - PowerPoint PPT Presentation

Transcript of Dr. Honghui Deng

Page 1: Dr. Honghui Deng

8.1

Dr. Honghui DengDr. Honghui Deng

Assistant ProfessorAssistant Professor

MIS DepartmentMIS Department

UNLVUNLV

MIS 370 System Analysis Theory

Page 2: Dr. Honghui Deng

8.2

Chapter 8Chapter 8

DATA MODELING AND DATA MODELING AND ANALYSISANALYSIS

MIS 370 System Analysis Theory

Page 3: Dr. Honghui Deng

8.3

Learning ObjectivesLearning Objectives

• Define data modeling and explain its benefits.Define data modeling and explain its benefits.• Recognize and understand the basic concepts and constructs of a Recognize and understand the basic concepts and constructs of a

data model.data model.• Read and interpret an entity relationship data model.Read and interpret an entity relationship data model.• Explain when data models are constructed during a project and Explain when data models are constructed during a project and

where the models are stored.where the models are stored.• Discover entities and relationships.Discover entities and relationships.• Construct an entity-relationship context diagram.Construct an entity-relationship context diagram.• Discover or invent keys for entities and construct a key-based Discover or invent keys for entities and construct a key-based

diagram.diagram.• Construct a fully attributed entity relationship diagram and Construct a fully attributed entity relationship diagram and

describe data structures and attributes to the repository.describe data structures and attributes to the repository.• Normalize a logical data model to remove impurities that can make Normalize a logical data model to remove impurities that can make

a database unstable, inflexible, and nonscalable. a database unstable, inflexible, and nonscalable. • Describe a useful tool for mapping data requirements to business Describe a useful tool for mapping data requirements to business

operating locations.operating locations.

Page 4: Dr. Honghui Deng

8.4

Data ModelingData Modeling

• Data modelingData modeling – a technique for – a technique for organizing and documenting a system’s organizing and documenting a system’s data. Sometimes called data. Sometimes called database database modelingmodeling. .

• Entity relationship diagram (ERD)Entity relationship diagram (ERD) – a data – a data model utilizing several notations to model utilizing several notations to depict data in terms of the entities and depict data in terms of the entities and relationships described by that data.relationships described by that data.

Page 5: Dr. Honghui Deng

8.5

Sample ERDSample ERD

Page 6: Dr. Honghui Deng

8.6

EntityEntity

• Entity – a class of persons, places, objects, Entity – a class of persons, places, objects, events, or concepts about which we need to events, or concepts about which we need to capture and store data.capture and store data.– Named by a singular noun

• PersonsPersons: agency, contractor, customer, : agency, contractor, customer, department, division, employee, instructor, department, division, employee, instructor, student, supplier. student, supplier.

• PlacesPlaces: sales region, building, room, : sales region, building, room, branch office, campus. branch office, campus.

• ObjectsObjects: book, machine, part, product, raw material, software : book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle. license, software package, tool, vehicle model, vehicle.

• EventsEvents: application, award, cancellation, class, flight, invoice, : application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip. order, registration, renewal, requisition, reservation, sale, trip.

• ConceptsConcepts: account, block of time, bond, course, fund, : account, block of time, bond, course, fund, qualification, stock. qualification, stock.

Student

Page 7: Dr. Honghui Deng

8.7

• Entity instance – a single occurrence of an Entity instance – a single occurrence of an entity. entity.

EntityEntity

Student Student IDID

Last Last NameName

First First NameName

21442144 ArnoldArnold BettyBetty

31223122 TaylorTaylor JohnJohn

38433843 SimmonSimmonss

LisaLisa

98449844 MacyMacy BillBill

28372837 LeathLeath HeatherHeather

22932293 WrenchWrench TimTim

Entity

Instances

Page 8: Dr. Honghui Deng

8.8

AttributesAttributes

• AttributeAttribute – a descriptive – a descriptive property or characteristic property or characteristic of an entity. Synonyms of an entity. Synonyms include include elementelement, , propertyproperty, and , and fieldfield. . – Just as a physical student can

have attributes, such as hair color, height, etc., a data entity has data attributes

• Compound attributeCompound attribute – an – an attribute that consists of other attribute that consists of other attributes. Synonyms in attributes. Synonyms in different data modeling different data modeling languages are numerous: languages are numerous: concatenated attribute, concatenated attribute, composite attribute, and data composite attribute, and data structure.structure.

STUDENT

Name .Last Name .First Name .Middle InitialAddress .Street .City .State .Country .Zip CodePhone Number .Area Code .Number .Extension

Page 9: Dr. Honghui Deng

8.9

Data TypeData Type

• Data typeData type – a property of an attribute that identifies – a property of an attribute that identifies what type of data can be stored in that attribute.what type of data can be stored in that attribute.

Representative Logical Data Types for AttributesRepresentative Logical Data Types for Attributes

Logical Data TypeLogical Data Type Logical Business MeaningLogical Business Meaning

NUMBERNUMBER Any number, real or integer.Any number, real or integer.

TEXTTEXT A string of characters, inclusive of numbers. When numbers are included in A string of characters, inclusive of numbers. When numbers are included in a TEXT attribute, it means that we do not expect to perform arithmetic or a TEXT attribute, it means that we do not expect to perform arithmetic or comparisons with those numbers.comparisons with those numbers.

MEMOMEMO Same as TEXT but of an indeterminate size. Some business systems require Same as TEXT but of an indeterminate size. Some business systems require the ability to attach potentially lengthy notes to a give database record.the ability to attach potentially lengthy notes to a give database record.

DATEDATE Any date in any format.Any date in any format.

TIMETIME Any time in any format.Any time in any format.

YES/NOYES/NO An attribute that can assume only one of these two values.An attribute that can assume only one of these two values.

VALUE SETVALUE SET A finite set of values. In most cases, a coding scheme would be established A finite set of values. In most cases, a coding scheme would be established (e.g., FR=Freshman, SO=Sophomore, JR=Junior, SR=Senior).(e.g., FR=Freshman, SO=Sophomore, JR=Junior, SR=Senior).

IMAGEIMAGE Any picture or image.Any picture or image.

Page 10: Dr. Honghui Deng

8.10

DomainsDomains

• DomainDomain – a property of an attribute that defines what – a property of an attribute that defines what values an attribute can legitimately take on.values an attribute can legitimately take on.

Representative Logical Domains for Logical Data TypesRepresentative Logical Domains for Logical Data Types

Data TypeData Type DomainDomain ExamplesExamples

NUMBERNUMBERFor integers, specify the range.For integers, specify the range.

For real numbers, specify the range and precision.For real numbers, specify the range and precision.

{10-99}{10-99}

{1.000-799.999}{1.000-799.999}

TEXTTEXTMaximum size of attribute.Maximum size of attribute.

Actual values are usually infinite; however, users may Actual values are usually infinite; however, users may specify certain narrative restrictions.specify certain narrative restrictions.

Text(30)Text(30)

DATEDATE Variation on the MMDDYYYY format. Variation on the MMDDYYYY format. MMDDYYYYMMDDYYYY

MMYYYYMMYYYY

TIMETIMEFor AM/PM times: HHMMTFor AM/PM times: HHMMT

For military (24-hour times): HHMMFor military (24-hour times): HHMM

HHMMTHHMMT

HHMMHHMM

YES/NOYES/NO {YES, NO}{YES, NO} {YES, NO} {ON, OFF}{YES, NO} {ON, OFF}

VALUE SETVALUE SET{value#1, value#2,…value#n}{value#1, value#2,…value#n}

{table of codes and meanings}{table of codes and meanings}

{M=Male{M=Male

F=Female}F=Female}

Page 11: Dr. Honghui Deng

8.11

Default ValueDefault Value

• Default valueDefault value – the value that will be recorded if a – the value that will be recorded if a value is not specified by the user.value is not specified by the user.

Permissible Default Values for AttributesPermissible Default Values for Attributes

Default ValueDefault Value InterpretationInterpretation ExamplesExamples

A legal value A legal value from the from the domaindomain

For an instance of the attribute, if the user For an instance of the attribute, if the user does not specify a value, then use this value.does not specify a value, then use this value.

00

1.001.00

NONE or NULLNONE or NULL For an instance of the attribute, if the user For an instance of the attribute, if the user does not specify a value, then leave it blank.does not specify a value, then leave it blank.

NONENONE

NULLNULL

Required or Required or NOT NULLNOT NULL

For an instance of the attribute, require that For an instance of the attribute, require that the user enter a legal value from the domain. the user enter a legal value from the domain. (This is used when no value in the domain is (This is used when no value in the domain is common enough to be a default but some common enough to be a default but some value must be entered.)value must be entered.)

REQUIREDREQUIRED

NOT NULLNOT NULL

Page 12: Dr. Honghui Deng

8.12

IdentificationIdentification

• Key – an attribute, or a group of attributes, Key – an attribute, or a group of attributes, that assumes a unique value for each that assumes a unique value for each entity instance. It is sometimes called an entity instance. It is sometimes called an identifieridentifier..– Concatenated key - a group of attributes that

uniquely identifies an instance of an entity. Synonyms include composite key and compound key.

– Candidate key – one of a number of keys that may serve as the primary key of an entity. Also called a candidate identifier.

– Primary key – a candidate key that will most commonly be used to uniquely identify a single entity instance.

– Alternate key – a candidate key that is not selected to become the primary key is called an alternate key. A synonym is secondary key.

– Subsetting criteria – an attribute(s) whose finite values divide all entity instances into useful subsets. Sometimes called inversion entry.

STUDENT

Student Number (Primary Key)Social Security Number (Alternate Key)Name .Last Name .First Name .Middle InitialAddress .Street .City .State .Country .Zip CodePhone Number .Area Code .Number .ExtensionGender (Subsetting Criteria 1)Major (Subsetting Criteria 2)Grade Point Average

Page 13: Dr. Honghui Deng

8.13

RelationshipsRelationships

• Relationship – a natural business Relationship – a natural business association that exists between one or association that exists between one or more entities. more entities.

The relationship may represent an event that links the entities or merely a logical affinity that exists between the entities.

Student Curriculumis being studied by is enrolled in

Page 14: Dr. Honghui Deng

8.14

CardinalityCardinality

• CardinalityCardinality – the minimum and maximum – the minimum and maximum number of occurrences of one entity that number of occurrences of one entity that may be related to a single occurrence of the may be related to a single occurrence of the other entity.other entity.

Because all relationships are bidirectional, cardinality must be defined in both directions for every relationship.

Student Curriculumis being studied by is enrolled in

bidirectional

Page 15: Dr. Honghui Deng

8.15

Cardinality NotationsCardinality Notations

Page 16: Dr. Honghui Deng

8.16

DegreeDegree

• DegreeDegree – the number of entities – the number of entities that participate in the that participate in the relationship. relationship. – A relationship between two entities

is called a binary relationship.

– A relationship between different instances of the same entity is called a recursive relationship.

– A relationship between three entities is called a 3-ary or ternary relationship.

Page 17: Dr. Honghui Deng

8.17

DegreeDegree

• Associative entityAssociative entity – an – an entity that inherits its entity that inherits its primary key from more primary key from more than one other entity than one other entity (called parents). (called parents).

• Each part of that Each part of that concatenated key concatenated key points to one and only points to one and only one instance of each of one instance of each of the connecting entities. the connecting entities.

AssociativeEntity

Page 18: Dr. Honghui Deng

8.18

Foreign KeysForeign Keys

• Foreign keyForeign key – a primary key of an entity that is – a primary key of an entity that is used in another entity to identify instances of used in another entity to identify instances of a relationship.a relationship.– A foreign key is a primary key of one entity that is

contributed to (duplicated in) another entity to identify instances of a relationship.

– A foreign key always matches the primary key in the another entity

– A foreign key may or may not be unique (generally not)– The entity with the foreign key is called the child.– The entity with the matching primary key is called the

parent.

Page 19: Dr. Honghui Deng

8.19

Foreign KeysForeign Keys

Student IDStudent ID Last NameLast Name First NameFirst Name DormDorm

21442144 ArnoldArnold BettyBetty SmitSmithh

31223122 TaylorTaylor JohnJohn JoneJoness

38433843 SimmonsSimmons LisaLisa SmitSmithh

98449844 MacyMacy BillBill

28372837 LeathLeath HeatherHeather SmitSmithh

22932293 WrenchWrench TimTim JoneJoness

DormDorm Residence DirectorResidence Director

SmithSmith Andrea FernandezAndrea Fernandez

JonesJones Daniel AbidjanDaniel Abidjan

Primary Key

Primary Key

Foreign KeyDuplicated from primary key of Major entity(not unique)

Page 20: Dr. Honghui Deng

8.20

Nonidentifying RelationshipsNonidentifying Relationships

• Nonidentifying relationshipNonidentifying relationship – a relationship in which each – a relationship in which each participating entity has its own independent primary key participating entity has its own independent primary key – Primary key attributes are not shared.

– The entities are called strong entities

Page 21: Dr. Honghui Deng

8.21

Identifying RelationshipsIdentifying Relationships

• Identifying relationshipIdentifying relationship – a relationship in which the – a relationship in which the parent entity’ key is also part of the primary key of the parent entity’ key is also part of the primary key of the child entity. child entity. – The child entity is called a weak entity.

Page 22: Dr. Honghui Deng

8.22

Nonspecific RelationshipsNonspecific Relationships

• Nonspecific Nonspecific relationshiprelationship – a – a relationship where relationship where many instances of an many instances of an entity are associated entity are associated with many instances with many instances of another entity. of another entity. Also called Also called many-to-many-to-many relationshipmany relationship..

• Nonspecific Nonspecific relationships must be relationships must be resolved. Most resolved. Most nonspecific nonspecific relationships can be relationships can be resolved by resolved by introducing an introducing an associative entityassociative entity..

Page 23: Dr. Honghui Deng

8.23

GeneralizationGeneralization

• GeneralizationGeneralization – a concept wherein the attributes – a concept wherein the attributes that are common to several types of an entity are that are common to several types of an entity are grouped into their own entity.grouped into their own entity.

• SupertypeSupertype – an entity whose instances store – an entity whose instances store attributes that are common to one or more entity attributes that are common to one or more entity subtypes. subtypes.

• SubtypeSubtype – an entity whose instances may inherit – an entity whose instances may inherit common attributes from its entity supertypecommon attributes from its entity supertype– And then add other attributes that are unique to

the subtype.

Page 24: Dr. Honghui Deng

8.24

Generalization HierarchyGeneralization Hierarchy

Page 25: Dr. Honghui Deng

8.25

Process of Logical Data ModelingProcess of Logical Data Modeling

• Strategic Data ModelingStrategic Data Modeling– Many organizations select IS development

projects based on strategic plans.• Includes vision and architecture for information Includes vision and architecture for information

systemssystems• Identifies and prioritizes develop projectsIdentifies and prioritizes develop projects• Includes enterprise data model as starting point for Includes enterprise data model as starting point for

projectsprojects

• Data Modeling during Systems AnalysisData Modeling during Systems Analysis– Data model for a single information system is

called an application data model.– Context data model includes only entities and

relationships.

Page 26: Dr. Honghui Deng

8.26

Logical Model Development StagesLogical Model Development Stages

1.1. Context Data modelContext Data model – To establish project scope

2.2. Key-base data modelKey-base data model– Eliminate nonspecific relationships– Add associative entities– Include primary and alternate keys– Precise cardinalities

3.3. Fully attributed data modelFully attributed data model– All remaining attributes– Subsetting criteria

4.4. Normalized data modelNormalized data model

MetadataMetadata - data about data. - data about data.

Page 27: Dr. Honghui Deng

8.27

Questions for Data ModelingQuestions for Data Modeling

PurposePurpose Candidate Questions Candidate Questions ((see Table 8-4 in text for a more complete list)see Table 8-4 in text for a more complete list)

Discover system entitiesDiscover system entities What are the subjects of the business? What are the subjects of the business?

Discover entity keysDiscover entity keysWhat unique characteristic (or characteristics) What unique characteristic (or characteristics) distinguishes an instance of each subject from other distinguishes an instance of each subject from other instances of the same subject?instances of the same subject?

Discover entity subsetting criteriaDiscover entity subsetting criteria Are there any characteristics of a subject that divide all Are there any characteristics of a subject that divide all instances of the subject into useful subsets?instances of the subject into useful subsets?

Discover attributes and domainsDiscover attributes and domains What characteristics describe each subject?What characteristics describe each subject?

Discover security and control Discover security and control needsneeds

Are there any restrictions on who can see or use the Are there any restrictions on who can see or use the data?data?

Discover data timing needsDiscover data timing needs How often does the data change?How often does the data change?

Discover generalization Discover generalization hierarchieshierarchies Are all instances of each subject the same?Are all instances of each subject the same?

Discover relationships?Discover relationships? What events occur that imply associations between What events occur that imply associations between subjects?subjects?

Discover cardinalitiesDiscover cardinalities Is each business activity or event handled the same way, Is each business activity or event handled the same way, or are there special circumstances?or are there special circumstances?

Page 28: Dr. Honghui Deng

8.28

Entity DiscoveryEntity Discovery

Entity NameEntity Name Business DefinitionBusiness Definition

AgreementAgreement A contract whereby a member agrees to purchase a certain number A contract whereby a member agrees to purchase a certain number of products within a certain time. After fulfilling that agreement, the of products within a certain time. After fulfilling that agreement, the member becomes eligible for bonus credits that are redeemable for member becomes eligible for bonus credits that are redeemable for free or discounted products.free or discounted products.

MemberMember An active member of one or more clubs.An active member of one or more clubs.Note: A target system objective is to re-enroll inactive members as opposed to deleting them.

Member Member orderorder

An order generated for a member as part of a monthly promotion, or An order generated for a member as part of a monthly promotion, or an order initiated by a member.an order initiated by a member.

Note: The current system only supports orders generated from promotions; however, customer initiated orders have been given a high priority as an added option in the proposed system.

TransactionTransaction A business event to which the Member Services System must A business event to which the Member Services System must respond.respond.

ProductProduct An inventoried product available for promotion and sale to members.An inventoried product available for promotion and sale to members.Note: System improvement objectives include (1) compatibility with new bar code system being developed for the warehouse, and (2) adaptability to a rapidly changing mix of products.

PromotionPromotion A monthly or quarterly event whereby special product offerings are A monthly or quarterly event whereby special product offerings are made available to members.made available to members.

Page 29: Dr. Honghui Deng

8.29

The Context Data ModelThe Context Data Model

Page 30: Dr. Honghui Deng

8.30

The Key-based Data ModelThe Key-based Data Model

Page 31: Dr. Honghui Deng

8.31

The Key-based Data Model With GeneralizationThe Key-based Data Model With Generalization

Page 32: Dr. Honghui Deng

8.32

The Fully-Attributed Data ModelThe Fully-Attributed Data Model

Page 33: Dr. Honghui Deng

8.33

What is a Good Data Model?What is a Good Data Model?

• A good data model is simple.A good data model is simple.– Data attributes that describe any given entity should

describe only that entity.– Each attribute of an entity instance can have only one

value.

• A good data model is essentially nonredundant.A good data model is essentially nonredundant.– Each data attribute, other than foreign keys, describes at

most one entity.– Look for the same attribute recorded more than once

under different names.

• A good data model should be flexible and adaptable A good data model should be flexible and adaptable to future needs.to future needs.

Page 34: Dr. Honghui Deng

8.34

Data Analysis & NormalizationData Analysis & Normalization

• Data analysisData analysis – a technique used to improve – a technique used to improve a data model for implementation as a a data model for implementation as a database.database.– Goal is a simple, nonredundant, flexible,

and adaptable database.

• NormalizationNormalization – a data analysis technique – a data analysis technique that organizes data into groups to form that organizes data into groups to form nonredundant, stable, flexible, and adaptive nonredundant, stable, flexible, and adaptive entities.entities.

Page 35: Dr. Honghui Deng

8.35

Normalization: 1NF, 2NF, 3NFNormalization: 1NF, 2NF, 3NF

• First normal form (1NF)First normal form (1NF) – an entity whose attributes have no – an entity whose attributes have no more than one value for a single instance of that entitymore than one value for a single instance of that entity– Any attributes that can have multiple values actually describe a

separate entity, possibly an entity and relationship.

• Second normal form (2NF)Second normal form (2NF) – an entity whose nonprimary- – an entity whose nonprimary-key attributes are dependent on the full primary key.key attributes are dependent on the full primary key.– Any nonkey attributes that are dependent on only part of the

primary key should be moved to any entity where that partial key is actually the full key. This may require creating a new entity and relationship on the model.

• Third normal form (3NF)Third normal form (3NF) – an entity whose nonprimary-key – an entity whose nonprimary-key attributes are not dependent on any other non-primary key attributes are not dependent on any other non-primary key attributes. attributes. – Any nonkey attributes that are dependent on other nonkey

attributes must be moved or deleted. Again, new entities and relationships may have to be added to the data model.

Page 36: Dr. Honghui Deng

8.36

Data-to-Location-CRUD MatrixData-to-Location-CRUD Matrix