Lecture2 slides-march-29

48
S 2060 Lecture 2 – Data Literacy & Data Mode

description

 

Transcript of Lecture2 slides-march-29

Page 1: Lecture2 slides-march-29

BSYS 2060 Lecture 2 – Data Literacy & Data Modeling

Page 2: Lecture2 slides-march-29

Agenda

• Announcements• Digital Literacy• Data Modeling• Relationship Types• Example

Page 3: Lecture2 slides-march-29

Source: CC by 2.0, http://www.flickr.com/photos/bobjudge/3569973941/

Bring headphones to all labs from here on in!

Page 5: Lecture2 slides-march-29

http://www.lynda.com/

Page 6: Lecture2 slides-march-29
Page 7: Lecture2 slides-march-29

Announcements

• If you didn’t already do the “Week 0” exercise, you should do ASAP!

• Sign up for Lynda.com ASAP– $21.67 US for duration of semester– Videos to watch will be assigned shortly after today’s lecture– Bring your headphones to all future labs!

• Join the BSYS 2060 group on zenportfolios.ca if you didn’t already– Upload an avatar if you didn’t already– We will start using this online group more in the weeks to

come, especially for the project

Page 8: Lecture2 slides-march-29

If you didn’t do this already…Go to zenportfolios.ca, log-in, click on Groups, search for 2060Then join the BSYS-2060-2012 group

Page 9: Lecture2 slides-march-29

Digital Literacy

Page 11: Lecture2 slides-march-29

“Inasmuch as you need to know how to read English, you need to have some understanding of the code that builds the Web,” said Sarah Henry, 39, an investment manager who lives in Wayne, Pa. “It is fundamental to the way the world is organized and the way people think about things these days.”

Page 12: Lecture2 slides-march-29

We live in an information age where data is king.

Page 13: Lecture2 slides-march-29

Data often most valuable asset of a companye.g. Aeroplan worth more than Air Canada

Page 14: Lecture2 slides-march-29

We live in a data mash-up world.

Page 15: Lecture2 slides-march-29
Page 16: Lecture2 slides-march-29

WordPress that uses open sourceMySQL database back-end

Twitter API / Integration

Page 17: Lecture2 slides-march-29

Take control of your data. Best thing is to buy your own domain nameand build your own site and web presence there.

http://jacobjpope.com

Page 18: Lecture2 slides-march-29

Where does data belong?

Page 19: Lecture2 slides-march-29

You control completely

You don’t control the data at all

Self-hostedWebsite containing all your data

convenientNot so easy...

Trend is towards you controlling your data

Diaspora

Many “fly by night” Web 2.0 apps

The average middle aged person has their data spread out overA 1000 different database locations

Page 24: Lecture2 slides-march-29

Who owns data?

Page 26: Lecture2 slides-march-29

Building a database for an organization?

Page 27: Lecture2 slides-march-29

What is the organization’s mission?

Page 28: Lecture2 slides-march-29

What is the project’s mission?

Page 29: Lecture2 slides-march-29

What data is required to make good decisions or for other reporting needs?

Page 30: Lecture2 slides-march-29

What data is currently being collected?

Page 31: Lecture2 slides-march-29

How is the data collected being used?

Page 32: Lecture2 slides-march-29

What data should be collected?

Page 33: Lecture2 slides-march-29

What are the main data entities?These will become your tables

Page 34: Lecture2 slides-march-29

What are the required fields and field data types?

Page 35: Lecture2 slides-march-29

What are the relationships between the tables?

Page 36: Lecture2 slides-march-29

Select your tools?e.g. MS Access, MS SQL, MySQL, OracleWhat is being used now if anything?

Page 37: Lecture2 slides-march-29

Be careful of the hammer!To a hammer, everything looks like a nail. If you only know MS Access, you may see always see MS Access as the solution even when it’s not. If you only know MS Excel, it seems like the perfect choice!

Source:http://www.flickr.com/photos/fixersphotos/3199566032/

Page 38: Lecture2 slides-march-29

Normalization

A brief introduction to the first three forms...

Page 39: Lecture2 slides-march-29

“Normalization”

• In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.

Codd, E.F. The Relational Model for Database Management: Version 2. Addison-Wesley (1990), p. 271

Page 41: Lecture2 slides-march-29

“...insertion, update, and deletion anomalies...”

Until a Course Code is assigned to this record, it can not be inserted in the table

Page 42: Lecture2 slides-march-29

“...insertion, update, and deletion anomalies...”

An edit made to one record may not be made to ALL records for the same employee

Page 43: Lecture2 slides-march-29

“...insertion, update, and deletion anomalies...”

If the Course Code is deleted the information for the Faculty Member will be lost.

Page 44: Lecture2 slides-march-29

Three Normal Forms

• 1NF– Eliminate repeating groups– No redundant data

• 2NF– Eliminate independent data– All fields depend on Primary Key

• 3NF– Eliminate dependency on non-key fields– Fields do not depend on each other

Page 45: Lecture2 slides-march-29

Un-Normalized Table (i.e. Excel)

Page 46: Lecture2 slides-march-29

First Normal Form (1NF)

0NF – “un-normalized”

1NF

Page 47: Lecture2 slides-march-29

Second Normal Form (2NF)

1NF

2NF

Page 48: Lecture2 slides-march-29

Third Normal Form (3NF)

2NF

3NF