Lecture1a Introduction to Databases
-
Upload
mahesh-dasari -
Category
Documents
-
view
238 -
download
2
Transcript of Lecture1a Introduction to Databases
![Page 1: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/1.jpg)
INTRODUCTION TO DATABASES
![Page 2: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/2.jpg)
What is a database (DB)?
![Page 3: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/3.jpg)
What is a database (DB)?• A collection of data
• Needs to be stored• Needs to be retrieved (possibly only parts) at some later moment• Might need to be updated/modified at some later moment
![Page 4: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/4.jpg)
What is a database-management system (DBMS)?
• A collection of data (ie, database) and• A set of programs to access these data
• Read, add, delete, update records or parts of records• Often “invisible” to end-users of a database
![Page 5: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/5.jpg)
Examples where databases are used…
• What are some real-life applications of databases?
![Page 6: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/6.jpg)
Examples where databases are used…
• University:• List of instructors, their department, classes, taught, salary,
address• List of students, major, classes taken, GPA, address• List of classes at the university, room, time
• Airline:• List and status of airplanes, pilots, support personnel• List and status of flights
• Hospital:• List of doctors, their specialization, their availability• List of patients and their health history
![Page 7: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/7.jpg)
We interact with a database “daily” (without even realizing it)• Register for a class• Search a list of movies playing in a nearby cinema• Look current weather conditions• Check Facebook news feed• Create LinkedIn profile• Add a friend on Google+
![Page 8: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/8.jpg)
A practical example: Netflix• An online “store” of movies and TV shows• A person pays monthly subscription to “rent” any amount
of movies/TV shows that are available• Netflix recommends movies/TV shows based on history of
usage by a person
TASK:• Let’s come up with a list of information that needs to be
stored• Ignore HR department and personnel working for Netflix
![Page 9: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/9.jpg)
Netflix: info to store• User:
• Username and password• Real name (first name and last name)• Credit card info (or bank account info)• Account is active or not• History of movies/TV shows watched• Recommendations ???
• Movie/TV show:• It’s a movie or a TV show• Title• Genre• Director(s)• Main actor(s)• Year produced• Oscar recipient/nomination
![Page 10: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/10.jpg)
Different methods to store information
• Pen and paper• Microsoft excel sheet• File-processing system (supported by operating system)• Database
![Page 11: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/11.jpg)
Advantages of DB• Can store large amount of information in a very organized
way.• Data redundancy and inconsistency can be removed.• Easy to access data.• All data is at the “same” place.• Integrity (ie, consistency constraints can be imposed).• Atomicity (ie, multiple operations are considered as one).• Concurrent access is not a problem.• No security problems (ie, different types of users can be
given access to different data).
We will revisit each of these topics.
![Page 12: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/12.jpg)
DB is “a set of tables”• User table:Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 13: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/13.jpg)
DB is “a set of tables”• User table:
• Any issues/concerns/difficulties?
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 14: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/14.jpg)
DB is “a set of tables”• User table:
• Username should be unique!!! -> not a problem, but we need to have that constraint programmed.
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 15: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/15.jpg)
DB is “a set of tables”• User table:
• Password should satisfy some predetermined conditions not a problem, but needs to be programmed.
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 16: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/16.jpg)
DB is “a set of tables”• User table:
• Real name: maybe we want to split it into 2 columns: first name and last name.
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 17: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/17.jpg)
DB is “a set of tables”• User table:
• Credit card: we need lots of info: number, expiration date, security number -> maybe we need more than one field.
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 18: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/18.jpg)
DB is “a set of tables”• User table:
• Credit card: what if we allowed credit card info or bank account? -> Different fields for bank account -> new table.
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 19: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/19.jpg)
DB is “a set of tables”• User table:Username
Password Real name
Payment Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
Credit card
Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
Bank account
Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
Bank account
N
![Page 20: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/20.jpg)
Payment tables• Credit card:
• Bank account:
Credit card number
Expiration date Security number
9867432554321223
12/2017 543
Routing number Account number
1231231231 9765432134
4343434343 7543009990
![Page 21: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/21.jpg)
Payment tables• Credit card:
• Bank account:
• Any issues???
Credit card number
Expiration date Security number
9867432554321223
12/2017 543
Routing number Account number
1231231231 9765432134
4343434343 7543009990
![Page 22: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/22.jpg)
Payment tables• Credit card:
• Bank account:
Username Credit card number
Expiration date
Security number
Jsmith3 9867432554321223
12/2017 543
Username Routing number Account numberMrayn 1231231231 9765432134
gomezs 4343434343 7543009990
![Page 23: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/23.jpg)
Payment tables• Credit card:
• Bank account:
• Could have we used first and last name? Address?
Username Credit card number
Expiration date
Security number
Jsmith3 9867432554321223
12/2017 543
Username Routing number Account numberMrayn 1231231231 9765432134
gomezs 4343434343 7543009990
![Page 24: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/24.jpg)
DB is “a set of tables”• User table:
• History: multiple titles in the same field…hard to distinguish between title, hard to search -> new table
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 25: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/25.jpg)
History table
Username Showjsmith3 12 Years of Slave
jsmith3 NCIS
Mrayn Hobbit
Mrayn Hunger Games
![Page 26: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/26.jpg)
DB is “a set of tables”• User table:
• Recommendations: multiple titles in the same field…hard to distinguish between title, hard to search -> new table
Username
Password Real name
Credit card
Active account
History Recommendations
Jsmith3 Pass1 John A. Smith
xxxxx Y NCIS, 12 Years of Slave
Law and Order
Mrayn Pass2 Mary Raynolds
xxxxx Y Hobbit, Hunger Games
gomezs Pass3 Stan Gomez
xxxxx N
![Page 27: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/27.jpg)
Advantages of DB• Can store large amount of information in a very organized
way.• Data redundancy and inconsistency can be removed.• Easy to access data.• All data is at the “same” place.• Integrity (ie, consistency constraints can be imposed).• Atomicity (ie, multiple operations are considered as one).• Concurrent access is not a problem.• No security problems (ie, different types of users can be
given access to different data).
![Page 28: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/28.jpg)
Advantages of DB• Can store large amount of information in a very organized
way• There is no limit on number of tables in a DB.• There is no limit on number of columns and rows in a DB.We can store large amount of data.We can organize data anyway we want.
![Page 29: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/29.jpg)
Advantages of DB• Data redundancy and inconsistency can be removed.
• Data redundancy = store same information in multiple places. • Eg., we have a table for movies and a table for TV shows, and for each
of them, we have separate list of customers (username, password, first name, last name, etc.) this is unnecessary redundancy bad DB design
• Inconsistency = pieces of data do not agree.• What if a person changes her name? Updates might have to be made in
both lists of customers, but the customer might remember to do it only at one place. Creates inconsistency.
![Page 30: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/30.jpg)
Advantages of DB• Easy to access data.
• It is easy to search data by any value in any column in any table (if DB is designed well).• Eg., A person forgot username and password, it is easy to search by
first name and last name columns (at the same time), and then ask for additional info (address, email…) to verify the correct person is found or to filter out wrong records if multiple records have same first and last names.
• We can search for all customers who saw a particular movie (this is another reason why we might want to have history as a separate table and each movie seen by a person on a separate line).
• We can search for all customers who have had membership for more than one year. Or can we?
![Page 31: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/31.jpg)
Advantages of DB• All data is at the “same” place.
• There are no different files or folders to search.• It seems as everything is stored together.
![Page 32: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/32.jpg)
Advantages of DB• Integrity (ie, consistency constraints can be imposed).
• Eg., username should satisfy certain constraints (at least 6 characters long, must be unique…).
• It is easy to impose these constraints in a DB. • It is programmer once even though username might be used in
multiple tables.
![Page 33: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/33.jpg)
Advantages of DB• Atomicity = multiple operations are considered as one
• When credit card is charged, the customer must be given an “active” status. What if something happens (ie, system failure) between charging credit card and setting customer’s status to “active”?
![Page 34: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/34.jpg)
Advantages of DB• Concurrent access is not a problem.
• Multiple users may access DB at the same time.
![Page 35: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/35.jpg)
Advantages of DB• No security problems (ie, different types of users can be
given access to different data).• Eg., Users are not allowed to update any record in movies table,
but they can search this table.• Eg., On the other hand, Netflix workers can update movies table.
![Page 36: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/36.jpg)
DB terminology
• DB schema = DB design• DB instance = a collection of information stored in DB at
the moment
• Data-definition language (DDL) = language used to specify DB schema
• Data-manipulation language (DML) = language used to manipulate (add, delete, update) information in DB• It changes DB instance
![Page 37: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/37.jpg)
DB schema• How many tables?
• Each table has a name. • It represent an “object”.
• How many columns in each table?• Each column is an attribute of the object it represents.• Each column saves data of a particular type: char(50), bigint(30),
date• How do we differentiate each record in a table?
• By a unique key. DB schema defines a key for each table.• There might be foreign keys present in some tables.
![Page 38: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/38.jpg)
Data-definition language• Eg.,CREATE TABLE user
(username VARCHAR(20) NOT NULL, first_name VARCHAR(15) NOT NULL,
last_name VARCHAR(15) NOT NULL, account_active CHAR(1), date_enrolled DATE, number_movies_watched BIGINT(30), PRIMARY KEY username);
• SQL (Structured Query Language) for DDL
![Page 39: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/39.jpg)
Data-definition language• Defines/ensures:
• Domain constraints• Must exist for each attribute• Eg., bigint, varchar
• Referential integrity• Eg., Each movie that show in a user’s history mist exist in movies table.
• Assertions• Any domain constraint is an assertion• Any referential integrity is an assertion• Other assertions exist
• Eg., password must satisfy some criteria (at least 6 characters long, contain at least one letter and one digit…)
• Authorization• Read authorization in movies table for users• Update authorization in movies for administrators
![Page 40: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/40.jpg)
Data-manipulation language• SQL • Used to
• Retrieve data from DB• Insert new info in DB• Delete info from DB• Modify a piece of info in DB
• Eg., SELECT user.first_name,user.last_name FROM user WHERE user.active=“Y”;
![Page 41: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/41.jpg)
DB design• DB has lots of advantage, but it has to be designed well to use
these advantages.• Several steps in DB design:
• Understand requirements• What information will be saved?
• Users (username, password…)• How information will be used?
• Customers: search movies by title, actor, genre,…• What additional “functionalities” might be required later?
• Allow payments by bank account• Conceptual design
• Describe data and relationships among pieces of information.• Types of data (char, int, date…)• Can one user have multiple movies listed in history? Should history be a table on its own?
• Physical design• Translate conceptual design into DB
![Page 42: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/42.jpg)
Understanding requirements• Lots of conversations between DB developers (ie., tech
people) and business people of the company or end-users.
• Usually several iterations happening between all parties involved to clarify details. Some of them happen after conceptual design has already started.
![Page 43: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/43.jpg)
Conceptual design• The most important and hardest part of DB design.• Two common methods:
• Entity-relationship (E-R) model• Normalization
![Page 44: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/44.jpg)
E-R model• Graphical representation of DB model.
• Unified Modeling Language (UML) is most commonly used for this representation.
• Entity = an “object”. • Represented by a rectangle.• It has entity name (on top) and a set of attributes.
• Relationship = relationship between two entities.• Represented by a diamond.• Relationship name is written inside the diamond.• It connects two entities.• It might contain cardinalities.
![Page 45: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/45.jpg)
E-R modeluserUsernamePasswordFirst nameLast nameAddressEmailActive account
Credit cardNumberExpiration dateSecurity code
HistoryMovie titleDate watched
Belongs to
1
N
Used by
N M
![Page 46: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/46.jpg)
Normalization• A set of algorithms to make sure that data are
represented in “normal form”.• Normal form:
• No redundancy• Eg., We could have all TV show information in history rather than having
a separate table for TV show
but we would repeat all this info for every person who watches NCIS.What if director changes at some moment or a new actor is added?
usename title actors director genreJsmith3 NCIA MH,PP,SS Bellisario drama
![Page 47: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/47.jpg)
Normalization• Normal form (con’t):
• No inability to represent information• In a previous example (if we didn’t have separate table for TV shows),
what would happen if there is a TV show that noone has watched yet (eg., a new TV show)? The info would not be able to be stored anywhere.
![Page 48: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/48.jpg)
Physical design• Pretty simple for a DB expert.• Translation from detailed human language to SQL or other
DB language.
![Page 49: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/49.jpg)
Who interacts with a DB?• DB administrators
• They create/update/manage DB• Sophisticated (tech) users
• They “analyze” data• They use SQL-type language to directly interact with DB
• Naïve (non-tech) end-users (eg., Netflix customers)• They search DB• They use application programing interface to interact with DB
• Often via web
• Application programmers• They create/program application programming interface for naïve
end-user
![Page 50: Lecture1a Introduction to Databases](https://reader036.fdocuments.net/reader036/viewer/2022062503/577cce451a28ab9e788db930/html5/thumbnails/50.jpg)
Types of DBs• Relational DB• Object-based data models
• Developed to suit better object-oriented programming languages.• Extension of E-R model to allow for structured and collection types,
encapsulation, inheritance…• Semistructured data models
• Allows specification of data where a data item might have different set of attributes for different records.
• XML language