database glossary.doc.doc.doc.doc
description
Transcript of database glossary.doc.doc.doc.doc
document.doc 8/4/2023
http://databases.about.com/library/glossary/blglossary.htm?PM=ss13_databases
<A> DataBase Management Systems / Software:Microsoft AccessDefinition: is an entry-level database that offers a flexible environment for database
developers and users. It makes use of the familiar Microsoft Office interface and allows for
integration with larger-scale enterprise databases such as Microsoft's SQL Server and Oracle.
Cold FusionDefinition: Cold Fusion, a product of Allaire Corporation, is a suite of development tools
designed to facilitate web integration of databases. It features the Cold Fusion Markup
Language (CFML) which expands upon the features provided by the Hypertext Transfer Protocol
(HTTP) and the Extensible Markup Language (XML). CFML allows developers to create web-
integrated databases without the complexity inherent in full-scale programming languages
such as Java and C++.
IBM DB2Definition: DB2 is a relational database system developed by IBM Corporation, originally for
use on large mainframe computer systems. It has since been ported to a variety of platforms
including SunOS, Solaris, Linux, Windows 95/98/NT/2000 and HP-UX.
dBaseDefinition: dBase is a relational database management system first marketed by Ashton-Tate
corporation in the early 1980s. The data formatting conventions utilized by dBase quickly
became industry standards still in use today. The dBase Corporation provides support for
legacy and future applications.
Delphi
Definition: Delphi, a product of the Borland Corporation, is a rapid application development
platform that utilizes a visual approach to rapid application development. Delphi offers
specialized features for database connectivity and application development.
FoxPro
Definition: Microsoft Visual FoxPro is a development environment catering to the needs of
database developers. Supported platforms include FoxPro, SQL Server and Oracle.
INGRES
Definition: INGRES is a relational database system produced by Computer Associates. It runs
under a wide variety of operating systems and supports the industry-standard Structured
Query Language.
MySQL
Definition: MySQL is a relational database management system that implements many
industry standards including SQL and ODBC along with C and Perl APIs. MySQL is made
available under the GNU General Public License (GPL) free of charge and under commercial
license for commercial use.
page 1
document.doc 8/4/2023
Oracle
Definition: Oracle is a powerful relational database management system that offers a large
feature set. Along with Microsoft SQL Server, Oracle is widely regarded as one of the two most
popular full-featured database systems on the market today.
Online Analytical Processing (OLAP)
Definition: Online Analytical Processing software allows for the real-time analysis of data
stored in a database. The OLAP server is normally a separate component that contains
specialized algorithms and indexing tools to efficiently process data mining tasks with minimal
impact on database performance.
Online Transaction Processing (OLTP)
Paradox
Definition: Paradox is a relational database management system produced by the Corel
Corporation.
Postgres
Definition: Postgres is an object-oriented relational database management system
(sometimes referred to as an object-relational database). It began as a research project at the
University of California, Berkely and is available in several free and commercial versions.
Microsoft SQL Server
Definition: Microsoft SQL Server is a powerful relational database management system
catering to high-end users with advanced needs. Along with Oracle, Microsoft SQL Server is
widely regarded as one of the two main full-featured database systems on the market today.
<B> Database Terminology:Raw Data
Definition: Data consists of a series of facts or statements that may have been collected,
stored, processed and/or manipulated but have not been organized or placed into context.
When data is organized, it becomes information. Information can be processed and used to
draw generalized conclusions or knowledge.
Example: A file listing all of the orders placed through an online service is an example of
data. If we sort the data by ZIP code and summarize the number of orders that come from
each city, we have created information. We can create knowledge by taking this information
and making statements such as "Most orders for Widget X come from the northeastern United
States."
Information
page 2
document.doc 8/4/2023
Definition: Information is the processed data ordered in a meaningful way.
Knowledge
Definition: Knowledge consists of generalized conceptual statements that have been
developed through the analysis of information.
Example: A file listing all of the orders placed through an online service is an example of
data. If we sort the data by ZIP code and summarize the number of orders that come from
each city, we have created information. We can create knowledge by taking this information
and making statements such as "Most orders for Widget X come from the northeastern United
States."
DatabaseDefinition: A database is a collection of information organized into interrelated tables of data
and specifications of data objects.
Relation / Table
Definition: A database relation is a predefined row/column format (that defines an entity) for
storing information in a relational database. Relations are equivalent to tables.
Record / Row
Definition: In a relational database, a row consists of one set of attributes (or one tuple)
corresponding to one instance of the entity that a table schema describes.
Attribute / Field / Column
Definition: Database tables are composed of individual columns corresponding to the
attributes of the object. A single data item related to a database object. The database schema
associates one or more attributes with each database entity.
Example: In the following database table, the attributes are <name, ID,
extension>
Name ID Extension
Jim 124 7075
Valeri 128 0853
Bob 192 4214
Domain
Definition: The domain of a database attribute is the set of all allowable values that attribute
may assume.
page 3
document.doc 8/4/2023
Examples: A field for gender may have the domain {male, female, unknown} where those
three values are the only permitted entries in that column.
Tuple
Definition: Tuple is a term from set theory which refers to a collection of one or more
attributes.
Cardinality
Definition: In set theory, cardinality refers to the number of members in the set. When
specifically applied to database theory, the cardinality of a table refers to the number of rows
(or tuples) contained in a table.
Examples: The table below has cardinality 5:
Name Age SSN Phone Extension
Rob 28 123-45-6789 1242
Amy 34 987-65-4321 9281
Elizabeth 34 111-22-3333 9312
Jim 42 333-22-1111 3214
Mike 29 999-99-9999 2314
Key
Definition: A database key is an attribute utilized to sort and/or identify data in some
manner. Each table has a primary key which uniquely identifies records. Foreign keys are
utilized to cross-reference data between relational tables.
Primary Key
Definition: The primary key of a relational table uniquely identifies each record in the table. It
can either be a normal attribute that is guaranteed to be unique (such as Social Security
Number in a table with no more than one record per person) or it can be generated by the
DBMS (such as a globally unique identifier, or GUID, in Microsoft SQL Server).
Candidate Key
Definition: A candidate key is a combination of attributes that can be uniquely used to identify
a database record. Each table may have one or more candidate keys. One of these candidate
keys is selected as the table primary key.
Examples: There are a large number of candidate keys in the sample table
below. Some of these are <SSN>, <Phone Extension>, <Name, SSN>, and
page 4
document.doc 8/4/2023
<Name, Age, SSN>. Note that <Age> is not a candidate key in this case
because Amy and Elizabeth share the same age.
Name Age Social Security No (SSN) Phone Extension Department Code
Rob 28 123-45-6789 1242 001
Amy 34 987-65-4321 9281 002
Elizabeth 34 111-22-3333 9312 002
Jim 42 333-22-1111 3214 005
Mike 29 999-99-9999 2314 004
Foreign Key
Definition: A foreign key is a field in a relational table that matches the primary key column
(e.g. department code) of another table (Department). The foreign key can be used to cross-
reference tables.
Index
Definition: An index is a database feature used for locating data quickly within a table.
Indexes are defined by selecting a set of commonly searched attribute(s) on a table and using
the appropriate platform-specific mechanism to create an index.
Example: Personnel information may be store in a Human Resource department's employee
table. Clerks find that they often search the table for employees by last name but get slow
query responses. Defining an index on the table consisting of the last name attribute would
speed up these queries.
Data Mining
Definition: Data mining is the use of automated data analysis techniques to uncover
previously undetected relationships among data items. Data mining often involves the analysis
of data stored in a data warehouse. Three of the major data mining techniques are regression,
classification and clustering.
Data Warehouse
Definition: A data warehouse is a centralized database that captures information from various
parts of an organization's business processes. This information can later be analyzed to
determine predictive relationships through the use of data mining techniques.
Enterprise
Definition: An enterprise is an organization that utilizes computers and applications. In
general use, enterprises refer to businesses/organizations that operate on a large scale.
page 5
document.doc 8/4/2023
Applications that are designed for these organizations are often referred to as enterprise
applications.
Example: A multinational company that has interconnected computer users located around
the world could be considered an enterprise. The network operating system that they utilize
can be referred to as an enterprise operating system. The database that stores their global
sales information is both an enterprise application and an enterprise database.
Entity
Definition: An entity is a single object (e.g. student, course, department, project) about which
data can be stored. It is the "subject" of a table. Entities and their interrelationships are
modeled through the use of entity-relationship diagrams.
Definition: An entity is a single object about which data can be stored. It is the "subject" of a
table. Entities and their interrelationships are modeled through the use of entity-relationship
diagrams.
Entity-Relationship Diagram
Definition: An entity-relationship diagram is a specialized graphic that illustrates the
interrelationships (e.g. 1 to 1, 1 to N, N to N) between entities in a database.
Also Known As: ER Diagram, E-R Diagram, entity-relationship model
Flat File
Definition: Flat files are data files that contain records with no structured relationships.
Additional knowledge is required to interpret these files such as the file format properties.
Modern database management systems used a more structured approach to file management
(such as one defined by the Structured Query Language) and therefore have more complex
storage arrangements.
Example: Many database management systems offer the option to export data to comma
delimited file. This type of file contains no inherent information about the data and
interpretation requires additional knowledge. For this reason, this type of file can be referred
to as a flat file.
Normalization
Definition: Normalization is the process of structuring relational database schema such that
most ambiguity is removed. The stages of normalization are referred to as normal forms and
progress from the least restrictive (First Normal Form) through the most restrictive (Fifth
Normal Form). Generally, most database designers do not attempt to implement anything
higher than Third Normal Form or Boyce-Codd Normal Form.
Boyce-Codd Normal Form (BCNF)
page 6
document.doc 8/4/2023
Definition: A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is a
candidate key.
First Normal Form (1NF)
Definition: A relation is said to be in First Normal Form (1NF) if and only if each attribute of
the relation is atomic. More simply, to be in 1NF, each column must contain only a single value
and each row must contain the same columns.
Example: The following table is NOT in First Normal Form:
Manager Employees
Jim Susan, Rob, Beth
Mary Alice, John, Asim
Renee Mike
Joe Alan, Tim
Here is an alternative option that IS in 1NF.
Manager Employee
Jim Susan
Jim Rob
Jim Beth
Mary Alice
Mary John
Mary Asim
Renee Mike
Joe Alan
Joe Tim
Second Normal Form (2NF)
Definition: In order to be in Second Normal Form, a relation must first fulfill the requirements
to be in First Normal Form. Additionally, each nonkey attribute in the relation must be
functionally dependent upon the primary key.
Example: The following relation is in First Normal Form, but not Second Normal
Form:
Order # Customer Contact Person Total
1 Acme Widgets John Doe $134.23
page 7
document.doc 8/4/2023
2 ABC Corporation Fred Flintstone $521.24
3 Acme Widgets John Doe $1042.42
4 Acme Widgets John Doe $928.53
In the table above, the order number serves as the primary key. Notice that
the customer and total amount are dependent upon the order number -- this
data is specific to each order. However, the contact person is dependent upon
the customer. An alternative way to accomplish this would be to create two
tables:
Customer Contact Person
Acme Widgets John Doe
ABC Corporation Fred Flintstone
Order # Customer Total
1 Acme Widgets $134.23
2 ABC Corporation $521.24
3 Acme Widgets $1042.42
4 Acme Widgets $928.53
The creation of two separate tables eliminates the dependency problem experienced in the
previous case. In the first table, contact person is dependent upon the primary key -- customer
name. The second table only includes the information unique to each order. Someone
interested in the contact person for each order could obtain this information by performing a
JOIN operation.
Third Normal Form (3NF)
Definition: In order to be in Third Normal Form, a relation must first fulfill the requirements to
be in Second Normal Form. Additionally, all attributes that are not dependent upon the
primary key must be eliminated.
Examples: The following table is NOT in Third Normal Form:
Company City State ZIP
Acme Widgets New York NY 10169
ABC Corporation Miami FL 33196
XYZ, Inc. Columbia MD 21046
page 8
document.doc 8/4/2023
In this example, the city and state are dependent upon the ZIP code. To place this table in
3NF, two separate tables would be created -- one containing the company name and ZIP code
and the other containing city, state, ZIP code pairings.
This may seem overly complex for daily applications and indeed it may be. Database
designers should always keep in mind the tradeoffs between higher level normal forms and the
resource issues that complexity creates.
Fourth Normal Form (4NF)
Definition: To be in Fourth Normal Form, a relation must first be in Boyce-Codd Normal Form.
Additionally, a given relation may not contain more than one multivalued attribute.
Examples: The following relation is NOT in Fourth Normal Form:
Ma
nag
er
Ch
ild
Em
plo
yee
JimBet
h
Alic
e
Ma
ry
Bo
bJane
Ma
ry
N
UL
L
Ada
m
Each manager can have more than one child and each manager can supervise more than one
employee. Therefore, this relation is not in Fourth Normal Form. The creation of two separate
relations for the Manager/Child and Manager/Employee relationships would put this relation in
Fourth Normal Form.
Functional Dependency
Definition: A functional dependency occurs when one attribute in a relation uniquely
determines another attribute. This can be written A -> B which would be the same as stating
"B is functionally dependent upon A."
Examples: In a table listing employee characteristics including Social Security Number (SSN)
and name, it can be said that name is functionally dependent upon SSN (or SSN -> name)
because an employee's name can be uniquely determined from their SSN. However, the
reverse statement (name -> SSN) is not true because more than one employee can have the
same name but different SSNs.
page 9
document.doc 8/4/2023
Lock
Definition: Database management systems utilize locks to provide concurrency control.
Common uses of locks are to ensure that only one user can modify a record at a time and that
data can not be read while it is being modified. Locking mechanisms can be enforced at the
row, table or page level.
Metadata
Definition: Metadata is literally "data about data." This term refers to information about data
itself -- perhaps the origin, size, formatting or other characteristics of a data item. In the
database field, metadata is essential to understanding and interpreting the contents of a data
warehouse.
Example: The eXtensible Markup Language (XML) is a metadata format used to define other
data objects.
Replication
Definition: Replication is the process of sharing information between databases (or any other
type of server) to ensure that the content is consistent between systems. Replication is
normally used to increase the number of database servers available to clients, thereby
reducing the load on each.
Form
Definition: A database form can be used to facilitate database data entry and/or retrieval
operations. A database developer/administrator usually designs a form which can then be used
by personnel without any specific database skills to perform repetitive tasks.
Examples: The picture below shows an example form from a Microsoft Access database:
Report
Definition: A database report presents information retrieved from a table or query in a
preformatted, attractive manner.
page 10
document.doc 8/4/2023
Examples: A sample Microsoft Access report is shown below:
Repository
Definition: A repository is a collection of resources that can be accessed to retrieve
information. Repositories often consist of several databases tied together by a common search
engine.
Transaction
Definition: Transactions are a group of database commands which are to be treated as a
single atomic event. Transactions are maintained using the two phase commit system.
Two Phase Commit
Definition: Two Phase Commit is the process by which a relational database ensures that
distributed transactions are performed in an orderly manner. In this system, transactions may
be terminated by either committing them or rolling them back.
Query
Definition: Queries are the primary mechanism for retrieving information from a database and
consist of questions presented to the database in a predefined format. Many database
management systems use the Structured Query Language (SQL) standard query format.
Structured Query Language (SQL)
page 11
document.doc 8/4/2023
Definition: The structured query language is an industry-standard language used for
manipulation of data in a relational database. The major SQL commands of interest to
database users are SELECT, INSERT, JOIN and UPDATE.
SELECT
Definition: The SELECT statement in SQL is the primary mechanism for retrieving information
from a relational database.
Examples: Given the following table:
Members
ID LastName Age
1 Smith 25
2 Jones 42
3 Reynolds 36
This SQL statement:
SELECT LastName
FROM Members
WHERE Age>30
Would produce the following results:
LastName
Jones
Reynolds
INSERT
Definition: The INSERT SQL command is used to add records to a table within a database.
Examples: Given the following simple table:
Members
ID Last Name Age
1 Smith 25
2 Jones 42
The following SQL statement could be used to add a new record:
page 12
document.doc 8/4/2023
INSERT INTO Members
VALUES ('3','Reynolds','36')
Which would produce the new table:
ID Last Name Age
1 Smith 25
2 Jones 42
3 Reynolds 36
JOIN
Definition: The SQL JOIN statement is used to combine the data contained in two relational
database tables based upon a common attribute.
Examples: Given the following two tables:
Customers
Customer ID CompanyName Phone
12 ABC Corporation 123-4567
49 XYZ, Inc. 765-4321
Orders
OrderID CustomerID Amount
4021 12 $842.21
8532 12 $582.20
8192 49 $12.43
The following JOIN statement could be used:
JOIN Customers, Orders
WHERE Customers.CustomerID = Orders.CustomerID
DISPLAY Customers.CompanyName, Orders.Amount
Which would display the following results:
CompanyName Amount
ABC Corporation $842.21
ABC Corporation $582.20
XYC, Inc. $12.43
Alternatively, the JOIN can be performed implicitly with a SELECT statement such as:
page 13
document.doc 8/4/2023
SELECT CompanyName, Amount
FROM Customers, Orders
In this example, it is not necessary to specify the JOIN condition because the two tables share
only one common column which is automatically used. A WHERE clause could be used to
further refine the results. For example, if we only wanted results from ABC Corporation we
could use the statement:
SELECT CompanyName, Amount
FROM Customers, Orders
WHERE CompanyName = 'ABC Corporation'
UPDATE
Definition: The UPDATE statement in SQL is used to edit values for attributes in one or more
records of a relational table.
Example: Given the following table:
Members
ID Last Name Age
1 Smith 25
2 Jones 42
3 Reynolds 36
Assume that the member Jones recently changed her last name to McGuire. This change could
be effected using the following SQL statement:
UPDATE Members
SET LastName = 'McGuire'
WHERE ID = 2
COMMIT
Definition: The COMMIT statement in SQL marks the final step in the processing of a database
transaction. The alternative is to utilize the ROLLBACK command to cancel the proposed
database changes.
Examples: The COMMIT statement is used in the following manner:
BEGIN TRANSACTION [transaction_name]
...
page 14
document.doc 8/4/2023
SQL Statement(s)
...
COMMIT TRANSACTION [transaction_name]
Rollback
Definition: The ROLLBACK statement in SQL cancels the proposed changes in a pending
database transaction. The transaction can be rolled back completely by specifying the
transaction name in the ROLLBACK statement. A partial rollback can also be accomplished by
specifying a savepoint name in lieu of the transaction name. The alternative to rolling back a
transaction is to utilize the COMMIT command to make the proposed changes part of the
relational database.
Examples: The ROLLBACK statement is used in the following manner to cancel an entire
transaction:
BEGIN TRANSACTION [transaction_name]
...
SQL Statement(s)
...
ROLLBACK TRANSACTION [transaction_name]
The ROLLBACK command can also be used to cancel part of a transaction in the following
manner:
BEGIN TRANSACTION [transaction_name]
...
SQL Statement(s)
SAVE TRANSACTION savepoint_name
SQL Statement(s)
ROLLBACK TRANSACTION savepoint_name
NULL
Definition: The NULL SQL keyword is used to represent either a missing value or a value that
is not applicable in a relational table.
page 15