Teradata Day 1

download Teradata Day 1

of 134

Transcript of Teradata Day 1

  • 8/12/2019 Teradata Day 1

    1/134

    1 / 25 May 2009 / EDS INTERNAL

    Teradata TrainingHema Venkatesh Ramasamy

    HP Global Soft Private Ltd.

  • 8/12/2019 Teradata Day 1

    2/134

    2 / 25 May 2009 / EDS INTERNAL

    Teradata ProductOverview

  • 8/12/2019 Teradata Day 1

    3/134

    3 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    After completing this module, you will be able to:

    Describe the purpose of the Teradata product

    Give a brief history of the product

    List major architectural features of the product

  • 8/12/2019 Teradata Day 1

    4/134

    4 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    What is Teradata?Teradata is a Relational Database Management System (RDBMS).

    Designed to run the worlds largest commercial databases. Preferred solution for enterprise data warehousing

    Executes on UNIX MP-RAS and Windows 2000 operating systems

    Compliant with ANSI industry standards

    Runs on a single or multiple nodes

    Acts as a database server to client applications throughout the enterprise

    Uses parallelism to manage terabytes of data

    Capable of supporting many concurrent users from various client platforms (over aTCP/IP or IBM channel connection).

    Win XPWin 2000

    UNIXClient

    MainframeClient

    TeradataDATABASE

  • 8/12/2019 Teradata Day 1

    5/134

    5 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradata A Brief History

    1979 Teradata Corp founded in Los Angeles, California

    Development begins on a massively parallel computer

    1982 YNET technology is patented

    1984 Teradata markets the first database computer DBC/1012 First system purchased by Wells Fargo Bank of Cal. Total revenue for year - $3 million

    1987 First public offering of stock

    1989 Teradata and NCR partner on next generation of DBC

    1991 NCR Corporation is acquired by AT&T Teradata revenues at $280 million

    1992 Teradata is merged into NCR

    1996 AT&T spins off NCR Corp. with Teradata product

    1997 Teradata database becomes industry leader in data warehousing

    2000 100+ Terabyte system in production

    2002 Teradata V2R5 released 12/2002; major release including features such as PPI, roles and profiles, multi-value compression, and more.

    2003 Teradata V2R5.1 released 12/2003; includes UDFs, BLOBs, CLOBs, and more.

  • 8/12/2019 Teradata Day 1

    6/134

    6 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    How Large is a Trillion?

    1 Kilobyte = 103 = 1000 bytes1 Megabyte = 106 = 1,000,000 bytes

    1 Gigabyte = 109 = 1,000,000,000 bytes1 Terabyte = 1012 = 1,000,000,000,000 bytes1 Petabyte = 1015 = 1,000,000,000,000,000 bytes

    1 million seconds = 11.57 days1 billion seconds = 31.6 years1 trillion seconds = 31,688 years

    1 million inches = 15.7 miles1 trillion inches = 15,700,000 miles (30 roundtrips to the moon)

    1 million square inches = .16 acres = .0002 square miles1 trillion square inches = 249 square miles (larger than Singapore)

    $1 million = < $ .01 for every person in U.S.$1 billion = $ 3.64 for every person is U.S.$1 trillion = $ 3,636 for every person in U.S.

  • 8/12/2019 Teradata Day 1

    7/134 7 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Designed for Todays Business

    Teradatas Charter meets the business needs of today

    and tomorrow with:

    Relational databasestandard for database design

    Enormous capacitybillions of rows, terabytes ofdata

    High performance parallel processing

    Single database server for multiple clientsSingleVersion of the Truth

    Network and mainframe connectivity

    Industry standard access languageStructuredQuery Language (SQL)

    Manageable growth via modularity

    Fault tolerance at all levels of hardware andsoftware

    Data integrity and reliability

  • 8/12/2019 Teradata Day 1

    8/134 8 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Evolution of Data Processing

    Type Example Number of Rows ResponseAccessed Time

    OLTP Update a checking account Small Secondsto reflect a deposit

    DSS How many child size blue Large Seconds or minutesjeans were sold acrossall of the our Eastern storesin the month of March?

    OLCP Instant creditHow much Small to moderate; Minutescredit can be extended to possibly acrossthis person? multiple databases

    OLAP Show the top ten selling Large number of Seconds or minutesitems across all stores detail rows orfor 2003. moderate number

    of summary rows

    TRADITIONAL

    Tod

    ay

    The need to process DSS, OLCP, and OLAP type requests across anenterprise and its data leads to the concept of a Data Warehouse.

  • 8/12/2019 Teradata Day 1

    9/134

  • 8/12/2019 Teradata Day 1

    10/13410 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Data Warehouse Usage Evolution

    STAGE 1REPORTING

    WHAThappened?

    STAGE 2ANALYZING

    WHYdid it happen?

    STAGE 3PREDICTING

    WHYwill it happen?

    PrimarilyBatch

    Increase inAd Hoc

    Queries

    AnalyticalModeling

    Grows

    Batch Ad Hoc Analytics

    Continuous Update &Time Sensitive Queries

    Become ImportantContinuous Update

    Short Queries

    STAGE4OPERATIONALIZING

    WHATIS Happening?

    STAGE 5ACTIVE WAREHOUSING

    MAKINGit happen!

    Event-Based

    Triggering

    Event BasedTriggering

    Takes Hold

  • 8/12/2019 Teradata Day 1

    11/134

    11 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    What is Active Data Warehousing?

    Data Warehousing is the timely, integrated, logically consistent store of

    detailed data available for analytic business decision making.

    Primarily batch feeds and updates Ad hoc queries to support strategic decisions that return in minutes and maybe

    hours

    Active Data Warehousing is the timely, integrated, logically consistentstore of detailed data available for strategic, tactical driven businessdecisions.

    Timely updatesclose to real time Short, tactical queries that return in seconds Event driven activity plus strategic queries

    Business requirements for an ADW (Active Data Warehouse)?

    Performanceresponse within seconds Scalabilitysupport for large data volumes, mixed workloads, and concurrent

    users Availability7 x 24 x 365

    Data FreshnessAccurate, up to the minute, data

  • 8/12/2019 Teradata Day 1

    12/134

    12 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradatas Competitive Advantages

    Unlimited, Proven Scalabilityamount of data and number of users; allowsfor an enterprise wide model of the data.

    Unlimited Parallelismparallel access, sorts, and aggregations.

    Mature Optimizerhandles complex queries, up to 64 joins per query, ad-hocprocessing.

    Models the Business3NF, robust view processing, & provides star schemacapabilities.

    Provides a single version of the truth.

    Low TCO (Total Cost of Ownership)ease of setup, maintenance, &administration; no re-orgs, lowest disk to data ratio, and robust expansionutility (reconfig).

    High Availabilityno single point of failure.

    Parallel Load and Unload utilitiesrobust, parallel, and scalable load andunload utilities such as FastLoad, MultiLoad, TPump, and FastExport.

  • 8/12/2019 Teradata Day 1

    13/134

    13 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradata Manageability

    Things a Teradata DBAneverhas to do!

    Reorganize data or index space

    Pre-allocate table/index space, format partitions

    Pre-prepare data for loading (convert, sort, split, etc.)

    Ensure that queries run in parallel

    Unload/reload data spaces due to expansion

    Design, implement and support partition schemes.

    Write or run programs to split the input source files into partitions forloading

    A DBA knows that if the data doubles, the system canexpand easily to accommodate it.

    The command and workload for creating a table that willhave 100,000 rows is the same as creating a table that willhave 1,000,000,000 rows!

  • 8/12/2019 Teradata Day 1

    14/134

    14 / 25 May 2009 / EDS INTERNAL

    Teradata Basics

  • 8/12/2019 Teradata Day 1

    15/134

    15 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    After completing this module, you will be able to:

    List and describe the major components of the Teradataarchitecture.

    Describe how the components interact to manageincoming and outgoing data

    List 5 types of Teradata database objects

  • 8/12/2019 Teradata Day 1

    16/134

    16 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradata Storage Architecture

    Notes:

    The Parsing Enginedispatchesrequest to insert a row.

    The Message Passing Layerinsures that a row gets to theappropriate AMP (Access ModuleProcessor).

    The AMPstores the row on itsassociated (logical) disk.

    An AMP manages a logical orvirtual diskwhich is mapped tomultiple physical disks in a diskarray.

    Teradata

    AMP 4AMP 3AMP 1 AMP 2

    ParsingEngine(s)

    Message Passing Layer

    18

    254

    41

    1290

    75

    80

    32 667

    25

    Records From Client (in random sequence)

    2 32 67 12 90 6 54 75 18 25 80 41

  • 8/12/2019 Teradata Day 1

    17/134

    17 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradata Retrieval Architecture

    Notes:

    The Parsing Enginedispatches arequest to retrieve one or morerows.

    The Message Passing Layerinsures that the appropriateAMP(s) are activated.

    The AMP(s)locate and retrievedesired row(s) in parallel access.

    Message Passing Layerreturns toretrieved rows to PE.

    The PEreturns row(s) torequesting client application.

    Teradata

    AMP 4AMP 3AMP 1 AMP 2

    ParsingEngine(s)

    Message Passing Layer

    18

    254

    41

    1290

    75

    80

    32 667

    25

    Rows retrieved from table

    2 32 67 12 90 6 54 75 18 25 80 41

  • 8/12/2019 Teradata Day 1

    18/134

    18 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Multiple Tables on Multiple AMPs

    EMPLOYEE RowsDEPARTMENT RowsJOB Rows

    EMPLOYEE Table DEPARTMENT Table JOB Table

    Parsing Engine

    AMP #1 AMP #2 AMP #3 AMP #4

    Message Passing Layer

    Notes:

    Some rows from each table maybe found on each AMP.

    Each AMP may have rows fromall tables.

    Ideally, each AMP will holdroughly the same amount ofdata.

    EMPLOYEE RowsDEPARTMENT RowsJOB Rows

    EMPLOYEE RowsDEPARTMENT RowsJOB Rows

    EMPLOYEE RowsDEPARTMENT RowsJOB Rows

  • 8/12/2019 Teradata Day 1

    19/134

    19 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Linear Growth and Expandability

    AMP

    AMP

    Parsing

    Engine

    AMP

    Disk

    Disk

    Disk

    ParsingEngine

    ParsingEngine

    Notes:

    Teradata is a linearlyexpandable RDBMS.

    Components may be added asrequirements grow.

    Linear scalability allows forincreased workload withoutdecreased throughput.

    Performance impact of addingcomponents is shown below.

    USERS AMPs DATA Performance

    Same Same Same SameDouble Double Same SameSame Double Double SameSame Double Same Double

  • 8/12/2019 Teradata Day 1

    20/134

    20 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Teradata Objects

    There are eight types of objects which may be found in a Teradata database/user.

    Tablesrows and columns of data

    Viewspredefined subsets of existing tablesMacrospredefined, stored SQL statements

    TriggersSQL statements associated with a table

    Stored Proceduresprogram stored within Teradata

    Join and Hash Indexesseparate index structures stored as objects within a database

    Permanent Journalstable used to store before and/or after images for recovery

    DEFINITIONS OFALL DATABASE

    OBJECTS

    DD/D

    These objects are created, maintainedand deleted using Structured QueryLanguage (SQL).

    Object definitions are stored in theData Dictionary / Directory (DD/D).

    DATABASE or USER

    TABLE 2 TABLE 3TABLE 1

    VIEW 2 VIEW 3VIEW 1

    MACRO 2 MACRO 3MACRO 1

    TRIGGER 2 TRIGGER 3TRIGGER 1

    Stored Procedure 1 Stored Procedure 2 Stored Procedure 2

    Join/Hash Index 1 Join/Hash Index 2 Join/Hash Index 3

    Permanent Journal These aren't directly accessed by users.

  • 8/12/2019 Teradata Day 1

    21/134

    21 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    The Data Dictionary Directory (DD/D)

    The DD/D ...

    is an integrated set of system tables

    contains definitions of and information about all objects in the system

    is entirely maintained by the RDBMS

    is data about the data or metadata

    is distributed across all AMPs like all tables

    may be queried by administrators or support staff

    is accessed via Teradata supplied views

    Examples of DD/D views:

    DBC.Tables - information about all tables

    DBC.Users - information about all users

    DBC.AllRights - information about access rights

    DBC.AllSpace - information about space utilization

  • 8/12/2019 Teradata Day 1

    22/134

    22 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Structured Query Language (SQL)

    SQL is a query language for Relational Database Systems. A fourth-generation language A set-oriented language A non-procedural language

    (e.g, doesnt have IF, GO TO, DO, FOR NEXT, or PERFORM statements)

    SQL consists of:

    Data Definition Language (DDL) Defines database structures (tables, users, views, macros, triggers, etc.)

    CREATE DROP ALTER

    Data Manipulation Language (DML) Manipulates rows and data values

    SELECT INSERT UPDATE DELETE

    Data Control Language (DCL) Grants and revokes access rights

    GRANT REVOKE

    Teradata SQL also includes Teradata Extensions to SQL

    HELP SHOW EXPLAIN CREATE MACRO

  • 8/12/2019 Teradata Day 1

    23/134

    23 / 25 MAY 2009 / EDS INTERNALTeradata Training

    CREATE TABLE Example of DDL

    CREATE TABLE Employee

    ,FALLBACK(employee_number INTEGER NOT NULL,manager_emp_number INTEGER,dept_number SMALLINT,job_code INTEGER COMPRESS,last_name CHAR(20) NOT NULL,first_name VARCHAR (20),hire_date DATE FORMAT 'YYYY-MM-DD'

    ,birth_date DATE FORMAT 'YYYY-MM-DD',salary_amount DECIMAL (10,2))UNIQUE PRIMARY INDEX (employee_number),INDEX (dept_number);

    Other DDL Examples

    CREATE INDEX (job_code) ON Employee ;

    DROP INDEX (job_code) ON Employee ;

    DROP TABLE Employee ;

  • 8/12/2019 Teradata Day 1

    24/134

    24 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Views

    Views are pre-defined subsets of existing tables consisting of specified columns and/orrows from the table(s).

    A single table view:

    is a window into an underlying table

    allows users to read and update a subset of the underlying table

    has no data of its own

    MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

    1006 1019 301 312101 Stein John 861015 631015 3945000

    1008 1019 301 312102 Kanieski Carol 870201 680517 3925000

    1005 0801 403 431100 Ryan Loretta 861015 650910 4120000

    1004 1003 401 412101 Johnson Darlene 861015 560423 4630000

    1007 1005 403 432101 Villegas Arnando 870102 470131 5970000

    1003 0801 401 411100 Trader James 860731 570619 4785000

    EMPLOYEE (Table)

    PK FK FK FK

    EMP NO DEPT NO LAST NAME FIRST NAME HIRE DATE

    1005 403 Villegas Arnando 870102801 403 Ryan Loretta 861015

    Emp_403 (View)

  • 8/12/2019 Teradata Day 1

    25/134

    25 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Multi-Table Views

    A multi-table view allows users to access data from multiple tables as if it were in a singletable. Multi-table views are also called join views. Join views are used for reading only,

    not updating.EMPLOYEE (Table)

    1006 1019 301 312101 Stein John 861015 631015 3945000

    1008 1019 301 312102 Kanieski Carol 870201 680517 3925000

    1005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 4630000

    1007 1005 403 432101 Villegas Arnando 870102 470131 5970000

    1003 0801 401 411100 Trader James 860731 570619 4785000

    MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

    PK FK FK FK

    MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER

    501 marketing sales 80050000 1017

    301 research and development 46560000 1019

    302 product planning 22600000 1016403 education 93200000 1005

    402 software support 30800000 1011

    401 customer support 98230000 1003

    201 technical operations 29380000 1025

    PK FK

    DEPARTMENT (Table)

    LAST DEPARTMENT

    NAME NAME

    Stein research & developmentKanieski research & developmentRyan educationJohnson customer supportVillegas educationTrader customer support

    EmpDept (View)

    "Joined Together"

  • 8/12/2019 Teradata Day 1

    26/134

    26 / 25 MAY 2009 / EDS INTERNALTeradata Training

    SELECT Example of DML

    The SELECT statement is used to retrieve data from tables.

    Who was hired on October 15, 1986?

    1006 1019 301 312101 Stein John 861015 631015 3945000

    1008 1019 301 312102 Kanieski Carol 870201 680517 3925000

    1005 0801 403 431100 Ryan Loretta 861015 650910 4120000

    1004 1003 401 412101 Johnson Darlene 861015 560423 4630000

    1007 1005 403 432101 Villegas Arnando 870102 470131 5970000

    1003 0801 401 411100 Trader James 860731 570619 4785000

    EMPLOYEE (partial listing)

    MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

    PK FK FK FK

    SELECT Last_Name,First_Name

    FROM EmployeeWHERE Hire_Date = '1986-10-15';

    Result

    LAST

    NAME

    Stein

    Ryan

    Johnson

    FIRST

    NAME

    John

    Loretta

    Darlene

  • 8/12/2019 Teradata Day 1

    27/134

    27 / 25 MAY 2009 / EDS INTERNALTeradata Training

    The JOIN OperationA join operation is used when the SQL query requires information from multipletables.

    Who works in Research and Development?EMPLOYEE

    1006 1019 301 312101 Stein John 861015 631015 3945000

    1008 1019 301 312102 Kanieski Carol 870201 680517 3925000

    1005 0801 403 431100 Ryan Loretta 861015 650910 41200001004 1003 401 412101 Johnson Darlene 861015 560423 4630000

    1007 1005 403 432101 Villegas Arnando 870102 470131 5970000

    1003 0801 401 411100 Trader James 860731 570619 4785000

    MANAGEREMPLOYEE EMP DEPT JOB LAST FIRST HIRE BIRTH SALARYNUMBER NUMBER NUMBER CODE NAME NAME DATE DATE AMOUNT

    PK FK FK FK

    MANAGERDEPT DEPARTMENT BUDGET EMPNUMBER NAME AMOUNT NUMBER

    501 marketing sales 80050000 1017

    301 research and development 46560000 1019302 product planning 22600000 1016

    403 education 93200000 1005

    402 software support 30800000 1011

    401 customer support 98230000 1003

    201 technical operations 29380000 1025

    PK FK

    DEPARTMENTResult

    LASTNAME

    Stein

    Kanieski

    FIRSTNAME

    John

    Carol

  • 8/12/2019 Teradata Day 1

    28/134

    28 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Macros Teradata SQL Extension

    A MACRO is a predefined set of SQL statements which is logically stored in a database.

    Macros may be created for frequently occurring queries of sets of operations.Macros have many features and benefits:

    Simplify end-user access

    Control which operations may be performed by users

    May accept user-provided parameter values

    Are stored on the RDBMS, thus available to all clients

    Reduces query size, thus reduces LAN/channel traffic Are optimized at execution time

    May contain multiple SQL statements

    To create a macro:

    CREATE MACRO Customer_List AS (SELECT customer_name FROM Customer;);

    To execute a macro:

    EXEC Customer_List;

    To replace a macro:

    REPLACE MACRO Customer_List AS

    (SELECT customer_name, customer_number FROM Customer;);

  • 8/12/2019 Teradata Day 1

    29/134

    29 / 25 MAY 2009 / EDS INTERNALTeradata Training

    HELP Commands Teradata SQL Extension

    Databases and Users:

    HELP DATABASE Customer_Service ;

    HELP USER Dave_Jones ;

    Tables, Views, and Macros:

    HELP TABLE Employee ;

    HELP VIEW Emp;

    HELP MACRO Payroll_3;

    HELP COLUMN Employee.*;

    Employee.last_name;

    Emp.* ;

    Emp.last;

    HELP INDEX Employee;

    HELP STATISTICS Employee;

    HELP CONSTRAINT Employee.over_21;

  • 8/12/2019 Teradata Day 1

    30/134

  • 8/12/2019 Teradata Day 1

    31/134

    l d

  • 8/12/2019 Teradata Day 1

    32/134

    32 / 25 MAY 2009 / EDS INTERNALTeradata Training

    EXPLAIN Facility Teradata SQLExtension

    The EXPLAIN modifier in front of any SQL statement generates an English translation of the Parsers

    plan.

    The request is fully parsed and optimized, but not actually executed.

    EXPLAIN returns:

    Text showing how a statement will be processed (a plan)

    An estimate of how many rows will be involved

    A relative cost of the request (in units of time)

    This information is useful for: predicting row counts

    predicting performance

    testing queries before production

    analyzing various approaches to a problem EXPLAIN

    EXPLAIN SELECT last_name, department_number FROM Employee ;

    Explanation (partial):

    3) We do an all-AMPs RETRIEVEstep from CUSTOMER_SERVICE.Employee by way of an all-rowsscanwith no residual conditions into Spool 1, which is built locally on the AMPs. The size ofSpool 1 is estimated to be 24 rows. The estimated time for this step is 0.15 seconds.

  • 8/12/2019 Teradata Day 1

    33/134

    33 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Teradata Features Review Designed for decision-support and tactical queries

    Ideal for data warehouse applications

    Parallelism makes possible access to very large tables Performance increase is linear as components are added

    Uses standard SQL

    Runs as a database server to client applications

    Runs on multiple hardware platforms

    Open architectureuses industry standard components

    Win XPWin 2000

    UNIXClient

    MainframeClient

    TeradataDATABASE

  • 8/12/2019 Teradata Day 1

    34/134

    34 / 25 May 2009 / EDS INTERNAL

    Teradata RDBMSArchitecture

  • 8/12/2019 Teradata Day 1

    35/134

    35 / 25 MAY 2009 / EDS INTERNALTeradata Training

    After completing this module, you will be able to:

    Describe the purpose of the PE and the AMP.

    Describe the overall RDBMS parallel architecture.

    Describe the relationship of the RDBMS to its client

    side applications.

  • 8/12/2019 Teradata Day 1

    36/134

  • 8/12/2019 Teradata Day 1

    37/134

    37 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Teradata Functional Overview

    Teradata RDBMS

    Message Passing Layer

    Channel-Attached System

    LAN

    Network-Attached System

    ParsingEngine

    ParsingEngine

    AMP

    ClientApplication

    CLI or ODBC

    MTDP

    MOSI

    ClientApplication

    CLI

    TDP

    AMP AMP AMP

    Channel

  • 8/12/2019 Teradata Day 1

    38/134

    38 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Channel-Attached Client Software Overview

    Client Application- Your own application(s)- Teradata utilities (BTEQ, etc.)

    CLI (Call-Level Interface) Service Routines- Request and Response Control- Parcel creation and blocking/unblocking- Buffer allocation and initialization

    TDP (Teradata Director Program)- Session balancing across multiple PEs- Insures proper message routing to/from RDBMS- Failure notification (application failure, Teradata restart)

    Channel (ESCON or Bus/Tag)

    Channel-Attached System

    TDP

    ClientApplication

    CLI

    ClientApplication

    CLI

    Parsing

    Engine

    Parsing

    Engine

    Host ChannelAdapter PBSA or PBCA

    N t k Att h d Cli t S ft

  • 8/12/2019 Teradata Day 1

    39/134

    39 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Network-Attached Client SoftwareOverview

    CLI (Call Level Interface)- Library of routines for blocking/unblocking requests and responses to/from the

    RDBMS

    ODBC (Open Database Connectivity) Driver

    - Uses open standards-based ODBC interface to provide client applications access toTeradata

    MTDP (Micro Teradata Director Program)- Library of session management routines

    MOSI (Micro Operating System Interface)- Library of routines providing OS independent interface

    LAN-Attached Servers

    LAN (TCP/IP)ClientApplication(ex., FastLoad)

    CLI

    MTDP

    MOSI

    Client

    Application(ex., SQLAssistant)

    ODBC

    MTDP

    MOSI

    ParsingEngine

    ParsingEngine

    Gateway Software (tgtw)

    ClientApplication(ex., BTEQ)

    CLIMTDP

    MOSI

    Ethernet Adapter

  • 8/12/2019 Teradata Day 1

    40/134

    40 / 25 MAY 2009 / EDS INTERNALTeradata Training

    The Parsing Engine

    The Parsing Engine is responsible for:

    Managing individual sessions (upto 120)

    Parsing and Optimizing your SQLrequests

    Dispatching the optimized plan tothe AMPs

    Input conversion (EBCDIC / ASCII)- if necessary

    Sending the answer set response

    back to the requesting client

    Answer Set Response

    ParsingEngine

    SQL Request

    Parser

    Optimizer

    Dispatcher

    Message Passing Layer

    AMP AMP AMP AMP

  • 8/12/2019 Teradata Day 1

    41/134

    41 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Message Passing Layer

    Answer Set Response

    ParsingEngine

    SQL Request

    Message Passing Layer

    (PDE and BYNET)

    AMP AMP AMP AMP

    The Message Passing Layer isresponsible for:

    Carrying messages between the AMPsand PEs

    Point-to-Point, Multi-Cast, andBroadcast communications

    Merging answer sets back to the PE

    Making Teradata parallelism possible

    The Message Passing Layer is acombination of:

    Parallel Database Extensions (PDE)Software

    BYNET Software

    BYNET Hardware for MPP systems

  • 8/12/2019 Teradata Day 1

    42/134

    42 / 25 MAY 2009 / EDS INTERNALTeradata Training

    The Access Module Processor (AMP)

    Answer Set Response

    ParsingEngine

    SQL Request

    Message Passing Layer

    AMP AMP AMP AMP

    AMPs store and retrieve rows to and from disk

    The AMPs are responsible for:

    Finding the rows requested

    Lock management

    Sorting rows

    Aggregating columns

    Join processing

    Output conversion and formatting

    Creating answer set for client

    Disk space management

    Accounting

    Special utility protocols

    Recovery processing

  • 8/12/2019 Teradata Day 1

    43/134

  • 8/12/2019 Teradata Day 1

    44/134

    44 / 25 May 2009 / EDS INTERNAL

    Storing andAccessing DataRows

  • 8/12/2019 Teradata Day 1

    45/134

  • 8/12/2019 Teradata Day 1

    46/134

    Creating a Primary Index

  • 8/12/2019 Teradata Day 1

    47/134

    47 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Creating a Primary Index A Primary Index is defined at table creation.

    It may consist of a single column, or a combination of columns

    Limit of 16 columns with V2R4.1 and prior releases

    Limit of 64 columns with V2R5.

    CREATE TABLE sample_1(col_a INTEGER,col_b INTEGER

    ,col_c INTEGER)UNIQUE PRIMARY INDEX (col_b);

    UPIIf the index choice of column(s) is unique,we call this a UPI (Unique Primary Index).

    A UPI choice will result in even distributionof the rows of the table across all AMPs.

    CREATE TABLE sample_2(col_x INTEGER

    ,col_y INTEGER,col_z INTEGER)

    PRIMARY INDEX (col_x);

    NUPI If the index choice of column(s) isnt unique,we call this a NUPI (Non-Unique Primary

    Index).A NUPI choice will result in evendistribution of the rows of the tableproportional to the degree of uniqueness ofthe index.Note: Changing the choice of Primary Index

    requires dropping and recreating the table.

  • 8/12/2019 Teradata Day 1

    48/134

    48 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Primary Index Values The value of the Primary Index for a specific row determines the AMP assignment for

    that row.

    This is done using a hashing algorithm.

    PE

    Row assignmentRow access

    HashingAlgorithm

    AMP AMP AMP

    PI Value

    Accessing the row by its Primary Index value is:

    always a one-AMP operation the most efficient way to access a row

    Other table access techniques:

    Secondary index access Full table scans

  • 8/12/2019 Teradata Day 1

    49/134

  • 8/12/2019 Teradata Day 1

    50/134

    50 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Accessing Via a Non-Unique Primary IndexA NUPI access is a one-AMP operation which may access multiple rows.

    CREATE TABLE sample_2(col_x INTEGER,col_y INTEGER,col_z INTEGER)

    PRIMARY INDEX (col_x);

    SELECT col_x,col_y,col_z

    FROM sample_2WHERE col_x = 25;

    PE

    HashingAlgorithm

    AMP

    NUPI = 25

    AMP AMP

    col_x col_y col_z

    10 30 A

    10 30 B

    35 40 B

    col_x col_y col_z

    20 50 A

    25 55 A

    25 60 B

    col_x col_y col_z

    5 70 B

    30 80 B

    30 80 A

    Both UPI and NUPIaccesses are oneAMP operations.

    Primary Keys and Primary Indexes

  • 8/12/2019 Teradata Day 1

    51/134

    51 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Primary Keys and Primary Indexes Indexes are conceptually different from keys.

    A PK is a relational modeling convention which allows each row to be uniquely identified.

    A PI is a Teradata convention which determines how the row will be stored and accessed.

    A significant percentage of tables may use the same columns for both the PK and the PI.

    A well-designed database will use a PI that is different from the PK for some tables.

    Primary Key Primary Index

    Logical concept of data modeling Physical mechanism for access and storage

    Teradata doesnt need to recognize Each table must have exactly one primary index

    No limit on number of columns 16 column limit (V2R4.1); 64 column limit (V2R5)

    Documented in data model Defined in CREATE TABLE statement

    (Optional in CREATE TABLE)

    Must be unique May be unique or non-unique

    Identifies each row May be unique or non-unique

    Values should not change Values may be changed (Delete + Insert)

    May not be NULLrequires a value May be NULL

    Does not imply an access path Defines most efficient access path

    Chosen for logical correctness Chosen for physical performance

  • 8/12/2019 Teradata Day 1

    52/134

    52 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Duplicate RowsA duplicate row is a row of a table whosecolumn values are all identical toanother row in the same table.

    col_a col_b col_c

    20 50 A

    25 50 A

    25 50 A

    Duplicate Rows

    Because a PK uniquely identifies each row, ideally a relational table should not haveduplicate rows!

    The ANSI standard, however, permits duplicate rows for specialized situations, thusTeradata permits them as well.

    You may select whether your table will or will not allow them.

    * Note: If a UPI is selected on a SET table, the duplicate row check is replaced by acheck for duplicate index values.

    CREATE SET TABLE table_A:

    :

    CREATE MULTISET TABLE table_B:

    :

    Checks for * and disallows duplicate rows. Doesnt check for and allows duplicate rows.

    The Teradata default The ANSI default

  • 8/12/2019 Teradata Day 1

    53/134

    R Di ib i U i NUPI C 2

  • 8/12/2019 Teradata Day 1

    54/134

    54 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Row Distribution Using a NUPI Case 2Notes:

    Customer_Number may be the preferred accesscolumn for ORDER table, thus a good index

    candidate. Values for Customer_Number are somewhat non-

    unique.

    Choice of Customer_Number is therefore a NUPI.

    Rows with the same PI value distribute to the sameAMP.

    Row distribution is less uniform or skewed.

    o_# c_# o_dt o_st

    7325 2 4/13 O

    7202 2 4/09 C

    7225 2 4/15 C

    o_# c_# o_dt o_st

    7384 1 4/12 C

    7103 1 4/10 O

    7415 1 4/13 C

    7188 1 4/13 C

    o_# c_# o_dt o_st

    7402 3 4/16 C

    7324 3 4/13 O

    AMP AMP AMP AMP

    Order

    Number

    Customer

    Number

    Order

    Date

    Order

    Status

    PK

    NUPI

    7325

    7324

    7415

    7103

    7225

    7384

    74027188

    7202

    2

    3

    1

    1

    2

    1

    31

    2

    4/13

    4/13

    4/13

    4/10

    4/15

    4/12

    4/164/13

    4/09

    O

    O

    C

    O

    C

    C

    CC

    C

    Order

    Row Distribution Using a Highly Non-Unique

  • 8/12/2019 Teradata Day 1

    55/134

    55 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Row Distribution Using a Highly Non UniquePrimary Index (NUPI) Case 3

    Order

    Number

    Customer

    Number

    Order

    Date

    Order

    Status

    PK

    NUPI

    7325

    7324

    7415

    7103

    7225

    7384

    7402

    7188

    7202

    2

    3

    1

    1

    2

    1

    3

    1

    2

    4/13

    4/13

    4/13

    4/10

    4/15

    4/12

    4/16

    4/13

    4/09

    O

    O

    C

    O

    C

    C

    C

    C

    C

    Order Notes:

    Values for Order_Status are highly non-

    unique. Choice of Order_Status column is a NUPI.

    Only two values exist, so only two AMPswill ever be used for this table.

    Table will not perform well in paralleloperations.

    Highly non-unique columns are poor PIchoices generally.

    The degree of uniqueness is critical toefficiency.

    AMP AMP AMP AMP

    o_# c_# o_dt o_st

    7402 3 4/16 C

    7202 2 4/09 C

    7225 2 4/15 C

    7415 1 4/13 C

    7188 1 4/13 C

    7384 1 4/12 C

    o_# c_# o_dt o_st

    7103 1 4/10 O

    7324 3 4/13 O

    7325 2 4/13 O

  • 8/12/2019 Teradata Day 1

    56/134

    56 / 25 May 2009 / EDS INTERNAL

    Primary IndexMechanics

  • 8/12/2019 Teradata Day 1

    57/134

    57 / 25 MAY 2009 / EDS INTERNALTeradata Training

    After completing this module, you will be able to:

    Explain the role of the hashing algorithm and the hash map in

    locating a row.

    Explain the makeup of the Row ID and its role in row storage.

    Describe the sequence of events for locating a row given its PI

    value.

  • 8/12/2019 Teradata Day 1

    58/134

    H hi D t th AMP

  • 8/12/2019 Teradata Day 1

    59/134

    59 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Hashing Down to the AMPs

    Index value(s)

    hashing algorithm

    Hash Map

    AMP #

    The hashing algorithm is designed to insure even distribution ofunique values across all AMPs.

    Different hashing algorithms are used for different internationalcharacter sets.

    A Row Hash is the 32-bit result of applying a hashing algorithm to

    an index value.The DSW or Hash Bucket is represented by the high order 16 bitsof the Row Hash.

    A Hash Map is uniquely configured for each system.

    It is a array of 65,536 entries (buckets) which associates bucketnumbers with specific AMPs.

    Two systems with the same number of AMPs will have the sameHash Map.

    Changing the number of AMPs in a system requires a change to

    the Hash Map.

    {

    {

    {

    {

    DSW orHash Bucket #

    Row Hash

    A H hi E l

  • 8/12/2019 Teradata Day 1

    60/134

    60 / 25 MAY 2009 / EDS INTERNALTeradata Training

    A Hashing Example

    Order

    OrderNumber

    PK

    UPI

    CustomerNumber

    OrderDate

    OrderStatus

    7325 2 4/13 O7324 3 4/13 O7415 3 4/13 O7415 1 4/13 C7103 1 4/10 O7225 2 4/15 C7384 1 4/12 C7402 3 4/12 C7188 1 4/13 C7202 2 4/09 C

    SELECT * FROM orderWHERE order_number = 7202;

    7202

    Hashing Algorithm

    691B 14AE

    32 bit Row Hash

    Remaining 16 bitsDestination Selection Word

    0110 1001 0001 1011 0001 0100 1010 1110

    6 9 1 B

  • 8/12/2019 Teradata Day 1

    61/134

    Identifying Rows

  • 8/12/2019 Teradata Day 1

    62/134

    62 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Identifying Rows

    Consideration #1

    A Row Hash = 32 bits = 4.2 billion possible values

    Because there is an infinite number of possible datavalues, some data values will have to share thesame row hash.

    Hash Algorithm

    1254 7769

    10A2 2936 10A2 2936 Hash Synonyms

    Data values input

    Consideration #2

    A Primary Index may be non-unique (NUPI).

    Different rows will have the same PI value and thus

    the same row hash.

    A row hash is not adequate to uniquely identify a row.

    Conclusion

    A row hash is no t adequate to un iquely ident i fy a row.

    Hash Algorithm

    (John)'Smith'

    0016 5557

    (Dave)'Smith' NUPI Duplicates

    Rows havesame hash

    0016 5557

    The Row ID

  • 8/12/2019 Teradata Day 1

    63/134

    63 / 25 MAY 2009 / EDS INTERNALTeradata Training

    The Row ID

    TO UNIQUELY IDENTIFY A ROW, WE ADD A 32-BIT UNIQUENESS VALUE.

    THE COMBINED ROW HASH AND UNIQUENESS VALUE IS CALLED A ROW

    ID.Row Hash(32 bits)

    Uniqueness Id(32 bits)

    Row ID

    Each stored rowhas a Row ID as a

    prefix.

    Rows are logicallymaintained in RowID sequence.

    Row ID Row Data

    3B11 5032 0000 0001 1018 Reynolds Jane3B11 5032 0000 0002 1020 Davidson Evan3B11 5032 0000 0003 1031 Green Jason3B11 5033 0000 0001 1014 Jacobs Paul3B11 5034 0000 0001 1012 Chevas Jose3B11 5034 0000 0002 1021 Carnet Jean

    : : : : :

    Row Hash Unique ID Emp_No Last_Name First_Name

    Row ID Row Data

  • 8/12/2019 Teradata Day 1

    64/134

    Storing Rows (2 of 2)

  • 8/12/2019 Teradata Day 1

    65/134

    65 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Storing Rows (2 of 2)Add a row for 'Fred Smith' - (NUPI Duplicate)

    Row ID Row Data

    Row Hash Unique ID Last_Name First_Name Etc.

    0016 5557 0000 0001 Smith John

    0016 5557 0000 0002 Smith Fred

    1058 9829 0000 0001 Adams Sam

    'Smith' Hash Algorithm 0016 5557 Hash Map AMP #3

    Add a row for 'Dan Jones' - (Hash Synonym)

    'Jones' Hash Algorithm 0016 5557 Hash Map AMP #3

    Row ID Row Data

    Row Hash Unique ID Last_Name First_Name Etc.0016 5557 0000 0001 Smith John

    0016 5557 0000 0002 Smith Fred

    0016 5557 0000 0003 Jones Dan

    1058 9829 0000 0001 Adams Sam

    Given the row hash, what other information would be needed to find the 'Dan Jones' row? The 'Fred Smith' row?

  • 8/12/2019 Teradata Day 1

    66/134

  • 8/12/2019 Teradata Day 1

    67/134

  • 8/12/2019 Teradata Day 1

    68/134

  • 8/12/2019 Teradata Day 1

    69/134

  • 8/12/2019 Teradata Day 1

    70/134

  • 8/12/2019 Teradata Day 1

    71/134

    Non-Unique Secondary Index (NUSI)

  • 8/12/2019 Teradata Day 1

    72/134

    72 / 25 MAY 2009 / EDS INTERNALTeradata Training

    Access

    CREATE INDEX (Name) ONCustomer;

    SELECT *FROM Customer

    WHERE Name = 'Adams';

    Create NUSI

    Access via NUSI

    HashingAlgorithm

    NUSI Value = 'Adams'

    PE

    Message Passing Layer

    AMP 1 AMP 2 AMP 3 AMP 4

    CustomerTable ID = 100

    Table ID Row Hash NUSI Value

    100 567 Adams

    to MPL

    NUSI Subtable NUSI Subtable NUSI Subtable NUSI Subtable

    RowID Name RowID

    432, 8 Smith 640, 1

    448, 1 White 107, 1

    567, 3 Adams 638, 1

    656, 1 Rice 536, 5

    RowID Name RowID

    432, 1 Smith 147, 1

    448, 4 Black 822, 1

    567, 6 Jones 338, 1

    770, 1 Young 147, 2

    RowID Name RowID

    155, 1 Marsh 915, 9

    396, 1 Peters 778, 3

    432, 5 Smith 778, 7

    567, 1 Jones 639, 1

    RowID Name RowID

    432, 3 Smith 884, 1

    567, 2 Adams 471, 1

    717, 2

    852, 1 Brown 555, 6

    AMP 1 AMP 2 AMP 3 AMP 4

    Base Table Base Table Base Table Base Table

    RowIDCust Name Phone

    NUSI NUPI

    471, 1 45 Adams 444-6666

    555, 6 98 Brown 333-9999

    717, 2 72 Adams 666-7777

    884, 1 74 Smith 555-6666

    RowIDCust Name Phone

    NUSI NUPI

    147, 1 49 Smith 111-6666

    147, 2 12 Young 777-4444

    388, 1 27 Jones 222-8888

    822, 1 62 Black 444-5555

    RowIDCust Name Phone

    NUSI NUPI

    107, 1 37 White 555-4444

    536, 5 84 Rice 666-5555

    638, 1 31 Adams 111-2222

    640, 1 40 Smith 222-3333

    RowIDCust Name Phone

    NUSI NUPI

    639, 1 77 Jones 777-6666

    778, 3 95 Peters 555-7777

    778, 7 56 Smith 555-7777

    915, 9 51 Marsh 888-2222

  • 8/12/2019 Teradata Day 1

    73/134

  • 8/12/2019 Teradata Day 1

    74/134

  • 8/12/2019 Teradata Day 1

    75/134

  • 8/12/2019 Teradata Day 1

    76/134

    76 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    After completing this module, you will be able to:

    Explain the concept of FALLBACK tables.

    List the types and levels of locking provided by Teradata.

    Describe the Recovery, Transient and Permanent Journals

    and their function.

    List the utilities available for archive and recovery.

  • 8/12/2019 Teradata Day 1

    77/134

    Disk Arrays

  • 8/12/2019 Teradata Day 1

    78/134

    78 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Disk Arrays

    DAC

    DAC

    Host Operating System

    Utilities Applications

    Why Disk Arrays?

    High availabilitythrough data mirroring or data parity protection.

    Better I/O performancethrough implementation of RAID technology at the hardwarelevel.

    Convenience- automatic disk recovery and data reconstruction when mirroring ordata parity protection is used.

  • 8/12/2019 Teradata Day 1

    79/134

  • 8/12/2019 Teradata Day 1

    80/134

    RAID 1 Summary

  • 8/12/2019 Teradata Day 1

    81/134

    81 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    RAID 1 Summary

    Characteristics

    data is fully replicated striped mirroring is possible with multiple pairs of disks in a drive group

    transparent to operating system

    Advantages

    maximum data availability

    read performance gains

    no performance penalty with write operations

    fast recovery and restoration

    Disadvantages

    50% of disk space is used for mirrored data

    Summary

    RAID 1 provides high data availability and performance, but storage costs are higher.

    Striped Mirro ring is NOT necessary with Teradata.

  • 8/12/2019 Teradata Day 1

    82/134

  • 8/12/2019 Teradata Day 1

    83/134

    Teradata RAID 1 and RAID 5

  • 8/12/2019 Teradata Day 1

    84/134

    84 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    e adata a d 5

    RAID 1 for Teradata

    Most useful with typical Teradata data warehouses (e.g., Active Data Warehouses).

    RAID 5 for Teradata

    Most useful when creating archival data warehouses that require less expensivestorage and where performance is not as important.

    Why?

    RAID 1 provides Superior Performance Mirroring provides the best read and write throughput.

    Maximizes the performance capabilities of controllers and disk drives.

    Best performance when a drive has failed.

    Less reconstruction impact when a drive has failed.

    RAID 1 provides Superior Availability Less susceptible to a double disk failure in a RAID drive group.

    Faster reconstruction of a failed drive - shorter vulnerability period duringreconstruction.

  • 8/12/2019 Teradata Day 1

    85/134

    Teradata Vproc Migration

  • 8/12/2019 Teradata Day 1

    86/134

    86 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    p gVproc Migrationvprocs in the failed node are started in the remaining

    nodes within the clique.

    SMP Fails

    DAC-A DAC-BDAC-A DAC-BDAC-A DAC-B DAC-A DAC-B

    SMP001-4 AMPs

    0 3 39

    SMP001-5 AMPs

    1 4 37.

    SMP002-4 AMPs

    2 5 38.

    SMP002-5 AMPs

    36

  • 8/12/2019 Teradata Day 1

    87/134

  • 8/12/2019 Teradata Day 1

    88/134

  • 8/12/2019 Teradata Day 1

    89/134

  • 8/12/2019 Teradata Day 1

    90/134

  • 8/12/2019 Teradata Day 1

    91/134

    Fallback Clusters

  • 8/12/2019 Teradata Day 1

    92/134

    92 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    A Fallback cluster is a defined set of AMPs across which fallback is implemented.

    All Fallback rows for AMPs in a cluster must reside within the cluster.

    Loss of one AMP in the cluster permits continued table access.

    Loss of two AMPs in the cluster causes the RDBMS to halt.

    Primaryrows

    Fallbackrows

    AMP 1

    62 278

    5 34 14

    AMP 2 AMP 3 AMP 4Cluster 0

    34 5022 5 1978 14 381

    19 38 8 22 62 1 50 27 78

    Primaryrows

    Fallback

    rows

    AMP 5 AMP 6 AMP 7 AMP 8Cluster 1

    41 766

    93 72 88

    58 2093 88 452 17 7237

    45 7 17 37 58 41 20 2 66

  • 8/12/2019 Teradata Day 1

    93/134

  • 8/12/2019 Teradata Day 1

    94/134

  • 8/12/2019 Teradata Day 1

    95/134

  • 8/12/2019 Teradata Day 1

    96/134

    Fallback and RAID 1 Example (cont.)

  • 8/12/2019 Teradata Day 1

    97/134

    97 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    RAID 1 -Mirrored

    Pair ofPhysicalDiskDrives

    Primary 342250

    Fallback 1938

    8

    Primary 342250

    Fallback 1938

    8

    Primary 141

    38Fallback 50

    27

    78

    Primary 141

    38Fallback 50

    27

    78

    Primary 628

    27Fallback 5

    34

    14

    Primary 628

    27Fallback 5

    34

    14

    Primary 57819

    Fallback 2262

    1

    Primary 57819

    Fallback 2262

    1

    Assume two disk drives have failed in the same drive group. Is Fallback needed?

    Primaryrows

    Fallbackrows

    AMP 1

    62 278

    5 34 14

    AMP 2 AMP 3 AMP 4

    Vdisk

    34 5022 5 1978 14 381

    19 38 8 22 62 1 50 27 78

  • 8/12/2019 Teradata Day 1

    98/134

  • 8/12/2019 Teradata Day 1

    99/134

  • 8/12/2019 Teradata Day 1

    100/134

    Recovery Journal for Down AMPs

  • 8/12/2019 Teradata Day 1

    101/134

    101 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Automatically activated when an AMP is taken off-line.Maintained by other AMPs in the cluster.Totally transparent to users of the system.

    Recovery Journal is:

    While AMP is off-line Journal is active.Table updates continue as normal.Journal logs Row IDs of changed rows for down-AMP.

    When AMP is back on-line Restores rows on recovered AMP to current status.Journal discarded when recovery complete.

    Primaryrows

    Fallbackrows

    AMP 1

    62 278

    5 34 14

    AMP 2 AMP 3 AMP 4

    Vdisk

    34 5022 5 1978 14 381

    19 38 8 22 62 1 50 27 78

    RecoveryJournal Row ID for 62Row ID for 34 Row ID for 14

  • 8/12/2019 Teradata Day 1

    102/134

  • 8/12/2019 Teradata Day 1

    103/134

    Archiving and Recovering Data

  • 8/12/2019 Teradata Day 1

    104/134

    104 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    ARC

    The Archive/Restore utility (arcmain)

    Runs on IBM, UNIX, and Windows 2000 systems

    Archives and restores data from/to Teradata RDBMS

    Restores or copies data from archive media

    Permits data recovery to a specified checkpoint (using Permanent Journals)

    ARC 7.0 is required to archive/restore with Teradata V2R5

    Open Teradata Backup

    Two choices from different NCR Partners

    NetVault - from BakBone software

    NetBackup - from VERITAS software (limited support)

    Provides Windows front end for ARC

    Easy creation of scripts for archive/recovery

    Provides job scheduling and tape management functions

    ASF2 no longer supported with Teradata V2R5

  • 8/12/2019 Teradata Day 1

    105/134

  • 8/12/2019 Teradata Day 1

    106/134

    Data Dictionary / Directory

  • 8/12/2019 Teradata Day 1

    107/134

    107 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    DBC

    Sys_Calendar SysAdmin SystemFECrashdumps SYSDBA

    Data Dictionary / Directory Tables

    Object definitionsSystem event logsSystem message tableJournals and Restart control tablesAccounting informationAccess control tables

    Views of DD/D Tables

    AdministrativeSecuritySupervisoryEnd UserOperational

    Macros

    Add calculation sequenceGenerate utilization reportsReset accounting valuesAuthorize secured functions

  • 8/12/2019 Teradata Day 1

    108/134

  • 8/12/2019 Teradata Day 1

    109/134

  • 8/12/2019 Teradata Day 1

    110/134

    System Views

  • 8/12/2019 Teradata Day 1

    111/134

    111 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    Clarify tables Re-title tables and/or columns. Reorder and format columns. Compute (derive) new column data.

    Simply operations Supply join operation syntax. Select and project relevant rows and columns.

    Limit access to data Exclude certain rows and/or columns from selection. Limit update to selected table rows and/or columns.

    Reduce maintenance When you add or drop columns, applications are not affected (unless a view references a

    dropped column). You can drop and recreate tables without affecting access rights granted to views.

    Applications

    SystemViews

    DictionaryTABLE

    Utilities

    CoordinatedProducts

    DictionaryTABLE

    DictionaryTABLE

  • 8/12/2019 Teradata Day 1

    112/134

  • 8/12/2019 Teradata Day 1

    113/134

  • 8/12/2019 Teradata Day 1

    114/134

  • 8/12/2019 Teradata Day 1

    115/134

  • 8/12/2019 Teradata Day 1

    116/134

  • 8/12/2019 Teradata Day 1

    117/134

  • 8/12/2019 Teradata Day 1

    118/134

  • 8/12/2019 Teradata Day 1

    119/134

  • 8/12/2019 Teradata Day 1

    120/134

  • 8/12/2019 Teradata Day 1

    121/134

    ShowTblChecks View

    Provides information about check constraints at the table level and named

  • 8/12/2019 Teradata Day 1

    122/134

    122 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    SELECT TableName (CHAR(10)),CheckName (CHAR(10)),TblCheck

    FROM DBC.ShowTblChecksWHERE DatabaseName = 'TFACT';

    Provides information about check constraints at the table level and named

    column constraints.

    Example Results:

    Example:Display table constraintinformation.

    DBC.ShowTblChecks

    DatabaseName TableName CheckName TblCheckCreatorName CreateTimeStamp

    TableName CheckName TblCheck

    DEPARTMENT Dept_Chk1 CONSTRAINT "Dept_Chk1" CHECK ( "Dept_nu

    EMPLOYEE Emp_Chk1 CONSTRAINT "Emp_Chk1" CHECK ( "Employe

    JOB ? CHECK ( "Job_code" >= 3000 )

    Note: The first two are named constraints and the third is an unnamedconstraint. All three of these constraints were created at the table level.

  • 8/12/2019 Teradata Day 1

    123/134

  • 8/12/2019 Teradata Day 1

    124/134

  • 8/12/2019 Teradata Day 1

    125/134

  • 8/12/2019 Teradata Day 1

    126/134

    IndexConstraints ViewProvides information about partitioned primary index constraints.

  • 8/12/2019 Teradata Day 1

    127/134

    127 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    p p y

    This view only displays tables with an index constraint type of "Q".

    Q indicates a table with a PPI

    SELECT TableName AS "Table Name",ConstraintText AS "Constraint Text"

    FROM DBC.IndexConstraintsWHERE DatabaseName = DATABASE;

    Example Results:

    Example:List all of the partitioningexpression constraintsfor all tables in thecurrent database.

    DBC.IndexConstraints

    DatabaseName TableName IndexName IndexNumberConstraintType ConstraintText ConstraintCollation CollationNameCreatorName CreateTimeStamp

    Table Name Constraint Text

    Sales_History CHECK ((RANGE_N("sales_date" BETWEEN ...Store_Sales CHECK ((store_id ) BETWEEN 1 and 65535)Store_Item CHECK ((((store_id - 1000)* 1000) + (item_id - Store_Revenue CHECK ((CASE_N(total_revenue < 2000, ...

    AllTempTables View

  • 8/12/2019 Teradata Day 1

    128/134

    128 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    SELECT HostNo,SessionNo,UserName (CHAR(10)),B_DatabaseName

    AS "DataBase",B_TableName AS "Table Name"

    FROM DBC.AllTempTables;

    Provides information about all global temporary tables materialized in thesystem.

    Example Results:

    Example:Show all temporary tablesmaterialized in thesystem.

    DBC.AllTempTables[X]

    HostNo SessionNo UserName B_DatabaseNameB_TableName E_TableID

    HostNo SessionNo UserName Database Table Name

    01 20887 TFACT02 PD GT_DEPTSALARY01 20908 TFACT01 PD GT_DEPTSALARY

  • 8/12/2019 Teradata Day 1

    129/134

  • 8/12/2019 Teradata Day 1

    130/134

  • 8/12/2019 Teradata Day 1

    131/134

    Teradata AdministratorObject Options

  • 8/12/2019 Teradata Day 1

    132/134

    132 / 25 MAY 2009 / EDS INTERNAL

    Teradata Training

    TeradataAdministrator canalso be used to

    display object details.

    For example, right-click on the object(e.g., Departmenttable) and a menu ofoptions is displayed.

    In this example, theIndexes option wasselected.

  • 8/12/2019 Teradata Day 1

    133/134

  • 8/12/2019 Teradata Day 1

    134/134