Mastering Teradata
-
Upload
lakshminarayana-sama -
Category
Documents
-
view
1.252 -
download
0
description
Transcript of Mastering Teradata
![Page 1: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/1.jpg)
Tera-Tom on Teradata
Basics for V13
Understanding is the key!
First Edition
Published by
Coffing Publishing
![Page 2: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/2.jpg)
First Edition May, 2010
Web Page: www.Tera-Tom.com and www.CoffingDW.com
E-Mail address:
Written by W. Coffing
Teradata, NCR, BYNET, V2R3, V2R4, V2R5, V2R6 are registered trademarks of
NCR Corporation, Dayton, Ohio, U.S.A., IBM and DB2 are registered trademarks of
IBM Corporation, ANSI is a registered trademark of the American National Standards
Institute. In addition to these products names, all brands and product names in this
document are registered names or trademarks of their respective holders.
Coffing Data Warehousing shall have neither liability nor responsibility to any person or
entity with respect to any loss or damages arising from the information contained in this
book or from the use of programs or program segments that are included. The manual is
not a publication of NCR Corporation, nor was it produced in conjunction with NCR
Corporation.
Copyright May 2010 by Coffing Publishing
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system,
or transmitted by any means, electronic, mechanical, photocopying, recording, or
otherwise, without written permission from the publisher. No patent liability is assumed
with respect to the use of information contained herein. Although every precaution has
been taken in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions, neither is any liability assumed for damages
resulting from the use of information contained herein.
Coffing Publishing
International Standard Book Number: ISBN 0-9704980-8-X
Printed in the United States of America
All terms mentioned in this book that are known to be trademarks or service have been
stated. Coffing Publishing cannot attest to the accuracy of this information. Use of a
term in this book should not be regarded as affecting the validity of any trademark or
service mark.
![Page 3: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/3.jpg)
About Coffing Data Warehousing’s CEO Tom Coffing
Tom is President, CEO, and Founder of Coffing Data Warehousing. He is an
internationally known consultant, facilitator, speaker, trainer, and executive coach with
an extensive background in data warehousing. Tom has helped implement data
warehousing in over 40 major data warehouse accounts, spoken in over 20 countries, and
has provided consulting and Teradata training to over 20,000 individuals involved in data
warehousing globally.
Tom has co-authored over 30 books on Teradata and Data Warehousing. To name a few:
Secrets of the Best Data Warehouses in the World
Teradata SQL - Unleash the Power
Tera-Tom on Teradata Basics
Tera-Tom on Teradata E-business
Teradata SQL Quick Reference Guide - Simplicity by Design
Teradata Database Design - Giving Detailed Data Flight
Teradata Users Guide -The Ultimate Companion
Teradata Utilities - Breaking the Barriers
Mr. Coffing has also published over 30 data warehousing articles and has been a
contributing columnist to DM Review on the subject of data warehousing. He wrote a
monthly column for DM Review entitled, "Teradata Territory". He is a nationally known
speaker and gives frequent seminars on Data Warehousing. He is also known as "The
Speech Doctor" because of his presentation skills and sales seminars.
Tom Coffing has taken his expert speaking and data warehouse knowledge and
revolutionized the way technical training and consultant services are delivered. He
founded CoffingDW with the same philosophy more than a decade ago. Centered around
20 Teradata Certified Masters this dynamic and growing company teaches every Teradata
class, provides world class Teradata consultants, offers a suite of software products to
enhance Teradata data warehouses, and has eight books published on Teradata.
Tom has a bachelor's degree in Speech Communications and over 35 years of business
and technical computer experience. Tom is considered by many to be the best technical
and business speaker in the United States. He has trained and consulted at so many
Teradata sites that students affectionately call him Tera-Tom.
Teradata Certified Master
- Teradata Certified Professional - Teradata Certified SQL Specialist
- Teradata Certified Administrator - Teradata Certified Implementation
- Teradata Certified Developer Specialist
- Teradata Certified Designer
![Page 4: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/4.jpg)
Table of Contents
I Copyright OSS 2010
Chapter 1 — The Teradata Architecture ............................................................................. 2
The Parsing Engine ......................................................................................................... 4 The AMPs ....................................................................................................................... 6 Born to be Parallel .......................................................................................................... 8
The BYNET .................................................................................................................. 10 A Scalable Architecture ................................................................................................ 12 Logical Modeling – Primary and Foreign Keys ........................................................... 16 Physical Modeling - The Primary Index ....................................................................... 18 Two Types of Primary Indexes (UPI or NUPI) ............................................................ 20
Unique Primary Index (UPI) ......................................................................................... 22 Non-Unique Primary Index (NUPI) .............................................................................. 24 Multi-Column Primary Indexes .................................................................................... 26 When do you define the Primary Index? ...................................................................... 28
Defining a Non-Unique Primary Index (NUPI)............................................................ 30 Defining a Multi-Column Primary Index ..................................................................... 32
How Teradata Distributes and Retrieves Rows ............................................................ 34 Hashing the Primary Index Value ................................................................................. 36
The Hash Map ............................................................................................................... 38 An 8-AMP Hash Map Example .................................................................................... 40 Laying a Row onto the Proper AMP............................................................................. 42
Retrieving a Row by way of the Primary Index ........................................................... 44 Hashing Non-Unique Primary Indexes (NUPI) ............................................................ 46
Placing Non-Unique Primary Indexes (NUPI) Rows ................................................... 48 Placing (NUPI) Rows Continued .................................................................................. 50 Retrieving (NUPI) Rows............................................................................................... 52
Placing Multi-Column Primary Index Rows ................................................................ 54
Retrieving Multi-Column Primary Index Rows ........................................................... 56 Even Distribution with an UPI ...................................................................................... 58 Uneven Distribution with a NUPI................................................................................. 60
Unacceptable Distribution with a NUPI ....................................................................... 62 Review – Parsing Engines Plan with an UPI ................................................................ 64
Review – Parsing Engines Plan with a NUPI ............................................................... 66 Review – Big Trouble – The Full Table Scan .............................................................. 68
Big Trouble – A Picture of a Full Table Scan .............................................................. 70 Test your Teradata Primary Index Knowledge ............................................................. 72 The Row Hash............................................................................................................... 76 The Uniqueness Value .................................................................................................. 78
The Row ID................................................................................................................... 80 Duplicates and the Uniqueness Value ........................................................................... 82 AMPs Sort Their Rows by the Row ID ........................................................................ 84
Search the Data like a Phone Book ............................................................................... 86 Why is my Phone Book 00000‘s and 111111‘s? .......................................................... 88 Performing a Binary Search .......................................................................................... 90 Opening the Phone Book to the Middle ........................................................................ 92 I can Name that Tune in 5 Notes .................................................................................. 94 A Visual for Data Layout .............................................................................................. 96
![Page 5: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/5.jpg)
Table of Contents
II Copyright OSS 2010
Test Your Teradata Access Query Knowledge ............................................................. 98
UPI Row-ID Test ........................................................................................................ 102 NUPI Row-ID Test ..................................................................................................... 106 Secondary Indexes ...................................................................................................... 112
The Base Table ........................................................................................................... 114 Creating a Unique Secondary Index (USI) ................................................................. 116 The Secondary Index Subtable ................................................................................... 118 Inside the Secondary Index Subtable .......................................................................... 120 How Teradata builds the Secondary Index Subtable .................................................. 122
How Teradata builds the Secondary Index Subtable .................................................. 124 Building the Secondary Index Subtable ...................................................................... 128 USI – Always a Two-AMP Operation ........................................................................ 130 The Parsing Engines Plan with an USI Query ............................................................ 134
Retrieving Base Rows using the USI .......................................................................... 136 Picture that USI in Action ........................................................................................... 138
USI Summary.............................................................................................................. 140 USI Pictorial using the Hash Maps ............................................................................. 142
USI Secondary Index Quiz ......................................................................................... 144 USI Secondary and Primary Index Quiz Answers ...................................................... 146 A Full Table Scan Example ........................................................................................ 148
The Base Table ........................................................................................................... 150 Creating a Non-Unique Secondary Index (NUSI) ...................................................... 152
Columns inside a NUSI Secondary Index Subtable ................................................... 154 NUSI Subtable is AMP-Local .................................................................................... 156 A Query using the NUSI Column ............................................................................... 158
A Query using the NUSI Column ............................................................................... 160
NUSI Recap ................................................................................................................ 162 Secondary Index Summary ......................................................................................... 164 Test Your Teradata Access Query Knowledge ........................................................... 166
An Incredible Quiz Opportunity ................................................................................. 170 An Incredible Quiz Opportunity ................................................................................. 173
A Table used for our Partitioning Example ................................................................ 178 Range Queries ............................................................................................................. 180
Why we had to perform a Full Table Scan ................................................................. 182 A Partitioned Table ..................................................................................................... 184 A Partitioned Table ..................................................................................................... 186 One Year of Orders Partitioned .................................................................................. 188
Fundamentals of Partitioning ...................................................................................... 190 Add the Partition to the Row-ID for the Row Key ..................................................... 192 You Partition a Table when you CREATE the Table ................................................. 194
RANGE_N Partitioning by Week ............................................................................... 196 RANGE_N Partitioning Older and Newer Data ......................................................... 198 Case_N Partitioning .................................................................................................... 200 Multi-Level Partitioning ............................................................................................. 202 Partitioning Rules........................................................................................................ 204 See the data ................................................................................................................. 206
![Page 6: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/6.jpg)
Table of Contents
III Copyright OSS 2010
Test Your Teradata Access Knowledge ...................................................................... 208
The most Powerful USER ........................................................................................... 214 DBC owns all the Disk Space ..................................................................................... 216 DBC Example of 1000 GBs ........................................................................................ 218
DBC will first CREATE a USER or a DATABASE .................................................. 220 Teradata is Hierarchical .............................................................................................. 222 Only two Objects can Receive PERM Space ............................................................. 224 Only difference between a User and a Database ........................................................ 226 A Typical approach to Security .................................................................................. 228
Example of a DATABASE and USER Interchanged ................................................. 230 PERM and SPOOL Space ........................................................................................... 232 Each AMP will have PERM and SPOOL ................................................................... 234 A Query using both PERM and SPOOL Space .......................................................... 236
Spool is Deleted when the Query is Done .................................................................. 238 Getting a better understanding of Spool ..................................................................... 240
Answering the MRKT Spool Query Answer .............................................................. 242 Spool is like a Speed Limit ......................................................................................... 244
All Space is calculated on a Per AMP Basis............................................................... 246 Examples of Perm and Spool on a Per AMP Basis .................................................... 248 Quiz on Perm and Spool Space ................................................................................... 250
Answers to Quiz on Perm and Spool Space ................................................................ 252 Collecting Statistics .................................................................................................... 256
Parsing Engine uses Statistics for the Plan ................................................................. 258 Columns and Indexes to Collect Statistics On ............................................................ 260 Syntax to Collect Statistics ......................................................................................... 262
Recollecting Statistics ................................................................................................. 264
Random Sample instead of Collected Statistics.......................................................... 266 V12 Statistics Enhancement – Stale Statistics ............................................................ 268 Where Statistics are Stored in DBC ............................................................................ 270
A Collect Statistics Example ...................................................................................... 272 What Statistics are Really Collected ........................................................................... 274
Loner Values and High Bias Intervals ........................................................................ 276 Teradata Limits ........................................................................................................... 278
Data Protection................................................................................................................ 282 Transaction Concept ................................................................................................... 284 Two Modes to Teradata .............................................................................................. 286 Differences between ANSI and Teradata Mode ......................................................... 288
ANSI Mode Commit ................................................................................................... 290 Teradata Mode Commit also called BTET ................................................................. 292 Trick to CREATE a Multi-Statement with BTEQ ...................................................... 294
Transient Journal ......................................................................................................... 296 How the Transient Journal Works .............................................................................. 298 The Transient Journal after a Commit ........................................................................ 300 VProcs ......................................................................................................................... 302 Nodes and MPP........................................................................................................... 304 RAID 1 - Mirroring ..................................................................................................... 306
![Page 7: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/7.jpg)
Table of Contents
IV Copyright OSS 2010
Cliques ........................................................................................................................ 308
VProcs Migrate when a Node Fails ............................................................................ 310 Cliques – An 8-Node Example ................................................................................... 312 Cliques – An 8-Node Example with Migration .......................................................... 314
Hot Standby Nodes ..................................................................................................... 316 Hot Standby Nodes in Action ..................................................................................... 318 FALLBACK Protection .............................................................................................. 320 How Fallback Works .................................................................................................. 322 Fallback Clusters Exercise .......................................................................................... 324
Fallback Clusters ......................................................................................................... 326 Fallback Exercises with Clusters ................................................................................ 328 Fallback Exercises with Clusters Answer ................................................................... 330 More Fallback Exercises ............................................................................................. 332
More Fallback Exercises with Answers ...................................................................... 334 Fallback – Performance Vs Protection Questions ...................................................... 336
The Six Rules of Fallback ........................................................................................... 338 Cliques and Clusters ................................................................................................... 340
Cliques and Clusters Answers .................................................................................... 342 Down AMP Recovery Journal (DARJ) ...................................................................... 344 Permanent Journal ....................................................................................................... 346
Table create with Fallback and Permanent Journal .................................................... 348 Permanent Journal Rules............................................................................................. 350
Some Permanent Journal Possibilities ........................................................................ 352 Creating a Permanent Journal ..................................................................................... 354 Create Table Examples with Permanent Journals ....................................................... 356
Each Permanent Journal is made up of 3 Areas .......................................................... 358
Permanent Journal Rules............................................................................................. 360 The Four Locks of Teradata ........................................................................................ 366 Teradata has 3 levels of Locking ................................................................................ 368
Quiz – Which Level of Locking is Occurring? ........................................................... 370 Quiz Locking Answers ............................................................................................... 372
The Teradata Lock Manager ....................................................................................... 374 Locking Modifiers – The Access Lock ....................................................................... 376
Locks and their compatibility ..................................................................................... 378 Moving Through the Locking Queue ......................................................................... 380 Quiz – Which Locks Move Up? ................................................................................. 382 Answers to Locking Quiz ........................................................................................... 384
A Single AMP Acts as the Locking Gatekeeper ......................................................... 386 Every AMP performs Locking Gatekeeper Duties ..................................................... 388 Answers to Which AMP is Waiting on Access .......................................................... 390
Explains – The Pseudo Table for Locks ..................................................................... 392 The NOWAIT Locking Option ................................................................................... 394 Rules of Teradata Locking .......................................................................................... 396 Explains – Psuedo Tables ........................................................................................... 400 Explain – Full Table Scan ........................................................................................... 402 Explain – Primary Index Reads .................................................................................. 404
![Page 8: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/8.jpg)
Table of Contents
V Copyright OSS 2010
Explain – Secondary Index Read ................................................................................ 406
Explain - View DDL of a Partitioned Table ............................................................... 408 Explain – Partition Elimination .................................................................................. 410 Explain – Joins with Duplication on all AMPs ........................................................... 412
Explain – Joins with Redistribution ............................................................................ 414 Explain – Bit Mapping with multiple NUSIs ............................................................. 416 Fundamentals of Teradata Joins.................................................................................. 420 A Join Example ........................................................................................................... 422 Joins and the Primary Index ........................................................................................ 424
Redistributing Rows in Spool ..................................................................................... 426 Redistributing Rows of Both Tables ........................................................................... 428 Duplicating the Smaller Table .................................................................................... 430 Quiz – How Many Rows are in Spool? ...................................................................... 432
Quiz Answer – How Many Rows in Spool? ............................................................... 434 How Duplication Appears on Every AMP ................................................................. 436
How Many Rows in Spool with Redistribution? ........................................................ 438 Answer to How Many Rows in Spool ........................................................................ 440
An Example of an AMP with Redistribution .............................................................. 442 The System Calendar .................................................................................................. 446 Columns in the System Calendar Views ..................................................................... 448
How to use the System Calendar with Tables ............................................................ 450 Teradata Temporary Tables ........................................................................................ 454
Derived Tables ............................................................................................................ 456 A Query Pictorial Example with a Derived Table ...................................................... 458 Volatile Tables ............................................................................................................ 460
How to Populate a Volatile Table ............................................................................... 462
Global Temporary Tables ........................................................................................... 464 A Pictorial of a Global Temporary Table ................................................................... 466 What Happens to Global Tables after the Session ...................................................... 468
Global Temporary Tables and Temp Space ................................................................ 470 V13 – No Primary Index Tables ................................................................................. 474
NoPI CREATE Statement........................................................................................... 476 NoPI Row-ID Increments the Uniqueness Value ....................................................... 478
NoPI Row-Hash Different on each AMP ................................................................... 480 NoPI Options and Facts .............................................................................................. 482 NoPI Restrictions ........................................................................................................ 484 Write Ahead Logging (WAL) ..................................................................................... 488
AMPs have FSG Cache for the Memories .................................................................. 490 An Example of an UPDATE Statements .................................................................... 492 AMP Local WALs ...................................................................................................... 494
AMPs UPDATE Rows in FSG Cache ........................................................................ 496 Write to WAL then Write Back to Disk ..................................................................... 498 The WAL Depot ......................................................................................................... 500 Clearing out the Wal Depot and the Wal Log............................................................. 502 V13 – Teradata Virtual Storage (TVS) ....................................................................... 506 AMPs in the 1980‘s .................................................................................................... 508
![Page 9: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/9.jpg)
Table of Contents
VI Copyright OSS 2010
AMPs in the 1990‘s .................................................................................................... 510
Data Blocks and Cylinders make up a Disk................................................................ 512 Cylinders are dedicated to Perm, Spool, etc. .............................................................. 514 Outside Disk Tracks are much Faster ......................................................................... 516
AMPs assigned Disk Cylinders, not Entire Disks ...................................................... 518 Hot, Warm, and Cold Data ......................................................................................... 520 The old way Teradata had to add Disk Space ............................................................. 522 Doubling the Disk Capacity ........................................................................................ 524 Incremental Disk Growth Is Here ............................................................................... 526
Mixed Disks and Solid State Drives ........................................................................... 528 Solid State Drives are like Giant Flash Drives ........................................................... 530 Virtual Storage Metrics ............................................................................................... 532 The Two Modes of Virtual Storage ............................................................................ 534
What is a Row Hash Lock? ......................................................................................... 543 Chapter 6 — Loading the Data ....................................................................................... 544
FastLoad ...................................................................................................................... 546 Multiload ..................................................................................................................... 548
TPump ......................................................................................................................... 550 FastExport ................................................................................................................... 552
![Page 10: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/10.jpg)
Mastering the Teradata Architecture
1 Copyright OSS 2010
![Page 11: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/11.jpg)
Mastering the Teradata Architecture
2 Copyright OSS 2010
Chapter 1 — The Teradata Architecture
“Let me once again explain the rules.
Teradata rules!”
Tera-Tom Coffing
Teradata relies on three architectural components that have set the rules for parallel
processing. They are the Parsing Engine, which is also called the PE or the Optimizer,
the Access Module Processors, which are referred to as the AMPs, and two BYNETs to
communicate between PE‘s and AMPs.
The PE is the boss and tells the AMPs exactly what to do. The AMPs each have their
own virtual disk, which no other AMP can read, and they merely read and write to their
respective disks.
When a user logon to Teradata their logon is accepted or rejected by a Parsing Engine.
The Parsing Engine will take care of that user for the entire session, which really means
until that user Logs Off.
The Parsing Engine will accept each query from that user and come up with a plan for the
AMPs to satisfy the request. The PE‘s plan is passed to the AMPs via the BYNET. The
AMPs will retrieve the data requested from their virtual disks and pass it back up the
BYNET to the PE. The PE will then deliver the data to the user.
![Page 12: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/12.jpg)
Mastering the Teradata Architecture
3 Copyright OSS 2010
![Page 13: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/13.jpg)
Mastering the Teradata Architecture
4 Copyright OSS 2010
The Parsing Engine
“Fall seven times, stand up eight.”
--Japanese Proverb
The Parsing Engines are perfectly balanced, with each having the capability to handle up
to 120 users at a time. This could be 120 distinct users or a single user utilizing the
power of all 120 sessions for a single application. That is why there are multiple PE‘s in
every Teradata system. Each PE has total command over every AMP.
Divided they stand (PE‘s) and United are the AMPs!
Each PE will take users SQL and do three things:
The PE will check the users SQL syntax. If there is a syntax error the user will receive
and error. For example, if the user wanted to use the KEY WORD SELECT and instead
wrote SLLLECCCT the PE would reject the SQL, but be kind enough to send the user a
message to help them correct the error. That‘s because the PE‘s are Stand-up guys!
If the SQL passes the syntax check the PE will check the users ACCESS RIGHTS to
ensure the user has permission to access the data in that table. If not then the user
receives a message ACCESS Denied!
If the user passes the Security Check then the Parsing Engine will come up with a PLAN
to satisfy the user request. The fastest plan is a Single-AMP retrieve. The second fastest
plan is a Two-AMP retrieve. The next fastest plan will be all AMPs reading only a
portion of the table, and the slowest plan is the full table scan. That is where each AMP
reads every row they contain for a table.
![Page 14: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/14.jpg)
Mastering the Teradata Architecture
5 Copyright OSS 2010
![Page 15: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/15.jpg)
Mastering the Teradata Architecture
6 Copyright OSS 2010
The AMPs
“Not all who wander are lost.”
– J. R. R. Tolkien
The AMPs are never lost because the PE always tells them what to do. One PE to rule
them all? No! Each PE rules them all because the rows of every table are spread across
all the AMPs. The AMPs organize every table in separate blocks just like you might
organize your clothes in separate dresser drawers. Organizing their tables and the rows
they contain is an obsession with the AMPs. They make organization a hobbit!
The PE passes the PLAN to the AMPs over the BYNET. The AMPs then retrieve the
rows they own from their disks and pass it back to the PE over the BYNET.
When a table is first created each AMP creates a table header on their disk. Even though
the table is empty the AMPs at least know the table name, the columns in the table, and
any indexes the table.
When the table is loaded each AMP receives rows for that table that they and only they
own. They carefully place the rows inside data blocks where they can easily be retrieved.
Now each AMP will own their own Table Header for the table and they will also own
data blocks where they place the rows for that table. Now the AMP is truly Lord of the
Disks!
![Page 16: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/16.jpg)
Mastering the Teradata Architecture
7 Copyright OSS 2010
![Page 17: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/17.jpg)
Mastering the Teradata Architecture
8 Copyright OSS 2010
Born to be Parallel
“Only he who attempts the ridiculous may
achieve the impossible.”
– Don Quixote
When Teradata was born in the late 1970‘s it was born to be parallel. That means that
multiple processors (AMPs) would split up the work and do it in parallel. At the time this
was considered impossible. To make it happen Teradata did something considered
ridiculous at the time. They would spread the data across all processors and let each
processor be responsible for only the data on its disk.
You will never see a Teradata table that is only on one AMP. The parallel processing
aspect is then lost. You will see every Teradata table spread the rows of the table across
all AMPs. Teradata was born to be parallel and the impossible was born.
The first picture on the opposite page never happens. The second picture below that is
exactly the design behind Teradata.
![Page 18: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/18.jpg)
Mastering the Teradata Architecture
9 Copyright OSS 2010
Teradata never lays out data like this!
Teradata lays out data like this!
![Page 19: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/19.jpg)
Mastering the Teradata Architecture
10 Copyright OSS 2010
The BYNET
“A Journey of a thousand miles begins with a
single step.”
-Lao Tzu
The BYNET is the communication network between AMPs and PE‘s. The PE comes up
with a PLAN and passes the plan to the AMPs in steps over the BYNET. A journey to a
Thousand AMPs begins with a single step. This step and all the steps of the plan travel
down the BYNET highway which guarantees delivery to each AMP.
The AMPs then retrieve the data requested by the PE and they deliver their portion of the
answer set to the PE over the BYNET.
The BYNET provides the communications between AMPs and PEs – so no matter how
large the data warehouse physically gets, the BYNET makes each AMP and PE think that
they are right next to one another. The BYNET gets its name from the Banyan tree.
The Banyan tree has the ability to continually plant new roots to grow forever.
Likewise, the BYNET scales as the Teradata system grows in size. The BYNET is
scalable.
There are always two BYNETs for redundancy and extra bandwidth. AMPs and PEs can
use both BYNETs to send and retrieve data simultaneously. What a network! It is like
having to phone lines to talk. Each AMP or PE can use one BYNET to retrieve
communication and simultaneously accept messages using the other BYNET. Both
BYNETs can be used to send a message or to receive a message!
Below is the steps to completely satisfy a query.
The PE checks the user‘s SQL Syntax;
The PE checks the user‘s security rights;
The PE comes up with a plan for the AMPs to follow;
The PE passes the plan along to the AMPs over the BYNET;
The AMPs follow the plan and retrieve the data requested;
The AMPs pass the data to the PE over the BYNET; and
The PE then passes the final data to the user.
![Page 20: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/20.jpg)
Mastering the Teradata Architecture
11 Copyright OSS 2010
![Page 21: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/21.jpg)
Mastering the Teradata Architecture
12 Copyright OSS 2010
A Scalable Architecture
“No wonder nobody comes here – It‟s too
crowded”
Yogi Berra
When it comes to scalability Teradata has put together a team of PEs and AMPs that are
guaranteed to hit a home run, while their competition continues to strike out when trying
to catch them in terms of scalability.
In Teradata land it never gets too crowded because Teradata can easily scale by adding
additional AMPs and PEs. This is considered to be something called Linear Scalability.
That means if you double your AMPs you will double your speed. A 4-AMP system can
double its speed by adding 4 more AMPs to become an 8-AMP system. This can
theoretically go on forever.
Other vendor systems can double their size and double their speed for a while, but
eventually they max out. Teradata has many customers who start with a small system
configuration and grow each year. Some of the largest data warehouses in the world are
Teradata systems who have proven their value each year and continually grow.
In the picture on the following page you can see we have a 4,000 AMP system. This
system is literally 1,000 times more powerful than a 4-AMP system.
![Page 22: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/22.jpg)
Mastering the Teradata Architecture
13 Copyright OSS 2010
![Page 23: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/23.jpg)
Mastering the Teradata Architecture
14 Copyright OSS 2010
![Page 24: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/24.jpg)
Mastering the Teradata Architecture
15 Copyright OSS 2010
![Page 25: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/25.jpg)
Mastering the Teradata Architecture
16 Copyright OSS 2010
Logical Modeling – Primary and Foreign Keys
“I have found the best way to give advice to
your children is to find out what they want
and then advise them to do it.”
--Harry S. Truman
Harry Truman was an American president with great logical skills. Tables are logically
created for all database systems. This is called Logical Modeling. A table that is
modeled or normalized will always have a Primary Key. The Primary Key is usually the
first column in a table, but the Primary Key column(s) will have three characteristics:
1. Never be Null
2. Never change
3. Never have duplicate values
If a table with a Primary Key has a relationship with another table it will be joined
through a Primary Key Foreign Key relationship. When two tables are joined they are
joined by taking the Primary Key of one table and matching it with a normal key in
another table with the same values. This normal key is called a foreign key.
Teradata doesn‘t care about Primary Keys and Foreign Keys when it lays out the data in
Teradata. It only cares about what is called the Primary Index. We will learn about the
Primary Index shortly.
![Page 26: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/26.jpg)
Mastering the Teradata Architecture
17 Copyright OSS 2010
![Page 27: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/27.jpg)
Mastering the Teradata Architecture
18 Copyright OSS 2010
Physical Modeling - The Primary Index
“Speak in a moment of anger and you‟ll
deliver the greatest speech you‟ll ever
regret.”
– Anonymous
Every table in Teradata has one and only one Primary Index.
Teradata uses the Primary Index of each table to provide a row its destination to the
proper AMP. This is why each table in Teradata is required to have a Primary Index.
The biggest key to a great Teradata Database Design begins with choosing the correct
column to be the Primary Index. The Primary Index columns value is the only thing that
will determine on which AMP a row will reside. Because this concept is extremely
important, let me state again that the Primary Index value for a row is the only thing that
will determine on which AMP a row will reside.
Many people new to Teradata assume that the most important concept concerning the
Primary Index is data distribution. INCORRECT! The Primary Index does determine
data distribution, but even more importantly, the Primary Index provides the fastest
physical path to retrieving data. The Primary Index also plays an incredibly important
role in how joins are performed. Remember these three important concepts of the
Primary Index and you are well on your way to a great Physical Database Design.
![Page 28: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/28.jpg)
Mastering the Teradata Architecture
19 Copyright OSS 2010
![Page 29: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/29.jpg)
Mastering the Teradata Architecture
20 Copyright OSS 2010
Two Types of Primary Indexes (UPI or NUPI)
“A man who chases two rabbits
Catches none.”
Roman Proverb
Every table must have at least one column as the Primary Index. The Primary Index is
defined when the table is created. There are only two types of Primary Indexes, which
are a Unique Primary Index (UPI) or a Non-Unique Primary Index (NUPI).
“A man who chases two rabbits misses both
by a HARE! A person who chases two
Primary Indexes misses both by an ERR!”
Tera-Tom Proverb
Every table must have one and only one Primary Index. Because Teradata distributes the
data based on the Primary Index columns value it is quite obvious that you must have a
primary index and that there can be only one primary index per table.
The Primary index is the Physical Mechanism used to retrieve and distribute data. The
primary index is limited to the number of columns in the primary index. This means that
the primary index is comprised totally of all the columns in the primary index. You can
have up to 64 multi-column keys comprising your primary index or as little as one
column as your primary index..
Most databases use the Primary Key as the physical mechanism. Teradata uses the
Primary Index and NOT the Primary Key. There are two reasons you might pick a
different Primary Index then your Primary Key. They are (1) for Performance reasons
and (2) known access paths.
![Page 30: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/30.jpg)
Mastering the Teradata Architecture
21 Copyright OSS 2010
![Page 31: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/31.jpg)
Mastering the Teradata Architecture
22 Copyright OSS 2010
Unique Primary Index (UPI)
“Always remember that you are unique just
like everyone else.”
– Anonymous
A Unique Primary Index (UPI) is unique and can’t have any duplicates. It is as unique
as you are. Nobody is like you and you are extremely beautiful and amazing. Not one
other person in the history of mankind has ever been exactly like you. You are the
creation of your beautiful parents and must realize how important you are to the world.
A Unique Primary Index is not as amazing as you are, but it is also very special.
A Unique Primary Index means that the values for the selected column must be unique.
If you try and insert a row with a Primary Index value that is already in the table, the row
will be rejected. An UPI enforces UNIQUENESS for a column.
A Unique Primary Index will always spread the table rows evenly amongst the AMPs.
Please don‘t assume this is always the best thing to do. The diagram on the next pages
shows a table that has a Unique Primary Index. We have selected EMP_NO to be our
Primary Index. Because we have designated EMP_NO to be a Unique Primary Index,
there can be no duplicate employee numbers in the table.
A Unique Primary Index (UPI) will always spread the rows of the table evenly amongst
the AMPs. UPI access is always a one-AMP operation. You will better understand
what I am talking about concerning a one-AMP operation by the end of this chapter.
![Page 32: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/32.jpg)
Mastering the Teradata Architecture
23 Copyright OSS 2010
![Page 33: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/33.jpg)
Mastering the Teradata Architecture
24 Copyright OSS 2010
Non-Unique Primary Index (NUPI)
“You miss 100 percent of the shots you
never take.”
– Wayne Gretzky Take a shot at using a Non-Unique Primary Index in your Teradata tables. A Non-
Unique Primary Index (NUPI) means that the values for the selected column can be non-
unique. You can have many rows with the same value in the Primary Index so don‘t
expect any value such as ―Smith‖ to be a one-timer. Duplicate values can exist.
A Non-Unique Primary Index will almost never spread the table rows evenly. Please
don‘t assume this is always a bad thing. On the following page is a table that has a Non-
Unique Primary Index. We have selected LAST_NAME to be our Primary Index.
Because we have designated LAST_NAME to be a Non-Unique Primary Index we are
anticipating that there will be individuals in the table with the same last name.
An All-AMP operation will take longer if the data is unevenly distributed. You might
pick a NUPI over an UPI because the NUPI column may be more effective for query
access and joins.
![Page 34: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/34.jpg)
Mastering the Teradata Architecture
25 Copyright OSS 2010
![Page 35: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/35.jpg)
Mastering the Teradata Architecture
26 Copyright OSS 2010
Multi-Column Primary Indexes
“Every sunrise is a second chance.”
– Unknown
A Primary Index can have multiple columns.
Teradata allows more than one column to be designated as the Primary Index. It is still
only one Primary Index, but it is merely made up by combining multiple columns
together. Teradata allows up to 64 combined columns to make up the one Primary Index
required for a table.
On the following page you can see we have designated First_Name and Last_Name
combined to make up the Primary Index.
This is often done for two reasons:
(1) To get better data distribution among the AMPs
(2) Users often use multiple keys consistently to query
![Page 36: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/36.jpg)
Mastering the Teradata Architecture
27 Copyright OSS 2010
![Page 37: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/37.jpg)
Mastering the Teradata Architecture
28 Copyright OSS 2010
When do you define the Primary Index?
“When you go into court you are putting
your fate into the hands of twelve people
who weren‟t smart enough to get out of jury
duty.”
- Norm Crosby
When you go to query Teradata you are putting your hands in the fate of the DBA who
created the table‘s Primary Index. When the table is created it is given a table name, the
columns and their data types are defined and the Primary Index is specified.
As you can see on the following page we have created the table called Employee_Table.
It contains five columns which are Emp_No, Dept_No, First_Name, Last_Name and
Salary. The Primary Index is Unique and on the column Emp_No. This really means
that Emp_No is the most important column in this table. If users query the table and put
Emp_No in the WHERE Clause it will always be a 1-AMP query. It is as fast as
lightning.
If no Primary Index is defined the system will define one for you. It will most likely pick
the first column and make it a Non-Unique Primary Index (NUPI). It will however check
to see if you have a Primary Key defined for referential integrity purposes. If you do it
will choose that column(s) and make it a Unique Primary Index (UPI). If you didn‘t
define a Primary Index or Primary Key then the system will check to see if you defined a
Unique Secondary Index (USI) on any column and if you have it will make that column a
Unique Primary Index (UPI). This is now way to build a system. The Primary Index
should always be explicitly defined when the table is first created.
![Page 38: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/38.jpg)
Mastering the Teradata Architecture
29 Copyright OSS 2010
![Page 39: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/39.jpg)
Mastering the Teradata Architecture
30 Copyright OSS 2010
Defining a Non-Unique Primary Index (NUPI)
“I know that you believe that you
understand what you think I said, but I am
not sure you realize that what you heard is
not what I meant.”
-Sign on Pentagon office wall
When the table is created it is given a table name, the columns and their data types are
defined and the Primary Index is specified.
As you can see on the following page we have created the table called Employee_Table.
It contains five columns which are Emp_No, Dept_No, First_Name, Last_Name and
Salary. The Primary Index is Non-Unique and on the column Last_Name. This really
means that Last_Name is the most important column in this table. If users query the table
and put Last_Name in the WHERE Clause it will always be a 1-AMP query. There could
be duplicates, but duplicate values will be on the same AMP. I will explain this further.
Remember, If no Primary Index is defined the system will define one for you. It will
most likely pick the first column and make it a Non-Unique Primary Index (NUPI). It
will however check to see if you have a Primary Key defined for referential integrity
purposes. If you do it will choose that column(s) and make it a Unique Primary Index
(UPI). If you didn‘t define a Primary Index or Primary Key then the system will check to
see if you defined a Unique Secondary Index (USI) on any column and if you have it will
make that column a Unique Primary Index (UPI). This is now way to build a system.
The Primary Index should always be explicitly defined when the table is first created.
![Page 40: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/40.jpg)
Mastering the Teradata Architecture
31 Copyright OSS 2010
![Page 41: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/41.jpg)
Mastering the Teradata Architecture
32 Copyright OSS 2010
Defining a Multi-Column Primary Index
“Some birds aren‟t meant to be caged, their
feathers are just too bright. And when they
fly away, the part of you that knows it was a
sin to lock them up, does rejoice.”
– Shawshank Redemption
When the table is created it is given a table name, the columns and their data types are
defined and the Primary Index is specified. The example on the following page shows a
multi-column Primary Index on First_Name and Last_Name combined.
As you can also see on the following page we have created the table called
Employee_Table. It contains five columns which are Emp_No, Dept_No, First_Name,
Last_Name and Salary. The Primary Index is Non-Unique and on the columns
First_Name and Last_Name. This really means that both First_Name and Last_Name are
the most important columns in this table. If users query the table and put both the
First_Name and the Last_Name in the WHERE Clause it will always be a 1-AMP query.
There could be duplicates, but duplicate values will be on the same AMP. I will explain
this further.
Remember, if no Primary Index is defined the system will define one for you. It will
most likely pick the first column and make it a Non-Unique Primary Index (NUPI). It
will however check to see if you have a Primary Key defined for referential integrity
purposes. If you do it will choose that column(s) and make it a Unique Primary Index
(UPI). If you didn‘t define a Primary Index or Primary Key then the system will check to
see if you defined a Unique Secondary Index (USI) on any column and if you have it will
make that column a Unique Primary Index (UPI). This is now way to build a system.
The Primary Index should always be explicitly defined when the table is first created.
![Page 42: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/42.jpg)
Mastering the Teradata Architecture
33 Copyright OSS 2010
![Page 43: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/43.jpg)
Mastering the Teradata Architecture
34 Copyright OSS 2010
How Teradata Distributes and Retrieves Rows
“I don‟t know who my grandfather was. I
am more interested in who his grandson will
become.”
– Abraham Lincoln, 16th
president of the United States
Teradata freed the AMPs from doing everything together by giving each table a Primary
Index. The Primary Index is the column(s) that lays out the data row to the proper AMP
and the Primary Index column(s) is also the fastest way to retrieve a row from that same
AMP. Follow this part closely because this is fundamentally the most important subject
you will learn about Teradata.
Teradata takes a table and spreads the rows across the AMPs one row at a time. A
Unique Primary Index on the table will spread the data rows perfectly evenly across the
AMPs. This is pretty amazing in itself, but the more amazing part is that Teradata knows
exactly which rows went to which AMPs so retrieval is always a 1-AMP operation when
users use the Primary Index in the WHERE Clause of their SQL. Here is how that works.
The Teradata Parsing Engine will take the Primary Index Value of a row and run a math
calculation called the Hash Formula on that Primary Index column value. The Hash
Formula doesn‘t change and can be calculated on any value or data type. The results of
the Hash Formula calculation on the Primary Index value will result in a number ranging
from one to one million.
Teradata then has a Hash Map with one million buckets. Inside the buckets are AMP
numbers. So, when the Hash Formula is calculated on the value of the column
designated as the Primary Index, and the result is for example 20, Teradata will go to
bucket 20 of the Hash Map, look inside bucket 20 and see which AMP it says should get
the row.
I will give you some visual examples in the next couple of pages to show you exactly
what I am talking about.
![Page 44: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/44.jpg)
Mastering the Teradata Architecture
35 Copyright OSS 2010
![Page 45: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/45.jpg)
Mastering the Teradata Architecture
36 Copyright OSS 2010
Hashing the Primary Index Value
“Measure a thousand times and cut once.”
-Turkish Proverb
Teradata doesn‘t measure a thousand times and cut once as the Turkish Proverb for Rug
makers states. Teradata measures one time and then places the row on the proper AMP.
There is only one Hash Formula and the result is always a 32-bit Row Hash that will
produce a result from one to one million. On the following page you will see that
Teradata Hashed the Primary Index Value of 2 and the result was 000000000101, which
equates to a 5.
What you need to understand right now is that if Teradata Hashed this value again it
would absolutely come up with the same Row Hash of 0000000000101 and it would
absolutely equate to a 5.
If the Hash Formula was run one thousand times against the value 2 it would return the
same 00000000000101 Row Hash and this would equate to a 5 every time.
That is how Teradata lays the data out and it is also how it retrieves the row. When a
user writes SQL using the Primary Index in the WHERE clause it knows where to get it.
For Example:
SELECT *
FROM Employee_Table
WHERE Emp_No = 2 ;
The Parsing Engine knows Emp_No is the Primary Index so it Hashes the value of 2 with
the Hash Formula, comes up with a 000000000101 Row Hash, which equates to a 5, and
then goes to bucket 5 in the Hash Map, which then tells the PE which AMP holds that
row. Ingenious! Stay tuned because this is about to become even more clear.
![Page 46: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/46.jpg)
Mastering the Teradata Architecture
37 Copyright OSS 2010
![Page 47: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/47.jpg)
Mastering the Teradata Architecture
38 Copyright OSS 2010
The Hash Map
“We're going to have the best-educated
American people in the world.”
– Dan Quayle
Every Teradata System has one Hash Map with a million buckets. Inside the buckets are
AMP numbers. The AMP numbers don‘t change inside the Hash Map. They are static.
If you have a 4-AMP system then the numbers 1, 2, 3, 4 are repeated until all one million
buckets contain a 1, 2, 3 or 4. The following page shows and excellent example of a 4-
AMP system. Notice how 1, 2, 3, 4 keep repeating throughout the entire Hash Map.
That is because we have a small 4-AMP system.
Do you remember on the previous page when we ran the Hash Formula on the Primary
Index value of 2? The result was a Row Hash of 000000101, which equates to a 5.
Teradata would count over 5 buckets and look inside bucket number 5. That would tell
the PE to place the row on AMP 1. I have circled bucket 5 in the Hash Map on the
following page so you can see exactly how this works.
You are soon to be one of the best-educated people in the Teradata world.
![Page 48: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/48.jpg)
Mastering the Teradata Architecture
39 Copyright OSS 2010
![Page 49: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/49.jpg)
Mastering the Teradata Architecture
40 Copyright OSS 2010
An 8-AMP Hash Map Example
“A true friend is one who walks in when the
rest of the world walks out.”
– Anonymous
The example on the following page is another Hash Map, but this one is for an 8-AMP
system. Notice how the numbers 1, 2, 3, 4, 5, 6, 7, 8 keep repeating inside all one million
buckets.
Every Teradata System has one Hash Map with a million buckets. Inside the buckets are
AMP numbers. The AMP numbers don‘t change inside the Hash Map. They are static.
Do you remember a couple of pages ago when we ran the Hash Formula on the Primary
Index value of 2? The result was a Row Hash of 000000101, which equates to a 5.
Teradata would count over 5 buckets and look inside bucket number 5. That would tell
the PE to place the row on AMP 5 in this system. I have circled bucket 5 in the Hash
Map on the following page so you can see exactly how this works.
They say a dog is man‘s best friend, but the best friend to Teradata is the Hash Map. It is
its guide dog.
![Page 50: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/50.jpg)
Mastering the Teradata Architecture
41 Copyright OSS 2010
![Page 51: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/51.jpg)
Mastering the Teradata Architecture
42 Copyright OSS 2010
Laying a Row onto the Proper AMP
“To have everything is to possess nothing.”
--Buddha
The brilliance about how Teradata lays a data row to the proper AMP is that by
possessing nothing they have everything. Let me explain.
On the following page you see that Teradata Hashes the Primary Index Value of 2 and
receives an answer set called the Row Hash, which is calculated to be 000000101, which
equates to a 5. Teradata counts over 5 buckets in the Hash Map and the system tells
Teradata to place this row on AMP 1.
Nothing changes in the Hash Map and no record of what just happened exists, but when
Teradata needs to find this row it will run through the same Hashing Process, then look at
the Hash Map and know this row can be found on AMP 1.
The same Hash Formula always produces the same answer on a specific value so this
consistency allows Teradata to have everything, without writing anything, pointing, or
possessing. They just do the math again each time.
If 1,000 users all ran a query to find Employee Number 2 the math would be run 1,000
times, with each time the system telling Teradata to look only on AMP 1.
![Page 52: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/52.jpg)
Mastering the Teradata Architecture
43 Copyright OSS 2010
![Page 53: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/53.jpg)
Mastering the Teradata Architecture
44 Copyright OSS 2010
Retrieving a Row by way of the Primary Index
“The best way to predict the future is to
create it.”
- Sophia Bedford-Pierce
The best way to predict the future is to create it, to make things happen yourself. To
control your own destiny is something that all of us have. Teradata does this by using the
same process to retrieve a row as it did for placing the row on the proper AMP.
On the following page you see that the user has run a query looking for all columns in the
Employee_Table where the Emp_No = 2. The Parsing Engine knows that Emp_No is the
Primary Index and comes up with a 1-AMP Plan.
Teradata Hashes the Primary Index Value of 2 and receives an answer set called the Row
Hash, which is calculated to be 000000101, which equates to a 5. Teradata counts over 5
buckets in the Hash Map and the system tells the BYNET to contact AMP 1.
If 1,000 users all ran a query to find Employee Number 2 the math would be run 1,000
times, with each time the system telling Teradata to look only on AMP 1. They just do
the math again each time, quite quickly, and only the AMP holding the row needs to be
contacted.
![Page 54: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/54.jpg)
Mastering the Teradata Architecture
45 Copyright OSS 2010
![Page 55: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/55.jpg)
Mastering the Teradata Architecture
46 Copyright OSS 2010
Hashing Non-Unique Primary Indexes (NUPI)
“Be not afraid of going slowly,
be afraid of standing still.”
- Chinese Proverb
Teradata can run the same Hash Formula on Character data and Non-Unique values. On
the following page you can see that Last_Name is the Primary Index. It is a Non-Unique
Primary Index (NUPI). Teradata hashes the Last_Name value of ‗Ratel‘ and once again
it comes up with a 32-bit Row Hash. The Row Hash value is 00000000000011111,
which equates to a 31. Teradata will go to the Hash Map and looks inside bucket 31 and
then know which AMP to place the row.
Here is what you need to understand about a Non-Unique Primary Index value. It will
have duplicates. If there are 5,000 people in the table with the Last_Name of ‗Smith‘
then all 5,000 of these rows will go to the same AMP.
Remember, there is only one Hash Formula and only one Hash Map. That is the only
problem with a NUPI. It can cause uneven distribution and often one AMP gets many
more rows then the other AMPs. This is called a ―Hot AMP‖ or ―Data Spike‖.
You don‘t need to have perfect distribution in Teradata so a NUPI is very acceptable, but
sometimes the ―Hot AMP‖ or ―Data Spike‖ situation is just too big and this can cause
problems.
![Page 56: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/56.jpg)
Mastering the Teradata Architecture
47 Copyright OSS 2010
![Page 57: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/57.jpg)
Mastering the Teradata Architecture
48 Copyright OSS 2010
Placing Non-Unique Primary Indexes (NUPI) Rows
“If the facts don‟t fit the theory, change the
facts”
-Albert Einstein
Teradata can run the same Hash Formula on Character data and Non-Unique values. On
the following page you can see that Last_Name is the Primary Index. It is a Non-Unique
Primary Index (NUPI). Teradata hashes the Last_Name value of ‗Lacy‘ and once again it
comes up with a 32-bit Row Hash. The Row Hash value is 00000000000001100, which
equates to a 12. Teradata will go to the Hash Map and looks inside bucket 12 and then
know which AMP to place the row.
I want you to notice that the last names in this table have three people named ‗Lacy‘ and
two people named ‗Jones‘. Can you predict what will happen?
Check out the next couple of slides.
![Page 58: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/58.jpg)
Mastering the Teradata Architecture
49 Copyright OSS 2010
![Page 59: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/59.jpg)
Mastering the Teradata Architecture
50 Copyright OSS 2010
Placing (NUPI) Rows Continued
“We must use time as a tool, not as a
crutch.”
– John F. Kennedy
I want you to notice that the last names in this table have three people named ‗Lacy‘ and
two people named ‗Jones‘. Each of the rows with the name ‗Lacy‘ went to AMP 4 and
everyone named ‗Jones‘ went to AMP 2. Duplicate values Hash the same and they point
to the same bucket in the Hash Map, so they always go to the same AMP.
If we had more people in the table named ‗Lacy‘ they would all continue to go to AMP 4.
If there were 1,000,000 people named ‗Jones‘ they would all end up on AMP 2.
![Page 60: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/60.jpg)
Mastering the Teradata Architecture
51 Copyright OSS 2010
![Page 61: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/61.jpg)
Mastering the Teradata Architecture
52 Copyright OSS 2010
Retrieving (NUPI) Rows
“Every sunrise is a second chance.”
– Unknown
Please remember that in our example on the next page that Last_Name is the Primary
Index of the table. It is a Non-Unique Primary Index (NUPI). A query that uses
Last_Name in the WHERE clause will always be a 1-AMP operation as you can see from
the example on the following page. Even though there could be hundreds, thousands or
millions of Last_Name values of ‗Lacy‘, they are all on the same AMP.
Teradata will often claim that NUPI values are grouped together and that is exactly the
truth. All duplicate values go to the same AMP as their duplicate counterparts.
The bottom line is that if you use an UPI or a NUPI in your WHERE clause it will always
be a 1-AMP operation.
![Page 62: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/62.jpg)
Mastering the Teradata Architecture
53 Copyright OSS 2010
![Page 63: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/63.jpg)
Mastering the Teradata Architecture
54 Copyright OSS 2010
Placing Multi-Column Primary Index Rows
“Life is a succession of lessons, which must be
lived to be understood”.
--Ralph Waldo Emerson
When multiple columns are combined to make up the Primary Index it is called a Multi-
Column Primary Index. Teradata places the columns together and then performs the
same process of hashing. In the picture on the next page notice that the Primary Index
consists of First_Name and Last_Name combined. Teradata will add the two names
together turning the First_Name of ‗Rakish‘ and the Last_Name of ‗Ratel‘ into
‗RakishRatel‘. It will then Hash RakishRatel and get a 32-bit Row Hash answer. In the
example we see that ‗RakishRatel‘ has hashed to 0000000011010, which equates to a 26.
Teradata will go to bucket 26 in the Hash Map and place the row on the AMP inside the
bucket.
![Page 64: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/64.jpg)
Mastering the Teradata Architecture
55 Copyright OSS 2010
![Page 65: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/65.jpg)
Mastering the Teradata Architecture
56 Copyright OSS 2010
Retrieving Multi-Column Primary Index Rows
“What lies behind us and what lies before us
are tiny matters compared to what lies
within us”
-Ralph Waldo Emerson
Remember that the Primary Index for this table example was a Multi-Column Primary
Index on both First_Name and Last_Name combined. When the user queries using both
the First_Name and the Last_Name Teradata knows this is a 1-AMP operation. Teradata
first combines the First_Name of ‗Rakish‘ and the Last_Name of ‗Ratel‘ and it becomes
‗RakishRatel‘. This produces a Row Hash of 0000000000011010, which equates to a 26.
Teradata can go to bucket 26 in the Hash Map and knows this is on AMP2.
There are a couple of items I want you to think about. First and foremost, the only way
Teradata can use a Multi-Column Primary Index is if you use all columns in the Multi-
Column Index in the WHERE Clause of your query. As you can see in our example we
used both the First_Name AND the Last_Name in the WHERE clause of our SQL. If the
query would have only used one of the columns instead of both, then Teradata would
have had to do a Full Table Scan. Partial Indexing does not work in Teradata.
The positive news about a Multi-Column Index is that it usually spreads the rows fairly
evenly across the AMPs.
![Page 66: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/66.jpg)
Mastering the Teradata Architecture
57 Copyright OSS 2010
![Page 67: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/67.jpg)
Mastering the Teradata Architecture
58 Copyright OSS 2010
Even Distribution with an UPI
“Nobody forgets where they buried the
hatchet”
– Frank McKinney “Kin” Hubbard
A Unique Primary Index always lays the data out perfectly evenly. Well, I shouldn‘t say
perfectly. If you have 20 AMPs and 42 rows then a couple of extra rows will go to one
or two AMPs, but overall it is considered perfectly distributed.
Perfect distribution is nice, but it isn‘t everything. If you decide that users query a Non-
Unique column like Last_Name a ton more than Emp_No then you are much better off
making Last_Name your Primary Index and making it a NUPI.
Uneven distribution is not a problem unless there are huge spikes causing hot AMPs.
![Page 68: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/68.jpg)
Mastering the Teradata Architecture
59 Copyright OSS 2010
![Page 69: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/69.jpg)
Mastering the Teradata Architecture
60 Copyright OSS 2010
Uneven Distribution with a NUPI
“To escape criticism – do nothing, say
nothing, be nothing.”
– Elbert Hubbard
A Non-Unique Primary Index will seldom lay the data out with perfect distribution.
Perfect distribution is nice, but it isn‘t everything. If you decide that users query a Non-
Unique column like Last_Name a ton more than Emp_No then you are much better off
making Last_Name your Primary Index and making it a NUPI.
Uneven distribution is not a problem unless there are huge spikes causing hot AMPs.
Remember that duplicates always go to the same AMP as their duplicate counterparts.
The example on the following page demonstrates this clearly.
![Page 70: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/70.jpg)
Mastering the Teradata Architecture
61 Copyright OSS 2010
![Page 71: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/71.jpg)
Mastering the Teradata Architecture
62 Copyright OSS 2010
Unacceptable Distribution with a NUPI
“When I was 14 I thought my parents were
the stupidest people in the world. When I
was 21 I was amazed at how much they
learned in seven years.”
– Mark Twain
A Non-Unique Primary Index will seldom lay the data out with perfect distribution.
Perfect distribution is nice, but it isn‘t everything. Then again, the following example on
the next page is awful. You should never choose a column to be your Primary Index if it
has less UNIQUE values than the number of AMPs. You should never do what the
example on the next page shows.
Uneven distribution is not a problem unless there are huge spikes causing hot AMPs.
The example will not only cause hot AMPs and huge distribution spikes, but only two
AMPs will be used when distributing and retrieving data. Horrible!
Remember that duplicates always go to the same AMP as their duplicate counterparts.
The example on the following page demonstrates this clearly.
![Page 72: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/72.jpg)
Mastering the Teradata Architecture
63 Copyright OSS 2010
![Page 73: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/73.jpg)
Mastering the Teradata Architecture
64 Copyright OSS 2010
Review – Parsing Engines Plan with an UPI
“When you are courting a nice girl an hour
seems like a second. When you sit on a red-
hot cinder a second seems like an hour.
That‟s relativity.”
– Albert Einstein
A Unique Primary Index always lays the data out perfectly evenly. Plus, the Parsing
Engines plan is a 1-AMP operation that can return a maximum of one row. That is
because the value the query is seeking is UNIQUE.
On the following page you can see the Parsing Engines plan. This is as sweet as it gets.
![Page 74: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/74.jpg)
Mastering the Teradata Architecture
65 Copyright OSS 2010
![Page 75: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/75.jpg)
Mastering the Teradata Architecture
66 Copyright OSS 2010
Review – Parsing Engines Plan with a NUPI
“If you don't know where you're going,
Any road will take you there.”
– Lewis Carroll
A Non-Unique Primary Index doesn‘t always lays the data out perfectly evenly, but it is
always a 1-AMP operation when used in the WHERE clause. There can be millions of
rows returned because the value the query is seeking is NOT UNIQUE.
On the following page you can see the Parsing Engines plan. This is pretty sweet as well.
![Page 76: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/76.jpg)
Mastering the Teradata Architecture
67 Copyright OSS 2010
![Page 77: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/77.jpg)
Mastering the Teradata Architecture
68 Copyright OSS 2010
Review – Big Trouble – The Full Table Scan
“The only true wisdom is in knowing
You know nothing.”
– Socrates
Although I will show you additional ways Teradata accesses the data, so far we have
learned about only two. A Primary Index Single-AMP or 1-AMP retrieve and the
dreaded Full Table Scan. A Full Table Scan in Teradata is pretty fast because of the
Parallel Processing. Each AMP reads the rows for the table that it owns only once and
passes any rows that meet the criteria up the BYNET to the Parsing Engine.
The only thing wrong with a Full Table Scan is to do one when it isn‘t necessary. If a
table has a Primary Index of Last_Name then you should attempt to use Last_Name in
the Where Clause. If a table has a Primary Index of Employee_No then you should
attempt to use that in the WHERE Clause.
Imagine that you work for HR and an employee comes in to talk. You want to know
about the employee so you run a query to access the Employee_Table. You notice that
Last_Name is the Primary Index of the table, and the good news is that you know their
last name. It is a waste of time to run the query without a WHERE Clause or to put in
their Employee_No in the WHERE Clause.
Sometimes a Full Table Scan is needed and that is why Teradata‘s Parallel Processing is
so great. You can ask difficult questions you might not have been able to ask before with
systems of less power, but please be aware of the power of knowing the Primary Index of
a table.
![Page 78: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/78.jpg)
Mastering the Teradata Architecture
69 Copyright OSS 2010
![Page 79: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/79.jpg)
Mastering the Teradata Architecture
70 Copyright OSS 2010
Big Trouble – A Picture of a Full Table Scan
“If I have seen farther than others, it is
because I was standing on the shoulders of
giants.”
- Isaac Newton
As you can see on the next page a Full Table Scan will cause ALL-AMPs to read every
row they own. Each row is read only once and the AMP will return rows that match the
criteria. There is nothing wrong with doing a Full Table Scan query unless you don‘t
have to do it.
There is nothing wrong with walking across the city if you don‘t have a car and can‘t
afford transportation, but if you can you might want to consider riding. Especially when
you are on the company‘s payroll and time and resources are important.
![Page 80: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/80.jpg)
Mastering the Teradata Architecture
71 Copyright OSS 2010
![Page 81: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/81.jpg)
Mastering the Teradata Architecture
72 Copyright OSS 2010
Test your Teradata Primary Index Knowledge
“Look at life through the windshield, not the
rearview mirror.”
- Byrd Baggett
The following page allows you to test your knowledge of what you have learned so far.
This will actually be an exercise that builds continually after each chapter. Follow the
instructions on the page and make your family proud with the right answers.
![Page 82: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/82.jpg)
Mastering the Teradata Architecture
73 Copyright OSS 2010
![Page 83: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/83.jpg)
Mastering the Teradata Architecture
74 Copyright OSS 2010
![Page 84: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/84.jpg)
Mastering the Teradata Architecture
75 Copyright OSS 2010
![Page 85: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/85.jpg)
Mastering the Teradata Architecture
76 Copyright OSS 2010
The Row Hash
“Following the light of the sun, we left the
Old World.”
– Christopher Columbus
I have a feeling you are going to sail through this chapter and discover a whole new
world. You are about to learn how the AMP gets their rows organized in ship shop
shape.
We already know that Teradata places rows on an AMP by Hashing the Primary Index,
which returns a Row Hash value, which is then equated to a number ranging from one to
one million, which equates to a bucket in the Hash Map, which points to an AMP number
in which the row will reside.
When the row is placed on the AMP the Row Hash that was derived by Hashing the
Primary Index will be placed at the front of the row. The AMP will sort the rows that it
owns for the table by the Row Hash.
Every Row will be kept in a perfect order on the AMP by sorting by the Row Hash.
![Page 86: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/86.jpg)
Mastering the Teradata Architecture
77 Copyright OSS 2010
![Page 87: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/87.jpg)
Mastering the Teradata Architecture
78 Copyright OSS 2010
The Uniqueness Value
“It‟s not the size of the dog in the fight, but
the size of the fight in the dog.”
– Archie Griffin
We have learned that the Row Hash is placed at the front of every row and that the AMP
will sort their rows by the Row Hash, thus keeping things sorted in perfect order. The
AMP will also add a Uniqueness Value behind the Row Hash so it can keep track of
duplicate values.
When a row comes in with its Row Hash the AMP will check to see if it has any other
Row Hashes exactly like the one it has just received. If this Row Hash is Unique it will
put a 1 as the Uniqueness value. If it already has another Row Hash just like this one it
will put a 2 in the Uniqueness value. If this Row Hash is the third duplicate it will put a
Uniqueness value of 3, etc., etc., etc.
For example, if there are 1000 duplicate Primary Index values such as the Last_Name of
‗Smith‘, then they would each have the same Row Hash and go to the same AMP. Their
Row Hash would be the same, but their Uniqueness Value would range from 1 to 1,000.
![Page 88: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/88.jpg)
Mastering the Teradata Architecture
79 Copyright OSS 2010
![Page 89: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/89.jpg)
Mastering the Teradata Architecture
80 Copyright OSS 2010
The Row ID
“A good plan, violently executed now, is
better than a perfect plan next week.”
- George S. Patton
The Row Hash and the Uniqueness Value make up the Row ID. Teradata rows placed on
an AMP always have the Row ID at the beginning of every row. Each AMP actually
sorts their rows by the Row ID, not just the Row Hash. This not only organizes the rows
perfectly, but is really how Teradata AMPs find their data so quickly.
Just like George S. Patton the Row ID shall RETURN…. An Answer Set!
The Row Hash is a 32-bit value and the Uniqueness Value is also a 32-bit value. This
means that 64-bits (8 Bytes) is placed in front of every Teradata Row!
![Page 90: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/90.jpg)
Mastering the Teradata Architecture
81 Copyright OSS 2010
![Page 91: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/91.jpg)
Mastering the Teradata Architecture
82 Copyright OSS 2010
Duplicates and the Uniqueness Value
“Ambition is a dream with a V8 Engine.”
– Elvis Presley
Teradata has a lot in common with Elvis Presley because to keep the AMPs disks from
overheating they have some really cool fans!
Notice in the picture on the following page that the Primary Index is Last_Name and this
AMP happens to have three rows with the Last_Name of Zao. Notice that the Row Hash
for all three rows containing Zao is identical, and also notice the Uniqueness Values are1,
2, 3. The first row placed on this AMP that had the name ‗Zao‘ was given a Uniqueness
Value of 1, the second ‗Zao‘ a 2, and the third ‗Zao‘ a 3.
Teradata takes the time to place the data in perfect order because it treats its rows like a
King! That is why Teradata disks are often referred to the ―King of Block and Roll‖!
![Page 92: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/92.jpg)
Mastering the Teradata Architecture
83 Copyright OSS 2010
![Page 93: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/93.jpg)
Mastering the Teradata Architecture
84 Copyright OSS 2010
AMPs Sort Their Rows by the Row ID
“Don't use a big word where a diminutive
one will suffice.” - Unknown
Please remember merely that each AMP sorts their rows by the Row ID. This will
become apparent why within the next few pages.
![Page 94: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/94.jpg)
Mastering the Teradata Architecture
85 Copyright OSS 2010
![Page 95: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/95.jpg)
Mastering the Teradata Architecture
86 Copyright OSS 2010
Search the Data like a Phone Book
“I've learned that you can't have everything
and do everything at the same time.”
– Oprah Winfrey
Everyone has use a Phone Book at some time in their life. If you decide you want to
order a Pizza you know you can go to the Phone Book. How do people handle the Phone
Book? Because a Phone Book is organized alphabetically from A-Z people generally
open the Phone Book to about the middle. Then they see where they are at alphabetically
and adjust the search towards the beginning or the end. It doesn‘t take long to find where
you can order a pizza.
Can you imagine if every time you used the Phone Book you started on page 1 and then
turned a page at a time until you found ‗Pizza Delivery‘? That wouldn‘t be a Pizza
search, but a serial search! You might starve before you even found the Pizza Delivery
place of your choice.
Teradata AMPs don‘t search for their data serially, but they do it just like a phone book.
They go to the middle of the table and see where they are and then adjust.
![Page 96: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/96.jpg)
Mastering the Teradata Architecture
87 Copyright OSS 2010
![Page 97: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/97.jpg)
Mastering the Teradata Architecture
88 Copyright OSS 2010
Why is my Phone Book 00000’s and 111111’s?
“I was walking down the street wearing
glasses when the prescription ran out.”
- Steven Wright
The AMPs don‘t like to brag, but they are so fast they often make a spectacle of
themselves!
The phone book is sorted alphabetically from A-Z, but computers read and write data in
Binary so they sort their numbers with zeros and ones (000000 to 111111).
This is why AMPs sort their rows by the Row-ID. This allows the AMP to search for a
row with a Binary Search!
This gives each AMP 20 20 vision when searching for a row! The next couple of pages
will show you clearly a Binary Search.
![Page 98: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/98.jpg)
Mastering the Teradata Architecture
89 Copyright OSS 2010
![Page 99: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/99.jpg)
Mastering the Teradata Architecture
90 Copyright OSS 2010
Performing a Binary Search
“Diplomacy is the art of saying “Nice
Doggie” until you can find a rock.”
– Will Rogers
The Row ID is the AMPs best friend and guides the AMP to the proper row. Let‘s just
say the Row ID is both a blood hound and a retriever when it comes to looking for a row.
Unless Teradata is doing a Full Table Scan it will always perform a Binary Search when
looking for a row based on the Primary Index. A Binary Search is always done on only
the Row ID when looking for a Primary Index value. That is why AMPs sort their rows
by the Row ID.
In the picture on the next page you can see that Last_Name is the Primary Index and this
AMP has been instructed to find a user named ‗Vey‘. The AMP is actually instructed to
find Row Hash 000011110, and then double check to make sure the Last_Name is
actually ‗Vey‘.
Don‘t let looking for a row frighten you because the disk‘s ―Bark is bigger than its Byte!‖
![Page 100: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/100.jpg)
Mastering the Teradata Architecture
91 Copyright OSS 2010
![Page 101: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/101.jpg)
Mastering the Teradata Architecture
92 Copyright OSS 2010
Opening the Phone Book to the Middle
“Look at life through the windshield, not the
rearview mirror.”
- Byrd Baggett
I once saw a funny show where everyone was racing to get to the finish line, but the
finish line was in a city hundreds of miles away. The Italian driver ripped off the Rear
View mirror, and the person sitting in the passenger street asked why he would tear the
Rear View mirror off the car! The Italian racer said, ―It is Italian driving – What‘s
behind you doesn‘t matter‖!
This is not the case when AMPs are driving. The Binary Search takes the AMP to the
middle of the rows. It will then move forward looking through the windshield or
backward using the rearview mirror to adjust its search. Because the rows are sorted by
the Row ID, and the Row ID is in Binary, the AMP can race to the row it wants to
retrieve.
In the picture on the following page the AMP has gone to the middle of the rows and
realizes it needs to go further.
![Page 102: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/102.jpg)
Mastering the Teradata Architecture
93 Copyright OSS 2010
![Page 103: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/103.jpg)
Mastering the Teradata Architecture
94 Copyright OSS 2010
I can Name that Tune in 5 Notes
“Where there is no patrol car, there is no
speed limit.”
- Al Capone
The following page shows how the AMP moves down through the next portion of the
rows to find the row. There is no speed limit on the AMP highway. It always moves as
fast as possible. You must remember that Teradata was built to hold Terabytes of data.
A single AMP may hold billions of rows for a single table. A binary search always cuts
the search in half. The first search goes to the middle of the phone book. Then the AMP
knows if it should move forward or backward. If it moves forward it goes halfway
further and checks where it is at again. It continues this brilliant binary search until it
finds the row.
![Page 104: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/104.jpg)
Mastering the Teradata Architecture
95 Copyright OSS 2010
![Page 105: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/105.jpg)
Mastering the Teradata Architecture
96 Copyright OSS 2010
A Visual for Data Layout
“I saw the angel in the marble and carved
until I set him free.”
--Michelangelo
The example on the following page is a logical view of data on AMPs. Each AMP holds
a portion of a table. Each AMP keeps the tables in their own separate drawers. The Row
ID is used to sort each table on an AMP.
Each AMP holds a portion of every
table.
Each AMP keeps their tables in separate
drawers.
Each table is sorted by Row ID.
![Page 106: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/106.jpg)
Mastering the Teradata Architecture
97 Copyright OSS 2010
![Page 107: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/107.jpg)
Mastering the Teradata Architecture
98 Copyright OSS 2010
Test Your Teradata Access Query Knowledge
Fill in the chart on the next page. Good luck.
![Page 108: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/108.jpg)
Mastering the Teradata Architecture
99 Copyright OSS 2010
![Page 109: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/109.jpg)
Mastering the Teradata Architecture
100 Copyright OSS 2010
![Page 110: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/110.jpg)
Mastering the Teradata Architecture
101 Copyright OSS 2010
![Page 111: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/111.jpg)
Mastering the Teradata Architecture
102 Copyright OSS 2010
UPI Row-ID Test
“Acting is all about honesty. If you can fake
that, you‟ve got it made”
- George Burns
To store the data, the value(s) in the PI are hashed though a calculation to determine
which AMP will possess the row. The same data values always hash the same row hash
and therefore are always associated with the same AMP. The PI is what makes or breaks
the system. The PI is responsible for all of the systems data distribution.
Our quiz on the next page is designed to only show in theory how Teradata places a row
on an AMP. We are going to divide the Primary Index value by two. The output is
called the Row-Hash. We will take our Row-Hash answer and it will point to a bucket in
the Hash Map. That bucket will tell Teradata which AMP will hold the row.
Your mission, if you decide to accept it, is to place the Row ID and the Row on the
proper AMP. I have already completed the first row for you because I am a nice guy!
![Page 112: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/112.jpg)
Mastering the Teradata Architecture
103 Copyright OSS 2010
![Page 113: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/113.jpg)
Mastering the Teradata Architecture
104 Copyright OSS 2010
![Page 114: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/114.jpg)
Mastering the Teradata Architecture
105 Copyright OSS 2010
![Page 115: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/115.jpg)
Mastering the Teradata Architecture
106 Copyright OSS 2010
NUPI Row-ID Test
“Warning: Keyboard Not Attached. Press
F10 to Continue.”
– Actual Computer Error Message
To store the data, the value(s) in the PI are hashed though a calculation to determine
which AMP will possess the row. The same data values always hash the same row hash
and therefore are always associated with the same AMP. The PI is what makes or breaks
the system. The PI is responsible for all of the systems data distribution.
Our quiz on the next page is designed to only show in theory how Teradata places a row
on an AMP. I have already Hashed the Last_Name Primary Index (NUPI) for you.
The output is called the Row-Hash. We will take our Row-Hash answer and it will point
to a bucket in the Hash Map. That bucket will tell Teradata which AMP will hold the
row.
Your mission, if you decide to accept it, is to place the Row ID and the Row on the
proper AMP. I have already completed the first row for you because I am a nice guy!
![Page 116: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/116.jpg)
Mastering the Teradata Architecture
107 Copyright OSS 2010
![Page 117: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/117.jpg)
Mastering the Teradata Architecture
108 Copyright OSS 2010
![Page 118: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/118.jpg)
Mastering the Teradata Architecture
109 Copyright OSS 2010
![Page 119: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/119.jpg)
Mastering the Teradata Architecture
110 Copyright OSS 2010
![Page 120: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/120.jpg)
Mastering the Teradata Architecture
111 Copyright OSS 2010
![Page 121: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/121.jpg)
Mastering the Teradata Architecture
112 Copyright OSS 2010
Secondary Indexes
“The afternoon knows what the morning
never suspected.”
- Swedish Proverb
The secondary index knows what the full table scan never suspected. Secondary Indexes
provide an alternate path to the data. So far we have learned that every table has one and
only one Primary Index and we have learned that the Primary Index is much faster than
the Full Table Scan. Secondary Indexes are not as fast as the Primary Index, but they can
be pretty fast, and they can be much faster than a Full Table Scan.
There can be up to 32 Secondary Indexes on a table, but there is a price to pay. Every
Secondary Index creates a Subtable on every AMP designed to point to the real Primary
Index Row-ID. I will explain in full detail. You may have wondered why I was so
persistent with the explanation of the Primary Index and the actual Row-IDs, but you will
soon see exactly why I really stressed you knowing that information.
There are two types of Secondary Index and they are Unique Secondary Indexes, which
are called USIs and Non-Unique Secondary Indexes called NUSIs.
An USI is always a Two-AMP operation so it is almost as fast as a Primary Index, but a
NUSI is an All-AMP operation, but not a Full Table Scan.
![Page 122: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/122.jpg)
Mastering the Teradata Architecture
113 Copyright OSS 2010
![Page 123: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/123.jpg)
Master the Teradata Architecture
114 Copyright OSS 2010
The Base Table
“It‟s d‟ej‟a vu all over again!”
-Yogi Berra
We have already discussed a table, but I want to emphasize that for this chapter we will
refer to the real tables as Base Tables and the Secondary Index tables as Subtables.
Secondary Index Subtables are treated by Teradata as just another table so I guess you
could say ―It‘s d‘ej‘a 2 all over again!‖
I want you to look at the picture on the next page. Notice that the Primary Index is
Last_Name and it is a Non-Unique Primary Index (NUPI). I also want you to pay close
attention to the Row-IDs in front of each row.
Secondary Index Subtables are designed to point to the real row in the base table and they
will do so by pointing to the exact Row-ID of the row they are looking for in the base
table.
![Page 124: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/124.jpg)
Master the Teradata Architecture
115 Copyright OSS 2010
![Page 125: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/125.jpg)
Master the Teradata Architecture
116 Copyright OSS 2010
Creating a Unique Secondary Index (USI)
“Beware of the young doctor and the old
barber.”
- Benjamin Franklin
If Ben Franklin was able to read this book I think he would be shocked at how easy it is
to create a secondary index!
The slide on the next page shows the syntax for creating a Unique Secondary Index
(USI). We have created the USI on the column Emp_No. Once the USI is created no
duplicate Employee Number can exist in the table. If a row is added with a duplicate
value for Emp_No then Teradata will reject the row and the user will receive an error
message.
As soon as we create the USI with our SQL Statement a Subtable will be created on
every AMP. I will show you that in the next pages.
![Page 126: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/126.jpg)
Master the Teradata Architecture
117 Copyright OSS 2010
![Page 127: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/127.jpg)
Master the Teradata Architecture
118 Copyright OSS 2010
The Secondary Index Subtable
“Once the game is over, the king and the
pawn go back in the same box.”
- Italian Proverb
As soon as the USI is created with the SQL syntax the next move comes from Teradata
creating a Subtable on every AMP. This is true for both the USI and the NUSI.
Let‘s say for example the DBA created the maximum of 32 secondary indexes on a table.
Then there would be 32 Subtables created, each taking up PERM Space.
The entire purpose for the Secondary Index Subtable will be to point back to the real row
in the base table via the Row-ID.
![Page 128: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/128.jpg)
Master the Teradata Architecture
119 Copyright OSS 2010
![Page 129: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/129.jpg)
Master the Teradata Architecture
120 Copyright OSS 2010
Inside the Secondary Index Subtable
“The absent are always in the wrong.”
English Proverb
There are always two columns in the Secondary Index Subtable. The column value in
which you created the secondary index on, which in this case was Emp_No, and the real
Row-ID of the row in the Base Table.
Think of the Subtable as a confidential informant of the FBI. Without the secondary
index Subtable there would only be only two ways for Teradata to find a particular row.
It would be by using the Primary Index value in the query or by doing Full Table Scan.
The Secondary Index Subtable is really a baby table that contains the Secondary Index
column, which acts as the Primary Index of the Subtable so Teradata can easily find
Emp_No 2 in the Subtable. It can find any Emp_No in the Subtable because Emp_No is
the Primary Index of the Subtable. So why is the Subtable like an FBI informant?
When a query is written with Emp_No in the WHERE clause the Teradata Parsing
Engine (PE) recognizes it is an USI and looks up the Emp_No value in the Subtable with
a 1-AMP operation. It then asks, ―Can you tell me the Row-ID of the row in the base
table‖? Once Teradata has the Row-ID it takes the value in the Row-Hash of the Row-ID
and looks at the Hash Map and knows exactly which AMP the Base Table row is on.
![Page 130: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/130.jpg)
Master the Teradata Architecture
121 Copyright OSS 2010
![Page 131: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/131.jpg)
Master the Teradata Architecture
122 Copyright OSS 2010
How Teradata builds the Secondary Index Subtable
“I don‟t know who my grandfather was. I
am more interested in who his grandson will
become.”
– Abraham Lincoln, 16th
president of the United States
As soon as the DBA uses the SQL to create a secondary index Teradata immediately gets
to work. Teradata must build the secondary index Subtable immediately before it can
become an alternate path to the data. Each AMP Hashes the secondary index value for
each row they own with the Hash Formula. The result is a 32-bit Row Hash which points
to a bucket in the Hash Map, which tells the secondary index row which AMPs Subtable
it will be on. All UNIQUE Secondary Indexes are hashed and the value plus the real
Row-ID of the base table are sent to the proper AMP over the BYNET.
Pay close attention to the slide on the next page and let me walk you through it.
We created the USI on the column Emp_No. Every Emp_No value will now have to also
reside inside the Subtable. The first rows value for Emp_No is a 2. The PE hashes the
value of 2 and the result is a 32-bit row hash. The PE then points to the bucket in the
Hash Map that corresponds to the 32-bit row hash and the Hash Map says that AMP 1 is
the destination AMP. So the Emp_No value of 2 goes to AMP 1‘s Subtable. It also
brings with it the real Row-ID for its row from the Base Table, which is 1,1. Now the
first secondary row is perfectly placed.
![Page 132: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/132.jpg)
Master the Teradata Architecture
123 Copyright OSS 2010
![Page 133: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/133.jpg)
Master the Teradata Architecture
124 Copyright OSS 2010
How Teradata builds the Secondary Index Subtable
“The most exciting phrase to hear in
science, the one that heralds the most
discoveries, is not “Eureka!”, but “That‟s
funny…””
Isaac Asimov
Your job is to now place the remaining three rows on the proper AMP perfectly. I want
you to use the Tera-Tom Hash Formula, which is to divide by 2. This is designed to
show you that a consistent formula will produce predictable and repeatable results.
Divide each Emp_No by 2 and that will represent the Hash Formula, with a 32-bit Row
Hash as the result. You can then point to the corresponding bucket in the Hash Map
where you will place the row on the destination AMP. You need to place the USI value
and the real Row-ID of the row with it. Good luck!
![Page 134: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/134.jpg)
Master the Teradata Architecture
125 Copyright OSS 2010
![Page 135: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/135.jpg)
Master the Teradata Architecture
126 Copyright OSS 2010
![Page 136: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/136.jpg)
Master the Teradata Architecture
127 Copyright OSS 2010
![Page 137: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/137.jpg)
Master the Teradata Architecture
128 Copyright OSS 2010
Building the Secondary Index Subtable
“I don‟t skate to where the puck is; I skate
to where I want the puck to be.”
– Wayne Gretzky
If you received the answers listed on the next page you are no longer skating on thin ice.
You are performing a Power Play. Each Secondary Index Subtable row won‘t perform a
one-timer, but instead perform a two-timer because the USI Value and Base Row-ID
always make an USI query a 2-AMP operation.
Now that the Secondary Index Subtable is built, the users can query the base table. If the
USI is used in the WHERE clause the Parsing Engine knows it has an alternate routine to
the data. It can use a 1-AMP operation to find the Subtable Row and then use another 1-
AMP operation to find the base row.
If there are 1,000,000 rows in the base table there will be 1,000,000 rows in the Subtable.
The Base table will be much larger because it probably has many columns, but the
Subtable only has two columns.
![Page 138: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/138.jpg)
Master the Teradata Architecture
129 Copyright OSS 2010
![Page 139: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/139.jpg)
Master the Teradata Architecture
130 Copyright OSS 2010
USI – Always a Two-AMP Operation
“Measure a thousand times and cut once.”
-Turkish Proverb
Secondary Indexes provide an alternate path to the data, and should be used on queries
that run thousands of times. Teradata runs extremely well without Secondary Indexes, but
since secondary indexes use up space and overhead, they should only be used on
―KNOWN QUERIES‖ or queries that are run over and over again. Once you know the
data warehouse, environment you can create Secondary Indexes to enhance its
performance.
“Measure a thousand query times and
create a secondary index.”
-Turkish Teradata Certified Professional
Every time the Parsing Engine sees the USI column in the WHERE clause it comes up
with a plan that involves only two AMPs. Memorize this if you have to, but always
know that an USI query is a two-AMP operation. Read the next couple of pages and you
will know why.
![Page 140: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/140.jpg)
Master the Teradata Architecture
131 Copyright OSS 2010
![Page 141: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/141.jpg)
Master the Teradata Architecture
132 Copyright OSS 2010
![Page 142: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/142.jpg)
Master the Teradata Architecture
133 Copyright OSS 2010
![Page 143: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/143.jpg)
Master the Teradata Architecture
134 Copyright OSS 2010
The Parsing Engines Plan with an USI Query
“If you do what you've always done, you'll
get what you've always got.”
– Anonymous
The above quote is perfect for the secondary index because the Secondary Index and the
Hash Map do what they have always done, and they know they‘ll get what they always
got. The Parsing Engine doesn‘t always put out the entire plan and wait for the data to
return. Sometimes the Parsing Engine gives pieces of the plan and helps guide the
AMPs. Take a look at the explanation below and the picture on the next page.
The first part of the USI plan will be to find the USI value in the Secondary Index
Subtable. Let‘s say for example the WHERE Clause stated:
WHERE Emp_No = 2;
The Parsing Engine knows that the Subtables Primary Index is Emp_No. It puts out the
first part of the plan by stating:
Hash the value of 2 and then look at the corresponding bucket in the Hash Map. Go to
the Destination AMP inside the Hash Map bucket and tell that AMP to find Emp_No 2 in
its Subtable. Then have it return the Row-ID of the base row to me (The Parsing
Engine).
Once the Parsing Engine receives the Row-ID of the base row it takes the first part of the
Row-ID (which is the row hash), looks at the corresponding bucket in the Hash Map and
now knows the AMP that holds the base row. It sends a second message to that AMP
and says ―Find this Row-ID in your Employee_Table and retrieve the row‖.
![Page 144: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/144.jpg)
Master the Teradata Architecture
135 Copyright OSS 2010
![Page 145: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/145.jpg)
Master the Teradata Architecture
136 Copyright OSS 2010
Retrieving Base Rows using the USI
“Never trust the advice of a man in
difficulties”
Aesop (620 BC – 560 BC)
Remember the previous slide? The first part of the USI plan was to find the USI value in
the Secondary Index Subtable. The PE saw:
SELECT *
FROM Employee_Table
WHERE Emp_No = 2;
The Parsing Engine knows that the Subtables Primary Index is Emp_No. It puts out the
first part of the plan by stating:
Hash the value of 2 and then look at the corresponding bucket in the Hash Map. Go to
the Destination AMP inside the Hash Map bucket and tell that AMP to find Emp_No 2 in
its Subtable. Then have it return the Row-ID of the base row to me (The Parsing
Engine).
Now look at the slide on the next page. Once the Parsing Engine receives the Row-ID of
the base row it takes the first part of the Row-ID (which is the row hash), looks at the
corresponding bucket in the Hash Map and now knows the AMP that holds the base row
WHERE Emp_No = 2. It sends a second message to the AMP holding the base row and
says ―Find this Row-ID in your Employee_Table and retrieve the row‖.
This process is always a 2-AMP operation.
![Page 146: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/146.jpg)
Master the Teradata Architecture
137 Copyright OSS 2010
![Page 147: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/147.jpg)
Master the Teradata Architecture
138 Copyright OSS 2010
Picture that USI in Action
“You always pass failure on your way to
success”
– Mickey Rooney 1920
The following page shows a great picture of what to expect with an USI query!
![Page 148: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/148.jpg)
Master the Teradata Architecture
139 Copyright OSS 2010
![Page 149: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/149.jpg)
Master the Teradata Architecture
140 Copyright OSS 2010
USI Summary
“Nearly all men can stand adversity, but if
you want to test a man‟s character, give him
power.”
– Abraham Lincoln
The following page provides a summary to show you the power of the USI. This will
work if you want to test a rows Characters or Integers!
![Page 150: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/150.jpg)
Master the Teradata Architecture
141 Copyright OSS 2010
![Page 151: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/151.jpg)
Master the Teradata Architecture
142 Copyright OSS 2010
USI Pictorial using the Hash Maps
“Nobody forgets where they buried the
hatchet”
– Frank McKinney “Kin” Hubbard
You won‘t have an Ax to grind with the next slide. This shows you exactly how Teradata
uses the Hash Maps twice for an USI query. This means that it does two binary searches
to chop away at finding the result set requested.
Notice that the PE hashes the USI value to find out which AMP holds the Subtable row.
Then the AMP does a binary search on its Subtable to deliver the real Row-ID of the base
table. The PE doesn‘t need to Hash this. It knows the first part of the Row-ID was
created when the PE hashed the Primary Index of the table to originally place the base
row. It takes the first part of the Row-ID, which is the Row Hash and looks at the Hash
Maps corresponding bucket. Now it sends a message to the proper AMP to get the base
row using the Row-ID. That AMP will do a Binary Search to quickly find the row.
There are always two binary searches when an USI query is used to retrieve a unique
row. One Binary Search in the Subtable look-up and one in the Base Table look-up.
![Page 152: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/152.jpg)
Master the Teradata Architecture
143 Copyright OSS 2010
![Page 153: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/153.jpg)
Master the Teradata Architecture
144 Copyright OSS 2010
USI Secondary Index Quiz
“Only dead fish swim with the stream.”
– Anonymous
Answer the questions on the next page, but be careful. I have a few tricks that you could
fall for Hook, Line, and Sinker!
![Page 154: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/154.jpg)
Master the Teradata Architecture
145 Copyright OSS 2010
![Page 155: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/155.jpg)
Master the Teradata Architecture
146 Copyright OSS 2010
USI Secondary and Primary Index Quiz Answers
“Choice, not chance, determines destiny.”
– Anonymous
Check your answers and see if you made the right choice.
![Page 156: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/156.jpg)
Master the Teradata Architecture
147 Copyright OSS 2010
![Page 157: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/157.jpg)
Master the Teradata Architecture
148 Copyright OSS 2010
A Full Table Scan Example
“Those who dance are considered insane by
those who cannot hear the music.”
– George Carlin
The next page is designed to show that a Full Table Scan will be performed when a Non-
Indexed Column is used by itself in the WHERE clause. We will soon create a Non-
Unique Secondary Index on this column, but first perform the Full Table Scan.
Please make a note of it!
![Page 158: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/158.jpg)
Master the Teradata Architecture
149 Copyright OSS 2010
![Page 159: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/159.jpg)
Master the Teradata Architecture
150 Copyright OSS 2010
The Base Table
"He who controls the past commands the
future. He who commands the future
conquers the past."
– George Orwell
The next page is designed to merely remind you that we have two types of tables. Those
are the Base Tables that hold the actual data that the users query against and the
Secondary Index Subtables designed to point to the real Row-ID of the base table.
Once again I want you to notice the Row-IDs at the front of each row in the Base Table.
Remember how those were derived? They were derived when the row was originally
loaded. The PE hashed the Primary Index Column (Last_Name) Value and that came up
with a 32-bit Row Hash. It then counted over the appropriate number of buckets in the
Hash Map that corresponded to the Row Hash, and inside the bucket was the Destination
AMP for that row. Once the row and the Row Hash went to the AMP the actual AMP
placed a 32-bit Uniqueness Value behind the Row Hash. The Row Hash plus the
Uniqueness Value make up a rows Row-ID.
![Page 160: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/160.jpg)
Master the Teradata Architecture
151 Copyright OSS 2010
![Page 161: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/161.jpg)
Master the Teradata Architecture
152 Copyright OSS 2010
Creating a Non-Unique Secondary Index (NUSI)
“Life is not the candle or the wick, it's the
burning.”
– David Joseph Schwartz
Get heated up and get ready to glow because we have just put the SQL Syntax into
Teradata to CREATE a Non-Unique Secondary Index. It really isn‘t apparent that this is
a Non-Unique Secondary Index, but it is. The word NON is never used in Teradata, but
the word UNIQUE is often used.
Once the SQL to CREATE the Secondary Index is successfully completed by Teradata, a
Subtable is created on every AMP. Pay close attention to the next couple of pages
because a NUSI is handled much differently than an USI, as you will soon understand.
![Page 162: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/162.jpg)
Master the Teradata Architecture
153 Copyright OSS 2010
![Page 163: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/163.jpg)
Master the Teradata Architecture
154 Copyright OSS 2010
Columns inside a NUSI Secondary Index Subtable
“Darkness cannot drive out darkness; only
light can do that. Hate cannot drive out
hate; only love can do that.”
– Martin Luther King, Jr.
Inside the NUSI Subtable resides two columns. They are the column values of the NUSI
and the real Row-ID of the row in the base table. This is exactly the same two columns
that were in the USI Subtable. Remember that the entire purpose of the NUSI or the USI
Subtable is to point to the real row in the Base Table. This pointing is done by capturing
the rows Row-ID.
The big difference between the USI and the NUSI Subtable is that the USI Subtable rows
are Hashed and the NUSI subtable rows are AMP-Local. Read on my friend!
![Page 164: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/164.jpg)
Master the Teradata Architecture
155 Copyright OSS 2010
![Page 165: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/165.jpg)
Master the Teradata Architecture
156 Copyright OSS 2010
NUSI Subtable is AMP-Local
“We never know the worth of water
„til the well is dry.”
– English Proverb
The NUSI Subtable is always AMP-Local. What does that term mean? Let me first
again explain how USIs are Hashed and then how NUSIs are AMP-Local.
When an USI Subtable is created each USI value for each row in the Base Table is
Hashed and sent to the AMP the Hash Formula and Hash Map dictate. Most often the
Base Row and the Subtable Row end up on different AMPs. The great news is that the
Parsing Engines plan is always a Two-AMP operation. This can be done because a
Unique Secondary Index is UNIQUE, which obviously means there can only be one row
returned.
A NUSI is a Non-Unique Secondary Index, obvious again meaning that the value is Non-
Unique and there could be thousands, millions or even billions of duplicates. So the
Parsing Engine takes on a different strategy when building the NUSI Subtable. Each row
in the Subtable only tracks the Base rows on the same AMP. This is what is meant by
AMP-Local.
On the following page you can see that the AMP labeled ―A Typical AMP‖ holds two
base rows of the Employee_Table. The First_Name values, which was the column we
created the NUSI Index on holds two values on this AMP, which are ‗Rakish‘ and ‗Vu‘.
So in this typical AMPs Subtable there will be two rows tracking ‗Rakish‘ and ‗Vu‘. A
NUSI Subtable is always created on each AMP, but in each AMPs Subtable are only
values local to the base rows for that AMP.
Now you know what the term AMP-Local means.
![Page 166: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/166.jpg)
Master the Teradata Architecture
157 Copyright OSS 2010
![Page 167: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/167.jpg)
Master the Teradata Architecture
158 Copyright OSS 2010
A Query using the NUSI Column
“Everyone is kneaded out of the same dough
but not baked in the same oven.”
– Yiddish Proverb
On the following page you will see we are running SQL and using the First_Name
column in our WHERE clause. This will usually cause the Parsing Engine to use the
NUSI Index, but not always. Sometimes the Parsing Engine will decide it is faster to do
a Full Table Scan. This is dependent on three things:
1) If the NUSI is weakly or strongly selective. An example of something that is
weakly selective might be this. Imagine you are in a large room with hundreds of
people and you ask, ―How many people here usually eat dinner every evening?‖
The answer would be everyone! Here is an example of a strongly selective index.
You now ask the same large room of people, ―How many of you were born in
Russia, are a twin, and only speak French?‖ If the NUSI is strongly selective it
will be used by the Parsing Engine.
2) If the table is small it is sometimes faster to just do a Full Table Scan.
3) If the DBA collected statistics. We will talk about Statistics in the future, but the
short answer is that the DBA will usually collect statistics on all Non-Unique
columns for a table so the Parsing Engine knows if the Index is strongly or
weakly selective.
![Page 168: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/168.jpg)
Master the Teradata Architecture
159 Copyright OSS 2010
![Page 169: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/169.jpg)
Master the Teradata Architecture
160 Copyright OSS 2010
A Query using the NUSI Column
“If all my possessions were taken from me
but one, I would choose to keep the power of
speech, for with it I could soon
regain all the rest.”
– Daniel Webster
A NUSI query is always an All-AMP operation, but not a Full Table Scan. In our query
example on the previous page we selected all columns WHERE the First_Name was
equal to ‗Rakish‘. There could potentially be a ‗Rakish‘ or multiple people named
‗Rakish‘ on every AMP, one AMP, No AMPs or some AMPs. That is why each NUSI
Subtable is AMP-Local. Now the PE can use this two-step strategy.
1) Each AMP needs to search their own AMP-Local NUSI Subtable and check if
you have one or more individuals named ‗Rakish‘.
2) Each AMP checks and only the AMPs that found ‗Rakish‘ in their Subtable will
retrieve rows with ‗Rakish‘ in the rows they own in their Employee_Table Base
Table.
That is why a NUSI query always involves All-AMPs, but it is not a Full Table Scan.
Think about this! Imagine we had a system with 100 AMPs. Now let‘s say there was
only one ‗Rakish‘ found on AMP 99. How many Binary Searches would there be in this
query? Well, in step 1 each AMP would perform a Binary Search on their AMP Local
NUSI Subtable so this would mean 100 Binary Searches, because there are 100 AMPs.
Then only AMP 99 would find a ‗Rakish‘ in its Subtable and so only AMP 99 would
have to perform a Binary Search in its Base Table, so the final answer is 101.
![Page 170: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/170.jpg)
Master the Teradata Architecture
161 Copyright OSS 2010
![Page 171: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/171.jpg)
Master the Teradata Architecture
162 Copyright OSS 2010
NUSI Recap
“He conquers who endures.”
– Persius 34 AD - 62 AD
A NUSI query is always an All-AMP operation, but not a Full Table Scan. The
following page sums all of this up.
![Page 172: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/172.jpg)
Master the Teradata Architecture
163 Copyright OSS 2010
![Page 173: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/173.jpg)
Master the Teradata Architecture
164 Copyright OSS 2010
Secondary Index Summary
“The mind is not a vessel to be filled but a
fire to be kindled.”
– Plutarch 40 - 120 AD
The next page sums up the USI and the NUSI secondary indexes. Remember that USI
rows are hashed and NUSI rows are AMP-Local. Also remember that an USI query is
always a two-AMP operation. A NUSI query is an All-AMP operation, but not a full
table scan.
An USI query is much faster than a NUSI! The Parsing Engine will use an USI at a
moment‘s notice, but it will not always choose to use a NUSI. Sometimes it will choose
a Full Table Scan over a NUSI.
The Parsing Engine will never choose a Full Table Scan over an USI.
![Page 174: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/174.jpg)
Master the Teradata Architecture
165 Copyright OSS 2010
![Page 175: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/175.jpg)
Master the Teradata Architecture
166 Copyright OSS 2010
Test Your Teradata Access Query Knowledge
Fill in the chart on the next page. Good luck.
![Page 176: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/176.jpg)
Master the Teradata Architecture
167 Copyright OSS 2010
![Page 177: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/177.jpg)
Master the Teradata Architecture
168 Copyright OSS 2010
![Page 178: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/178.jpg)
Master the Teradata Architecture
169 Copyright OSS 2010
![Page 179: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/179.jpg)
Master the Teradata Architecture
170 Copyright OSS 2010
An Incredible Quiz Opportunity
Your mission if you decide to accept it is to answer the multiple choice question on the
following page. Should you be killed, captured, or get the answer wrong you will be
disavowed. Be careful here because this is trickier than it looks. Use your knowledge of
Teradata and think. The answer will really get you thinking and understanding how to
tune Teradata for your applications.
![Page 180: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/180.jpg)
Master the Teradata Architecture
171 Copyright OSS 2010
![Page 181: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/181.jpg)
Master the Teradata Architecture
172 Copyright OSS 2010
![Page 182: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/182.jpg)
Master the Teradata Architecture
173 Copyright OSS 2010
An Incredible Quiz Opportunity
Take a look at the answer. It is D. That is because a NUPI is a one-AMP operation and
an USI is only a two-AMP operation. What a great combination.
Did you fall for the answer of C. That UPI just invites you because of your tendency to
want the data to spread completely evenly. You want good distribution, but not if you
have to use Full Table Scans or extra AMPs to satisfy your user base. As long as the
distribution is reasonable a NUPI is perfectly acceptable. In this case D is the right way
to go! Are your eyes beginning to open on how Teradata works? I am pushing you hard
in the right direction. Stick with me cause we are going to the mountain top together.
![Page 183: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/183.jpg)
Master the Teradata Architecture
174 Copyright OSS 2010
![Page 184: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/184.jpg)
Master the Teradata Architecture
175 Copyright OSS 2010
![Page 185: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/185.jpg)
Master the Teradata Architecture
176 Copyright OSS 2010
![Page 186: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/186.jpg)
Master the Teradata Architecture
177 Copyright OSS 2010
![Page 187: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/187.jpg)
Master the Teradata Architecture
178 Copyright OSS 2010
A Table used for our Partitioning Example
“The entire sum of existence is the magic of
being needed by just one other person.”
– Vi Putnam
The next page shows a table that will be used to show how Teradata Partitioning works.
We will take this table and show it in a Non-Partitioned fashion and then Partition the
table and show how Teradata runs certain queries faster.
All I want you to notice right now is the column Order_Date. Notice that we have dates
in both January and February.
![Page 188: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/188.jpg)
Master the Teradata Architecture
179 Copyright OSS 2010
![Page 189: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/189.jpg)
Master the Teradata Architecture
180 Copyright OSS 2010
Range Queries
“Cowards die many times before their
deaths; the valiant never taste
of death but once.”
– William Shakespeare
The next page shows our Order_Table spread across the AMPs. Notice that I have color
coded the January and February dates. Also notice that January and February dates are
mixed on every AMP in what is a random order. This is because the Primary Index is
Order_Number. So, the January dates are most likely on every AMP and so are the
February dates!
I also want you to take notice of the query. We are looking for all orders in January.
Remember that!
The query on the next page is called a Range Query because it uses the keyword
BETWEEN. The BETWEEN keyword in Teradata means find everything in the range
BETWEEN this date and this other date.
The BETWEEN statement is said to be inclusive. If someone said to me tell me what is
BETWEEN the numbers 8 and 10 I would normally say, ―The number 9‖. In Teradata
land I would be wrong because the BETWEEN statement is inclusive, so it INCLUDES
the starting and ending numbers. What is BETWEEN 8 and 10? The numbers 8, 9 and
10!
Partitioned tables work very well on Range Queries using the keyword BETWEEN.
Turn the next couple of pages and you will soon see WHY!
We will next discuss what a Partitioned Table is all about!
![Page 190: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/190.jpg)
Master the Teradata Architecture
181 Copyright OSS 2010
![Page 191: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/191.jpg)
Master the Teradata Architecture
182 Copyright OSS 2010
Why we had to perform a Full Table Scan
“Reality is wrong.
Dreams are for real.”
– Tupac Shakur
The next page shows our Order_Table spread across the AMPs. Notice that I have color
coded the January and February dates. Also notice that January and February dates are
mixed on every AMP in what is a random order. Because the January Data is on all
AMPs and because the January Dates are randomly mixed we have to do Full Table Scan.
We had no indexes on Order_Date so it is obvious the PE will command the AMPs to do
a Full Table Scan, but soon we will Partition the table and prevent the Full Table Scan.
This brings me to a great point I want you to remember.
We partition tables so we won‘t have to do a Full Table Scan on our Range Queries!
![Page 192: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/192.jpg)
Master the Teradata Architecture
183 Copyright OSS 2010
![Page 193: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/193.jpg)
Master the Teradata Architecture
184 Copyright OSS 2010
A Partitioned Table
“A good head and a good heart are always
a formidable combination.”
– Nelson Mandela
Notice the example of AMPs on the top of the page. This table is not partitioned.
Now notice the example of AMPs on the bottom of the page. This table is partitioned.
This is a very important point so I want to drive it home. The only difference between a
Partitioned table and a Non-Partitioned table is how each AMP sorts its rows for a table.
We have learned that each AMP always sorts its rows by the Row-ID in order to do a
Binary Search on Primary Index queries.
Well, a Partitioned Table will have the AMPs first sort their rows by the Partition.
Notice that the rows on an AMP don‘t change AMPs because the table is partitioned.
Remember it is the Primary Index alone that will determine which AMP gets a row. If
the table is partitioned then the AMP will sort its rows by the partition.
What is great about this? The January rows are at the top on each AMP and the February
rows are at the bottom. We won‘t have to do a Full Table Scan on our Range Query
now! If we are looking for all order in January then each AMP only has to read from
their January Partition and look at the top of their rows!
![Page 194: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/194.jpg)
Master the Teradata Architecture
185 Copyright OSS 2010
![Page 195: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/195.jpg)
Master the Teradata Architecture
186 Copyright OSS 2010
A Partitioned Table
“A man who views the world at 50 the same
as he did at 20 has wasted 30 years of his
life.”
– Muhammad Ali
Notice the example on the next page. We are running our Range Query on our
Partitioned Table to demonstrate visually how Teradata has all AMPs participate, but
each AMP only reads from one partition. The Parsing Engine no longer has to instruct
the AMPs to do a Full Table Scan. It instructs the AMPs to each read from their January
Partition.
Remember what we said earlier? A Partitioned Table is designed to eliminate a Full
Table Scan, especially on Range Queries.
![Page 196: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/196.jpg)
Master the Teradata Architecture
187 Copyright OSS 2010
![Page 197: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/197.jpg)
Master the Teradata Architecture
188 Copyright OSS 2010
One Year of Orders Partitioned
“Do not remove a fly from your friend‟s
forehead with a hatchet.”
- Chinese Proverb
Notice the example on the next page. This is a great visual picture of exactly how a
Partitioned Table might look in a real environment. Notice that each AMP holds dates
for the entire year, but each AMP sorts the rows in Month order.
For Range Queries or even queries that only query a certain month, Teradata can use
what they call Partition Elimination and only read certain partitions to satisfy the query.
Get it in your mind that Partitioning is only about each AMP sorting their rows!
![Page 198: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/198.jpg)
Master the Teradata Architecture
189 Copyright OSS 2010
![Page 199: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/199.jpg)
Master the Teradata Architecture
190 Copyright OSS 2010
Fundamentals of Partitioning
“Nobody believes the official spokesman…
but everybody trusts an unidentified
source.”
- Ron Nesenx
Take a look at the statements on the next page. Take your time and take them in! The
points I really want you to take notice is that it is the Primary Index that determines with
AMP gets a particular row and that Partitioning doesn‘t affect distribution. Partitioning
only affects how each AMP sorts the rows they get!!!!
![Page 200: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/200.jpg)
Master the Teradata Architecture
191 Copyright OSS 2010
![Page 201: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/201.jpg)
Master the Teradata Architecture
192 Copyright OSS 2010
Add the Partition to the Row-ID for the Row Key
“I was a vegetarian until I started leaning
towards the sunlight.”
- Rita Rudner
It took a little while for you to digest the Row-ID. Well now you need to know that if a
table is partitioned, the partition number is placed in front of the Row-ID for each row.
This combination of the Partition number, Row-Hash, and Uniqueness value are now
called the ROW KEY. Instead of sorting by the Row-ID we are merely first sorting by
the Partition Number. We are really just sorting by the Row Key!
If a table is NOT partitioned the Partition Number is merely set to ZERO!
Notice on the next page that when our Typical AMP sorts by the Row Key (in other
words, the Partition first and then the Row-ID) the January dates are at the top and the
February dates follow the January dates.
![Page 202: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/202.jpg)
Master the Teradata Architecture
193 Copyright OSS 2010
![Page 203: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/203.jpg)
Master the Teradata Architecture
194 Copyright OSS 2010
You Partition a Table when you CREATE the Table
“Whenever you are asked if you can do a
job, tell „em “Certainly I can!” Then get
busy and find out how to do it”
- Franklin D. Roosevelt
The next page shows the syntax to CREATE a partitioned table. Please don‘t assume you
can only partition a table when you CREATE it. You can actually CREATE a normal
table first and later ALTER the table, but generally you partition a table when you first
create it.
I want you to notice the Primary Index statement. Our Primary Index for this example is
a NUPI on Order_No, but we are partitioning on Month of Order_Date.
![Page 204: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/204.jpg)
Master the Teradata Architecture
195 Copyright OSS 2010
![Page 205: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/205.jpg)
Master the Teradata Architecture
196 Copyright OSS 2010
RANGE_N Partitioning by Week
“Examine what is said, not him who
speaks.”
- Arab Proverb
In the example on the next page we are showing a partition example of RANGE_N. I
want you to notice the last line in the CREATE statement. This is where you tell
Teradata whether to partition by day, week, or month. In this example we are
partitioning by day. Each day from the starting date range to the ending date range will
![Page 206: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/206.jpg)
Master the Teradata Architecture
197 Copyright OSS 2010
![Page 207: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/207.jpg)
Master the Teradata Architecture
198 Copyright OSS 2010
RANGE_N Partitioning Older and Newer Data
“Only the Spoon knows what is stirring in
the pot.”
– Sicilian Proverb
What a great example on the next page to stir your imagination. This table contains older
data and more recent data. We are partitioning the older data by month and the newer
data by day. Wow! Now we‘re cooking!
![Page 208: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/208.jpg)
Master the Teradata Architecture
199 Copyright OSS 2010
![Page 209: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/209.jpg)
Master the Teradata Architecture
200 Copyright OSS 2010
Case_N Partitioning
“A man who views the world at 50 the same
as he did at 20 has wasted 30 years of his
life.”
– Muhammad Ali
We are partitioning by CASE_N in the next example. This is just like any CASE
statement in programming or SQL. In the example I want you to notice that if an
Order_Total for a row is less than $1,000.00 it will go into the first partition. If it falls
between $1,000.00 and 4,999.99 it will go into partition 2. If it is between $5,000.00 and
$9,9999.99 it will fall into partition 3 and so on.
I also need you to pay close attention to the UNKNOWN partition and the NO CASE
partition. The UNKNOWN Partition is for an Order_Total with a NULL value. The NO
CASE Partition is for partitions that did not meet the CASE criteria. For example, if an
Order_Total is greater than $20,000.00 it wouldn‘t fall into any of the partitions so it goes
to the NO CASE partition.
Important note. It is an excellent idea to have a NO CASE and UNKNOWN partition.
More Important note: You do not want to include the UNKNOWN or NO RANGE
partitions with dates in a RANGE_N partition. I will explain later in detail, but it is
because when you delete a partition in the RANGE_N partitions they will go to the NO
RANGE or UNKNOWN partitions. This takes a long time and is usually not wanted
anyway.
![Page 210: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/210.jpg)
Master the Teradata Architecture
201 Copyright OSS 2010
![Page 211: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/211.jpg)
Master the Teradata Architecture
202 Copyright OSS 2010
Multi-Level Partitioning
“Two roads diverged in a wood and I took
the one less traveled by, and that has made
all the difference.”
– Robert Frost
Teradata introduced Multi-Level partitioning in Teradata V12. You can have up to 15
levels of partitions within partitions. I want you to remember that a Partitioned Table
merely tells each AMP how to sort their rows for the table. So think of Multi-Level
partitioning as a table with multiple sort keys. The first partition statement is how the
data is sorted first. The second partition statement is the second sort key.
Think of a simple sorting of an answer set. Let‘s imagine we sorted an Employee_Table
by Department_Number first. Then we sorted by Last_Name within each
Department_Number.
That is similar to what we are doing on the next page. We first partition by day. Then
within each day we are partitioning by our CASE_N statement. Each AMP will have
each day sorted first on their disk and then within each day the data will be sorted with
the lower Order_Total values first.
This is really getting down to a granular form. The entire purpose of partitioning is to
eliminate the Full Table Scan. Instead of reading all rows in a table each AMP merely
has to one or more of their partitions.
![Page 212: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/212.jpg)
Master the Teradata Architecture
203 Copyright OSS 2010
![Page 213: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/213.jpg)
Master the Teradata Architecture
204 Copyright OSS 2010
Partitioning Rules
“I find that the harder I work, the more luck
I seem to have.”
– Thomas Jefferson
Check out the fundamental rules of partitioning on the next page.
![Page 214: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/214.jpg)
Master the Teradata Architecture
205 Copyright OSS 2010
![Page 215: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/215.jpg)
Master the Teradata Architecture
206 Copyright OSS 2010
See the data
“He who walks in another‟s tracks leaves no
footprints.”
- Joan L. Brannon
Check out the fundamental rules of partitioning on the next page.
![Page 216: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/216.jpg)
Master the Teradata Architecture
207 Copyright OSS 2010
![Page 217: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/217.jpg)
Master the Teradata Architecture
208 Copyright OSS 2010
Test Your Teradata Access Knowledge
“The superior man is modest in his speech,
but exceeds in his actions.”
- Confucius, 551 BC -479 BC
Test your knowledge on the next page and make me proud!
![Page 218: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/218.jpg)
Master the Teradata Architecture
209 Copyright OSS 2010
![Page 219: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/219.jpg)
Master the Teradata Architecture
210 Copyright OSS 2010
![Page 220: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/220.jpg)
Master the Teradata Architecture
211 Copyright OSS 2010
![Page 221: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/221.jpg)
Master the Teradata Architecture
212 Copyright OSS 2010
![Page 222: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/222.jpg)
Master the Teradata Architecture
213 Copyright OSS 2010
![Page 223: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/223.jpg)
Master the Teradata Architecture
214 Copyright OSS 2010
The most Powerful USER
“It‟s time for the human race to enter the
Solar System.”
- Dan Quayle
Dan Quayle, Vice President during George Herbert Bush‘s presidency never made it to
the the top of the human race, because he was never president. Dan Quayle.never made it
to the top of the Teradata hierarchy because his name wasn‘t DBC.
The first Teradata machine ever built was called the DBC 1012. DBC stood for
DataBase Computer and the 1012 represented ten to the 12th
power, which happens to be
a Terabyte.
So, in honor of the first Teradata machine, which coincidentally had only one USER
when the system first arrived, whose name was DBC.
DBC has been the most powerful USER from the beginning of Teradata time (1984).
Whoever is assigned to be the USER DBC will have all the power. DBC will create
other DATABASES and USERS and the hierarchy begins.
It is time for the human race to enter the Teradata System!
![Page 224: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/224.jpg)
Master the Teradata Architecture
215 Copyright OSS 2010
![Page 225: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/225.jpg)
Master the Teradata Architecture
216 Copyright OSS 2010
DBC owns all the Disk Space
“Too much of a good thing is just right.”
– Mae West
When the system arrives DBC owns all the Disk Space. Each AMP will have one virtual
disk, really four physical disks, which that AMP can read and write, but no other AMP
can read or write to or from another AMPs virtual disk. Add up all the AMPs disks and
you will know how much space DBC originally owns.
This space is called PERMANENT Space, or PERM SPACE.
Think of PERM SPACE like you might think of money. If DBC has 1000 GBs or gives
another database or user 100 GBs then DBC only has 900 GBs left. Just like money, too
much of a good thing is just right. Remember, ―A fool and his PERM Space are soon
parted‖.
![Page 226: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/226.jpg)
Master the Teradata Architecture
217 Copyright OSS 2010
![Page 227: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/227.jpg)
Master the Teradata Architecture
218 Copyright OSS 2010
DBC Example of 1000 GBs
“It‟s kind of fun to do the impossible.”
- Walt Disney
DBC owns all the PERM Space when the system first arrives. DBC is a user who has a
logon and password. DBC will calculate how much PERM space is in the system when
each AMP reports the size of their virtual disk. DBC will begin creating another USER
or DATABASE and the Parent/Child hierarchy is started.
Remember that PERM Space is like money. You have 1000 GBs to start, but if you give
it away you lose it. DBC will never give all the space away, but about 80%.
![Page 228: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/228.jpg)
Master the Teradata Architecture
219 Copyright OSS 2010
![Page 229: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/229.jpg)
Master the Teradata Architecture
220 Copyright OSS 2010
DBC will first CREATE a USER or a DATABASE
“If you shoot at mimes, should you use a
silencer?”
- Steven Wright
In our example on the next page DBC has created a USER named Mary. DBC also
created two other DATABASES called Sales and MRKT. DBC originaly 1000 GBS, but
in creating Mary, Sales, and MRKT, the user DBC gave each of them 100 GBs of PERM
Space. Now DBC only has 700 G Bs of PERM Space.
![Page 230: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/230.jpg)
Master the Teradata Architecture
221 Copyright OSS 2010
![Page 231: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/231.jpg)
Master the Teradata Architecture
222 Copyright OSS 2010
Teradata is Hierarchical
“When they discover the center of the
universe, a lot of people will be disappointed
to discover they are not in it.”
- Bernard Bailey
In this Teradata Universe you can see that DBC is at the top of the Hierarchy. That will
always stay that way. Under DBC you should be able to see that Mary, Sales, and
MRKT were CREATED by DBC. Three USERS were added to MRKT named Sam,
Don, and BO. Sam then went and CREATED there users named VU, Jane and Jusn.
Anyone above you in the hierarchy is a parent. For instance the parent of VU is Sam,
MRKT, and DBC. The immediate Parent of VU is Sam.
![Page 232: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/232.jpg)
Master the Teradata Architecture
223 Copyright OSS 2010
![Page 233: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/233.jpg)
Master the Teradata Architecture
224 Copyright OSS 2010
Only two Objects can Receive PERM Space
“A lot of people approach risk as if it‟s the
enemy when it‟s really fortune‟s
accomplice.”
- Sting
Only a DATABASE or a USER can have PERM Space. When a user or database is
created they will be given their Perm Space. Other objects such as tables can be created
under a database or user. If a user has 100 GBs of space then they can create tables with
data that combined take up a maximum of 1000 GBs of space. Once a database or user
has used up their PERM space, they cannot add any more data to the tables they own.
![Page 234: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/234.jpg)
Master the Teradata Architecture
225 Copyright OSS 2010
![Page 235: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/235.jpg)
Master the Teradata Architecture
226 Copyright OSS 2010
Only difference between a User and a Database
“Everyone is trying to accomplish
something big, not realizing that life is made
up of little things.”
-Frank A. Clark
A USER and a DATABASE are considered the same in Teradata except a USER has a
LOGON and PASSWORD so they can actually Logon to Teradata and run queries.
Other than that they are considered exactly the same. Both can be created with PERM
and SPOOL Space. Both can have objects created beneath them.
![Page 236: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/236.jpg)
Master the Teradata Architecture
227 Copyright OSS 2010
![Page 237: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/237.jpg)
Master the Teradata Architecture
228 Copyright OSS 2010
A Typical approach to Security
“Sometimes it is more important to discover
what one cannot do than what one can do.”
-- Lin Yutang
A USER doesn‘t have to even know that they don‘t have access directly to the tables.
The following page shows a typical approach to Teradata security. A database will often
be setup for USERs. Then another database or user will be setup for Views and Macros.
Then another USER or DATABASE will be setup to hold the actual tables.
The USER Database is given access to the VIEW and MACRO Database and the VIEW
and MACRO Database is given access to the Tables.
![Page 238: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/238.jpg)
Master the Teradata Architecture
229 Copyright OSS 2010
![Page 239: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/239.jpg)
Master the Teradata Architecture
230 Copyright OSS 2010
Example of a DATABASE and USER Interchanged
“Everyone I meet is in some way my
superior.”
-- William Shakespeare
The following page merely shows our previous typical security example, but we have
replaced the DATABASEs with USERs. It doesn‘t matter!
A DATABASE and USER are the same thing except a USER has a logon and password
so they can run queries.
A DATABASE is sometimes referred to as a Passive Repository and a USER is referred
to as an Active Repository because of the action of logging on and running queries.
![Page 240: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/240.jpg)
Master the Teradata Architecture
231 Copyright OSS 2010
![Page 241: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/241.jpg)
Master the Teradata Architecture
232 Copyright OSS 2010
PERM and SPOOL Space
“Opportunity may only knock once, but
temptation leans on the doorbell.”
-- Unknown
There are two types of space in Teradata. They are called PERM Space and SPOOL
Space. Perm Space is for Permanent Tables and Spool Space is used to temporarily build
Answer Sets when users run queries.
In actuality Spool Space is unused PERM Space.
Most users don‘t get their own PERM space. All users get Spool Space. Without Spool
Space the users couldn‘t run queries.
Although I have listed different things associated with Perm and Spool space on the next
page I want you to simply remember that Perm is for your Tables and Data and that
Spool is used as space for Users to run queries.
Tables, Join Indexes, Permanent Journals, Hash Indexes, Stored Procedures and User
Defined Functions (UDF) require Perm Space.
Views, Macros and Triggers don‘t require Perm space.
![Page 242: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/242.jpg)
Master the Teradata Architecture
233 Copyright OSS 2010
![Page 243: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/243.jpg)
Master the Teradata Architecture
234 Copyright OSS 2010
Each AMP will have PERM and SPOOL
“We can win at home. We can‟t win on the
road. I just can‟t figure out where else to
play.”
-- Coach Pat Williams
Each AMP will have Perm Space to hold tables and have empty space for Spool. The
following picture is designed so you can see exactly what a typical AMP will have on its
virtual disk. The AMP will go to PERM space and read or write to the tables and then
build the answer sets using the Spool Area.
![Page 244: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/244.jpg)
Master the Teradata Architecture
235 Copyright OSS 2010
![Page 245: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/245.jpg)
Master the Teradata Architecture
236 Copyright OSS 2010
A Query using both PERM and SPOOL Space
“One thing I like about stones in my path is
when I cross them they become milestones.”
-- Anonymous
The example on the following page shows a user running a query. The query is selecting
all columns and rows from the Employee_Table.
Each AMP will read the Employee_Table located in their PERM Space. They will then
begin building their portion of the Answer Set by placing these rows in the empty area of
disk called SPOOL. When each AMP is finished they will inform the Parsing Engine
they are done. Each AMP will pass their Spool Answer Set over the BYNET to the
Parsing Engine. The Parsing Engine will take the Answer Set and deliver it to the user.
Once the Answer Set is delivered to the PE and the User the Answer Set in Spool will be
deleted. Spool is only temporarily used for each query and then deleted when the query
is over!
![Page 246: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/246.jpg)
Master the Teradata Architecture
237 Copyright OSS 2010
![Page 247: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/247.jpg)
Master the Teradata Architecture
238 Copyright OSS 2010
Spool is Deleted when the Query is Done
“Behold the turtle. He only makes progress
when he sticks his neck out.”
-- James Bryant Conant
The example on the following page is meant to show that when a query finishes the Spool
Answer Set is automatically deleted.
What really happens is that spool is deleted as soon as the query no longer needs that
portion of spool.
![Page 248: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/248.jpg)
Master the Teradata Architecture
239 Copyright OSS 2010
![Page 249: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/249.jpg)
Master the Teradata Architecture
240 Copyright OSS 2010
Getting a better understanding of Spool
“Genius is nothing but a great aptitude for
patience.”
-- George-Louis De Buffon
In the example on the following page you can see we have given MRKT 20 GBs of
Spool. Then we ask the question can three users in MRKT simultaneously run a query
that is 15 GBs in size. The answer is YES!
PERM should be looked at like money. If you give money away you no longer have that
money. Spool should be looked at like a speed limit. MRKT has a speed limit of 20
GBs. No user in MRKT can run a query that uses more than 20 GBs of spool, but every
person in MRKT can run queries simultaneously. Thousands of users in MRKT could
simultaneously run queries. The total sum of this Spool Space could be enormous, but
MRKT isn‘t tied to everyone only using a sum total of 20 GBs. Nobody in MRKT can
go over their speed limit of 20 GBs!
![Page 250: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/250.jpg)
Master the Teradata Architecture
241 Copyright OSS 2010
![Page 251: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/251.jpg)
Master the Teradata Architecture
242 Copyright OSS 2010
Answering the MRKT Spool Query Answer
“An eye for an eye only ends up making the
whole world blind.”
-- Gandhi
In the example on the following page you can see we have given MRKT 20 GBs of
Spool. Then we ask the question can three users in MRKT simultaneously run a query
that is 15 GBs in size. The answer is YES!
![Page 252: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/252.jpg)
Master the Teradata Architecture
243 Copyright OSS 2010
![Page 253: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/253.jpg)
Master the Teradata Architecture
244 Copyright OSS 2010
Spool is like a Speed Limit
“The difference between genius and
stupidity is that genius has its limits.”
– Albert Einstein
Teradata definitely has its limits and these pertain to Spool space. Think of PERM Space
like money, but think of Spool Space like a speed limit. If the database MRKT is
assigned 20 GBs of Spool then that is MRKT‘s speed limit. Each user can run queries
that travel up to 20 GBs. This goes for all users in MRKT.
Imagine you are on the highway and the speed limit is 60 MPH. If you were driving
beside another car also going 60 MPH and they pulled off the road you wouldn‘t be able
to now go 120 MPH. The speed limit is 60 MPH and that is everyone‘s limit.
The Teradata police will abort your query if at any time you go 1 byte over 20 GBs.
Here is why you should think of PERM as money. If the system starts with 1000 GBs or
Perm (which is actually equal to 1 Terabyte) then the system will always have 1000 GBs
unless an upgrade occurs and more hardware is added. So, there is a limited amount of
space that always adds up to 1,000 GBs. DBC starts with the entire 1000 GBs, but if
DBC gives away 500 GBs then DBC will only own 500 GBs. It is like having 1,000
dollars in a poker game. It may be split up and won or lost among the players, but there
is always $1,000 at the table until the game is over.
Spool doesn‘t equate to the 1000 GBs. The DBA could assign every user and database in
the system spool and if you added it all up it could equate to millions of GBs. This is
because we are assuming that not everyone will be logged on at the same time.
Spool is designed for two purposes.
1) Users have a limit so they can‘t hog the system resources.
2) If users make a mistake and run a runaway query the system will abort it after it
reaches that users spool limit.
![Page 254: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/254.jpg)
Master the Teradata Architecture
245 Copyright OSS 2010
![Page 255: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/255.jpg)
Master the Teradata Architecture
246 Copyright OSS 2010
All Space is calculated on a Per AMP Basis
“My son has taken up meditation – at least
it‟s better than sitting doing nothing.”
– Max Kauffmann
Teradata calculates Perm and Spool space on a per AMP basis. If the system has 10
AMPs and a user or database is assigned 20 GBs of spool then there are actually two
limitations:
1) The user cannot run a query that goes over 20 GBs.
2) The user cannot run a query that goes over 2 GBs on any single AMP (20 GBs /
10 AMPs = 2 GBs per AMP).
This design is to ensure that data is spread fairly evenly over the AMPs, which is based
solely on the Primary Index choice.
This will also ensure that no AMP should be a hot AMP, which means that if the data is
skewed badly the system will blow the Spool limit.
![Page 256: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/256.jpg)
Master the Teradata Architecture
247 Copyright OSS 2010
![Page 257: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/257.jpg)
Master the Teradata Architecture
248 Copyright OSS 2010
Examples of Perm and Spool on a Per AMP Basis
“It is the mark of an educated mind to be
able to entertain a thought without
accepting it.”
– Aristotle
The following page shows an example of both Perm and Spool being calculated on a Per
AMP basis. Notice that we have 10 AMPs in our system. We have 20 GBs of Spool and
100 GBs of Perm. This means that this user or database cannot run a query that goes
over 20 GBs or one that goes over 2 GBs per AMP.
Also notice that in our 10 AMP system the user or database was assigned 100 GBs of
Perm. This means that the user or database cannot contain tables with data that exceeds
100 GBs or that goes over 10 GBs on any AMP.
Again the philosophy of this is to ensure reasonable data distribution, which is based
solely on the Primary Index choice.
In a worst case scenario you choose a column for the Primary Index that has only one
value. Let‘s say for example, ―State Code‖ and the value is ‗California‘. Then all of the
data for that table would be on only 1 AMP. This could cause a prematurely Full Perm
Space message or an abort of a query because it exceeded it‘s per AMP limit.
Special Note: Sometimes when systems are upgraded to a large number of AMPs the
DBA will assign each user or database more space because they don‘t want the Per AMP
limit to cause problems. If a 10 AMP system with 20 GBs of space equals a Per AMP
limit of 2 GBs per AMP, then an upgrade to a 100 AMP system would mean that the Per
AMP Limit would be .20 GBs. That might be considered too low to run some queries if
there is any skewing at all so the DBA will often up the spool limits for everyone.
![Page 258: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/258.jpg)
Master the Teradata Architecture
249 Copyright OSS 2010
![Page 259: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/259.jpg)
Master the Teradata Architecture
250 Copyright OSS 2010
Quiz on Perm and Spool Space
“I went to a restaurant that serves
„breakfast any time.‟ So I ordered French
toast during the Renaissance.”
– Steven Wright
The following page is giving you a chance to show how smart you are. Answer the quiz
and decide how much PERM and SPOOL is in MRKT after they create the three users
Sam, Don, and Bo.
![Page 260: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/260.jpg)
Master the Teradata Architecture
251 Copyright OSS 2010
![Page 261: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/261.jpg)
Master the Teradata Architecture
252 Copyright OSS 2010
Answers to Quiz on Perm and Spool Space
“All human actions have one or more of
these seven causes: chance, nature,
compulsion, habit, reason, passion and
desire.”
– Aristotle
On the next page are the answers to the quiz.
![Page 262: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/262.jpg)
Master the Teradata Architecture
253 Copyright OSS 2010
![Page 263: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/263.jpg)
Master the Teradata Architecture
254 Copyright OSS 2010
![Page 264: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/264.jpg)
Master the Teradata Architecture
255 Copyright OSS 2010
![Page 265: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/265.jpg)
Master the Teradata Architecture
256 Copyright OSS 2010
Collecting Statistics
“If you are not true to your teeth they will be
false to you.”
– Teradata Certified Dentist
I asked my dentist, ―Do I have to floss all my teeth‖? He said, ―No, just the ones you
want to keep?‖
Whether the Parsing Engine (PE) is checking a users security rights or if statistics were
collected on a table the PE will go to user DBC for the answers.
The PE uses statistics to help decide what plan to build so the AMPs can satisfy a user‘s
query. Before the PE can come up with a plan it wants to know if a table is large,
medium, or small. It wants to know about certain columns or indexes. Does a particular
column have a lot of duplicates, nulls or are the values unique? Does a particular index
unique or non-unique or is the index strongly or weakly selective? These questions are
often answered by Collect Statistics.
What is Collect Statistics? When a table is created and loaded with data the DBA will
run a COLLECT STATISTICS command on certain columns and indexes of that table.
That will help the PE answer key questions that will give the PE a better understanding of
the table in general.
If more data is loaded or deleted the DBA will then Recollect Statistics to ensure that the
statistics reflect the true data inside the table.
It is not mandatory to collect statistics on a table as it is not mandatory that a person
brushes their teeth or cleans their clothes. If statistics are not collected on a table then the
PE will perform a Random Sample and make an educated guess.
I asked my DBA, ―Do I have to Collect Statistics on all the columns and indexes‖? The
answer was, ―No – Only on the important ones, but never the entire table‖. I hear that is
good advice, but I became concerned when I noticed he was missing teeth!
![Page 266: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/266.jpg)
Master the Teradata Architecture
257 Copyright OSS 2010
![Page 267: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/267.jpg)
Master the Teradata Architecture
258 Copyright OSS 2010
Parsing Engine uses Statistics for the Plan
“You cannot depend on your eyes when your
imagination is out of focus.”
– Mark Twain
The following page lists some of the key answers that Collect Statistics offers.
![Page 268: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/268.jpg)
Master the Teradata Architecture
259 Copyright OSS 2010
![Page 269: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/269.jpg)
Master the Teradata Architecture
260 Copyright OSS 2010
Columns and Indexes to Collect Statistics On
“This is a test. It is only a test. Had it been
an actual job, you would have received
raises, promotions, and other signs of
appreciation.”
– Anonymous
You sincerely don‘t collect statistics on every column and index in a table. These
statistics are stored inside DBC and it takes up Perm Space. You only want to collect on
certain columns and indexes such as:
All Non-Unique Indexes
Columns frequently used in user queries in the WHERE Clause
All Primary Indexes of small tables
Columns used as Join Conditions
![Page 270: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/270.jpg)
Master the Teradata Architecture
261 Copyright OSS 2010
![Page 271: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/271.jpg)
Master the Teradata Architecture
262 Copyright OSS 2010
Syntax to Collect Statistics
“Ninety percent of the game is half mental.”
– Yogi Berra
The following page shows you the syntax for the Collect Statistics command. It also
provides you with some great advice.
![Page 272: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/272.jpg)
Master the Teradata Architecture
263 Copyright OSS 2010
![Page 273: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/273.jpg)
Master the Teradata Architecture
264 Copyright OSS 2010
Recollecting Statistics
“The less their ability, the more their
conceit.”
– Ahad Haam
Whenever a table changes data by 10% it is time to recollect statistics. If you have a
billion row table and have collected statistics and someone adds one row it is NOT time
to recollect statistics. If rows are deleted or added and about 10% or more of the rows
have been changed then recollect.
It is better to never collect statistics then to let them become stale. Before the PE creates
a plan it checks to see if statistics were collected. If they were NOT collected the PE will
perform a random AMP sample of the data and make an educated guess. This is not as
good as collected statistics, but it is better than statistics that lie!
Make sure once you have collected statistics on a table to recollect when the table data
changes by 10% or more.
I want you to notice that when we recollect statistics on the following page we merely
write the SQL to say, ―Collect Statistics on Employee_Table‖. This doesn‘t mean we
collect statistics on every column and index. It means collect statistics on the same
columns and indexes you have done in the past. In other words, refresh the statistics you
have collected on in the past.
![Page 274: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/274.jpg)
Master the Teradata Architecture
265 Copyright OSS 2010
![Page 275: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/275.jpg)
Master the Teradata Architecture
266 Copyright OSS 2010
Random Sample instead of Collected Statistics
“Actions lie louder than words.”
– Carolyn Wells
What is a Random AMP Sample? The PE will poll a single AMP before running a query
and ask questions about the data. It will then multiply the statistics it finds on the random
AMP by the number of AMPs in the system. For example, if the PE estimates that the
random AMP has 1,000 rows and there are 50 AMPs in the system it will assume the
table has 50,000 rows!
Before Teradata V12 the PE would only perform a random AMP sample if no statistics
were collected on a table. In Teradata V12 and beyond Teradata always performs a
Random AMP sample.
![Page 276: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/276.jpg)
Master the Teradata Architecture
267 Copyright OSS 2010
![Page 277: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/277.jpg)
Master the Teradata Architecture
268 Copyright OSS 2010
V12 Statistics Enhancement – Stale Statistics
“Am I not destroying my enemies when I
make friends of them?”
– Abraham Lincoln
State statistics will now be hunted down and destroyed! When Teradata came out with
Teradata V12 they added a great enhancement. The PE will now perform a quick
Random AMP Sample on a single AMP to check if statistics are current or stale. If the
statistics are current the PE will use the statistics, but if the statistics appear to be stale the
PE will use the random AMP sample and make an educated guess.
Previous to Teradata V12 the PE would only perform a random AMP sample if no
statistics were collected. Now, it will always perform a random AMP sample and then
compare the Random AMP Sample with the real statistics to see if the real statistics are
stale and out of date.
Of course if no statistics were ever collected on a table the PE will use the Random AMP
Sample.
![Page 278: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/278.jpg)
Master the Teradata Architecture
269 Copyright OSS 2010
![Page 279: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/279.jpg)
Master the Teradata Architecture
270 Copyright OSS 2010
Where Statistics are Stored in DBC
“We see the brightness of a new page where
everything yet can happen.”
– Rainer Maria Rilke, Book of Hours
You don‘t want to collect statistics on every column or index inside a table. This takes
up space, takes up resources, and just isn‘t needed. There are important columns and
indexes that will really help the PE when coming up with a plan and there are other
columns or indexes that will never be considered. Only collect on the important ones.
![Page 280: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/280.jpg)
Master the Teradata Architecture
271 Copyright OSS 2010
![Page 281: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/281.jpg)
Master the Teradata Architecture
272 Copyright OSS 2010
A Collect Statistics Example
“The future belongs to those who believe in
the beauty of their dreams.”
– Eleanor Roosevelt
The following page shows you how expensive the process of collecting statistics can be.
In this example we are collecting statistics on the column Last_Name in the
Employee_Table. This requires a Full Table Scan!
The results are then sorted in alphabetical order on Last_Name from A-Z and then
chopped up or divided up into 200 intervals. There is more to it so pay attention to the
next couple of pages.
![Page 282: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/282.jpg)
Master the Teradata Architecture
273 Copyright OSS 2010
![Page 283: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/283.jpg)
Master the Teradata Architecture
274 Copyright OSS 2010
What Statistics are Really Collected
“A behaviorist is someone who pulls habits
out of rats.”
– Anonymous
The following page shows you the questions that the PE is trying to answer in each
interval. Notice that in each interval the PE wants to know the Maximum Value, Most
Frequent Value, Most Frequent Value number of Rows, the Other Values, and the Other
Rows.
Let‘s take a look at the first interval. The Most Frequent Value is Allan. That means in
this interval that the name Allan is the most popular. Then look at the Most Freq Value
Rows and notice it says 3. That means that there are 3 people with the Last_Name of
Allan.
If someone wrote the query below;
SELECT * FROM Employee_Table
WHERE Last_Name = ‗Allan‘
The PE would assume that there are 3 people named Allan and would come up with a
plan to get the data for the user.
I really want you to notice the last two statistics which are Other Values and Other Rows.
Other Values means last names OTHER than Allan. Other Rows means the number of
rows that are NOT Allan. If the PE was given the name AFRIM it would divide the
Other Rows by the Other Values and make an estimated guess of 1.1. Here would be the
formula on this example:
(6 / 5 = 1.1). Remember there were 6 Other Rows and 5 Other Values.
![Page 284: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/284.jpg)
Master the Teradata Architecture
275 Copyright OSS 2010
![Page 285: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/285.jpg)
Master the Teradata Architecture
276 Copyright OSS 2010
Loner Values and High Bias Intervals
“The most exciting phrase to hear in
science, the one that heralds the most
discoveries, is not “Eureka!”, but “That‟s
funny...”
– Isaac Asimov
One of the problems the PE had with Collect Statistics in the past is if there were certain
values that were huge. These often expanded into multiple intervals. Teradata came up
with a solution. If a value is large it will make it a Loner Value and store it in a High
Bias Interval. Now Teradata will know if there are a million people named ‗Davis‘
because ‗Davis‘ won‘t expand multiple intervals, but instead receive their own interval.
Teradata can actually place up to two Loner Values inside one High Bias Interval.
![Page 286: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/286.jpg)
Master the Teradata Architecture
277 Copyright OSS 2010
![Page 287: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/287.jpg)
Master the Teradata Architecture
278 Copyright OSS 2010
Teradata Limits
“Asking an incumbent member of Congress
to vote for term limits is a bit like asking a
chicken to vote for Colonel Sanders.”
– Bob Inglis, 1995
The following page shows some of the limits of Teradata V12 and V13.
![Page 288: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/288.jpg)
Master the Teradata Architecture
279 Copyright OSS 2010
![Page 289: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/289.jpg)
Master the Teradata Architecture
280 Copyright OSS 2010
![Page 290: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/290.jpg)
Master the Teradata Architecture
281 Copyright OSS 2010
![Page 291: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/291.jpg)
Master the Teradata Architecture
282 Copyright OSS 2010
Data Protection
"Age does not protect you from love. But
love, to some extent protects you from age. "
-Jeanne Moreau, French Actress
As a man was driving down the interstate highway, his cell phone rang. When he
answered he heard his wife warn him urgently, "George, I just heard on the news that
there's a car going the wrong way on I-26!" George replied, "I'm on I-26 right now and
it's not just one car. It's hundreds of them!"
How do you protect your data when things go the wrong way? Murphy‘s Law states,
―The more mission critical a data warehouse, the more likely the system will crash at
the most critical moment of the mission. Ironically, most DBAs think Murphy was an
optimist.
A database not prepared to defend itself is like an unsigned contract. It is not worth the
paper it is written on. However, Teradata is always prepared and it will protect your
data better than a wild pit bull. As a matter of fact, the difference between Teradata and
a pit bull is that eventually the pit bull will get bored and let go.
System and user errors are inevitable in any large system. For example, an associate
may accidentally give everyone a 100% raise instead of a 10% raise. Or, what if a
million-dollar transaction fails right at the wrong time? Or an AMP or DISK goes
down? In any of these cases, Teradata will have many ways to protect your data. Some
processes for protection are automatic and some of them are optional.
The protection features we will discuss are:
Transaction Concept
Transient Journal
RAID 1 Mirroring
Cliques
Standby Nodes
Fallback
Fallback Clusters
Archive
Permanent Journaling
![Page 292: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/292.jpg)
Master the Teradata Architecture
283 Copyright OSS 2010
![Page 293: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/293.jpg)
Master the Teradata Architecture
284 Copyright OSS 2010
Transaction Concept
“The afternoon knows what the morning
never suspected.”
- Swedish Proverb
At any time something could go wrong with a transaction. An old proverb suggests,
―The afternoon often knows what the morning never suspected,‖ likewise the
―Transient Journal‖ knows what the transaction never suspected.
What good would it do if you could gather, store and analyze terabytes of data, but
doubted the integrity of the data? Teradata makes every effort to ensure a database
doesn‘t get corrupt. Fundamental to this assurance is the ―Transaction Concept,‖ which
means that an SQL statement is viewed as a transaction. Simply stated, either it works
or it fails.
![Page 294: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/294.jpg)
Master the Teradata Architecture
285 Copyright OSS 2010
![Page 295: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/295.jpg)
Master the Teradata Architecture
286 Copyright OSS 2010
Two Modes to Teradata
“A life filled with love may have some
thorns, but a life empty of love will have no
roses.”
- Anonymous
Teradata has two different modes in which it operates. Those modes are called
Teradata Mode and ANSI Mode. Both modes handle things a little differently. Every
Teradata system will have a default mode set by the DBA when the system first arrives.
Although there is a default mode set, the user can actually change the mode they want
during their sessions. Depending on which mode you are using a transaction takes on a
whole new meaning.
![Page 296: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/296.jpg)
Master the Teradata Architecture
287 Copyright OSS 2010
![Page 297: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/297.jpg)
Master the Teradata Architecture
288 Copyright OSS 2010
Differences between ANSI and Teradata Mode
“You can tell whether a man is clever by his
answers. You can tell whether a man is wise
by his questions.”
- Naquib Mahfouz
Teradata has two different modes in which it operates. Those modes are called Teradata
Mode and ANSI Mode. Both modes handle things a little differently. As you can see on
the following page there are many differences.
I want you to focus on two main areas. The first is that in Teradata mode you don‘t need
to use the word COMMIT, but in ANSI mode you do.
The second area of focus is how statements are rolled back. In Teradata mode if a
statement in a transaction fails, EVERY Statement in that transaction is Rolled Back, but
in ANSI mode if a transaction fails, ONLY the FAILED Statement(s) are Rolled Back.
![Page 298: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/298.jpg)
Master the Teradata Architecture
289 Copyright OSS 2010
![Page 299: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/299.jpg)
Master the Teradata Architecture
290 Copyright OSS 2010
ANSI Mode Commit
“You got to be careful if you don‟t know
where you‟re going, because you might not
get there.”
- Yogi Berra
With ANSI Mode you must use the words COMMIT WORK or you can just say
COMMIT, but this is mandatory anytime you are changing something in the Teradata
database. This includes anytime you use the CREATE statement or any INSERT,
UPDATE, DELETE also. You don‘t need it for the queries with SELECT.
On the following page you can see both a single statement transaction at the top of the
slide and on the bottom of the slide you can see a multi-statement transaction.
If the single statement transaction failed for any reason then Teradata would Roll Back
this UPDATE statement and ensure the database was exactly like it was before the
transaction.
If a statement in the multi-statement transaction were to fail only the FAILED Statement
would be Rolled Back!
![Page 300: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/300.jpg)
Master the Teradata Architecture
291 Copyright OSS 2010
![Page 301: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/301.jpg)
Master the Teradata Architecture
292 Copyright OSS 2010
Teradata Mode Commit also called BTET
“The only thing worse than being talked
about is not being talked about.”
- Oscar Wilde
With Teradata Mode you never use the words COMMIT WORK or COMMIT. This is
implied with each statement. I am sure you are asking, ―Then how does Teradata Mode
run a Multi-Statement Transaction‖?
It uses a BT or BEGIN TRANSACTION Statement, then runs the statements, and then
follows them with an ET or END TRANSACTION Statement.
On the following page you can see both a single statement transaction at the top of the
slide and on the bottom of the slide you can see a multi-statement transaction.
If the single statement transaction failed for any reason then Teradata would Roll Back
this UPDATE statement and ensure the database was exactly like it was before the
transaction.
If a statement in the multi-statement transaction were to fail ALL Statements within the
transaction would be Rolled Back!
![Page 302: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/302.jpg)
Master the Teradata Architecture
293 Copyright OSS 2010
![Page 303: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/303.jpg)
Master the Teradata Architecture
294 Copyright OSS 2010
Trick to CREATE a Multi-Statement with BTEQ
“A government that robs Peter to pay Paul
can always depend upon the support of
Paul.”
- George Bernard Shaw
The next page shows an old trick of creating a Multi-Statement request in the Teradata
Utility called BTEQ (Pronounced Bee Teek).
BTEQ requires a semi-colon at the end of every SQL Statement. If you put the semi-
colon as the front of the next line and then place another SQL Statement immediately
following, then these statements are considered part of the same transaction.
![Page 304: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/304.jpg)
Master the Teradata Architecture
295 Copyright OSS 2010
![Page 305: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/305.jpg)
Master the Teradata Architecture
296 Copyright OSS 2010
Transient Journal
“The Transient Journal knows what the
Transaction never suspected.”
Swedish Proverb after a rollback
The Transient Journal’s job is to ensure if an insert, update, or delete fails, then the
rows affected can be reverted back to their original state. This is called a Rollback.
In Teradata, all SQL statements are considered transactions. This applies whether you
have one statement or multiple statements executing (MACRO). If all SQL statements
cannot be performed successfully, the following happens:
The user receives immediate feedback in the form of a failure message;
The entire transaction is rolled back, and any changes made to the database are
reversed;
Locks are released
Spool files are discarded
The Transient Journal is automatic and it takes a before picture of any update or delete
for rollback purposes.
![Page 306: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/306.jpg)
Master the Teradata Architecture
297 Copyright OSS 2010
![Page 307: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/307.jpg)
Master the Teradata Architecture
298 Copyright OSS 2010
How the Transient Journal Works
“Beware of the young doctor and the old
barber.”
- Benjamin Franklin
Wouldn‘t it be great if every time you got a haircut, the barber or stylist took a picture of
your hairdo before they cut a single strand? Then after he or she cut your hair, asked if
you liked it? If you didn‘t like it, then you could ask to have it restored? Well, that is
what the Transaction Journal does. If a row is going to change because of an UPDATE
or DELETE, it takes a BEFORE picture. If the transaction fails, then the journal
restores it to the way it was.
The TRANSIENT JOURNAL is an automatic system function. It is not optional. The
BEFORE image is actually stored in the AMP‘s Transient Journal. Every AMP has a
transient journal that is maintained in DBC‘s PERM space. If the transaction is aborted
for any reason, the AMP restores the data to match the before-image stored in the
Transient Journal. The data will then revert to its original state. When a transaction is
successful, the PE and the AMPs shake hands on it and the Transient Journal is wiped
clean. The handshake is called the ―COMMIT.‖ After a COMMIT, all the AMPS have a
party to celebrate, and the user is invited to join in the festivities! In other words,
Transaction Journal Cleanliness is next to Godliness.‖ If it is clean, then things went
good!
The Transient Journal provides two system events that occur automatically to ensure data
integrity. An automatic rollback of changed rows occurs in the event of a transaction
failure. This is done because before images are retained on each AMP as changes
occur. Data is always returned to its original state after a transaction failure.
In the picture on the next page you can see we are updating the budget of Dept_No 100,
which is the Sales Department, from 100000 to 500000. Before the transaction can occur
the AMP will take a snapshot of the entire row and store it in its Transient Journal. You
can see the actual SQL statement doing the UPDATE at the top of the picture. Inside the
Disk you can see the Transient Journal with the before picture of the row being updated.
You can also see the table inside the disk.
![Page 308: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/308.jpg)
Master the Teradata Architecture
299 Copyright OSS 2010
![Page 309: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/309.jpg)
Master the Teradata Architecture
300 Copyright OSS 2010
The Transient Journal after a Commit
“Do you know, my son, with what little
understanding the world is ruled?”
- Pope Julius III
The Transient Journal rows are discarded once the transaction is committed. The only
reason that each AMP takes a BEFORE picture and stores it in its Transient Journal is in
case of a problem in which a ROLLBACK occurs. If there are no problems and the
transaction is committed then the BEFORE picture is discarded. If a ROLLBACK did
occur then the AMP can replace the attempted UPDATE with the BEFORE picture and
everything is back to the way it was before the UPDATE Statement.
![Page 310: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/310.jpg)
Master the Teradata Architecture
301 Copyright OSS 2010
![Page 311: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/311.jpg)
Master the Teradata Architecture
302 Copyright OSS 2010
VProcs
“The longer I live the more beautiful life
becomes.”
-Frank Lloyd Wright
Teradata utilizes Parsing Engines (PE) and Access Module Processors (AMPs) in which
they call VProcs. These refer to virtual processors or VProcs. Each AMP and PE lives
inside the memory of a Node. There are anywhere between 25 and 35 VProcs inside
each node.
Think of a Node as a giant Personal Computer. One that has 4 Intel Processors that work
and act as if there were 8 Intel Processors. This node also has up to 16 GBs of memory.
The VProcs get loaded inside the Nodes memory and then we connect this node via the
BYNET with all the other nodes and now we are part of the Teradata warehouse.
![Page 312: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/312.jpg)
Master the Teradata Architecture
303 Copyright OSS 2010
![Page 313: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/313.jpg)
Master the Teradata Architecture
304 Copyright OSS 2010
Nodes and MPP
“The surprising thing about young fools is
how many survive to become old fools.”
-Doug Larson
Teradata has taken a simple PC, filled the memory with AMPs and PEs and calls it a
node. Connect multiple nodes together with the BYNET and you have a Massively
Parallel Processing or MPP system.
![Page 314: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/314.jpg)
Master the Teradata Architecture
305 Copyright OSS 2010
![Page 315: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/315.jpg)
Master the Teradata Architecture
306 Copyright OSS 2010
RAID 1 - Mirroring
“You can only perceive real beauty in a
person as they get older.”
-Anouk Aimee
RAID 1 is mirroring and Teradata always mirrors their disks. As you can see on the
following page every AMP is attached to four physical disks. Two hold actual data and
two are for backup. This provides excellent protection.
Each AMP is said to have four physical disks, but only one Virtual Disk. This really
means that no AMP can get into another AMPs disks. Each AMP is the only thing
allowed to read that AMPs disks. So, each AMP is said to have its own virtual disk,
which is a set of four physical disks.
The great thing about mirroring is that if we lose a disk we already have it mirrored and
protected. The DBA just has to remove the failed disk and put in a fresh disk and the
mirroring will immediately begin.
Remember the price for Mirroring is double the disk costs. Each time you have a disk
with data you have another disk mirroring and protecting that data disk.
![Page 316: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/316.jpg)
Master the Teradata Architecture
307 Copyright OSS 2010
![Page 317: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/317.jpg)
Master the Teradata Architecture
308 Copyright OSS 2010
Cliques
“Never advise anyone to go to war or to
marry.”
-Spanish Proverb
Teradata CLIQUES (pronounced ―cleeks‖) are a method of system protection against the
failure of an entire node. Each node contains in memory AMP VPROCs. Each AMP is
attached to one virtual disk (Vdisk) and that AMP is the only Vproc allowed access to
its Vdisk. A Clique utilizes access to a set of disks from another node. If a node fails the
AMP VPROCs can migrate to the node that has the backup access to its virtual disk. The
migrating AMP can continue to read and write to its Vdisk while its home node is down.
When the home node is fixed and available again the VPROCs return home.
If a Teradata system uses two-node cliques then when one node fails all of its AMP
VPROCs migrate to the other node. The system is now about 50% slower. To solve this
problem Teradata allows bigger cliques such as eight nodes. If one node fails, its
VPROCs split up and migrate amongst the seven other nodes in the clique without much
performance degradation.
In the picture on the following page I want you to notice that we have two nodes. In each
Node we have two Parsing Engines (PE) and six AMPs. Each of these nodes directly
attaches to its disk farm. Each AMP gets access to four physical disks, which is
considered one virtual disk because only this AMP can access its disks.
Watch what happens next when one of our nodes fail!
![Page 318: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/318.jpg)
Master the Teradata Architecture
309 Copyright OSS 2010
![Page 319: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/319.jpg)
Master the Teradata Architecture
310 Copyright OSS 2010
VProcs Migrate when a Node Fails
“Many receive advice, few profits by it.”
-Publilius Syrus 100 BC
When a node fails Teradata resets and the nodes begin their startup routine, but the failed
node will now receive instructions for its VProcs to migrate to another node in their
clique. A Clique is nothing but extra cables connecting the disk farms of each node
together just in case a migration needs to take place.
As you can see on the following page a node has failed. The VProcs in that node will
now migrate to the memory of a node in their clique. The system is degraded and not up
to maximum speed, but at least the system is up and running.
This is an example of a 2-node clique. In a 2-node clique if one node fails then all of the
VProcs in that node must migrate to the other node.
Teradata has been smart to allow for 4-node cliques and even 8-node cliques. When one
node fails in an 8-node clique then all the VProcs in the failed node can spread out evenly
among the other 7-nodes remaining in the clique.
To accomplish this each node is directly attached to its own disk farm, but it is also
attached to the other nodes disk farms within the clique. Now, any AMP within the
clique could migrate to any other node in the clique if necessary.
Cliques are designed to prevent against a NODE Failure.
![Page 320: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/320.jpg)
Master the Teradata Architecture
311 Copyright OSS 2010
![Page 321: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/321.jpg)
Master the Teradata Architecture
312 Copyright OSS 2010
Cliques – An 8-Node Example
“Half the money I spend on advertising is
waster; the trouble is I don‟t know which
half.”
-John Wanamaker
In the picture on the following page you see 8 nodes. When we connect each of these
nodes to each other nodes disk farms we are essentially creating a clique. Now if there is
a node failure, Teradata will reset and the AMPs and PEs in the down node will be able
to migrate to the memory of another node within the clique.
I want you to notice that Clique 1 has Green AMPs and all the other nodes have purple
colored AMPs. We are about to see what happens when Node 1 crashes. Get ready!
![Page 322: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/322.jpg)
Master the Teradata Architecture
313 Copyright OSS 2010
![Page 323: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/323.jpg)
Master the Teradata Architecture
314 Copyright OSS 2010
Cliques – An 8-Node Example with Migration
“I do not regret one professional enemy I
have made. Any actor who doesn‟t dare to
make and enemy should get out of the
business.”
-Bette Davis
In the picture on the following page you see 8 nodes. When we connect each of these
nodes to each other nodes disk farms we are essentially creating a clique. Now if there is
a node failure, Teradata will reset and the AMPs and PEs in the down node will be able
to migrate to the memory of another node within the clique.
![Page 324: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/324.jpg)
Master the Teradata Architecture
315 Copyright OSS 2010
![Page 325: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/325.jpg)
Master the Teradata Architecture
316 Copyright OSS 2010
Hot Standby Nodes
“Conscience is the inner voice which warns
us that someone may be looking.”
-H. L. Mencken
Teradata actually has hot standby nodes! This is in case of a node failure. Although
other AMPs in the Clique could migrate from a down node to other nodes in the clique,
an even better way is to have a hot standby node. This is the nodes hardware without
running anything until another node goes down.
When a node goes down Teradata will reset. When it does the AMPs and PEs in the
down node will be instructed to migrate to the hot standby node. Now everything is up
and running perfectly. A hot standby node is equivalent to you buying a second car. You
would only drive the 2nd
car if your other car broke down. Yes it is expensive, but it is
great when the first car is down.
![Page 326: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/326.jpg)
Master the Teradata Architecture
317 Copyright OSS 2010
![Page 327: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/327.jpg)
Master the Teradata Architecture
318 Copyright OSS 2010
Hot Standby Nodes in Action
“If you are all wrapped up in yourself, you
are overdressed.”
-Kate Halverson
Notice in our picture that our first node is down, but that the AMPs and PEs migrated to
our Hot Standby Node! Isn‘t it great when a plan comes together?
![Page 328: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/328.jpg)
Master the Teradata Architecture
319 Copyright OSS 2010
![Page 329: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/329.jpg)
Master the Teradata Architecture
320 Copyright OSS 2010
FALLBACK Protection
“United we stand divided we fall.”
-Circular letter, Boston during the American Revolution
FALLBACK is a table protection feature used in case an AMP fails. Fallback is similar
to mirroring in that a duplicate copy of a row is created and maintained on another
AMP for redundancy purposes. Essentially, anytime you define a table with Fallback
you are using twice the space. You can use FALLBACK on all tables, some tables or
no tables. You can also create a table with or without FALLBACK and then add or
drop the feature at any time.
“Divided we stand united we Fallback.”
-AMP during the computer revolution
Fallback is similar to mirroring in that it creates and maintains a duplicate copy of each
row, but it is designed in a revolutionary manner for performance purposes. With
mirroring if one disk goes down another duplicate disk takes over. Fallback however
will take all the rows that one AMP is responsible for in a fallback protected table and
store them on multiple AMPs. If the AMP fails then multiple AMPs will be
responsible for delivering the failed AMPs rows.
“We have the right to bear arms.”
-2nd amendment of the constitution
Teradata believes its constitution is to protect the data and so a duplicate copy is always
maintained on another AMP.
“We have no access rights to bare amps.”
-2nd amendment of the Teradata constitution
![Page 330: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/330.jpg)
Master the Teradata Architecture
321 Copyright OSS 2010
![Page 331: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/331.jpg)
Master the Teradata Architecture
322 Copyright OSS 2010
How Fallback Works
“It‟s d‟ej‟a vu all over again!”
-Yogi Berra
Fallback is like déjà vu all over again because when a table is fallback protected the rows
are duplicated on other AMPs. Fallback is similar to mirroring, but different. The
similarities is that both provide a duplicate copy, but the difference is that Fallback places
copies of its rows on multiple AMPs so if a failure occurs Teradata can use the
parallelism to help the failed AMP.
On the next page are four AMPs holding a base table. For examples sake, let‘s assume
that the base table is the Employee Table. There are 12 employees with employee
numbers ranging from 1 to 12. The data is spread evenly in the table with each AMP
responsible for 3 employees.
The Employee Table has been created with Fallback, so each row of the base table is
duplicated on another AMP in the Fallback Table. Notice three very important features:
(1) No base table row is on the same AMP with its Fallback protected duplicate copy.
(2) Each AMP spreads their Fallback rows evenly to multiple AMPs.
(3) The perm space used for the table is double because of the fallback
The system can lose any single AMP or Disk in this system. If multiple AMPs or Disks
fail in the picture below then Teradata won‘t be able to run queries that ask for all the
data.
![Page 332: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/332.jpg)
Master the Teradata Architecture
323 Copyright OSS 2010
![Page 333: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/333.jpg)
Master the Teradata Architecture
324 Copyright OSS 2010
Fallback Clusters Exercise
“My father taught me to work; he did not
teach me to love it.”
-Abraham Lincoln
Fallback is always associated with CLUSTERS. Fallback can be specified at the
table level. Fallback is worth the price because when an AMP fails users still have
access to the data even while the AMP is offline. Any data that has changed is
automatically restored during the AMP offline period.
If we can lose any one AMP/disk, what happens if we lose two? The chance of losing
two AMPs in a four-AMP system is rare, however some systems have nearly 2,000
AMPs. Therefore, the chance of losing two AMPs in a 2,000 AMP system is much
greater than in a four-AMP system. That‘s why Teradata designed Clustering. With
Clustering, Teradata can lose one AMP/Disk per cluster. Let‘s look at this next
example with 8 AMPs in two clusters.
Notice that the data in the base table lays out evenly with 24 records on 8 AMPs. What
is key to notice is that the fallback copy remains within the cluster. In other words, the
base table rows in cluster one are fallback protected within cluster one. The base table
rows in cluster two are fallback protected within cluster two. We can lose one
AMP/Disk in both cluster one and cluster two and the system is fine.
Fallback cluster sizes are set usually by a Teradata representative through a Teradata
Console Utility. They can range from 2 AMPs in a cluster up to 16 AMPs in a cluster.
The most often used cluster size is 4 AMPs per cluster. Not all clusters in a system
have to be the same size, but this is usually desired.
![Page 334: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/334.jpg)
Master the Teradata Architecture
325 Copyright OSS 2010
![Page 335: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/335.jpg)
Master the Teradata Architecture
326 Copyright OSS 2010
Fallback Clusters
“Don‟t worry about people stealing your
ideas. If your ideas are any good, you‟ll
have to ram them down people‟s throats.”
-Howard Aiken
Fallback has been placed perfectly in the picture on the following page. Notice we
have two clusters. The top cluster and the bottom cluster. The top cluster is in purple
and the bottom cluster is in yellow.
We laid the data out and it spread evenly among both clusters. Now it is time to layout
the fallback data. Notice that the fallback from the top cluster stays within the top
cluster. The same rule goes for the bottom cluster. The Fallback data stays within the
cluster. Now we can lose 1 AMP in every cluster and still have our data up and
running.
Teradata will not use the Fallback data unless an AMP in the cluster goes down.
![Page 336: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/336.jpg)
Master the Teradata Architecture
327 Copyright OSS 2010
![Page 337: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/337.jpg)
Master the Teradata Architecture
328 Copyright OSS 2010
Fallback Exercises with Clusters
“I have never let my schooling interfere with
my education”.
-Mark Twain
This is an outstanding exercise that is designed to teach you exactly how Fallback works
with clusters. In the example on the following page you will see a 12 AMP system that
has 12 Base Rows. In each system we have placed these rows and labeled them 1-12.
The 12 records have been spread evenly among the 12 AMPs with each AMP getting one
record. I have placed the first Fallback row on the proper AMP. The base row records
are on the top of the disk and we have placed the Fallback rows on the bottom of the disk.
Your job is to finish the exercise. No looking!
![Page 338: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/338.jpg)
Master the Teradata Architecture
329 Copyright OSS 2010
![Page 339: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/339.jpg)
Master the Teradata Architecture
330 Copyright OSS 2010
Fallback Exercises with Clusters Answer
“Grad school is the snooze button on the
clock-radio of life”.
-John Rogers, Comedian who holds a graduate degree if physics
Your answers are on the following page.
![Page 340: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/340.jpg)
Master the Teradata Architecture
331 Copyright OSS 2010
![Page 341: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/341.jpg)
Master the Teradata Architecture
332 Copyright OSS 2010
More Fallback Exercises
“Time is a great teacher, but unfortunately
it kills all its pupils”.
-Hector Louis Berlioz
I have already completed the first example, which is the 12 AMPs in One Cluster.
In the next example there are two clusters of six AMPs each. In the next example there
are three clusters of four AMPs. In the final example there are four clusters of three
AMPS.
Your job is to place the Fallback rows in their proper place. Remember, because I am a
nice guy I have helped you out with the first system containing one cluster of 12 AMPs.
Now, you should attempt to place the Fallback records on the proper AMP in the proper
cluster for the remaining system.
![Page 342: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/342.jpg)
Master the Teradata Architecture
333 Copyright OSS 2010
![Page 343: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/343.jpg)
Master the Teradata Architecture
334 Copyright OSS 2010
More Fallback Exercises with Answers
“When you are courting a nice girl an hour
seems like a second. When you sit on a red-
hot cinder a second seems like an hour.
That‟s relativity”.
-Albert Einstein
Check out how the Fallback was laid out in all four systems.
![Page 344: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/344.jpg)
Master the Teradata Architecture
335 Copyright OSS 2010
![Page 345: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/345.jpg)
Master the Teradata Architecture
336 Copyright OSS 2010
Fallback – Performance Vs Protection Questions
You will be asked several questions about the slide you have just seen concerning
Fallback. By answering these questions we hope you will be able to further your
understanding of exactly how Fallback works. We want you to clearly understand the
trade-off between protection and performance and then understand why NCR Teradata
usually picks a number of AMPs in a cluster that will maximize both. Answer the
questions assuming that there are millions of rows in the table.
1. Which System (A, B, C, D) provides the best protection?
2. Which System provides the best performance should a single AMP go down?
3. How many AMPs could you lose in System A and still have Teradata be able to
satisfy a query that was a Full Table Scan?
4. How many AMPs could you potentially lose in System D and still have Teradata
satisfy a query that was a Full Table Scan?
5. How many AMPs could you lose in System D (Cluster 1) before Teradata would
not be able to satisfy a query that was a Full Table Scan?
6. If none of the systems had Fallback, how many AMPs could any system lose
before Teradata would not be able to satisfy a query that was a Full Table scan?
7. Why does Teradata usually place four AMPs in each cluster?
![Page 346: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/346.jpg)
Master the Teradata Architecture
337 Copyright OSS 2010
Fallback – Performance Vs Protection (Answers)
Here are the questions again, but with the answers. Remember, you were to answer the
questions assuming that there are millions of rows in the table.
1. Which System (A, B, C, D) provides the best protection? D (This is because we
could potentially lose one AMP in each cluster thus allowing us to lose 4 AMPs.
2. Which System provides the best performance should a single AMP go down? A
(This is because if a single AMP went down then 11 other AMPs (all in the same
cluster) would hold an equal portion of the down AMPs Fallback rows.
Therefore, only 1/12th
of the system would be affected with each AMP
responsible for a portion of the down AMPs work.)
3. How many AMPs could you lose in System A and still have Teradata be able to
satisfy a query that was a Full Table Scan? One (You can only lose one AMP in a
cluster. If you lose two AMPs in a cluster the Teradata system can‘t fulfill
requests to the down AMPs).
4. How many AMPs could you potentially lose in System D and still have Teradata
satisfy a query that was a Full Table Scan? Four (You can lose one AMP in each
Cluster with Fallback, but lose two in any single cluster and the table is in trouble)
5. How many AMPs could you lose in System D (Cluster 1) before Teradata would
not be able to satisfy a query that was a Full Table Scan? One (You can only lose
one AMP in a cluster and since all 12 AMPs are in the same cluster you better not
lose a second one).
6. If none of the systems had Fallback, how many AMPs could any system lose
before Teradata would not be able to satisfy a query that was a Full Table scan?
None (Since the records are not Fallback protected then there is no way to satisfy
a query wanting information from the down AMP. That means the system could
not perform a Full Table Scan or satisfy any query that involved the down AMP).
7. Why does Teradata usually place four AMPs in each cluster? Teradata usually
places four AMPs in a cluster because of both Performance and Protection.
(The Protection with four AMPs is solid because it is not likely that two AMPs
out of four will fail. The Performance is solid because if a single AMP goes
down then the three other AMPs in the cluster will share responsibility for the
down AMPs records. Only 25% of the system performance is gone because the
three AMPs will do their work plus what is needed for the failed AMP).
![Page 347: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/347.jpg)
Master the Teradata Architecture
338 Copyright OSS 2010
The Six Rules of Fallback
“Don‟t worry about the world coming to an
end today. It‟s already tomorrow in
Australia”.
-Charles Schultz
There are a couple of rules I want you to think about with Fallback.
Rule 1: Fallback doubles the size of your table.
Rule 2: All AMPs are clustered (usually in sets of four).
Rule 3: Fallback rows always reside within the same Cluster.
Rule 4: Two AMPs in the same Cluster never reside inside the same NODE.
Rule 5: Two AMPs in the same Cluster never reside inside the same CLIQUE.
Rule 6: Fallback protects you against a Failed AMP
![Page 348: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/348.jpg)
Master the Teradata Architecture
339 Copyright OSS 2010
![Page 349: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/349.jpg)
Master the Teradata Architecture
340 Copyright OSS 2010
Cliques and Clusters
“Time is at once the most valuable and most
perishable of all our possessions”.
-John Randolph
On the following page you can see a picture that has four Cliques. In each Clique are
four Nodes. Within each Node is 2 PE‘s and 4 AMPs. Normally, there would be about 4
PE‘s and 25 AMPs, but this picture is designed to give you knowledge of how Teradata
Clusters the AMPs inside the cliques.
This picture is to set you up. Your job is to put in a clustering scheme that follows three
rules:
1) Group your Clusters in AMPs of Four.
2) Never have two AMPs in the same Cluster be a part of the same Clique.
3) Never have two AMPs in the same Cluster be a part of the same Node.
If you do this correctly you will understand that we can lose a Node or a Clique and still
not have more than two AMPs within a Cluster fail.
![Page 350: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/350.jpg)
Master the Teradata Architecture
341 Copyright OSS 2010
![Page 351: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/351.jpg)
Master the Teradata Architecture
342 Copyright OSS 2010
Cliques and Clusters Answers
“Though no one can go back and make a
rand new start, anyone can start from now
and make a brand new ending”.
-Anonymous
On the following page you can see a picture that has four Cliques. In each Clique are
four Nodes. Within each Node are 2 PE‘s and 4 AMPs. Normally, there would be about
4 PE‘s and 25 AMPs, but this picture is designed to give you knowledge of how Teradata
Clusters the AMPs inside the cliques.
Remember the three rules:
1) Group your Clusters in AMPs of Four.
2) Never have two AMPs in the same Cluster be a part of the same Clique.
3) Never have two AMPs in the same Cluster be a part of the same Node.
If you do this correctly you will understand that we can lose a Node or a Clique and still
not have more than two AMPs within a Cluster fail.
The following page shows the four AMPs in Cluster 1. They are each in a different
Clique and a different Node. I have circled the four AMPs and placed the number 1
inside them to represent Cluster 1. Notice that the four AMPs in cluster 1 are very far
apart physically.
Notice the four AMPs in Cluster 2. Notice the four AMPs in Cluster 3. Notice we have
clusters from Cluster 1 to Cluster G. We don‘t normally call our Clusters by their
numbers, but I want you to also notice the four AMPs in Cluster G.
If we lost every node in Clique number 1 we would have actually lost one AMP in every
Cluster.
![Page 352: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/352.jpg)
Master the Teradata Architecture
343 Copyright OSS 2010
![Page 353: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/353.jpg)
Master the Teradata Architecture
344 Copyright OSS 2010
Down AMP Recovery Journal (DARJ)
“Once the game is over, the king and the
pawn go back in the same box.”
- Italian Proverb
The Down AMP Recovery Journal (DARJ) is started on all AMPs in the cluster when
an AMP is down. This allows for three AMPs to check on their mate. Since there are
four AMPs in most clusters and all Fallback for a particular AMP remains within the
cluster there are Three AMPs that will hold Fallback rows for a down AMP.
The Down AMP Recovery Journal (DARJ) is a special journal used only for
FALLBACK rows when an AMP is not working. Like the TRANSIENT
JOURNAL, the DARJ, also known as the RECOVERY JOURNAL, gets it space from
the DBC‘s PERM Space. When an AMP fails, the rest of the AMPs in its cluster
initiate a DARJ. The DARJ keeps track of any changes that would have been
written to the failed AMP. When the AMP comes back online, the DARJ will catch-
up the AMP by completing missed transactions. Once everything is caught-up the
DARJ is dropped.
![Page 354: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/354.jpg)
Master the Teradata Architecture
345 Copyright OSS 2010
![Page 355: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/355.jpg)
Master the Teradata Architecture
346 Copyright OSS 2010
Permanent Journal
“The absent are always in the wrong.”
English Proverb
If a system had five million rows and used FALLBACK protection, then it would have
five million FALLBACK rows. However, this would be quite costly because
FALLBACK actually stores a duplicate copy of all the rows on other AMPs within the
same cluster. FALLBACK is used either because the system is mission critical or the
system is not backed up regularly. For customers who backup data regularly, another
option for data restoration is the ―Permanent Journal.‖ When a company is not severely
impacted by a couple of hours for a restoration to be completed, this is a very good
option. The Permanent Journal works in conjunction with backup procedures, plus it‘s a
lot more cost effective than FALLBACK.
“The absent are always in the write.”
Permanent Journal Proverb
The Permanent Journal stores only images of rows that have been changed due to
an INSERT, UPDATE, or DELETE command. That is why when data is lost or
absent the permanent journal can write it back to the disks. The permanent journal keeps
track of all new, deleted or modified data since the last Permanent Journal backup. This
option is usually less expensive than storing the additional five million FALLBACK
rows.
Like FALLBACK, the Permanent Journal is optional. It may be used on specific tables
of your choosing or on no tables at all. It provides the flexibility to customize a Journal
to meet specific needs. The Permanent Journal must be manually purged from time to
time.
There are five image options for the Permanent Journal:
Before Journal
After Journal
Dual Before Journal
Dual After Journal
Journal
![Page 356: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/356.jpg)
Master the Teradata Architecture
347 Copyright OSS 2010
![Page 357: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/357.jpg)
Master the Teradata Architecture
348 Copyright OSS 2010
Table create with Fallback and Permanent Journal
“A real friend is one who walks in when the
rest of the world walks out.”
Walter Winchell
The example created the table called ―Employee‖ in the Teratom database, and is
FALLBACK protected. A BEFORE Journal and a DUAL AFTER Journal are specified.
Remember that both FALLBACK and JOURNALING have defaults of ―NO‖ - meaning
if you don‘t specify this protection at either the table or database level the default is NO
FALLBACK and NO JOURNALING.
![Page 358: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/358.jpg)
Master the Teradata Architecture
349 Copyright OSS 2010
![Page 359: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/359.jpg)
Master the Teradata Architecture
350 Copyright OSS 2010
Permanent Journal Rules
“If you can find something everyone agrees
on, it‟s wrong.”
-Mo Udall The only time you can create a Permanent Journal is when you CREATE a DATABASE
or a USER or if you MODIFY a database or user. Remember, Teradata considers a
database or user basically the same thing. Both databases and users can have PERM
Space or Spool space assigned to them. So, remember that you can only create a
Permanent Journal inside a database or user.
You can only have one Permanent Journal per database or user so theoretically we could
have one Permanent Journal in the entire system or we could go to the other extreme and
have one Permanent Journal in every database and every user.
When you create a table you specify whether or not you want journaling. You can also
tell the table which journal you want as its journal. You can have many tables in a
database and each table could potentially write to a default journal in the database itself,
or you could have each table write to a default journal in another database or you could
even have some tables write to one default journal and other tables write to different
default journals. It is up to you.
![Page 360: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/360.jpg)
Master the Teradata Architecture
351 Copyright OSS 2010
![Page 361: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/361.jpg)
Master the Teradata Architecture
352 Copyright OSS 2010
Some Permanent Journal Possibilities
“Talk low, talk slow, and don‟t too much.”
-John Wayne, Advice on acting
Notice in the picture below the following scenarios. You could have a scenario where
every Database or User has their own Permanent Journal and the tables in that Database
or User always select the Permanent Journal in their database or user as their Default
Journal. That example is shown in System 1.
You might set up one Permanent Journal in a database and have every table in the system
choose that one Permanent Journal as their Default Journal. That example is shown in
System 2.
You will most likely have multiple Permanent Journals and have tables choose one of
them as their Default Journal. That example is shown in System 3.
![Page 362: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/362.jpg)
Master the Teradata Architecture
353 Copyright OSS 2010
![Page 363: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/363.jpg)
Master the Teradata Architecture
354 Copyright OSS 2010
Creating a Permanent Journal
“Only the educated are free.”
-Epictetus
The following page is an excellent example of creating a Permanent Journal. You can
only create a Permanent Journal when you use a CREATE DATABASE or CREATE
USER statement. Of course this also applies with a MODIFY DATABASE or MODIFY
USER statement.
Remember your two most basic Permanent Journal rules:
Rule 1:
You can only have one Permanent Journal per database or per user.
Rule 2:
Tables within a database can be assigned to any Permanent Journal in any DATABASE
or USER.
After this next page we will create tables inside Advertising to see examples of different
scenarios.
![Page 364: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/364.jpg)
Master the Teradata Architecture
355 Copyright OSS 2010
![Page 365: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/365.jpg)
Master the Teradata Architecture
356 Copyright OSS 2010
Create Table Examples with Permanent Journals
“I am desperately trying to figure out why
Kamikaze pilots wore helmets.”
-Dave Edison
In our previous slide we created a database called Advertising. We gave it a Journal
called Journals!
We have named a journal called Journals, in order to keep track of our table changes and
this will serve as the default for any table created in Advertising. Below are examples of
three different tables being created. Watch and learn what happens.
The first example shows a table called Department_Table. This table makes no reference
to any journals so by default will write an After Journal to Advertising.Journals. The
table took the default AFTER Journal that was created in its database.
The next example shows a table called Employee_Table. This table overrides the
database default and explicitly demands that the table not write to a Journal.
The final example shows a table called Department_Table2. This table requests an
AFTER Journal, but demands that the table write any changes to rows to a different
Journal in another database called Sales.SJournal.
![Page 366: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/366.jpg)
Master the Teradata Architecture
357 Copyright OSS 2010
![Page 367: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/367.jpg)
Master the Teradata Architecture
358 Copyright OSS 2010
Each Permanent Journal is made up of 3 Areas
“I became a policeman because I wanted to
be in a business where the customer is
always wrong.”
– Anonymous
Every Permanent Journal is comprised of three areas. They are the Active Current
Journal, the Saved Current Journal and the Restored Journal. This slide demonstrates the
purpose of the Active Current Journal and the Saved Current Journal portions of the
Permanent Journal. When a table is defined with a Permanent Journal and a change takes
place on a row the changed row is written (appended) to the Active Current Journal.
When the Database Administrator (DBA) submits a CHECKPOINT WITH SAVE
statement the Active Current Journal appends its rows to the Saved Current Journal.
Then, Teradata automatically deletes the Active Current Journal rows so a fresh start to
the Active Current Journal can take place.
When the DBA submits an ARCHIVE JOURNAL TABLE statement the Saved Current
Journal is copied to tape. This is usually done on a daily basis. The DBA must submit a
DELETE JOURNAL statement to delete the Saved Current Journal. It is never done
automatically.
![Page 368: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/368.jpg)
Master the Teradata Architecture
359 Copyright OSS 2010
![Page 369: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/369.jpg)
Master the Teradata Architecture
360 Copyright OSS 2010
Permanent Journal Rules
“You miss 100 percent of the shots you
never take.”
– Wayne Gretzky
You miss 100 percent of the Journals you never save! The following page shows the
rules for a consistent Permanent Journal.
![Page 370: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/370.jpg)
Master the Teradata Architecture
361 Copyright OSS 2010
![Page 371: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/371.jpg)
Master the Teradata Architecture
362 Copyright OSS 2010
![Page 372: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/372.jpg)
Master the Teradata Architecture
363 Copyright OSS 2010
![Page 373: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/373.jpg)
Master the Teradata Architecture
364 Copyright OSS 2010
![Page 374: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/374.jpg)
Master the Teradata Architecture
365 Copyright OSS 2010
![Page 375: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/375.jpg)
Master the Teradata Architecture
366 Copyright OSS 2010
The Four Locks of Teradata
“Some birds aren‟t meant to be caged, their
feathers are just too bright. And when they
fly away, the part of you that knows it was a
sin to lock them up, does rejoice.”
– Shawshank Redemption
You don‘t lock up a bird, but you always lock a query. Teradata uses a lock manager to
automatically lock at the database, table or row hash level. Teradata will lock objects
using four types of locks:
Exclusive - Exclusive locks are placed only on a database or table when the object is
going through a structural change. An Exclusive lock restricts access to the object by any
other user. This lock can also be explicitly placed using the LOCKING modifier.
Write - A Write lock happens on an INSERT, DELETE, or UPDATE request. A Write
lock restricts access by other users. The only exception is for users reading data that are
not concerned with data consistency and override the applied lock by specifying an
Access lock. This lock can also be explicitly placed using the LOCKING modifier.
Read - This is placed in response to a SELECT request. A Read lock restricts access by
users who require Exclusive or Write locks. This lock can also be explicitly placed using
the LOCKING modifier. Read locks put the word integrity in ―data integrity‖. If you
have a multi-user environment with updates occurring and you need to keep data
consistent, you want a read lock.
Access - Placed in response to a user-defined LOCKING FOR ACCESS phrase. An
Access lock permits the user to access to READ an object that may already be locked for
READ or WRITE. An access lock does not restrict access by another user except when
an Exclusive lock is required. A user requesting access cannot be concerned with data
consistency.
When Teradata locks a resource for a user the lifespan of the transaction lock is forever
or until the user releases the lock.
This is different then a deadlock situation. If two transactions are deadlocked the
youngest query is always aborted.
![Page 376: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/376.jpg)
Master the Teradata Architecture
367 Copyright OSS 2010
![Page 377: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/377.jpg)
Master the Teradata Architecture
368 Copyright OSS 2010
Teradata has 3 levels of Locking
“When you go into court you are putting
your fate into the hands of twelve people
who weren‟t smart enough to get out of jury
duty.”
- Norm Crosby
Teradata uses a lock manager to be judge, jury, and executioner of SQL. There are four
locks placed on objects at the database, table, or row hash level.
![Page 378: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/378.jpg)
Master the Teradata Architecture
369 Copyright OSS 2010
![Page 379: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/379.jpg)
Master the Teradata Architecture
370 Copyright OSS 2010
Quiz – Which Level of Locking is Occurring?
“In the end we‟ll remember not the words of
our enemies, but the silence of our friends.”
– Martin Luther King, Jr.
Dr. Martin Luther King Jr. was a great man with a message that will live on forever. Dr.
King believed in equality. In dedication to Dr. King we have set up this quiz about lock
equality. Which lock level will Teradata use for the SQL on the following page. Will the
SQL cause Teradata to place the lock at the Database level, Table level, or Row level?
![Page 380: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/380.jpg)
Master the Teradata Architecture
371 Copyright OSS 2010
![Page 381: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/381.jpg)
Master the Teradata Architecture
372 Copyright OSS 2010
Quiz Locking Answers
“If you are planning for a year, sow rice; if
you are planning for a decade, plant trees; if
you are planning for a lifetime, educate
people.”
- Chinese Proverb
On the following page we have the answers to the quiz.
![Page 382: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/382.jpg)
Master the Teradata Architecture
373 Copyright OSS 2010
![Page 383: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/383.jpg)
Master the Teradata Architecture
374 Copyright OSS 2010
The Teradata Lock Manager
“You can make more friends in two months
by becoming interested in other people, than
you will in two years by trying to get other
people interested in you.”
- Dale Carnegie
Someone on a New York Street Corner was asked, ―How do you get to Carnegie Hall?‖
He replied, ―Practice man practice!‖ The following page is designed so you can practice
the art of understanding Teradata locks.
What I want you to know is that the only lock that users have control over is the Access
Lock. If you want to read a table, but don‘t want to wait on a WRITE Lock, and don‘t
care that the answer set may not be perfect then you want an ACCESS Lock. This is also
called a ―Dirty Read‖ or a ―Read without Integrity‖ because the data isn‘t always perfect.
Let me explain. When someone is updating a table they are given a WRITE Lock. As
they perform the UPDATE a user who wants a READ lock on the table writes their SQL.
Teradata makes the READ lock wait until the WRITE lock has completely updated the
table. This could take a long time. An Access lock says, ―I know someone is already
updating the table, but I don‘t want to wait. I am merely trying to get an average of sales
for the week and I don‘t have to have everything perfect.
I also want you to notice that an Exclusive Lock, Write Lock and the Read Lock are
determined by Teradata based on the SQL that is written.
![Page 384: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/384.jpg)
Master the Teradata Architecture
375 Copyright OSS 2010
![Page 385: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/385.jpg)
Master the Teradata Architecture
376 Copyright OSS 2010
Locking Modifiers – The Access Lock
“We‟re fools whether we dance or not, so
we might as well dance.”
- Japanese Proverb
If the user doesn‘t want to wait on any write locks they can use the Locking for ACCESS
modifier. They won‘t have to wait on any Write Locks that would normally make them
wait in the Pseudo Table line. They are compatible with Write locks. Notice in the
picture on the next page that I have highlighted in Yellow the actual SQL Statement.
You will find that most views use the Locking for Access modifier.
Here is a trick. Merely put ―Locking row for Access‖ in your SQL. This will try and
lock just the row for an Access Lock, but if Teradata needs to lock the entire table it will
do so.
![Page 386: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/386.jpg)
Master the Teradata Architecture
377 Copyright OSS 2010
![Page 387: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/387.jpg)
Master the Teradata Architecture
378 Copyright OSS 2010
Locks and their compatibility
“Frankly, my dear, I don‟t give a damn.”
- Rhett Butler – Gone with the Wind (1939)
Not everyone is compatible and Teradata locks are no exception. Locks that are
compatible can lock the same object simultaneously. Clark Gable would have been a
great Teradata user because he always used a Rhett Lock and according to Scarlet was
almost never Write!
Locks that are compatible can share access to objects simultaneously so READ locks
are great because one or a thousand users can read the same object at the same time.
Teradata will not allow a user to change a table while others are reading it. This prevents
database corruption.
An ACCESS Lock is an excellent way to avoid waiting for a write lock currently on a
particular table. Two statements allow this:
Locking Row for Access
Locking Tablename for Access
![Page 388: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/388.jpg)
Master the Teradata Architecture
379 Copyright OSS 2010
![Page 389: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/389.jpg)
Master the Teradata Architecture
380 Copyright OSS 2010
Moving Through the Locking Queue
“If you‟re falling off a cliff, you may as well
try to fly.”
- Captain John Sheridan, Babylon 5
Teradata locks can fly through the Queue with amazing speed. How?
Teradata locks work a lot like going to a movie theatre. You decide you want to see a
movie or access a table you must first get in line. But in the Teradata Movie Line if you
are compatible with the person directly in front of you then you can move up the line.
That doesn‘t mean you can go right to the front of the line if you are compatible with the
first person in line. It means that you can move up the line one person at a time as long
as you are compatible.
The next couple of pages will discuss what locks are compatible with each other.
![Page 390: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/390.jpg)
Master the Teradata Architecture
381 Copyright OSS 2010
![Page 391: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/391.jpg)
Master the Teradata Architecture
382 Copyright OSS 2010
Quiz – Which Locks Move Up?
“Everyone is kneaded out of the same dough
but not baked in the same oven.”
- Yiddish Proverb
The quiz on the following page will give you an opportunity to understand how locks
move up and read rows of a table simultaneously. If everyone in a company only did
SELECTs, then there would only be READ locks. Read locks are compatible so
everyone could always immediately read any table they want. However, most of the time
a data warehouse environment has users who want to read and analyze data, but others
doing updates. This is where there is danger in slowing users down because they often
have to wait to access a table.
Remember what is compatible to what and also remember that you can only move up 1
person at a time, and that is only if you are compatible!
![Page 392: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/392.jpg)
Master the Teradata Architecture
383 Copyright OSS 2010
![Page 393: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/393.jpg)
Master the Teradata Architecture
384 Copyright OSS 2010
Answers to Locking Quiz
“They can conquer who believe they can.”
- Virgil 70 BC
You believed you could do it and that is why you probably aced this quiz. Great job!
Notice how I put the slide together on the next page. I am showing in the circles which
locks simultaneously performed together. These locks moved up because they were
compatible with the lock directly in front of them. You can keep moving up until you run
into a lock you are not compatible with.
![Page 394: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/394.jpg)
Master the Teradata Architecture
385 Copyright OSS 2010
![Page 395: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/395.jpg)
Master the Teradata Architecture
386 Copyright OSS 2010
A Single AMP Acts as the Locking Gatekeeper
“First you imitate, then you innovate.”
- Miles Davis
Miles Davis was a musical genius who worked hard at his Jazz. Teradata has struck a
chord with its lock abilities too. Please make a note of it.
Teradata innovates by making sure that only 1 AMP is responsible for locking a
particular table. What is innovative about this is that each AMP is assigned certain tables
it is responsible for locking. How does Teradata accomplish this and make sure that each
AMP is responsible for an equal amount of tables? It hashes the table name and then
goes to the hash map. The hash map then points to the AMP responsible.
In our example on the following page you see that we have two users named John and
Mary. Both are trying to run SQL on the Order_Table. Teradata will need to assign one
of the AMPs the responsibility of locking the table so it hashes the name Order_Table.
As you can see the Row Hash came out as a 00000000001100 which equates to a 12.
Teradata goes to the 12the bucket in the hash map and the bucket says AMP 4 will be
responsible for locking the Order_Table.
AMP 4 builds a Pseudo Table and John got there just before Mary because John
submitted his query first, but the line has been established. Mary will wait for John to
finish unless her lock is compatible with his. If that is the case then both will have access
to the Order_Table simultaneously.
AMP 4 will send a message over the BYNET to the other AMPs guiding them to lock
their tables for John first and then do the same for Mary.
This locking architecture is designed to eliminate dead locks and ensure that the first
query submitted gets the first lock for a particular table. Teradata refers to this as ―A
sequential locking resource‖.
![Page 396: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/396.jpg)
Master the Teradata Architecture
387 Copyright OSS 2010
![Page 397: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/397.jpg)
Master the Teradata Architecture
388 Copyright OSS 2010
Every AMP performs Locking Gatekeeper Duties
“Fate chooses your relations, you choose
your friends.”
- Jacques Delille
Notice the picture on the following page. Teradata fate chooses which AMP will be
responsible for locking the table(s) your SQL Query needs. What I want you to notice is
that each AMP is responsible for locking a different table. AMP 1 is responsible for the
Order_Table while AMP 2 is responsible for the Item_Table while AMP 3 has the Sales
table and AMP 4 has the Cust_Table.
Teradata accomplishes this locking by hashing the table name and so each AMP in theory
should be responsible for an equal amount of tables.
Notice that each AMP builds what is called a Pseudo table. This is the front of the line
for that table‘s access. The first person to submit the SQL will be at the front of the line
and the last to submit will be at the end of the line in the pseudo table. If you are
compatible with the lock in front of you then you can move up one lock at a time.
![Page 398: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/398.jpg)
Master the Teradata Architecture
389 Copyright OSS 2010
![Page 399: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/399.jpg)
Master the Teradata Architecture
390 Copyright OSS 2010
Answers to Which AMP is Waiting on Access
“Regret for wasted time is more wasted
time.”
- Mason Cooley
The slide on the following page answers the age old question, ―Which AMP will have a
lock that waits. The answer as you can see is AMP 1 and it is because of the last write in
AMP 1‘s Pseudo table. The Write can move up to the ACCESS directly in front of it, but
as the Access also moves up to the Read Lock, the Write has to wait for the reads to
finish.
![Page 400: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/400.jpg)
Master the Teradata Architecture
391 Copyright OSS 2010
![Page 401: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/401.jpg)
Master the Teradata Architecture
392 Copyright OSS 2010
Explains – The Pseudo Table for Locks
“Money doesn‟t bring happiness. People
with ten million dollars are no happier than
people with nine million dollars.”
- Hobart Brown
The first thing you will see in an EXPLAIN Statement, which is the PE‘s plan in English,
will usually refer to locking. The first line usually says something like, ―Locking a
Pseudo Table for Read‖. This means that you are now in line in the Pseudo Table. The
next line states ―We lock the Employee_Table for Read‖, which means you have moved
to the front of the line and the lock has been place the table.
I like to explain this to users because the EXPLAIN can really give insight into the
Parsing Engines plan, but users are often confused by the term Pseudo Table. Now you
know!
![Page 402: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/402.jpg)
Master the Teradata Architecture
393 Copyright OSS 2010
![Page 403: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/403.jpg)
Master the Teradata Architecture
394 Copyright OSS 2010
The NOWAIT Locking Option
“Destiny is not a matter of chance, it is a
matter of choice; it is not a thing to be
waited for, it is a thing to be achieved”
- William Jennings Bryan
Sometimes your SQL is not a thing to be waited for; it is a thing to be aborted. When the
NOWAIT is used, if a lock request cannot be responded to immediately the transaction
will abort. The NOWAIT option is used when it is not desirable to have a request wait
for resources, or cause resources to be tied up while waiting. The NOWAIT option is an
excellent way to avoid waiting on conflicting locks.
Don‘t make the mistake of thinking the NOWAIT option means you have a free dash to
the front of the lock line.
The NOWAIT option dictates that a transaction is to ABORT immediately if the LOCK
MANAGER cannot immediately place the lock.
Use a LOCKING modifier with the NOWAIT option when you don‘t want requests
waiting in the queue. A 7423 return code informs the user that the lock could not be
placed due to an existing, conflicting, lock.
![Page 404: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/404.jpg)
Master the Teradata Architecture
395 Copyright OSS 2010
![Page 405: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/405.jpg)
Master the Teradata Architecture
396 Copyright OSS 2010
Rules of Teradata Locking
“People can have the Model T in any color
– so long as it‟s black.”
- Henry Ford
The rules of locking are right there on the slide on the following page.
![Page 406: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/406.jpg)
Master the Teradata Architecture
397 Copyright OSS 2010
![Page 407: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/407.jpg)
Master the Teradata Architecture
398 Copyright OSS 2010
![Page 408: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/408.jpg)
Master the Teradata Architecture
399 Copyright OSS 2010
![Page 409: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/409.jpg)
Master the Teradata Architecture
400 Copyright OSS 2010
Explains – Psuedo Tables
“If we knew what we were doing, it wouldn‟t
be called research, would it?”
- Albert Einstein
On most EXPLAINS the very first thing you will read is about locking of a Psuedo table.
This is very confusing for many users and they often feel lost immediately when reading
an EXPLAIN plan. Let me give you the real scoop.
Teradata is a parallel processing system that places a portion of the rows of a table on
every AMP. That means that each AMP has its own portion of a table, in other words,
each AMP has thinks they own that table because to them they have a table header and
rows that follow. That is a table to an AMP.
This makes locking a little tricky because if you are trying to query the Employee_Table
and you have a 500 AMP system then you are essentially telling all 500 AMPs to lock
their Employee_Table. It is important to keep this locking process coordinated.
Teradata accomplishes this by having only 1 AMP command the locking process.
Because Teradata doesn‘t want 1 AMP to be responsible for locking of every table they
spread this duty around to every AMP. Let me explain.
When a user wants to query the Employee_Table the Parsing Engine hashes the name
Employee_Table and looks to the hash map to see which AMP will be responsible for
locking the Employee_Table. Let‘s just say it is AMP 4 for example sake.
AMP 4 has a Psuedo Table which it dedicates to the Employee_Table. The Pseudo table
will keep track of who wants to query the Employee_Table. It is considered first come
first serve. The first person who enters a query for the Employee_Table will be first in
line in the Pseudo Table. Then AMP 4 will communicate with all the other AMPs to lock
their Employee_Table for the first user in AMP 4‘s Pseudo table. This allows Teradata to
control and synchronize the locking system from a single resource.
Think of a Pseudo table as a queue. The first user in the queue gets the lock and the
others wait their turn. So, when you see ―We lock a distinct Pseudo table for READ‖ that
means you are in the locking queue waiting your turn to query the table. When you see
the next line of the EXPLAIN say ―We lock the Employee_Table for READ‖ that means
it is now your turn to query the table and the locking has taken place.
You now know more about EXPLAINs then 90% of the people in the world.
Congratulations friend! Thanks for reading this book!
![Page 410: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/410.jpg)
Master the Teradata Architecture
401 Copyright OSS 2010
![Page 411: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/411.jpg)
Master the Teradata Architecture
402 Copyright OSS 2010
Explain – Full Table Scan
“The great tragedy of science, the slaying of
a beautiful theory by an ugly fact.”
- Thomas Henry Huxley
Teradata performs Full Table Scans as fast as any other vendor. They fly through data
because of their parallel processing design. In a Full Table Scan each AMP must read all
their rows for a particular table thus each row of a table is examined once.
You can see in the picture on the following page that we use every AMP when you see
the words ―All-AMPs Retrieve‖. You can see that every row is read from the words
―All-Rows Scan‖. So, when you see an All-AMPs Retrieve by way of an All Rows Scan
you will know it is merely a Full Table Scan.
There is nothing wrong with doing a Full Table Scan if necessary, but it is silly to do a
Full Table Scan if you don‘t need to do it. Some users won‘t know the Primary Index
and write their queries to do a Full Table Scan when they could have used the Primary
Index or a Secondary Index in the query. This wastes time and money.
If you think you are writing a query that should use the Primary Index or a Secondary
Index or only search certain Partitions and you see the Full Table Scan you should
investigate what is wrong.
Never do a Full Table Scan unless it is the only choice! For example, if you wanted to
find the Average Salary of the employees in the Employee_Table you would have to read
every row to get that answer set. Doing a Full Table Scan in that case is 100%
acceptable!
If however you worked in Human Resources and an employee came in for a meeting and
you wanted to look their information up in the Employee_Table you should find out the
Primary Index of the Employee_Table. If it is the column Employee_Table you should
ask the employee what their Employee_No is and then use that Employee_No in the
WHERE clause of the SQL. It will be a 1 AMP operation retrieving only 1 row. It is
very quick! Some Human Resources users will leave off the WHERE clause and do a
Full Table Scan and then scroll down through the huge report until they find the right
employee. This is a huge waste of resources in Human Resources. The irony the irony!
![Page 412: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/412.jpg)
Master the Teradata Architecture
403 Copyright OSS 2010
![Page 413: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/413.jpg)
Master the Teradata Architecture
404 Copyright OSS 2010
Explain – Primary Index Reads
“Winning is a habit. Unfortunately, so is
losing.”
- Vince Lombardi
The fastest queries in Teradata use the Primary Index column in the WHERE Clause. As
you can see in the example of the following page the Primary Index is the Employee_No
and it is being used in the WHERE clause of the query.
In the EXPLAIN you will see the beautiful words, ―We do a Single-AMP Retrieve by
way of a UNIQUE PRIMARY INDEX.
This is the best way to write queries in Teradata and the fastest way for Teradata to
retrieve data!
Did you know that Teradata doesn‘t even use SPOOL Space for this query? It just
retrieves the row and immediately delivers it to the user.
![Page 414: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/414.jpg)
Master the Teradata Architecture
405 Copyright OSS 2010
![Page 415: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/415.jpg)
Master the Teradata Architecture
406 Copyright OSS 2010
Explain – Secondary Index Read
“Sometimes a scream is better than a
thesis.”
- Ralph Waldo Emerson
The fastest queries in Teradata use the Primary Index in the WHERE clause because this
is a 1 AMP operation, but the second fastest queries use a Unique Secondary Index. This
is called an USI (Unique Secondary Index) query. An USU Query only uses two AMPs.
In the picture on the following page you can see that we first show you the CREATE
UNIQUE INDEX statement. Now you know that we have created an USI index.
Then in the picture we show you the query that uses the secondary index. In the
EXPLAIN you can see the statement, ―We do a Two-AMP Retrieve by way of a
UNIQUE INDEX‖. This means that the Parsing Engine is going to use the secondary
index and that this query will be delivered quickly.
An USI query is always a two-AMP operation.
![Page 416: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/416.jpg)
Master the Teradata Architecture
407 Copyright OSS 2010
![Page 417: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/417.jpg)
Master the Teradata Architecture
408 Copyright OSS 2010
Explain - View DDL of a Partitioned Table
“Colleges are places where pebbles are
polished and diamonds are dimmed.”
- Robert G. Ingersoll
The picture on the following page shouldn‘t be part of the EXPLAIN chapter because it
is merely showing the table definition, which is often referred to as the Data Definition
Language (DDL). I want you to see that our table called Sales_Table_PPI has been
partitioned. On the following page we will show you a query and its EXPLAIN that
doesn‘t do a Full Table Scan, but one that reads only certain partitions.
![Page 418: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/418.jpg)
Master the Teradata Architecture
409 Copyright OSS 2010
![Page 419: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/419.jpg)
Master the Teradata Architecture
410 Copyright OSS 2010
Explain – Partition Elimination
“To teach is to learn twice.”
- Joseph Joubert
On the following page you can see that we are querying the Sales_Table_PPI table and
the EXPLAIN tells us that we are not doing a Full Table Scan, but are instead only
reading 3 Partitions.
A Full Table Scan uses All-AMPs but it reads All Rows. A Partitioned query uses All-
AMPs, but they don‘t read All-Rows. Each AMP only reads from 3 Partitions in this
example, thus speeding up the query magnitudes of order!
The entire purpose of a Partitioned Table is do eliminate the Full Table Scan on that table
as often as possible. Each AMP sorts the rows it owns for their table in partitions and
this will often eliminate reading all rows. This is called Partition Elimination because
some partitions won‘t have to be read to satisfy the query.
![Page 420: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/420.jpg)
Master the Teradata Architecture
411 Copyright OSS 2010
![Page 421: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/421.jpg)
Master the Teradata Architecture
412 Copyright OSS 2010
Explain – Joins with Duplication on all AMPs
“I cannot teach anybody anything; I can
only make them think.”
- Socrates
Teradata joins up to 128 tables in a single query. That is amazing, but something that
most people don‘t know is two things:
1) Teradata joins only two tables at a time
2) The rows being joined must reside on the same AMP
Most of the time two rows are being joined they will be on different AMPs and Teradata
must move them to the same AMP for the joining process.
Teradata will do this by either redistributing one or both of the tables or by duplicating
the smaller table on all AMPs.
In the EXPLAIN on the following page the Parsing Engine has decided to duplicate the
smaller table (Order_Table) on all AMPs. Now each matching Customer_Table row will
be able to join directly to the Order_Table.
![Page 422: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/422.jpg)
Master the Teradata Architecture
413 Copyright OSS 2010
![Page 423: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/423.jpg)
Master the Teradata Architecture
414 Copyright OSS 2010
Explain – Joins with Redistribution
“A new idea is delicate. It can be killed by a
sneer or a yawn; it can be stabbed to death
by a joke or worried to death by a frown on
the right person‟s brow.”
- Charles Brower
Teradata joins up to 128 tables in a single query. That is amazing, but something that
most people don‘t know is two things:
1) Teradata joins only two tables at a time
2) The rows being joined must reside on the same AMP
Most of the time two rows are being joined they will be on different AMPs and Teradata
must move them to the same AMP for the joining process.
Teradata will do this by either redistributing one or both of the tables or by duplicating
the smaller table on all AMPs.
In the EXPLAIN on the following page the Parsing Engine has decided to do a Full Table
Scan on the Order_Table and then Redistribute the Order_Table by the
Customer_Number. This will match up the Customer_Number on each AMP from the
Order_Table with its associated Customer_Number rows of the Customer_Table.
Whenever a join takes place Teradata will need to ensure that matching rows are on the
same AMP if they are to be joined. Watch for the words in your EXPLAIN such as
Duplicated on all AMPs or Redistributed by and you will know data is moving across the
BYNET in order to place the matching rows on the same AMP for the join process.
![Page 424: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/424.jpg)
Master the Teradata Architecture
415 Copyright OSS 2010
![Page 425: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/425.jpg)
Master the Teradata Architecture
416 Copyright OSS 2010
Explain – Bit Mapping with multiple NUSIs
“There is one thing stronger than all the
armies in the world; and that is an idea
whose time has come.”
- Victor Hugo
One of the most exciting (and rare) things to see in an EXPLAIN is a BMSMS statement.
This happens when the columns in the WHERE clause each have a Non-Unique
Secondary Index (NUSI). When multiple NUSI columns are ANDed together with the
AND Clause the Parsing Engine may decide to build a bit-map. This can really speed up
a large query because the PE will only read from the secondary index Subtables and build
a bit-map of the process. This is fast, fast, and fast!
For a BMSMS to take place the Parsing Engine wants you to Collect Statistics on any
column where there is a Non-Unique Secondary Index (NUSI). This gives the Parsing
Engine confidence to perform the bit-map process.
The bit-map process takes a little longer to set up, but once it is set up it can speed up the
querying enormously.
Notice that we are using the columns Shipdate and Partkey in our query example. Also
notice the AND in between these two columns in the query. Both of these columns have
Non-Unique Secondary Indexes (NUSI) on them. When multiple NUSI‘s in a query are
separated by the AND keyword they are considered ANDed together and the bitmapping
process can take place.
![Page 426: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/426.jpg)
Master the Teradata Architecture
417 Copyright OSS 2010
![Page 427: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/427.jpg)
Master the Teradata Architecture
418 Copyright OSS 2010
![Page 428: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/428.jpg)
Master the Teradata Architecture
419 Copyright OSS 2010
![Page 429: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/429.jpg)
Master the Teradata Architecture
420 Copyright OSS 2010
Fundamentals of Teradata Joins
“As I would not be a slave, so I would not be
a master. This expresses my idea of
democracy.”
- Abraham Lincoln
For two rows to be joined together they must be physically on the same AMP! Wow!
That is probably a surprise, but Teradata is an MPP Parallel Processing System. This
means that each AMP has its own disk, its own memory, and its own processor. So, like
any system the rows will be moved to the AMPs memory and joined in memory.
Therefore two rows being joined must be physically together on the same AMP.
The picture on the following page shows you some fundamentals of this concept that are
very important for you to get inside your brain.
First of all rows reside on a particular AMP because of the Primary Index Value. It is the
Primary Index that is hashed, checked with the hash map, and distributed to the proper
AMP. Most of the time rows from two joining tables won‘t match up perfectly on the
same AMPs so Teradata will redistribute or duplicate the data to make that happen. Then
the join can take place.
In the beginning this can be a tricky and confusing concept, but we are going to take our
time and get this down to a science.
![Page 430: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/430.jpg)
Master the Teradata Architecture
421 Copyright OSS 2010
![Page 431: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/431.jpg)
Master the Teradata Architecture
422 Copyright OSS 2010
A Join Example
“It is better to be feared than loved, if you
cannot be both.”
- Niccolo Machiavelli, The Prince
One the following page notice the arrows pointing to the two columns joined together in
the ON Clause. This is the only thing that matters to Teradata when joining tables.
Notice that we are joining WHERE Customer_Number = Customer_Number. If the
Customer_Number is the Primary Index of both tables then all joining rows will be on the
same AMP.
Think about it. Let‘s assume Customer_Number from the Order_Table was the Primary
Index and Customer_Number 1 was originally hashed to AMP 1. If Customer_Number
was also the Primary Index of the Customer_Table then Customer_Number 1 would also
hash to AMP 1. We learned how consistent the hashing process was in the beginning of
this book. As a matter of fact when these two tables are joined all matching rows would
be on the same AMP because both columns in the ON Clause are the respective Primary
Indexes of their tables. This is Teradata Heaven!
Most of the time this won‘t be the case and the Parsing Engine will have to plan for one
of the table or both of the tables to be either redistributed in spool or duplicated on all
AMPs. I will explain further in upcoming slides.
![Page 432: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/432.jpg)
Master the Teradata Architecture
423 Copyright OSS 2010
![Page 433: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/433.jpg)
Master the Teradata Architecture
424 Copyright OSS 2010
Joins and the Primary Index
“Bad officials are elected by good citizens
who do not vote.”
- George Jean Nathan
The picture on the following page is designed to enforce the point that when both
columns in the ON Clause are the Primary Index of their respective tables then no data
needs to move because all joining rows reside on the same AMP.
Notice the On Clause and notice that we are joining on Cust.Customer_Number =
Ord.Customer_Number. Notice that Customer_Number is the Primary Index of both
tables. This ensures that all matching rows are on the same AMP.
If Customer_Number 1 of the Order_Table is hashed and goes to AMP 1 then
Customer_Number 1 of the Customer_Table will also go to AMP 1.
If Customer_Number 99 of the Order_Table is hashed and goes to AMP 4 then
Customer_Number 99 of the Customer_Table will also go to AMP 4.
This consistency makes every matching row go the same AMP together because the
hashing formula is consistent, the hash map is consistent, and the process is flawless.
On the upcoming pages we will soon see examples that are not so flawless!
![Page 434: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/434.jpg)
Master the Teradata Architecture
425 Copyright OSS 2010
![Page 435: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/435.jpg)
Master the Teradata Architecture
426 Copyright OSS 2010
Redistributing Rows in Spool
“The man with the best job in the country is
the Vice President. All he has to do is get up
every morning and say, “How‟s the
President?”.”
- Will Rogers
In the next example on the following page notice the ON Clause in the SQL Join example
and notice that Customer_Number is the Primary Index of the Customer_Table, but not
the Primary Index of the Order_Table. Joining rows will not be on the same AMP.
The Parsing Engine will instruct the Order_Table to be redistributed in spool and
rehashed temporarily by Customer_Number. This is literally like making the Primary
Index of the Order_Table Customer_Number for just this query.
Once the hashing is redone the rows of both tables being joined with be on the same
AMP. We have tricked the system and now the join can take place.
When you see the words Redistribution in the EXPLAIN plan you will now know that
Teradata is temporarily changing the Primary Index of a table for just one query.
![Page 436: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/436.jpg)
Master the Teradata Architecture
427 Copyright OSS 2010
![Page 437: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/437.jpg)
Master the Teradata Architecture
428 Copyright OSS 2010
Redistributing Rows of Both Tables
“As always, victory finds a hundred fathers
but defeat is an orphan.”
- Count Galeazzo Ciano, The Ciano Diaries
In the example on the following page I want you to notice the Join example and
especially pay attention to the ON Clause. Notice that we are again joining on
Cust.Customer_Number = Ord.Customer_Number. I want you to realize in this example
that Customer_Number is NOT the Primary Index of either table.
So Teradata uses its trick. The Parsing Engine has the AMPs redistribute both tables as if
the Primary Index was Customer_Number. Once both tables are rehashed and
redistributed temporarily in spool to the AMPs the matching rows will be on the same
AMP together. This is expensive in time and resources, but is the only way Teradata can
get the matching rows to the proper AMP simultaneously.
![Page 438: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/438.jpg)
Master the Teradata Architecture
429 Copyright OSS 2010
![Page 439: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/439.jpg)
Master the Teradata Architecture
430 Copyright OSS 2010
Duplicating the Smaller Table
“Old soldiers never die, they just fade
away.”
- General Douglas MacArthur
In the beginning years of Teradata the Parsing Engine would always redistribute one or
both tables if necessary to bring matching rows together on the same AMP for join
purposes. This included redistributing a large table when joining to a small table.
Sometimes Teradata would redistribute a billion row table just to join it to a table with
four rows. It didn‘t make sense.
Teradata changed the Parsing Engine to include a Big Table Small Table join. Instead of
redistributing the big table Teradata found it faster and more cost effective to copy the
smaller table to all AMPs. That also satisfies the requirement to have matching rows on
the same AMP. The only difference is that the smaller table is copied in its entirety to all
AMPs temporarily for just this one query.
![Page 440: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/440.jpg)
Master the Teradata Architecture
431 Copyright OSS 2010
![Page 441: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/441.jpg)
Master the Teradata Architecture
432 Copyright OSS 2010
Quiz – How Many Rows are in Spool?
“A man‟s feet should be planted in his
country, but his eyes should survey the
world.”
- George Santayana
The picture on the following page shows an example of a big table small table join.
Notice that the Customer_Table only has 4 rows. Notice that the Order_Table has 4,000
rows. Now Notice that both tables have spread the rows of their respective tables evenly
across all AMPs.
Here is what Teradata is about to do. The Parsing Engine will come up with a plan to
join these two tables by first commanding the AMPs to bring back any Customer_Table
rows. After this process the Parsing Engine is looking at all four rows of the
Customer_Table. The Parsing Engine then copies all 4 rows to every AMP.
The following couple of pages will demonstrate this clearly!
![Page 442: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/442.jpg)
Master the Teradata Architecture
433 Copyright OSS 2010
![Page 443: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/443.jpg)
Master the Teradata Architecture
434 Copyright OSS 2010
Quiz Answer – How Many Rows in Spool?
“Experience teaches only the teachable.”
- Aldous Huxley
Notice that we have 4 customers in spool on every AMP. I have placed these at the
bottom of the disks in the color yellow. Now the join is ready to take place between the
Order_Table in the color green and the Customer_Table which was duplicated in spool
on every AMP. Because we have four customers duplicated on four AMPs we have 16
rows in spool (4 rows multiplied by 4 AMPs = 16).
The question at the bottom asks, ―What if we had 100 AMPs?‖ If that were the case we
would have 4 rows copied to 100 AMPs thus we would have 400 rows in spool!
![Page 444: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/444.jpg)
Master the Teradata Architecture
435 Copyright OSS 2010
![Page 445: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/445.jpg)
Master the Teradata Architecture
436 Copyright OSS 2010
How Duplication Appears on Every AMP
“Experience is the worst teacher; it gives
the test before presenting the lesson.”
- Vernon Law
As you can see in our example on the following page we didn‘t move the Order_Table,
but duplicated the four rows of the Customer_Table. Now you can see how easy the
rows are to join. This shows only one AMP, but you can imagine the same process going
on with all AMPs simultaneously because the four rows from the Customer_Table have
been duplicated on all AMPs.
![Page 446: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/446.jpg)
Master the Teradata Architecture
437 Copyright OSS 2010
![Page 447: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/447.jpg)
Master the Teradata Architecture
438 Copyright OSS 2010
How Many Rows in Spool with Redistribution?
“Great Spirit, help me never to judge
another until I have walked in his
moccasins.”
- Sioux Indian Prayer
The Parsing Engine will decide whether to duplicate the smaller table or redistribute one
or both of the tables to make the matching rows appear on the same AMP. The Parsing
Engine is a cost-based optimizer and will attempt to do what is easiest, fastest, and moves
the least amount of data.
The question to be answered is, ―How many rows will be in spool if the Parsing Engine
decides to redistribute the Order_Table by Customer_Number in spool?‖
The answer is on the next couple of pages! Take a guess!
![Page 448: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/448.jpg)
Master the Teradata Architecture
439 Copyright OSS 2010
![Page 449: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/449.jpg)
Master the Teradata Architecture
440 Copyright OSS 2010
Answer to How Many Rows in Spool
“When I give a lecture, I accept that people
look at their watches, but what I do not
tolerate is when they look at it and raise it to
their ear to find out if it stopped.”
- Marcel Achard
As you can see on the following page there were 4,000 rows in the Order_Table that we
can refer to as the Base Table. Then when we redistributed the 4,000 rows by hashing
the Customer_Number there were 4,000 rows in spool. When you redistribute a table the
exact same amount of rows are merely rehashed. The only difference is that the rows
will move to a different AMP. The great news is that they are moved to the same AMPs
as their matching counterparts in the Customer_Table.
![Page 450: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/450.jpg)
Master the Teradata Architecture
441 Copyright OSS 2010
![Page 451: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/451.jpg)
Master the Teradata Architecture
442 Copyright OSS 2010
An Example of an AMP with Redistribution
“It is important that students bring a certain
ragamuffin, barefoot, irreverence to their
studies; they are not here to worship what is
known, but to question it.”
- J. Bronowski, The Ascent of Man
The following page is an excellent example of a join and the importance of matching
rows being on the same AMP. Notice first the Customer_Table in blue. Notice that on
this AMP that customers 1-8 landed on this AMP. Also realize that when we rehashed
the Order_Table (in yellow) by Customer_Number that customers 1-8 landed on this
AMP also. Hashing brilliantly places like values together on the same AMP and this
allows for Teradata to fly through joins.
![Page 452: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/452.jpg)
Master the Teradata Architecture
443 Copyright OSS 2010
![Page 453: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/453.jpg)
Master the Teradata Architecture
444 Copyright OSS 2010
![Page 454: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/454.jpg)
Master the Teradata Architecture
445 Copyright OSS 2010
![Page 455: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/455.jpg)
Master the Teradata Architecture
446 Copyright OSS 2010
The System Calendar
“Don‟t count the days, make the days
count.”
- Mohammed Ali
Teradata has a built-in Calendar called the System Calendar. The System Calendar is a
table which has one row for each day starting from the dates of January 1, 1900 to
December 31, 2100. I guess Teradata is Y2K compliant!
The System Calendar is a fantastic tool, especially when your boss says something like,
―I want to know all orders that happened on a Friday in the first full week of the month
during the fourth quarter.‖ This is usually when you update your resume than risk brain
overload. The great news is that this is exactly the type of stuff the System Calendar was
designed to handle for you.
Notice on the following page that I have written SQL that will show you the System
Calendar for the date of June 15, 2010.
Skip a couple of pages and you will see the results of this query and much better
understand the System Calendar.
![Page 456: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/456.jpg)
Master the Teradata Architecture
447 Copyright OSS 2010
![Page 457: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/457.jpg)
Master the Teradata Architecture
448 Copyright OSS 2010
Columns in the System Calendar Views
“Those who dance are considered insane by
those who cannot hear the music”
- George Carlin
The following slide shows the type of information you can attain by using the System
Calendar. Notice some of the key entries:
Day_of_Week always goes from 1-7 with 1 being a Sunday.
Day_of_Calendar are the Julian days since January 1, 1900.
Week_of_Month, Week_of_Year, and Week_of_Calendar will have a zero in them for
the first partial week. The first full week will have a 1 and so on.
Month_of_Calendar and Quarter_of_Calendar also start from the January 1, 1900 date.
The rest are fairly self explanatory.
![Page 458: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/458.jpg)
Master the Teradata Architecture
449 Copyright OSS 2010
![Page 459: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/459.jpg)
Master the Teradata Architecture
450 Copyright OSS 2010
How to use the System Calendar with Tables
“I cannot imagine any condition which
would cause this ship to founder. Modern
shipbuilding has gone beyond that.”
- E. I. Smith, Captain of the Titanic
The System Calendar is great for simple things like finding out whether you were born on
a Saturday night or on a Monday, but the read gold is done when you join the System
Calendar to another table in a query.
On the following page we have joined our Order _Table with the System Calendar where
the Order_Date from the Order_Table is equal to the Calendar_Date from the System
Calendar. Once the Join takes place we can use the WHERE Clause to pinpoint the exact
calendar information we are looking to query.
In our example we wanted all orders placed in January during the first full week of the
month that happened on a Friday. How about that for fancy SQL writing?
![Page 460: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/460.jpg)
Master the Teradata Architecture
451 Copyright OSS 2010
![Page 461: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/461.jpg)
Master the Teradata Architecture
452 Copyright OSS 2010
![Page 462: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/462.jpg)
Master the Teradata Architecture
453 Copyright OSS 2010
![Page 463: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/463.jpg)
Master the Teradata Architecture
454 Copyright OSS 2010
Teradata Temporary Tables
“Sure I‟m helping the elderly. I‟m going to
be old myself some day.”
- Lillian Carter, in her 80s
Teradata has three basic types of temporary tables. They are the Derived Table, Volatile
Table, and Global Temporary Table. Each functions a little differently.
The most popular is the Derived Table. This is a temporary table that is implemented
inside users SQL and only exists for the life of the query. It is materialized in the user‘s
spool and automatically deleted at query end.
Volatile tables can be created by any user and are materialized with an INSERT/SELECT
statement. The table is only available to the user who created the volatile table and is
available to that user until the user logs off their session. This also takes up the user‘s
spool.
Global Temporary Tables are CREATED by a user and they are materialized in Temp
Space. Any user with Temp Space can create a Global Temporary Table. The table
structure will exist until the user DROPS the table. What is interesting about the Global
Temporary Table is that after it is populated the data is available until the user logs off
the session, but the table definition stays. Many users can use the table definition if they
have Temp Space and each user who performs an Insert/Select on the table will in a sense
have their own version of the table that only they can access.
![Page 464: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/464.jpg)
Master the Teradata Architecture
455 Copyright OSS 2010
![Page 465: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/465.jpg)
Master the Teradata Architecture
456 Copyright OSS 2010
Derived Tables
“My best friend is the one who brings out
the best in me.”
- Henry Ford
On the following page you can see an example of a Derived Table. Derived Tables are
created inside a users query for only the life of that query. Notice that after the FROM
Clause we have placed brackets (colored in yellow) around the derived query. We have
also named the Derived Table TeraTom and then placed a name for the single column we
have created (called AVGSAL) inside the Derived Table TeraTom.
We can use the column AVGSAL in the SELECT list or later on in the WHERE Clause.
Derived Tables are often used with aggregates and serve as a temporary space to query
and hold data for the life of a query. In our example on the following page we were able
to compare the Average Salary of all employees to see who was making more than the
average salary.
![Page 466: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/466.jpg)
Master the Teradata Architecture
457 Copyright OSS 2010
![Page 467: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/467.jpg)
Master the Teradata Architecture
458 Copyright OSS 2010
A Query Pictorial Example with a Derived Table
“Never go to a doctor whose office plants
have died.”
- Erma Bombeck
The example on the next page shows a pictorial of the AMPs and their disks utilizing a
Derived Table. The derived table was created and holds the Average Salary for all
employees. As you can see on the picture the Derived Table is called TeraTom and it
holds a single column called AVGSAL. The Average Salary happened to be $59,583.00.
Now I want you to notice Spool 1. This is the answer set from each AMP. This comes
from comparing the Employee_Table rows (seen at the top of each disk) and each
Employee‘s Salary with the Average Salary stored in the Derived Table. Each employee
who is making more than the average salary will be placed inside Spool 1 so it can be
returned to the user.
![Page 468: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/468.jpg)
Master the Teradata Architecture
459 Copyright OSS 2010
![Page 469: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/469.jpg)
Master the Teradata Architecture
460 Copyright OSS 2010
Volatile Tables
“If all my possessions were taken from me,
with the exception of one, I would choose to
keep the power of Speech. For with it, I
would soon regain all the rest.”
- Daniel Webster
Notice the picture on the following page and the 3 steps in using a Volatile Table. The
first step is to CREATE the Volatile Table. This table will now be available until the end
of the session or if the user decides to DROP the Volatile Table.
The second step is to populate the Volatile Table with an INSERT/SELECT statement.
Now the fun actually starts with the third step because this table is now available for the
user to query. Only the user who created the Volatile Table has access to it. That user
can run an endless amount of queries and joins against their Volatile Table until session
end. All Volatile Tables use the user‘s spool space to populate the table.
![Page 470: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/470.jpg)
Master the Teradata Architecture
461 Copyright OSS 2010
![Page 471: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/471.jpg)
Master the Teradata Architecture
462 Copyright OSS 2010
How to Populate a Volatile Table
“The brain is a wonderful organ. It starts
working the moment you get up in the
morning, and does not stop until you get into
the office.”
- Robert Frost
The following page shows an excellent example of the first two steps and they are in the
CREATE statement of the Volatile Table and in the materialization of the Volatile table
with an INSERT/SELECT.
Now I want you to notice that our Volatile Table named Aggy is in materialized in Spool
Space, but appears much like a real table. The rows are spread evenly across all the
AMPs and this table is ready for action. It can be queried or joined to other tables.
![Page 472: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/472.jpg)
Master the Teradata Architecture
463 Copyright OSS 2010
![Page 473: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/473.jpg)
Master the Teradata Architecture
464 Copyright OSS 2010
Global Temporary Tables
“In case you‟re worried about what‟s going
to become of the younger generation, it‟s
going to grow up and start worrying about
the younger generation.”
- Roger Allen
The example of the following page shows all three steps for using a Global Temporary
Table, but the actual CREATE (step 1) only needs to be performed once by the original
CREATING User. After that CREATE statement the table structure will remain in
Teradata until the user who created it actually DROPS the table.
Global Temporary Tables can be used by any user who has Temp Space. Many users can
perform step 2 simultaneously and each will have their own copy of the Global
Temporary Table and after their session has ended the data will automatically be deleted,
but the table structure will remain.
![Page 474: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/474.jpg)
Master the Teradata Architecture
465 Copyright OSS 2010
![Page 475: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/475.jpg)
Master the Teradata Architecture
466 Copyright OSS 2010
A Pictorial of a Global Temporary Table
“There are three great friends: an old wife,
and old dog, and ready money.”
- Benjamin Franklin
In the picture on the following page you can see we have done an INSERT/SELECT into
our Globaggy table. This table is a Global Temporary Table that was created previously.
The table is now materialized with data inside Temp space. The rows of the table are
spread evenly across the AMPs and this table can be queried or joined with other tables
until session end.
If 1,000 users did an INSERT/SELECT on GlobAggy then all 1,000 users would have
their own version of this table and nobody else can access another users copy.
![Page 476: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/476.jpg)
Master the Teradata Architecture
467 Copyright OSS 2010
![Page 477: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/477.jpg)
Master the Teradata Architecture
468 Copyright OSS 2010
What Happens to Global Tables after the Session
“Forgiveness does not change the past, but
it does enlarge the future.”
- Paul Boese
The following page shows that the user has logged off their session and the Globaggy
table that used to be filled with data automatically deletes the data, but the table structure
stays available on Teradata.
The data is gone, but the table structure stays. Why? Check out the next couple of pages
for the answer.
![Page 478: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/478.jpg)
Master the Teradata Architecture
469 Copyright OSS 2010
![Page 479: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/479.jpg)
Master the Teradata Architecture
470 Copyright OSS 2010
Global Temporary Tables and Temp Space
“You‟re alive. Do something. The directive
in life, the moral imperative was so
uncomplicated. It could be expressed in
single words, not complete sentences. It
sounded like this: Look. Listen. Choose.
Act.”
- Barbara Hall, A summons to New Orleans, 2000
The Global Temporary Table structure is not deleted at session end. Only the data inside
the table is deleted at session end. Why? So many users can materialize their own
version of the Global Table. This helps in multiple ways:
1) Users populate the table using their Temp Space and then have more Spool Space
to actually query the table.
2) Most users won‘t have PERM Space so they can‘t create tables or may not know
the syntax to create a table. The table structures have been created for them and
they are now ready to merely perform an INSERT/SELECT to populate their own
version of the table.
![Page 480: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/480.jpg)
Master the Teradata Architecture
471 Copyright OSS 2010
![Page 481: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/481.jpg)
Master the Teradata Architecture
472 Copyright OSS 2010
![Page 482: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/482.jpg)
Master the Teradata Architecture
473 Copyright OSS 2010
![Page 483: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/483.jpg)
Master the Teradata Architecture
474 Copyright OSS 2010
V13 – No Primary Index Tables
“No one is so generous as he who has
nothing to give.”
– French Proverb
New in Teradata V13 the DBA has the ability to CREATE tables without a Primary
Index! These tables are designed to merely spread the rows randomly and evenly. They
are called NoPI tables, which stands for No Primary Index tables. A NoPI table is
designed for ETL staging tables so data can be quickly transferred from flat files taken
from operational systems such as Oracle or DB2. This might be data that needs to be
massaged or transformed. Then once the transformation has been completed the DBA
can write an INSERT/SELECT command and quickly load the data inside the stating
table into a Teradata table that has a Primary Index.
Although you can query or JOIN a NoPI table with a traditional table containing a
Primary Index they are really meant to quickly import data inside Teradata temporarily so
it can be transformed inside Teradata and then loaded inside the data warehouse tables.
![Page 484: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/484.jpg)
Master the Teradata Architecture
475 Copyright OSS 2010
![Page 485: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/485.jpg)
Master the Teradata Architecture
476 Copyright OSS 2010
NoPI CREATE Statement
“The Constitution only gives people the
right to pursue happiness. You have to
catch it yourself.”
– Ben Franklin
On the following page you can see the NoPI CREATE statement. This is done when you
create the table. This can be done with normal SQL as seen on the following page or it
can be done with a FastLoad or Tpump Load Utility. The key word to focus on the
following page is the NO PRIMARY INDEX highlighted for your convenience.
![Page 486: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/486.jpg)
Master the Teradata Architecture
477 Copyright OSS 2010
![Page 487: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/487.jpg)
Master the Teradata Architecture
478 Copyright OSS 2010
NoPI Row-ID Increments the Uniqueness Value
“It‟s not the size of the dog in the fight, but
the size of the fight in the dog.”
– Archie Griffin
Each AMP will receive an equal amount of rows in an attempt by the Parsing Engine to
spread the data evenly. Notice the picture on the following page. The Row Hash for
every row in the NoPI table is the same. Only the Uniqueness Value is incremented.
![Page 488: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/488.jpg)
Master the Teradata Architecture
479 Copyright OSS 2010
![Page 489: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/489.jpg)
Master the Teradata Architecture
480 Copyright OSS 2010
NoPI Row-Hash Different on each AMP
“When all you have is a hammer, you tend
to see every problem as a nail.”
- Abraham Maslow
The example on the next page allows you to realize that the Row Hash on each AMP is
different, but once the Row Hash is established on each AMP, all rows contain that exact
same Row Hash and each AMP only increments the Uniqueness Value. NoPI tables
don‘t need to be sorted and that is another main advantage if you desire to CREATE a
staging table.
![Page 490: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/490.jpg)
Master the Teradata Architecture
481 Copyright OSS 2010
![Page 491: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/491.jpg)
Master the Teradata Architecture
482 Copyright OSS 2010
NoPI Options and Facts
“Failure accepts no alibis.
Success requires no explanation.”
– Robert Rose
The example on the next page describes the options and facts about NoPI Tables.
![Page 492: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/492.jpg)
Master the Teradata Architecture
483 Copyright OSS 2010
![Page 493: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/493.jpg)
Master the Teradata Architecture
484 Copyright OSS 2010
NoPI Restrictions
“He who asks a question may be a fool for
five minutes, but he who never asks a
question remains a fool forever.”
– Tom Connelly
The example on the next page shows the restrictions of NoPI Tables.
![Page 494: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/494.jpg)
Master the Teradata Architecture
485 Copyright OSS 2010
![Page 495: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/495.jpg)
Master the Teradata Architecture
486 Copyright OSS 2010
![Page 496: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/496.jpg)
Master the Teradata Architecture
487 Copyright OSS 2010
![Page 497: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/497.jpg)
Master the Teradata Architecture
488 Copyright OSS 2010
Write Ahead Logging (WAL)
“The reputation of a thousand years may be
determined by the conduct of one hour.”
– Japanese Proverb
Teradata has traditionally taken a Before Picture of any row being UPDATED or
DELETED. This was always called the Transient Journal and Data Integrity was its main
purpose. If a transaction was going to UPDATE or DELETE a row the BEFORE
PICTURE would be taken of the row and stored in a journal in case a ROLLBACK was
done or in case there was a glitch in the system
Now this function is done by the Write Ahead Log or WAL. There are two main pieces
to WAL and that is the Wal Log and the Wal Depot.
The WAL Log takes a BEFORE and AFTER Picture of a row being UPDATED and each
AMP has their own WAL log to make sure that Teradata can Rollback a transaction or
Commit the transaction.
The WAL Depot stores blocks of UPDATED row(s) it receives from FSG Cache to
provide a backup copy of the changes and the COMMIT is considered done. Teradata
can then WRITE the block of data to the actual table on disk when it deems it a good
time to do so.
Teradata uses the WAL Log and WAL Depot for transaction integrity in order to Commit
or Rollback the data.
![Page 498: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/498.jpg)
Master the Teradata Architecture
489 Copyright OSS 2010
![Page 499: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/499.jpg)
Master the Teradata Architecture
490 Copyright OSS 2010
AMPs have FSG Cache for the Memories
“Never insult an alligator until after you
have crossed the river.”
– African Proverb
Memory is a thousand times faster than disk so AMPs attempt to store hot tables inside
memory for fast retrieval. This memory dedicated to each AMP is called File System
Generating Cache (FSG Cache). This pool of memory is like each AMP having its own
swimming pool that only it can use. Let‘s imagine that 100 users have been querying the
Order_Table. Each AMP will be reading the Order_Table hundreds of times in order to
provide answer sets back to the users. All AMP will attempt to keep their Order_Table
rows inside the FSG cache in order to speed up reads and writes thousands of times.
Remember that each AMP has their own FSC Cache memory, their own WAL Log and
their own WAL Depot.
![Page 500: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/500.jpg)
Master the Teradata Architecture
491 Copyright OSS 2010
![Page 501: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/501.jpg)
Master the Teradata Architecture
492 Copyright OSS 2010
An Example of an UPDATE Statements
“Let every nation know, whether it wishes us
well or ill, that we shall pay any price, bear
any burden, meet any hardship, support any
friend, oppose any foe, in order to assure the
survival and the success of liberty”
-John F. Kennedy (Inaugural Address 1961)
I want to run you through the process of UPDATING a row. The picture on the
following page shows the UPDATE statement at the top. Notice the Department_Table
inside the AMP. We are going to UPDATE this row and change the Dept_Name from
‗Human‘ to ‗HR‘.
Turn the page and see what happens next!
![Page 502: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/502.jpg)
Master the Teradata Architecture
493 Copyright OSS 2010
![Page 503: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/503.jpg)
Master the Teradata Architecture
494 Copyright OSS 2010
AMP Local WALs
“The believer is happy, the doubter wise”
– Greek Proverb
The example on the next page is designed to introduce the AMPs WAL Log. This WAL
Log stores a BEFORE image of the entire row being updated in order to backup the row
in case something goes wrong. With a backup copy of the row Teradata is confident it
can Rollback the UPDATE and put things back exactly the same way they were
BEFORE the transaction. The BEFORE picture inside the WAL Log is called the
UNDO record because it is designed to UNDO an attempted change to the row. This is
the WHOOPs I made a mistake button. Once the transaction has been completed the row
in the WAL Log can be erased.
![Page 504: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/504.jpg)
Master the Teradata Architecture
495 Copyright OSS 2010
![Page 505: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/505.jpg)
Master the Teradata Architecture
496 Copyright OSS 2010
AMPs UPDATE Rows in FSG Cache
“I have found the best way to give advice to
your children is to find out what they want
and then advise them to do it.”
– Harry S. Truman (1884 - 1972)
When an AMP is commanded to UPDATE a row that AMP finds the row inside a data
block on its virtual disk and transfers the block inside the node into FSG Cache. Now the
AMP can process the rows as fast as lightning. This process of moving data inside FSG
Cache is how an AMP can READ, UPDATE, or DELETE a row.
![Page 506: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/506.jpg)
Master the Teradata Architecture
497 Copyright OSS 2010
![Page 507: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/507.jpg)
Master the Teradata Architecture
498 Copyright OSS 2010
Write to WAL then Write Back to Disk
“You are educated when you have the
ability to listen to almost anything without
losing your temper or self-confidence.”
– Robert Frost
The WAL Log takes a BEFORE picture so a ROLLBACK can be performed and it takes
an AFTER picture as a backup to ensure integrity temporarily until the AMP physically
writes the data back to its virtual disk. The AMP will UPDATE the row inside its FSG
Cache, then write the AFTER image of the row to the WAL log. Now the WAL log
contains a BEFORE and AFTER picture of the row. The AMP cans send a message to
the PE that the row has been updated. The row really hasn‘t been completely updated
because it hasn‘t physically been written back to the AMPs disk. Only the WAL Log
rows were written to the AMPs disk. The AMP is confident that it can write the update
back to its physical disk for permanent storage when the AMP deems it most efficient to
write the row back. Plus the AMP knows it has added insurance because the WAL log
has both the BEFORE and AFTER image. Even in a disaster where the Teradata System
goes down the AMP knows when Teradata is rebooted it can complete the transaction of
writing to disk by using the WAL log to catch-up and complete the COMMIT or
ROLLBACK.
![Page 508: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/508.jpg)
Master the Teradata Architecture
499 Copyright OSS 2010
![Page 509: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/509.jpg)
Master the Teradata Architecture
500 Copyright OSS 2010
The WAL Depot
Even if I knew that tomorrow the world
would go to pieces, I would still plant my
apple tree.”
– Dr. Martin Luther King, Jr.
The AMP could have many updates to rows inside a block of data. Before the AMP
writes the block back to its physical disk it writes the entire block to the WAL Depot.
Now it has a backup of the entire block of data in case something goes wrong. If the
AMP writes the data from FSG Cache back to its physical disk successfully the WAL
Depot backup copy can be erased. The WAL Depot only serves as a backup copy in case
a problem should occur.
![Page 510: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/510.jpg)
Master the Teradata Architecture
501 Copyright OSS 2010
![Page 511: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/511.jpg)
Master the Teradata Architecture
502 Copyright OSS 2010
Clearing out the Wal Depot and the Wal Log
“You don‟t drown by falling into the water;
you drown by staying in the water.”
-Edwin Louis Cole
The example on the next page shows a pictorial of the erasing of the WAL Log and WAL
Depot. The changes have been made and there is no more reason to have a backup of the
rows and the blocks that were changed. These have been written back to the AMPs
physical disk successfully. Think of the WAL Log and WAL depot as wearing a seat belt
when you are driving in a car. You take off your seat belt when you get home and leave
the car don‘t you?
![Page 512: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/512.jpg)
Master the Teradata Architecture
503 Copyright OSS 2010
![Page 513: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/513.jpg)
Master the Teradata Architecture
504 Copyright OSS 2010
![Page 514: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/514.jpg)
Master the Teradata Architecture
505 Copyright OSS 2010
![Page 515: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/515.jpg)
Master the Teradata Architecture
506 Copyright OSS 2010
V13 – Teradata Virtual Storage (TVS)
“To succeed... you need to find something to
hold on to, something to motivate you,
something to inspire you.”
-Tony Dorsett
Teradata Virtual Storage or TVS for short is one of the most exciting improvements
Teradata has made. TVS changes the way AMP access their disks. TVS manages the
disks for each AMP. This will be explained throughout the chapter, but the following
page shows some of the topics and advantages that TVS brings to Teradata.
![Page 516: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/516.jpg)
Master the Teradata Architecture
507 Copyright OSS 2010
![Page 517: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/517.jpg)
Master the Teradata Architecture
508 Copyright OSS 2010
AMPs in the 1980’s
“Alone we can do so little; together we can do so much.”
– Helen Keller
In the 1980‘s and early 1990‘s an AMP connected to its physical disks with JBOD
technology. This JBOD term stood for Just a Bunch of Disks (JBOD). It meant that four
disks were available for the AMP to store its data rows. The disks did not provide any
protection features such as RAID so any single disk failure could be a disaster. Back in
the early days the disks weren‘t exactly reliable either so Teradata tables were
FALLBACK Protected almost always. FALLBACK means that an AMP stores a backup
or FALLBACK copy of its rows on other AMPs within its cluster.
![Page 518: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/518.jpg)
Master the Teradata Architecture
509 Copyright OSS 2010
![Page 519: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/519.jpg)
Master the Teradata Architecture
510 Copyright OSS 2010
AMPs in the 1990’s
Looking to the stars always makes me
dream, as simply as I dream over the black
dots representing towns and villages on a
map. Why, I ask myself, shouldn't the
shining dots of the sky be as accessible as
the black dots on the map of France?
-Vincent Van Gogh
In the 1990‘s an AMP still had one Virtual Disk and four Physical Disks, but the disks
were mirrored. The AMP would store data on one disk and then mirror that disk in case
of a failure. As you can see on the following page each AMP had two disks for data and
two mirrored disks. FALLBACK wasn‘t a necessity anymore because of the disk
protections.
![Page 520: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/520.jpg)
Master the Teradata Architecture
511 Copyright OSS 2010
![Page 521: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/521.jpg)
Master the Teradata Architecture
512 Copyright OSS 2010
Data Blocks and Cylinders make up a Disk
“I never lost a game; time just ran out on
me.”
– Michael Jordan
Each AMP stores their data blocks inside cylinders on the disk. Each disk is made up of
thousands of cylinders and data blocks are stored inside the cylinders.
![Page 522: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/522.jpg)
Master the Teradata Architecture
513 Copyright OSS 2010
![Page 523: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/523.jpg)
Master the Teradata Architecture
514 Copyright OSS 2010
Cylinders are dedicated to Perm, Spool, etc.
“One‟s dignity may be assaulted,
vandalized, and cruelly mocked, but it
cannot be taken away unless it is
surrendered.”
- Michael J. Fox
Different cylinders store different types of data. For example, some cylinders will be
used to hold Permanent Data, while completely other cylinders will be used for Spool
files. The following page shows you the type of data used in cylinders. You won‘t have
a cylinder share. This means you can‘t use a single cylinder to store permanent data and
spool data simultaneously. Once the first row of a table is written to a cylinder as PERM
Space that cylinder can‘t also be used to store Spool Files.
Cylinders are dedicated to PERM, SPOOL, TEMP, Permanent Journals, and for the WAL
Logs.
![Page 524: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/524.jpg)
Master the Teradata Architecture
515 Copyright OSS 2010
![Page 525: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/525.jpg)
Master the Teradata Architecture
516 Copyright OSS 2010
Outside Disk Tracks are much Faster
“Make sure you have finished speaking
before your audience has finished
listening.”
- Dorothy Sarnoff
The example on the next page shows cylinders that sit on top of a disk platter. Notice the
outside of the disk and see how many more cylinders there are versus the inside track of
cylinders. This makes the outside track faster because with one revolution of the
spinning of the disk the system can read so many more cylinders on the outside track.
I want you to merely realize that the outer track reads and writes are considered faster and
the inner tracks, which hold less cylinders are considered the slower tracks.
![Page 526: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/526.jpg)
Master the Teradata Architecture
517 Copyright OSS 2010
![Page 527: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/527.jpg)
Master the Teradata Architecture
518 Copyright OSS 2010
AMPs assigned Disk Cylinders, not Entire Disks
“There‟s no point in being grown up if you
can‟t be childish sometimes.”
- Doctor Who
Teradata TVS uses software to manage the disks for the AMPs. This software is called
VSS. The VSS software assigns cylinders to the AMPs. In the past, AMPs were
assigned entire disks and each AMP owned all the cylinders inside their disks. Now a
single disk can be allocated cylinder by cylinder to all AMPs within the Clique.
This will prove to be helpful in many ways. Keep reading!
![Page 528: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/528.jpg)
Master the Teradata Architecture
519 Copyright OSS 2010
![Page 529: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/529.jpg)
Master the Teradata Architecture
520 Copyright OSS 2010
Hot, Warm, and Cold Data
“It doesn‟t make a difference what
temperature a room is, it‟s always room
temperature.”
- Steven Wright
TVS will place data that is being accessed often on the outer tracks of the disks. This is
done so Teradata users can feel the need for speed. TVS will also place the data that is
not being accessed very often on the inside tracks of the disk. This is called a Multi-
Temperature data warehouse. TVS gathers metrics automatically about how often a
cylinder is accessed and moves the data blocks inside cylinders accessed the most to the
outer tracks of the disk to improve the access speeds.
![Page 530: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/530.jpg)
Master the Teradata Architecture
521 Copyright OSS 2010
![Page 531: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/531.jpg)
Master the Teradata Architecture
522 Copyright OSS 2010
The old way Teradata had to add Disk Space
“They always say time changes things, but
you actually have to change them yourself.”
- Andy Warhol
In the past you needed to pretty much double your disk space if you needed more space
added to your Teradata system. Notice in the picture on the following page that the
system on the top of the picture shows each AMP connected to only two physical disks.
The upgraded system doubled the disks to four and we doubled our system size. This is
considered very expensive.
![Page 532: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/532.jpg)
Master the Teradata Architecture
523 Copyright OSS 2010
![Page 533: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/533.jpg)
Master the Teradata Architecture
524 Copyright OSS 2010
Doubling the Disk Capacity
“The only difference between a rut and a
grave… is in their dimensions.”
- Ellen Glasglow
Teradata also allowed for replacing smaller disks with larger disks. In the picture on the
following page you can see the top system has 146 GB disks and the upgraded picture on
the bottom of the page shows 300 GB disks. This is again pretty much doubling the
space on your system. TVS will allow for much smaller increments of space. Read on!
![Page 534: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/534.jpg)
Master the Teradata Architecture
525 Copyright OSS 2010
![Page 535: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/535.jpg)
Master the Teradata Architecture
526 Copyright OSS 2010
Incremental Disk Growth Is Here
“Where facts are few, experts are many.”
- Donald R. Gannon
The example on the next page shows how TVS can add additional space to a Teradata
system by adding a mixture of disk sizes and incremental disk additions. TVS can take a
single disk and assign each AMP in the clique certain cylinders. Teradata used to assign
AMPs physical disks, but TVS assigns AMPs to certain cylinders. How clever!
One concept stays the same. Each AMP has its own virtual disk that only it can access.
Instead of reading and writing to cylinders on four dedicated disks the AMPs read and
write to cylinders managed by TVS.
![Page 536: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/536.jpg)
Master the Teradata Architecture
527 Copyright OSS 2010
![Page 537: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/537.jpg)
Master the Teradata Architecture
528 Copyright OSS 2010
Mixed Disks and Solid State Drives
“Just because something doesn‟t do what
you planned it to do doesn‟t mean it‟s
useless.”
- Thomas Edison
This is the most exciting part about TVS. Teradata is mixing Solid State Drives with
traditional spinning disk drives. Solid State Drives are faster because they use Flash
Drive technology. Yes, we are talking about the same flash drives you have used to copy
a file from one computer to another. These are called Solid State Drives or SSD Drives
and they are 100 times faster than traditional disks. This is really like having memory
speed on physical disk.
TVS will place the hot data on the hot Solid State Drives, the warm data on the faster 146
GB disks, and the data that isn‘t accessed very much, referred to as cold data on the
larger slower spinning disks.
This is referred to as a Multi-Temperature data warehouse.
![Page 538: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/538.jpg)
Master the Teradata Architecture
529 Copyright OSS 2010
![Page 539: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/539.jpg)
Master the Teradata Architecture
530 Copyright OSS 2010
Solid State Drives are like Giant Flash Drives
“It isn‟t the mountains ahead that wear you
out, it‟s the grain of sand in your shoe.”
- Robert Service
The goal for Teradata is eventually have nothing but Solid State Drives for its storage,
but the costs are too high. This will eventually happen when the costs become lower.
![Page 540: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/540.jpg)
Master the Teradata Architecture
531 Copyright OSS 2010
![Page 541: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/541.jpg)
Master the Teradata Architecture
532 Copyright OSS 2010
Virtual Storage Metrics
“Science is facts; just as houses are made of
stones, so is science made of facts; but a pile
of stones is not a house and a collection of
facts is not necessarily science.”
- Henri Poincare
TVS gathers metrics about cylinders so it can determine the hot, warm, and cold data.
This is done in the background. TVS will actually move about 10% of the data each
week to the appropriate disk types and appropriate tracks on disks. The DBA can also
command the system to move the data inside the cylinders to their respective hot, warm,
or cold areas.
![Page 542: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/542.jpg)
Master the Teradata Architecture
533 Copyright OSS 2010
![Page 543: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/543.jpg)
Master the Teradata Architecture
534 Copyright OSS 2010
The Two Modes of Virtual Storage
“Good design can‟t fix broken business
models.”
- Jeffrey Veen
TVS has two modes and they are TT (Teradata Traditional) and Intelligent Placement.
Current customer upgrading from a previous Teradata version will use TT mode. New
customers will use Intelligent Placement. The specifics of both are listed on the
following page.
![Page 544: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/544.jpg)
Master the Teradata Architecture
535 Copyright OSS 2010
![Page 545: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/545.jpg)
Master the Teradata Architecture
536 Copyright OSS 2010
![Page 546: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/546.jpg)
Master the Teradata Architecture
537 Copyright OSS 2010
![Page 547: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/547.jpg)
Master the Teradata Architecture
538 Copyright OSS 2010
![Page 548: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/548.jpg)
Master the Teradata Architecture
539 Copyright OSS 2010
![Page 549: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/549.jpg)
Master the Teradata Architecture
540 Copyright OSS 2010
![Page 550: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/550.jpg)
Master the Teradata Architecture
541 Copyright OSS 2010
![Page 551: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/551.jpg)
Master the Teradata Architecture
542 Copyright OSS 2010
![Page 552: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/552.jpg)
Master the Teradata Architecture
543 Copyright OSS 2010
What is a Row Hash Lock?
A Row Hash lock always involves a 1-AMP operation where the Primary Index is
utilized in the WHERE clause of the query. Instead of locking the entire table and
possibly making other users wait Teradata will only lock the rows that have the same
Row Hash as the value in the WHERE clause.
In our example you can see that we want to SELECT * from the Employee_Table
WHERE Last = ‗Jones‘. Since the column Last is the Primary Index of the
Employee_table the Parsing Engine comes up with a plan that is a 1-AMP operation. It
knows which AMP holds all the rows where the last name is ‗Jones‘. Since it is a
SELECT statement Teradata places a READ lock at the Row Hash level. As you can see
in the picture below all Row Hash values of 0001 are locked and the query can be
satisfied without locking the entire table.
Even the other rows on AMP 55 are still accessible by other users.
Row_ID Last First Emp# Dept Salary
0001,1
0001,2
0001,3
0001,4
0001,5
0001,6
0101,1
0110,1
0111,1
1000,1
1001,1
1010,1
1011,1
1100,1
1101,1
1110,1
Jones
Jones
Jones
Jones
Jones
Jones
Bjorn
Patel
Noone
Gore
Samson
Ruler
Baker
Doron
Mistel
Wan
Joe
Mary
Dave
Sandy
Sue
Bill
Jill
Ty
Mo
Jay
Mick
May
Jan
Hanna
Hans
Tan
61
65
63
3
7
51
68
69
24
49
22
8
11
12
67
23
10
20
30
30
10
20
40
30
20
10
30
20
30
20
30
40
50000.00
64000.50
84000.60
90490.90
25000.50
26089.40
85000.40
65876.40
86900.40
58000.50
45000.40
86000.89
65000.50
56450.00
98654.00
87659.50
SELECT * FROM Employee_Table
WHERE Last = ‘Jones’ ;
AMP 55
The PE knows that Last
is the Primary Index. It
hashes ‘Jones’ and the
Row Hash is 0001. The
PE now knows AMP 55
holds the row(s) in this
1-AMP Operation.
Primary
Index
A Row Hash Lock is
placed on all rows on
AMP 55 that have a
Row Hash of 0001.
1
2
![Page 553: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/553.jpg)
Master the Teradata Architecture
544 Copyright OSS 2010
Chapter 6 — Loading the Data
“I don‟t know who my grandfather was. I
am more interested in who his grandson will
become.”
– Abraham Lincoln, 16th
president of the United States
My son once told me he did not feel like studying. I said to him, ―When Abraham
Lincoln was your age, he studied by candlelight.‖ My son retorted, ―When Abraham
Lincoln was your age, he was president.‖
Data within a warehouse environment is often historic in nature, so the sheer volume of
data can overwhelm many systems. But, not Teradata!
“Abraham Lincoln will go down as one of
the greatest presidents in history, but
Teradata is even better because it will not
go down when it loads history.”
– Tom Coffing, 1st president of Coffing Data Warehousing
Teradata is so advanced in the data-loading department that other database vendors can‘t
hold a candle to it. A Teradata data warehouse brings enormous amounts of data into the
system. This is an area that most companies overlook when purchasing a data warehouse.
Most company officials think loading of data is simply that – just loading data. Some
people actually ask, ―Are data loads that critical?‖ Come on, ASCII stupid question and
get a stupid ANSI.
Data warehouses fail because customer cannot load the data fast enough once it reaches a
certain volume. As one Teradata developer said, ―It is not the load that brings them
down, but the way they carry it.‖ Even an experienced body builder must use a good
technique to lift the weight over his head. While most database vendors are new to the
data warehouse game, Teradata has had 15 years of experience of loading the largest data
![Page 554: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/554.jpg)
Master the Teradata Architecture
545 Copyright OSS 2010
warehouses in the world. The combination of FastLoad, MultiLoad, and TPump can load
millions, even billions of records in record time. (SHOULD WE HAVE A HEADER???)
FastLoad is designed to load flat file data from a mainframe or LAN directly into an
empty Teradata table. This is how a Teradata table is populated the first time. I have
personally seen Teradata load over one billion large rows in less than 6 hours. Plus, I
have seen Teradata load millions of rows in minutes. How is Teradata‘s speed and
performance accomplished? Once again it‘s through the power of parallel processing.
Where FastLoad is meant to populate empty tables with INSERTs, MultiLoad is meant to
process INSERTs, UPDATEs, and DELETEs on tables that have existing data.
MultiLoad is extremely fast. One major Teradata data warehouse company processes
120 million inserts, updates, and deletes nightly during its batch window.
The TPump utility is designed to allow OLTP transactions to immediately load into a
data warehouse. When I started working with Teradata, more than 10 years ago, most
companies loaded data on a monthly basis. Suddenly, companies began to load data
weekly.
Today, most companies load data nightly, and industry leaders are loading data hourly.
TPump is the beginning step of an ―Active Data Warehouse (ADW)‖. ADW combines
OLTP transactions with the power of a Decision Support System (DSS).
The TPump utility theoretically acts like a water faucet. TPump can be set to full throttle
to load millions of transactions during off peak hours or ―turned down‖ to trickle small
amounts of data during the data warehouse daily rush hour. It can also be automatically
preset to load levels at certain times during the day, and can be modified at any time.
Also, TPump locks at a row level so users have access to the rest of the rows while the
table is being loaded. Another advantage of this load utility is that it allows for multiple
updates to be conducted on a table simultaneously.
When the utilities start, the Parsing Engine comes up with a plan for the AMPs. The
Parsing Engine then steps back and lets the AMPs do their work. The data is loaded in
large 64K blocks. Each AMP is given a 64K block of rows for loading. Like a line of
workers trying to pass sand bags to prevent a flood, Teradata passes these blocks from
AMP to AMP until all the data is on Teradata. Next, all AMPs take the blocks they
received and hash the Primary Index value sending the rows over the BYNET to their
destination AMP. Once this is done, each AMP sorts its data by Row ID and the table is
ready for business.
![Page 555: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/555.jpg)
Master the Teradata Architecture
546 Copyright OSS 2010
FastLoad
“If you are all wrapped up in yourself, you
are overdressed”
Kate Halverson
The Teradata FastLoad utility is wrapped up in your data and even though it appears
under dressed without fancy dressings it is one of the best utilities every built. It may not
be dressed to kill, but it is designed to thrill!
FastLoad is actually designed to load flat file data from a mainframe or LAN directly
into an empty Teradata table. This is how a Teradata table is populated the first time. I
have personally seen Teradata load over one billion large rows in less than 6 hours. Plus,
I have seen Teradata load millions of rows in minutes. Teradata has the quickest time to
solution, and has the most powerful performance in the data warehousing industry.
How is Teradata‘s speed and performance accomplished? It‘s done through parallel
processing.
FastLoad understands one SQL command - INSERT. It inserts rows into an empty table.
The process is as follows: A flat file is prepared for loading on a mainframe or LAN.
The FastLoad utility needs three pieces of information to process: where the flat file
located, what is its file definition, and what table the data should be loaded into in
Teradata.
When the FastLoad utility starts, the Parsing Engine comes up with a plan for the AMPs.
The Parsing Engine then steps back and lets the AMPs do their work. The data is loaded
in large 64K blocks. Each AMP is given a 64K block of rows for loading. Like a line of
workers trying to pass sand bags to prevent a flood, Teradata passes these blocks from
AMP to AMP until all the data is on Teradata. Next, all AMPs take the blocks they
received, hash the rows in those blocks (in parallel) and send the rows to the proper AMP
over the BYNET. Once this is done, each AMP sorts its data by Row ID and the table is
ready for business. FastLoad Basics:
Loads data to Teradata from a Mainframe or LAN flat file;
Only one table may be loaded at a time;
The table to be loaded must be empty;
There can be no secondary indexes, referential integrity, or triggers;
It locks at the table level.
FastLoad populates empty tables at the block level. Teradata LOADs using FastLoad.
![Page 556: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/556.jpg)
Master the Teradata Architecture
547 Copyright OSS 2010
FastLoad Picture
Input File from
Mainframe or LAN
64K Block
64K Block
64K Block
64K Block
AMP AMP AMP AMP
Teradata
AMP AMP AMP AMP
PE
AMP
Fastload inserts into empty tables at the Block Level.
No Secondary Indexes, Referential Integrity or Triggers allowed.
Empty
Table
Empty
Table
Empty
Table
Empty
Table
![Page 557: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/557.jpg)
Master the Teradata Architecture
548 Copyright OSS 2010
Multiload
“No wonder nobody comes here – It‟s too
crowded”
Yogi Berra
Tera-Tom has actually had dinner with Yogi and he was a real pleasure. As an All-
American Athlete who placed third in the NCAA‘s for the University of Arizona in 1979
Tera-Tom got to spend some time with Yogi. Yogi is a lot like Multiload. He is fast on
his feet, is extremely versatile, and he knows a little bit about clean-up. Multiload can
handle the high heat or the curve when inserting, updating or deleting data.
Where FastLoad is meant to populate empty tables with INSERTS, Multiload is meant to
process INSERTS, UPDATES, and DELETES on tables that have existing data.
Multiload is extremely fast. One major Teradata data warehouse company processes 120
million inserts, updates, and deletes during its nightly batch.
Multiload works similar to FastLoad. Data originates as a flat file on either a mainframe
or LAN. When the Multiload utility is executed, the Parsing Engine creates a plan for the
AMPs to follow. The data is then passed to the AMPs, in parallel, in 64K blocks, and the
AMPs hash the rows to the proper AMP. Last, the INSERTS, UPDATES, and
DELETES are applied.
In the previous diagram the mainframe/LAN is talking to the Parsing Engine. The PE
passes the data across the BYNET for the AMPs to retrieve. Keep in mind, many
systems have hundreds to thousands of AMPs. The load takes place, continually, in
parallel when the 64K packets are delivered to the AMPs. Multiload has been designed
for users who have a ―need for speed‖. Multiload locks at the table level. Therefore,
while Multiload is running, the table is unavailable unless users utilize an Access
Lock.
Multiload Basics:
Loads data to Teradata from a Mainframe or LAN flat file;
Up to 20 INSERTS, UPDATES, or DELETES may be executed on up to 5 tables;
Receiving tables are usually populated;
There can be no Unique secondary indexes, referential integrity, or triggers;
It locks at the table level.
Multiload loads to populated tables at the block level. Teradata UPDATEs using
MULTILOAD.
![Page 558: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/558.jpg)
Master the Teradata Architecture
549 Copyright OSS 2010
Multiload Picture
Input File from
Mainframe or LAN
64K Block
64K Block
64K Block
64K Block
AMP AMP AMP AMP
Teradata
AMP AMP AMP AMP
PE
AMP
Multiload inserts, updates, upserts and deletes rows into
populated tables at the Block Level. It does not allow Triggers,
Unique Secondary Indexes (USIs) or Referential Integrity.
Populated
Table
Populated
Table
Populated
Table
Populated
Table
![Page 559: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/559.jpg)
Master the Teradata Architecture
550 Copyright OSS 2010
TPump
“You don‟t drown by falling into the water;
you drown by staying in the water.”
-Edwin Louis Cole
The TPump utility is designed to allow OLTP transactions to immediately load into a
data warehouse. When I started working with Teradata, more than 10 years ago, most
companies loaded data on a monthly basis. Suddenly, companies began to load data
weekly. Today, most companies load data nightly, and industry leaders are loading data
hourly. TPump is the beginning step of an “Active Data Warehouse (ADW).” ADW
combines OLTP transactions with a Decisions Support System (DSS).
If the data is not flowing, a company can drown in it! The utility is called TPump because
it theoretically acts like a water faucet. TPump can be set to full throttle to load millions
of transactions during off peak hours or ―turned down‖ to trickle small amounts of data
during the data warehouse rush hour. It can also be automatically preset to load different
levels at certain times during the day, and can be modified at any time.
Also, TPump locks at a row level so users have access to the rest of the rows while the
table is being loaded.
Basics:
Loads data to Teradata from a Mainframe or LAN flat file;
Processes INSERTS, UPDATES, or DELETES;
Tables are usually populated;
It can have secondary indexes, triggers, and referential integrity;
It locks at the row level.
TPump is used for continuous updates to rows in a table. Teradata STREAMs using
TPump.
![Page 560: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/560.jpg)
Master the Teradata Architecture
551 Copyright OSS 2010
TPump Picture
Input File from
Mainframe or LAN
Packets
Packets
Packets
Packets
AMP AMP AMP AMP
Teradata
AMP AMP AMP AMP
PE
AMP
Tpump inserts, updates, upserts and deletes rows into
populated tables at the Row Level. It supports Triggers,
all Secondary Indexes and Referential Integrity.
Populated
Table
Populated
Table
Populated
Table
Populated
Table
Row Level
Locks
Row Level
Locks
Row Level
Locks
Row Level
Locks
![Page 561: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/561.jpg)
Master the Teradata Architecture
552 Copyright OSS 2010
FastExport
“The most exciting phrase to hear in
science, the one that heralds the most
discoveries, is not “Eureka!”, but “That‟s
funny…””
Isaac Asimov
The most exciting words when loading or unloading data is ―That Fast‖. Put a seat belt
on before running FastExport because this utility will blow your socks off.
FastExport is designed to export Teradata data to a flat file on a mainframe or LAN.
FastExport merely takes an SQL Select command and places the output to a host.
FastExport exports data from multiple tables and exports data to a host file.
Teradata LOADs using FASTLOAD
Teradata UPDATEs using MULTILOAD
Teradata STREAMs using TPump
Teradata Exports using FASTEXPORT
![Page 562: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/562.jpg)
Master the Teradata Architecture
553 Copyright OSS 2010
FastExport Picture
Output to a
Mainframe or LAN
Host
File
AMP AMP AMP AMP
Teradata
AMP AMP AMP AMP
PE
AMP
Fastexport uses a SELECT statement to retrieve rows from
one or more tables and exports the result set to a host
file on a mainframe or LAN.
Populated
Table
Populated
Table
Populated
Table
Populated
Table
![Page 563: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/563.jpg)
Master the Teradata Architecture
554 Copyright OSS 2010
![Page 564: Mastering Teradata](https://reader036.fdocuments.net/reader036/viewer/2022062418/55403f1d4a7959bc158b49fe/html5/thumbnails/564.jpg)
Master the Teradata Architecture
555 Copyright OSS 2010