Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to...

37
Copyright © SAS Institute Inc. All rights reserved. Ask the Expert SAS SQL: An Introduction

Transcript of Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to...

Page 1: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Ask the ExpertSAS SQL: An Introduction

Page 2: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Ask the ExpertSAS SQL: An Introduction

Presenter

Cynthia JohnsonSenior Systems Engineer, SAS Customer Loyalty Team

Page 3: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

The SQL Procedure

The SQL procedure: enables the use of SQL in SAS

is part of Base SAS software

follows American National Standards Institute (ANSI) standards

includes enhancements for compatibility with SAS software

Page 4: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

The SQL Procedure

The SQL procedure is a tool for querying data

a tool for data manipulation and management

an augmentation to the DATA step.

The SQL procedure is not a DATA step replacement

a custom reporting tool.

Page 5: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Structured Query Language

Input Output

PROC

SQL

Report

PROC SQL

SAS Data Set

SAS Data View

SAS Data Set

SAS Data View

DBMS Table

DBMS Table

Page 6: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SELECT Statement Syntax

General form of the SELECT statement with selected clauses:

SELECT column-1<, ...column-n>

FROM table-1|view-1<, ...table-n|view-n>

<WHERE expression>

<GROUP BY column-1<, …column-n>>

<HAVING expression>

<ORDER BY column-1<DESC><, …column-n>>;

Page 7: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Select Statement Syntax

To help you remember…

So

Few

Workers

Go

Home

On time;

Page 8: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SELECT Statement Syntax

Order of execution for SQL clauses:

5 SELECT column-1<, ...column-n>

1 FROM table-1|view-1<, ...table-n|view-n>

2 <WHERE expression>

3 <GROUP BY column-1<, …column-n>>

4 <HAVING expression>

6 <ORDER BY column-1<DESC><, …column-n>>;

Page 9: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Order of Execution

• The FROM clause is processed first. This clause identifies the data sets from which SQL is reading

• The WHERE clause is a pre-processor. As with the DATA step, the WHERE clause searches the original data set(s) for values that meet the WHERE condition. These results are stored in an intermediate table.

• The GROUP BY is the next clause that processes. The GROUP BY clause creates groupings within the data.

Page 10: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Order of Execution

• The HAVING clause is processed next. This clause subsets based upon groups that were created by the GROUP BY clause. It also builds an intermediate table.

• The SELECT clause selects the columns for the results set.

• The ORDER BY clause orders (or sorts) the rows.

Page 11: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Demonstration

Let’s use SQL to query and manipulate a data set.

Page 12: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Combining Data from Multiple Tables

SQL uses joins to combine tables horizontally.

Table A Table B

Page 13: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Types of Joins

PROC SQL supports two types of joins: inner joins

outer joins

Page 14: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Inner JoinsInner joins return only matching rows

enable a maximum of 256 tables to be joined at the same time.

Page 15: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Types of Joins

Outer joins return all matching rows, plus nonmatching rows

from one or both tables can be performed on only two tables or views

at a time.

Full RightLeft

Page 16: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Cartesian Product

To understand how SQL processes a join, it is important to understand the concept of the Cartesian product.

A query that lists multiple tables in the FROM clause without a WHERE clause produces all possible combinations of rows from all tables. This result is called the Cartesian product.

select *from one, two;

Page 17: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Cartesian Product

Table OneX A

1 a

4 d

2 b

Table TwoX B

2 x

3 y

5 v

Page 18: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Cartesian Product

Table OneX A

1 a

4 d

2 b

Table TwoX B

2 x

3 y

5 v

Result SetX A X B

1 a 2 x

1 a 3 y

1 a 5 v

4 d 2 x

4 d 3 y

4 d 5 v

2 b 2 x

2 b 3 y

2 b 5 v

Page 19: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Demonstration

Write a query that performs an inner join.

Page 20: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Outer Joins

Full Join Retrieve the matching rows as well as the non-matches from the left table and

the non-matches from the right table.

Full

Page 21: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Full Join

select * from one full join two

on one.x = two.x;

Table OneX A

1 a

4 d

2 b

Table TwoX B

2 x

3 y

5 v

X A X B

1 a .

2 b 2 x

. 3 y

4 d .

. 5 v

Page 22: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SQL Full Join

SQL joins do not automatically overlay same-named columns.

Table One

b2d4a1

AX

Table Two

v5y3x2

BX

proc sql;select one.x, a, b

from one full join twoon one.x=two.x

;quit;

X A B1 a2 b x

y4 d

v

Output

Page 23: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

The COALESCE Function

The COALESCE function returns the value of the first non-missing argument.

General form of the COALESCE function:COALESCE(argument-1,argument-2<, ...argument-n)

Page 24: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SQL Full Join

You can use the COALESCE function to overlay columns.

Table One

b2d4a1

AX

Table Two

v5y3x2

BX

proc sql;select

coalesce(one.x,two.x)as x,a,b

from one full join twoon one.x=two.x;

quit;

Output

X A B1 a2 b x3 y4 d5 v

Page 25: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SQL Join versus DATA Step Merge

The DATA step code producing similar results:

Table One

b2d4a1

AX

Table Two

v5y3x2

BX

proc sort data=one;

by x;

run;

data three;

merge one two;

by x;

run;

Output

Page 26: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SQL Join versus DATA Step Merge

Key Points SQL Join DATA Step

Merge

Explicit sorting of data

before join/merge

Not required Required

Same-named columns in

join/merge expressions

Not required Required

Equality in join or merge

expressions

Not required Required

Page 27: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

PROC SQL and Macro Variables

PROC SQL creates or updates macro variables using an INTO clause.

The INTO clause has varied syntaxes, and each produces a different result.

Page 28: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

PROC SQL and Macro Variables

The following syntax places values from the first row returned by an SQL query into macro variable(s). Data from additional rows returned by the query is ignored.

General form of the SELECT statement with an INTO clause:

The value from the first column in the SELECT list is placed in the first macro variable listed in the INTO clause, and so on.

SELECT column-1<, …column-n>

INTO :macvar_1<, ... :macvar_n>

FROM table|view …

Page 29: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Demonstration

Create macro variables with PROC SQL.

Page 30: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

support.sas.com Resources

• SAS ® Base SAS Documentation

http://support.sas.com/documentation/onlinedoc/base/index.html

• Papers & SAS Notes

http://support.sas.com/resources/papers/sgf09/336-2009.pdf

http://support.sas.com/kb/20/783.html

• SAS Training

https://support.sas.com/edu/schedules.html?ctry=us&id=2579

Page 31: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

support.sas.com Resources

• Support Communities

https://communities.sas.com/

• Blogs

http://blogs.sas.com/content/

Page 32: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

sas.com

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Thank you for attending!

Page 33: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Q&APlease submit your questions in the Q&A window

Page 34: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

@SASSoftware

SASSoftware

communities.sas.com

SAS Software, SASUsersgroup

SAS, SAS Users Group

blogs.sas.com/content

Page 35: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the
Page 36: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Learn SAS with SAS Education

SAS Education will support you in continual learning to grow your career.

• SAS Training Coursessupport.sas.com/training

• Get SAS Certifiedsupport.sas.com/certify

• SAS Bookssupport.sas.com/books

Contact SAS Training Customer Service(800) 727-0025 or [email protected] your webinar host, Rochelle Whittle at (919) 531-3090

Page 37: Ask the Expert - Sas Institute...To understand how SQL processes a join, it is important to understand the concept of the Cartesian product. A query that lists multiple tables in the

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Ask the ExpertFeedback

Please fill out survey as you leave WebEx today.

Thank you for joining us today!