GASUG SS 20170419 - sasCommunity · The Basics proc freq; run; 4/20/2017 8 What does that do? ......
Transcript of GASUG SS 20170419 - sasCommunity · The Basics proc freq; run; 4/20/2017 8 What does that do? ......
4/20/2017
1
SESUG 2017
Promotional Slides
•2017 marks the 25th Anniversary of the Southeast SAS User Group’s annual meeting and we are celebrating with a special homecoming.
•Please join us for our 2017 to be held on SAS Campus!
4/20/2017
2
•Sections for 2017:• Application/Macro Development• Banking/Finance• Building Blocks• Coder's Corner• Data Management/Big Data• e‐Posters• Hands On Workshop• Life Sciences/Health Care/Insurance• Planning/Support/Administration• Reporting/Visualization/JMP• Statistics/Data Analysis
•In addition, there will be weekend workshops on November 3 and 4.
•Finally, returning in 2017, there will be post‐conference SAS Certification testing on Wednesday, November 8. Spots will be limited and available at a discount.
•Be on the lookout for our special base certification registration package!
4/20/2017
3
•Call for content opens in late February!
•Early registration opens in early May!
•See www.sesug.org for more details! Find us on Facebook, Twitter and LinkedIn.
Questions?Want to Volunteer?
Want to Volunteer Someone Else?
2017 Conference Co‐Chairs:
Jason BrinkleyAcademic Chair
Meenal SinhaOperations Chair
4/20/2017
4
Stephanie R. Thompson – Top 5 Things to Know
Loves programming (FORTRAN 77 anyone?) Has been programming for a longtime (see #1) Is a SAS Global Forum Mentor (Not sure what
this is? Ask her!) Presents at lots of SAS conferences and finds
them invaluable (learning, networking, even some fun)
Worked in higher education, auto parts retail & manufacturing, commercial printing, cereal production, and nuclear waste remediation
FREQ Out – Exploring Your Data the Old School Way
The Presentation - Adapted from a Hands On Workshop
4/20/2017
5
It’s OK to
FREQ Out! Benefits Basics The fancy stuff
Data Quality & Reporting
Know thy data! Good data = good information Junk in, junk out Have you seen a data train wreck?
4/20/2017
6
Data Quality & Reporting
How do you check your data?– Your favorite PROC?
How much is missing? What are the values? I don’t want to type a lot of code
Why PROC FREQ?
Fast Easy Not a lot of code to type
4/20/2017
7
Why PROC FREQ?
Provides a lot of information by default– Frequencies– 3 different percentages
Can give you more than you bargained for Useful for quick reporting
The Basics
proc freq;run;
4/20/2017
8
What does that do?
Generates a one-way frequency table for every variable in the most recently referenced dataset
You can explicitly reference the dataset
proc freq data=sashelp.cars;run;
What is all of that?Table of Make by Cylinders
Make CylindersFrequencyPercentRow PctCol Pct 3 4 5 6 8 10 12 Total
Acura 00.000.000.00
20.47
28.571.47
00.000.000.00
51.17
71.432.63
00.000.000.00
00.000.000.00
00.000.000.00
71.64
Audi 00.000.000.00
40.94
21.052.94
00.000.000.00
102.35
52.635.26
51.17
26.325.75
00.000.000.00
00.000.000.00
194.46
BMW 00.000.000.00
00.000.000.00
00.000.000.00
163.76
80.008.42
40.94
20.004.60
00.000.000.00
00.000.000.00
204.69
4/20/2017
9
Notes
Can generate a lot of tables and a lot of output pages
The cars dataset generates almost 50 pages of output
May not be helpful in looking at your data
A favorite option
proc freq data = sashelp.carscompress;run;
4/20/2017
10
Get What You Want
The tables statement Plain or fancy
proc freq data=sashelp.cars;tables make;run;
4/20/2017
11
Multiple Tables & Types
proc freq data=sashelp.cars;tables make*cylinders;run;
proc freq data=sashelp.cars;tables make make*cylinders;run;
Multiple Tables 2 Ways - Bonus
proc freq data=sashelp.cars;tables make*cylinders make*type make*origin;run;
proc freq data=sashelp.cars;tables make*(cylinders type origin);run;
4/20/2017
12
Changing the Look
Use table options
proc freq data=sashelp.cars;tables make*cylinders / nopercent nocol;run;
4/20/2017
13
Missing Data?
proc freq data=sashelp.cars;tables cylinders / missing;run;
How is this different?
4/20/2017
14
By Value or 3-Way?
Getting multi-level tables Several options Each has a different look Depends on what you need One option requires sorting
2 Choices - 1
proc sort data=sashelp.cars;by make;run;
proc freq data=sashelp.cars;by make;tables model*cylinders;run;
4/20/2017
15
2 Choices - 2
proc freq data=sashelp.cars;tables make*model*cylinders;run;
4/20/2017
16
List
Slightly different look
proc freq data=sashelp.cars;tables model*cylinders / list;run;
4/20/2017
17
How to change the order?
Procedure option
proc freq data=sashelp.cars order=freq;tables model / list;run;
4/20/2017
18
Stats?
Generated by table options
proc freq data = sashelp.cars;tables origin*drivetrain / nocol nopercent missing chisq;run;
4/20/2017
19
Need a Dataset?
Must be paired with a statistical option
proc freq data = sashelp.cars noprint;tables origin*drivetrain / nocol nopercent missing chisq;output out=table1 chisq;run;
4/20/2017
20
What if it’s not?
WARNING: No OUTPUT data set is produced because no statistics are requested in the corresponding TABLES statement.--- OR ---WARNING: No OUTPUT data set is produced because no statistics are requested.
Coding it right
proc freq data = sashelp.cars noprint;tables origin*drivetrain / nocol nopercent missing
chisq;output out=table1 chisq;run;
What is thisoption?
4/20/2017
21
ODS Graphics
Generate simple histograms No need to write graph code Visualize your data
4/20/2017
22
How?
ods graphics on;proc freq data = sashelp.cars;tables origin*drivetrain / nocol nopercent missing chisq;run;ods graphics off;
Questions?
4/20/2017
24
Getting Your Data into SASStephanie R. Thompson
Where is your data stored?
Oracle tables SQL Server tables Microsoft Access Microsoft Excel Text file SPSS All over the place
4/20/2017
25
What’s the solution?
SAS/ACCESS Interface! What can you access? SAS/ACCESS validation matrix http://support.sas.com/matrix
4/20/2017
26
Access to Relational Databases and Data Warehouse Appliances
Aster Data nCluster DB2 Greenplum Hadoop HP Neoview Informix Microsoft SQL Server MySQL Netezza
ODBC OLE DB Oracle (including Sun Oracle
Database Machine [Exadata]) Sybase Sybase IQ Teradata
Access to nonrelational databases
ADABAS CA-DATACOM/DB SAS/Access DATA Step Interface to CA-IDMS IMS PC Files SYSTEM 2000
Operating System Dependent
4/20/2017
27
How do I know what I have licensed?
proc setinit noalias ;run;
Ways to Access Your Data
SAS libname engine PROC SQL ODBC PROC IMPORT
– Import Wizard
4/20/2017
28
Libname Refresher
Point to where your data are Think of as shortcuts or nicknames Can be 1 to 8 characters
– Start with an underscore or letter– Cannot start w/a number or other special character
Libname General Form
LIBNAME libref <engine> ‘physical path‘;
Libref – you provide <engine> is optional
– Refers to type of data to read Physical path is operating system dependent
4/20/2017
29
Libname Engine
Connect to relational database tables– Oracle, DB2, etc.
Connect to other file types– Excel, SPSS, etc.
SAS treats tables like SAS datasets Need to define the “engine” Can include ID and PWD
Libname Engines
Based on what is licensed– Some with Base SAS– SAS/ACCESS optional engines
Some are only for certain platforms Release resource when done
4/20/2017
30
Excel Example
libname fac 'R:\Employee Data\Employee Database.xls' access=readonly ;
data faculty_all(drop = months) ;set fac.'No Costing$'n ;[ add’l SAS statements ] ;run;
New with SAS/ACCESS 9.4
http://support.sas.com/documentation/cdl/en/acpcref/69731/HTML/default/viewer.htm#p05hxa7xihwolmn1sdjpfui9nosc.htm
4/20/2017
31
Excel Example
libname fac excel 'R:\Employee Data\Employee Database.xls' access=readonly ;
data faculty_all(drop = months) ;set fac.'No Costing$'n ;[ add’l SAS statements ] ;run;
or pcfiles or xlxs
Accessing Databases
Databases on– PC– Servers
Use of ID and passwords Additional libname options
4/20/2017
32
Non-SAS Oracle Tips
Check your IP address– Static or dynamic? – Determine your IP address:
go to a DOS prompt : Start – run – CMD Type ipconfig/all IP address listed after “IP Address”
– Obtain a static IP address for your PC. This will be necessary if your IT department opens the firewall only to
specific IP addresses.
tnsnames.ora
General form (from Oracle website)net_service_name=(DESCRIPTION= (ADDRESS=(protocol_address_information))
(CONNECT_DATA= (SERVICE_NAME=service_name))) Can have multiple services in one file
4/20/2017
33
More Non-SAS Oracle Tips
Obtain any security clearances that you may need.– Security may be by instance, schema, table, table elements
Install the Oracle client on your PC Configure the Oracle client to access your data
– Edit tnsnames.ora for your database specifics
Protect your ID and Password!
One way:libname st 'C:\My Documents\My SAS Files';
data cred;set st.cred;call symput('stid', stid);call symput('stpwd', stpwd);run;
4/20/2017
34
Protect your ID and Password!
A better way:
proc pwencode!
Use generatedstring in place of password
Oracle Using Libname
libname mylib oracle user=&stid pw=&stpwd path=RPTS.SPECTRUMREPORTS schema=saturn;
“oracle” is the engine – how the data are stored
libname mylib oracle user=&stidpw=“{SAS002}C35AF82C41896CB428D730DE2D954CE5192649BB50B8230D523D263F”
path=RPTS.SPECTRUMREPORTS schema=saturn;
4/20/2017
35
Accessing Oracle Tables
proc sql;create table registered asselect distinct sfrstcr_pidm as pidmfrom mylib.SFRSTCR where [add’l SQL statements] order by sfrstcr_pidm;quit;
Just Using PROC SQL
proc sql noerrorstop;connect to oracle(user=&stid pw=&stpwd
path=RPTS.SPECTRUMREPORTS);create table regdetail as select * from connection to oracle (select pidm, levl_desc, styp_codefrom sgbstdn a, sfbetrmwhere [add’l SQL statements] );quit;
4/20/2017
36
Why Use SQL?
Takes advantage of SAS SQL Pass-Through Facility Understood by database host system Standardized language No need for “translation”
Benefits of using PROC SQL
SQL Pass-Though Facility!
Speed!!!!!!
4/20/2017
37
Benefits of using PROC SQL (cont’d)
The Pass-Through Facility uses SAS/ACCESS to connect to a DBMS and to send statements directly to the DBMS for execution. This facility is an alternative to the SAS/ACCESS LIBNAME statement. It enables you to use the SQL syntax of your DBMS, and it supports any non-ANSI standard SQL that is supported by your DBMS.
(SAS online documentation)
Benefits of using PROC SQL (cont’d)
SQL code can be processed on system hosting the database– Increase speed– Increase efficiency– Take advantage of additional functions
4/20/2017
38
Other Notes
DATA step code is not passed to Oracle when using the LIBNAME engine (turns into a select *)
PROC PRINT is passed to Oracle as a SELECT query PROC FREQ is sent to Oracle as a select * from table
Other Notes (cont’d)
WHERE criteria will get passed to Oracle SAS functions can be used in PROC SQL
– (except LAG and DIF) More functionality in SAS 9.2 and beyond Ability to see what is being passed via a trace
4/20/2017
39
Using PROC SQL w/ODBC
proc sql;connect to odbc(dsn=mdsn uid=&usr pwd=&pwd);
create table studs asselect * from connection to odbc ([add’l SQL statements] );
disconnect from odbc;quit;
“mdsn” is the name of the ODBC connection set up on my PC
Setting up an ODBC Connection
From Control Panel
4/20/2017
40
Setting up an ODBC Connection (cont’d)
Setting up an ODBC Connection (cont’d) Why use?
– Needed for certain types of databases SQL Server
– To use the libname engine instead of PROC IMPORT
4/20/2017
41
Import File Types
Base PROC IMPORT– text files
.dat tab delimited comma separated delimited
Import File Types
SAS/ACCESS to PC Files– JMP– Excel– MS Access– dBase– Lotus
4/20/2017
42
PROC IMPORT
Allows you to import many types of PC files Add code to your program Use the Wizard Use the Wizard to create code for you
PROC IMPORT - ExcelPROC IMPORT OUT= WORK.mydata
DATAFILE= "H:\Stephanie\Excel\Student_Data.xls"
DBMS=EXCEL REPLACE;SHEET="Sheet1$"; GETNAMES=YES;MIXED=NO;SCANTEXT=YES;USEDATE=YES;SCANTIME=YES;
RUN;
4/20/2017
43
Import Wizard
Log Window1 PROC IMPORT OUT= WORK.mydata2 DATAFILE= "H:\Stephanie\Excel\Student_Data.xls"3 DBMS=EXCEL REPLACE;4 SHEET="Sheet1$";5 GETNAMES=YES;6 MIXED=NO;7 SCANTEXT=YES;8 USEDATE=YES;9 SCANTIME=YES;10 RUN;
NOTE: WORK.MYDATA was successfully created.NOTE: PROCEDURE IMPORT used (Total process time):
real time 2.37 secondscpu time 0.03 seconds
4/20/2017
44
Flat Files
Delimited– Import Wizard– Data Step
Flat File (non-delimited)– Data step
Know your data!
Make it Easy on Yourself
4/20/2017
45
References
Absolute– Preceded by the @ symbol– Indicate a specific column to begin reading in data– Sample lineinput @1 week 2. @5 home $2.;
References (cont’d)
Relative– Relative to current location– Can be positive or negative– Sample lineinput week 2. +2 home $2.;
4/20/2017
46
Tips
Know Thy Data! Understand record layout for flat files Libname engine vs. import
– Depends on what you need to do– What you have licensed
Tips (cont’d)
Database Views– Contain pre-combined sets of information– Can cover many different base tables– May include pre-calculated variables– Can be resource intensive
4/20/2017
47
Tips (cont’d)
Tables– Learn table structures
One record per keyMultiple records per key
– Use validation tables– Know keys needed to join tables
Single?Many?
Tips (cont’d)
Data step vs. PROC SQL– Both will work– Data step runs much more slowly
System dependent Try both and see which is faster
– Data step only w / libname engine– No SQL pass-through w / data step
4/20/2017
48
Enterprise Guide
Getting data into EG
Questions?