The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We...

40
The Report Procedure

Transcript of The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We...

Page 1: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The Report Procedure

Page 2: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The REPORT Procedure

We have seen the PRINT procedure used to display data in various forms, and the MEANS procedure for summarizing data.

The REPORT procedure can do the many of these same things in a versatile format.

Page 3: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The REPORT Procedure General Syntax:

PROC REPORT <option(s)>; BREAK location break-variable</ option(s)>; BY <DESCENDING> variable-1

<...<DESCENDING> variable-n> <NOTSORTED>; COLUMN column-specification(s); DEFINE report-item / <usage>

<attribute(s)> <option(s)> <justification> <COLOR=color> <'column-header-1' <...'column-header-n'>> <style>;

FREQ variable; RBREAK location </ option(s)>; WEIGHT variable;

Page 4: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The COLUMN Statement

The result of the REPORT procedure can be viewed as a table—columns represent different variables, rows are values.

In general, we will use the columnstatement like the var statement:column variable(s); Variables will appear in the order listed in the

column statement.

Page 5: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Interactive vs. Non-Interactive

In previous versions of SAS, the REPORT procedure opens an interactive window.

In versions prior to 9.4 you can suppress the interactive mode, and direct output to the output window, using the NOWD option in the PROC REPORT line:proc report data=data-set nowd;

Page 6: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Example

A simple example:

proc report data=mysas.projects nowd;

column Region Pol_type equipmnt personel JobTotal;

run;

The result is very much like that of the PRINT procedure.

Page 7: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Example

Page 8: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The DEFINE Statement

The define statement allows attributes to be set for each variable.

Syntax:define variable / options ;

Many different options are available…

Page 9: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The DEFINE Statement

Some options: A quoted string: ‘Any Text’ – provides a

column label

width=n—sets the column width

format=SAS-format—sets format for display

Page 10: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Example Modifying the previous example:

proc report data=mysas.projects nowd;

column Region Pol_type equipmnt personel JobTotal;

define region/'Regional Office';

define pol_type/'Pollutant' width=10;

define equipmnt/'Equipment Cost' format=dollar10.;

define personel/'Personnel Cost' format=dollar10.;

define jobtotal/'Total Cost' format=dollar10.;

run;

Page 11: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Example

Page 12: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The DEFINE Statement

This output could also be generated using PROC PRINT with appropriate label and format statements.

There must be more options available in PROC REPORT…

Page 13: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The ORDER Option

The order option:

Orders the values of the variable in question.

Displays each value only once at the beginning of its output block.

To see this, add the order option to the options for the Region and Pol_type variables in the previous example

Page 14: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Output (2nd Page)

Can PRINT do this?

Page 15: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The GROUP Option

The group option:

Orders the values of the variable

Condenses multiple observations (if possible)

Replaces any quantitative variable(s) with a summary statistic (the sum by default)

To see this, include the group option for both the Region and Pol_type variables

Page 16: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

The GROUP Option

Can PRINT do this?

Page 17: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Summary Statistics

To change the default statistic for numeric variables a statistic keyword is included as an option in its define statement.

Some statistics keywords:N MIN MAX MEAN STD MEDIAN(Similar to what is available with PROC MEANS)

Page 18: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Summary Statistics Example—compute means for each cost variable:

proc report data=mysas.projects nowd;

column Region Pol_type equipmnt personel JobTotal;

define region/'Regional Office' group;

define pol_type/'Pollutant' group width=10;

define equipmnt/mean 'Average Equipment Cost' format=dollar10.;

define personel/mean 'Average Personnel Cost' format=dollar10.;

define jobtotal/mean 'Average Total Cost' format=dollar10.;

run;

Page 19: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Summary Statistics

Page 20: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Summary Statistics

Suppose we want a summary of multiple statistics on a single variable, jobtotal in this example, how is that accomplished?

The same variable can be used in multiple columns in a report with differing definitions by the use of aliases

Page 21: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Aliases Aliases are set in the column statement:

column variable=alias …

The alias must be a legal SAS name and must not be a variable name from the current data set.

In define statements, the alias is referred to.

proc report data=mysas.projects nowd;

column Region Pol_type JobTotal=num JobTotal=avg JobTotal=med JobTotal=std;

define region/'Regional Office' group;

define pol_type/'Pollutant' group width=10;

define num/n 'Number of Jobs' width=8;

define avg/mean 'Mean Total Cost' format=dollar12.2;

define med/median 'Median Total Cost' format=dollar12.2;

define std/std 'Std. Deviation of Total Cost' format=dollar12.2;

run;

Page 22: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Aliases Aliases are set in the column statement:

column variable=alias …

The alias must be a legal SAS name and must not be a variable name from the current data set.

In define statements, the alias is referred to.

proc report data=mysas.projects nowd;

column Region Pol_type JobTotal=num JobTotal=avg JobTotal=med JobTotal=std;

define region/'Regional Office' group;

define pol_type/'Pollutant' group width=10;

define num/n 'Number of Jobs' width=8;

define avg/mean 'Mean Total Cost' format=dollar12.2;

define med/median 'Median Total Cost' format=dollar12.2;

define std/std 'Std. Deviation of Total Cost' format=dollar12.2;

run;

Page 23: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Aliases

Page 24: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Using Formats with Grouping

In past work, formats have been used to define categories.

For groups, categories are determined by the format if one is specified.

Consider:

Page 25: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Using Formats with Groupingproc format;

value gestation

low-<259='Premature'

259-<999='Normal'

999='Unknown'

;

run;

proc report data=mysas.birthweight nowd;

column gestation birth_wt=n birth_wt=avg birth_wt=sd;

define gestation/group 'Gestation' format=gestation.;

define n/n 'Number of Births' width=8;

define avg/mean 'Avg. Birth Weight (oz)' format=9.2;

define sd/std 'Std. Deviation' format=9.2;

run;

Page 26: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Using Formats with Groupingproc format;

value gestation

low-<259='Premature'

259-<999='Normal'

999='Unknown'

;

run;

proc report data=mysas.birthweight nowd;

column gestation birth_wt=n birth_wt=avg birth_wt=sd;

define gestation/group 'Gestation' format=gestation.;

define n/n 'Number of Births' width=8;

define avg/mean 'Avg. Birth Weight (oz)' format=9.2;

define sd/std 'Std. Deviation' format=9.2;

run;

Page 27: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Result

Try without the format.

Page 28: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

The rbreak command allows for whole report summaries.

Syntaxrbreak location </ option(s)>;

Location: before or after

Options: summarize—gives whole report summary statistic(s)

ol or dol—overline or double overline

ul or dul—underline or double underline

Page 29: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

Example

proc report data=mysas.projects nowd;

column Region Pol_type JobTotal=num JobTotal=avg JobTotal=med JobTotal=std;

define region/'Regional Office' group;

define pol_type/'Pollutant' group width=10;

define num/n 'Number of Jobs' width=8;

define avg/mean 'Mean Total Cost' format=dollar12.2;

define med/median 'Median Total Cost' format=dollar12.2;

define std/std 'Std. Deviation of Total Cost' format=dollar12.2;

rbreak after / summarize dol;

run;

Page 30: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

Summary line added at bottom with double over-line separator.

Page 31: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

The break command allows for summaries by group.

Syntaxbreak location variable </ option(s)>;

Location: before or after Options: (in addition to previous)

skip—skips a line between groups page—skips a page between groups suppress—suppresses printing of the break variable

values at the end of each group

Page 32: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK Example:

proc report data=mysas.projects nowd;

column Region Pol_type JobTotal=num JobTotal=avg JobTotal=med JobTotal=std;

define region/'Regional Office' group;

define pol_type/'Pollutant' group width=10;

define num/n 'Number of Jobs' width=8;

define avg/mean 'Mean Total Cost' format=dollar12.2;

define med/median 'Median Total Cost' format=dollar12.2;

define std/std 'Std. Deviation of Total Cost' format=dollar12.2;

break after region/summarize ol skip;

rbreak after / summarize dol;

run;

Page 33: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

Page 34: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

BREAK and RBREAK

Summary line added at bottom of each region with over-line separator and line break before the next group.

Suppress option will remove these from the summary lines.

Page 35: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Other Options

In addition to the nowd option in the PROC REPORT line, you may wish to specify: headline—underlines the set of column

headings headskip—skips a line after the headings

before beginning the report box—draws borders for rows and columns of

the table

Page 36: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Exercise 1 Using the delay data set, create:

Page 37: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Exercise 2 Using the birthweight data and the following: Normal gestation is 259 days or more, less is

premature (missing values are coded as 999) Smoking status is coded as:

0, Never smoked 1, Currently smoke 2, Stopped smoking at pregnancy 3, Stopped before current pregnancy 9, Unknown

Create:

Page 38: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Exercise 2

Page 39: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Exercise 3

Use the fish data set to create the report that follows.

The standards for mercury levels are:

less than 0.5ppm is acceptable

more than 1.0ppm is toxic

between is considered as requiring action--dangerous

Page 40: The Report Procedurepeople.uncw.edu/blumj/stt305/ppt/The Report Procedure.pdfThe REPORT Procedure We have seen the PRINT procedure used to display data in various forms, and the MEANS

Exercise 3