Post on 07-Oct-2020
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 1
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
VISUALIZING SPATIAL INFORMATION FROM
MULTIPLE MEASURES WITHIN A UNIT OVER TIME;
SUPPORTED BY JSL
• Tony Cooper & Sam Edgemon
• Analytical Consultants at SAS
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
BIO
• Tony Cooper, PhD, Principal Analytic Consultant, SAS• Tony has extensive experience using statistical methods to solve executive-driven business problems in a variety
of industries. He has done project work, as well as taught the principles of work process improvement, statistical thinking and statistical methods. Tony is an excellent SAS coder, JMP user and JSL scripter. Tony received his
doctorate from the University of Tennessee. He also has a BS in chemical engineering from Rensselaer Polytechnic Institute.
• Sam Edgemon, Senior Analytical Consultant, SAS• A SAS and JMP user for over 20 years, a period of time in which Sam Edgemon has focused on consulting and
corporate work utilizing many SAS products with project roles ranging from contributing analyst to project lead, as
well as all aspects of managing technically-oriented projects. He has gained experience from many areas: Government, Environmental, Biological Surveillance, Health Care, Pharmaceutical, Automotive, Financial
Services, Education, Gaming, Recreation, and Agriculture. Edgemon holds a BS in mathematics and a BS in statistics from the University of Tennessee with certificates from the University of Tennessee in Process Controls
and Experimental Design, and from the Massachusetts Institute of Technology in Data Mining.
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 2
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
A SCRIPT TO INVESTIGATE THE MULTIVARIATE DIACHRONIC NATURE OF
SPATIAL SOURCES OF VARIATION GRAPHICALLY & QUANTITATIVELY
• Understanding the causes of variation in key product or process
characteristics is an ongoing task in product and process design and
manufacturing. Once discovered, the initial investigation is followed by
engineering solutions to reduce or mitigate sources of variation. Included in
the multitude of diagnostic strategies available for initial investigation &
development of theories are various data methodologies to examine
retrospective and observational data.
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
BACKGROUND
• Edward Tuf te characterizes fundamental graphical designs into data maps, time series maps, space-time narrative designs, and relational graphics. Data Maps, placing data on cartographical displays, have a rich history. Perhaps the most famous
is Dr. John Snow’s use of data maps in investigating the cause of cholera. An equiv alent of Data Maps to investigate spatial
structure are concentration diagrams where data, measurements or defects, etc., are placed on a schematic of a product. The importance of time based graphics in production is well understood; runs charts and control charts are widely used.
Inv estigation of diachronic structure is at the basis of time series plots; “do the sources of variation act in a consistent f ashion” or “How long does it take the process to ‘settle’ down’” are typical questions. But, despite their usefulness, space-
time maps are not y et common in engineering. Ty pically, a large number of product dimensions and process parameters are
observ ed simultaneously over time. Investigation of these multivariate systems with space-time maps, graphically and quantitatively, is a nontrivial process.
A famous map combining spatial and time characteristics is Minard’sMap describing Napoleon’s Army destruction in the
invasion of 1812.
Charles Joseph Minard, “Carte Figurative des pertessuccessives en hommes de l'Armée Française dans la campagne de Russie 1812-1813”; Paris 1869
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 3
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
SPATIAL RELATIONSHIPS: CONCENTRATION DIAGRAMS / MAPS IN JMP
Malaria Cases (World Health Organization)
JMP finds these
Map shapes ‘automatically’
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
TEMPORAL CHANGES
• The graph shows two measurements on 70 parts over time (some labeled)
• Unsupervised Causal Models:
• Special cause / Common Cause• Championed by Shew hart and Deming
• Typically evaluated w ith control charts
• Clusters
Control Chart
Cluster Analysis
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 4
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
SMALL 3D EXAMPLE
• What are the interesting facets of
this data?
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
AN ANSWER
Cluster Analysis
Multivariate Control Chart
Each step is readily available in JMP
A script can speed up this analysis
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 5
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
TYPICAL DATA
• Univariate
• Across dimensions over time
• Across regions (geographic) over
time
• Across part locations (quality) over
time
• Columns
• Time (or a logical order)
• Location
• Measure
• Product Examples
• Profiles within parts
• Measured on a cmm
• Impurity profiles
• Public Health
• Disease counts by country
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
Step 2)
Choose #
Clusters
using
dendogram
Step 2a)
Choose #
Principal
Components
Step 3)
Continue (or
Update)
Step 4) Add
Cluster
information to
original table
SCRIPT
STEP 1) RUN ’WITHINPARTANALYSIS’
STEP 1A) OPEN DATA WITH LINKED (CUSTOM) MAP
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 6
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
ORIGINAL &
REARRANGED DATA
Name L Part Value
R1_S1 1 1 125.02
R1_S2 1 1 123.1
R1_S3 1 1 122.38
R1_S4 1 1 122.81
R1_S5 1 1 126.74
R1_S6 1 1 125.05
R1_S7 1 1 125.27
R1_S8 1 1 122.08
R2_S1 1 1 127.04
R2_S2 1 1 123.58
Part
R1_S
1 1
R1_S
1 2
R1_S
1 3
R1_S
2 1
R1_S
2 2
R1_S
2 3
R1_S
3 1
R1_S
3 2
R1_S
3 3
1 125.02 128.75 126.55 123.1 126 122.5 122.38 125.4 125.68
2 125.41 127.77 126.22 125.16 124.72 123.34 125.18 119.41 125.97
3 124.67 126.15 125.98 123.41 125.07 128.28 125.66 122.97 121.73
4 120.96 123.44 125.26 124.17 122.77 130.01 123.49 123.33 124.06
5 123.93 125.47 125.16 127.48 121.23 128.23 124.44 121.6 123.71
6 125.3 123.9 126.95 123.87 125.35 124.83 124.93 125.42 122.49
At least one
column
describes locations
within a part
One
column
identifies the part
One
column
relates the
measure at the
location /
part
Rows=(# Parts)x(# Locations)
Rows = (#Parts)
Locations w ithin part
became columns
Original Data
Rearranged Data
Script
splits
columns
STEP 1) RUN ’WITHINPARTANALYSIS’
STEP 1A) OPEN DATA WITH LINKED (CUSTOM) MAP
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
HIERARCHICAL CLUSTER ANALYSIS USING THE WARD METHOD
• Starts with the maximum number of clusters (= # parts) and iteratively
combines points & clusters of points that are closest together.
• Closest is the shortest distance between a summary of the profile.
• Ward is a method based on creating a next cluster with minimum variance between
the profiles.
Step 2) Choose #
Clusters using
dendogram
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 7
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
PRINCIPAL COMPONENTS:
ASSESSING COVARIANCE / CORRELATION STRUCTURE
• Maximum number of Principal
Components = Number of Locations
within the part
• Considers linear combinations
between locations (describing the
profile)
• 1st component describes the strongest
relationship and so on.
• Consider correlation between locations
as an example.
Y1 Y2 Y3 Z1 Z2 Z3 x1 x2
Y1 0.796 0.693 0.509 0.487 0.439 0.113 -0.157
Y2 0.796 0.745 0.345 0.500 0.361 0.016 0.019
Y3 0.693 0.745 0.287 0.315 0.345 -0.012 -0.161
Z1 0.509 0.345 0.287 0.904 0.734 0.157 0.011
Z2 0.487 0.500 0.904 0.809 0.109 0.037
Z3 0.439 0.361 0.345 0.734 0.809 -0.039 -0.059
x1 0.113 0.016 -0.012 0.157 0.109 -0.039 0.033
x2 -0.157 0.019 -0.161 0.011 0.037 -0.059 0.033
Example of a correlation matrix
Are the relationships always as described by this matrix? I.e., Are there multiple loadings
Step 2a) Choose #
Principal
Components
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
MORE ABOUT PRINCIPAL COMPONENTS
• Columns are locations
• The idea that the locations
are distinct is preserved
• The idea that certain
locations are closer than
others is lost
Dimension Reduction
Scree Plot
Eigenvalues
Most the variation is
explained by 3 or 4
components
λ
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 8
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
MULTIVARIATE CONTROL CHARTS
ASSESSING COVARIANCE / CORRELATION STRUCTURE OVER TIME
• Data must be in a logical (time)
order.
• Signal
• Noise
• Summarized by Hotelling T2
• 𝑇2 = 𝑃𝐶𝐴
𝑖2
𝜆𝑖
• 𝑇2~𝑛 𝑛 − 1 2Β
𝑝2, 𝑛−𝑝−1
2
Step 3)
Continue (or
Update)
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
T2
WITH PHASES BY
CLUSTER
• The T2 multivariate control chart assumes a single homogenous baseline process (with a few
observations affected by special causes).
• The clustering suggest many ‘common’ cause states. Invalidating the calculated limits.
• If the reason for the clustering is understood, phased control limits could make sense.
• Multiple machines? (f ixed, systematic differences)
• Raw material lots, setup?
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 9
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
VIEW CONCENTRATION MAPS REPRESENTING KEY TIME PERIODS
Step 4) Add Cluster
information to original
table
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
EXAMPLE 1
• Measures at 72 Locations per part
• Three levels (L=1,2,3)
• 24 measurements per Level
• Three rings
• Eight sections per ring
• 617 parts produced over time
• 44,424 records
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 10
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
ANALYSIS OF EXAMPLE 1:
Multivariate Summary over Time
Spatial Description of clusters 3,
6 & 8
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
A QUICK NOTE ABOUT CUSTOM MAP FILES: TWO DIMENSIONAL MAPS
Data Table
• Includes a column identif ying the map location f or the row.
•This location ID column has a Column Property > Map Role > Shape Name Use > nn-name.jmp
nn-Name.jmp
•Def ault File Location
•Custom file location, data table must ref erence the custom location
•At least two columns
•Must hav e “Shape ID” column with Column Property > Map Role > Shape Name Def inition
•Custom Name: This has the same v alues used in the map location in the data table. Examples: Parish, Loc_ID
nn-XY.jmp
•Must be in the same location as nn-name.jmp
•Column Names
•Shape ID: Identifies each map shape
•Part ID: allows a single shape to hav e separate sections (ev en non contiguous)
•X, Y: The v ertices
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 11
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
CONCLUSION
• Semiconductor w afer manufacture and CMMs (Coordinate Measurement Machines) provide many
measurements for a single part. Another example of repeated measures w ithin a unit is defect
counts. This data includes a w ithin unit location.
• Modeling using the spatial information can be diff icult, but spatial information should alw ays be used
in visualization. The potential information provided by this data includes: “Is the current unit similar
to other recent units?” and “Where are the opportunities concentrated w ithin the unit?”
• This data can be analyzed and visualized using Principal Components Analysis (PCA), multivariate
control charting and clustering. Most importantly the data w ill be visualized using custom maps
w ithin JMP. There w ill need to be more than one arrangement of the data in order for the analysis to
w ork; splitting the data for PCA and clustering, but also merging cluster information w ith the original
stacked view.
• This presentation described a methodology that is streamlined w ith JMP scripting to: indicate similar
processing time periods on a control chart and then map the w ithin part information during each of
those time periods.
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
THANK YOU
• Tony.Cooper@SAS.com
• Sam.Edgemon@SAS.com
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 12
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
NEXT STEPS ADDITIONAL IDEAS
• Implement option to calculate control limits by cluster
• Consider time series in T2 - a 3rd method of looking for patterns in the profile
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
SUPPORT PAGES
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 13
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
TWO RELATED UNSUPERVISED MODELS;
‘STATES’ OF THE CAUSAL STRUCTURE
Special Cause / Common Cause• Shewhart / Deming Model
• Common cause• Ubiquitous causal structure
• Typical
• The norm
• Special cause• Additional ‘assignable’ cause(s)
• Methodology1. Summarize prof ile
1. Often based on PCA 2. Max # components= # variables
2. Use (robust) estimators to assess what is typical / normal / baseline
3. Compare data to baseline to evaluate atypical
Clustering / Segmentation• Parts w ith similar profiles are grouped. Each
group representing a causal structure.
• Methodology
• ‘Distance’ betw een profile of different
parts
• Max number of clusters - # observations
Return
T2 Multivariate
Control Chart
Cluster Analysis
The two models are not entirely compatible. Clustering suggests more than one ‘common’ causal structure.
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
OVERVIEW OF
SCRIPT
Data
Rearrange Data
Principal Components across
Locations
Multivariate Control Chart
Color Code Multivariate Control Chart by Cluster
Update Original table
Summarize Maps by Cluster
Cluster Analysis
(Custom) Map
Return
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 14
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
FOUR COMPONENTS SAVED FROM EXAMPLE
Return
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
T2 PRINCIPAL COMPONENTS ON COVARIANCE
Return
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 15
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
CUSTOM MAPS DEFAULT FILE LOCATIONS
• JMP looks for these files in two locations. One location is shared by all users
on a machine. This location is:
• Windows: C:\Program Files\SAS\JMP\<Version Number>\Maps
• Mac: /Library/Application Support/JMP/<Version Number>/Maps
• The other location is specific for an individual user:
• On Windows: C:\Users\<user name>\AppData\Roaming\SAS\JMP\Maps
• Note: On Window s, in JMP Pro, the “JMP” folder is named “JMPPro”. In JMP Shrinkw rap, the
“JMP” folder is named “JMPSW”.
• On Mac: /Users/<user name>/Library/Application Support/JMP/Maps
Return
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
EXAMPLE OF DATA
TABLE
Return
8/14/2015
C o p y r ig h t © 20 15 , SAS In sti t u te In c . A l l rig h ts re ser ve d . 16
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
EXAMPLE OF –
NAME.JMP FILE
Return
Copy right © 2015, SAS Ins ti tu te Inc . Al l rights res erv ed.
EXAMPLE OF –
XY.JMP FILE
• Column Names
• Shape ID: Identifies each map shape
• Part ID: allows a single shape to
have separate sections (even non
contiguous)
• X, Y: The vertices
Return