Data Transformations
-
Upload
brynn-savage -
Category
Documents
-
view
35 -
download
0
description
Transcript of Data Transformations
![Page 1: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/1.jpg)
Data Transformations
![Page 2: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/2.jpg)
Data Transformations
For some data sets, it may be necessary to transform variables– e.g. change units (lb to kg, ˚C to ˚F, etc.)
• This is simply a change in the scale, and such transformations are called ‘Linear’.
• Linear transformations consist of (1) multiplying all the observations by a constant, (2) adding a constant to all observations, or (3) both.
![Page 3: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/3.jpg)
Data Transformations Multiplicative transformation example
– Y = weight in kg– Y’ = weight in lb– Y’ = 2.2Y
Additive transformation example– Measurements of nitrate (mg/l) → Y
• Y = 0.3, 0.35, 0.5, 0.42, 0.38, 0.56…
– Add 1 to each number → Y’• Y’ = 1.3, 1.35, 1.5, 1.42, 1.38, 1.56…
![Page 4: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/4.jpg)
Data Transformations
Additive and Multiplicative example– Body temperature measurements in ˚C (Y) were
taken for 47 women; if we convert to ˚F (Y’): • Y’ = 1.8Y + 32
Multiplicative transformations affect S in the same way that they affect the mean:– e.g., if mean Y = 22, and mean Y’ = 2.2Y
– then SY’ = 2.2SY
![Page 5: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/5.jpg)
Data Transformations
Additive transformations, however, don’t affect S
Original observations
Deviations Transformed observations
Deviations
0.36 -0.3 1.36 -0.3
0.40 0.1 1.40 0.1
0.42 0.3 1.42 0.3
0.38 -0.1 1.38 -0.1
Mean 0.39 1.39
![Page 6: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/6.jpg)
Data Transformations
Additive transformations thus effectively move probability distributions to the left or the right – but the shape of the histogram is unchanged.
Multiplicative transformations shrink or stretch the probability distribution
![Page 7: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/7.jpg)
Nonlinear Transformations
These sorts of transformations affect data in more complex ways.
Examples:
2'
1'
'
)log('
YY
YY
YY
YY
![Page 8: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/8.jpg)
Nonlinear Transformations
These transformations do change the essential shape of frequency distributions
They are thus used to try and make distributions more symmetric – i.e., are tools to achieve normality.
![Page 9: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/9.jpg)
Transformations to achieve normality
If the distribution is skewed to the right (the most common problem) then each of the following transformations will help produce a more symmetric distribution.
The transformations are listed in order of how much they will pull in a right-skewed distribution. Y
Y
Y
Y
Y
1
1
)ln(
)(log10
![Page 10: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/10.jpg)
Transformations to achieve normality
Percentage or proportion data is a special case – it often appears binomially distributed– e.g., 0-100%, 0-1
Here the appropriate transformation is:
YY arcsin'
![Page 11: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/11.jpg)
Results
Tables and figures - must have a purpose
![Page 12: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/12.jpg)
Results: Tables
When to use:– Present numerical values– Large amounts of information
Rules– Numbered consecutively– Must be able to stand alone– Vertical arrangement– Title goes above the table– Definitions/’explanations’ go below the table
![Page 13: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/13.jpg)
“Bad Table”
Carbon Source
Glucose Sucrose Mannitol
Growth rate (generations/h)
0.93 0.21 0.47
Activity of ODC (mol CO2/h)
12.6 6.9 1.5
Activity of SDH (mmol fumarate/h)
137.7 19.3 50.9
Table 6. Growth rate of cell cultures and activity of ornithine decarboxylase (ODC) and succinate dehydrogenase (SDH)in Pseudomonas aeruginosa in response to various carbonsources
![Page 14: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/14.jpg)
“Good Table”
Enzyme activity
Carbon Source
Growth rate (generations/h)
ODC (mol/CO2/h)
SDH (mmol fumarate/h)
Glucose 0.93 12.6 137.7
Sucrose 0.21 6.9 19.3
Mannitol 0.47 1.5 50.9
Table 7. Growth rate of cell cultures and activity of ornithine decarboxylase(ODC) and succinate dehydrogenase (SDH) in Pseudomonas aeruginosa inresponse to various carbon sources
![Page 15: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/15.jpg)
Table 4. Response of male fighting fish (Betta splendens) totheir image in a mirrora
aPrior to the experiment, fish had been visually isolated from one another for 2 wk. Observation period for each fish was 30 s.
![Page 16: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/16.jpg)
Results: Figures
Use to illustrate important points – summarize your data
Number graphs consecutively– separately from tables
Must be able to stand alone Titles go below figure or on separate “Figure
Legends” page Know when to use specific types of graphs
– Bar graph vs histogram– Scatter plot vs line graph
![Page 17: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/17.jpg)
Bar graph (refer to page 57)
0
5
10
15
20
25
C. rap. E. ang. H. aur.
Species
Mea
n #
of fl
ower
s/pl
ant
Problems?
![Page 18: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/18.jpg)
Bar graph (refer to page 57)
0
5
10
15
20
25
C. rap. E. ang. H. aur.
Species
Mea
n #
of fl
ower
s/pl
ant
Cleared quadratControl quadrat
![Page 19: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/19.jpg)
0
5
10
15
20
25
0 to 2 4 to 6 8 to 10 12 to 14 16 to 18 20 to 22
Disance from parent plant (cm)
See
d f
req
uen
cy
![Page 20: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/20.jpg)
Results: Graphs
Do not forget to include error bars– Is your data significant?– Are there differences
Complete figure legend
![Page 21: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/21.jpg)
0
5
10
15
20
25
C. rap. E. ang. H. aur.
Species
Mea
n #
of fl
ower
s/pl
ant
Figure 2. Production of flowers by three species of plants in the absence of interspecific competition and under natural conditions
Cleared quadratControl quadrat
![Page 22: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/22.jpg)
0
5
10
15
20
25
C. rap. E. ang. H. aur.
Species
Mea
n #
of fl
ower
s/pl
ant
Figure 2. Production of flowers by three species of plants in the absence of interspecific competition (cleared quadrats) and undernatural conditions (control quadrats). The plants were Campanularapunculoides, Epilobium angustifolium, and Hieracium aurantiacum. Plotted are means for eight randomly chosen quadrats. Each 1 x 1 m2.
Cleared quadratControl quadrat
![Page 23: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/23.jpg)
Text – Data summary– Do not discuss or draw conclusions
Statistics– Incorporate statistics into the verbal text– Be careful when using the word “significant”– Refer to appropriate tables and figures
• When do you use “Figure” and when do you use “Fig.”?
![Page 24: Data Transformations](https://reader034.fdocuments.net/reader034/viewer/2022042822/568134fd550346895d9c47ef/html5/thumbnails/24.jpg)
As shown in Figure 1, the shoreline of Hicks Pond was generally predominated by grasses and sedges.
Observed frequencies of turtles obtaining food differed significantly from expected frequencies (x2=58.19, df=8, P<0.001; Fig. 2).