10 Tips for Better Visualization of Scientific Data
-
Upload
sercan-taha-ahi -
Category
Education
-
view
831 -
download
4
Transcript of 10 Tips for Better Visualization of Scientific Data
1
10 Tips for
Better Visualization of
Scientific DataSercan Taha Ahi ([email protected])
Yamaguchi Laboratory @ Tokyo Institute of Technology
2012/7/26
2
Each plot, each figure, and each drawing is there to communicate a
scientifically interesting idea to a scientific community.
They are not intended to be laundry lists of experimental outcomes.
Please do not forget the purpose, and do not forget the audience.
Before starting
36%
9%5%4%
45%
3
1. Get rid of “empty” dimensions.
A pie chart
A better pie chart
45%
36%
9%
5%4%
12
34
5
0
1
2
3
4
1 2 3 4 50
0.5
1
1.5
2
2.5
3
3.5
A bar graph
A better bar graph
4
2. Maximize data-ink ratio.
1 2 3 40
10
20
30
40
50
60
70
80
90
100
A bar graph A better bar graph
Optimized data-ink ratio (1) is eco-friendly, (2) provides better visibility - even for greyscale prints-, and (3)
communicates ideas more efficiently.
1 2 3 40
10
20
30
40
50
60
70
80
90
100
100 200 300 400 500
90.5
91
91.5
92
92.5
93
93.5
5
3. Show the entire scale.
Is this a significant drop?
A line plot A better line plot
100 200 300 400 500
10
20
30
40
50
60
70
80
90
100
6
3. Show the entire scale.
This one is better
A group of line plots
7
4. State the axis labels, units, and title.
A line plot A better line plot
100 200 300 400 500
10
20
30
40
50
60
70
80
90
100
100 200 300 400 500
10
20
30
40
50
60
70
80
90
100Classification accuracy
Number of training samples
Acc
ura
cy (
%)
For arbitrary units, use (a.u.)
8
5. Set the aspect ratio appropriately.
-4 -2 0 2 4 6
-1.5
-1
-0.5
0
0.5
1
1.5
2-dim representation of the data by PCA
PC#1
PC
#2
-4 -2 0 2 4 6
-1
0
1
2-dim representation of the data by PCA
PC#1
PC
#2
A 2D plot
A better 2D plot
Although the ranges of x and y coordinates of the
data samples are unequal, the left figure has equal
length x and y axes, which might mislead the viewer
into believing the distance between cluster A and B is
equal to the distance between cluster A and C.
A
B
C
A
B
C
9
6. Indicate and label uncertainty.
There is an uncertainty in every experiment, in every measurement. Depending on the deviation in the data,
the conclusions that you draw might be drastically different. Therefore, you should always show the
uncertainty in the data.
Mean plot
Mean plot
One data sample
400 450 500 550 600 650 700
0.1
0.2
0.3
0.4
0.5
0.6
Wavelength (nm)N
orm
aliz
ed
ab
sorp
tion
coeffi
cien
ts (
a.u
.)
(Given confidence intervals are for 1 std.)
400 450 500 550 600 650 7000
0.1
0.2
0.3
0.4
0.5
0.6
Wavelength (nm)
Norm
aliz
ed
ab
sorp
tion
coeffi
cien
ts (
a.u
.)
10
7. Do not use bitmap graphics; prefer eps or pdf when possible.
Bitmap graphics do not scale well. When graphics do not scale well, your study looks amateurish.
Use vector graphics instead. If you have no access to proprietary tools, then please create high-resolution
bitmap images, or better, consider using free programming languages such as R and Python for plots, and
free graphics tools such as Inkscape and Gimp for drawings.
100 200 300 400 500
10
20
30
40
50
60
70
80
90
100Classification accuracy
Number of training samplesA
ccura
cy (
%)
A line plot A better line plot
11
8. Set the precision of the real numbers appropriately.
0.060216
612.272794
1224.510995
1836.737045
2449.010718
3061.264419
3673.485352
4285.764609
4897.965173
5510.240014
6122.491553
6734.756012
7346.954305
7959.213265
8571.479209
9183.700827
9795.935226
0
0.3235235
0.647047000000001
0.970570500000001
1.294094
1.6176175
0.060216
408.193666
816.356831
1224.510995
1632.684048
2040.827888
2449.010718
2857.165282
3265.318728
3673.485352
4081.650341
4489.808321
4897.965173
5306.167373
5714.313701
6122.491553
6530.654041
6938.789245
7346.954305
7755.141964
8163.295462
8571.479209
8979.635357
9387.81045
9795.935226
0.00.20.40.60.81.01.21.41.61.8
A line plot
A better line plot
Ask yourself: What is the minimum precision (number of decimal places) needed to convey my idea?
12
9. Choose colors carefully.
1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Our methodTheir method1Their method2Their method3Their method4
1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Our methodTheir method1Their method2Their method3Their method4
A bar graph A better bar graph
When you want to compare your method with a number of well-established approaches on a graph, pick an
easily distinguishable color for your results. Do not make the listeners or readers search for it.
13
9. Choose colors carefully.
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
A scatter plot A better scatter plot
Be gentle, and do not forget color blinds.
14
10. Put your data into a context.
A plot that depicts the total snowfall in Boston for the winter of 2010-2011
Inches?? Can you quickly imagine how high 80.1 inches is?
htt
p:/
/ww
w.b
ost
on.c
om
/new
s/w
eath
er/
gra
phic
s/2011_s
now
fall/
15
10. Put your data into a context.
A better plot that depicts the total snowfall in Boston for the winter of 2010-2011
Now you can, right?
htt
p:/
/ww
w.b
ost
on.c
om
/new
s/w
eath
er/
gra
phic
s/2011_s
now
fall/
16
THANK YOU.I would also like to thank Dr. Mehmet Cagatay
Tarhan for his valuable comments.