20111226

26
. . . . . . . . . . . . . . . Introduction . . . Spatio Clustering of Suicide Data in Japan . . . . . . . . Application of RnavGraph Summary and Future Studies Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan Takafumi Kubota 1 , Makoto Tomita 2 , Fumio Ishioka 3 and Toshiharu Fujita 1 1 The Institute of Statistical Mathematics 2 Tokyo Medical and Dental University 3 Okayama University December 26, 2011 Kubota, T., Tomita, M., Ishioka, F. and Fujita, T. Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Transcript of 20111226

Page 1: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Visualization of high dimensional and large data setby RnavGraph and its application of suicide data in

Japan

Takafumi Kubota1, Makoto Tomita2,Fumio Ishioka3 and Toshiharu Fujita1

1The Institute of Statistical Mathematics

2Tokyo Medical and Dental University

3Okayama University

December 26, 2011

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 2: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

1 IntroductionStatistics of Suicide in JapanObjective

2 Spatio Clustering of Suicide Data in JapanStatistics of Community for the Death from SuicideHeirachical cluster analysis

3 Application of RnavGraphInstallApplication of the Suicide data

4 Summary and Future Studies

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 3: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

We briefly introduce statistics of suicide in Japan at the points of

When?

Where?Who?

SexAge-group

We changed the color of Age-group to red because it is ourobjective of this presentation.

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 4: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

When? (Time Series of the Number of Suicide)

White paper of suicide prevention (2011)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 5: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

The number of suicide rapidly increased from 1997 to 1998Burst of the economic bubble (1990-1992)Economic recession (1993-1997)→ Bankruptcy, corporate downsizing, unemployment,...

In this study, we use the time period of 1988-1992; beforerapidly increased time periods

For our future studies, we will use other time periods:→ (1988-1992),1993-1997,1998-2002,2003-2007,...

Individually (Purely spatial clustering)Simultaneously (Spatio-temporal clustering)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 6: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

Where? (Hotspot and Coolspot)

The results of spatial clustering. The color legend is as followsHotspot

Most likely cluster

Second most likely cluster

CoolspotMost likely cluster

Second most likely cluster

Otherwise

Kubota, et al. (2011)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 7: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

Hotspots and Coolspots of Male Case in 1988-1992

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 8: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Suicide in Japan

Who? (Sex and Age Group of the Number of Suicide)

White paper of suicide prevention (2011)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 9: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Objective

Objective

From the bar chart, we can find differences of proportions betweenage groups.→Our goal is to find characteristics of age-grouped spatial data ofsuicide in Japan.

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 10: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Objective

Analysis Procedure

1 Dendrogram; the results of hierarchical clustering

2 Dynamic tree cut

3 Reasoning for each cluster

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 11: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Objective

How we apply RnavGraph to the results of clustering?

To visualize the result of clustering, we will find the common pointsin same cluster.

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 12: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Statistics of Community for the Death from Suicide

Statistics of Community for the Death from Suicide(Fujita, 2009) was updated from the Ministry of Health,Labour and Welfare demographic survey of death

Population Survey Death Report of the Ministry of Health,Labour and WelfareTime: (73-77, 78-82, 83-87,) 88-92, (93-97, 98-02, 02-07, 08-09)Place: 354 Secondary medical care zonesSex: Male (, Female)16 age groups→ 4 age groups (weighted average)

10-29(10-14,15-19,20-24,25-29)30-49(30-34,35-39,40-44,45-49)50-69(50-54,55-59,60-64,65-69)70+(70-74,75-79,80-84,85+)

(Ways, Marriage and Job )Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 13: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Heirachical cluster analysis

Result; 1900 male

From the result, it seems that there are four groups.

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 14: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Heirachical cluster analysis

Choropleth map

4 clusters cut by dynamicTreeCut

Langfelder, et al.

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 15: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Install

What is RnavGraph?

RnavGraph provides interactive visualization tools forexploring high dimensional space through lower dimensionaltrajectories, based on the concepts first presented in Hurleyand Oldford (2011).

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 16: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Install

Install

EnvironmentWindows 7 (64 bit)R 2.14.0 (execute as Administrator(?))

1 install.packages(c("PairViz", "scagnostics",

2 "rgl", "grid", "MASS", "RGtk2", "hexbin", "vegan"),

3 dependencies = TRUE)

4 source("http://www.bioconductor.org/biocLite.R")

5 biocLite("graph")

6 biocLite("RBGL")

7 biocLite("RDRToolbox")

8 install.packages("RnavGraph")

9 install.packages("RnavGraphImageData")

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 17: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Install

Hello RnavGraph World!

1 library(RnavGraph)

2 ng.iris <- ng_data(name = "iris", data = iris[,1:4],

3 shortnames = c(’s.L’, ’s.W’, ’p.L’, ’p.W’),

4 group = iris$Species,

5 labels = substr(iris$Species,1,2))

6 navGraph(ng.iris)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 18: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Application of the Suicide data

Data

suigm90 int.csv

secid age1 age2 age3 age4 group1 101 11.91 28.50 32.98 50.45 12 102 9.70 40.21 46.79 36.73 23 103 18.93 27.49 34.52 49.23 1. . . . . . . . . . . . . . . . . . . . .

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 19: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Application of the Suicide data

Application of Suicide Data

1 require(RnavGraph)

2 sui.m90c <- read.csv("suigm90_int.csv")

3 ng.suim90c <- ng_data(name = "SuicideMale90",

4 data = sui.m90c[,2:5])

5 ng_set(ng.suim90c, "group") <- sui.m90c[,6]

6 navGraph(ng.suim90c)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 20: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Application of the Suicide data

Output of navGraph

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 21: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Application of the Suicide data

Application of Suicide Data (scagNav)

1 ng.sui<-ng_data(name="suicide",

2 data=sui.m90c[,2:5],

3 shortnames=c("a1","a2","a3","a4"),

4 group=sui.m90c[,6])

5 nav.sui <- scagNav(data = ng.sui,

6 scags = c("Monotonic", "NotMonotonic", "Clumpy",

7 "NotClumpy", "Convex", "NotConvex",

8 "Stringy", "NotStringy", "Skinny",

9 "NotSkinny", "Outlying","NotOutlying",

10 "Sparse", "NotSparse", "Striated",

11 "NotStriated", "Skewed", "NotSkewed"),

12 topFrac = 0.2, combineFn = max,

13 glyphs = shortnames(ng.sui), sep = ’:’)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 22: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Application of the Suicide data

Outputs of scagNav

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 23: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Reasoninggroup 3 (purple): High rate→Large squaregroup 4 (orange): Low rate→Small squaregroup 2 (green): High rate of age 1 (10-29)→Long right handgroup 1 (blue): Others

For our future studies, we will use other time periods:→ (1988-1992),1993-1997,1998-2002,2003-2007,...

Individually (Purely spatial clustering)Simultaneously (Spatio-temporal clustering)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 24: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

REFERENCES (1)

Fujita, T. (2009). Statistics of Community for the Death from Suicide.National Institute of Mental Health, National Center of Neurology andPsychiatry, Japan.

Hurley, C. and Oldford, R.W. (2011). Graphs as navigational infrastructurefor high dimensional data spaces, (Computational Statistics, to appear).

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T. (2011). SpatialAutocorrelation Statistics and Spatial Clustering in the Areas in Japan withLow Suicide Rates, Joint2011, pp. ???

Waddell, A. and Oldford, W. (2011). RnavGraph: an R package to visualizehigh dimensional data using graphs as navigational infrastructure.http://cran.r-project.org/web/packages/RnavGraph/vignettes/

RnavGraph.pdf(Dec. 26, 2011)

White paper of suicide prevention (2011). Cabinet Office (in Japanese)http://www8.cao.go.jp/jisatsutaisaku/whitepaper/index-w.html

(Dec. 17, 2011)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 25: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

REFERENCES (2)

Langfelder, P., Zhang, B. and Horvath, S. Defining clusters from ahierarchical cluster tree:the Dynamic Tree Cut library for Rhttp://www.genetics.ucla.edu/labs/horvath/

CoexpressionNetwork/BranchCutting/ (Dec. 26, 2011)

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan

Page 26: 20111226

. . . . . .

. . . . . .

. . .

Introduction.. .

Spatio Clustering of Suicide Data in Japan. . .. . . . .

Application of RnavGraph Summary and Future Studies

Q & A

Thank you very much foryour kind attention.

Takafumi Kubota (The Institute of Statistical Mathematics)[email protected]

Kubota, T., Tomita, M., Ishioka, F. and Fujita, T.

Visualization of high dimensional and large data set by RnavGraph and its application of suicide data in Japan