Package ‘datasets’nakama/Rjp/datasets-manual.pdfPackage ‘datasets’ September 14, 2016...

Package ‘datasets’September 14, 2016

Version 3.3.1

Priority base

Title The R Datasets Package

Author R Core Team and contributors worldwide

Maintainer R Core Team <[email protected]>

Description Base R datasets.

License Part of R 3.3.1

R topics documented:datasets-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3ability.cov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3airmiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4AirPassengers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5airquality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6anscombe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7attenu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8attitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9austres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10beavers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10BJsales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11BOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12cars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13ChickWeight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14chickwts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15CO2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16co2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17crimtab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18discoveries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19DNase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20esoph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21euro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22eurodist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23EuStockMarkets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24faithful . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Formaldehyde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25freeny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1

2 R topics documented:

HairEyeColor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Harman23.cor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Harman74.cor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Indometh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29infert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30InsectSprays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31iris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32islands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33JohnsonJohnson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33LakeHuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34lh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35LifeCycleSavings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Loblolly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36longley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37lynx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38morley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38mtcars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39nhtemp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40Nile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41nottem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42npk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43occupationalStatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Orange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44OrchardSprays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45PlantGrowth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46precip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47presidents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Puromycin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49quakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50randu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51rivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52rock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53stackloss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55sunspot.month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56sunspot.year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57sunspots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58swiss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Theoph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Titanic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61ToothGrowth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62treering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64UCBAdmissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64UKDriverDeaths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66UKgas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67UKLungDeaths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67USAccDeaths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68USArrests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68USJudgeRatings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

datasets-package 3

USPersonalExpenditure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70uspop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70VADeaths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71volcano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72warpbreaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72women . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73WorldPhones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74WWWusage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Index 76

datasets-package Rデーターセットパッケージ

Description

Rの基本データセット

Details

このパッケージは様々なデータセットを含む．完全なリストは library(help = "datasets") を使う．

Author(s)

R Core Teamと世界中の貢献者

Maintainer: R Core Team <[email protected]>

ability.cov 能力と知能検査データ

Description

112人に六種類のテストが行われた．このオブジェクトの共分散行列が与えられる．

Usage

ability.cov

Details

テストは次のようになる

general: Cattellの文化に依存しない検査を用いた一般的知能の非言語的測定．picture: 図形完成テスト．blocks: ブロックデザイン．maze: 迷路．reading: 読解力．vocab: 語彙．

Bartholomewは共分散と相関行列を与えているがそれらは一貫していない．どちらもオリジナル論文には無い．

4 airmiles

Source

Bartholomew, D. J. (1987) Latent Variable Analysis and Factor Analysis. Griffin.

Bartholomew, D. J. and Knott, M. (1990) Latent Variable Analysis and Factor Analysis. SecondEdition, Arnold.

References

Smith, G. A. and Stanley G. (1983) Clocking g: relating intelligence and measures of timed perfor-mance. Intelligence, 7, 353–368.

Examples

require(stats)(ability.FA <- factanal(factors = 1, covmat = ability.cov))update(ability.FA, factors = 2)## 因子の，従って相関の符号は promax 回転では任意．update(ability.FA, factors = 2, rotation = "promax")

airmiles 米国の商業航空便の乗客マイル数データ

Description

1937年から1960年の米国の商業航空便の課金乗客マイル数データ．

Usage

airmiles

Format

24観測値の時系列：年度 1937–1960．

Source

F.A.A. Statistical Handbook of Aviation.

References

Brown, R. G. (1963) Smoothing, Forecasting and Prediction of Discrete Time Series. Prentice-Hall.

Examples

require(graphics)plot(airmiles, main = "airmiles data",

xlab = "Passenger-miles flown by U.S. commercial airlines", col = 4)

AirPassengers 5

AirPassengers 飛行機乗客データ

Description

古典的な Box & Jenkinsの航空便データ．国際航空便乗客の月別総計，1949年から1960年．

Usage

AirPassengers

Format

月別時系列，千人単位．

Source

Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976) Time Series Analysis, Forecasting andControl. Third Edition. Holden-Day. Series G.

Examples

## Not run:## これらは非常に遅いので example(AirPassengers) では実行されない

## 古典的な 'airline model'，完全な ML による(fit <- arima(log10(AirPassengers), c(0, 1, 1),

seasonal = list(order = c(0, 1, 1), period = 12)))update(fit, method = "CSS")update(fit, x = window(log10(AirPassengers), start = 1954))pred <- predict(fit, n.ahead = 24)tl <- pred$pred - 1.96 * pred$setu <- pred$pred + 1.96 * pred$sets.plot(AirPassengers, 10^tl, 10^tu, log = "y", lty = c(1, 2, 2))

## 完全な ML 当てはめは時系列を逆転しても同じ，CSS 当てはめは異なるap0 <- rev(log10(AirPassengers))attributes(ap0) <- attributes(AirPassengers)arima(ap0, c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))arima(ap0, c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12),

method = "CSS")

## 構造的時系列ap <- log10(AirPassengers) - 2(fit <- StructTS(ap, type = "BSM"))par(mfrow = c(1, 2))plot(cbind(ap, fitted(fit)), plot.type = "single")plot(cbind(ap, tsSmooth(fit)), plot.type = "single")

## End(Not run)

6 airquality

airquality airquality

Description

ニューヨークの1973年の5月から9月の毎日の大気品質観測．

Usage

airquality

Format

6変数に対する154観測値のデータフレーム．

[,1] Ozone 数値オゾン(ppb)[,2] Solar.R 数値日照量(lang)[,3] Wind 数値風量(mph)[,4] Temp 数値温度(華氏)[,5] Month 数値月(1–12)[,6] Day 数値月の中の日(1–31)

Details

May 1, 1973年(木曜日)から September 30, 1973年の以下の大気品質の日別記録．

• Ozone: ルーズベルト島における 1300時から1500時までの平均オゾン量(単位 ppm)

• Solar.R:セントラルパークに於ける 0800時から1200時までの周波数帯 4000–7700オングストローム間の日照量(単位 Langley)

• Wind: LaGuardia空港に於ける 0700時から1000時の平均風速(毎時マイル)

• Temp: LaGuardia空港に於ける最高気温(華氏)

Source

データはニューヨーク州の環境管理局 (オゾンデータ)と国立気象サービス(気象データ)から得られた．

References

Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983) Graphical Methods for DataAnalysis. Belmont, CA: Wadsworth.

Examples

require(graphics)pairs(airquality, panel = panel.smooth, main = "airquality data")

anscombe 7

anscombe Anscombeの回帰分析の四つ組みデータ

Description

同一の伝統的な統計的性質 (平均，分散，相関，回帰直線等)を持つが全く異なる x-y データセットの四組．

Usage

anscombe

Format

8変数に関する11観測値のデータフレーム．

x1 == x2 == x3 整数 4:14，特殊な配置x4 値 8と19

y1, y2, y3, y4 平均7.5で標準偏差2.03の(3, 12.5)中の数

Source

Tufte, Edward R. (1989) The Visual Display of Quantitative Information, 13–14. Graphics Press.

References

Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.

Examples

require(stats); require(graphics)summary(anscombe)

##-- ここで四つの回帰をループで行う"マジック"を：ff <- y ~ xmods <- setNames(as.list(1:4), paste0("lm", 1:4))for(i in 1:4) {

ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)## 又は ff[[2]] <- as.name(paste0("y", i))## ff[[3]] <- as.name(paste0("x", i))mods[[i]] <- lmi <- lm(ff, data = anscombe)print(anova(lmi))

}

## それらがどれくらい近いか(数値的に!)を見るsapply(mods, coef)lapply(mods, function(fm) coef(summary(fm)))

## ここで，先ず行うべきことを行う：プロットop <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma = c(0, 0, 2, 0))for(i in 1:4) {

ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,

8 attenu

xlim = c(3, 19), ylim = c(3, 13))abline(mods[[i]], col = "blue")

}mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex = 1.5)par(op)

attenu 地震の減衰データ

Description

このデータはカリフォルニアの23の地震に対する様々な観測所に於けるピーク時加速度を与える．データは地盤加速に関する距離の減衰効果を推定するために様々な研究者により使われてきた．

Usage

attenu

Format

5変量182観測値のデータフレーム．

[,1] event 数値イベント番号[,2] mag 数値モーメント・マグニチュード[,3] station 因子観測所番号

[,4] dist 数値観測所と震央の距離(km)[,5] accel 数値ピーク時加速度(g)

Source

Joyner, W.B., D.M. Boore and R.D. Porcella (1981). Peak horizontal acceleration and velocityfrom strong-motion records including records from the 1979 Imperial Valley, California earthquake.USGS Open File report 81-365. Menlo Park, Ca.

References

Boore, D. M. and Joyner, W.B.(1982) The empirical prediction of ground motion, Bull. Seism. Soc.Am., 72, S269–S268.

Bolt, B. A. and Abrahamson, N. A. (1982) New attenuation relations for peak and expected accel-erations of strong ground motion, Bull. Seism. Soc. Am., 72, 2307–2321.

Bolt B. A. and Abrahamson, N. A. (1983) Reply to W. B. Joyner & D. M. Boore’s “Comments on:New attenuation relations for peak and expected accelerations for peak and expected accelerationsof strong ground motion”, Bull. Seism. Soc. Am., 73, 1481–1483.

Brillinger, D. R. and Preisler, H. K. (1984) An exploratory analysis of the Joyner-Boore attenuationdata, Bull. Seism. Soc. Am., 74, 1441–1449.

Brillinger, D. R. and Preisler, H. K. (1984) Further analysis of the Joyner-Boore attenuation data.Manuscript.

attitude 9

Examples

require(graphics)## 変数のデータクラスをチェックsapply(attenu, data.class)summary(attenu)pairs(attenu, main = "attenu data")coplot(accel ~ dist | as.factor(event), data = attenu, show.given = FALSE)coplot(log(accel) ~ log(dist) | as.factor(event),

data = attenu, panel = panel.smooth, show.given = FALSE)

attitude 勤務態度データ

Description

大規模ファイナンス機関の事務職員の調査から，データは30の部局の各々から(無作為に)選ばれた約35人の社員のアンケートから集められた．数値は各部局での7つの質問に対する好意的な反応のパーセント割合を与える．

Usage

attitude

Format

7変数に対する30観測値のデータフレーム．最初の列は参考文献からの短い名前，二番目の列はデータフレーム中の名前：

Y rating 数値全般的評価

X[1] complaints 数値雇用者の不満の処理X[2] privileges 数値特別な特権を許さないX[3] learning 数値学習の機会X[4] raises 数値成果による昇進X[5] critical 数値批判的過ぎるX[6] advancel 数値達成度

Source

Chatterjee, S. and Price, B. (1977) Regression Analysis by Example. New York: Wiley. (Section3.7, p.68ff of 2nd ed.(1991).)

Examples

require(stats); require(graphics)pairs(attitude, main = "attitude data")summary(attitude)summary(fm1 <- lm(rating ~ ., data = attitude))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),

mar = c(4.1, 4.1, 2.1, 1.1))plot(fm1)summary(fm2 <- lm(rating ~ complaints, data = attitude))plot(fm2)

10 beavers

par(opar)

austres オーストラリア在住者数のデータ

Description

March 1971から March 1994までの四半期毎に測定されたオーストラリア在住者数(千人単位)．オブジェクトはクラス "ts"．

Usage

austres

Source

P. J. Brockwell and R. A. Davis (1996) Introduction to Time Series and Forecasting. Springer

beavers ビーバーの体温データ

Description

Reynolds (1994)はウィスコンシン州の北部中央でのビーバー Castor canadensisの長期間体温変動の研究のごく一部を紹介している．体温は四匹の雌に対して10分毎にテレメトリーで測定されたが，一日未満の一つの期間からのデータがそこでは使われている．

Usage

beaver1beaver2

Format

beaver1は10分置きに測られた体温測定の114行4列のデータフレーム．

beaver2は10分置きに測られた体温測定の100行4列のデータフレーム．

変数は次の通り：

day 観測日(1990年の始めからの日数)，December 12–13 (beaver1)とNovember 3–4 (beaver2)．

time 観測時間， 3:30amに対して 0330の形式．

temp 測定された摂氏体温．

activ 隠れ家の外での活動を指示．

Note

beaver1の 22:20に於けるデータは欠損している．

BJsales 11

Source

P. S. Reynolds (1994) Time-series analyses of beaver body temperatures. Chapter 11 of Lange, N.,Ryan, L., Billard, L., Brillinger, D., Conquest, L. and Greenhouse, J. eds (1994) Case Studies inBiometry. New York: John Wiley and Sons.

Examples

require(graphics)(yl <- range(beaver1$temp, beaver2$temp))

beaver.plot <- function(bdat, ...) {nam <- deparse(substitute(bdat))with(bdat, {# Hours since start of day:hours <- time %/% 100 + 24*(day - day[1]) + (time %% 100)/60plot (hours, temp, type = "l", ...,

main = paste(nam, "body temperature"))abline(h = 37.5, col = "gray", lty = 2)is.act <- activ == 1points(hours[is.act], temp[is.act], col = 2, cex = .8)

})}op <- par(mfrow = c(2, 1), mar = c(3, 3, 4, 2), mgp = 0.9 * 2:0)beaver.plot(beaver1, ylim = yl)beaver.plot(beaver2, ylim = yl)

par(op)

BJsales セールスデータと先行指数

Description

各々150観測値を含むセールス時系列 BJsalesと先行指数これらはクラス "ts"のオブジェクトである．

Usage

BJsalesBJsales.lead

Source

データはBox & Jenkins (1976)で与えられた．時系列データライブラリ http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/から得られる．

References

G. E. P. Box and G. M. Jenkins (1976): Time Series Analysis, Forecasting and Control, Holden-Day,San Francisco, p. 537.

P. J. Brockwell and R. A. Davis (1991): Time Series: Theory and Methods, Second edition, SpringerVerlag, NY, pp. 414.

http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/


12 BOD

BOD 生化学的酸素要求度データ

Description

データフレーム BODは水質検査における生化学的酸素要求度を時間に対して与える 6行と2列を持つ．

Usage

BOD

Format

このデータフレームは以下の列を持つ：

Time 測定時間(日)を与える数値ベクトル．

demand 生化学的酸素要求度(mg/l)を与える数値ベクトル．

Source

Bates, D.M. and Watts, D.G. (1988), Nonlinear Regression Analysis and Its Applications, Wiley,Appendix A1.4.

Originally from Marske (1967), Biochemical Oxygen Demand Data Interpretation Using Sum ofSquares Surface M.Sc. Thesis, University of Wisconsin – Madison.

Examples

require(stats)# これらのデータに一次モデルを当てはめる最も簡単な形式fm1 <- nls(demand ~ A*(1-exp(-exp(lrc)*Time)), data = BOD,

start = c(A = 20, lrc = log(.35)))coef(fm1)fm1# plinear アルゴリズムを使うfm2 <- nls(demand ~ (1-exp(-exp(lrc)*Time)), data = BOD,

start = c(lrc = log(.35)), algorithm = "plinear", trace = TRUE)# using a self-starting modelfm3 <- nls(demand ~ SSasympOrig(Time, A, lrc), data = BOD)summary(fm3)

cars 13

cars 車の速度と停止距離データ

Description

このデータは車の速度と停止するまでに必要な距離を与える．データは1920年代に記録されたことを注意する．

Usage

cars

Format

データフレームで2変数に対する50観測値を持つ．

[,1] speed 数値速度(mph)[,2] dist 数値停止距離(ft)

Source

Ezekiel, M. (1930) Methods of Correlation Analysis. Wiley.

References

McNeil, D. R. (1977) Interactive Data Analysis. Wiley.

Examples

require(stats); require(graphics)plot(cars, xlab = "Speed (mph)", ylab = "Stopping distance (ft)",

las = 1)lines(lowess(cars$speed, cars$dist, f = 2/3, iter = 3), col = "red")title(main = "cars data")plot(cars, xlab = "Speed (mph)", ylab = "Stopping distance (ft)",

las = 1, log = "xy")title(main = "cars data (logarithmic scales)")lines(lowess(cars$speed, cars$dist, f = 2/3, iter = 3), col = "red")summary(fm1 <- lm(log(dist) ~ log(speed), data = cars))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),

mar = c(4.1, 4.1, 2.1, 1.1))plot(fm1)par(opar)

## 多項式回帰の例plot(cars, xlab = "Speed (mph)", ylab = "Stopping distance (ft)",

las = 1, xlim = c(0, 25))d <- seq(0, 25, length.out = 200)for(degree in 1:4) {

fm <- lm(dist ~ poly(speed, degree), data = cars)assign(paste("cars", degree, sep = "."), fm)lines(d, predict(fm, data.frame(speed = d)), col = degree)

}anova(cars.1, cars.2, cars.3, cars.4)

14 ChickWeight

ChickWeight ひよこの体重データ

Description

578行と4列のデータフレーム ChickWeightはひよこの初期成長に関する給餌の効果の実験の結果を与える．

Usage

ChickWeight

Format

クラス c("nfnGroupedData", "nfGroupedData", "groupedData", "data.frame")のオブジェクトで次の列を含む：

weight ひよこの体重を与える数値ベクトル(グラム)．

Time 測定時の誕生以来の日数を与える数値ベクトル．Chick ひよこに対するユニークな識別子を与える水準 18 < . . . < 48を持つ順序付き因子．

水準の順序は同じ餌を貰ったひよこでグループ分けしそれらを同じ餌内で最終体重に応じて順序付ける (軽い方から重い方へ)．

Diet 水準 1, . . . , 4の因子でどの実験的な餌をひよこが与えられたかを指示する．

Details

ひよこの体重は誕生時とそれから二日置きに20日まで測定された． 21日目にも測定がされた．4種類のタンパク質飼料に関して四つのグループがあった．

このデータセットは元々パッケージ nlmeの一部であり，それはそのグループ化されたクラスに対するメソッド ([, as.data.frame, plotそして printを含む)を持つ．

Source

Crowder, M. and Hand, D. (1990), Analysis of Repeated Measures, Chapman and Hall (example5.3)

Hand, D. and Crowder, M. (1996), Practical Longitudinal Data Analysis, Chapman and Hall (tableA.2)

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS, Springer.

See Also

このデータセットに当てはめたモデルに対しては SSlogis．

Examples

require(graphics)coplot(weight ~ Time | Chick, data = ChickWeight,

type = "b", show.given = FALSE)

chickwts 15

chickwts 鶏の成長率データ

Description

鶏の成長率に対する様々な餌補助剤の効果を比較するため実験が行われた．

Usage

chickwts

Format

データフレームで2変数に対する71観測値を持つ．

weight 鶏の体重を与える数値変数．

feed 餌のタイプを与える因子．

Details

新しく孵化したひよこをランダムに6グループに割り当て，そして各グループは異なった餌補助剤を与えられた． 6週間後のそれらの体重(グラム)が餌のタイプと共に与えられている．

Source

Anonymous (1948) Biometrika, 35, 214.

References

McNeil, D. R. (1977) Interactive Data Analysis. New York: Wiley.

Examples

require(stats); require(graphics)boxplot(weight ~ feed, data = chickwts, col = "lightgray",

varwidth = TRUE, notch = TRUE, main = "chickwt data",ylab = "Weight at six weeks (gm)")

anova(fm1 <- lm(weight ~ feed, data = chickwts))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),


16 CO2

CO2 草の二酸化炭素吸収データ

Description

CO2データフレームは84行と5列を持つ，種類 Echinochloa crus-galliの草の耐寒性に関する実験から得られたデータである．

Usage

CO2

Format


Plant 水準 Qn1 < Qn2 < Qn3 < . . . < Mc1の順序付き因子で，各植物に対するユニークな識別子を与える．

Type 水準 Quebecと Mississippiの因子で植物の由来を表す．

Treatment 水準 nonchilledと chilledの因子．

conc 周囲の二酸化炭素濃度(mL/L)の数値．

uptake 二酸化炭素の吸収率 (µmol/m2秒)．

Details

ケベックの6本の植物とミシシッピの6本の植物の CO2 吸収度が様々な周辺 CO2 濃度で測定された．各タイプの植物の半数は実験が行われる前に一晩冷やされた．

このデータセットは元々パッケージ nlmeの一部分であり，それはそのグループ化されたクラスに対するメソッドを持つ ([, as.data.frame, plotそして printに対するものを含む)．

Source

Potvin, C., Lechowicz, M. J. and Tardif, S. (1990) “The statistical analysis of ecophysiologicalresponse curves obtained from experiments involving repeated measures”, Ecology, 71, 1389–1400.


Examples

require(stats); require(graphics)

coplot(uptake ~ conc | Plant, data = CO2, show.given = FALSE, type = "b")## 最初の植物に対するデータに当てはめfm1 <- nls(uptake ~ SSasymp(conc, Asym, lrc, c0),

data = CO2, subset = Plant == "Qn1")summary(fm1)## 各植物に個別に当てはめfmlist <- list()for (pp in levels(CO2$Plant)) {

fmlist[[pp]] <- nls(uptake ~ SSasymp(conc, Asym, lrc, c0),

co2 17

data = CO2, subset = Plant == pp)}## 植物による係数をチェックprint(sapply(fmlist, coef), digits = 3)

co2 マウナロア山の大気二酸化炭素濃度データ

Description

大気中の CO2 濃度は百万分率(ppm)で表され予備的な 1997 SIOモル分率検圧法スケールで記録されている．

Usage

co2

Format

468観測値の時系列，1959年から1997年の月別．

Details

1964年の2月，3月，4月に対する値は欠損しており 1964年の1月から5月を線形補間して得られている．

Source

Keeling, C. D. and Whorf, T. P., Scripps Institution of Oceanography (SIO), University of Califor-nia, La Jolla, California USA 92093-0220.

ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2.

References

Cleveland, W. S. (1993) Visualizing Data. New Jersey: Summit Press.

Examples

require(graphics)plot(co2, ylab = expression("Atmospheric concentration of CO"[2]),

las = 1)title(main = "co2 data set")

ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2

18 crimtab

crimtab スチューデントの犯罪者データ

Description

イングランドとウェールズの主要監獄に服役中の20才以上の男性犯罪者3000人のデータ．

Usage

crimtab

Format

integerカウントの tableオブジェクトで，次元は 42×22で総和 sum(crimtab)は 3000．

42の rownames ("9.4", "9.5", . . . )は指の長さの区間の中央値に対応し，22の列名(colnames)("142.24", "144.78", . . . )は3000名の囚人の身長に対応する，下も見よ．

Details

Student は William Sealy Gosset の匿名である．タイトルが Practical Test of the forgoingEquationsの彼の1908年の論文の節 VIの冒頭(13ページ)でこう書いている：

“問題を解析的に解くのに成功する前に私はそれを経験的に行うことに没頭した．使われた題材はW. R. MacDonell (Biometrika, Vol. I., p. 219)の論文からの 3000人の囚人の身長と左手の中指の測定を含む相関表である．測定値は 3000枚の厚紙に書かれ，それからそれを十分にシャッフルし無作為に抽出した．各カードが取り出される度にその番号が本に書きだされ，それは従って3000人の囚人のランダムな順序での測定値を含む．最後に引き続く4枚のセット—全部で750組—が標本として取られ各サンプルの平均，標準偏差，相関が決定された．各サンプルの平均と全体の平均の差が標本の標準偏差で割られ．節IIIの zが得られた．”

表は実際には MacDonell(1902) の219ページではなく216頁にある． MacDonell の表では中指の長さは mm で与えられ身長はフィート/インチ区間で与えられ．それらは共にここでは cm に変換されている．区間の中央値が使われており，例えば MacDonell で4′7′′9/16−−8′′9/16ならば，我々は 142.24を得，これは 2.54*56 = 2.54*(4′8′′)である．

MacDonellはデータのソースを次のように述べている(178ページ)：このメモワールの元になっているデータは New Scotland Yardの一般計量事務所の Dr Garsonの好意による．彼は179ページで指摘している：用紙は無作為に事務所の棚の塊から取り出された；従って我々は無作為抽出を行っている．

Source

http://pbil.univ-lyon1.fr/R/donnees/criminals1902.txt Jean R. LobryとAnne-BéatriceDufourの好意による．

References

Garson, J.G. (1900) The metric system of identification of criminals, as used in in Great Britain andIreland. The Journal of the Anthropological Institute of Great Britain and Ireland 30, 161–198.

MacDonell, W.R. (1902) On criminal anthropometry and the identification of criminals. Biometrika1, 2, 177–227.

Student (1908) The probable error of a mean. Biometrika 6, 1–25.

http://pbil.univ-lyon1.fr/R/donnees/criminals1902.txt

discoveries 19

Examples

require(stats)dim(crimtab)utils::str(crimtab)## より良いプリントのため：local({cT <- crimtab

colnames(cT) <- substring(colnames(cT), 2, 3)pri(cT, zero.print = " ")

})

## スチューデントの実験を繰り返す：

# 1) 3000 個のインチ単位の生データを再構成しスチューデントの論文のように# 最近接整数に丸める：

(heIn <- round(as.numeric(colnames(crimtab)) / 2.54))d.hei <- data.frame(height = rep(heIn, colSums(crimtab)))

# 2) データをシャッフル：

set.seed(1)d.hei <- d.hei[sample(1:3000), , drop = FALSE]

# 3) 各々サイズ4の750組のサンプル：

d.hei$sample <- as.factor(rep(1:750, each = 4))

# 4) 750組のサンプルに対する平均と標準偏差を計算：

h.mean <- with(d.hei, tapply(height, sample, FUN = mean))h.sd <- with(d.hei, tapply(height, sample, FUN = sd)) * sqrt(3/4)

# 5) 各標本の平均と全体の平均の差を計算し# それから標本の標準偏差で割る：

zobs <- (h.mean - mean(d.hei[,"height"]))/h.sd

# 6) スチューデントの論文のように無限値を +/- 6 で置き換える：

zobs[infZ <- is.infinite(zobs)] # それらの内の 3zobs[infZ] <- 6 * sign(zobs[infZ])

# 7) 分布をプロット

require(grDevices); require(graphics)hist(x = zobs, probability = TRUE, xlab = "Student's z",

col = grey(0.8), border = grey(0.5),main = "Distribution of Student's z score for 'crimtab' data")

discoveries 年毎の大発見の数のデータ

Description

The numbers of 1860年から1959年の“偉大な”発明と科学的発見の数．

20 DNase

Usage

discoveries

Format

100個の値の時系列．

Source

The World Almanac and Book of Facts, 1975 Edition, pages 315–318.

References


Examples

require(graphics)plot(discoveries, ylab = "Number of important discoveries",

las = 1)title(main = "discoveries data set")

DNase DNase

Description

DNaseデータフレームは 176行 3列で，ラットのリンパ液中の遺伝子組み換えタンパク質の酵素結合免疫吸着検定法分析の開発の過程で得られたデータ．

Usage

DNase

Format


Run 分析実行を指示する水準 10 < . . . < 3の順序付き因子．

conc 既知のタンパク質濃度を与える数値ベクトル．

density 分析中の測定された光学濃度を与える数値ベクトル (次元無し)．重複した工学濃度が得られた．

Details

このデータセットは元々パッケージ nlmeの一部分で，そのグループ化されたクラスに対するメソッド (code[, as.data.frame, plotそして printに対するものを含む)．

esoph 21

Source

Davidian, M. and Giltinan, D. M. (1995) Nonlinear Models for Repeated Measurement Data, Chap-man & Hall (section 5.2.4, p. 134)


Examples


coplot(density ~ conc | Run, data = DNase,show.given = FALSE, type = "b")

coplot(density ~ log(conc) | Run, data = DNase,show.given = FALSE, type = "b")

## 代表的な試行を当てはめるfm1 <- nls(density ~ SSlogis( log(conc), Asym, xmid, scal ),

data = DNase, subset = Run == 1)## 4パラメータロジスティックモデルと比較するfm2 <- nls(density ~ SSfpl( log(conc), A, B, xmid, scal ),

data = DNase, subset = Run == 1)summary(fm2)anova(fm1, fm2)

esoph 食道癌データ

Description

フランスの Ille-et-Vilaineに於ける食道癌のケースコントロール研究からのデータ．

Usage

esoph

Format

88個の年齢/飲酒/喫煙の組み合わせのデータフレーム．

[,1] "agegp" 年齢グループ 1 25–34 years2 35–443 45–544 55–645 65–746 75+

[,2] "alcgp" アルコール消費量 1 0–39 gm/day2 40–793 80–1194 120+

[,3] "tobgp" タバコ消費量 1 0– 9 gm/day2 10–193 20–294 30+

[,4] "ncases" 症例数

[,5] "ncontrols" 対照数

22 euro

Author(s)

Thomas Lumley

Source

Breslow, N. E. and Day, N. E. (1980) Statistical Methods in Cancer Research. Volume 1: TheAnalysis of Case-Control Studies. IARC Lyon / Oxford University Press.

Examples

require(stats)require(graphics) # モザイクプロットのためsummary(esoph)## 飲酒，喫煙そして交互作用の効果，年齢補正済みmodel1 <- glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp,

data = esoph, family = binomial())anova(model1)## 飲酒と喫煙の線形効果を試すmodel2 <- glm(cbind(ncases, ncontrols) ~ agegp + unclass(tobgp)

+ unclass(alcgp),data = esoph, family = binomial())

summary(model2)## モザイクプロットのためにデータを再アレンジttt <- table(esoph$agegp, esoph$alcgp, esoph$tobgp)o <- with(esoph, order(tobgp, alcgp, agegp))ttt[ttt == 1] <- esoph$ncases[o]tt1 <- table(esoph$agegp, esoph$alcgp, esoph$tobgp)tt1[tt1 == 1] <- esoph$ncontrols[o]tt <- array(c(ttt, tt1), c(dim(ttt),2),

c(dimnames(ttt), list(c("Cancer", "control"))))mosaicplot(tt, main = "esoph data set", color = TRUE)

euro ユーロ通貨の交換率データ

Description

様々なユーロ通貨間の交換レート．

Usage

euroeuro.cross

Format

euroは長さ11の名前付きベクトルで， euro.crossは次元名を持つサイズ 11掛ける 11の行列である．

eurodist 23

Details

データセット euroは 1ユーロのヨーロッパ金融共同体に参加している全ての通貨 (オーストリア・シリング ATS，ベルギー・フラン BEF，ドイツ・マルク DEM，スペイン・ペセタ ESP，フィンランド・マルカ FIM，フランス・フラン FRF，アイルランド・プントIEP，イタリア・リラ ITL，ルクセンブルグ・フラン LUF，オランダ・ギルダー NLGそしてポルトガル・エスクード PTE)での値を持つ．これらの交換率はヨーロッパ共同体により December 31, 1998に固定された．価格をユーロ価格に変換するには対応するレートで割り二桁に丸める．

データセット euro.crossは様々なユーロ通貨間の交換レートを含む，つまりouter(1 / euro, euro)の結果である．

Examples

cbind(euro)

## これらの関係が成り立つ：euro == signif(euro, 6) # [ユーロの定義では6桁の精度]all(euro.cross == outer(1/euro, euro))

## 20ユーロをベルギー・フランに変換20 * euro["BEF"]## 20オーストリア・シリングをユーロに変換20 / euro["ATS"]## 20スペイン・ペセタをイタリア・リラに20 * euro.cross["ESP", "ITL"]

require(graphics)dotchart(euro,

main = "euro data: 1 Euro in currency unit")dotchart(1/euro,

main = "euro data: 1 currency unit in Euros")dotchart(log(euro, 10),

main = "euro data: log10(1 Euro in currency unit)")

eurodist ヨーロッパの都市間距離データ

Description

eurodistは21のヨーロッパの都市間の道路距離(km単位)を与える．データは The Cam-bridge Encyclopaediaから取られた．

UScitiesDは米国の10都市間の“直線距離”を与える．

Usage

eurodistUScitiesD

Format

それぞれ21と10個のオブジェクトに基づく distオブジェクト． (このたぐいのオブジェクトに対するメソッドを利用するためには statsパッケージをロードする必要がある)．

24 faithful

Source

Crystal, D. Ed. (1990) The Cambridge Encyclopaedia. Cambridge: Cambridge University Press,

米国の都市間データは Pierre Legendreの提供による．

EuStockMarkets 欧州株式市場データ 1991–1998

Description

主要ヨーロッパ株価指数の毎日の終値を含む：ドイツの DAX (Ibis)，スイスの SMI，フランスの CAC，そしてイギリスの FTSE．データは営業時間にサンプルされた，つまり週末と祭日は除かれている．

Usage

EuStockMarkets

Format

4変数に関する1860個の観測値を持つ多変量時系列．クラス "mts"のオブジェクト．

Source

データは Erste Bank AG, Vienna, Austriaの好意で提供された．

faithful 間欠泉データ

Description

米国ワイオミング州のイエローストーン国立公園の間欠泉 Old Faithful geyserに対する噴出間の待ち時間と噴出の持続時間．

Usage

faithful

Format


[,1] eruptions 数値噴出時間(分単位)[,2] waiting 数値次の噴出までの待ち時間(分単位)

Details

faithful$eruptionsを注意深く見ると元々秒単位の時間が過度に丸められていることが分かる． 5の倍数が人間によらない測定下で期待されるよりもより頻繁である．より優

Formaldehyde 25

れた噴出時間のバージョンについては下の例を見よ．

このデータセットについては多くのバージョンがある： Azzalini and Bowman (1990)はより完全なバージョンを使っている．

Source

W. Härdle.

References

Härdle, W. (1991) Smoothing Techniques with Implementation in S. New York: Springer.

Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old Faithful geyser. AppliedStatistics 39, 357–365.

See Also

パッケージMASS中の Azzalini–Bowmanのバージョンの geyser．

Examples

require(stats); require(graphics)f.tit <- "faithful data: Eruptions of Old Faithful"

ne60 <- round(e60 <- 60 * faithful$eruptions)all.equal(e60, ne60) # 相対差 ~ 1/10000table(zapsmall(abs(e60 - ne60))) # 0, 0.02 又は 0.04faithful$better.eruptions <- ne60 / 60te <- table(ne60)te[te >= 4] # 5の倍数が多過ぎる!plot(names(te), te, type = "h", main = f.tit, xlab = "Eruption time (sec)")

plot(faithful[, -3], main = f.tit,xlab = "Eruption time (min)",ylab = "Waiting time to next eruption (min)")

lines(lowess(faithful$eruptions, faithful$waiting, f = 2/3, iter = 3),col = "red")

Formaldehyde フォルムアルデヒドの定量データ

Description

これらのデータは変色性の酸と濃縮硫酸を加え分光光度計で紫色を読み取るフォルムアルデヒドの定量に対する標準カーブを用意する化学実験から得られた．

Usage

Formaldehyde

Format

二変量に対する6観測値のデータフレーム．

https://CRAN.R-project.org/package=MASS

26 freeny

[,1] carb 数値炭水化物 (ml)[,2] optden 数値光学密度

Source

Bennett, N. A. and N. L. Franklin (1954) Statistical Analysis in Chemistry and the Chemical Indus-try. New York: Wiley.

References


Examples

require(stats); require(graphics)plot(optden ~ carb, data = Formaldehyde,

xlab = "Carbohydrate (ml)", ylab = "Optical Density",main = "Formaldehyde data", col = 4, las = 1)

abline(fm1 <- lm(optden ~ carb, data = Formaldehyde))summary(fm1)opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0))plot(fm1)par(opar)

freeny Freenyの歳入データ

Description

四半期毎の歳入と外生変数に関する Freenyのデータ．

Usage

freenyfreeny.xfreeny.y

Format

‘freeny’データセットは三種類ある．

freeny.yは (1962,2Q)から (1971,4Q)までの四半期毎の歳入の39観測値の時系列である．

freeny.x は外生変数の行列である．列は1四半期でラグを取った freeny.y，物価指数，所得水準，そして潜在需要である．

そして freenyは上の二つのデータオブジェクトから得られた y, lag.quarterly.revenue,price.index, income.levelそして market.potentialのデータフレームである．

Source

A. E. Freeny (1977) A Portable Linear Regression Package with Test Programs. Bell Laboratoriesmemorandum.

HairEyeColor 27

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth &Brooks/Cole.

Examples

require(stats); require(graphics)summary(freeny)pairs(freeny, main = "freeny data")# 警告が出る： freeny$y has class "ts"

summary(fm1 <- lm(y ~ ., data = freeny))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),


HairEyeColor 髪と瞳の色のデータ

Description

592人の統計学学生の髪と瞳の色の分布．

Usage

HairEyeColor

Format

592個の観測値を3変量でクロス集計して得られた3次元配列．変数とそれらの水準は次の通り：

No Name Levels1 Hair Black, Brown, Red, Blond2 Eye Brown, Blue, Hazel, Green3 Sex Male, Female

Details

髪色と瞳の色の表は Snee (1974)により報告された Delaware大学の学生の調査から得られた． Sex分割は衒学的理由から Friendly (1992a)により付け加えられた．

このデータセットは様々な分割表の解析の説明に対して有用である，例えば標準的なカイ二乗検定やより一般的に対数線形モデル，そしてモザイクプロット，シーブ図式そしてアソシエーションプロットのようなグラフィカルな手法．

Source

http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas

Snee (1974)は Sexについて累計された二元表を与えている． ‘Brown hair, Brown eye’セルの Sexによる分割は Friendly (2000)が用いたものと一致するように変更された．

http://euclid.psych.yorku.ca/ftp/sas/vcd/catdata/haireye.sas

28 Harman23.cor

References

Snee, R. D. (1974) Graphical display of two-way contingency tables. The American Statistician,28, 9–12.

Friendly, M. (1992a) Graphical methods for categorical data. SAS User Group International Confer-ence Proceedings, 17, 190–200. http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html

Friendly, M. (1992b) Mosaic displays for loglinear models. Proceedings of the Statistical GraphicsSection, American Statistical Association, pp. 61–68. http://www.math.yorku.ca/SCS/Papers/asa92.html

Friendly, M. (2000) Visualizing Categorical Data. SAS Institute, ISBN 1-58025-660-0.

See Also

chisq.test, loglin, mosaicplot

Examples

require(graphics)## 完全なモザイクmosaicplot(HairEyeColor)## 性別で累計(Snee の元々のデータのように)x <- apply(HairEyeColor, c(1, 2), sum)xmosaicplot(x, main = "Relation between hair and eye color")

Harman23.cor Harmanの相関係数例 23

Description

8つの305人の少女の身体測定の7歳と17歳時の相関行列．

Usage

Harman23.cor

Source

Harman, H. H. (1976) Modern Factor Analysis, Third Edition Revised, University of Chicago Press,Table 2.3.

Examples

require(stats)(Harman23.FA <- factanal(factors = 1, covmat = Harman23.cor))for(factors in 2:4) print(update(Harman23.FA, factors = factors))

http://www.math.yorku.ca/SCS/sugi/sugi17-paper.html

http://www.math.yorku.ca/SCS/Papers/asa92.html

http://www.math.yorku.ca/SCS/Papers/asa92.html

Harman74.cor 29

Harman74.cor Harmanの相関係数例 74

Description

Holzingerと Swinefordによるシカゴ郊外の145人の7年時と8年次の子供の心理検査の相関行列．

Usage

Harman74.cor

Source

Harman, H. H. (1976) Modern Factor Analysis, Third Edition Revised, University of Chicago Press,Table 7.4.

Examples

require(stats)(Harman74.FA <- factanal(factors = 1, covmat = Harman74.cor))for(factors in 2:5) print(update(Harman74.FA, factors = factors))Harman74.FA <- factanal(factors = 5, covmat = Harman74.cor,

rotation = "promax")print(Harman74.FA$loadings, sort = TRUE)

Indometh Indometh

Description

Indometh データフレームはインドメタシン(酸性非ステロイド性抗炎症薬 indometacin，古い表記では ‘indomethacin’)のの薬物動態学に関する66行と3列のデータ．

Usage

Indometh

Format


Subject 被験者のコードを含む順序付き因子．順序は最大反応の増加順．

time 血液サンプルが取られた時間(hr)の数値ベクトル．

conc インドメタシンの血漿凝縮度の数値ベクトル(mcg/ml).

30 infert

Details

6人の被験者の各々はインドメタシンの静脈注射を受けた．

このデータセットは元々パッケージ nlmeの一部であり，そのグループ化データクラスに対するメソッドを持つ ([, as.data.frame, plotそして printに対するものを含む)．

Source

Kwan, Breault, Umbenhauer, McMahon and Duggan (1976) Kinetics of Indomethacin absorption,elimination, and enterohepatic circulation in man. Journal of Pharmacokinetics and Biopharma-ceutics 4, 255–280.

Davidian, M. and Giltinan, D. M. (1995) Nonlinear Models for Repeated Measurement Data, Chap-man & Hall (section 5.2.4, p. 129)


See Also

このデータセットへのモデル当てはめについては SSbiexp．

infert 不妊性データ

Description

これは条件付きロジスティック回帰が登場する前に遡るマッチされたケースコントロル研究である．

Usage

infert

Format

1. Education 0 = 0-5年1 = 6-11年2 = 12+年

2. age ケースの年齢3. parity カウント4. number of prior 0 = 0

induced abortions 1 = 12 = 2 or more

5. case status 1 =ケース0 =対照

6. number of prior 0 = 0spontaneous abortions 1 = 1

2 = 2又はそれ以上7. matched set number 1-838. stratum number 1-63

InsectSprays 31

Note

二回の自然流産を持つケースと二回の人口流産を持つケースは省略されている．

Source

Trichopoulos et al (1976) Br. J. of Obst. and Gynaec. 83, 645–650.

Examples

require(stats)model1 <- glm(case ~ spontaneous+induced, data = infert, family = binomial())summary(model1)## 他の潜在的交絡因子に対して補正：summary(model2 <- glm(case ~ age+parity+education+spontaneous+induced,

data = infert, family = binomial()))## 実際は survaval パッケージ中にある条件付きロジスティック回帰で解析されるべきif(require(survival)){

model3 <- clogit(case ~ spontaneous+induced+strata(stratum), data = infert)print(summary(model3))detach() # survival (衝突する)

}

InsectSprays 殺虫剤データ

Description

異なった殺虫剤を処理された農業実験ユニット中の昆虫の数．

Usage

InsectSprays

Format

A data frame with 72観測値と 2変量に関するデータフレーム．

[,1] count 数値昆虫の数[,2] spray factor スプレイの数

Source

Beall, G., (1942) The Transformation of data from entomological field experiments, Biometrika,29, 243–262.

References

McNeil, D. (1977) Interactive Data Analysis. New York: Wiley.

Examples


32 iris

boxplot(count ~ spray, data = InsectSprays,xlab = "Type of spray", ylab = "Insect count",main = "InsectSprays data", varwidth = TRUE, col = "lightgray")

fm1 <- aov(count ~ spray, data = InsectSprays)summary(fm1)opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0))plot(fm1)fm2 <- aov(sqrt(count) ~ spray, data = InsectSprays)summary(fm2)plot(fm2)par(opar)

iris アイリスデータ

Description

この有名(Fisher または Anderson の)アイリスデータは三種類のアイリスの各々50個の花のそれぞれ萼片の長さと幅，花弁の長さと幅の変数を与える．品種は Iris setosa,versicolorそして virginicaである．

Usage

irisiris3

Format

irisは150ケース(行)と名前が Sepal.Length, Sepal.Width, Petal.Length, Petal.Widthそして Speciesの5変数(列)を持つ．

iris3は同じデータを S-PLUSのようにサイズ 50 x 4 x 3の3次元配列にまとめたものである．最初の次元は各種類中のケース番号，二番目は名前 Sepal L., Sepal W., Petal L.そして Petal W.の計測値，そして三番目は種類を与える．

Source

Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics,7, Part II, 179–188.

データは以下で集められた Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletinof the American Iris Society, 59, 2–5.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth &Brooks/Cole. (has iris3 as iris.)

See Also

irisを使う幾つかの例は matplot．

islands 33

Examples

dni3 <- dimnames(iris3)ii <- data.frame(matrix(aperm(iris3, c(1,3,2)), ncol = 4,

dimnames = list(NULL, sub(" L.",".Length",sub(" W.",".Width", dni3[[2]])))),

Species = gl(3, 50, labels = sub("S", "s", sub("V", "v", dni3[[3]]))))all.equal(ii, iris) # TRUE

islands 主要大陸の面積データ

Description

10,000平方マイルを超える大陸の面積(単位千平方マイル)．

Usage

islands

Format

長さ48の名前付きベクトル．

Source

The World Almanac and Book of Facts, 1975, page 406.

References


Examples

require(graphics)dotchart(log(islands, 10),

main = "islands data: log10(area) (log10(sq. miles))")dotchart(log(islands[order(islands)], 10),

main = "islands data: log10(area) (log10(sq. miles))")

JohnsonJohnson 四半期毎の収益データ

Description

Johnson & Johnsonの四半期毎の株価当たり収益(ドル) 1960–80．

Usage

JohnsonJohnson

34 LakeHuron

Format

四半期毎の時系列．

Source

Shumway, R. H. and Stoffer, D. S. (2000) Time Series Analysis and its Applications. Second Edi-tion. Springer. Example 1.1.

Examples

require(stats); require(graphics)JJ <- log10(JohnsonJohnson)plot(JJ)## この例はある種のプラットフォームでは警告 possible-non-convergence## が出るが，x86 Linux と Windows では収束するように見える(fit <- StructTS(JJ, type = "BSM"))tsdiag(fit)sm <- tsSmooth(fit)plot(cbind(JJ, sm[, 1], sm[, 3]-0.5), plot.type = "single",

col = c("black", "green", "blue"))abline(h = -0.5, col = "grey60")

monthplot(fit)

LakeHuron ヒューロン湖の水位データ

Description

フィート単位のヒューロン湖の年毎の水位，1875–1972．

Usage

LakeHuron

Format

長さ 98の時系列．

Source

Brockwell, P. J. and Davis, R. A. (1991). Time Series and Forecasting Methods. Second edition.Springer, New York. Series A, page 555.

Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and Forecasting. Springer,New York. Sections 5.1 and 7.6.

LifeCycleSavings 35

lh 黄体形成ホルモンデータ

Description

10分間隔の女性の血液サンプル中の黄体形成ホルモンの通常の時系列．

Usage

lh

Source

P.J. Diggle (1990) Time Series: A Biostatistical Introduction. Oxford, table A.1, series 3

LifeCycleSavings ライフサイクル貯蓄データ

Description

1960–1970年の貯蓄率データ．.

Usage

LifeCycleSavings

Format

50観測値と 5変数を持つデータフレーム．

[,1] sr 数値累積個人貯蓄

[,2] pop15 数値 15才以下の人口 %[,3] pop75 数値 75才以上の人口 %[,4] dpi 数値一人あたりの可処分所得[,5] ddpi numeric 可処分所得の増加率

Details

Franco Modiglianiのライフサイクル貯蓄仮説によれば．貯蓄率(可処分所得で割られた累計個人貯蓄)は，一人あたりの収入，一人あたりの収入のパーセント変化率，そして二つの民生変数：15才未満の人口パーセントと 75才以上の人口パーセント，で説明できる．このデータはビジネスサイクルや他の短期間変動を除くために期間 1960–1970で平均されている．

Source

このデータは Belsley, Kuh and Welsch (1980)から得られた．それらは更に Sterling (1977)のデータから得られた．

36 Loblolly

References

Sterling, Arnie (1977) Unpublished BS Thesis. Massachusetts Institute of Technology.

Belsley, D. A., Kuh. E. and Welsch, R. E. (1980) Regression Diagnostics. New York: Wiley.

Examples

require(stats); require(graphics)pairs(LifeCycleSavings, panel = panel.smooth,

main = "LifeCycleSavings data")fm1 <- lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)summary(fm1)

Loblolly Growth of Loblolly pine trees

Description

Loblollyデータフレームは 84行と 3列のテーダ松(Loblolly pine tree)の成長記録からなる．

Usage

Loblolly

Format

クラス c("nfnGroupedData", "nfGroupedData", "groupedData", "data.frame")のオブジェクトで以下の列を含む：

height 樹高(フィート単位)の数値ベクトル．

age 樹齢(年単位)の数値データ．

Seed 木の種の出所を示す順序付き因子．順序は最大樹高の増加順．

Details

このデータフレームは元々パッケージ nlmeの一部であり，そのグループ化データクラスに対するメソッドを持つ ([, as.data.frame, plotそして printに対するものを含む)．

Source

Kung, F. H. (1986), Fitting logistic growth curve with predetermined carrying capacity, in Proceed-ings of the Statistical Computing Section, American Statistical Association, 340–343.


longley 37

Examples

require(stats); require(graphics)plot(height ~ age, data = Loblolly, subset = Seed == 329,

xlab = "Tree age (yr)", las = 1,ylab = "Tree height (ft)",main = "Loblolly data and fitted curve (Seed 329 only)")

fm1 <- nls(height ~ SSasymp(age, Asym, R0, lrc),data = Loblolly, subset = Seed == 329)

age <- seq(0, 30, length.out = 101)lines(age, predict(fm1, list(age = age)))

longley Longleyの経済回帰データ

Description

良く知られた高度の共線性を示す例を与えるマクロ経済学的データセット．

Usage

longley

Format

7つの経済的変数について 1947年から1962年(n = 16)まで一年ごとに観測したデータフレーム．

GNP.deflator GNPの暗黙の価格デフレーター (1954 = 100)

GNP 国民総生産．

Unemployed 非雇用者数．

Armed.Forces 軍隊に属する人数．

Population 年齢14歳以上の‘入院中でない’人口．

Year 年(時間)．

Employed 雇用者数．

回帰 lm(Employed ~ .) は高度の共線性を持つことが知られている．

Source

J. W. Longley (1967) An appraisal of least-squares programs from the point of view of the user.Journal of the American Statistical Association 62, 819–841.

References


38 morley

Examples

require(stats); require(graphics)## S-PLUS で使われているような形式でデータを与える：longley.x <- data.matrix(longley[, 1:6])longley.y <- longley[, "Employed"]pairs(longley, main = "longley data")summary(fm1 <- lm(Employed ~ ., data = longley))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0),


lynx 山猫の捕獲数データ

Description

1821–1934年のカナダの山猫の年間捕獲数． Brockwell & Davis (1991)から取られたが，Campbell & Walker (1977)で検討されたものと思われる．

Usage

lynx

Source

Brockwell, P. J. and Davis, R. A. (1991) Time Series and Forecasting Methods. Second edition.Springer. Series G (page 557).

References


Campbell, M. J.and A. M. Walker (1977). A Survey of statistical work on the Mackenzie Riverseries of annual Canadian lynx trappings for the years 1821–1934 and a new analysis. Journal ofthe Royal Statistical Society series A, 140, 411–431.

morley Michelson Speed of Light Data

Description

Michelson (しかしMorleyと一緒のではない)の1879年に行われた光の速度に関する古典的なデータ．データは5回の実験からなり，各々は20回の引き続く ‘実行’からなる．応答は光の速度の測定値であり，適当にコードされている (単位 km/secで 299000が引かれている)．

Usage

morley

mtcars 39

Format

次の3変数に対する 100観測値からなるデータフレームである．

Expt 実験番号．1から 5．

Run 各実験内の実行番号．

Speed 光の速度の測定値．

Details

このデータはここでは ‘experiment’と ‘run’を因子とするランダムブロック実験として眺められる． ‘run’はまた一つの実験のコースに関する測定値の線形(又は多項式)的変化を説明する量的変数と考えられる．

Note

これはパッケージMASS中のデータセット michelsonと同じものである．

Source

A. J. Weekes (1986) A Genstat Primer. London: Edward Arnold.

S. M. Stigler (1977) Do robust estimators work with real data? Annals of Statistics 5, 1055–1098.(See Table 6.)

A. A. Michelson (1882) Experimental determination of the velocity of light made at the UnitedStates Naval Academy, Annapolis. Astronomic Papers 1 135–8. U.S. Nautical Almanac Office.(See Table 24.)

Examples

require(stats); require(graphics)michelson <- transform(morley,

Expt = factor(Expt), Run = factor(Run))xtabs(~ Expt + Run, data = michelson) # 5 x 20 釣り合い型(二元)plot(Speed ~ Expt, data = michelson,

main = "Speed of Light Data", xlab = "Experiment No.")fm <- aov(Speed ~ Run + Expt, data = michelson)summary(fm)fm0 <- update(fm, . ~ . - Run)anova(fm0, fm)

mtcars 車の性能データ

Description

このデータは1974年の米国の Motor Trend 誌から取られ， 32台の自動車(1973–74年モデル)に対する燃費と10のデザインと性能の特徴からなる．

Usage

mtcars

https://CRAN.R-project.org/package=MASS

40 nhtemp

Format

11変数に対する32観測値からなるデータフレームである．

[, 1] mpg マイル/(US)ガロン[, 2] cyl シリンダー数[, 3] disp 排気量(立方インチ)[, 4] hp 全般的馬力

[, 5] drat リアアクセル比[, 6] wt 重量(1000ポンド)[, 7] qsec 1/4マイル走行時間[, 8] vs V/S[, 9] am トランスミッション(0 =オートマチック, 1 =マニュアル)[,10] gear フォワードギア数[,11] carb キャビュレター数

Source

Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37,391–411.

Examples

require(graphics)pairs(mtcars, main = "mtcars data")coplot(mpg ~ disp | as.factor(cyl), data = mtcars,

panel = panel.smooth, rows = 1)

nhtemp 年平均気温データ

Description

1912年から1971年のコネチカット州ニューヘブンの華氏平均年気温．

Usage

nhtemp

Format

60観測値の時系列．

Source

Vaux, J. E. and Brinker, N. B. (1972) Cycles, 1972, 117–121.

References


Nile 41

Examples

require(stats); require(graphics)plot(nhtemp, main = "nhtemp data",

ylab = "Mean annual temperature in New Haven, CT (deg. F)")

Nile ナイル川の流量データ

Description

アスワン(Aswan，かっては Assuan)におけるナイル川の年別流量，1871–1970年，単位は108m3． “1898年近くに明らかな転換点” (Cobb(1978), Table 1, p.249)．

Usage

Nile

Format

長さ 100の時系列．

Source

Durbin, J. and Koopman, S. J. (2001) Time Series Analysis by State Space Methods. Oxford Uni-versity Press. http://www.ssfpack.com/DKbook.html

References

Balke, N. S. (1993) Detecting level shifts in time series. Journal of Business and Economic Statistics11, 81–92.

Cobb, G. W. (1978) The problem of the Nile: conditional solution to a change-point problem.Biometrika 65, 243–51.

Examples

require(stats); require(graphics)par(mfrow = c(2, 2))plot(Nile)acf(Nile)pacf(Nile)ar(Nile) # 2次を選択cpgram(ar(Nile)$resid)par(mfrow = c(1, 1))arima(Nile, c(2, 0, 0))

## ここで Durbin & Koopman に従い欠損値を考慮するNileNA <- NileNileNA[c(21:40, 61:80)] <- NAarima(NileNA, c(2, 0, 0))plot(NileNA)pred <-

predict(arima(window(NileNA, 1871, 1890), c(2, 0, 0)), n.ahead = 20)lines(pred$pred, lty = 3, col = "red")

http://www.ssfpack.com/DKbook.html

42 nottem

lines(pred$pred + 2*pred$se, lty = 2, col = "blue")lines(pred$pred - 2*pred$se, lty = 2, col = "blue")pred <-

predict(arima(window(NileNA, 1871, 1930), c(2, 0, 0)), n.ahead = 20)lines(pred$pred, lty = 3, col = "red")lines(pred$pred + 2*pred$se, lty = 2, col = "blue")lines(pred$pred - 2*pred$se, lty = 2, col = "blue")

## 構造的時系列モデルpar(mfrow = c(3, 1))plot(Nile)## 局所レベルモデル(fit <- StructTS(Nile, type = "level"))lines(fitted(fit), lty = 2) # 同時的平滑化lines(tsSmooth(fit), lty = 2, col = 4) # 固定区間平滑化plot(residuals(fit)); abline(h = 0, lty = 3)## 局所トレンドモデル(fit2 <- StructTS(Nile, type = "trend")) ## 定数トレンド当てはめpred <- predict(fit, n.ahead = 30)## with 50% confidence intervalts.plot(Nile, pred$pred,

pred$pred + 0.67*pred$se, pred$pred -0.67*pred$se)

## ここで欠損値を考えるplot(NileNA)(fit3 <- StructTS(NileNA, type = "level"))lines(fitted(fit3), lty = 2)lines(tsSmooth(fit3), lty = 3)plot(residuals(fit3)); abline(h = 0, lty = 3)

nottem 月別平均気温データノッチンガム城 1920–1939

Description

ノッチンガム城に於ける20年間の平均気温を含む時系列オブジェクト．

Usage

nottem

Source

Anderson, O. D. (1976) Time Series Analysis and Forecasting: The Box-Jenkins approach. Butter-worths. Series R.

Examples

require(stats); require(graphics)nott <- window(nottem, end = c(1936,12))fit <- arima(nott, order = c(1,0,0), list(order = c(2,1,0), period = 12))nott.fore <- predict(fit, n.ahead = 36)ts.plot(nott, nott.fore$pred, nott.fore$pred+2*nott.fore$se,

nott.fore$pred-2*nott.fore$se, gpars = list(col = c(1,1,4,4)))

npk 43

npk 要因実験データ

Description

6つのブロックで行われた豆の成長に関する古典的な N, P, K (窒素，リン酸，カリ)要因実験実験データ． NPK交互作用を交絡した一部実施要因実験の各半分がプロットの3つで用いられた．

Usage

npk

Format

npkデータフレームは24行と5列を含む：

block ブロック番号(ラベル 1から 6)．

N 窒素施肥の指示(0/1)．

P リン酸施肥の指示(0/1)．

K カリ施肥の指示(0/1)．

yield 豆収量，プロット当たりポンド(プロットは (1/70)エーカー)．

Source

Imperial College, London, M.Sc. exercise sheet.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Examples

options(contrasts = c("contr.sum", "contr.poly"))npk.aov <- aov(yield ~ block + N*P*K, npk)npk.aovsummary(npk.aov)coef(npk.aov)options(contrasts = c("contr.treatment", "contr.poly"))npk.aov1 <- aov(yield ~ block + N + K, data = npk)summary.lm(npk.aov1)se.contrast(npk.aov1, list(N=="0", N=="1"), data = npk)model.tables(npk.aov1, type = "means", se = TRUE)

44 Orange

occupationalStatus 父と息子の雇用身分データ

Description

イギリスの男性の標本を各個人の雇用身分とその父親の雇用身分でクロス集計．

Usage

occupationalStatus

Format

カウントの tableで，分類因子は origin (父親の雇用身分；水準 1:8)と destination (息子の雇用身分；水準 1:8)．

Source

Goodman, L. A. (1979) Simple Models for the Analysis of Association in Cross-Classificationshaving Ordered Categories. J. Am. Stat. Assoc., 74 (367), 537–552.

このデータセットはパッケージ gnm中にあり，パッケージの作者から提供された．

Examples


plot(occupationalStatus)

## 対角効果を分離した一様連関モデルを当てはめるDiag <- as.factor(diag(1:8))Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE)Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE)modUnif <- glm(Freq ~ origin + destination + Diag + Rscore:Cscore,

family = poisson, data = occupationalStatus)

summary(modUnif)plot(modUnif) # 4つのプロット， h_ii ~= 1 に対して警告

Orange オレンジの木の成長データ

Description

Orangeデータフレームは 35行と 3列のオレンジの木の成長記録である．

Usage

Orange

https://CRAN.R-project.org/package=gnm

OrchardSprays 45

Format


Tree 測定がなされた木を指示する順序付き因子．順序は最大直径の増加順．age 樹齢を与える数値ベクトル(1968/12/31以来の日数)．

circumference 幹周長の数値ベクトル(mm)．これは恐らく森林学での標準である “胸の高さでの周長”．

Details

このデータフレームは元々パッケージ nlmeの一部であり，そのグループ化データクラスに対するメソッドを持つ ([, as.data.frame, plotそして printに対するものを含む)．

Source

Draper, N. R. and Smith, H. (1998), Applied Regression Analysis (3rd ed), Wiley (exercise 24.N).


Examples

require(stats); require(graphics)coplot(circumference ~ age | Tree, data = Orange, show.given = FALSE)fm1 <- nls(circumference ~ SSlogis(age, Asym, xmid, scal),

data = Orange, subset = Tree == 3)plot(circumference ~ age, data = Orange, subset = Tree == 3,

xlab = "Tree age (days since 1968/12/31)",ylab = "Tree circumference (mm)", las = 1,main = "Orange tree data and fitted model (Tree 3 only)")

age <- seq(0, 1600, length.out = 101)lines(age, predict(fm1, list(age = age)))

OrchardSprays Potency of Orchard Sprays

Description

様々な成分の果樹園散布剤がミツバチを寄せ付けない効果を調べる実験がラテン方画デザインを使って行われた．

Usage

OrchardSprays

Format


[,1] rowpos 数値デザインの行[,2] colpos 数値デザインの列[,3] treatment 因子処理レベル[,4] decrease 数値応答

46 PlantGrowth

乾燥した蜂の巣の個々のセルを石灰硫黄合剤乳液の蔗糖溶液で満たした．石灰硫黄合剤の 1/100 から 1/1,562,500 の範囲の引き続く因子 1/5 の幾つかの異なった濃度が使われ，石灰硫黄合剤を含まない溶液も使われた．

100匹の蜂を二時間部屋に放ち異なった溶液に対する応答が得られ，それから様々なセル中の溶液の体積の減少が測られた．

8× 8ラテン方画デザインが使われ，そして処理が次のようにコードされた：

A 最大の石灰硫黄合剤のレベルB 次に大きな石灰硫黄合剤のレベル...

G 最少の石灰硫黄合剤のレベルH 石灰硫黄合剤無し

Source

Finney, D. J. (1947) Probit Analysis. Cambridge.

References


Examples

require(graphics)pairs(OrchardSprays, main = "OrchardSprays data")

PlantGrowth 植物の成長実験からの結果

Description

対称群と二つの異なった処理の下での生産量(植物の乾燥重量で測定)の比較実験からの結果．

Usage

PlantGrowth

Format

2変数，30ケースのデータフレーム．

[, 1] weight 数値

[, 2] group 因子

groupの水準は ‘ctrl’, ‘trt1’そして ‘trt2’．

precip 47

Source

Dobson, A. J. (1983) An Introduction to Statistical Modelling. London: Chapman and Hall.

Examples

## Dobson の本の Table 7.4 からの一因子 ANOVArequire(stats); require(graphics)boxplot(weight ~ group, data = PlantGrowth, main = "PlantGrowth data",

ylab = "Dried weight of plants", col = "lightgray",notch = TRUE, varwidth = TRUE)

anova(lm(weight ~ group, data = PlantGrowth))

precip 年間降雨量データ

Description

70の米国州(とプエルトリコ)の年のインチ単位の平均年間降雨量．

Usage

precip

Format

長さ70の名前付きベクトル．

Source

Statistical Abstracts of the United States, 1975.

References


Examples

require(graphics)dotchart(precip[order(precip)], main = "precip data")title(sub = "Average annual precipitation (in.)")

48 pressure

presidents 米国大統領の支持率データ

Description

(近似的に)四半期ごとの米国大統領の支持率データ， 1945年の第一四半期から1974年の最終四半期まで．

Usage

presidents

Format

120個の値の時系列．

Details

このデータは実際は支持率のいい加減なバージョンである．詳細についてはMcNeilの本を見よ．

Source

The Gallup Organisation.

References


Examples

require(stats); require(graphics)plot(presidents, las = 1, ylab = "Approval rating (%)",

main = "presidents data")

pressure 水銀の蒸気圧データ

Description

摂氏温度とミリメートル単位の水銀の蒸気圧の関係に関するデータ．

Usage

pressure

Format

2変数19観測値のデータフレーム．

[, 1] temperature 数値温度(摂氏)[, 2] pressure 数値圧力(mm)

Puromycin 49

Source

Weast, R. C., ed. (1973) Handbook of Chemistry and Physics. CRC Press.

References


Examples

require(graphics)plot(pressure, xlab = "Temperature (deg C)",

ylab = "Pressure (mm of Hg)",main = "pressure data: Vapor Pressure of Mercury")

plot(pressure, xlab = "Temperature (deg C)", log = "y",ylab = "Pressure (mm of Hg)",main = "pressure data: Vapor Pressure of Mercury")

Puromycin 酵素反応の反応速度データ

Description

Puromycinはピューロマイシン(タンパク質合成阻害剤)で処理されない・処理された細胞を含む酵素反応中の反応速度対基質濃度の 23行と3列のデータフレームである．

Usage

Puromycin

Format

このデータフレームは次の列を含む：

conc 基質濃度の数値ベクトル(ppm)

rate 瞬間的反応率の数値ベクトル(counts/min/min)

state 水準 treated, untreatedの因子．

Details

酵素反応速度に関するデータは Treloar (1974)により得られた．反応からの放射性の産物の分毎のカウント数が百万分率(ppm)単位での基質濃度の関数として計測され，そしてこれらのカウントから反応の初期レート(速度) (counts/min/min)が計算された．実験は一度ピューロマイシンで処理された酵素を用いて行われ，一度は処理無しの酵素を用いて行われた．

Source

Bates, D.M. and Watts, D.G. (1988), Nonlinear Regression Analysis and Its Applications, Wiley,Appendix A1.3.

Treloar, M. A. (1974), Effects of Puromycin on Galactosyltransferase in Golgi Membranes, M.Sc.Thesis, U. of Toronto.

50 quakes

See Also

このデータセットに当てはめられた他のモデルについては SSmicmen．

Examples


plot(rate ~ conc, data = Puromycin, las = 1,xlab = "Substrate concentration (ppm)",ylab = "Reaction velocity (counts/min/min)",pch = as.integer(Puromycin$state),col = as.integer(Puromycin$state),main = "Puromycin data and fitted Michaelis-Menten curves")

## これらのデータに Michaelis-Menten モデルを当てはめる最も簡単な形式fm1 <- nls(rate ~ Vm * conc/(K + conc), data = Puromycin,

subset = state == "treated",start = c(Vm = 200, K = 0.05))

fm2 <- nls(rate ~ Vm * conc/(K + conc), data = Puromycin,subset = state == "untreated",start = c(Vm = 160, K = 0.05))

summary(fm1)summary(fm2)## プロットに当てはめた線を加えるconc <- seq(0, 1.2, length.out = 101)lines(conc, predict(fm1, list(conc = conc)), lty = 1, col = 1)lines(conc, predict(fm2, list(conc = conc)), lty = 2, col = 2)legend(0.8, 120, levels(Puromycin$state),

col = 1:2, lty = 1:2, pch = 1:2)

## 部分的線形性を使うfm3 <- nls(rate ~ conc/(K + conc), data = Puromycin,

subset = state == "treated", start = c(K = 0.05),algorithm = "plinear")

quakes フィジーの地震位置のデータ

Description

このデータは MB(実体波マグニチュード) > 4.0の1000の地震の位置を与える．地震はフィジー近くの立方体中で1964年以降起きたものである．

Usage

quakes

Format


[,1] lat 数値事象の緯度[,2] long 数値経度

[,3] depth 数値震度(km)[,4] mag 数値リヒターマグニチュード[,5] stations 数値観測所番号

randu 51

Details

地震活動の二つのはっきりした平面がある．一つは主要プレートの接合部である；他はニュージーランド沖のトンガ海溝である．これらのデータは5000個の観測値を含むより大きなデータセットの一部である．

Source

これは Harvard PRIM-Hプロジェクトのデータセットの一部である．そして彼らはそれをDr. John Woodhouse, Dept. of Geophysics, Harvard Universityから入手した．

Examples

require(graphics)pairs(quakes, main = "Fiji Earthquakes, N = 1000", cex.main = 1.2, pch = ".")

randu RANDU乱数データ

Description

VMS 1.5の下で稼働する VAX FORTRAN関数 RANDUから取られた引き続く乱数の三つ組400個．

Usage

randu

Format

名前 x, yそして zの 3変数と400観測値のデータフレームで，三つ組の最初と二番目と三番目の乱数を与える．

Details

三次元ディスプレイでは三つ組は三次元区間の15枚の平行平面に載ることが明らかである．これは RANDU生成器からの全ての三つ組について正しいことが理論的に証明できる．

これらの特別な400個の三つ組は系列中5つ離れて始まる，つまりそれらは ((U[5i+1],U[5i+2], U[5i+3]), i= 0, . . . , 399)で，小数点以下6桁に丸められている．

VMSのバージョン 2.0以降ではこの問題は訂正されている．

Source

David Donoho

52 rock

Examples

## 次の R コードでデータセットを再生成できるseed <- as.double(1)RANDU <- function() {

seed <<- ((2^16 + 3) * seed) %% (2^31)seed/(2^31)

}for(i in 1:400) {

U <- c(RANDU(), RANDU(), RANDU(), RANDU(), RANDU())print(round(U[1:3], 6))

}

rivers 北米の主要河川の長さのデータ

Description

このデータは 141の北米の“主要”河川の長さ(単位マイル)を与え，米国の地理調査から集められた．

Usage

rivers

Format

141観測値を含むベクトル．

Source

World Almanac and Book of Facts, 1975, page 406.

References


rock 石油母岩のデータ

Description

油層からの48個の石油母岩に関する測定．

Usage

rock

Format

48の行と4つの数値列を持つデータフレーム．

sleep 53

[,1] area 空隙面積， 256掛ける256からのピクセル[,2] peri ピクセル単位の直径[,3] shape 直径/sqrt(面積)[,4] perm ミリダーシー単位の浸透性

Details

油層からの12個のコア標本が4個の断面からサンプルされた．各コア標本の浸透性が測定され，そして各断面は空隙の総面積，空隙の総直径，そして形状を持つ．

Source

BP Researchからのデータで，オクスフォード大学の Ronit Katzにより画像解析された．

sleep スチューデントの睡眠データ

Description

10人の患者に対する二つの睡眠薬の効果(対称群に比較して睡眠を数時間増やす)のデータ．

Usage

sleep

Format


[, 1] extra 数値時間単位の睡眠量増加[, 2] group 因子与えられた薬[, 3] ID 因子患者 ID

Details

group変数の名前はデータについて誤解を招きやすい：それらは10人に関する測定でグループではない．

Source

Cushny, A. R. and Peebles, A. R. (1905) The action of optical isomers: II hyoscines. The Journalof Physiology 32, 501–510.

Student (1908) The probable error of the mean. Biometrika, 6, 20.

References

Scheffé, Henry (1959) The Analysis of Variance. New York, NY: Wiley.

Examples

require(stats)

54 stackloss

## スチューデントの対 t 検定with(sleep,

t.test(extra[group == 1],extra[group == 2], paired = TRUE))

## 睡眠の*延長*sleep1 <- with(sleep, extra[group == 2] - extra[group == 1])summary(sleep1)stripchart(sleep1, method = "stack", xlab = "hours",

main = "Sleep prolongation (n = 10)")boxplot(sleep1, horizontal = TRUE, add = TRUE,

at = .6, pars = list(boxwex = 0.5, staplewex = 0.25))

stackloss 硝酸製造プラントデータ

Description

アンモニアの酸化による硝酸製造プラントの稼働データ．

Usage

stackloss

stack.xstack.loss

Format

stacklossは4変数の21観測値に対するデータフレームである．

[,1] Air Flow 冷却空気の流量[,2] Water Temp 冷却水の注入口温度[,3] Acid Conc. 酸濃度[千分率，500を引く][,4] stack.loss スタック損失

S-PLUSとの互換性のために，データフレームの最初の三つ(独立)変数を持つ行列であるデータセット stack.xと四番目の(従属)変数を与える数値ベクトルである stack.lossが同様に提供されている．

Details

“アンモニア(NH3)の酸化による硝酸(HNO3)製造プラントの21日間の稼働から得られた．作られた硝酸は向流式吸収塔で吸収される”． (Brownlee, Dodgeによる引用，MMにより少し改変．)

Air Flowはプラントの稼働率を表現する． Water Tempは吸収塔中のコイルを循環する冷却水の温度である． Acid Conc. は巡回する酸の濃度で， 50を引き10倍差れている：つまり89は58.9パーセント濃度の酸に対応する． stack.loss (従属変数)は吸収管で吸収されないアンモニアの吸収プラントへの投入アンモニアに対するパーセントの10倍である：つまりプラントの全体的な効率性の(逆向きの)目安である．

state 55

Source

Brownlee, K. A. (1960, 2nd ed. 1965) Statistical Theory and Methodology in Science and Engi-neering. New York: Wiley. pp. 491–500.

References


Dodge, Y. (1996) The guinea pig of multiple regression. In: Robust Statistics, Data Analysis, andComputer Intensive Methods; In Honor of Peter Huber’s 60th Birthday, 1996, Lecture Notes inStatistics 109, Springer-Verlag, New York.

Examples

require(stats)summary(lm.stack <- lm(stack.loss ~ stack.x))

state 米国各州のデータ

Description

米国の50州に関連するデータセット．

Usage

state.abbstate.areastate.centerstate.divisionstate.namestate.regionstate.x77

Details

Rは現在以下の“州”に関するデータセットを持つ．全てのデータは州名のアルファベット順に並べられていることを注意する．

state.abb: 州名の二文字の省略形の文字列ベクトル．state.area: 州面積の数値ベクトル(単位平方マイル)．

state.center: 名前 xと yを持つリストで，負の緯度と経度による近似的な各州の地理学的中心．アラスカとハワイは単に西海岸に位置する．

state.division: 州の区分を与える因子 (New England, Middle Atlantic, South Atlantic, EastSouth Central, West South Central, East North Central, West North Central, Mountain,そして Pacific)．

state.name: 完全な州名．state.region: 各州が属する地域を与える因子(Northeast, South, North Central, West)．

state.x77: 以下の統計を対応する列に持つ50行8列の行列．

56 sunspot.month

Population: July 1, 1975に於ける推定人口．Income: 一人あたり収入(1974)．Illiteracy: 文盲率(1970，人口比)．Life Exp: 平均余命(1969–71)．Murder: 人口10万人あたりの殺人と故殺率(1976)．HS Grad: 高校卒業率(1970)．Frost: 首都もしくは大都市に於ける零度以下の気温の平均日数 (1931–1960)．Area: 平方マイル単位の面積．

Source

U.S. Department of Commerce, Bureau of the Census (1977) Statistical Abstract of the UnitedStates.

U.S. Department of Commerce, Bureau of the Census (1977) County and City Data Book.

References


sunspot.month 太陽黒点数データ

Description

World Data Center，別名 SIDC，からの月別太陽黒点数．これは新しいカウントが入手された時に時おり改訂されるバージョンである．

Usage

sunspot.month

Format

それぞれ289と2988観測値を含む一変量時系列 sunspot.year と sunspot.month．オブジェクトはクラス "ts"である．

Author(s)

R

Source

WDC-SILSO, Solar Influences Data Analysis Center (SIDC), Royal Observatory of Belgium, Av.Circulaire, 3, B-1180 BRUSSELS Currently at http://www.sidc.be/silso/datafiles

See Also

sunspotsのより長いバージョンである sunspot.month； sunspotsは1983年まで続き (データセットの例としての再現性のために)固定されている．

http://www.sidc.be/silso/datafiles

sunspot.year 57

Examples

require(stats); require(graphics)## 月別時系列の比較plot (sunspot.month,

main="sunspot.month & sunspots [package'datasets']", col=2)lines(sunspots) # -> 重なるところで僅かな食い違い

## ここで違いを見る：all(tsp(sunspots) [c(1,3)] ==

tsp(sunspot.month)[c(1,3)]) ## 開始と周期は同じn1 <- length(sunspots)table(eq <- sunspots == sunspot.month[1:n1]) #> 132が異なる!i <- which(!eq)rug(time(eq)[i])s1 <- sunspots[i] ; s2 <- sunspot.month[i]cbind(i = i, time = time(sunspots)[i], sunspots = s1, ss.month = s2,

perc.diff = round(100*2*abs(s1-s2)/(s1+s2), 1))

## "古い" sunspot.month (R <= 3.0.3) の復元法：.sunspot.diff <- cbind(

i = c(1202L, 1256L, 1258L, 1301L, 1407L, 1429L, 1452L, 1455L,1663L, 2151L, 2329L, 2498L, 2594L, 2694L, 2819L),

res10 = c(1L, 1L, 1L, -1L, -1L, -1L, 1L, -1L,1L, 1L, 1L, 1L, 1L, 20L, 1L))

ssm0 <- sunspot.month[1:2988]with(as.data.frame(.sunspot.diff), ssm0[i] <<- ssm0[i] - res10/10)sunspot.month.0 <- ts(ssm0, start = 1749, frequency = 12)

sunspot.year 年別太陽黒点数データ

Description

1700年から1988年までの年別太陽黒点数(小数点以下一桁に丸められている)．

月別数は sunspot.monthで得られるが，少し遅く始まることを注意する．

Usage

sunspot.year

Format

一変量時系列 sunspot.yearは289観測値を持ち，クラス "ts"である．

Source

H. Tong (1996) Non-Linear Time Series. Clarendon Press, Oxford, p. 471.

See Also

月別の太陽黒点数は sunspot.monthと sunspotsを見よ．

定期的に更新された年別太陽黒点数は以下で入手できる WDC-SILSO, Royal Observatoryof Belgium, at http://www.sidc.be/silso/datafiles

http://www.sidc.be/silso/datafiles

58 sunspots

Examples

utils::str(sm <- sunspots)# 改訂無しの月別バージョンutils::str(sy <- sunspot.year)## 共通の時間区間(t1 <- c(max(start(sm), start(sy)), 1)) # Jan 1749(t2 <- c(min( end(sm)[1],end(sy)[1]), 12)) # Dec 1983s.m <- window(sm, start=t1, end=t2)s.y <- window(sy, start=t1, end=t2[1]) # {関係のない警告}stopifnot(length(s.y) * 12 == length(s.m),

## 年別系列は月別系列の平均に近い：all.equal(s.y, aggregate(s.m, FUN = mean), tol = 0.0020))

## 注意：奇妙なことに，月別の日数で正確に重みづけたもの(2月には28.25を使う)は## 単純なものほど近くないndays <- c(31, 28.25, rep(c(31,30, 31,30, 31), 2))all.equal(s.y, aggregate(s.m, FUN = mean)) # 0.0013all.equal(s.y, aggregate(s.m, FUN = weighted.mean, w = ndays)) # 0.0017

sunspots 月別太陽黒点数データ

Description

1749年から1983年までの月別平均相対的太陽黒点数． 1960年まではチューリッヒのスイス連邦観測所で，それ以降は東京天文台で収集されている．

Usage

sunspots

Format

1749年から1983年の月別時系列．

Source

Andrews, D. F. and Herzberg, A. M. (1985) Data: A Collection of Problems from Many Fields forthe Student and Research Worker. New York: Springer-Verlag.

See Also

sunspot.monthはより長い(そして少し異なった)系列を持ち， sunspot.yearはかなり短い系列である．より最近の太陽黒点数を得るにはそこを見よ．

Examples

require(graphics)plot(sunspots, main = "sunspots data", xlab = "Year",

ylab = "Monthly sunspot numbers")

swiss 59

swiss スイスのswiss

Description

18988年頃のスイスの仏語を話す47の州の各々に対する標準化出生度と社会経済的指標．

Usage

swiss

Format

各々がパーセント単位，つまり [0, 100] 中，の6変数に対する 47観測値のデータフレーム．

[,1] Fertility Ig , ‘通常の標準化された出生度’[,2] Agriculture 農業従事男性の %[,3] Examination 軍隊試験で最高得点を得た被徴兵者の %[,4] Education 被徴兵者の小学校を超える教育．[,5] Catholic ‘カソリック’信者の % (‘プロテスタント’に対する)．[,6] Infant.Mortality 寿命1年未満の出生者．

All variables but ‘Fertility’を除く全ての変数は人口比を与える．

Details

(Mosteller and Tukeyの解説)：

1888年のスイスは民生的変換として知られる時期に入った；つまりその出生度が低開発国の高水準から低下し始めた．

データは1888年頃の47の仏語を話す“州”に対して収集された．

ここで全ての変数は [0, 100]にスケール化されているが，オリジナルでは "Catholic"を除く全てが [0, 1]にスケール化されていた．

Note

1888年とその他の年の182全ての地域に対するファイルは次から入手出来る https://opr.princeton.edu/archive/pefp/switz.aspx．

それらは変数 Examinationと Educationは1887年，1888年そして1889年に対して平均されていると述べている．

Source

以下のプロジェクト “16P5”, pages 549–551

Mosteller, F. and Tukey, J. W. (1977) Data Analysis and Regression: A Second Course in Statistics.Addison-Wesley, Reading Mass.

それらの出典は次のように述べられている “データは Franice van de Walle. Office ofPopulation Research, Princeton University, 1976 の許可で用いられた．未刊行のデータはNICHD contract number No 1-HD-O-2077の下で集められた．”

https://opr.princeton.edu/archive/pefp/switz.aspx

https://opr.princeton.edu/archive/pefp/switz.aspx

60 Theoph

References


Examples

require(stats); require(graphics)pairs(swiss, panel = panel.smooth, main = "swiss data",

col = 3 + (swiss$Catholic > 50))summary(lm(Fertility ~ . , data = swiss))

Theoph 抗喘息薬テオフィリンの薬物動態学データ

Description

codeTheophは 132行 5列のデータフレームで，テオフィリンの薬物動態学に関する実験から取られた．

Usage

Theoph

Format

クラス c("nfnGroupedData", "nfGroupedData", "groupedData", "data.frame")のオブジェクトで次を含む：

Subject 水準 1, . . . , 12の順序付き因子で観測がなされた人物を指定する．順序は観測された最大テオフィリン濃度の増加順．

Wt 被検者の体重(kg)．

Dose 被検者に口から処方されたテオフィリン量(mg/kg)．

Time 薬の処方からサンプルが取られるまでの時間(時間)．

conc サンプル中のテオフィリン濃度(mg/L).

Details

Boeckmann, Sheiner and Beal (1994)が Dr. Robert Uptonの抗喘息薬(気管支拡張剤)テオフィリンの動態学に関する研究から取られたデータを報告した． 12人の被験者が口からテオフィリンを投与され次の25時間に渡る11回細胞への濃縮度が測定された．

これらのデータは Davidian and Giltinan (1995)と Pinheiro and Bates (2000)により自己開始モデル SSfolが利用できる two-compartment open薬物動態学モデルを利用して解析された．

このデータセットは元々パッケージ nlmeの一部であり，それはそのグループ化されたクラスに対するメソッドを持つ([, as.data.frame, plotそして printに対するものを含む)．

Titanic 61

Source

Boeckmann, A. J., Sheiner, L. B. and Beal, S. L. (1994), NONMEM Users Guide: Part V, NON-MEM Project Group, University of California, San Francisco.

Davidian, M. and Giltinan, D. M. (1995) Nonlinear Models for Repeated Measurement Data, Chap-man & Hall (section 5.5, p. 145 and section 6.6, p. 176)

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS, Springer (AppendixA.29)

See Also

SSfol

Examples


coplot(conc ~ Time | Subject, data = Theoph, show.given = FALSE)Theoph.4 <- subset(Theoph, Subject == 4)fm1 <- nls(conc ~ SSfol(Dose, Time, lKe, lKa, lCl),

data = Theoph.4)summary(fm1)plot(conc ~ Time, data = Theoph.4,

xlab = "Time since drug administration (hr)",ylab = "Theophylline concentration (mg/L)",main = "Observed concentrations and fitted model",sub = "Theophylline data - Subject 4 only",las = 1, col = 4)

xvals <- seq(0, par("usr")[2], length.out = 55)lines(xvals, predict(fm1, newdata = list(Time = xvals)),

col = 4)

Titanic タイタニック号乗客の生存状況データ

Description

このデータセットは外洋航路船 ‘タイタニック’の破滅的な処女航海の乗員の運命に関する情報を提供し，経済状況(クラス)，性別，年齢，そして生存状況について要約する．

Usage

Titanic

Format

4変数に関する2201の観測値をクロス集計した4次元配列．変数とそれらの水準は次の通りである：

No 名前水準

1 クラス 1st, 2nd, 3rd, Crew2 性別 Male, Female3 年齢 Child, Adult4 生存 No, Yes

62 ToothGrowth

Details

タイタニック号の沈没は有名な出来事であり，それに関する新しい本がいまだに出版される．多くのよく知られた事実—一等船客の‘女性と子供優先’方針に対する割合，そして三等船客中の女性と子供の救助が完全には成功しなかった—が乗客の様々なクラスに対する生存率に反映している．

これらのデータは最初イギリスの商業会議所による沈没の調査で集められた．正確な乗員，救助者そして死亡者に関しては一次的資料間で完全な合意がないことを注意する．

特に部分的には映画 ‘Titanic’の大成功のせいで，最近タイタニック号に対する一般的関心が上昇している．非常に詳細な乗客に関するデータが今やインターネット上で得られる．例えば Encyclopedia Titanica (http://www.rmplc.co.uk/eduweb/sites/phind)．

Source

Dawson, Robert J. MacG. (1995), The ‘Unusual Episode’ Data Revisited. Journal of StatisticsEducation, 3. https://www.amstat.org/publications/jse/v3n3/datasets.dawson.html

この出典はタイタニック号の各乗員の乗船クラス，性別，年齢，そして生存状況を記録し，イギリスの商業会議所により最初集められ以下に再録されているデータによる：

British Board of Trade (1990), Report on the Loss of the ‘Titanic’ (S.S.). British Board of TradeInquiry Report (reprint). Gloucester, UK: Allan Sutton Publishing.

Examples

require(graphics)mosaicplot(Titanic, main = "Survival on the Titanic")## 子供の高い生存率?apply(Titanic, c(3, 4), sum)## 女性の高い生存率?apply(Titanic, c(2, 4), sum)## さらなる解析にはパッケージ 'MASS' 中の loglm() を使う

ToothGrowth ビタミンCの歯の成長への効果のデータ

Description

60匹のギニアピッグの象牙質芽細胞(歯の成長に関わる細胞)の長さ．各個体は三種類のビタミンCの投薬量 (一日当り 0.5, 1そして 2 mg)を二種類の方法で与えられた (オレンジジュースかアスコルビン酸(ビタミンCの一形態で VCとコードされる))．

Usage

ToothGrowth

Format


[,1] len 数値歯長

[,2] supp 因子サプリメントタイプ (VC又は OJ)[,3] dose 数値一日あたりのミリグラム投薬量

http://www.rmplc.co.uk/eduweb/sites/phind

https://www.amstat.org/publications/jse/v3n3/datasets.dawson.html

treering 63

Source

C. I. Bliss (1952) The Statistics of Bioassay. Academic Press.

References


Crampton, E. W. (1947) The growth of the odontoblast of the incisor teeth as a criterion of vitaminC intake of the guinea pig. The Journal of Nutrition 33(5): 491–504. http://jn.nutrition.org/content/33/5/491.full.pdf

Examples

require(graphics)coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,

xlab = "ToothGrowth data: length vs dose, given type of supplement")

treering 樹木の年輪データ -6000–1979

Description

単位なしの正規化された年輪幅を含む．

Usage

treering

Format

7981観測値を持つ一変量時系列．オブジェクトはクラス "ts"である．

各年輪は一年に対応する．

Details

データは1980年に Donald A. GraybillによりMethuselah Walk, Californiaの Gt Basin Bristle-cone Pine 2805M, 3726-11810から記録された．

Source

Time Series Data Library: http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/, se-ries ‘CA535.DAT’

References

For some photos of Methuselah Walk see http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html

http://jn.nutrition.org/content/33/5/491.full.pdf

http://jn.nutrition.org/content/33/5/491.full.pdf


http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html

http://www.ltrr.arizona.edu/~hallman/sitephotos/meth.html

64 UCBAdmissions

trees 樹木の幹径，高さ，体積のデータ

Description

このデータセットは31本のブラックチェリーの倒木の幹の周長，高さそして体積の測定値である．幹径は地面から4フィート6インチの高さで測定された樹木の直径であることを注意する．

Usage

trees

Format

3変数の31観測値に対するデータフレーム．

[,1] Girth 数値インチ単位の樹木の直径[,2] Height 数値フィート単位の高さ[,3] Volume 数値立方フィート単位の木材体積

Source

Ryan, T. A., Joiner, B. L. and Ryan, B. F. (1976) The Minitab Student Handbook. Duxbury Press.

References

Atkinson, A. C. (1985) Plots, Transformations and Regression. Oxford University Press.

Examples

require(stats); require(graphics)pairs(trees, panel = panel.smooth, main = "trees data")plot(Volume ~ Girth, data = trees, log = "xy")coplot(log(Volume) ~ log(Girth) | Height, data = trees,

panel = panel.smooth)summary(fm1 <- lm(log(Volume) ~ log(Girth), data = trees))summary(fm2 <- update(fm1, ~ . + log(Height), data = trees))step(fm2)## つまり Volume ~= c * Height * Girth^2 が適当と思われる

UCBAdmissions UCBAdmissions

Description

1973年の Berkeleyの六つの主要学部の大学院への出願者を合否と性別で分類した累計データ．

UCBAdmissions 65

Usage

UCBAdmissions

Format

4526観測値を3変数でクロス集計した結果の3次元配列．変数とそれらの水準は次の通り．

No 名前水準

1 Admit Admitted, Rejected2 Gender Male, Female3 Dept A, B, C, D, E, F

Details

このデータセットはしばしば Simpson のパラドックスの説明に用いられる， Bickel etal (1975) を見よ．問題はデータが合否判定における性別バイアスの証拠を示すかどうかである．2691人の男性出願者の内 1198人(44.5%)が合格し，1835人の女性出願者の内557人(30.4%)が合格した．これは標本オッズ比 1.83をあたえ，男性がほとんど二倍合格しやすいことを示している．実際には，グラフィカルな方法(下の例におけるような)や対数線形モデルによる解析は見かけ上の合否と性別の関連は男女が出願した個々の学部の傾向の差に基づく(女性は高い不合格率の学部により出願する傾向があった)．

このデータセットはまた一般的な mosaicplotや 2-by-2-by-kテーブルに対する fourfoldplotの様なカテゴリカルなデータのグラフィカルな表示に対する説明的な手法に対しても使うことが出来る．

References

Bickel, P. J., Hammel, E. A., and O’Connell, J. W. (1975) Sex bias in graduate admissions: Datafrom Berkeley. Science, 187, 398–403.

Examples

require(graphics)## 学部について累計したデータapply(UCBAdmissions, c(1, 2), sum)mosaicplot(apply(UCBAdmissions, c(1, 2), sum),

main = "Student admissions at UC Berkeley")## 個々の学部についてのデータopar <- par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))for(i in 1:6)

mosaicplot(UCBAdmissions[,,i],xlab = "Admit", ylab = "Sex",main = paste("Department", LETTERS[i]))

mtext(expression(bold("Student admissions at UC Berkeley")),outer = TRUE, cex = 1.5)

par(opar)

66 UKDriverDeaths

UKDriverDeaths イギリスの自動車運転手の事故データ 1969–84

Description

UKDriverDeathsは Jan 1969から Dec 1984の死亡又は重症事故の月別総計数を与える時系列である．シートベルトの着用義務は 31 Jan 1983に導入された．

Seatbeltsは同じ問題に関するより多くの情報を与える．

Usage

UKDriverDeathsSeatbelts

Format

Seatbelts多重時系列で次の列を持つ

DriversKilled 死亡運転手．

drivers UKDriverDeathsと同じ．

front 死亡又は重症の前部座席搭乗者．

rear 死亡又は重症の後部座席搭乗者．

kms 運転距離．

PetrolPrice ガソリン価格．

VanKilled バン(‘軽量荷物運搬車’)運転手．

law 0/1: 法律はその月に施行されていたか?

Source

Harvey, A.C. (1989) Forecasting, Structural Time Series Models and the Kalman Filter. CambridgeUniversity Press, pp. 519–523.

Durbin, J. and Koopman, S. J. (2001) Time Series Analysis by State Space Methods. Oxford Uni-versity Press. http://www.ssfpack.com/dkbook/

References

Harvey, A. C. and Durbin, J. (1986) The effects of seat belt legislation on British road casualties:A case study in structural time series modelling. Journal of the Royal Statistical Society series B,149, 187–227.

Examples

require(stats); require(graphics)## モデルを特定するためシートベルト時代の前について作業，対数を使うwork <- window(log10(UKDriverDeaths), end = 1982+11/12)par(mfrow = c(3, 1))plot(work); acf(work); pacf(work)par(mfrow = c(1, 1))(fit <- arima(work, c(1, 0, 0), seasonal = list(order = c(1, 0, 0))))z <- predict(fit, n.ahead = 24)

http://www.ssfpack.com/dkbook/

UKgas 67

ts.plot(log10(UKDriverDeaths), z$pred, z$pred+2*z$se, z$pred-2*z$se,lty = c(1, 3, 2, 2), col = c("black", "red", "blue", "blue"))

## ここで外生変数の効果を見るX <- Seatbelts[, c("kms", "PetrolPrice", "law")]X[, 1] <- log10(X[, 1]) - 4arima(log10(Seatbelts[, "drivers"]), c(1, 0, 0),

seasonal = list(order = c(1, 0, 0)), xreg = X)

UKgas イギリスのガス消費量データ

Description

1960Q1から 1986Q4までのイギリスの四半期毎のガス消費量，百万ガス料金単位．

Usage

UKgas

Format

長さ 108の四半期毎の時系列．

Source


Examples

## 恐らく str(UKgas) ; plot(UKgas) ...

UKLungDeaths イギリスの肺疾患による月別死亡者数データ

Description

1974–1979のイギリスの気管支炎，肺気腫，そして喘息による月別死亡者数を与える三つの時系列で，男女(ldeaths)，男性(mdeaths)そして女性(fdeaths)．

Usage

ldeathsfdeathsmdeaths

Source

P. J. Diggle (1990) Time Series: A Biostatistical Introduction. Oxford, table A.3


68 USArrests

Examples

require(stats); require(graphics) # 時間に対してplot(ldeaths)plot(mdeaths, fdeaths)## より良いラベル：yr <- floor(tt <- time(mdeaths))plot(mdeaths, fdeaths,

xy.labels = paste(month.abb[12*(tt - yr)], yr-1900, sep = "'"))

USAccDeaths 米国の事故死亡者数データ 1973–1978

Description

米国の月別の事故死者の総計を与える時系列． 1979年の最初の六ヶ月の値は 7798 74068363 8460 9217 9316．

Usage

USAccDeaths

Source

P. J. Brockwell and R. A. Davis (1991) Time Series: Theory and Methods. Springer, New York.

USArrests 米国の暴力犯罪率データ

Description

このデータセットは米国の50の州での住民10万人あたりの暴行，殺人，強姦による逮捕者数の1973年の統計である．また郊外地域に住む人口のパーセントも与える．

Usage

USArrests

Format


[,1] Murder 数値殺人による逮捕(10万人当たり)[,2] Assault 数値暴行による逮捕(10万人当たり)[,3] UrbanPop 数値郊外人口のパーセント[,4] Rape 数値強姦による逮捕(10万人当たり)

Source

World Almanac and Book of facts 1975. (犯罪率)．

USJudgeRatings 69

Statistical Abstracts of the United States 1975. (郊外人口率)．

References


See Also

The state data sets.

Examples

require(graphics)pairs(USArrests, panel = panel.smooth, main = "USArrests data")

USJudgeRatings 米国最高裁判事の評価データ

Description

弁護士による米国最高裁の判事の評価データ．

Usage

USJudgeRatings

Format

12数値変数に関する43観測値を含むデータフレーム．

[,1] CONT 弁護士による判事との接触数．[,2] INTG 裁判官として清廉さ．[,3] DMNR 行状．[,4] DILG 勤勉さ．[,5] CFMG 審判の流れの管理．[,6] DECI 迅速な決定．[,7] PREP 判決に対する準備．[,8] FAMI 法律への知識．[,9] ORAL 健全な口頭による裁定．

[,10] WRIT 健全な文章による裁定．[,11] PHYS 身体的能力．[,12] RTEN 残留に値する．

Source

New Haven Register, 14 January, 1977 (John Hartiganによる).

Examples

require(graphics)pairs(USJudgeRatings, main = "USJudgeRatings data")

70 uspop

USPersonalExpenditure 米国の個人支出データ

Description

このデータセットはカテゴリー別の米国の個人支出データ (10億ドル単位)を含む：1940,1945, 1950, 1955そして 1960年の食品とタバコ，家計管理，医薬品と健康，個人医療費，そして個人教育費．

Usage

USPersonalExpenditure

Format

5行5列の行列．

Source

The World Almanac and Book of Facts, 1962, page 756.

References

Tukey, J. W. (1977) Exploratory Data Analysis. Addison-Wesley.


Examples

require(stats) # medpolish に対してUSPersonalExpendituremedpolish(log10(USPersonalExpenditure))

uspop 米国人口データ

Description

このデータセットは期間1790年–1970年の10年毎の国政調査により記録された合衆国の人口(百万人単位)を与える．

Usage

uspop

Format

19値の時系列．

VADeaths 71

Source


Examples

require(graphics)plot(uspop, log = "y", main = "uspop data", xlab = "Year",

ylab = "U.S. Population (millions)")

VADeaths バージニア州の死亡率データ

Description

1940年のバージニア州の千人当たり死亡率．

Usage

VADeaths

Format

5行4列の行列．

Details

死亡率は一年ごと千人毎で測られる．それらは年齢グループ(行)と人口グループ(列)でクロス集計される．年齢グループは： 50–54, 55–59, 60–64, 65–69, 70–74そして人口グループは都市部/男性，都市部/女性，田舎/男性，そして田舎/女性である．

これはかなり優れた三元分散分析の例を与える．

Source

Molyneaux, L., Gilliam, S. K., and Florant, L. C.(1947) Differences in Virginia death rates by color,sex, age, and rural or urban residence. American Sociological Review, 12, 525–535.

References


Examples

require(stats); require(graphics)n <- length(dr <- c(VADeaths))nam <- names(VADeaths)d.VAD <- data.frame(Drate = dr,age = rep(ordered(rownames(VADeaths)), length.out = n),gender = gl(2, 5, n, labels = c("M", "F")),site = gl(2, 10, labels = c("rural", "urban")))

coplot(Drate ~ as.numeric(age) | gender * site, data = d.VAD,panel = panel.smooth, xlab = "VADeaths data - Given: gender")

72 warpbreaks

summary(aov.VAD <- aov(Drate ~ .^2, data = d.VAD))opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0))plot(aov.VAD)par(opar)

volcano マウンガ・ファウ火山の地形情報データ

Description

マウンガ・ファウ(イーデン)山はオークランドの火山帯にある約50の火山の一つである．このデータセットは10メートル四方のグリッド上のマウンガ・ファウ山の地形情報を与える．

Usage

volcano

Format

87行と61列の行列で，行は東から西へ伸びるグリッド線に，列は南から北へ伸びるグリッド線に対応する．

Source

Ross Ihakaにより地形図マップからデジタル化された．これらのデータは正確であると考えるべきではない．

See Also

filled.contour for a nice plot.

Examples

require(grDevices); require(graphics)filled.contour(volcano, color.palette = terrain.colors, asp = 1)title(main = "volcano data: filled contour map")

warpbreaks 織り糸の切断データ

Description

このデータは織機毎の縦糸の断線数を与える，ここで織機は決まった長さの織り糸に対応する．

Usage

warpbreaks

Format


women 73

[,1] breaks 数値断線数

[,2] wool 因子羊毛の種類(Aか B)[,3] tension 因子張力のレベル(L, M, H)

6種類の縦糸のタイプ (AL, AM, AH, BL, BM, BH)毎に 9台の織機に関する測定がある．

Source

Tippett, L. H. C. (1950) Technological Applications of Statistics. Wiley. Page 106.

References

Tukey, J. W. (1977) Exploratory Data Analysis. Addison-Wesley.


See Also

これらのデータを表として表示する方法については xtabs．

Examples

require(stats); require(graphics)summary(warpbreaks)opar <- par(mfrow = c(1, 2), oma = c(0, 0, 1.1, 0))plot(breaks ~ tension, data = warpbreaks, col = "lightgray",

varwidth = TRUE, subset = wool == "A", main = "Wool A")plot(breaks ~ tension, data = warpbreaks, col = "lightgray",

varwidth = TRUE, subset = wool == "B", main = "Wool B")mtext("warpbreaks data", side = 3, outer = TRUE)par(opar)summary(fm1 <- lm(breaks ~ wool*tension, data = warpbreaks))anova(fm1)

women 米国女性の平均身長と体重のデータ

Description

このデータは30歳から39歳の米国女性の平均身長と体重を与える．

Usage

women

Format


[,1] height 数値身長(インチ)[,2] weight 数値体重(ポンド)

74 WorldPhones

Details

データセットはある(我々が知らない)昔の米国保険数理士協会の体格と血圧研究から取られたように見える．

World Almanacの注意：“数字は通常の室内着と靴を着用した体重と靴をはいた身長を表す”．

Source

The World Almanac and Book of Facts, 1975.

References


Examples

require(graphics)plot(women, xlab = "Height (in)", ylab = "Weight (lb)",

main = "women data: American women aged 30-39")

WorldPhones 世界中の電話数データ

Description

世界の各地域の電話の数(千単位)．

Usage

WorldPhones

Format

7行8列の行列．行列の列は与えられた地域での数，そして行は一年の数．

地域は：北米，ヨーロッパ，アジア，南米，オセアニア，アフリカ，中米．

年度は：1951, 1956, 1957, 1958, 1959, 1960, 1961．

Source

AT&T (1961) The World’s Telephones.

References


Examples

require(graphics)matplot(rownames(WorldPhones), WorldPhones, type = "b", log = "y",

xlab = "Year", ylab = "Number of telephones (1000's)")legend(1951.5, 80000, colnames(WorldPhones), col = 1:6, lty = 1:5,

pch = rep(21, 7))title(main = "World phones data: log scale for response")

WWWusage 75

WWWusage 分毎のインターネット使用データ

Description

各分毎にサーバー経由でインターネットに接続したユーザ数の時系列．

Usage

WWWusage

Format

長さ100の時系列．

Source


References

Makridakis, S., Wheelwright, S. C. and Hyndman, R. J. (1998) Forecasting: Methods and Applica-tions. Wiley.

Examples

require(graphics)work <- diff(WWWusage)par(mfrow = c(2, 1)); plot(WWWusage); plot(work)## Not run:require(stats)aics <- matrix(, 6, 6, dimnames = list(p = 0:5, q = 0:5))for(q in 1:5) aics[1, 1+q] <- arima(WWWusage, c(0, 1, q),

optim.control = list(maxit = 500))$aicfor(p in 1:5)

for(q in 0:5) aics[1+p, 1+q] <- arima(WWWusage, c(p, 1, q),optim.control = list(maxit = 500))$aic

round(aics - min(aics, na.rm = TRUE), 2)

## End(Not run)


Index

∗Topic datasetsability.cov, 3airmiles, 4AirPassengers, 5airquality, 6anscombe, 7attenu, 8attitude, 9austres, 10beavers, 10BJsales, 11BOD, 12cars, 13ChickWeight, 14chickwts, 15CO2, 16co2, 17crimtab, 18discoveries, 19DNase, 20esoph, 21euro, 22eurodist, 23EuStockMarkets, 24faithful, 24Formaldehyde, 25freeny, 26HairEyeColor, 27Harman23.cor, 28Harman74.cor, 29Indometh, 29infert, 30InsectSprays, 31iris, 32islands, 33JohnsonJohnson, 33LakeHuron, 34lh, 35LifeCycleSavings, 35Loblolly, 36longley, 37lynx, 38morley, 38

mtcars, 39nhtemp, 40Nile, 41nottem, 42npk, 43occupationalStatus, 44Orange, 44OrchardSprays, 45PlantGrowth, 46precip, 47presidents, 48pressure, 48Puromycin, 49quakes, 50randu, 51rivers, 52rock, 52sleep, 53stackloss, 54state, 55sunspot.month, 56sunspot.year, 57sunspots, 58swiss, 59Theoph, 60Titanic, 61ToothGrowth, 62treering, 63trees, 64UCBAdmissions, 64UKDriverDeaths, 66UKgas, 67UKLungDeaths, 67USAccDeaths, 68USArrests, 68USJudgeRatings, 69USPersonalExpenditure, 70uspop, 70VADeaths, 71volcano, 72warpbreaks, 72women, 73WorldPhones, 74

76

INDEX 77

WWWusage, 75∗Topic package

datasets-package, 3

ability.cov, 3airmiles, 4AirPassengers, 5airquality, 6anscombe, 7attenu, 8attitude, 9austres, 10

beaver1 (beavers), 10beaver2 (beavers), 10beavers, 10BJsales, 11BOD, 12

cars, 13ChickWeight, 14chickwts, 15chisq.test, 28CO2, 16co2, 17colnames, 18crimtab, 18

datasets (datasets-package), 3datasets-package, 3discoveries, 19DNase, 20

esoph, 21euro, 22eurodist, 23EuStockMarkets, 24

faithful, 24fdeaths (UKLungDeaths), 67filled.contour, 72Formaldehyde, 25fourfoldplot, 65freeny, 26

HairEyeColor, 27Harman23.cor, 28Harman74.cor, 29

Indometh, 29infert, 30InsectSprays, 31integer, 18iris, 32

iris3 (iris), 32islands, 33

JohnsonJohnson, 33

LakeHuron, 34ldeaths (UKLungDeaths), 67lh, 35LifeCycleSavings, 35Loblolly, 36loglin, 28longley, 37lynx, 38

matplot, 32mdeaths (UKLungDeaths), 67morley, 38mosaicplot, 28, 65mtcars, 39

nhtemp, 40Nile, 41nottem, 42npk, 43

occupationalStatus, 44Orange, 44OrchardSprays, 45

PlantGrowth, 46precip, 47presidents, 48pressure, 48Puromycin, 49

quakes, 50

randu, 51rivers, 52rock, 52rownames, 18

Seatbelts (UKDriverDeaths), 66sleep, 53SSbiexp, 30SSfol, 61SSlogis, 14SSmicmen, 50stack.loss (stackloss), 54stack.x (stackloss), 54stackloss, 54state, 55, 69sunspot.month, 56, 57, 58sunspot.year, 57, 58

78 INDEX

sunspots, 56, 57, 58swiss, 59

table, 18, 44Theoph, 60Titanic, 61ToothGrowth, 62treering, 63trees, 64

UCBAdmissions, 64UKDriverDeaths, 66UKgas, 67UKLungDeaths, 67USAccDeaths, 68USArrests, 68UScitiesD (eurodist), 23USJudgeRatings, 69USPersonalExpenditure, 70uspop, 70

VADeaths, 71volcano, 72

warpbreaks, 72women, 73WorldPhones, 74WWWusage, 75

xtabs, 73

Package ‘datasets’nakama/Rjp/datasets-manual.pdfPackage ‘datasets’ September 14, 2016...

Documents

Transcript of Package ‘datasets’nakama/Rjp/datasets-manual.pdfPackage ‘datasets’ September 14, 2016...