Thinking about Data:
• Terms:– matrix– unit of analysis– case– variable – code
Types of Data
• Microlevel: data collected on the characteristics of individual cases, people, houses, events, that is, discrete units. For example an individual, with characteristic information on sex, age, state of residence, etc.
• Aggregate: Tabular data representing counts of units falling into particular categories, e.g., populations of states. The state is the unit of analysis; the variables are the name of the state and the population of the state.
Sources of Data
• Survey: collected specifically for the research purpose, e.g., CPS, GSS, census.
• Administrative record: records of immigrant arrivals by port; tax filings; vital registration records; case files of judicial proceedings, health records.
Univariate Statistics
• Types of Variables: Nominal; ordinal, interval, ratio
• Measures of central tendency: mean, median, mode
• Measures of dispersion: standard deviation, ntiles, range, coefficient of variation
• Measures of shape: skewness, kurtosis.
STATS YRBUILT CONCOST / Mean Min Max SD STATS YRBUILT CONCOST / Mean Min Max SD CV Kurtosis Median Range SEK SEM SES CV Kurtosis Median Range SEK SEM SES
Skewness Sum Variance N CIM=.95Skewness Sum Variance N CIM=.95• YRBUILT• N of cases 1235• Minimum 888.000• Maximum 929.000• Range 41.000• Sum 1116660.000• Median 904.000• Mean 904.178• 95% CI Upper 904.724• 95% CI Lower 903.633• Std. Error 0.278• Standard Dev 9.770• Variance 95.451• C.V. 0.011• Skewness(G1) 0.409• SE Skewness 0.070• Kurtosis(G2) -0.528• SE Kurtosis 0.139
• CONCOST
• 1127• 20.000• 5200.000• 5180.000• 354277.000• 250.000• 314.354• 332.888• 295.820• 9.446• 317.116• 100562.250• 1.009• 5.563• 0.073• 59.859• 0.146
880 890 900 910 920 930YRBUILT
0
50
100
150
Cou
nt
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Proportion per B
ar
0 1000 2000 3000 4000 5000 6000CONCOST
0
100
200
300
400
500
600
Cou
nt
0.0
0.1
0.2
0.3
0.4
0.5
Proportion per B
ar
0 500 1000 1500 2000CONCOST
0
100
200
300
Co u
nt
0.0
0.1
0.2 Proportion per B
ar
100 1000CONCOST
0
50
100
150
Cou
n t
0.00
0.02
0.04
0.06
0.08
0.10
0.12P
roportion per Bar
0 1000 2000 3000 4000 5000 6000CONCOST
-4
-3
-2
-1
0
1
2
3
4
Exp
ecte
d V
a lu e
fo r
No r
ma l
Dis
tri b
u tio
n
100 1000CONCOST
-4
-3
-2
-1
0
1
2
3
4
Exp
ecte
d V
a lu e
fo r
No r
ma l
Dis
tri b
u tio
n
100 1000CONCOST
-4
-3
-2
-1
0
1
2
3
4
Exp
ecte
d V
a lu e
fo r
No r
ma l
Dis
tri b
u tio
n
Top Related