Cent Tend SD Corr Reg

Click here to load reader

  • date post

    08-Apr-2018
  • Category

    Documents

  • view

    217
  • download

    0

Embed Size (px)

Transcript of Cent Tend SD Corr Reg

  • 8/6/2019 Cent Tend SD Corr Reg

    1/69

    Measurement of central tendency

    Measurement of dispersionCorrelation Regression

    Statistical methods

  • 8/6/2019 Cent Tend SD Corr Reg

    2/69

    Data ts types

    Definition ofData: Facts, figures, enumerations & other materials, pastand present, serving as basis for study and analysis; they are raw

    material for analysis; provide basis for testing hypothesis, developingscales and tables Data help researchers draw inferences on specific issues/

    problems Quality of findings depend on relevance, adequacy & reliability of

    data Types of data (Not in statistical sense)

    A.1. Personal data (Individual as a source) Demographic & socio-economic Characteristics Behaviour variables Attitude, behaviour, opinions Awareness, preferences, knowledge

    Practices, intensions

    2. Organisational data (Organisational sources) Archives ,Manuscript library, museums

    3. Territorial data Economic structure, occupation pattern

    B. I Secondary (Paper method)

  • 8/6/2019 Cent Tend SD Corr Reg

    3/69

    Methods & Techniques ofData Collection

    I-Secondary data

    How to scrutinize

    Published & unpublished

    Methods where used

    A-Meta analysis

    B- Historical method

    C-Content analysis D-Informetrics

    E-Use studies

  • 8/6/2019 Cent Tend SD Corr Reg

    4/69

    II-Primary data

    A-Records & relics B-Observation C-Experimentation D-Simulation E-Ask people orally F-Ask people in writing G-Panel study H-Projective techniques I -Sociometry

    J -Case study-Interview / Depth interview / Schedule-Mail survey / questionnaire-Mechanical devices

  • 8/6/2019 Cent Tend SD Corr Reg

    5/69

    Primary Data

    Secondary Data-1. Internet sites /webpage of different companies and

    organizations2. Central and local govt. studies and reports,3. Rules on international trading, import and exports,

    state budgets4. FICCI(federation ofIndian chambers of conference

    and industry),CII(Confederation ofINDIANINDUSTRY),ASSOCAM(Associated chamber ofcommerce and Industry).

    5. Policies on foreign direct investment

    Data Sources

  • 8/6/2019 Cent Tend SD Corr Reg

    6/69

    Skewness and Kurtosis: someexamples

    Edu ational Attainment

    7.06.05.0

    .0

    .0

    .01.0

    Edu ational Attainment

    Frequen

    1

    0

    100

    80

    60

    0

    0

    0

    Std. De

    = 1.81

    ean =

    .8

    N =

    .00

    Reason or ermination

    17.515.01

    .510.07.55.0

    .50.0

    Reason or ermination

    Frequen

    80

    60

    0

    0

    0

    Std. De

    = 5.

    6

    ean =

    .6

    N = 1

    .00

  • 8/6/2019 Cent Tend SD Corr Reg

    7/69

  • 8/6/2019 Cent Tend SD Corr Reg

    8/69

  • 8/6/2019 Cent Tend SD Corr Reg

    9/69

  • 8/6/2019 Cent Tend SD Corr Reg

    10/69

  • 8/6/2019 Cent Tend SD Corr Reg

    11/69

  • 8/6/2019 Cent Tend SD Corr Reg

    12/69

  • 8/6/2019 Cent Tend SD Corr Reg

    13/69

    Pictogram

  • 8/6/2019 Cent Tend SD Corr Reg

    14/69

    Annotated box plot

  • 8/6/2019 Cent Tend SD Corr Reg

    15/69

    Describing Data Numerically

    Arithmetic Mean

    Median

    Mode

    Describing Data Numerically

    Variance

    Standard Deviation

    Coefficient of Variation

    Range

    Interquartile Range

    Central Tendency Variation

  • 8/6/2019 Cent Tend SD Corr Reg

    16/69

    Measures of Central Tendency

    Central Tendency

    Mean Median Mode

    n

    n

    1i

    i!!

    Overview

    Midpoint ofranked values

    Most fre uentlyobserved value

    Arithmeticaverage

  • 8/6/2019 Cent Tend SD Corr Reg

    17/69

    Arithmetic Mean

    The arithmetic mean (mean) is the mostcommon measure of central tendency

    For a population ofN values:

    For a sample of size n:

    Sample size

    nnn1

    n

    1ii

    !!

    !

    . Observedvalues

    N

    xxx

    N

    x

    N21

    N

    1ii

    !!

    ! .

    Population size

    Populationvalues

  • 8/6/2019 Cent Tend SD Corr Reg

    18/69

    Arithmetic Mean

    The most common measure of central tendency

    Mean sum of values divided by the number of values

    Affected by extreme values (outliers)

    (continued)

    0 1 2 3 4 5 6 7 8 9 10

    Mean = 3

    0 1 2 3 4 5 6 7 8 9 10

    Mean = 4

    35

    15

    5

    54321!!

    4

    5

    2

    5

    104321!!

  • 8/6/2019 Cent Tend SD Corr Reg

    19/69

    Median

    In an ordered list, the median is the middlenumber(50% above, 50% below)

    Not affected by extreme values Median L+[(1/2N-C)/f ]h Q2 Compare knowledge level in Two subjects for a

    group of students by median

    0 1 2 3 4 5 6 7 8 9 10

    Median = 3

    0 1 2 3 4 5 6 7 8 9 10

    Median = 3

  • 8/6/2019 Cent Tend SD Corr Reg

    20/69

    Quartiles, Deciles.Percentiles

    Similar to median which divides data in to parts , Quartiles (dividesdata in four parts), Deciles(divides data in ten parts) and percentiles(divides data in 1000 parts)

    Mode 3median-2mode

    3,2,1,..4

    Qj !

    ! jhf

    fcpjN

    L

    9,....2,1

    ..10

    Dj

    !

    !

    j

    hf

    fcpjN

    L

    99...2,1

    ..100

    Pj

    !

    !

    j

    hf

    fcpjN

    L

  • 8/6/2019 Cent Tend SD Corr Reg

    21/69

    Finding the Median

    The location of the median:

    If the number of values is odd, the median is the middle number

    If the number of values is even, the median is the average ofthe two middle numbers

    Note that is not the value of the median, only the

    position of the median in the ranked data

    dataorderedtheinosition

    1n

    ositionedian

    !

    2

    1n

  • 8/6/2019 Cent Tend SD Corr Reg

    22/69

    Mode

    A measure of central tendency

    Value that occurs most often

    Not affected by extreme values

    Used for either numerical or categorical data

    There may be several modes

    Mode L+[(f-f-1)/(2f-f-1-f1 )]h

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

    Mode = 9

    0 1 2 3 4 5 6

    No Mode

    Frequency after modalclass

    Frequency beforemodal class

  • 8/6/2019 Cent Tend SD Corr Reg

    23/69

    Five houses on a hill by the beach

    Review xample

    $

    $

    $

    $

    $

    House Prices:

    $2,000,000500,000300,000100,000

    100,000

  • 8/6/2019 Cent Tend SD Corr Reg

    24/69

    Review xample:Summary Statistics

    Mean: ($3,000,000/5)

    $600,000

    Median: middle value of ranked data$300,000

    Mode: most fre uent value$100,000

    House Prices:

    $2,000,000

    500,000300,000100,000100,000

    Sum 3,000,000

  • 8/6/2019 Cent Tend SD Corr Reg

    25/69

    Example

    5 1 Class Freque C.F less C.F More Than

    9 2 19 5-10 5 5 49

    7 3 20 10-15 6 11 44

    9 4 22 15-20 15 26 38

    10 5 22 20-25 10 36 23

    9 7 17 25-30 5 41 135 7 30-35 4 45 8

    Mean 7.714286 4.142857 20 35-40 2 47 4

    mode 9 7 22 40-45 2 49 2median 9 4 20

    SD 4.238095 5.47619 4.5

    Median=L+[(1/2N-C)/f ]h e ( - - ( - - -Median Class=Total Freq/2 Class MODAL CLASS= Max Frequency class

    Median Class='15-20 i.e 15 is max fre in freq G21

    i.e 26 in Cumulative frequency

    Median=15+[((1/2)49-11)/15 ]5 Mode 15+[(15-6)/(2x15-6-10 )]5

  • 8/6/2019 Cent Tend SD Corr Reg

    26/69

    Mean is generally used, unlessextreme values (outliers) exist

    Then median is often used, sincethe median is not sensitive toextreme values.

    Example: Median home prices may be

    reported for a region less sensitive tooutliers

    Which measure of locationis the best?

  • 8/6/2019 Cent Tend SD Corr Reg

    27/69

    Geometric mean & Harmonicmean

    Geometric mean is nth root of product of n observations ( ex: averagepercent increase in sales, production, ), Best considered in case ofconstructing index number.

    Harmonic mean: restricted use such as average rate of increase of

    profits average price at which an article has been sold

    NX

    anti !log

    logG.M

    ,H.M,1

    H.M

    !

    !

    X

    f

    X

    N

    21

    2211 loglog.log NN

    GNGN

    !

  • 8/6/2019 Cent Tend SD Corr Reg

    28/69

    Same center,

    different variation

    Measures of Variability

    Variation

    Variance Standard

    Deviation

    Coefficient

    of Variation

    Range Interquartile

    Range

    Measures of variation give

    information on the spreadorvariability of the datavalues.

  • 8/6/2019 Cent Tend SD Corr Reg

    29/69

    Range

    Simplest measure of variation

    Difference between the largest and the smallest

    observations:Range Xlargest Xsmallest

    0 1 2 3 4 5 9 10 11 12 13 14

    Range = 14 - 1 = 13

    Example:

  • 8/6/2019 Cent Tend SD Corr Reg

    30/69

    Ignores the way in which data are distributed

    Sensitive to outliers

    7 8 9 10 11 12

    Range = 12 - 7 = 5

    7 8 9 10 11 12

    Range = 12 - 7 = 5

    Disadvantages of the Range

    1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

    1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

    Range = 5 - 1 = 4

    Range = 120 - 1 = 119

  • 8/6/2019 Cent Tend