01 - Organising and Displaying Data

download 01 - Organising and Displaying Data

of 39

Transcript of 01 - Organising and Displaying Data

  • 8/10/2019 01 - Organising and Displaying Data

    1/39

    C H A P T E R

    1CORE

    Organising anddisplaying data

    What is the difference between categorical and numerical data?

    What is a frequency table, how is it constructed and when is it used?

    What is the mode and how do we determine its value?

    What are bar charts, histograms, stem plots and dot plots? How are they

    constructed and when are they used?

    How do you describe the features of bar charts, histograms and stem plots when

    writing a statistical report?

    1.1 Classifying dataStatistics is a science concerned with understanding the world through data. The first step in

    this process is to put the data into a form that makes it easier to see patterns or trends.

    Some dataThe data contained in Table 1.1 are part of a larger set of data collected from a group of

    university students.

    Table 1.1 Student data

    Height Weight Age Sex Plays sport Pulse rate

    (cm) (kg) (years) M male 1 regularly (beats/min)F female 2 sometimes

    3 rarely

    173 57 18 M 2 86179 58 19 M 2 82167 62 18 M 1 96195 84 18 F 1 71173 64 18 M 3 90184 74 22 F 3 78175 60 19 F 3 88140 50 34 M 3 70

    Source: www.statsci.org/data/oz/ms212.html. Used with permission.

    1ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    2/39

    2 Essential Further Mathematics Core

    Variables

    In a data set, we call the things about which we record information variables. An important

    first step in analysing any set of data is to identify the variables involved, their units of

    measurement (where appropriate) and the values they take. In this particular data set there are

    sixvariables:

    height (in centimetres)weight (in kilograms)

    age (in years)

    sex (M =male, F = female)plays sport (1 =regularly, 2 =sometimes, 3 = rarely)

    pulse rate (beats/minute)

    Types of variables: categorical and numericalHaving identified the variables we are working with, the next step is to decide the variable type.

    Some variables representqualitiesor attributes. For example, F in theSexcolumn

    indicates that the person is a female, while a 2 in thePlays sportcolumn indicates that the

    person is someone who plays sport sometimes.

    Variables that representqualitiesare calledcategorical variables.

    Other variables representquantities. For example, a 179 in theHeightcolumn indicates

    that the person is 179 cm tall, while an 82 in the Pulse ratecolumn indicates that they have a

    pulse rate of 82 beats/minute.

    Variables that representquantitiesare callednumerical variables.

    Numerical variables come in two types: discrete and continuous.

    Discretenumerical variables represent quantities that arecounted. The number of mobile

    phones in a house is an example. Counting leads to discrete data values because it results in

    values such as 0, 1, 2, 3 etc. There can be nothing in between. As a guide, discrete numericalvariablesarise when we ask the question How many?

    Continuousnumerical variables represent quantities that aremeasuredrather than counted.

    Thus, even though we might record a persons height as 179 cm, in reality that could be any

    value between 178.5 and 179.4 cm. We have just rounded off the height to 179 cm for

    convenience, or to match the accuracy of the measuring device.

    Warning!!It is not the variable name itself that determines whether the data are numerical or categorical, it is

    the way the data for the variable are recorded.For example:

    weight recorded in kilograms, is anumericalvariableweight recorded as 1 = underweight, 2 = normal weight, 3 = overweight, is acategoricalvariable

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    3/39

    Chapter 1 Organising and displaying data 3

    Exercise 1A

    1What is:

    a a numerical variable? Give an example. b a categorical variable? Give an example.

    2There are two types of numerical variables. Name them.

    3Classify each of the following variables as numerical or categorical. If the variable is

    numerical, further classify the variable as discrete or continuous.

    Recording information on:

    a length of bananas (in centimetres)

    b number of cars in a supermarket car park

    c daily temperature in C

    d eye colour (brown, blue, . . . )

    e shoe size (6, 8, 10, . . . )

    f the number of children in a family

    g city of residence (NY, London, . . . )

    h number of people who live in your city/area

    i time spent watching TV (hours)

    j the TV channel most watched by students

    k salary (high, medium, low)

    l salary (in dollars)

    m whether a person smokes (yes, no)

    n the number of cigarettes smoked per day

    4Classify the data for each of the variables in Table 1.1 as numerical or categorical.

    1.2 Organising and displaying categorical dataThe frequency table

    With a large number of data values, it is difficult to identify any patterns or trends in the rawdata. We first need to organise the data into a more manageable form. A statistical tool we use

    for this purpose is the frequency table.

    The frequency table

    Afrequency tableis a listing of the values a variable takes in a data set, along with how

    often (frequently) each value occurs.

    Frequency can be recorded as a

    count: the number of times a value occurs, or

    per cent: the percentage of times a value occurs (percentage frequency)

    per cent =count

    total count 100%

    A listing of the values a variable takes, along with how frequently each of these values

    occurs in a data set, is called afrequency distribution.

    Example 1 Frequency table for a categorical variable

    The sex of 11 preschool children is as shown (F = female, M =male):

    F M M F F M F F F M M

    Construct a frequency table to display the data.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    4/39

  • 8/10/2019 01 - Organising and Displaying Data

    5/39

    Chapter 1 Organising and displaying data 5

    Example 2 Constructing a bar chart from a frequency table

    Construct a bar chart for this frequency table

    of climate types in various countries.Frequency

    Climate type Count Per cent

    Cold 3 13.0Moderate 14 60.9

    Hot 6 26.1

    Total 23 100.0

    Solution

    1 Label the horizontal axis with the variable

    name, Climate type. Mark the scale off

    into three equal intervals and label them Cold,Moderate and Hot.

    2 Label the vertical axis Frequency. Scale

    allowing for the maximum frequency, 14.

    Fifteen would be appropriate. Mark the

    scale off in fives.

    3 For each interval, draw in a bar. There are

    gaps between the bars to show that the

    categories are separate. The height of the

    bar is made equal to the frequency.

    15

    10

    5

    0Cold Moderate Hot

    Frequency

    Climate type

    The modeOne of the features of a data set that is quickly revealed with a bar chart is the modeormodal

    category. This is the most frequently occurring value or category. This is given by the

    category with the tallest bar. For the bar chart above, the modal category is clearly Moderate.

    That is, for the countries considered, the most frequently occurring climate type is Moderate.

    However, the mode is only of interest when a single value or category in the frequency table

    occurs much more often than the others. Modes are of particular importance in popularity

    polls. For example, in answering questions such as Which is the most frequently watched TV

    station between the hours of 6.00 and 8.00 p.m.? or What are the times when a supermarket

    is in peak demand morning, afternoon or night?

    What to look for in a frequency distribution of a categoricalvariable: writing a reportA bar chart, in combination with a frequency table, is useful for gaining an overall view of a

    frequency distribution of a categorical variable, the so-called big picture.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    6/39

    6 Essential Further Mathematics Core

    Describing a bar chart

    In describing a bar chart, we focus on two things:

    the presence of a dominant category(or group of categories) in the distribution. This

    is given by the mode. If there is no dominant category, then this should be stated.

    theorder of occurrenceof each category and its relative importance.

    In commenting on these features, it is usual to support your conclusions with percentages.

    When quoting percentages, it is also advisable to indicate at the beginning the total number of

    cases involved. Using the information in Example 2 to describe the distribution of climate

    type, you might write as follows:

    ReportThe climate types of 23 countries were classified as being, `cold', `moderate' or hot'. The

    majority of the countries, 60.9%, were found to have a moderate climate. Of the remaining

    countries, 26.1% were found to have a hot climate while 13.0% were found to have a coldclimate.

    Stacked or segmented bar charts

    Climate

    Hot

    Cold

    Moderate

    25

    20

    15

    10

    5

    0

    Frequ

    ency

    A variation on the standard bar chart is the

    segmented or stacked bar chart. In a

    segmented bar chart, the bars are stacked

    on one another to give a single bar with

    several components. The lengths of thesegments are determined by the frequencies.

    When this is done, the height of the bar gives

    thetotalfrequency. Segmented bar charts

    should only be used when there are

    a relatively small number of components; usually no more than four or five. Otherwise it

    becomes difficult to distinguish the components. The segmented bar chart above was formed

    from the climate data used in Example 2. Note that a legend has been included to identify the

    segments.

    Climate

    Hot

    Moderate

    Cold

    100

    90

    80

    70

    60

    50

    Percentage

    40

    30

    20

    10

    0

    In apercentage segmented bar chart,the lengths of each of the segments in the

    bar are determined by the percentages.

    When this is done, the height of the bar is

    100. The percentage segmented bar chart

    opposite was formed from the climate data

    used in Example 2.

    Percentage segmented bar charts are most

    useful when we come to analyse the

    relationship between two categoricalvariables, as we will see in Chapter 4.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    7/39

    Chapter 1 Organising and displaying data 7

    Exercise 1B

    1 a In a frequency table, what is the mode?

    b Identify the mode in the following data sets:

    i Grades: A A C B A B B B B D Cii Shoe size: 8 9 9 10 8 8 7 9 8 10 12 8 10

    2The following data identifies the state of residence of a group of people, where

    1 =Victoria, 2 =SA and 3 = WA.

    2 1 1 1 3 1 3 1 1 3 3

    a Form a frequency table (with both counts and percentages) to show the distribution of

    state of residencefor this group of people. Use the table in Example 1 as a model.

    b Construct a bar chart using Example 2 as a model.

    3Thesize(S =small, M =medium, L = large) of 20 cars was recorded as follows:

    S S L M M M L S S M

    M S L S M M M S S M

    a Form a frequency table (with both counts and percentages) to show the distribution of

    sizefor these cars. Use the table in Example 1 as a model.

    b Construct a bar chart using Example 2 as a model.

    4The table shows the frequency distribution ofSchool typefor a number of schools. The table

    is incomplete.

    Frequency

    School type Count Percent

    Catholic 4 20

    Government 11

    Independent 5 25

    Total 100

    a Write down the information missing from the table.

    b How many schools are categorised

    as Independent?

    c How many schools are there in total?

    d What percentage of schools are

    categorised as Government?

    e Use the information in the frequency table

    to complete the following report.

    Reportschools were classified according to school type. The majority of these schools, %,

    were found to be schools. Of the remaining schools, were while

    20% were schools.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    8/39

    8 Essential Further Mathematics Core

    5The table shows the frequency distribution of the place of birth for 500 Australians.

    a IsPlace of birtha categorical or a numerical variable?

    b Display the data in the form of a percentage

    segmented bar chart.

    c Use the information in the frequency table to

    write a brief report.

    Place of birth Per cent

    Australia 78.3

    Overseas 21.8

    Total 100.1

    6The table records the number of new cars sold in Australia during the first quarter of one

    year, categorised by type (private vehicle or commercial vehicle).

    a Copy and complete the table giving the

    percentages correct to the nearest

    whole number.

    b Display the data in the form of a

    percentage segmented bar chart.

    Frequency

    Type of vehicle Count Per cent

    Private 132 736

    Commercial 49 109

    Total

    7The table shows the frequency distribution of eye colour of 11 preschool children.

    a Use the information in the table to construct a

    bar chart. Place the columns in order of

    decreasing frequency.

    b Use the information in the table to construct a

    percentage segmented bar chart.

    c Use the information in the table to write a brief

    report.

    Frequency

    Eye colour Count Percentage

    Brown 6 54.5

    Hazel 2 18.2

    Blue 3 27.3

    Total 11 100.0

    8Twenty-two students were asked the question, How

    often do you play sport? with the possible response:

    Regularly, Sometimes or Rarely. The

    distribution of responses is summarised in the

    frequency table.

    a Write down the information missing from the table.

    b Use the information in the frequency

    table to complete the following report.

    Frequency

    Plays sport Count Per cent

    Regularly 5 22.7

    Sometimes 10

    Rarely 31.8

    Total 22

    ReportWhen students were asked the question, `How often do you play sport', the dominant

    response was `Sometimes', given by % of the students. Of the remaining students,

    % of the students responded that they played sport while % said that they

    played sport .

    1.3 Organising and displaying numerical dataFrequency tables can also be used to organise numerical data. For discrete numerical data, the

    process exactly mirrors that for categorical data. For continuous data, some modifications needto be made because groups of data values, rather than individual values, are listed.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    9/39

    Chapter 1 Organising and displaying data 9

    Example 3 Frequency table for discrete numerical data

    The family sizes of 11 preschool children (including the child itself) are as follows:

    3 3 4 4 5 3 2 4 3 5 3

    Display the data in the form of a frequency table.

    Solution

    1 Set up a table as shown. In the data set, the

    variablefamily sizetakes the values 2, 3, 4

    and 5. List these values under Family size

    in some order, here increasing.

    Frequency

    Family size Count Per cent

    2 1 9.13 5 45.54 3 27.35 2 18.2

    Total 11 100.1

    2 Count up the number of 2s, 3s, 4s and 5s in

    the dataset. For example, there are five 3s.

    Record these values in the Count column.

    3 Add the counts to find the total count, 11. Record this value in the Count column opposite

    Total.

    4 Convert the counts into percentages. Record them in the Per cent column. For example,

    percentage of 3s =5

    11 100% = 45.5%

    5 Finally, total the percentages and record.

    Grouping dataSome variables can only take on a limited range of values; for example, the number of children

    in a family. Here, it makes sense to list each of these values individually when forming a

    frequency distribution.

    In other cases, the variable can take a large range of values; for example, age (0100).

    Listing all possible ages would be tedious and would produce a large and unwieldy display. To

    solve this problem, wegroupthe data into a small number of convenient intervals. There are

    no hard and fast rules for the number of intervals but, usually, between five and fifteen intervals

    are used. Usually, the smaller the number of data values, the smaller the number of intervals.Note that the intervals are defined so that it is quite clear into which interval each data value

    falls. For example, you cannot define intervals as, 15, 510, 1015, 1520, . . . etc., as you

    would not know into which interval to put the values, 5, 10, 15 etc.

    Guideline for choosing the number of intervals

    There are no hard and fast rules for the number of intervals to use but, usually, between five

    and fifteen intervals are used.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    10/39

    10 Essential Further Mathematics Core

    Example 4 Grouping data

    The ages of a sample of 200 people aged from 16 to 72 years are to be recorded. Group the

    ages into six equal-sized categories that will cover all of these ages.

    Solution

    1 Write down the required number of intervals.

    2 Determine interval width.

    Ages range from 16 to 72, which covers

    57 years. Six intervals will give intervals

    of width57

    6 =9.5.

    Set the interval width to 10, the nearest

    whole number above 9.5.

    Number of intervals: 6

    Interval width=57

    6 = 9.5: use 10

    Starting point: 153 Choose a starting point that ensures that

    the intervals cover the full range of values.

    15 would be a suitable starting point.

    Intervals: 1524, 2534, . . . , 65744 Write down the intervals.

    Once we know how to group data, we can form a frequency distribution for grouped data.

    Example 5 A grouped frequency distribution for acontinuousnumerical variable

    The data below give the average hours worked per week in 23 countries.

    35.0, 48.0, 45.0, 43.0, 38.2, 50.0, 39.8, 40.7, 40.0, 50.0, 35.4, 38.8,

    40.2, 45.0, 45.0, 40.0, 43.0, 48.8, 43.3, 53.1, 35.6, 44.1, 34.8

    Form a grouped frequency table with five intervals.

    Solution

    1 Set up a table as shown. For five intervals and

    data values ranging between 34.8 and 53.1,

    use the intervals: 30.034.9, 35.039.9, . . . ,

    50.054.9.

    FrequencyAverage hoursworked Count Per cent

    30.034.9 1 4.335.039.9 6 26.140.044.9 8 34.845.049.9 5 21.750.054.9 3 13.0

    Total 23 99.9

    2 List these intervals, in ascending order, under

    Average hours worked.

    3 Count the number of countries whose

    average working hours fall into each of

    the intervals. For example, six countries have

    average working hours between 35.0 and 39.9.

    Record these values in the Count column.

    4 Add the counts to find the total count, 23.

    Record this value in the Count columnopposite Total.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    11/39

    Chapter 1 Organising and displaying data 11

    5 Convert the counts into percentages. Record these in the Per cent column.

    For example, for 35.039.9 hours,

    percentage =6

    23 100% =26.1%

    6 Finally, total the percentages and record.

    There are two things to note in the frequency table in Example 5.

    1 The intervals in this example are of width five. For example, the interval 35.039.9, is an

    interval of width 5.0 because it contains all values from 34.9500 . . . to 39.9499.

    2 The modal interval is 40.044.9 hours; eight (34.8%) of the countries have working hours

    that fall into this interval.

    How has forming a frequency table helped?

    The process of forming a frequency table for a numerical variable:

    ordersthe datadisplays the data in acompactform

    tells us something about the way the data values aredistributed(the pattern of the data)

    helps us identify themode(the most frequently occurring value or interval of values).

    The histogramThefrequency histogram, or histogram for short, is a graphical way of presenting the

    information in a frequency table fornumericaldata. Later in the chapter, you will learn about

    two other graphical displays for numerical data, the stem plot and the dot plot.

    Constructing a histogram from a frequency table

    In a frequency histogram:

    frequency (count or per cent) is shown on the vertical axis

    the values of the variable being displayed are plotted on the horizontal axis

    for continuous data, each bar in a histogram corresponds to a data interval. For discrete

    data, where there are gaps between values, the intervals start and end halfway between

    values. Empty classes or missing discrete values have bars of zero height

    the height of the bar gives the frequency (usually the count, but it can equally well be the

    percentage).

    Example 6 Constructing a histogram from a frequency table:continuousnumerical variable

    Construct a histogram for this frequency table.Average hours worked Frequency (count)

    30.034.9 1

    35.039.9 6

    40.044.9 8

    45.049.9 5

    50.054.9 3Total 23

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    12/39

    12 Essential Further Mathematics Core

    Solution

    1 Label the horizontal axis with the variable

    name, Average hours worked. Mark in the

    scale using the beginning of each interval

    as the scale points: that is 30, 35, . . .

    2 Label the vertical axis Frequency. Scale

    allowing for the maximum frequency, 8.

    Ten would be appropriate. Mark in the scale

    in units.

    3 Finally, for each interval, 30.034.9,

    35.039.9, . . . , draw in a bar with the base

    starting at the beginning of each interval

    and finishing at the beginning of the next.

    The height of the bar is made equal to the

    frequency.

    9

    8

    7

    6

    5

    4

    3

    2

    1

    025 30 35 40 45 50 55 60

    Average hours worked

    Frequen

    cy

    Example 7 Constructing a histogram from a frequency table:

    discretenumerical variable

    Construct a histogram for this frequency table. Family size Frequency (count)

    2 1

    3 5

    4 3

    5 2Total 11

    Solution

    1 Label the horizontal axis with the variable name,

    Family size. Mark the scale in units, so that it

    includes all possible values.

    2 Label the vertical axis Frequency. Scale to

    allow for the maximum frequency, 5. Five

    would be appropriate. Mark the scale in units.

    3 Draw in a bar for each data value. The width of

    each bar is 1, starting and ending halfway between

    data values. For example, the base of the bar

    representing a family size of 2 starts at 1.5 and

    ends at 2.5. The height of the bar is made equal to

    the frequency.

    10

    1

    2

    3

    4

    5

    2 3 4 5 6

    Family size

    Frequency

    Constructing a histogram from raw data

    It is relatively quick to construct a histogram from a preprepared frequency table. However, if

    you only have raw data (as you mostly do), it is a very slow process because you have to

    construct the frequency table first. Fortunately, a graphics calculator will do this for us.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    13/39

    Chapter 1 Organising and displaying data 13

    How toconstruct a histogramusing the TI-Nspire CAS

    Display the following set of marks in the form of a histogram.

    16 11 4 25 15 7 14 13 14 12 15 13 16 14

    15 12 18 22 17 18 23 15 13 17 18 22 23

    Steps

    1 Start a new document: Pressc and

    select New Document(or use/ +N).

    If prompted to save an existing

    document, move cursor to No and press

    .

    2 SelectAdd Lists & Spreadsheet.

    Enter the data into a list namedmarks.

    a Move the cursor to the name space

    of column A (or any other column)

    and type inmarksas the list name.

    Press .

    b Move the cursor down to row 1, type

    in the first data value and press .

    Continue until all the data has been

    entered. Press after each entry.

    3 Statistical graphing is done through the

    Data& Statisticsapplication.

    Press/ + and selectAdd Data&

    Statistics(or pressc, arrow to ,

    and press ).

    Note:A random display of dots will appear this is to indicate that data are available

    for plotting. It is not a statistical plot.

    a Presse to show the list of

    variables. The variablemarksis

    shown as selected. Press to

    paste the variablemarksto that axis.

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    14/39

    14 Essential Further Mathematics Core

    b A dot plot is then displayed as the

    default plot. To change the plot to a

    histogram press

    b>Plot Type>Histogram

    Note forCX only:To add colour (or change

    colour) move cursor over the plot and press

    / +b >Color>Fill Color.

    Your screen should now look like that

    shown opposite. This histogram has a

    column (or bin) width of 2 and a

    starting point of 3.

    4 Data analysis

    a Move cursor onto any column,

    will show and the column data will

    be displayed as shown opposite.

    b To view other column data values

    move the cursor to another column.

    Note:If you click on a column it will be selected.

    To deselect any previously selected columns,

    move the cursor to the open area and press .

    Hint: If you accidentally move a column or data

    point, press/+ to undo the move.

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    15/39

    Chapter 1 Organising and displaying data 15

    5 Change the histogram column (bin) width to 4 and the starting point to 2.

    a Press/ +b to get the contextual menu as shown (below left).Hint: Pressing/ +b with the cursor on the histogram gives you access to a contextual menuthat enables you to do things that relate only to histograms.

    b Select BinSettings.

    c In the settings menu (below right) change theWidthto 4 and the StartingPoint(Alignment) to 2 as shown. Press .

    d A new histogram is displayed with a column width of 4 and a starting point of 2 but

    it no longer fits the viewing window (below left). To solve this problem press

    / +b>Zoom>Zoom-Datato obtain the histogram shown below right.

    6 To change the frequency axis to a percentage axis, press/ +b>Scale>Percentand

    then press .

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    16/39

    16 Essential Further Mathematics Core

    How toconstruct a histogramusing the ClassPad

    Display the following set of 27 marks in the form of a histogram.

    16 11 4 25 15 7 14 13 14 12 15 13 16 14

    15 12 18 22 17 18 23 15 13 17 18 22 23

    Steps

    1 From the application menu

    screen, locate the built-in Statistics

    application. Tap to open.

    Tapping from the icon panel

    (just below the touch screen) will

    display the application menu if it is

    not already visible.

    2 Enter the data into a list named

    marks.

    To name the list:

    a Highlight the heading of the

    first list by tapping it.

    b Pressk on the front of

    the calculator and tap the

    tab.

    c To enter the data, type the word

    marksand pressE.

    d Type in each data value and press

    E or (which is found on thecursor button on the front of the

    calculator) to move down to the

    next cell.

    The screen should look like the one

    shown opposite.

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    17/39

    Chapter 1 Organising and displaying data 17

    3 Set up the calculator to

    plot a statistical graph.

    a Tap from the toolbar. This

    opens theSet StatGraphs

    dialog box.

    b Complete the dialog

    box as given below.

    Draw: selectOn

    Type: select

    Histogram( )

    XList: selectmain\

    marks( )

    Freq: leave as1

    c Taph to confirm your

    selections.

    Note:To make sure only this

    graph is drawn, selectSetGraph

    from the menu bar at the top and

    confirm that there is a tick only beside

    StatGraph1and no others.

    4 To plot the graph:

    a Tap in the toolbar.

    b Complete theSet Interval

    dialog box as follows.

    HStart: type2 (i.e. the

    starting point of the first

    interval)

    HStep: type4 (i.e. the

    interval width)Tap OK to display histogram.

    Note:The screen is split into two halves, with the graph displayed in the bottom half, as shown above.

    Tapping from the icon panel allows the graph to fill the entire screen. Tap again to return

    to half-screen size.

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    18/39

    18 Essential Further Mathematics Core

    5 Tapping from the toolbar

    places a marker (+) at the top of

    the first column of the histogram

    (see opposite) and tells us that

    a the first interval begins

    at 2(xc = 2)

    b for this interval, the frequency

    is 1(Fc = 1).

    To find the frequencies and starting points of the other intervals, use the arrow ( ) to

    move from interval to interval.

    Exercise 1C

    1 The numbers of occupants in nine cars stopped at a traffic light were:

    1 1 2 1 3 1 2 1 3

    What is the mode of this data set? What does this tell us?

    2 The number of surviving grandparents for 11 preschool children is listed below.

    0 4 4 3 2 3 4 4 4 3 3

    Form a frequency table to show the distribution of the number of surviving grandparents.

    3 a Write down the missing information in the

    frequency table.

    b How many families had only one child?

    c How many families had more than one

    child?

    d What percentage of families had no

    children?

    e What percentage of families had fewer

    than three children?

    FrequencyNo. of childrenin family Count %

    0 3

    1 10 47.6

    2 6 28.6

    3

    4 2 9.5

    Total 21

    ISBN 978-1-107-65590-4

    Photocopying is restricted under law and this material must not be transferred to another party.

    Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

  • 8/10/2019 01 - Organising and Displaying Data

    19/39

    Chapter 1 Organising and displaying data 19

    4 a Salaries of women teaching in a school range from $20 106 to $63579. Group the salaries

    into five equal-sized categories that cover all teaching salaries.

    b The number of students in VCE Further Mathematics classes ranges from 6 to 33. Group

    the class sizes into six equal-sized categories that cover all Further Mathematics class

    sizes.

    c The amount of money carried by a sample of 23 students ranges from nothing to $8.75.Group the amount of money carried by the students into five equal-sized categories that

    cover all amounts of money carried by the students.

    05 10 15 20 25 30

    5

    10

    15

    20

    25

    30

    35

    Frequency(%)

    Number of words in sentence

    5The histogram opposite was formed by recording the

    number of words in 30 randomly selected sentences.

    a What percentage of these sentences contained:

    i 59 words? ii 2529 words?

    iii 1019 words? iv fewer than 15 words?

    Give answers correct to the nearest per cent.

    b How many of these sentences contained:

    i 2024 words? ii more than 25 words?

    c What is the mode (modal interval)?

    6Use the information in the table opposite to

    help you construct a histogram to display

    population density. Use the histogram in

    Example 6 as a model. Label axes and

    mark in scales.

    Population density Frequency(count)

    0199 11

    200399 4

    400599 4

    600799 2

    800999 1

    Total 22

    7Use the information in the table opposite to

    help you construct a histogram to display the

    distribution of the number of rooms in the

    houses of 11 preschool children. Use the

    histogram in Example 7 as a model. Label

    axes and mark in scales.

    Number of rooms Frequency(count)

    4 3

    5 0

    6 1

    7 3

    8 4

    Total 11

    8The pulse rates of 23 students are given below.

    86 82 96 71 90 78 68 71 68 88 76 74

    70 78 69 77 64 80 83 78 88 70 86

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    20/39

    20 Essential Further Mathematics Core

    a Use a graphics calculator to construct a histogram so that the first column starts at 63 and

    the column width is two.

    b For this histogram:

    i what is the starting point of the third column?

    ii what is the count for the third column? Whatactualdata values does this include?

    c Redraw the histogram so that the column width is five and the first column starts at 60.d For this histogram, what is the count in the interval 65 to

  • 8/10/2019 01 - Organising and Displaying Data

    21/39

    Chapter 1 Organising and displaying data 21

    Symmetric distributions

    If a histogram is single-peaked, does the histogram region tail off evenly on either side of the

    peak? If so, the distribution is said to besymmetric(see Histogram 1).

    Histogram 1

    lower tail peak upper tail10

    86

    4

    2

    0

    Frequency

    Histogram 2

    peak peak10

    86

    4

    2

    0

    Frequency

    A single-peaked symmetric distribution is characteristic of the data that derive from

    measuring variables such as peoples heights, intelligence test scores, weights of oranges in a

    storage bin, or any other data for which the values vary evenly around some central value. The

    histogram for average hours worked (see Example 6) would be classified as approximately

    symmetric.

    The double-peaked distribution (Histogram 2) is symmetric about the dip between the two

    peaks. A histogram that has two distinct peaks indicates abimodal(two modes) distribution.

    A bimodal distribution often indicates that the data have come from two different

    populations. For example, if we were studying the distance the discus is thrown by Olympic

    level discus throwers, we would expect a bimodal distribution if both male and female throwers

    were included in the study.

    Skewed distributionsSometimes a histogram tails off primarily in one direction. Such distributions are said to be

    skewed.

    If a histogram tails off to the right we say that it is positivelyskewed (Histogram 3). The

    distribution of salaries of workers in a large organisation tends to be positively skewed. Most

    workers earn a similar salary with some variation above or below this amount, but a few earn

    more and even fewer, such as the senior manager, earn even more. The distribution of house

    prices also tends to be positively skewed.

    Histogram 3

    peak long upper tail10

    8

    6

    4

    2

    0

    Frequency +ve skew

    Histogram 4

    long lower tail peak 10

    8

    6

    4

    2

    0

    Frequency ve skew

    If a histogram tails off to the left we say that it is negativelyskewed (Histogram 4). The

    distribution of age at death tends to be negatively skewed. Most people die in old age, a few in

    middle age and even fewer in childhood.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    22/39

    22 Essential Further Mathematics Core

    OutliersOutliersare any data values that stand out from the main body of data. These are data values

    that are atypically high or low. See for example, Histogram 5, which shows an outlier. In this

    case it is a data value that is atypically low compared to the rest of the data values.

    Histogram 5

    outlier main body of data

    108

    6

    4

    2

    0

    Frequency

    Outliers can indicate errors made collecting

    or processing data; for example, a persons

    age recorded as 365. Alternatively, they may

    indicate data values that are very different

    from the rest of the values. For example,

    compared to her students ages, a teachers

    age is an outlier.

    Centre

    050

    Histograms 6 to 8

    60 70 80 90 100 110 120 130 140 150

    1

    2

    3

    4

    5

    6

    7

    8

    Frequency

    Histograms 6 to 8 display the distribution

    of test scores for three different classes

    taking the same subject. They are identical

    in shape, but differ in where they are

    located along the axis. In statistical terms

    we say that the distributions are centred

    at different points along the axis.

    But what do we mean by the centreof a

    distribution? This is an issue we will return

    to in more detail later. For the present we

    will take centre to be themiddleof the

    distribution.

    The middle of a symmetric distribution is reasonably easy to locate by eye. Looking at

    Histograms 6 to 8, it would be reasonable to say that the centre or middle of each distribution

    lies roughly halfway between the extremes; half the observations would lie above this point

    and half below. Thus we mightestimatethat Histogram 6 (yellow) is centred at about 60,

    Histogram 7 (light blue) at about 100, and Histogram 8 (dark blue) at about 140.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    23/39

    Chapter 1 Organising and displaying data 23

    For skewed distributions, it is more difficult to estimate the middle of a distribution by eye.

    The middle is not halfway between the extremes because, in a skewed distribution, the scores

    tend to bunch up at one end. However, if we

    imagine a cardboard cut-out of the histogram,

    the midpoint lies on the line that divides the

    histogram into two equal areas (Histogram 9).

    Histogram 9

    line that divides

    the area of the

    histogram in half

    150

    1

    2

    3

    4

    5

    20 25 30 35 40 45 50

    Frequenc

    y

    Using this method, we would estimate the

    centre of the distribution to lie somewhere

    between 35 and 40, but closer to 35, so we

    might opt for 37. However, remember that

    this is only an estimate.

    SpreadIf the histogram is single peaked, is it narrow? This would indicate that most of the data values

    in the distribution are tightly clustered in a small region. Or is the peak broad? This would

    indicate that the data values are more widely spread out. Histograms 10 and 11 are both single

    peaked. Histogram 10 has a broad peak, indicating that the data values are not very tightly

    clustered about the centre of the distribution. In contrast, Histogram 11 has a narrow peak,

    indicating that the data values are tightly clustered around the centre of the distribution.

    wide central region10

    8

    6

    4

    2

    02 4 6 8 10 12 14 16 18 20 22

    Frequen

    cy

    Histogram 10

    narrow central region20

    16

    128

    4

    2 4 6 8 10 12 14 16 18 20 220

    Frequen

    cy

    Histogram 11

    But what do we mean by the spreadof a distribution? We will return to this in more detail

    later. For a histogram we will take it to be the maximumrangeof the distribution.

    Range

    Range =largest value smallest value

    For example, Histogram 10 has a spread (maximum range) of 22 (22 0) units, which is

    considerably greater than the spread of Histogram 11, which has a spread of 12 (18 6) units.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    24/39

    24 Essential Further Mathematics Core

    Example 8 Describing a histogram in terms of shape, centre and spread

    The histogram opposite shows the distribution of the

    number of phones per 1000 people in 85 countries.

    a Describe its shape and note outliers (if any).b Locate the centre of the distribution.

    c Estimate the spread of the distribution.

    0170 340 510 680 850 1020

    5

    10

    15

    20

    25

    30

    35

    Frequency(count)

    Number of phones (per 1000 people)Solution

    a Shape and outliers

    b CentreCount up the frequencies fromeither end to find the middle interval.

    c SpreadUse the maximum range to

    estimate the spread.

    The distribution is positively skewed.

    There are no outliers.

    The distribution is centred in the interval170340 phones/1000 people.Spread = 1020 0

    = 1020 phones/1000 people

    It should be noted that, with grouped data, it is difficult to precisely determine the location of

    the centre of a distribution from a histogram. So, when working with grouped data, it is

    acceptable to state that the centre of a distribution lies in the interval 170340. We will learn

    how to solve this problem later in the chapter.

    If you were using the histogram above to describe the distribution in a form suitable for astatistical report, you might write as follows.

    ReportFor the 85 countries, the distribution of the number of phones per 1000 people is positively

    skewed. The centre of the distribution lies somewhere in the interval 170340 phones/1000

    people. The spread of the distribution is 1020 phones/1000 people. There are no outliers.

    Exercise 1D

    1Label each of the following histograms as approximately symmetric, positively skewed or

    negatively skewed, and identify the following:

    i the mode ii any potential outliers iii the approximate location of the centre

    a

    Frequency

    Histogram A

    20

    15

    10

    5

    0

    b

    Frequency

    Histogram B

    80

    60

    40

    20

    0

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    25/39

    Chapter 1 Organising and displaying data 25

    c

    Frequency

    Histogram C

    20

    15

    10

    5

    0

    d 20

    15

    10

    0

    5

    Histogram D

    Frequency

    2These three histograms show

    the marks obtained by a group

    of students in three subjects.

    Frequency

    Subject A Subject B

    Marks

    Subject C

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    02 6 10 14 18 22 26 30 34 38 42 46 50

    a Are each of the distributions

    approximately symmetric or

    skewed?

    b Are there any clear outliers?

    c Determine the interval

    containing the central mark

    for each of the three subjects.

    d In which subject was the

    spread of marks the least? Use

    the range to estimate the spread.

    e In which subject did the marks vary most? Use the range to estimate the spread.

    3Label each of the following histograms as approximately symmetric, positively skewed or

    negatively skewed, and identify the following:i the mode(s) ii any potential outliers iii the approximate location of the centre

    a

    Histogram A

    20

    15

    10

    5

    0

    Frequency

    b

    Histogram B

    Frequency80

    60

    40

    20

    0

    c20

    15

    10

    5

    0

    Frequency

    Histogram C

    d

    Frequen

    cy

    Histogram D

    15

    20

    10

    5

    0

    e

    Frequency

    Histogram E

    15

    20

    10

    5

    0

    f

    Frequency

    Histogram F

    80

    60

    40

    20

    0

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    26/39

    26 Essential Further Mathematics Core

    4This histogram shows the distribution

    of pulse rate (in beats per minute) for

    28 students.

    0

    1

    2

    3

    4

    5

    6

    2

    3

    4

    5

    6

    Pulse rate (beats per minute)

    60 65 70 75 80 85 90 95 100 105 110 115

    Frequency(count)

    Use the histogram to complete the report

    below, describing the distribution of

    pulse rate in terms of shape, centre,

    spread and outliers (if any).

    ReportFor the students, the distribution of pulse rates is with an outlier. The

    centre of the distribution lies in the interval beats per minute and the spread of the

    distribution is beats per minute. The outlier lies in the interval beats per minute.

    1.5 Stem-and-leaf plots and dot plotsStem plotsAstem-and-leaf plot, orstem plotfor short, is an alternative to the histogram. It is

    particularly useful for displaying small to medium sized sets of data (up to about 50 data

    values) and has the advantage of retaining all the original data values. This makes it useful for

    further computations. A stem plot is also a very quick and easy way to order and display a set

    of data by hand. Like a histogram, the stem plot gives information about the shape, outliers,

    centre and spread of the distribution.

    One of the stem plots advantages over a histogram in describing distributions is being able

    to see all the actual data values. This enables the centre and the range of the distribution to be

    located more precisely. It also enables the clear identification of outliers.

    Constructing a stem plot

    In a stem-and-leaf plot, each data value is separated into two parts: the leading digit(s) form

    the stem, and the trailing digit becomes the leaf. For example, in a stem-and-leaf plot, the

    data values 25 and 132 are represented as follows:

    25 is represented by

    132 is represented by

    Stem Leaf

    2 5

    13 2

    and so on.

    To construct a stem plot, enter the stems to the left of a vertical dividing line, and the leaves

    for each data point to the right. Usually we first construct an unordered stem plotby

    systematically plotting each data point as listed in the data set. From the unordered

    stem-and-leaf plot anordered stem plotis then easily obtained. In an ordered stem plot the

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    27/39

    Chapter 1 Organising and displaying data 27

    leaves increase in value as they move away from the stem. It is usually the ordered stem plot

    that we want, because an ordered stem plot makes it easy to find the key values.

    Example 9 Constructing an ordered stem plot

    University participation rates (%) in 23 countries are given below.2 6 3 1 2 2 0 3 6 1 2 5 2 6 1 3 9 2 6 2 7 3 0 1 1 5 2 1 7 8 2 2 3 3 7 1 7 5 5

    Display the data in the form of an ordered stem plot.

    Solution

    1 The data set has values in the units, tens,

    twenties, thirties, forties and fifties. Thus,

    appropriate stems are 0, 1, 2, 3, 4, and 5.

    Write these down in ascending order,

    followed by a vertical line.

    01234

    52 Now attach the leaves.

    The first data value is 26. The stem is 2

    and the leaf is 6. Opposite the 2 in the stem,

    write down the number 6, as shown.

    The second data value is 3 or 03. The stem

    is 0 and the leaf is 3. Opposite the 0 in the

    stem, write down the number 3, as shown.

    012 6345

    0 312 6

    345

    Continue systematically working through

    the data following the same procedure until

    all points have been plotted. You will then

    have the unordered stem plot, as shown.

    0 3 1 9 1 7 8 31 2 3 5 72 6 0 5 6 6 7 1 2 3 6 0 745 5unordered stem plot

    3 Ordering the leaves in increasing value asthey move away from the stem gives the

    ordered stem plot, as shown.

    0 1 1 3 3 7 8 91 2 3 5 72 0 1 2 5 6 6 6 73 0 6 745 5

    ordered stem plot

    Using a stem plot to describe a distribution

    Stem plots are just like histograms, except that you can see all the data values. This enablesmore precise estimates to be made of the centre and spread.

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    28/39

    28 Essential Further Mathematics Core

    Methods for determining the centre, spread and outliers from a stem plot

    Centre (middle) Count up from either end of the distribution until you find the middle

    value; the value that has an equal number of data values either side.

    For an odd number of data values,n, the middle value is then + 1

    2 th

    value. Thus, the median will be an actual data value.

    For an even number of data values,n, the middle value is then + 1

    2 th

    value. Thus, the median will lie between two data values.

    Spread (range) Subtract the smallest data value from the largest data value.

    Range =largest value smallest value

    Outliers Data values that stand out from the main body of data are called outliers.

    Their values can be read directly from the stem plot.

    Example 10 Describing a stem plot in terms of shape, centre and spread

    The ordered stem plot opposite shows the

    distribution of test marks of 23 students.

    a Name its shape and note outliers (if any).

    b Locate the centre of the distribution.

    c Estimate the spread of the distribution.

    d Write down the values of any outliers.

    Test marks

    0

    1 5 9 9 9

    2 0 4 5 7 8 8 8

    3 0 3 5 5 6 8

    4 1 2 3 3 5

    5

    6 0Solution

    a Shape

    b CentreThere are 23 data values; the middle

    value is the 12th value. Check by counting.

    The distribution is approximately

    symmetric with one outlier.

    The distribution is centred at 30 marks.

    c SpreadUse the range to estimate the spread.

    d OutlierRead off the value of the outlier.

    Spread = 60 15 = 45 marks

    Outlier= 60 marks

    If you were using the stem plot to describe the distribution in a form suitable for a statistical

    report, you might write as follows.

    ReportFor the 23 students, the distribution of marks is approximately symmetric with an outlier.

    The centre of the distribution is at 30 marks and the distribution has a spread of 45

    marks. The outlier is a mark of 60.

    Split stemsIn some instances, using the simple process outlined above produces a stem plot that is too

    bunched up to give us a good overall picture of the variation in the data. This is often the case

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    29/39

    Chapter 1 Organising and displaying data 29

    when the data values all have the same first digit or the same one or two first digits. For

    example, a group of 17 VCE students recently sat for a statistics test marked out of 20. The

    results are as shown below.

    2 12 13 9 18 17 7 16 12 10 16 14 11 15 16 15 17

    Using the process described in Example 10 to form a stem plot, we end up with abunched-up plot like the one below.

    0 2 7 9

    1 0 1 2 2 3 4 5 5 6 6 6 7 7 8

    When this happens, the stem plot scale can be stretched out by splitting the stems. Generally

    the stem is split into halves or fifths. For example, for the interval 1019, the split stem system

    works as follows.

    1 (1011)

    1 (1213)1 (1415)

    1 (1617)

    1 (1819)

    1 (1014)

    1 (1519)1 (1019)

    Single stem Stem split into halves Stem split into fifths

    In a stem plot with a single stem, the 1 represents the interval 1019.

    In a stem plot with its stem split into halves, the top 1 represents the interval 1014,

    while the bottom 1 represents the interval 1519.

    In a stem plot with its stem split into fifths, the top 1 represents the interval 1011, the

    second 1 represents the interval 1213, the third 1 represents the interval 1415, the

    fourth 1 represents the interval 1617, while the bottom 1 represents the interval 1819.

    Comparison of stem plots with different split stems

    Using a split stem plot to display the test marks can show features not revealed by a standard

    plot. This can be seen in the next plot with the stem split into fifths, indicating that a mark of 2

    is an outlier.

    0 2 7 9 0 2 0

    1 0 1 2 2 3 4 5 5 6 6 6 7 7 8 0 7 9 0 2

    1 0 1 2 2 3 4 01 5 5 6 6 6 7 7 8 0 7

    0 9

    1 0 1

    1 2 2 3

    1 4 5 5

    1 6 6 6 7 7

    1 8

    Single stem Stem split into halves Stem split into fifths

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    30/39

  • 8/10/2019 01 - Organising and Displaying Data

    31/39

    Chapter 1 Organising and displaying data 31

    2 Plot each data value by marking in a dot

    above the corresponding value on the

    number line.17 18 19 20 21 22 23 24 25 26 2 7 28 29 3 0

    Age(years)

    Interpreting a dot plot

    Dot plots are interpreted in much the same way as stem plots. However, usually there is little

    we can say about the shape of the distribution from the dot plot because there are not sufficient

    data points for any pattern to be revealed.

    From the dot plot in Example 12, we see that the distribution of ages is centred at 22 years

    (the middle value) with a spread of 11 years (29 18 = 11).

    Which graph?One of the issues that you will face is choosing a suitable graph to display a distribution. The

    following guidelines might help you in your decision-making. They are guidelines only,because in some instances there may be more than one suitable graph.

    Type of data Graph Qualifications on use

    Categorical Bar chart

    Segmented bar chart Not too many categories (4 or 5 maximum)

    Numerical Histogram Best for medium to large data sets (n 40)

    Stem plot Best for small to medium sized data sets (n 50)

    Dot plot Suitable only for small data sets (n 20)

    Exercise 1E

    1The data below give the urbanisation rates (%) in 23 countries.

    54 99 22 20 31 3 22 9 25 3 56 12

    16 9 29 6 28 100 17 9 35 27 12

    a Construct an ordered stem plot.

    b What advantage does a stem plot have over a histogram?

    2For each of the following stem plots (A, B and C):

    a name its shape and note outliers (if any)

    b locate the centre of the distribution

    c determine the spread of the distribution

    d write down the values of outliers (if any)

    Stem plot A Stem plot B Stem plot C

    0 0 0 1 1 2 6 7 7 9 0 0 0 1 3

    1 2 2 3 5 5 5 5 6 1 1 3 6 9 1

    2 0 1 4 7 2 0 0 1 5 6 8 8 2

    3 2 2 3 2 2 2 4 5 9 9 9 3 2

    4 0 4 1 2 4 4 6 4 0 2 4

    5 2 5 2 3 5 1 1 3 5 8 86 6 2 6 0 0 4 4 4 7 7 8 9

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    32/39

    32 Essential Further Mathematics Core

    3The data below give the wrist circumference (in cm) of 15 men.

    16.9 17.3 19.3 18.5 18.2 18.4 19.9 16.7 17.1 17.6 17.7 16.5 17.0 17.2 17.6

    a Construct a stem plot for wrist circumference using:

    i stems 16, 17, 18, 19 ii these stems split into halves

    b Which stem plot appears to be more appropriate for the data?

    c Use the stem plot with split stems to help you complete the report below.

    ReportFor the men, the distribution of their wrist circumference is . The centre of

    the distribution is at cm and it has a spread of cm. There are no outliers.

    4The data below give the weight (in kg) of 22 students.

    57 58 62 84 64 74 57 55 56 60 75

    68 59 72 110 56 69 56 50 60 75 58a Construct a stem plot for weight using:

    i stems 5, 6, 7, 8, 9, 10 and 11 ii these stems split into halves

    b Use the stem plot with a split stem to write a brief report on the distribution of the

    weights of the students in terms of shape (and outliers), centre and spread. Use the report

    from Question 3 as a model.

    5The number of possessions (kicks, mark, handballs, knockouts etc.) recorded for players in a

    football game between Carlton and Essendon is shown below.

    Carlton Essendon

    10 44 32 44 19 35 11 5 24 28 21 32 21 59 21 12 19 26 23 22 29 34

    22 34 36 20 14 25 16 19 32 32 14 29 8 22 21 26 44 19 21 22

    a Display the data in the form of anorderedback-to-back stem plot.

    b Complete the following report comparing the two distributions in terms of shape (and

    outliers), centre and spread.

    ReportThe distribution of the number of possessions is for both teams. The two

    distributions have similar centres, at and possessions, respectively. The spread ofthe distribution is less for Carlton, possessions, compared to possessions for

    Essendon.

    6The following data give the number of children in the families of 14 VCE students:

    1 6 2 5 5 3 4 4 2 7 3 4 3 4

    a Construct a dot plot.

    b What is the mode?

    c What is:i the centre? ii the spread?

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    33/39

    Chapter 1 Organising and displaying data 33

    7The following data give the life expectancies in years of 13 countries:

    76 75 74 74 73 73 75 71 72 75 75 78 72

    a Construct a dot plot.

    b What is the mode?

    c What is:

    i the centre? ii the spread?

    8Data have been collected for each of the following variables. The data are to be displayed

    graphically. In each case, decide which is the most appropriate graph. Select from bar chart,

    histogram, stem plot or dot plot. Sometimes more than one sort of graph is suitable.

    a number of passengers in a bus 1000 buses in sample

    b amount of petrol purchased (in litres) 30 petrol purchases

    c type of petrol purchased (super, unleaded, premium)

    d prices of houses sold in Melbourne over a weekend

    e the number of medals won by countries winning medals at the Olympicsf state of residence of a sample of 200 Australians

    g number of cigarettes smoked in a day (a sample of 120 people)

    h resting pulse rates of 7 students

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    34/39

    Re

    view

    34 Essential Further Mathematics Core

    Key ideas and chapter summary

    Types of data Data can be classified asnumericalorcategorical.

    Frequency table Afrequency tableis a listing of the values a variable takes in a data set,

    along with how often (frequently) each value occurs.Frequency can be recorded as a: count: the number of times a value occurs; for example, the number

    of females in the data set is 32 per cent: the percentage of times a value occurs; for example, the

    percentage of females in the data set is 45.5%.

    Categorical data Categorical dataarise when classifying or naming some quality or

    attribute; for example, place of birth, hair colour.

    Bar chart Bar chartsare used to display the frequency distribution of categorical

    data.

    For a small number of categories, the distribution of a categorical

    variable is described in terms of the dominant category(if any), the

    orderof occurrence of each category and itsrelative importance.

    Describing

    distributions of

    categorical variables

    Mode Themodeis the value or group of values that occurs most often

    (frequently) in a data set. For example, for the data 2 1 1 3 3 2 5 1 6 1 1

    2 1 1, the mode is 1, because it is the data value that occurs most often.

    Numerical data Numerical dataarise from measuring or counting some quantity; for

    example, height, number of people etc.Numerical data can be discrete or continuous.Discrete dataarise when

    youcount.Continuous dataarise when youmeasure.

    Histogram Ahistogramis used to display the frequency distribution of a numerical

    variable; suitable for medium to large sized data sets.

    Stem plot Astem plotis an alternative graphical display to the histogram; suitable

    for small to medium sized data sets.

    The advantage of the stem plot over the histogram is that it shows the

    value of each data point.

    Dot plot Adot plotconsists of a number line with each data point marked by a

    dot; suitable for small sets of data only.

    The distribution of a numerical variable can be described in terms of:Describing thedistribution of a

    numerical variable

    shape: symmetric or skewed (positive or negative)? outliers: values that appear to stand out centre: the midpoint of the distribution (median) spread: one measure is the range of values covered

    (Range = largest value smallest value)

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    35/39

  • 8/10/2019 01 - Organising and Displaying Data

    36/39

    Re

    view

    36 Essential Further Mathematics Core

    The following information relates to Questions 4 to 6

    A number of teenagers were asked to

    nominate their favourite leisure

    activity. Their responses have been

    organised into a frequency table, asshown. Some information is missing.

    Frequency

    Leisure activity Count Percentage

    Sport 73 29.2

    Listening to music 70

    Watching TV 19.2

    Other 59 23.6

    Total 250

    4 The percentage of students who said that listening to music was their favourite

    leisure activity is:

    A 17.5 B 28.0 C 29.2 D 50.0 E 70.0

    5 The number of students who said watching TV was their favourite leisure activity

    is:

    A 19 B 48 C 62 D 125 E 70.0

    6 For the students surveyed, the most popular leisure activity is:

    A sport B listening to music C watching TV

    D other E cant tell

    Questions 7 to 11 relate to the histogram shown below

    This histogram displays the test scores of a class

    of Further Mathematics students.

    6 8 10 12 14 16 18 20 22 24 26 28

    Test score

    6

    5

    43

    2

    1

    0

    Freque

    ncy

    7 The total number of students in the class is:

    A 6 B 18 C 20 D 21 E 22

    8 The number of students in the class who

    obtained a test score less than 14 is:

    A 4 B 10 C 14 D 17 E 28

    9 The histogram is best described as:

    A negatively skewed B negatively skewed with an outlier

    C approximately symmetric

    D approximately symmetric with outliers E positively skewed10 The centre of the distribution lies in the interval:

    A 810 B 1012 C 1214 D 1416 E 1820

    11 The spread of the students marks is:

    A 8 B 10 C 12 D 20 E 22

    12 For the stem plot shown opposite, the modal interval is:

    A 2024 B 2529 C 2029

    D 25 E 29

    1 0 2

    1 5 5 6 9

    2 3 3 4

    2 5 7 9 9 9

    3 0 1 2 4

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    37/39

    Revi

    ew

    Chapter 1 Organising and displaying data 37

    The following information relates to Questions 13 and 14

    This percentage segmented bar chart

    shows the distribution of hair

    colour for 200 students.

    0

    10

    2030

    4050

    60

    7080

    90100

    Percentage

    Blonde

    Brown

    Black

    Red

    Other

    13 The number of students withbrown hair is closest to:

    A 4 B 34 C 57

    D 68 E 114

    14 For these students, the most common

    hair colour is:

    A black B blonde C brown D red E other

    15 The ages of 11 primary school children were collected. The best graph to display

    the distribution of ages of these children would be a:

    A bar chart B dot plot C histogram

    D segment bar chart E stem plot

    Extended-response questions

    1 One hundred and twenty-one students were

    asked to identify their preferred leisure activity.

    The results of the survey are displayed in a

    bar chart.

    30

    25

    20

    15

    10

    5

    0

    Sport TV

    Preferred leisue activity

    Music

    Movies

    Reading

    Other

    Percen

    tage

    a What percentage of students nominated

    watching TV as their preferred leisure

    activity?

    b What percentage of students in total

    nominated either going to the movies or

    reading as their preferred leisure activity?

    c What is the most popular leisure activity for

    these students? How many students rated this

    activity as their preferred leisure activity?2 The number of people killed in natural and non-natural disasters in 1997 by world

    region is shown in the table below.

    a Construct a bar chart.

    b In which region was the:

    i greatest number of people killed?

    ii least number of people killed?

    Region Number killed

    Europe 874

    Africa 8 327

    Asia 10 551

    Oceania 457

    The Americas 1581

    includes Australia (41)

    ISBN: 9781107655904 Peter Jones, Michael Evans, Kay Lipson 2012 Cambridge University Press

    Photocopying is restricted under law and this material must not be transferred to another party

  • 8/10/2019 01 - Organising and Displaying Data

    38/39

  • 8/10/2019 01 - Organising and Displaying Data

    39/39

    Revi

    ew

    Chapter 1 Organising and displaying data 39

    e i Name the shape of the distribution displayed by the histogram.

    ii Locate the interval containing the centre of the distribution.

    iii Determine the spread of the distribution using the range.

    6 This stem plot displays the ages (in years) of a group of women.

    a What was the age of the youngest woman?

    Note: 17 2 = 17.2 years

    17 2 3 4

    17 5 6 6 8 8 9 9

    18 0 1 3 3 3 4

    18 5 5 5 5 5 5 6 7 8 8 8 9

    19 1 2 2 3 3

    19 8

    20

    20 6

    b In terms of age, one of the women is a

    possible outlier. What is her age?

    c How many women were aged between

    17.0 and 17.4 years, inclusive?

    d How many women were 19 years old

    or older?

    e What is the modal age category?

    f What percentage of women were younger

    than 20 years old?

    g i Name the shape of the distribution

    of ages, noting outliers.

    ii Locate the centre of the distribution.

    iii Determine the spread of the distribution.

    7 The distribution of the waiting times of 37 cars

    stopped by a traffic light is as shown in

    the histogram opposite. Use the histogram to

    write a report on the distribution of waiting

    times in terms of shape, centre, spread

    and outliers.

    5 10 15 20 25 30 35 40 45 50 55

    Waiting time (seconds)

    10

    8

    6

    4

    2

    0

    Freque

    ncy

    8 Use a graphics calculator to construct histograms for the following sets of data.

    a Use intervals of width 5 starting at 90.

    Monthly expenditure on entertainment (in dollars)

    110 115 105 98 118 114 125 95 114 104 97 130 122

    112 107 135 121 94 108 118 106 121 125 107 109 93

    b Use intervals of width 8 starting at 32.

    Life span (in years)

    58 65 68 74 73 73 75 71 72 61 67 66 37