Chapter 2 Data Collection statistc

61
DATA COLLECTION AND PRESENTATION CHAPTER-II

description

Chapter 2 Data Collection

Transcript of Chapter 2 Data Collection statistc

  • DATA COLLECTION AND PRESENTATIONCHAPTER-II

  • Definition of Data

    Data are considered as the raw materials of statistics. Data are numerical measurement that are collected in scientific/ systematic way and are related to the predetermined objectives.Statistical observations are called data.

  • Source of Data Data can be obtained from three important sources-DATAPrimary DataInternal DataSecondary DataObservationsFocused Group Discussion (FGD)QuestioningMailTelephonePersonal Interview

  • Primary Data :

    Primary Data measurements observed & record as Part of an original study, when the data required for a Particular study. It may become necessary to collect the original data to conduct first hand investigation.

  • Internal Data Internal Data refer to the measurements that are the by product of routing business record keeping like accounting, Finance, Production, Personnel, Quality control, Sales Research and Development etc.

  • Secondary Data When and investigator used the data, which has already been collected by others, such data are called secondary data.Secondary data can be obtained form Journal, Reports, Govt. Publications, etc.

  • METHOD OF OBTAINING PRIMARY DATA:

    There are three basic methods of collectingprimary data

    Questioning. Observations. Focused group Discussion (FGD).

  • Questioning:Questioning, as the name suggests, is distinguished by the fact that data are collected by asking questions from people who are thought to have the desired information.Questions may be asked in person, or in writing. A formal list of such questions is called a questionnaire.A distinction is often made between a questionnaire & a schedule. Questionnaire refers to a device for securing answers to questions by using a form which the respondent fills in himself.Schedule is the name usually applied to a set of questions which are asked & filled in a face-to-face situation with another person.

  • Observations: When the data are collected by observation, the investigator asks no questions. Instead he observes the object or actions in which he is interested. Sometimes individuals make the observations, on other occasions, mechanical devices observe & record the desired information.Observation method does not automatically produce accurate data. Physical difficulties in the observation situation on the part of the observer may result in errors. Even more important, however, is the influence on observations of the observers training, philosophy, opinions & expectations.Examples, projects as the reading of X-ray films, ECG, state of repair for roads.

  • FOCUSED GROUP DISCUSSIONWhen using focused group discussions as a research technique, the researcher is no longer the center of activity, but rather let informants discuss with each other, providing guidance. Tools Discussion Guide, Tape Recorder.

  • QUESTIONNAIRE METHODSOf the three methods named above, the questionnaire method is most widely used for collecting business data, when questionnaire method is used, three different techniques of communication with questionnaire are available -Personal interview. Mail. Telephone.

  • Personal Interview:

    Personal interviews are those in which an interviewer obtains information from respondents in face-to-face meetings.

    The information obtained by this method is likely to be more accurate because the interviewer can clear up doubts, can cross-examine the informants & thereby obtain correct information.

  • 2. Mail: In most mail surveys, questionnaires are mailed to the respondents to fill them & return by mail. Sometimes mail questionnaires are placed in respondents hands by other means such as attaching them to consumer products, putting them in magazines or newspapers or having field workers leave them with respondents. In each case respondents complete the questionnaires themselves & send back the completed forms by mail.This method has a special advantage in surveys where filed of investigation is very vast & the informants are spread over a wide geographical area.

  • 3. Telephone Interviews: Are similar to personnel interviews except that communications between interviewer & respondent is on telephone instead of direct personal contact.However, this method has several limitations such as it can not be used to interview those people who dont have telephone, telephone conversation can not be very long & also replies on the phone can be erratic & unreliable.

  • DATA COLLECTION TECHNIQUES AND TOOLSData collection TechniquesUsing available information

    Observing

    Interviewing

    Administering written questionnairesOrganizing Focus Group Discussions

    Data Collection ToolsChecklist, data, compilation formEyes and ears, pen and paper, watch, tape Or video recorder etc.

    Interview schedule, checklist, questionnaire, Tape recorderQuestionnaireDiscussion guide, tape recorder

  • Data:Statistical observations are called data.Attribute:The phenomena, which are expressed in some qualitative form, is termed as attribute. Example occupation, religion, education etc.

  • Variable:A variable is a quantitative characteristic of a person, object or phenomena that can take more than one value. Example income, price, height, weight, family size. There are mainly four types of variables

    Discrete VariableContinuous Variable. Dependent Variable. Independent Variable.

  • 1.Discrete Variable:Discrete variable are those, which can only vary by finite jumps & can not manifest every conceivable fractional value. It is obtained by counting. For instance the number of persons in the family, the number of rooms in the house, number of employees.

  • 2.Continuous Variable:

    A continuous variable is capable of manifesting every conceivable fractional value within the range of possibilities. It is obtained by measurement. Such as height, weight of a product, age.

  • 3.Dependent Variable:The variable that is used to describe or measure the problem understudy is called dependent variable.4.Independent Variable:The variables that are used to describe or measure the factors that are assumed to cause or at least influence the problem are called independent variable.

  • Presentation of Data: Presentation can take place mainly in two forms Statistical Table. Statistical Charts.

    A Statistical Table is the presentation of numbers in a logical arrangements, with some brief explanation to show what they are.A Statistical Charts is a pictorial device for presenting data.

  • Classification of Data

    Classification is a grouping of related facts into different classes. Facts in one class differ from those of another class with respect to some characteristics called a basis of classification. Sorting facts on one basis of classification & then on another basis called cross-classification. Classification of data is a function very similar to that of sorting letters in post office (common characteristics destination) This process helps gathering important information while dropping unnecessary details enabling statistical treatment.

  • Types of Classification based on basis:

    Geographical area wise, Cities, districts etc.

    Chronological on basis of time.

    Qualitative according to some attributes.

    Quantitative in terms of some measurable quantity (magnitude).

  • 1. Geographical Classification:

    Basis geographical or location differences. Production of wheat, rice in different states.

    Usually listed in alphabetical order for every reference.

    May also be listed by size to emphasize the important areas as ranking the states by population.

  • 2. Chronological Classification: Basis is period of time. Time series are usually listed in chronological order normally starting with the earliest period. When the major emphasis falls on the most recent events, a reverse time order may be used.

  • 3.QUALITATIVE CASSIFICATION:Basis of some attribute or quality as color of hair, literacy, religion, sex etc.

    The point to note, in this type of classification, is that the attribute under study can not be measured: one can whether it is present or absent in the units of population under study.

    Example Blindness, we may find out how many persons are blind in a given population. It is not possible to measure the degree of blindness in each case. Thus often one attribute is studied two classes are formed, one possessing the attribute & the other not possessing the attribute. This classification is known as simple classification.

  • PopulationNon-blindsBlinds This type of classification where only two classes are formed is also called two fold or dichotomous classification.

    If instead of forming only two classes we further divide the data on the basis of some attributes so as to form several classes, the classification is known as manifold classification.

  • 4. Quantitative Classifications: Basis some measurable characteristics such as height, weight, income, sales. Workers of a factory may be classified according to wages. In this type of classification there are two elements oThe Variable wage in the example below.oThe frequency, the number of workers in each class.Population PopulationPopulationPopulationPopulationPopulationPopulation

  • CLASSIFICATION ACCORDING TO CLASS INTERVALS:Class Limit

    The class limit are the lowest & the highest values that can be included in the class. 20(lower limit of the class) 30(upper limit)The lower limit of the class is the value below which there can be no value in that class & the upper limit of a class is that value above which no value can belong to that class.

  • 2. Class Interval:

    The span of the class, that is, the difference between the upper limit & the lower limit, is known as the class interval.20 30 , class interval is 20.The size of the class interval is determined by the number of classes & total range in data.

  • 3. Class Frequency:

    The number of observations corresponding to the particular class is known as the frequency of that class or class frequency.

  • IncomeNo of Emp800 90050900 10001001000 11002001100 12001501200 1300401300 140010Total550

  • There are 50 employees having income between 800 900. If we add together the frequencies of all the individual classes, we obtain total frequency. The total frequency of six classes is 550, which means that in all there are 550 employees whose income has been studied.

  • 4. Class Midpoint:

    Upper limit of the class + lower limit of the classMid point of a class = .

    There are two methods of classifying the data according to class Intervals Exclusive methods.Inclusive methods.

  • EXCLUSIVE METHODS: When the class intervals are so fixed that the upper limit of one class is the lower limit of the next class it is known as the exclusive methods.

    IncomeNo of Emp800 90050900 10001001000 11002001100 12001501200 1300401300 140010Total550

  • The exclusive method ensures continuity of data.

    In the above example 50 persons whose income between Tk. 800 & Tk. 899.99. A person who is getting 900 would be included 900 1000 class.

  • INCLUSIVE METHODS:Inclusive Methods:Under this method of classification, the upper limit of one class is included in that class itself.

    IncomeNo of Emp800 90050900 10001001000 11002001100 12001501200 1300401300 140010Total550

  • In this class, 800 899 we include persons whose income is between 800 & 899. if the income of a person is exactly 900 he is included in the next class.

    It should be noted that both the inclusive & exclusive method give us the same class intervals. For inclusive method the class interval is obtained by taking the difference between the two upper limits.Class Interval = 999 899 = 100

  • STATISTICAL TABLE One of the simplest & most revealing devices for summarizing data in a meaningful fashion is the statistical table. The purpose of the table is to simplify the presentation & to facilitate comparison.

    Parts of a Table:Table Number:

    The number may be given either in the center of the top above the title.Left hand side (L.H.S) of the table at the top or at the bottom on the L.H.S.

  • 2. Title of the table: The title is the description of the content of the table. A complete title has to answer the questions

    What precisely are the data in the table?Where the data occurred (the precise geographical, political or physical area covered). When the occurred?

    The title should be clear, brief & self-explanatory.

  • 3. Caption: Caption refers to the column heading. It explains what the column represents. It may consists of one or more column headings. Under a column heading there may be subheads. The caption should be clearly defined & placed at the middle of the column. If the different columns are expressed in different units, the units should be should be specified along with the captions.

  • 4. Stub:

    Stubs are designation of the rows or row headings. They are placed at the extreme left.

    5. Body:

    The body of the table contains the numerical information. This is the most vital part of the table. Data presented in the body arranged according to description & classifications of the captions & stubs.

  • 6. Head note:

    It is a brief explanatory statement applying to all or a major part material in the table, & is placed below the title entered & enclosed in brackets. It is used to explain certain points relating to the whole table that have not been included in the title nor in the captions or stubs. For example, unit of measurement is frequently written as the head note, such as in thousand or in million tones or in crores.

  • 7. Foot note:

    Anything in a table, which the reader may find difficult to understand from the title, stubs or captions should be explained in foot notes.If footnotes are needed they are placed directly below the body of the table. Footnotes are used for four main purposes To point out exceptions to avoid any conclusion based on that exception. Any special circumstances affecting the data strike.To clarify anything in the table. To give source in case of secondary data.

  • Graphs: Broadly the various graphs can be divided under the following two heads

    Graphs of Time series or line graphs. Graphs of Frequency distributions.

    Graphs of Frequency Distributions:

    Histograms or column diagrams. Frequency Polygon Smoothed frequency curve Cumulative frequency curve.

  • FREQUENCY DISTRIBUTION:Frequency distribution is a statistical table which shows the set of all distinct values of the variables arranged in order of magnitude either individually or in groups with their corresponding frequencies side by side. A frequency distribution table is given below:

  • FREQUENCY DISTRIBUTION TABLEIIII

    IIII III

    IIII I

  • PROBLAMThe profit (in take) of 30 companies for year 2010-2011 are given below- 20,22,35,42,37,42,48,53, 49,65,39,48,67,18,16,23,37,35,49,63,65,55,45,58,57,69,25,29,58,65. Classify the above data taking a suitable class interval.Represent the data frequency distribution table given there,Highest value = 69Lowest value = 16 Range = 53

  • 53 5 = 1153 25 = 2.00

    Profits (0in takaTallyNo of frequency15 25IIII525 35II235 45IIII II745 55IIII I655 65IIII565 - 75IIII5N = 30

  • Frequency Distribution :Frequency distribution is a statistical table which shows the set of all distinct values of the variables arranged in order of magnitude either individually or in groups with their corresponding frequencies side by side. A frequency distribution table is given below:

    Name of graphs X-axis Y-axisHis to gram Class lower limitFrequency F. PolygonClass mid valuesFrequency

  • FREQUENCY DISTRIBUTION TABLEFrequency Histogram : A histogram is a graphical method by presenting data, where observations are located on a horizontal axis (usually grouped into intervals) and the frequency of those observation is depicted along the vertical axis.

    Profits (0in takaTallyFrequency20 30III330 40IIII540 50IIII I650 60IIII III860 70II270 80I1N = 30

  • Chart1

    354321

    20-30

    30-40

    40-50

    50-60

    60-70

    70-80

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet1

    20-303

    30-405

    40-504

    50-603

    60-702

    70-801

    Sheet1

    000000

    20-30

    30-40

    40-50

    50-60

    60-70

    70-80

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    0

    0

    0

    0

    0

    0

    U

    Y

    X

    Sheet3

  • FREQUENCY DISTRIBUTION TABLE:Frequency polygon :A frequency polygon is a graphical display of a frequency table. The intervals are shown on the X- axis and the number of scores in each interval is represented by the height of a point located above the middle of the interval. The points are connected so that together with the X- axis they form a polygon.

    Profits (0in takaTallyFrequency20 30III330 40IIII540 50IIII I650 60IIII III860 70II270 80I1

  • Chart3

    3

    5

    6

    8

    2

    1

    Frequency

    Class Mid-point

    Frequency

    Frequency Polygon

    Sheet1

    YearProduction (Ton)

    199550

    199660

    199771

    199872

    199974

    43653192.7

    2182796.3

    1091348

    54723

    Class IntervalFrequency

    20-303

    30-405

    40-506

    50-608

    60-702

    70-801

    Sheet1

    050607172

    Year

    1995

    1996

    1997

    1998

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    199550

    199660

    199771

    199872

    199974

    Year

    Production (Ton)

    1995 1996 1997 1998 1999 Years

    Production

    Bar Diagram

    Sheet3

    43653192.7

    2182796.3

    1091348

    54723

    Pie- Chart

    43653192.7

    2182796.3

    1091348

    54723

    Pie- Chart

    0

    0

    0

    0

    0

    0

    Frequency

    Class Mid-point

    Frequency

    Frequency Polygon

    MBD00022477.xls

    Chart2

    3

    5

    4

    3

    2

    1

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet1

    20-303

    30-405

    40-504

    50-603

    60-702

    70-801

    Sheet1

    000000

    20-30

    30-40

    40-50

    50-60

    60-70

    70-80

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    0

    0

    0

    0

    0

    0

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet3

  • Bar Diagram : A bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they represent. They bars can be plotted vertically or horizontally. Bar charts are used for plotting discrete data that is data, which has discrete values and is not continuous.

  • Chart2

    199550

    199660

    199771

    199872

    199974

    Year

    Production (Ton)

    1995 1996 1997 1998 1999 Years

    Production

    Bar Diagram

    Sheet1

    YearProduction (Ton)

    199550

    199660

    199771

    199872

    199974

    43653192.7

    2182796.3

    1091348

    54723

    Sheet1

    Year

    1995

    1996

    1997

    1998

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    Year

    Production (Ton)

    1995 1996 1997 1998 1999 Years

    Production

    Bar Diagram

    Sheet3

    Pie- Chart

    43653192.7

    2182796.3

    1091348

    54723

    Pie- Chart

    Year

    Production (Ton)

    Year

    Production

    Bar Diagram

  • Pie Chart :A pie chart is circular chart divided into sectors. In a pie chart, the length of each sector is proportional to the quantity it represents. When angles are measured with 1 turn as unit then a number of percent is identified with the same number of cent turns. Together, the sectors create a full disk. It is named for its resemblance to a pie, which has been sliced. The pie chart is the most ubiquitous statistical chart in the business world and the mass media. To present the above data in a pie chart, we have to calculate the (%) of the personnel by job status or by degree of angles occupied, lets the calculation:-

  • = f = number of frequencyN = total number of frequency

  • Chart4

    43653192.7

    2182796.3

    1091348

    54723

    Pie- Chart

    Sheet1

    YearProduction (Ton)

    199550

    199660

    199771

    199872

    199974

    43653192.7

    2182796.3

    1091348

    54723

    Class IntervalFrequency

    20-303

    30-405

    40-506

    50-608

    60-702

    70-801

    Sheet1

    050607172

    Year

    1995

    1996

    1997

    1998

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    199550

    199660

    199771

    199872

    199974

    Year

    Production (Ton)

    1995 1996 1997 1998 1999 Years

    Production

    Bar Diagram

    Sheet3

    Pie- Chart

    Pie- Chart

    Frequency

    Class Mid-point

    Frequency

    Frequency Polygon

    MBD00022477.xls

    Chart2

    3

    5

    4

    3

    2

    1

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet1

    20-303

    30-405

    40-504

    50-603

    60-702

    70-801

    Sheet1

    000000

    20-30

    30-40

    40-50

    50-60

    60-70

    70-80

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    0

    0

    0

    0

    0

    0

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet3

  • Chart5

    43653192.7

    2182796.3

    1091348

    54723

    Pie- Chart

    Sheet1

    YearProduction (Ton)

    199550

    199660

    199771

    199872

    199974

    43653192.7

    2182796.3

    1091348

    54723

    Class IntervalFrequency

    20-303

    30-405

    40-506

    50-608

    60-702

    70-801

    Sheet1

    050607172

    Year

    1995

    1996

    1997

    1998

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    199550

    199660

    199771

    199872

    199974

    Year

    Production (Ton)

    1995 1996 1997 1998 1999 Years

    Production

    Bar Diagram

    Sheet3

    Pie- Chart

    Pie- Chart

    Frequency

    Class Mid-point

    Frequency

    Frequency Polygon

    MBD00022477.xls

    Chart2

    3

    5

    4

    3

    2

    1

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet1

    20-303

    30-405

    40-504

    50-603

    60-702

    70-801

    Sheet1

    000000

    20-30

    30-40

    40-50

    50-60

    60-70

    70-80

    0 10 20 30 40 50 60 70Class Interval

    Frequency

    Histogram

    Sheet2

    0

    0

    0

    0

    0

    0

    Class Mid Point

    Frequency

    Frquency Plogon

    Sheet3

  • THANK YOU