Introduction to research methodology acccording to JNTU(Hyd) MBA syllabus
-
Upload
venkata-siva-kumar -
Category
Documents
-
view
174 -
download
7
description
Transcript of Introduction to research methodology acccording to JNTU(Hyd) MBA syllabus
RESEARCH METHODOLOGY & STATISTICAL TOOLS
MASTER OF BUSINESS ADMINISTRATION(JNTU)
A MATERIAL FOR
RESEARCH
METHODOLOGY
AND
STATISTICAL TOOLS(According to JNTU Syllabus)
Prepared by,S. Venkata Siva Kumar;
MBA (HR/MRKTG), MSc (Statistics).
1
RESEARCH METHODOLOGY & STATISTICAL TOOLS
UNIT-1RESEARCH METHODOLOGY:
An Introduction
Meaning of Research:
Research in common parlance refers to a search for knowledge. Once can also
define research as a scientific and systematic search for pertinent information on a
specific topic. In fact, research is an art of scientific investigation. The advanced
Learner’s Dictionary of current English lays down the meaning of research as “a
careful investigation or inquiry especially through search for new facts in any branch
of knowledge.” Redman and Mory define research as a “systematized effort to gain
new knowledge.” Some people consider research as a movement, a movement from
the known to unknown. It is actually a voyage of discovery.
Research is an academic activity and as such the term should be used in a
technical sense. According to “Clifford Woody, Research comprises defining and
redefining problems, formulating hypothesis or suggested solutions; collecting,
organizing and evaluating data; making deductions and reaching conclusions; and at
last carefully testing the conclusions to determine whether they fit the formulating
hypothesis. D. Slesinger and M. Stephenson in the encyclopedia of Social Sciences
define Research as “the manipulation of things, concepts or symbols for the purpose
of generalizing to extend, correct or verify knowledge, whether that knowledge aids in
construction of theory or in the practice of an art.”
Objectives of Research:
The purpose of Research is to discover answers to questions through the
application of scientific procedures. The main aim of research is to find out the truth
which is hidden and which has not been discovered as yet. Though each research
study has its own specific purpose, we may think of research objectives as falling into
a number of following broad groupings:
1. To gain familiarity with a phenomenon or to achieve new insights into it
(studies with this object in view are termed as exploratory or formulative
research studies);
2
RESEARCH METHODOLOGY & STATISTICAL TOOLS
2. To portray accurately the characteristics of a particular individual, situation or
a group (studies with this object in view are known as descriptive research
studies);
3. To determine the frequency with which something occurs or with which it is
associated with something else (studies with this object in view are known as
diagnostic research studies);
4. To test a hypothesis of a casual relationship between variables (such studies
are known as hypothesis-testing research studies).
Motivation in Research:
What makes people to undertake research? This is a question of fundamental
importance. The possible motives for doing research may be either one or more of the
following:
1. Desire to get a research degree along with its consequential benefits;
2. Desire to face the challenge in solving the unsolved problems, i.e., concern
over practical problems initiates research;
3. Desire to get intellectual joy of doing some creative work;
4. Desire to be of service to society.
5. Desire to get Respectability.
However, this is not an exhaustive list of factors motivating people to
undertake research studies. Many more factors such as directives of
government, employment conditions, curiosity about new things, desire to
understand casual relationships, social thinking and awakening and the like
may as well motivate (or at times compel) people to perform research
operations.
Types of Research:
The basic types of research are as follows:
1. Descriptive Vs. Analytical Research : Descriptive research includes surveys
and fact-finding enquiries of different kinds. The major purpose of descriptive
research is description of the state of affairs as it exists at present. In social
science and business research we quite often use the term Ex post facto
research for descriptive research studies. The main characteristic of this
3
RESEARCH METHODOLOGY & STATISTICAL TOOLS
method is that the researcher has no control over the variables; he can only
report what has happened or what is happening. Most ex post facto research
projects used for descriptive studies in which the researcher seeks to measure
such items as, for example, frequency of shopping, preferences of people, or
similar data. Ex post facto studies also include attempts by researchers to
discover causes even when they cannot control the variables. The methods of
research utilized in descriptive research are survey methods of all kinds,
including comparative and co-relational methods. In analytical research, on
the other hand, the researcher has to use facts or information already available,
and analyze these to make a critical evaluation of the material.
2. Applied Vs Fundamental Research : Research can either be applied (or
action) research or fundamental (to basic or pure) research. Applied research
aims at finding a solution for an immediate problem facing a society or an
industrial/business organization, whereas fundamental research is mainly
concerned with generalizations and with the formulation of a theory.
“Gathering knowledge for knowledge’s sake is termed as ‘pure’ or ‘basic’
research.” Research concerning some natural phenomenon or relating to pure
mathematics are examples of fundamental research. Similarly, research
studies, concerning human behavior carried on with a view to make
generalizations about human behavior, are also examples of fundamental
research, but research aimed at certain conclusion (say, a solution) facing a
concrete social or business problem is an example of applied research.
Research to identify social, economic or political trends that may affect a
particular institution or the copy research or the marketing research or
evaluation research are examples of applied research. Thus, the central aim of
applied research is to discover a solution for some pressing practical problem,
whereas basic research is directed towards finding information that has a
broad base of applications and thus, adds to the already existing organized
body of scientific knowledge.
3. Quantitative Vs Qualitative Research : Quantitative research is based on the
measurement of quantity or amount. It is applicable to phenomena that can be
expressed in terms of quantity. Qualitative research, on the other hand, is
concerned with qualitative phenomenon i.e., phenomena relating to or
involving quality or kind. For instance, when we are interested in investigating
4
RESEARCH METHODOLOGY & STATISTICAL TOOLS
the reasons for human behavior, we quite often talk of ‘Motivation Research’,
an important type of qualitative research. This type of research aims at
discovering the underlying motives and desires, using in depth interviews for
the purpose. Other techniques of such research are word association tests,
sentence completion tests, story completion tests and similar other projective
techniques. Attitude or opinion research i.e., research designed to find out how
people feel or what they think about a particular subject or institution is also
qualitative research. Qualitative research is especially important in the
behavioral sciences where the aim is to discover the underlying motives of
human behavior. Through such research we can analyze the various factors
which motivate people to behave in a particular manner or which make people
like or dislike a particular thing. It may be stated, that to apply qualitative
research in practice is relatively a difficult job and therefore, while doing such
research, one should seek guidance from experimental psychologists.
4. Conceptual Vs Empirical Research : Conceptual research is that related to
some abstract idea(s) or theory. It is generally used by philosophers and
thinkers to develop new concepts or to reinterpret existing ones. On the other
hand, empirical research relies on experience or observation alone, often
without due regard for system and theory. It is data-based research, coming up
with conclusions which are capable of being verified by observation or
experiment. We can also call it as experimental type of research. In such a
research it is necessary to get at facts first hand, at their source, and actively to
go about doing certain things to stimulate the production of desired
information. In such a research, the researcher must first provide himself with
a working hypothesis or guess as to the probable results. He then works to get
enough facts (data) to prove or disprove his hypothesis. He then sets up
experimental designs which he thinks will manipulate the persons or the
materials concerned so far to bring forth the desired information. Such
research is thus characterized by the experimenter’s control over the variables
under study and his deliberate manipulation of one of them to study its effects.
Empirical research is appropriate when proof is sought that certain variables
affect other variables in some way. Evidence gathered through experiments or
empirical studies is today considered studies are today considered to be the
most powerful support possible for a given hypothesis.
5
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Nature and Importance of Research:
“All progress is born of inquiry. Doubt is often better than over-confidence, for it
leads to inquiry, and inquiry leads to invention” is famous Hudson Maxim in context
of which the significance of research can well be understood. Increased amounts of
research make progress possible. Research inculcates scientific and inductive thinking
and it promotes the development of logical habits of thinking and organization.
The role of research in several fields of applied economics, whether related to
business or to the economy as a whole, has greatly increased in modern times. The
increasingly complex nature business and government has focused attention on the
use of research in solving operational problems. Research, as an aid to economic
policy, has gained added importance, both for government ad business.
Research provides the basis for nearly all government policies in our
economic system. For instance, government’s budgets rests in part on an analysis of
the needs and desires of the people and on the availability of revenues to meet these
needs. The cost of needs has to be equated to probable revenues and this is a field
where research is most needed. Through research we van devise alternative policies
and can as well examine the consequences of each of these alternatives. Decision-
making may not be a part of research, but research certainly facilitates the decisions
of the policy maker. Government has also to chalk out programmes for dealing with
all facets of the country’s existence and most of these will be related directly or
indirectly to economic conditions. The plight of cultivators, the problems of big and
small business and industry, working conditions, trade union activities, the problems
of distribution, even the size and nature of defense services are matters requiring
research. Thus, research is considered necessary with regard to the allocation of
nation’s resources.
Research has its special significance in solving various operational and
planning problems of business and industry. Operations research and market research,
along with motivational research, are considered crucial and their results assist, in
more than one way, in taking business decisions. Market research is the investigation
of the structure and development of a market of the purpose of formulating efficient
policies for purchasing, production and sales. Operations research refers to the
application of mathematical, logical and analytical techniques to the solution of
business problems of cost minimization or of profit maximization or what can be
6
RESEARCH METHODOLOGY & STATISTICAL TOOLS
termed as optimization problems. Motivational research of determining why people
behave as they do is mainly concerned with market characteristics.
In addition to what has been stated above, the significance of research can also be
understood keeping in view the following points:
1. To those students who are to write a master’s or Ph.D.thesis, research may
mean a careerism or a way to attain a high position in the social structure;
2. To professionals in research methodology, research may mean a source of
livelihood.
3. To philosophers and thinkers, research may mean the outlet for new ideas and
insights;
4. To analysts and intellectuals, research may mean the generalizations of new
theories.
Thus, research is the fountain of knowledge for the sake of knowledge and an
important source of providing guidelines for solving different business, governmental
and social problems. It is a sort of formal training which enables one to understand the
new developments in one’s field in a battery way.
RESEARCH PROCESS:
The Research Process consists of series of actions or steps necessary to
effectively carry out research and the desired sequencing of these steps. The following
order concerning various steps provides a useful procedural guideline regarding the
research process:
1. Formulating the Research problem
2. Extensive Literature survey
3. Development of working hypothesis
4. Preparing the Research design
5. Determining the Sample design
6. Collection of data
7. Execution of the project
8. Analysis of data
9. Hypothesis-testing
10. Generalizations and interpretation
11. Preparation of the report or the thesis
7
RESEARCH METHODOLOGY & STATISTICAL TOOLS
1) Formulating the research problem: There are two types of research problems,
viz., those which relates to states of nature and those which relate to relationships
between variables. At the very outset the researcher must single out the problem he
wants to study i.e., he must decide the general area of interest or aspect of a subject
matter that he would like to inquire into. Initially the problem may be stated in a
broad general way and then the ambiguities, if any, relating to the problem be
resolved. Then, the feasibility of a particular solution has to be considered before a
working formulation of the problem can be set up. The formulation of a general topic
into a specific research problem, thus, constitutes the first step in a scientific enquiry.
Essentially two steps are involved in formulating the research problem, viz.,
understanding the problem thoroughly, and rephrasing the same into meaningful
terms from an analytical point of view.
The best way of understanding the problem is to discuss it with one’s
own colleagues or with those having some expertise in the matter. In an academic
institution the researcher can seek the help from a guide who is usually an
experimented man and has several research problems in mind. Often, the guide puts
forth the problem in general terms and it is up to the researcher to narrow it down and
phrase the problem in operational terms. In private business units or in governmental
organizations, the problem is usually earmarked by the administrative agencies with
which the researcher can discuss as to how the problem originally came about and
what considerations are involved in its possible solutions.
Professor W.A. Neiswanger correctly states that the statement of the
objective is of basic importance because it determines the data which are to be
collected, the characteristics of the data which are relevant, relations which are to be
explored, the choice of techniques to be used in these explorations and the form of the
final report. If there are certain pertinent terms, the same should be clearly defined
along with the task of formulating the problem. In fact, formulation of the problem
often follows a sequential pattern where a number of formulations are set up, each
formulation more specific than the preceding one, each one phrased in more analytical
terms, and each more realistic in terms of the available data and resources.
8
RESEARCH METHODOLOGY & STATISTICAL TOOLS
2) Extensive literature survey: Once the problem is formulated, a brief summary
of it should be written down. It is compulsory for a research worker writing a thesis
for a Ph.D. degree to write a synopsis of the topic and submit it to the necessary
Committee or the Research Board for approval. At this juncture the researcher should
undertake extensive literature survey connected with the problem. For this purpose,
the abstracting and indexing journals and published or unpublished bibliographies are
the first place to go to. Academic journals, conference proceedings, government
reports, books etc., must be tapped depending on the nature of the problem. In this
process, it should be remembered that one source will lead to another. The earlier
studies, if any, which are similar to the study in hand, should be carefully studied. A
good library will be a great help to the researcher at this stage.
3) Development of working hypothesis: After extensive literature survey,
researcher state in clear terms the working hypothesis or hypotheses. Working
hypothesis is tentative assumption made in order to draw out and test its logical or
empirical consequences. As such the manner in which research hypotheses are
developed is particularly important since they provide the focal point for research.
They also affect the manner in which tests must be conducted in the analysis of data
and indirectly the quality of data which is required for the analysis. In most types of
research, the development of working hypothesis plays an important role. Hypothesis
should be very specific and limited to the piece of research in hand because it has to
be tested. The role of the hypothesis is to guide the researcher by delimiting the area
of research and to keep him on the right track. It sharpens his thinking and focuses
attention on the more important facets of the problem. It also indicates the type of data
required and the type of methods of data analysis to be used.
How does one go about developing working hypothesis? The answer is by
using the following approach:
a) Discussions with colleagues and experts about the problem, its origin and the
objectives in seeking a solution;
b) Examination of data and records, if available, concerning the problem for
possible trends, peculiarities and other clues;
c) Review of similar studies in the area or of the studies on similar problems; and
9
RESEARCH METHODOLOGY & STATISTICAL TOOLS
d) Exploratory personal investigation which involves original field interviews on
a limited scale with interested parties and individuals with a view to secure
greater insight into the practical aspects of the problem.
Thus, working hypothesis arise as a result of a priori thinking about the subject,
examination of the available data and material including related studies and the
counsel of experts and interested parties. Working hypothesis is more useful when
stated in precise and clearly defined terms. It may as well be remembered that
occasionally we may encounter a problem where we do not need working hypothesis,
especially in the case of exploratory or formulative researches which do not aim at
testing the hypothesis. But as a general rule, specification of working hypothesis in
another basic step of the research process in most research problems.
4) Preparing the research design: The research problem having been formulated
in clear cut terms, the researcher will be required to prepare a research design, i.e., he
will have to state the conceptual structure within which research would be conducted.
The preparation of such a design facilitates research to be as efficient as possible
yielding maximal information. In other words, the function of research design is to
provide for the collection of relevant evidence with minimal expenditure of effort,
time and money. But how all these can be achieved depends mainly on the research
purpose. Research purposes may be grouped into four categories, viz.,
a. Exploration
b. Description
c. Diagnosis
d. Experimentation
A flexible research design which provides opportunity for considering many
different aspects of a problem is considered appropriate if the purpose of the research
study is that of exploration. But when the purpose happens to be an accurate
description of a situation or of an association between variables, the suitable design
will be one that minimizes bias and maximizes the reliability of the data collected and
analyzed. There are several research designs, such as, an experimental and non-
experimental hypothesis testing. Experimental designs can be either informal design
(such as completely randomized design, randomized block design, Latin square
design, simple and complex factorial designs), out of which the researcher must select
one for his own project.
10
RESEARCH METHODOLOGY & STATISTICAL TOOLS
The preparation of the research design, appropriate for a particular research
problem, involves usually the consideration of the following:
I. The means of obtaining the information;
II. The availability and skills of the researcher and his staff (if any);
III. Explanation of the way in which selected means of obtaining information will
be organized and the reasoning leading to the selection;
IV. The time available for research; and
V. The cost factor relating to research, i.e., the finance available for the purpose.
5) Determining sample design: All the items under consideration in any field of
inquiry constitute a ‘universe’ or ‘population’. A complete enumeration of all items in
the ‘population’ is known as a census enquiry. It can be presumed that in such an
enquiry when all the items are covered no element of chance is left and highest
accuracy is obtained. But in practice this may not be true. Even the slightest element
of bias in such an enquiry will get larger and larger as the number of observations
increases. Moreover, there is no way of checking the element if bias or its extent
except through a resurvey or use of sample checks. Besides, this type of inquiry
involves a great deal of time, money and energy. Not only this, census enquiry is not
possible in practice under many circumstances. For instance, blood testing is done
only on sample basis. Hence, quite often we select only a few items from the universe
for our study purposes. The items so selected continue what is technically called a
sample.
The researcher must decide the way of selecting a sample or what is popularly
known as the sample design. In other words a sample design is a definite plan
determined before any data are actually collected for obtaining a sample from a given
population. Thus, the plan to select 12 of a city’s 200 drugstores in a certain way
constitutes a sample design. Samples can be either probability samples or non-
probability samples. With probability samples each element has a known probability
of being included in the sample but the non-probability samples do not allow the
researcher to determine this probability. Probability samples are those based on
simple random sampling, systematic sampling, stratified sampling, cluster/area
sampling whereas non-probability samples are those based on convenience sampling,
judgment sampling and quota sampling techniques. A brief mention of the important
sample designs is as follows.
11
RESEARCH METHODOLOGY & STATISTICAL TOOLS
1. Deliberate sampling: Deliberate sampling is also known as purposive or non-
probability sampling. This sampling method involves purposive or deliberate
selection of particular units of the universe for constituting a sample which
represents the universe. When population elements are selected for inclusion
in the sample based on the ease of access, it can be called convenience
sampling.
2. Simple random sampling: This type of sampling is also known as chance
sampling or probability sampling where each and every item in the population
has an equal chance of inclusion in the sample and each one of the possible
samples, in case of finite universe, has the same probability of being selected.
For example, if we have to select a sample of 300 items from a universe of
15,000 items, then we can put the names or numbers of all the 15,000 items on
slips of paper and conduct a lottery.
3. Systematic sampling: In some instances the most practical way of sampling is
to select every 15th name on a list, every 10th house on one side of a street and
so on. Sampling of this type is known as systematic sampling.
4. Stratified sampling: if the population from which a sample is to be drawn
does not constitute a homogeneous group, then stratified sampling technique is
applied so as to obtain a representative sample. In this technique, the
population as stratified into a number of non-overlapping subpopulations or
strata and sample items are selected from each stratum. If the items selected
from each stratum is based on simple random sampling the entire procedure,
first stratification and then simple random sampling, is known as stratified
random sampling.
5. Quota sampling: In stratified sampling the cost of taking random samples
from individual strata is often so expensive that interviewers are simply given
quota to be filled from different strata, the actual selection of items for sample
being left to the interviewer’s judgment. This is called quota sampling.
6. Cluster sampling and Area sampling: cluster sampling involves grouping the
population and then selecting the groups or the clusters rather than individual
elements for inclusion in the sample. Suppose some departmental store wishes
to sample its credit card holders. It has issued its cards to 15,000 customers.
The sample size is to be kept say 450. For cluster sample this list of 15,000
12
RESEARCH METHODOLOGY & STATISTICAL TOOLS
card holders could be formed into 100 clusters of 150 card holders each. Three
clusters might then be selected for the sample randomly.
7. Multi-stage sampling: This is a further development of the idea of cluster
sampling. This technique is mean for big enquiries extending to a considerably
large geographical area like an entry country. Under multi-stage sampling the
first stage may be to select large primary sampling units such as states, then
districts, then towns and finally certain families within towns. If the technique
of random sampling is applied at all stages, the sampling procedure is
described as multi-stage random sampling.
8. Sequential sampling: This is some what a complex sample design where the
ultimate size of the sample is not fixed in advance but is determined according
to mathematical decisions on the basis of information yielded as survey
progresses. This design is usually adopted under acceptance sampling plan in
the context of statistical quality control.
6) Collecting the data: In dealing with any real life problem it is often found that
data at hand are inadequate, and hence, it becomes necessary to collect data that are
appropriate. There are several ways of collecting the appropriate data which differ
considerably in context of money costs, time and other resources at the disposal of the
researcher.
Primary data can be collected either through experiment or through survey. If
the researcher conducts an experiment, he observes some quantitative measurements,
or the data, with the help of which he examines the truth contained in his hypothesis.
But in the case of a survey, data can be collected by any one or more of the following
ways.
1. By observation
2. Through personal interview
3. Through telephone interviews
4. By mailing of questionnaires
5. Through schedulers.
7) Execution of the project: Execution of the project is a very important step in
the research process. If the execution of the project proceeds on correct lines, the data
to be collected would be adequate and dependable. The researcher should see that the
project is executed in a systematic manner and in time. If the survey is to be
13
RESEARCH METHODOLOGY & STATISTICAL TOOLS
conducted by means of structured questionnaires, data can be readily machine-
processed. In such a situation, questions as well as the possible answers may be
coded. If the data are to be collected through interviewers, arrangements should made
for proper selection and training of the interviewers. The training may be given with
the help of instruction manuals which explain clearly the job of the interviewer at
each step. Occasional field checks should be made to ensure that the interviewers are
doing their assigned job sincerely and efficiently. A careful watch should be kept for
unanticipated factors in order to keep the survey as much realistic as possible. This, in
other words, means that steps should be taken to ensure that survey is under statistical
control so that the collected information is in accordance with the pre-defined
standard of accuracy. If some of the respondents do not cooperate, some suitable
methods should be designed to tackle this problem. One method of dealing with the
non-response problem is to make a list of the non-respondents and take a small sub
sample of them, and then with the help of experts vigorous efforts can be made for
securing response.
8) Analysis of data: After the data have been collected, the researcher turns to the
task of analyzing them. The analysis of data requires a number of closely related
operations such as establishment of categories, the application of these categories to
raw data through coding, tabulation and then drawing statistical inferences. The un-
widely data should necessarily be condensed into a few manageable groups and tables
for further analysis. Thus researcher should classify the raw data into some purposeful
and usable categories. Coding operation is usually done at this stage through which
the categories of data are transformed into symbols that nay be tabulated and counted.
Editing is the procedure that improves the quality of the data for coding. With coding
the stage is ready for tabulation. Tabulation is a part of the technical procedure
wherein the classified data are put in the form of tables. The mechanical devices can
be made use of at this juncture. A great deal of data, especially in large inquiries, is
tabulated by computers. Computers not only save time but also make it possible to
study large number of variables affecting a problem simultaneously.
9) Hypothesis-testing: after analyzing the data as stated above, the researcher is in
a position to test the hypothesis, if any, he had formulated earlier. Do the facts support
the hypothesis or they happen to be contrary? This is the usual question which should
14
RESEARCH METHODOLOGY & STATISTICAL TOOLS
be answered while testing hypothesis. Various tests, such as Chi-square test, t-test, F-
test have been developed by statisticians for the purpose. The hypothesis may be
tested through the use of one or more of such tests, depending upon the nature and
object of research inquiry. Hypothesis-testing will result in either accepting the
hypothesis or in rejecting it. If the researcher had no hypothesis to start with,
generalizations established on the basis of data may be stated as hypothesis to be
tested by subsequent researches in times to come.
10) Generalizations and interpretation: If a hypothesis is tested and upheld
several times, it man be possible for the researcher to arrive at generalization, i.e., to
build a theory. As a matter of fact, the real value of research lies in its ability to arrive
at certain generalizations. If the researcher had no hypothesis to start with. He might
seek to explain his findings on the basis of some theory. It is knows as interpretation.
The process of interpretation may quite often trigger off new questions which in turn
lead to further researches.
11) Preparation of the report or the thesis: Finally, the researcher has to prepare
the report of what has been done by him. Writing of report must be done with great
care keeping in view the following:
1. The layout of report should be as follows:
(i) The preliminary pages;
(ii) The main text, and (iii) The end matter
In its preliminary pages the report should carry title and data followed
acknowledgements and foreword. Then there should be a table of contents followed
by a list of tables and list of graphs and charts, if any, given in the report.
The main text of the report should have the following parts:
(a) Introduction: It should contain a clear statement of the objective of the
research and explanation of the methodology adopted in accomplishing the
research. The scope of the study along with various limitations should as well
be stated in this part.
(b) Summary of findings: after introduction there would appear a statement of
findings and recommendations in non-technical language. If the findings are
extensive, they should be summarized.
15
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(c) Main report: the main body of the report should be presented in logical
sequence and broken-down into readily identifiable sections.
(d) Conclusion: towards the end of the main text, researcher should again put
down the results of his research clearly and precisely. In fact, it is the final
summing up.
At the end of the report, appendices should be enlisted in respect of all
technical data. Bibliography, i.e., list of books, journals, reports, etc.,
consulted, should also be given in the end. Index should also be given
specially in a published research report.
2. Report should be written in a concise and objective style in simple language
avoiding vague expressions such as ‘it seems’, ‘there may be’, and the like.
3. Charts and illustrations in the main report should be used only if they present
the information more clearly and forcibly.
4. Calculated ‘confidence limits’ must be mentioned and the various constraints
experienced in conducting research operations may as well be stated.
COLLECTION OF DATA
Statistical investigation: An investigation (or) inquiry means a “search for
knowledge”. Statistical investigation means “search for knowledge with the help of
statistical methods”.
Stages of Investigation: A statistical investigation is a comprehensive which passes
through the following steps:
1. Planning the inquiry
2. Collection of data
3. Editing the data
4. Presentation of data
5. Analysis of data
6. Presentation of final report
16
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Collection of data: The first in the conduct of statistical investigation (or) inquiry is “collection of data”. The source of data can be represented as follows:
Internal source: Internal data come from government and business organizations
which generate them in the form of production, purchase, expenses etc.
External data: When data is collected from outside the organization, then this is
collected from the external source. External data can be divided into two types.
(i) Primary (ii) secondary
(i) Primary data: It refers to the statistical material which the investigator originates
for him for the purpose of the inquiry in hand in other words; it is one which is
collected by the investigator the first time.
(ii) Secondary data: it refers to the statistical material which is not originated by the
investigator himself but obtained from some one else records. This type of data is
generally taken from news papers, magazines, bulletins, reports etc.
Methods of collection of primary data: following methods may be used to collect the
primary data:
1. Direct personal investigation
2. Indirect personal investigation
3. Information through correspondent
4. Questionnaire method
(a) Questionnaire step to post
(b) Questionnaire step to investigators
(1) Direct personal investigation: According to this method, the investigator obtains
the data from personal interview or observation.
DATA
INTERNAL DATA
EXTERNAL DATA
PRIMARY DATASECONDARY
DATA
17
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Therefore, he contains the source of information directly and personally. He
will contact cash and every possible source of information.
(2) Indirect personal investigation: According to this method the investigator
contains third party’s witnesses who are use to collect the information directly or
indirectly and or capable of supplying the necessary information. This method is
generally adapted by government committees to get views of the people relating to the
inquiry.
(3) Information through correspondent: Under this method, the investigator does not
collect the information from the persons directly. He appoints local agents in different
cards of the area under investigation. These local agents are called “correspondents”.
This correspondents collect the information and pass it on to the investigate on time-
to-time.
(4) Questionnaire method: In this method, the necessary information is collected
from the respondent’s through a questionnaire. A questionnaire is a set of questions
relating to the inquiry. The information can be collected through questionnaires in two
ways.
(i) Questionnaires sent to post: in this case, the questionnaire is sent to a person
and the persons he fills the various answers to the various questions asked in it.
(ii) Questionnaires sent to investigator: under this method, the investigators are
appointed and contact the persons and get replace to the questionnaire and tell them in
their own hand writing in the questionnaire form.
Sources of secondary data: sometimes it is not possible to collect information
for resources in terms of money, time etc, in that solution secondary data is used. This
type of data is generally available in magazines, journals etc. This secondary data can
be classified into two categories:
(i) Published data
(ii) Unpublished data
Organization of data: the raw data in the form of unarranged figures are collected
through primary or secondary sources. The raw data practically gives no information
and hence there is a need for organization of data. In organization of data involves the
following ‘3’ stages:
(1) Editing of data
(2) Classification of data
18
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(3) Tabulation of data
(1) Editing of data:
Editing of data refers to detect possible errors and irregulatories committed
during the collection of data.
If the data is not edited, then it may lead to wrong conclusions. Therefore
editing is essential to arrange the data in order.
(2) Classification of data:
The process of arranging the data in groups or classes according to their
common characteristics is technically classified.
Classification is the grouping of related facts into classes.
Types of classification: broadly whole data can be classified into following factors:
1. geographical classification
2. chromo logical classification
3. conditional classification
4. qualitative classification
5. quantitative classification
1. Geographical classification: Here data are classified on the basic of
geographical area like village, city, states, and regions.
2. Chromo logical classification: Here, this classification is done on the
basis of time likely hourly, daily, weakly, monthly etc.
3. Conditional classification: This classification is done on the basis of
some conditions such as literacy, intelligence, honesty, beauty and
ugly etc.
4. Qualitative classification: Here, this data is classified on the basis of
some attributes (or) quality like literacy, honesty, beauty, intelligence
etc,. In this case the basis of classification is either presence or
absence of a quality.
5. Quantitative classification: When the data classified on the basis of
the characteristics which can be measured such as age, income, marks,
height, weight, product is called “Qualitative classification”.
(3) Tabulation of data: After the collection and classification of data process of
tabulation begins. Tabulation is dependent upon classification. Tabulation is
necessary in order to make the data understandable or organize. By tabulation we
make a systematic arrangement of statistical data in rows and columns. Rows are the
19
RESEARCH METHODOLOGY & STATISTICAL TOOLS
horizontal arrangements of data, where as the columns are the vertical arrangement of
data.
Tabulation tries to give the maximum information contained in the data in
minimum possible space. It is mid way process between the collection of data and
statistical analysis.
QUESTIONNAIRE AS A TOOL OF COLLECTING DATA
This method consists in preparing a questionnaire (a list of questions relating
to the field of enquiry and providing space for the answers to be filled by the
respondents) which is mailed to the respondents with a request for quick response
within the specified time. The questionnaire is the only media of communication
between the investigator and the respondents and as such the questionnaire should be
designed or drafted with utmost care and caution so that all the relevant and essential
information for the enquiry may be collected without any difficulty, ambiguity and
vagueness.
Drafting or Framing the Questionnaire:
Drafting of a good questionnaire is a highly specialized job and requires great
care, skill, wisdom, efficiency and experience. No hard and fast rules can be laid
down for designing or framing a questionnaire. However, in this connection, the
following general points may be borne in mind:
1. The size of the questionnaire should be as small as possible. The number of
questions should be restricted to the minimum, keeping in view the nature,
objectives and scope of the enquiry. In other words, the questionnaire should be
concise and should contain only those questions which would furnish all the
necessary information relevant for the purpose. Respondents’ time should not be
wasted by asking irrelevant and unimportant questions. A large number of
questions would involve more work for the investigator and thus result in delay
on his part in collecting and submitting the information. These may, in addition,
also necessarily annoy or tire the respondents. A reasonable questionnaire
should contain from 15 to 20-25 questions. If a still larger number of questions
is a must in any enquiry, then the questionnaire should be divided into various
sections or parts.
20
RESEARCH METHODOLOGY & STATISTICAL TOOLS
2. The questions should be clear, brief, unambiguous, non-offending, and
courteous in tone, corroborative in nature and to the point so that not much
scope of guessing is left on the part of the respondents.
3. The questions should be arranged in a natural logical sequence. For example, to
find if a person owns a refrigerator the logical order of questions would be: “Do
you own a refrigerator”? When did you buy it? What is its make? How much did
it cost you? Is its performance satisfactory? Have you ever got it serviced? The
logical arrangement of questions in addition to facilitating tabulation work
would leave no chance for omissions or duplication.
4. The usage of vague and ‘multiple meaning’ words should be avoided. The vague
works like good, bad, efficient, sufficient, prosperity, rarely, frequently,
reasonable, poor, and rich, etc., should not be used since these may be
interpreted by different persons and as such might give unreliable and
misleading information. Similarly the use of words with multiple meanings like
price, assets, capital, income, household, democracy, socialism, etc., should not
be used unless a clarification to these terms is given in the questionnaire.
5. Questions should be so designed that they are readily comprehensive
and easy to answer for the respondents. They should not be tedious nor should
they tax the respondents’ memory. Further, questions involving mathematical
calculations like percentages, ratios, etc., should not be asked.
6. Questions of a sensitive and personal nature should be avoided. Questions like
“How much money you owe to private parties?” or “Do you clean your utensils
yourself?” which might hurt the sentiments, pride or prestige of an individual
should not be asked, as far as possible. It is also advisable to avoid questions on
which the respondent may be reluctant or unwilling to furnish information. For
example, the questions pertaining to income, savings, habits, addiction to social
evils, age (particularly in case of ladies), etc., should be asked very tactfully.
7. Typed Questions: Under this head, the questions in the questionnaire may be
broadly classified as follows:
a) Shut Questions: In much questions possible answers are suggested by the
framers of the questionnaire and the respondent is required to tick one of
them. Shut questions can further be sub-divided into the following forms.
(i) Simple Alternative Questions: In such questions, the
respondent has to choose between two clear cut alternatives like ‘Yes’ or
21
RESEARCH METHODOLOGY & STATISTICAL TOOLS
‘No’; ‘Right’ or ‘Wrong’; ‘Either’ or ‘Or’ and so on. For instance, do
you own a refrigerator? – Yes or No. Such questions are also called
dichotomous questions. This technique can be applied with elegance to
situations where two clear cut alternatives exist.
(ii) Multiple Choice Questions: Quite often, it is not possible to define a
clear cut alternative and accordingly in such a situation either the first
method (Alternative Questions) is not used or additional answers
between ‘Yes’ or ‘No’ like ‘Do not know’, ‘No opinion’, Occasionally,
Casually, Seldom, etc., are added. For instance to find a person smokes
or drinks, the following multiple choice answers may be used:
Do you smoke?
Yes (Regularly) [ ] No (Never) [ ]
Occasionally [ ] Seldom [ ]
Which of the following modes of cooking you use?
Gas [ ] Coal (Coke) [ ] Wood [ ]
Power (Electricity) [ ] Stove (Kerosene) [ ]
How do you go to your place of duty?
By bus [ ] By three wheeler scooter [ ]
By your own vehicle [ ] By taxi [ ]
By your own scooter [ ] On foot [ ]
By your own car [ ] Any other [ ]
Multiple choice questions are very easy and convenient for the respondents
to answer. Such questions save time and also facilitate tabulation. This
method should be used if only a selected few alternative answers exist to a
particular question. Sometimes, a last alternative under the category
‘Others’ or ‘Any other’ may be added. However, multiple answer
questions of relatively equal importance to a given question.
b) Open Questions: Open questions are those in which no alternative
answers are suggested and the respondents are at liberty to express their
frank and independent opinions on the problem in their own words. For
instance, ‘What are the drawbacks in our examination system?’; ‘What
solution do you suggest to the housing problem in Delhi?’; ‘Which program
in the Delhi TV do you like best?’ are some of the open questions. Since the
22
RESEARCH METHODOLOGY & STATISTICAL TOOLS
views of the respondents in the open questions might differ widely, it is very
difficult to tabulate the diverse opinions and responses.
8) Leading questions should be avoided: For example, the question ‘why do we use
a particular brand of blades, say, Erasmic blades’ should preferably be framed into
two questions.
(i) Which blade do you use?
(ii) Why do you prefer it?
Gives a smooth shave [] Readily available in the market []
Gives more shaves [] Any other []
Price is less (cheaper) []
9) Cross checks: The questionnaire should be so designed as to provide internal
checks on the accuracy of the information supplied by the respondents by including
some connected questions at least with respect to matters which are fundamental to
the enquiry. For example in social survey for finding the age of the mother the
question ‘What is your age’? Can be supplemented by additional questions ‘What is
your date of birth?’ or ‘What is the age of your eldest child’? Similarly, the question,
‘Age at marriage’ can be supplemented by the question ‘The age of the first child’.
10) Pre-testing the questionnaire: From practical of view it is desirable to try out the
questionnaire on a small scale (i.e., on a small cross-section of the population for
which the enquiry is intended) before using it for the given enquiry on a large scale.
This testing on a small scale (called pre-test) has been found to be extremely useful in
practice. The given questionnaire can be improved or modified in the light of the
drawbacks, shortcomings and problems faced by the investigator in the pre-test. Pre-
testing also helps to decide upon the effective methods of asking questions for
soliciting the requisite information.
11) A covering letter: A covering letter from the organizers of the enquiry should be
enclosed along with the questionnaire for the following purposes:
i. It should clearly explain in brief the objectives and scope of the
survey to evoke the interest of the respondents and impress upon them to
render their full co-operation by returning their schedule/questionnaire duly
filled in within the specified period.
ii. It should contain a note regarding the operational definitions to
the various terms and the concepts used in the questionnaire; units of
measurements to be used and the degree of accuracy aimed it.
23
RESEARCH METHODOLOGY & STATISTICAL TOOLS
iii. It should take the respondents in confidence and ensure them
that the information furnished by them will be kept completely secret and
they will not be harassed in any way later.
iv. In the case of mailed questionnaire method a self-addressed
stamped envelope should be enclosed for enabling the respondents to return
the questionnaire after completing it.
v. To ensure quick and better response the respondents may be
offered awards/incentives in the form of free gifts, coupons, etc.
vi. A copy of the survey report may be promised to the interested
respondents.
12) Mode of tabulation and analysis viz., hand operated, machine tabulation or
computerization should also be kept in mind while designing the questionnaire.
13) Lastly, the questionnaire should be made attractive by proper layout and
appealing get up. We give below two specimen questionnaires for illustration.
A MODEL OF QUESTIONNAIRE IN REGARDS TO CENSUS SURVEY:
We give below the 1971 Census – Individual Slip which was used for a
general purpose survey to collect:
(i) Social and Cultural data like nationality, religion, literacy, mother tongue, etc.;
(ii) Exhaustive economic data like occupation, industry, class of worker and activity,
if not working;
(iii) Demographic data like relation to the head of the house,
sex, age, marital status, birth place, births and depths and the fertility of women to
assess in particular the performance of the family planning programme.
1971 CENSUS – INDIVIDUAL SLIP
1. Name…………………………………………………..
2. Relationship to the head of the family………………………………………
3. Sex………………………..
4. Age…………………………………..
5. Marital status………………………..
6. For currently married women only:
a) Age at marriage……………
24
RESEARCH METHODOLOGY & STATISTICAL TOOLS
b) Any child born in the last one year……………..
7. Birth place:
a) Place of birth……………
b) Rural or urban…………….
c) District…………………………….
d) State/Country…………………………..
8. Last Residence:
a) Place of last residence…………………………………………
b) Rural/Urban……………………………………….
c) District………………………………….
d) State/Country………………………………………
9. Duration of present residence……………………………………..
10. Religion………………………………………….
11. Scheduled Caste/Tribe………………………………………
12. Literacy………………………………………….
13. Educational level………………………………………..
14. Mother Tongue…………………………………………..
15. Other Languages, if any……………………………………………………….
16. Main Activity:
a) Broad Category:
(i) Worker
(ii) Non – Worker
b) Place of work (Name of village/town)…………………………..
c) Name of establishment………………………
d) Name of Industry, Trade, Profession or Service…………………
e) Description of work…………………………………..
f) Class of worker………………………………..
17. Secondary work:
a) Broad Category………………………
b) Place of work…………………………….
c) Name of establishment……………………….
d) Nature of Industry, Trade, Profession or
service………………………….
e) Description of work…………………………………..
25
RESEARCH METHODOLOGY & STATISTICAL TOOLS
f) Class of worker……………………………………………..
SCHEDULES AS A TOOL FOR COLLECTING DATA
Before discussing this method it is desirable to make a distinction between a
questionnaire and a schedule. As already explained, questionnaire in a list of
questions which are answered by the respondent himself in this own handwriting
while schedule is the device of obtaining answers to the questions in a form which is
filled by the interviewers or enumerators (the field agents who put these questions) in
a face to face situation with the respondents. The most widely used method of
collection of primary data is the ‘schedules sent through enumerators’. This is so
because this method is free from certain shortcomings inherent in the earlier methods
discussed so far. In this the enumerators go to the respondents personally with the
schedule (list of questions), ask them the questions there in and record their replies.
This method is generally used by big business houses, large public enterprises and
research institutions like ‘National Council of Applied Economic Research (NCAER),
Federation of Indian Chambers of Commerce and Industries (FICCI) and so on and
even by the governments – state or central – for certain projects and investigations
where high degree of response is desired. Population census, all over the world is
conducted by this technique.
Merits:
1. The enumerators can explain in detail the objectives and aims of the enquiry to
the informants and impress upon them the need and utility of furnishing the
correct information.
2. This technique is very useful in expensive enquiries and generally yields fairly
dependable and reliable results due to the fact that the information is recorded
by highly trained and educated enumerators.
3. Unlike the ‘Questionnaire method’, this technique can be used with advantage
even if the respondents are illiterate.
4. As already pointed out in the ‘direct personal investigation’, due to personal
likes and dislikes, different people react differently to different questions and
as such some people might react very sharply to certain sensitive and personal
questions.
26
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Demerits:
1. It is fairly expensive method since the team of enumerators is to be paid for
different services and as such can be used by only those bodies or institutions
which are financially sound.
2. It is also more time consuming as compared with the ‘Questionnaire method’.
3. The success of the method largely depends upon the efficiency and skill of the
enumerators who collect the information. The enumerators have to be trained
properly in the art of collecting correct information by their intelligence,
insight, patience and perseverance, diplomacy and courage. They should
clearly understand the aims and objectives of the enquiry and also the
implications of the various terms, definitions and concepts used in the
questionnaire.
4. Due to inherent variation in the individual personalities of the enumerators
there is bound to be variation, though not so obvious, in the information
recorded by different enumerators. An attempt should be made to minimize
this variation.
5. The success of this method also lies to a great extent on the efficiency and
wisdom with which the schedule is prepared or drafted. If the schedule is
framed haphazardly and incompetently, the enumerators will find it very
difficult to get the complete and correct desired information from the
respondents.
SAMPLE DESIGN AND SAMPLING PROCEDURES
SAMPLE DESIGN:
A sample design is a definite plan for obtaining a sample from a given
population. It refers to the technique or the procedure the researcher would adopt in
selecting items for the sample. Sample design may as well lay down the number of
times to be included in the sample i.e., the size of the sample. Sample design is
determined before data are collected. There are many sample designs from which a
researcher can choose. Some designs are relatively more precise and easier to apply
27
RESEARCH METHODOLOGY & STATISTICAL TOOLS
than others. Researcher must select/prepare a sample design which should be reliable
and appropriate for his research study.
STEPS IN SAMPLE DESIGN:
While developing a sample design, the researcher must pay attention to the following
points:
1. Type of universe: The first step in developing sample design is to clearly
define the set of objects, technically called the Universe, to be studied. The
universe can be finite or infinite. In finite universe the number of items is
certain, but in case of an infinite universe the number of items is infinite i.e.,
we cannot have any idea about the total number of items. The population of a
city, the number of workers in a factory and the like are examples of finite
universes, whereas the number of stars in the sky, listeners of a specific radio
programme, throwing of a dice etc., are examples of infinite universes.
2. Sampling Unit: A decision has to be taken concerning a sampling unit before
selecting sample. Sampling unit may be a geographical one such as state,
district, village, etc., or a construction unit such as house, flat, etc., or it may
be a social unit such as family, club, school, etc., or it may be an individual.
The researcher will have to decide one or more of such units that he has to
select for his study.
3. Source List: It is also known as ‘Sampling frame’ from which sample is to be
drawn. It contains the names of all items of a universe (in case of finite
universe only). If source list is not available, researcher has to prepare it. Such
a list should be comprehensive, correct, reliable and appropriate. It is
extremely important for the source list to be as representative of the population
as possible.
4. Size of sample: This refers to the number of items to be selected from the
universe to constitute a sample. This major problem before a researcher. The
size of sample should neither be excessively large, nor too small. It should be
optimum. An optimum sample is one which fulfills the requirements of
efficiency, representative-ness, reliability and flexibility. While deciding the
size of sample, researcher must determine the desired precision as also an
acceptable confidence level for the estimate.
28
RESEARCH METHODOLOGY & STATISTICAL TOOLS
5. Parameters of interest: In determining the sample design, one must consider
the question of the specific population parameters which are of interest. For
instance, we may be interested in estimating the proportion of persons with
some characteristic in the population, or we may be interested in knowing
some average or the other measure concerning the population. There may also
be important sub-groups in the population about whom we would like to make
estimates. All this has a strong impact upon the sample design we would
accept.
6. Budgetary Constraint: Cost considerations, from practical point of view, have
a major impact upon decisions relating to not only the size of the sample but
also to the type of sample. This fact can even lead to the use of a non-
probability sample.
7. Sampling Procedure: Finally, the researcher must decide the type of sample
he will use i.e., he must decide about the technique to be used in selecting the
items for the sample. In fact, this technique or procedure stands for the sample
design itself. There are several sample designs out of which the researcher
must choose one for his study. Obviously, he must select that design which,
for a given sample size and for a cost, has a small sampling error.
CHARACTERISTICS OF GOOD SAMPLE DESIGN:
From what has been stated above, we can list down the characteristics of a good
sample design as under:
a) Sample design must result in a truly representative sample.
b) Sample design must be such which results in a small sampling error.
c) Sample design must be viable in the context of funds available for the research
study.
d) Sample design must be such so that systematic bias can be controlled in a
better way.
e) Sample should be such that the results of the sample study can be applied, in
general, for the universe with a reasonable level of confidence.
CRITERIA OF SELECTING A SAMPLING PROCEDURE:
29
RESEARCH METHODOLOGY & STATISTICAL TOOLS
In this context one must remember that two costs are involved in a sampling
analysis viz., the cost of collecting the data and the cost of an incorrect inference
resulting from the data. Researcher must keep in view the two causes of incorrect
inferences viz., systematic bias and sampling error. Systematic bias results from errors
in the sampling procedures, and it cannot be reduced or eliminated by increasing the
sample size. At best the causes responsible for these errors can be detected and
corrected. Usually a systematic bias is the result of one or more of the following
factors.
1) Inappropriate frame: If the sampling frame is inappropriate i.e., a biased
representation of the universe, it will result in a systematic bias.
2) Defective measuring device: If the measuring device is constantly in error, it will return in
systematic bias. In survey work, systematic bias can result if the questionnaire or the
interviewer is biased. Similarly, if the physical measuring device is defective there will be
systematic bias in the data collected through such a measuring device.
3) Non-respondents: If we are unable to sample all the individuals initially include in the
sample, there may arise a systematic bias. The reason is that in such a situation the likelihood
of establishing contact or receiving a response from an individual is often correlated with the
measure of what is to be estimated.
4) Indeterminacy principle: Sometimes we find that individuals act different when kept
under observation that what they do when kept in non-observed situations. For instance, if
workers are aware that somebody is observing then in course of a work study on the basis of
which the average length of time to complete a task will be determined and accordingly the
quota will be set for piece work, they generally tend to work slowly in comparison to the
speed with which they work if kept unobserved. Thus, the indeterminacy principle may also
be a cause of a systematic bias.
5) Natural bias in the reporting of data: Natural bias of respondents in the reporting of data
is often the cause of a systematic bias in many inquiries. There is usually a download bias in
the income data collected data by government taxation department, whereas we find an
upward bias in the income data collected by some social organization. People in general
understate their incomes if asked about it for tax purposes, but they overstate the same if
asked for social status or their affluence. Generally in psychological surveys, people tend to
give what they think is the ‘correct’ answer rather than revealing their true feelings.
30
RESEARCH METHODOLOGY & STATISTICAL TOOLS
DIFFERENT TYPES OF SAMPLE DESIGNS:
There are different types of sample designs based on two factors viz., the
representation basis and the element selection technique. On the representation basis
and the element selection technique. On the representation basis, the sample may be
probability sampling or it may be non-probability sampling. Probability sampling is
based on the concept of random selection, whereas non-probability sampling is ‘non-
random sampling. On element selection bias, the sample may be either unrestricted or
restricted. When each sample element is drawn individually from the population at
large, then the sample so drawn is known as ‘unrestricted sample’, whereas all other
forms of sampling are covered under the term ‘restricted sampling’. The following
chart exhibits the sample designs as explained above.
Non-probability sampling: Non-probability sampling is that sampling procedure
which does not afford any basis for estimating the probability that each item in the
population has of being included in the sample. Non-probability sampling is also
known by different names such as deliberate sampling, purposive sampling and
judgment sampling. In this type if sampling, items for the sample are selected
deliberately by the researcher; his choice concerning the items remains supreme. In
other words, under non-probability sampling the organizers of the inquiry purposively
choose the particular units of the universe for consulting a sample on the basis that the
small mass that they so select out of a huge one will be typical or representative of the
whole. For instance, if economic conditions of people living in a state are to be
studied, a few towns and villages may be purposively selected for intensive study on
the principle that they can be representative of the entire state. Thus, the judgment of
the organizers of the study plays an important part in this sampling design.
Quota sampling: It is also an example of non-probability sampling. Under quota
sampling the interviewers are simply given quotas to be filled from the different
strata, with some restrictions on how they are to be filled. In other words, the actual
selection of the items for the sample is left to the interviewer’s discretion. This type of
sampling is very convenient and is relatively inexpensive. But the samples so selected
certainly do not possess the characteristic of random samples. Quota samples are
31
RESEARCH METHODOLOGY & STATISTICAL TOOLS
essentially judgment samples and inferences drawn on their basis are not amenable to
statistical treatment in a formal way.
Probability sampling: Probability sampling is also known as ‘random sampling’ or
‘chance sampling’. Under this sampling design, every time of the universe has an
equal chance of inclusion in the sample. It is, so to say, a lottery method in which
individual units are picked up from the whole group not deliberately but by some
mechanical process. Here it is blind chance alone that determines whether one item or
the other is selected. The results obtained from probability or random sampling can be
assured in terms of probability i.e., we can measure the errors of estimation or the
significance of results obtained from a random sample, and this fact brings out the
superiority of random sampling design over the deliberate sampling design. Random
sampling ensures the Law of Statistical Regularity which states that if on an average
the sample chosen is a random one, the sample will have the same composition and
characteristics as the universe. This is the reason why random sampling is considered
as the best technique of selecting a representative sample.
Random sampling from a finite population to that method of sample selection
which gives each possible sample combination an equal probability of being picked
up and each item in the entire population to have an equal chance of being included in
the sample. This applies to sampling without replacement i.e., once an selected for the
sample, it cannot appear in the sample again (sampling with replacement is used less
frequently in which procedure the element for the sample is returned to the population
before the next element is selected. In such a situation the same element could appear
twice in the same sample before the second element is chosen).in brief, the
implications of random sampling (or simple random sampling) are:
(a) It gives each element in the population an equal probability of getting into the
sample; and all choices are independent of one another.
(b) It gives each possible sample combination an equal probability of being chosen.
COMPLEX RANDOM SAMPLING DESIGNS:
Probability sampling under restricted sampling techniques, as stated above,
may result in complex random sampling designs. Such designs may as well be called
‘mixed sampling designs’ for many of such designs may represent a combination of
32
RESEARCH METHODOLOGY & STATISTICAL TOOLS
probability and non-probability sampling procedures in selecting a sample. Some of
the popular complex random sampling designs are as follows:
(i) Systematic Sampling: In some instances, the most practical way of sampling is to
select every ith item on a list. Sampling of this type is known as systematic sampling.
An element of randomness is introduced into this kind of sampling by using random
numbers to pick up the unit with which to start. For instance, if a 4 percent sample is
desired, the first item would be selected randomly from the first twenty-five and
thereafter every 25th item would automatically be included in the sample. Thus, in
systematic sampling only the first unit is selected randomly and the remaining units of
the sample are selected at fixed intervals. Although a systematic sample is not a
random sample in the strict sense of the term, but it is often considered reasonable to
treat systematic sample as if it were a random sample.
(ii) Stratified Sampling: If a population from which a sample is to be drawn does not
constitute a homogeneous group, stratified sampling technique is generally applied in
order to obtain a representative sample. Under stratified sampling the population is
divided into several sub-populations that are individually more homogeneous than the
total population a (the different sub-populations are called ‘strata’) and then we select
items from each stratum to constitute a sample. Since each stratum is more
homogeneous than the total population, we are able to get precise estimates for each
stratum and by estimating more accurately each of the component parts; we get a
better estimate of the whole. In brief, stratified sampling results in more reliable and
detailed information.
(iii) Cluster Sampling: If the total area of interest happens to be a big one , a
convenient way in which a sample can be taken is to divide the area into a number of
smaller non-overlapping areas and then to randomly select a number of these smaller
areas (usually called clusters), with the ultimate sample consisting of all (or samples
of ) units in these small areas of clusters.
Thus in cluster sampling the total population is divided into a number of
relatively small subdivisions which are themselves clusters of still smaller units and
then some of these clusters are randomly selected for inclusion in the overall sample.
Suppose we want to estimate the proportion of machine parts in an inventory which
33
RESEARCH METHODOLOGY & STATISTICAL TOOLS
are defective. Also assume that there are 20000 machine parts in the inventory at a
given point of time, stored in 400 cases of 50 each. Now using a cluster sampling, we
would consider the 400 cases as clusters and randomly select ‘n’ cases and examine
all the machine parts in each randomly selected case.
Cluster sampling, no doubt, reduces cost by concentrating surveys in selected
surveys. But certainly it is less precise than random sampling. There is also not as
much information in ‘n’ observations within a cluster as there happens to be in ‘n’
randomly drawn observations. Cluster sampling is used only because of the economic
advantage it possesses; estimates based on cluster samples are usually more reliable
per unit cost.
(iv) Area Sampling: If clusters happen to be some geographic subdivisions, in that
case cluster sampling is better known as area sampling. In other words, cluster
designs, where the primary sampling unit represents a cluster of units based on
geographic area, are distinguished as area sampling. The plus and minus points of
cluster sampling are also applicable to area sampling.
(v) Multi-stage Sampling: Multi-stage sampling is a further development of the
principle of cluster sampling. Suppose we want to investigate the working efficiency
of nationalized banks in India and we want to take a sample of few banks for this
purpose. The first stage is to select large primary sampling unit such as states in a
country. Then we may select certain districts and interview all banks in the chosen
districts. This would represent a two-stage sampling design with the ultimate
sampling units being clusters of districts.
If instead of taking a census of all banks within the selected districts, we select
certain towns and interview all banks in the chosen towns. This would represent a
three-stage sampling design. If instead of taking a census of all banks within the
selected towns, we randomly sample banks from each selected town, then it is a case
of using a four-stage sampling plan. If we select randomly at all stages, we will have
what is known as ‘multi-stage random sampling design’.
Ordinarily multi-stage sampling is applied in inquires extending to a
considerable large geographical area, say, the entire country. There are two
advantages of this sampling design viz., (a) It is easier to administer than most single
34
RESEARCH METHODOLOGY & STATISTICAL TOOLS
stage designs mainly because of the fact that sampling frame under multi-stage
sampling in developed impartial units. (b) A large number of units can be sampled for
a given cost under multistage because of sequential clustering, whereas this is not
possible in most of the sample designs.
(vi) Sampling with probability proportional to size: In case the cluster sampling
units do not have the same number or approximately the same number of elements, it
is considered appropriate to use a random selection process where the probability of
each cluster being included in the sample is proportional to the size of the cluster. For
this purpose, we have to list the number of the elements in each cluster irrespective of
the method of ordering the cluster. Then we must sample systematically the
appropriate number of elements from the cumulative totals.
(vii) Sequential Sampling: This sampling design is some what complex sample
design. The ultimate size of the sample under this technique is not fixed in advance,
but we determined according to mathematical decision rules on the basis of
information yielded as survey progresses. This is usually adopted in case of
acceptance sampling plan in context of statistical quality control. When a particular
lot is to be accepted or rejected on the basis of single sample, it is known as single
sampling; when the decision is to be taken on the basis of two samples, it is known as
double sampling and in case the decision rests on the basis of more than two samples
but the number of samples in certain and decide in advance, the sampling is known as
the multiple sampling. But when the number of samples is more than two but it is
neither certain nor decides in advance, this type of system is often referred to as
sequential sampling.
DIAGRAMATIC PRESENTATION OF DATA
General rules for Constructing Diagrams:
(1) Neatness: Diagrams are visual aids for presentation of statistical
data and are more appealing and fascinating to the eye and leave a lasting
impression on the mind. It is, therefore, imperative that they are made very neat,
clean and attractive by proper size and lettering; and the use of appropriate
devices like different colours, different shades (light and dark), dots, dashes,
35
RESEARCH METHODOLOGY & STATISTICAL TOOLS
dotted lines, broken lines, dots and dash lines, etc., for filling the in between space
of the bars, rectangles, circles, etc., and their components.
(2) Title and Footnotes: As in the case of a good statistical table, each
diagram should be given a suitable title to indicate the subject-matter and the
various facts depicted in the diagram. The title should be brief and self
explanatory, clear. If necessary the footnotes may be given at the left hand bottom
of the diagram to explain certain points or facts, not otherwise covered in the title.
(3) Selection of Scale: One of the most important factors in the
construction of diagrams is the choice of an appropriate scale. The same set of
numerical data if plotted on different scales may give the diagrams differing
widely in size and at times might lead to wrong and misleading interpretations.
Hence, the scale should be selected with great caution.
(4) Proportion between Width and Height: A proper proportion
between the dimensions (height and width) of the diagram should be maintained,
consistent with the space available.
(5) Choice of a Diagram: A large number of diagrams are used to
present statistical data. The choice of a particular diagram to present a given set of
numerical data is not an easy one. It primarily depends on the nature of the data,
magnitude of the observations and the type of the people for whom the diagrams
are meant and requires great amount of expertise, skill, and intelligence. An
inappropriate choice of the diagram for the given set of data might give a distorted
picture of the phenomenon under study and might lead to wrong and fallacious
interpretations and conclusions.
(6) Source Note and Number: As in the case of tables, source note,
wherever possible should be appended at the bottom of the diagram. This is
necessary as, to the learned audience of statistics; the reliability of the information
varies from source to source. Each diagram should also be given a number for
ready reference and comparative study.
36
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(7) Index: A brief index explaining various types of shades, colors,
lines, and designs used in the construction of the diagram should be given for
clear understanding of the diagram.
(8) Simplicity: Lastly, diagrams should be as simple as possible so
that they are easily understood even by a layman who does not have any
mathematical or statistical background. If too much information is presented in a
single complex diagram it will be difficult to grasp and might even become
confusing to the mind. Hence, it is advisable to draw more simple diagrams than
one or two complex diagrams.
TYPES OF DIAGRAMS:
A large variety of diagrammatic devices are used in practice to present statistical data.
However, we shall discuss here only some of the most commonly used diagrams
which may be broadly classified as follows:
(1) One-dimensional diagrams
(2) Two-dimensional diagrams
(3) Three-dimensional diagrams
(4) Pictograms
(5) Cartograms
1) One-Dimensional Diagrams: These one-dimensional diagrams are classified into
two types. They are:
I. Line Diagrams
II. Bar Diagram
a) Line Diagram: This is the simplest of all the diagrams. It consists in drawing
vertical lines, each vertical line being equal to the frequency. The variate (x) values
are presented on a suitable scale along the X-axis and the corresponding frequencies
are presented on a suitable scale along Y-axis. Line diagrams facilitate comparisons
though they are not attractive or appealing to the eye.
37
RESEARCH METHODOLOGY & STATISTICAL TOOLS
b) Bar Diagram: Bar diagrams are one of the easiest and the most commonly used
devices of presenting most of the business and economic data. These are
especially satisfactory for categorical data or series. They consist of a group of
equidistant rectangles, one for each group or category of the data in which the
values or the magnitudes are represented by the length or height of the rectangles,
the width of the rectangles being arbitrary and immaterial. These diagrams are
called one-dimensional because in such diagrams only one dimension viz., height
or length of the rectangles is taken into account to present the given values. There
are various types of Bar Diagrams. They are listed as follows:
(i) Simple bar diagram(ii) Sub-divided or component bar diagram(iii) Percentage bar diagram(iv) Multiple bar diagram(v) Deviation or Bilateral bar diagram
38
RESEARCH METHODOLOGY & STATISTICAL TOOLS
2) Two-Dimensional Diagrams: Line or Bar diagrams discussed so far are one-
dimensional diagrams since the magnitudes of the observations are represented by
only one of the dimensions viz., height (length) of the bars while the width of the bars
is arbitrary and uniform. However, in two-dimensional diagrams, the magnitudes of
the given observations are represented by the area of the diagram. Thus, in the case of
two-dimensional bar diagrams, the length as well as width of the bars will have to be
considered. Two-dimensional diagrams are also known as “area diagrams or surface
diagrams”. Some of the commonly used two-dimensional diagrams are listed as
follows:
They are:
Rectangles Squares Circles Angular or pie diagrams
3) Three-Dimensional Diagrams: Three-dimensional diagrams, also termed as
‘volume diagrams’ are those in which three dimensions, viz., length, breadth, and
height are taken into account. They are constructed so that the given magnitudes
are represented by the volumes of the corresponding diagrams. The common
forms of such diagrams are “cubes, spheres, cylinders, blocks etc”. These
diagrams are specially useful if there are very wide variations between the
smallest and the largest magnitudes to be represented. Of the various three-
dimensional diagrams, ‘cubes’ are the simplest and most commonly used devices
of diagrammatic presentation of data.
39
RESEARCH METHODOLOGY & STATISTICAL TOOLS
4) Pictograms: Pictograms is the technique of presenting statistical data through
appropriate pictures and is one of the very popular devices particularly when the
statistical facts are to be presented to a layman without any mathematical
background. In this, the magnitudes of the particular phenomenon under study are
presented through appropriate pictures, the number pictures drawn or the size of
the pictures being proportional to the values of the different magnitudes to be
presented. Pictures are more attractive and appealing to the eye and have a lasting
impression on the mind. Accordingly they are extensively used by government
and private institutions for diagrammatic presentation of the data relating to a
variety of social, business or economic phenomena primarily for display to the
general public or common masses in fairs and exhibitions.
5) Cartograms: in cartograms, statistical facts are presented through maps
accomplished by various types of diagrammatic representation. They are specially
used to depict the quantitative facts on a regional or geographical basis eg., the
population density of different states in a country or different countries in the
world, or the distribution of the rainfall in different regions of a country can be
shown with the help of maps or cartograms. The different regions or geographical
zones are depicted on a map and the quantities or magnitudes in the regions may
be shown by dots, different shades or colors etc., or by placing bars or pictograms
in each region or by writing the magnitudes to be represented in the respective
regions. Cartograms are simple and elementary forms of visual presentation and
are easy to understand. They are generally used when the regional or geographic
comparisons are to be highlighted.
GRAPHIC REPRESENTATION OF DATA
Diagrams are primarily used for comparative studies and can’t be used to study the
relation ship between the variables under study. This is done through graphs.
Diagrams furnish only approximate information and they are not of much utility to a
statistician from analysis point of view. On the other hand, graphs are more obvious,
precise and accurate than diagrams and can be effectively used for further statistical
analysis, viz., to study slopes, rates of change and for forecasting wherever possible.
Graphs are drawn on a special type of paper, known as “graph paper”.
Before discussing these graphs we shall briefly describe the technique of
constructing graphs and the general rules for drawing graphs.
40
RESEARCH METHODOLOGY & STATISTICAL TOOLS
TECHNIQUE OF CONSTRUCTION OF GRAPHS:
QUADRANT II 5- QUADRANT I
X-Negative 4- X-Positive
Y-Positive 3- Y-Negative
(-X, +Y) 2- (+X, +Y)
1-
-5 -4 -3 -2 -1 0 1 2 3 4 5
QUADRANT III -1- QUADRANT IV
X-Negative -2- X - Positive
Y-Positive -3- Y - Negative
(-X, -Y) -4- (+X, -Y)
-5-
Graphs are drawn on a special type of paper known as “Graph Paper”, which
has a fine network of horizontal and vertical lines; the thick lines for each division of
a centimeter or an inch measure and thin lines for small parts of the same. In a graph
of any size, two simple lines are drawn at right angle to each other, intersecting at
point ‘O’ which is known as origin or zero of reference. The two lines are known as
co-ordinate axes. The horizontal line is called X – axis and is denoted by X’OX. The
vertical line is called the Y – axis and is usually denoted by YOY’. Thus the graph is
divided into four sections, known as four quadrants.
General Rules for Graphing: The following guidelines may be kept in mind for
drawing effective and accurate graphs.
1. Neatness
2. Title and Footnote
3. Structural Framework
4. Scale
5. False Base Line
6. Ratio or Logarithmic Scale
7. Line designs
8. Source Note and Number
41
RESEARCH METHODOLOGY & STATISTICAL TOOLS
9. Index
10. Simplicity
TYPES OF GRAPHS: A large number of graphs are used in practice. But they can
be broadly classified under the following two heads:
(i) Graphs of frequency distributions.
(ii) Graphs of time series.
1) Graphs of Frequency Distributions: The reasons and the guiding principles for
the graphic representation of the frequency distributions are precisely the same as
for the diagrammatic and graphic representation of other types of data. The so-
called frequency graphs are designed to reveal clearly the characteristic features
of a frequency data. Such graphs are more appealing to the eye than the tabulated
data and are readily perceptible to the mind. They facilitate comparative study of
two or more frequency distributions regarding their shape and pattern. The most
commonly used graphs for charting a frequency distribution for the general
understanding of the details of the data are:
A) Histogram B) Frequency Polygon
C) Frequency Curve D) “Ogive” or Cumulative Frequency CurveThe choice of a particular graph for a given frequency distribution largely depends on
the nature of the frequency distribution, viz., discrete or continuous.
A) HISTOGRAM: It is one of the most popular and commonly used devices for
charting continuous frequency distribution. It consists in erecting a series of
adjacent vertical rectangles on the sections of the horizontal axis (X-axis), with
bases (sections) equal to the width of the corresponding class intervals and heights
are so taken that the areas of the rectangles are equal to the frequencies of the
corresponding classes.
The Histogram can be constructed in two cases. They are:
Case (i): Histogram with equal classes.
Case (ii): Histogram with un-equal classes.
B) FREQUENCY POLYGON: Frequency polygon is other device of graphic
presentation of a frequency distribution (continuous, grouped or discrete). In case
of discrete frequency distribution, frequency polygon is obtained on plotting the
frequencies on the vertical axis (Y-axis) against the corresponding values of the
42
RESEARCH METHODOLOGY & STATISTICAL TOOLS
variable on the horizontal axis (X-axis) and joining the points so obtained by
straight lines.
C) FREQUENCY CURVE: A frequency curve is a smooth free hand curve
drawn through the vertices of a frequency polygon. The object of smoothing of the
frequency polygon is to eliminate, as far as possible, the random or erratic
fluctuations that might be present in the data. The area enclosed by the frequency
curve is same as that of the histogram or frequency polygon but its shape is
smooth one and not with sharp edges. Frequency curve may be regarded as a
limited form of the frequency polygon as the number of observations (total
frequency) becomes very large and class intervals are made smaller and smaller.
Types of frequency curves:
Though different types of data may give rise to a variety of frequency curves, we
shall discuss below only some of the important curves which, in general, describe
most of the data observed in practice, viz., and the data relating to natural, social,
economic and business phenomena.
i) Curves of Symmetrical Distribution
ii) Moderately Asymmetrical (skewed) frequency distribution
curves
iii) Extremely asymmetrical or J – shaped curves
iv) U – curve
v) Mixed curves
D) “OGIVE” OR CUMULATIVE FREQUENCY CURVE: Ogive,
pronounced as “Ojive”, is a graphic presentation of the cumulative frequency
(C.F) distribution of continuous variable. It consists in plotting the cumulative
frequency (along the Y – axis) against the class boundaries (along the X – axis).
Since there are two types of cumulative frequency distributions viz., “LESS
THAN C.F” and “MORE THAN C.F”. We have accordingly two types of
ogives, viz., (i) Less than ogive (ii) More than ogive.
(i) Less than Ogive: This consists in plotting the ‘less than’ cumulative
frequencies against the upper class boundaries of the respective classes. The
points so obtained are joined by a smooth free hand curve to give “Less than
Ogive”. Obviously, “less than ogive” is an increasing curve, sloping upwards from
left to right and has the shape of an elongated S.
43
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(ii) More than Ogive: Similarly, in “more than ogive”, the “more than”
cumulative frequencies are plotted against the lower class boundaries of the
respective classes. The points so obtained are joined by a smooth ‘free hand’
curve to give “more than ogive”. “More than Ogive” is a decreasing curve and
slopes downwards from left to right and has the shape of an elongated S, upside
down.
2) Graphs of Time Series: The Time Series data are represented geometrically by
means of times series graph which is also known as “Histogram”. The various types
of Time Series graphs are:
i) Horizontal Line Graphs or Histograms
ii) Silhouette or Net Balance Graphs
iii) Range or Variation Graphs
iv) Components or Band Graphs
TABULATION OF DATA
Meaning and Importance of Tabulation: By Tabulation we mean the symmetric
presentation of the information contained in the data, in rows and columns in
accordance with some salient features or characteristics. Rows are horizontal
arrangements and columns are vertical arrangements. In the words of A.M. Tuttle.
“A Statistical table is the logical listing of related quantitative data in vertical
columns and horizontal rows of numbers with sufficient explanatory and qualifying
words, phrases and statements in the form of titles, headings and notes to make clear
the full meaning of data and their origin”.
Professor Bowley, in his manual of statistics prefers to Tabulation as “the
intermediate process between the accumulation of data in what ever form they are
obtained, and the final reasoned account of the result shown by the statistics”.
Tabulation is one of the most important and ingenious device of the presenting
the data in a condensed and readily comprehensible form and attempts to furnish the
maximum information contained in the data in the minimum possible space, without
sacrificing the quality and usefulness of the data. It is an intermediate process between
the collection of the data on one hand and statistical analysis on the other hand. In
fact, Tabulation is the final stage in collection and compilation of the data and forms
44
RESEARCH METHODOLOGY & STATISTICAL TOOLS
the gateway for further statistical analysis and interpretations. Tabulation makes the
data comprehensible and facilitates comparisons (by classifying data into suitable
groups), and the work of further statistical analysis, averaging, correlation, etc. It
makes the data suitable for further Diagrammatic and Graphic representation.
GENERAL RULES FOR CONSTRUCTING A TABLE
The various parts of a table vary from problem to problem depending upon the nature
of the data and the purpose of the investigation. However, the following are a must in
a good statistical table:
1. Table Number
2. Title
3. Head Notes (or) Prefatory Notes
4. Captions and Stubs
5. Body of the Table
6. Foot-Note
7. Source Note
FORMAT OF A BLANK TABLE
Table No: # TITLE
[Head Note or Prefatory Note (if any)]
Stub Heading
Caption
TotalSub Heads Sub Heads
Column Head
Column Head
Column Head
Column Head
Column Head
45
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Body
Total
Foot Note:Source Note:
TYPES OF TABULATION: The Tables are constructed in many ways.
1. Objectives and Scope of the enquiry.
(i) General Purpose or Reference Table
(ii) Special Purpose or Summary Table
2. Nature of Enquiry.
(i) Original or Primary Table
(ii) Derived or Derivative Table
3. Extent of Coverage given in the Enquiry.
(i) Simple Table
(ii) Complex Table
46
RESEARCH METHODOLOGY & STATISTICAL TOOLS
SPSS (STATISTICAL PACKAGE FOR THE SOCIAL SCIENCES)
SPSS (Statistical Package for the Social Sciences) has now been in development for
more than thirty years. Originally developed as a programming language for
conducting statistical analysis, it has grown into a complex and powerful application
with now uses both a graphical and a syntactical interface and provides dozens of
functions for managing, analyzing, and presenting data. Its statistical capabilities
alone range from simple percentages to complex analyses of variance, multiple
regressions, and general linear models. You can use data ranging from simple
integers/binary variables to multiple response or logarithmic variables. SPSS also
provides extensive data management functions, along with a complex and powerful
programming language.
STATISTICS PROGRAM
SPSS (originally, Statistical Package for the Social Sciences) was released in its first
version in 1968 after being developed by Norman H. Nie and C. Hadlai Hull. Norman
Nie was then a political science postgraduate at Stanford University, and now
Research Professor in the Department of Political Science at Stanford and Professor
Emeritus of Political Science at the University of Chicago. SPSS is among the most
widely used programs for statistical analysis in social science. It is used by market
researchers, health researchers, survey companies, government, education researchers,
TYPES OF
TABLES
OBJECTIVES AND THE SCOPE OF
THE ENQUIRIES
NATURE OF THE
ENQUIRY
EXTENT OF COVERAGE
GIVEN IN THE
ENQUIRY
General Purpose or
Reference Table
Special Purpose or Summary
Table
Original or Primary Table
Derived or Derivative
TableSimple Table Complex Table
47
RESEARCH METHODOLOGY & STATISTICAL TOOLS
marketing organizations and others. The original SPSS manual (Nie, Bent & Hull,
1970) has been described as 'Sociology's most influential book'. In addition to
statistical analysis, data management (case selection, file reshaping, creating derived
data) and data documentation (a metadata dictionary is stored in the data file) are
features of the base software.
Statistics included in the base software:
Descriptive statistics: Cross tabulation, Frequencies, Descriptive, Explore,
Descriptive Ratio Statistics
Bi-variate statistics: Means, t-test, ANOVA, Correlation (bi-variate, partial,
distances), Nonparametric tests
Prediction for numerical outcomes: Linear regression
Prediction for identifying groups: Factor analysis, cluster analysis (two-step,
K-means, hierarchical), Discriminant
The many features of SPSS are accessible via pull-down menus or can be
programmed with a proprietary 4GL command syntax language. Command syntax
programming has the benefits of reproducibility; simplifying repetitive tasks; and
handling complex data manipulations and analyses. Additionally, some complex
applications can only be programmed in syntax and is not accessible through the
menu structure. The pull-down menu interface also generates command syntax, this
can be displayed in the output though the default settings have to be changed to make
the syntax visible to the user; or can be paste into a syntax file using the "paste"
button present in each menu. Programs can be run interactively or unattended using
the supplied Production Job Facility. Additionally a "macro" language can be used to
write command language subroutines and a Python programmability extension can
access the information in the data dictionary and data and dynamically build
command syntax programs. The Python programmability extension, introduced in
SPSS 14, replaced the less functional SAX Basic "scripts" for most purposes,
although Sax Basic remains available. In addition, the Python extension allows SPSS
to run any of the statistics in the free software package R. From version 14 onwards
SPSS can be driven externally by a Python or a VB.NET program using supplied
"plug-ins".
48
RESEARCH METHODOLOGY & STATISTICAL TOOLS
SPSS places constraints on internal file structure, data types, data processing and
matching files, which together considerably simplify programming. SPSS datasets
have a 2-dimensional table structure where the rows typically represent cases (such as
individuals or households) and the columns represent measurements (such as age, sex
or household income). Only 2 data types are defined: numeric and text (or "string").
All data processing occurs sequentially case-by-case through the file. Files can be
matched one-to-one and one-to-many, but not many-to-many.
The graphical user interface has two views which can be toggled by clicking on one
of the two tabs in the bottom left of the SPSS window. The 'Data View' shows a
spreadsheet view of the cases (rows) and variables (columns). Unlike spreadsheets,
the data cells can only contain numbers or text and formulas cannot be stored in these
cells. The 'Variable View' displays the metadata dictionary where each row represents
a variable and shows the variable name, variable label, value label(s), print width,
measurement type and a variety of other characteristics. Cells in both views can be
manually edited, defining the file structure and allowing data entry without using
command syntax. This may be sufficient for small datasets. Larger datasets such as
statistical surveys are more often created in data entry software, or entered during
computer-assisted personal interviewing, by scanning and using optical character
recognition and optical mark recognition software, or by direct capture from online
questionnaires. These datasets are then read into SPSS.
SPSS can read and write data from ASCII text files (including hierarchical files),
other statistics packages, spreadsheets and databases. SPSS can read and write to
external relational database tables via ODBC and SQL.
Statistical output is to a proprietary file format (*.spv file, supporting pivot tables) for
which, in addition to the in-package viewer, a stand-alone reader can be downloaded.
The proprietary output can be exported to text or Microsoft Word. Alternatively,
output can be captured as data (using the OMS command), as text, tab-delimited text,
PDF, XLS, HTML, XML, SPSS dataset or a variety of graphic image formats (JPEG,
PNG, BMP and EMF).
49
RESEARCH METHODOLOGY & STATISTICAL TOOLS
Add-on modules provide additional capabilities. The available modules are:
SPSS Programmability Extension (added in version 14). Allows Python
programming control of SPSS.
SPSS Data Validation (added in version 14). Allows programming of logical
checks and reporting of suspicious values.
SPSS Regression Models - Logistic regression, ordinal regression,
multinomial logistic regression, and mixed models.
SPSS Advanced Models - Multivariate GLM and repeated measures ANOVA
(removed from base system in version 14).
SPSS Classification Trees. Creates classification and decision trees for
identifying groups and predicting behavior.
SPSS Tables. Allows user-defined control of output for reports.
SPSS Exact Tests. Allows statistical testing on small samples.
SPSS Categories
SPSS Trends
SPSS Conjoint
SPSS Missing Value Analysis. Simple regression-based imputation.
SPSS Map
SPSS Complex Samples (added in Version 12). Adjusts for stratification and
clustering and other sample selection biases.
SPSS Server is a version of SPSS with client/server architecture. It has some features
not available in the desktop version, such as scoring functions.
REPORT PRESENTATION
Significance of Report Writing:
Research report is considered a major component of the research study for the
research task remains incomplete till the report has been presented and/or written. As
a matter of fact even the most brilliant hypothesis, highly well designed and
conducted research study, and the most striking generalizations and findings are of
little value unless they are effectively communicated to others. The purpose of
research is not well served unless the findings are made known to others. Research
results must invariably enter the general store of knowledge. All this explains the
50
RESEARCH METHODOLOGY & STATISTICAL TOOLS
significance of writing research report. Writing of report is the last step in a research
study and requires a set of skills some what different from those called for in respect
of the earlier stages of the research. This task should be accomplished by the
researcher with at-most care; he may seek the assistance and guidance of experts for
the purpose.
Different steps in writing Report: Research reports are the product of slow,
painstaking, accurate inductive work. The usual steps involved in writing report are as
follows:
1) Logical analysis of the subject-matter.
2) Preparation of the final outline.
3) Preparation of the rough draft.
4) Re-writing and Polishing.
5) Preparation of the final Bibliography.
6) Writing the final draft.
LAYOUT OF THE RESEARCH REPORT:
Anybody, who is reading the research report, must necessarily be conveyed enough
about the study so that he can place it in its general specific context, judge the
adequate of its adequacy of its methods and thus form an opinion of how seriously the
findings are to be taken. For this purpose there is the need of proper layout of the
report. The layout of the report means as to what the research report should contain. A
comprehensive layout of the research report should comprise (A) Preliminary Pages;
(B) The Main Text; (C) The End Matter. Let us deal with them separately.
(A) PRELIMINARY PAGES: In its preliminary pages the report should carry a
“title and date”, followed by acknowledgments in the form of ‘Preface’ or
‘Foreword’. Then there should be a “table of contents” followed by “list of tables”
and “illustrations” so that the decision-maker or anybody interested in reading the
report can easily locate the required information in the report.
(B) MAIN TEXT: the main text provides the completely outline of the research
report along with all details. Title of the research study is repeated at the top of the
first page of the main text and then follows the other details on pages numbered
consecutively, beginning with the second page. Each main section of the report should
begin on a new page. The main text of the report should have the following sections:
(i) introduction; (ii) statement of findings and recommendations; (iii) the results;
51
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(iv) The implications drawn from the results; and (v) the summary.
(i) Introduction: The purpose of introduction is to introduce
the research project to the readers. It should contain a clear statement of the
objectives of research i.e., enough background should be given to make clear to
the reader why the problem was considered worth investigating. A brief summary
of other relevant research may also be stated so that the present study can be seen
in that context. The hypothesis of study, if any, and the definitions of the major
concepts employed in the study should be explicitly stated in the introduction of
the report.
(ii) Statement of findings and recommendations: After
introduction, the research report must contain a statement of findings and
recommendations in non-technical language so that it can be easily understood by
all concerned. The findings happen to be extensive; at this point they should be
put in the summarized form.
(iii) Results: A detailed presentation of the finding of the
study, with supporting data in the form of tables and charts together with a
validation of results, is the next step in writing the main text of the report. This
generally comprises the main body of the report, extended over several chapters.
The result section of the report should contain statistical summaries and
reductions of the data rather than the raw data. All the results should be presented
in logical sequence splitted into readily identifiable sections.
(iv) Implications of the Results: Toward the end of the main
text, the researcher should again put down the results of his research clearly and
precisely. He should, state the implications that flow from the results of the study,
for the general reader is interested in the implications for understanding the human
behavior. Such implications may have three aspects as stated below:
A statement of the inferences drawn from the present study which may be
expected to apply in similar circumstances.
The conditions of the present study which may limit the extent of
legitimate generalizations of the inferences drawn from the study.
The relevant questions that still remain unanswered or new questions
raised by the study along with suggestions for the kind of research that
would provide answers for them.
52
RESEARCH METHODOLOGY & STATISTICAL TOOLS
(v) Summary: It has become customary to conclude the research report with a very
brief summary, resting in brief the research problem, the methodology, the major
findings and the major conclusions drawn from the research results.
(C) END MATTER: At the End of the Report, appendices should be enlisted in
respect of all technical data such as questionnaires, sample information, mathematical
derivations and the like ones. Bibliography of sources consulted should also be given.
Index (an alphabetical listing of names, places and topics along with the numbers of
the pages in a book or report on which they are mentioned or discussed) should
invariably be given at the end of the report. The value of Index lies in the fact that it
works as a guide to the reader for the contents in the report.
TYPES OF REPORTS: Types of reports are classified into two types:
(A) Technical report
(B) Popular report
(A) Technical report: In the technical report the main emphasis is on (i) the methods
employed, (ii) assumptions made in the course of the study, (iii) the detailed
presentation of the findings including their limitations and supporting data. A general
outline of a technical report can be as follows:
1. Summary of results
2. Nature of the study
3. Methods employed
4. Data
5. Analysis of data and presentation of findings
6. Conclusions
7. Bibliography
8. Technical Appendices
9. Index
(B) Popular report: The popular is one which gives emphasis on simplicity and
attractiveness. The simplification should be sought through clear writing,
minimization of technical, particularly mathematical, details and liberal use of charts
and diagrams. Attractive layout along with large print, many subheadings, even an
occasional cartoon now and then is another characteristic feature of the popular
53
RESEARCH METHODOLOGY & STATISTICAL TOOLS
report. Besides, in such a report emphasis is given on practical aspects and policy
implications. We give below a general outline of a popular report.
1. The Findings and their Implications
2. Recommendations for Action
3. Objective of the Study
4. Methods Employed
5. Results
6. Technical Appendices
MECHANICS OF WRITING A RESEARCH REPORT
There are very definite and set rules which should be followed in the actual
preparation of the research report or paper. Once the techniques are finally decided,
they should be scrupulously adhered to, and no deviation permitted. The criteria of
format should be decided as soon as the materials for the research paper have been
assembled. The following points deserve mention so far as the mechanics of writing a
report the concerned.
1. Size and Physical design: The manuscript should be written on un ruled
paper1/2”x 11” in size. If it is written by hand, then black or blue-black ink
should be used. A margin of at least one and one-half inches should be
allowed at the left hand and of at least half an inch at the right hand of the
paper. There should also be one-inch margins, top and bottom. The paper
should be neat and legible. If the manuscript is to be typed, then all typing
should be double-spaced on one side of the page only except for the insertion
of the long quotations.
2. Procedure: Various steps in writing the report should be strictly adhered (all
such steps have already been explained earlier in this chapter).
Keeping in view the objective and nature of the problem, the layout of
the report should be thought of and decided and accordingly adopted (The
layout of the research report and various types of reports have been described
in this chapter earlier which should be taken as a guide for report-writing in
case of particular problem).
3. Treatment of quotations: Quotations should be placed in quotation marks
and double spaced, forming an immediate part of the text. But if a quotation is
54
RESEARCH METHODOLOGY & STATISTICAL TOOLS
of a considerable length (more than four or five type written lines) then it
should be single-spaced and intended at least half an inch to the right of the
normal text margin.
4. The footnotes: Regarding footnotes one should keep in view the followings:
a) The footnotes serve two purposes viz., the identification of materials
used in quotations in the report and the notice of materials not
immediately necessary to the body of the research text but still of
supplemental value.
b) Footnotes are placed at the bottom of the page on which the reference
or quotation which they identify or supplement ends. Footnotes are
customarily separated from the textual material by a space or half an
inch and a line about one and a half inches long.
c) Footnotes should be numbered consecutively, usually beginning with 1
in each chapter separately. The number should be put slightly above
the line, say at the end of a quotation. At the foot of the page, again,
the footnote number should be intended and typed a little above the
line.
d) Footnotes are always typed in single space though they are divided
from one another by double space.
5. Documentation style: Regarding documentation, the first footnote reference
to any given work should be complete in its documentation, giving all the
essential facts about the edition used. Such documentary footnotes follow a
general sequence. The common order may be described as under:
(i) Regarding the single-volume reference:
1. Author’s name in normal order (and not beginning
with the last name as in a bibliography) followed by comma;
2. Title of work, underlined to indicate italics;
3. Place and date of publication;
4. Pagination references (The page number).
Example: John Gassner, Master of the Drama, New York: Dover
Publications, Inc. 1954, p. 315.
(ii) Regarding Multi-volume reference:
1. Author’s name in the normal order;
2. Title of work, underlined to indicate italics;
55
RESEARCH METHODOLOGY & STATISTICAL TOOLS
3. Place and date of publication;
4. Number of volume;
5. Pagination references (The page number).
(iii) Regarding works arranged alphabetically: For works arranged
alphabetically such as encyclopedias and dictionaries, no pagination
reference is usually needed. In such cases the order is illustrated as
under:
Example: “Salamanca,” Encyclopedia Britannica, 14th Edition.
Example: “Mary Wollstonecraft Godwin,” Dictionary of national
biography.
But if there should be a detailed reference to a long encyclopedia article,
volume and pagination reference may be found necessary.
(iv) Regarding periodicals reference:
1. Name of the author in normal order;
2. Title of article, in quotation marks;
3. Name of periodical, underlined to indicate italics;
4. Volume number;
5. Date of issuance;
6. Pagination;
(v) Regarding anthologies and collections reference: Quotations from
anthologists and collections of literary works must be acknowledged not
only by author, but also by the name of the collector.
(vi) Regarding second-hand quotations reference: In such cases the
documentation should be handled as follows:
1. Original author and Title;
2. “Quoted or Cited in,”;
3. Second author and work;
Example: J.F. Jones, life in Ploynesia, p.16, quoted in History of the
Pacific Ocean Area, by R.B. Abel, p.191.
(vii) Case of Multiple authorship: If there are more than two authors or
editors, then in the documentation the name of only the first is given and
the multiple authorship is indicated by “et al.” or “and “others”.
Subsequent references to the same work need not be as detailed as stated
above. If the work is cited again without any other work intervening, it
56
RESEARCH METHODOLOGY & STATISTICAL TOOLS
may be indicated as ibid, followed by a comma and the page number. A
single page should be referred to as p., but more than one page be
referred to as pp. If there are several pages referred to at a stretch, the
practice is to use often the page number, for example, pp.190ff, which
means page number 190 and the following pages; but only for page 190
and the following page ‘190f’.
6. Punctuation and Abbreviations in footnotes: The first item after the number
in the footnote is the author’s name, given in the normal signature order. This is
followed by a comma. After the comma, the title of the book is given: the article
(such as “A”,”An”,”The”etc) is omitted and only the first word and proper
nouns and adjectives are capitalized.
Anon., anonymous
Ante., before
Art., article
Aug., augmented
Bk., book
Bull., bulletin
Cf., compare
Ch., chapter
Col., column
Diss., dissertation
Ed., editor, edition, edited
Ed.cit., edition cited
e.g., exempli sratia: for example
eng., enlarged
et.al., and others
et seq., et sequens: and the following
ex., example
f.,ff., and the following
fig(s)., figure(s)
fn., footnote
ibid.,ibidem., in the same place (when two or more successive
footnotes refer to the same work, it is not
necessary to repeat complete reference for the
57
RESEARCH METHODOLOGY & STATISTICAL TOOLS
second footnote. Ibid. may be used. If different
pages are referred to, pagination must be
shown).
Id.,idem., the same
Ill.,illus.,or illust(s). illustrated, illustration(s)
Intro.,intro., introduction
L, or ll, line(s)
loc.cit., in the place cited; used as op.cit.,(when new
reference
loco citato is made to the same pagination as cited in the
previous note)
MS., MSS., Manuscript or Manuscripts
N.B.,notabene: note well
N,d., no date
n.p., no place
no pub., no publisher
no(s) number(s)
o.p., out of print
op.cit: in the work cited (if reference has been made to
a work
Opera citato and new reference is to be made, ibid., may be
used, if intervening reference has been made to
different works, op.cit. Must be used. The name
of the author must proceed.
p.or pp., page(s)
Passim: here and there
Post: after
rev., revised
tr.,trans., translator,translated,translation
vid or vide: see, refer to
viz., namely
Vol. or vol(s) volume(s)
vs., versus: against
58
RESEARCH METHODOLOGY & STATISTICAL TOOLS
7. Use of Statistics Charts and Graphs: A judicious use of statistics in research
reports is often considered a virtue for it contributes a great deal towards the
clarification and simplification of the material and research results. One may
well remembered that a good picture is often worth more than a thousand words.
Statistics are usually presented in the form of tables, charts, bars and line-graphs
and pictograms. Such presentation should be self explanatory and complete in
itself. It should be suitable and appropriate looking to the problem at hand.
Finally, statistical presentation should be neat and attractive.
8. The Final Draft: Revising and re-writing the rough draft of the report should
be done with great care before writing the final draft. For the purpose, the
researcher should put to himself questions like: Are the sentences written in the
report clear? Are they grammatically correct? Do they say what is meant? Do
the various points incorporated in the report fit together logically? “Having at
least one colleague read the report just before the final revision is extremely
helpful. Sentences that seem crystal-clear to the writer may prove quite
confusing to other people; a connection that had seemed self evident may strike
others as a non-sequitur. A friendly critic, by pointing out passages that seem
unclear or illogical, and perhaps suggesting ways of remedying the difficulties,
can be an invaluable aid in achieving the goal of adequate communication.
9. Bibliography: Bibliography should be prepared and appended to the research
report as discussed earlier.
10. Preparation of the Index: At the end of the report, an Index should
invariably be given, the value of which lies in the fact that it acts as a good
guide, to the reader. Index may be prepared both as “Subject Index” and as
“Author Index”. The former gives the names of the subject-topics or concepts
along with the number of pages on which they have appeared or discussed in the
report, whereas the latter gives the similar information regarding the names of
authors. The Index should always be arranged alphabetically. Some people
prefer to prepare only one index common for names of authors, subject-topics,
concepts and the like ones.
59