RM_5___6_Group_no_10_
-
Upload
pramod-khanvilkar -
Category
Documents
-
view
232 -
download
0
Transcript of RM_5___6_Group_no_10_
-
8/2/2019 RM_5___6_Group_no_10_
1/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 1
Group 10
Roll No. Name
8 Sarvesh Desai
17 Pooja Gupta
24 Nilesh Jadhav41 Rupesh Phalke
55 Venugopalan Swaminathan
RM Assignment: RM 5
Q1 Differentiate between following,
1. Parameter and statistic.2. Level of significance and level of confidence.3. Null and Alternate hypothesis.4. Type-I and type-II error.5.
One-tailed and two-tailed test of hypothesis
6. Testing of hypothesis and estimation.7. Point estimate and interval estimate.8. Parametric and non-parametric test of hypothesis9. Z-test and t-test of hypothesis.10.Test of goodness of fit and test of independence, under chi-square test11.1-way ANOVA and 2-way ANOVA.12.Test of confirmation and test of comparison.
Solution:
Q1.1 Parameter Statitics
1 A parameter describes a full population a statistic describes a sample2 A parameter is a property of the
underlying population distribution
"statistic" is "a function of a
sample/observation."
3 as the sample becomes large,
approaches the population mean, which
is a parameter
the sample mean is a statistic
Q1.2 Level of Significance Level of confidance
1 It indicates the likelihood that the
answer will fall outside that range
Is the expected % of times that actual
value will fall with the stated precision
limits
2 1% significance level means 99%confidance level
95% confidance level means 95 chancesin 100 that sample represents true
condition
3 It indicates the likelihood that the
answer will fall outside that range
Is the expected % of times that actual
value will fall with the stated precision
limits
Q1.3 Null Hypothesis Alternate hypothesis
-
8/2/2019 RM_5___6_Group_no_10_
2/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 2
1 Ho: The finding occurred by chance H1: The finding did not occur by chance
2 The null hypothesis is then assumed to
be true unless we find evidence to the
contrary
If we find that the evidence is just too
unlikely given the null hypothesis, we
assume the alternative hypothesis is
more likely to be correct.
Q1.4 Type I error Type II error
1 Means rejection of hypothesis which
should have been accepted
Means accepting the hypothesis which
should have been rejected
2 Denoted by alapha Denoted by Beta
3 Can be controlled by fixing it lower It depends on the type I error
Q1.5 One tailed Hypothesis two tailed hyopthesis
1 Rejection/Acceptance area only on one
side
Rejection/Acceptance area only on two
side
2
Q1.6 Testing of Hypothesis Estimation of Hypothesis
Hypothesis testing is carried out fortesting of the assumed criteria
Population parameters are unknown sohas to be estimated from sample
Q1.7 Point Estimate Interval Estimate
The esitmate of a population parameter
may be one single value or it could be a
range
Estimation of the parameter is not
sufficient. It is necessary to analyse and
see how confident we can be about this
particular estimation. One way of doing
it is defining confidence intervals. If we
have estimated q we want to know if the
true parameter is close to our
estimate. In other words we want to
find an interval that satisfies following
relation:
as the name suggests is the estimation
of the population parameter with one
number
-
8/2/2019 RM_5___6_Group_no_10_
3/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 3
Q1.8 Parameric test of hypotesis Non parameteric test of hypotesis
1 The observations must be independent Observations are independent
2 The observations must be drawn from
normally distributed populations
Variable under study has underlying
continuity
3 These populations must have the same
variances4 The means of these normal and
homoscedastic populations must be
linear combinations of effects due to
columns and/or rows*
Q1.9 Z test T test
1 Z-test is a statistical hypothesis test that
follows a normal distribution
T-test follows a Students T-distribution
2 Z-test is appropriate when you are
handling moderate to large samples (n >
30).
A T-test is appropriate when you are
handling small samples (n < 30)
3 Z-test will often require certain
conditions to be reliable.
T-test is more adaptable than Z-test
4 Z-tests are not commonly used than T-
tests
T-tests are more commonly used than Z-
tests
Q
1.10
Test of goodness of fit under chi sqaure Test of independence under chi sqaure
1 A goodness-of-fit test is a one variable
Chi-square test.
A test of independence is a two variable
Chi-square test
2 the goal of a Chi-square goodness-of-fit
test is to determine whether a set of
frequencies or proportions is similar to
and therefore fits with a hypothesized
set of frequencies or proportions
the goal of a two-variable Chi-square is
to determine whether or not the first
variable is related toor independent
ofthe second variable
3 A Chi-square goodness-of-fit test is like
to a one-sample t-test
A two variable Chi-square test or test of
independence is similar to the test for
an interaction effect in ANOVA
4 It determines if a sample is similar to,
and representative of, a population.
Is the outcome in one variable related to
the outcome in some other variable
Q1.11 1 way ANOVA 2 Way ANOVA
1 The purpose of one way Anova is to
verify whether the data collected from
different sources converge on a
common mean
purpose of the two way Anova is to
verify whether the data collected from
different sources coverage on a
common mean based on two categories
of defining characteristics
-
8/2/2019 RM_5___6_Group_no_10_
4/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 4
2 one way Anova is find out whether the
groups carried out the same procedures
in conducting research
Anova is used in the comparison of
treatment means. This involves the
introduction of randomized block
design. The experiment conducted in
the case of two way Anova gets split
normally into many mini experiments. In
short it can be said that the two way
Anova is employed for a design with two
or more treatment means that can be
called factorial designs.
Q
1.12
Test of confirmation Test of comparision
1
2
-
8/2/2019 RM_5___6_Group_no_10_
5/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 5
Q2 State whether following statements are true or false, giving reasons,
1) Level of significance is type-I error.2) In 1-way ANOVA, we need all samples to be of equal size.3) Point estimate is often insufficient because it is either right or wrong.4) In Z distribution , area contained between + / - 3* standard deviation is equal to
100%.
5) In fixing critical value of t, we need to specify level of significance or degrees offreedom or one/two tailed.
6) All tests of hypothesis are repetitive and hence universal.7) If the test fails to support null hypothesis, it also, indicates why test fails.8) ( 1 beta error ) is called power of test.9) 1% level of significance gives greater confidence to decision maker than 5% level of
significance.
10)In 1-way ANOVA, if F calculated is lesser than 1, it means the factor whichdifferentiates columns is the strong reason explaining variation in data.
11)If all data values are increased by 5, ANOVA inference drawn earlier will change.12)Client is supposed to give beta error to researcher in advance.13)In chi-square test, we want to confirm whether chi-square value is zero or not.14)Level of significance is rejection area under the sampling distribution beyond critical
value of test statistic
15)Good hypothesis can result into type-II error only.16)Alternate hypothesis can decide whether test is one tailed or two tailed in case of
large sample Z test.
17)Randomised block experimental design results into one-way ANOVA.18)Difference between sample statistic and population parameter is always significant.19)We use chi-square test of goodness of fit on nominal data 2-way classified.20)Latin square experimental design will lead to 3-way ANOVA
Solution:
Q2 State whether following
statements are true or false,
giving reasons
Answer Reason
Q 2 .1 Level of significance is type-I
error.
TRUE Level of significance indicates most
likelihood to reject the hypothesis
though its true which is Type-I error
Q 2 .2 In 1-way ANOVA, we need all
samples to be of equal size.
FALSE Not necessary. 1-way ANOVA can
result for unequal sample size also
Q 2 .3 Point estimate is often
insufficient because it is either
right or wrong.
TRUE Point estimate gives one value
which can be right or wrong where
interval gives range to check answer
Q 2 .4 In Z distribution , area contained
between + / - 3* standard
deviation is equal to 100%.
FALSE In Z distribution, area contained
between +/-3* SD is 99.87%
Q 2 .5 In fixing critical value of t, we
need to specify level of
significance or degrees of
freedom or one/two tailed.
TRUE To fix critical value of 't', we need to
specify LOS, DOF, one/tqo tailed.
-
8/2/2019 RM_5___6_Group_no_10_
6/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 6
Q 2 .6 All tests of hypothesis are
repetitive and hence universal.
TRUE When sample changes, we need to
repeat thst of hypothesis
Q 2 .7 If the test fails to support null
hypothesis, it also, indicates why
test fails.
FALSE No. It does no tell why test fails
Q 2 .8 ( 1 beta error ) is called power
of test.
TRUE 1-beta error is type-II error in which
False H0 is accepted.
Q 2 .9 1% level of significance gives
greater confidence to decision
maker than 5% level of
significance.
TRUE 1% LOS is 99% confidence level
which means 99% confidence level
is > 95% confidence level
Q 2 .10 In 1-way ANOVA, if F calculated
is lesser than 1, it means the
factor which differentiates
columns is the strong reason
explaining variation in data.
TRUE Yes. 'F' calculated is lesser than 1
explains variation in data with
strong reason
Q 2 .11 If all data values are increased by
5, ANOVA inference drawn
earlier will change.
FALSE
Q 2 .12 Client is supposed to give beta
error to researcher in advance.
TRUE Researcher should know the client
expected success rate
Q 2 .13 In chi-square test, we want to
confirm whether chi-square value
is zero or not.
TRUE
Q 2 .14 Level of significance is rejection
area under the sampling
distribution beyond critical value
of test statistic
TRUE LOS indicates the % failure in test
statistic
Q 2 .15 Good hypothesis can result into
type-II error only.
TRUE Here False H0 is accepted, indicating
failures are accepted hence good
hypothesis
Q 2 .16 Alternate hypothesis can decide
whether test is one tailed or two
tailed in case of large sample Z
test.
TRUE Alternate hypothesis tells the
Q 2 .17 Randomised block experimental
design results into one-way
ANOVA.
FALSE CR results into one way ANOVA
Q 2 .18 Difference between samplestatistic and population
parameter is always significant.
FALSE Lets say population has seasonalityfactor and while if the sampling is
not done proper way, your sample
statistic and population parameter
can be different.
Q 2 .19 We use chi-square test of
goodness of fit on nominal data
2-way classified.
TRUE
-
8/2/2019 RM_5___6_Group_no_10_
7/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 7
Q 2 .20 Latin square experimental design
will lead to 3-way ANOVA
TRUE
-
8/2/2019 RM_5___6_Group_no_10_
8/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 8
Q3 State whether following statements are true or false, giving reasons
1. Partial correlation analysis is same as multiple correlation analysis.2. If byx = 0.8, bxy = - 0.2, hence r = - 0.4.3. If byx = 0.8,bxy = 1.6, hence r = 1.13.4. byx and bxy must be less than 1, always.5. y = a + bx this equation can be used to estimate value of x for a given value of y
always.
6. If two regression lines are perpendicular to each other., correlation coefficient is 1
7. If r =0.7, amount of variation in y because of x is 70 %.8. Coefficient of determination can be negative sometimes.9. If one variable is constant, correlation between x and y is positive perfect.10.If coefficient of determination is less, stronger will be relationship between x and
y.
11.Coefficient of indetermination and standard error of estimate are same inconcepts.
12.Variance and co-variance mean the same thing.13.If correlation coefficient between x and y is 0.90, this definitely proves that
relationship is always causal.
14.If two regression lines coincide, coefficient of correlation is always +1.15.Intersection of two regression lines is the mean of each variable.
Solution:
Q3 State whether following
statements are true or false, giving
reasons
TRUE
/
FALSE
Reason
Q 3.1 Partial correlation analysis is same
as multiple correlation analysis.
FALSE Partial correlation measures the
effect of its independent variable on
the dependent variable whereas
multiple correlation takes into
account two independent and one
dependent variable.
Q 3.2 If byx = 0.8, bxy = - 0.2, hence r = -
0.4.
TRUE r=(0.8*0.2) = hence r0.16= - 0.4
Q 3.3 If byx = 0.8,bxy = 1.6, hence r =
1.13.
TRUE (.0.8*1.6) r = 1.28 r= 1.13
Q 3.4 byx and bxy must be less than 1,
always.
TRUE
Q 3.5 y = a + bx this equation can be used
to estimate value of x for a given
value of y always.
TRUE
Q 3.6 If two regression lines are
perpendicular to each other.,
correlation coefficient is 1
TRUE
Q 3.7 If r =0.7, amount of variation in y
because of x is 70 %.
TRUE
-
8/2/2019 RM_5___6_Group_no_10_
9/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 9
Q 3.8 Coefficient of determination can be
negative sometimes.
TRUE negative values of R2 may occur
when fitting non-linear trends to
data.
Q 3.9 If one variable is constant,
correlation between x and y is
positive perfect.
FALSE
Q 3.10 If coefficient of determination is
less, stronger will be relationship
between x and y.
FALSE
Q 3.11 COefficient of indetermination and
standard error of estimate are same
in concepts.
FALSE
Q 3.12 Variance and co-variance mean the
same thing.
FALSE
Q 3.13 If correlation coefficient between x
and y is 0.90, this definitely proves
that relationship is always causal.
FALSE
Q 3.14 If two regression lines coincide,
coefficient of correlation is always
+1.
FALSE When r +/- 1, there is exact linear
relationship between X & Y and two
regression lines coincides with each
other.
Q 3.15 Intersection of two regression lines
is the mean of each variable.
TRUE Two regression lines always
intersect each other at point mean
of X and mean of Y
-
8/2/2019 RM_5___6_Group_no_10_
10/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 10
Q4 Explain importance of following in statistical analysis (Under what circumstances will
you recommend following in analyzing data collected?
1. Mode as measure of central tendency.2. Coefficient of variation3. Interquartile range.4. Measures of skewness and kurtosis5. Syx : standard error of estimate of y because of x.6. Coefficient of determination ( r2)7. Co-variance in bivariate analysis8. Interval estimate.9. Classification, tabulation, presentation of data10.Frequency curve and histogram11.Correlation and regression analysis12.Yules coefficient of association
Solution:
1) Mode as measure of central tendency.The mode is the most frequently occurring value in the data set. The mode in a distribution
is that item around which there is maximum concentration. In general mode is the size of
the item which has the maximum frequency.
For example, in the data set {1,2,3,4,4}, the mode is equal to 4. A data set can have more
than a single mode, in which case it is multimodal. In the data set {1,1,2,3,3} there are two
modes: 1 and 3.
The mode can be very useful for dealing with categorical data. For example, if a sandwich
shop sells 10 different types of sandwiches, the mode would represent the most popular
sandwich. The mode also can be used with ordinal, interval, and ratio data. However, in
interval and ratio scales, the data may be spread thinly with no data points having the same
value. In such cases, the mode may not exist or may not be very meaningful.
2) Coefficient of variationThe coefficient of variation measures variability in relation to the mean (or average) and is
used to compare the relative dispersion in one type of data with the relative dispersion in
another type of data. The data to be compared may be in the same units, in different units,
with the same mean, or with different means.
Suppose you want to evaluate the relative dispersion of grades for two classes of students:
Class A and Class B. The coefficient of variation can be used to compare these two groups
and determine how the grade dispersion in Class A compares to the grade dispersion in
Class B. This is one example of how the coefficient of variation can be applied.
The coefficient of variation is a calculation built on other calculations -- the standard
deviation and the mean -- as follows:
This reads as 'the coefficient of variation is equal to the standard deviation divided by the
mean, multiplied by 100 (to produce a percentage).
The steps required for calculating the coefficient of variation are:
-
8/2/2019 RM_5___6_Group_no_10_
11/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 11
Calculate the mean for the data set.
Calculate the standard deviation.
Divide the standard deviation by the mean.
Multiply the result of step 3 by 100.
3) Interquartile range.The interquartile range (IQR) is the distance between the 75
thpercentile and the 25
th
percentile. The IQR is essentially the range of the middle 50% of the data. Because it uses
the middle 50%, the IQR is not affected by outliers or extreme values.
The IQR is also equal to the length of the box in a box plot.
4) Measures of skewness and kurtosisSkewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution,
or data set, is symmetric if it looks the same to the left and right of the center point.
For univariate data Y1, Y2, ..., YN, the formula for skewness is:
where is the mean, is the standard deviation, and N is the number of data points. The
skewness for a normal distribution is zero, and any symmetric data should have a skewness
near zero. Negative values for the skewness indicate data that are skewed left and positive
values for the skewness indicate data that are skewed right. By skewed left, we mean that
the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is
long relative to the left tail. Some measurements have a lower bound and are skewed right.
For example, in reliability studies, failure times cannot be negative.
Kurtosis is a measure of whether the data are peaked or flat relative to a normal
distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean,
decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flattop near the mean rather than a sharp peak. A uniform distribution would be the extreme
case
For univariate data Y1, Y2, ..., YN, the formula for kurtosis is:
where is the mean, is the standard deviation, and N is the number of data points.
5) Syx : standard error of estimate of y because of x.Let us consider yest as the estimated value ofy for a given value ofx. This estimated value
can be obtained from the regression curve ofy on x From this, the measure of the scatter
about the regression curve is supplied by the quantity:
-
8/2/2019 RM_5___6_Group_no_10_
12/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 12
The above equation is called the Standard Error of Estimate ofy on x. It is important to note
that this Standard Error of Estimate has properties analogous to those of standard
deviation.
6) Coefficient of determination ( r2)The coefficient of determination, r
2,is useful because it gives the proportion of
the variance (fluctuation) of one variable that is predictable from the other variable.
It is a measure that allows us to determine how certain one can be in making
predictions from a certain model/graph.
The coefficient of determination is the ratio of the explained variation to the total
variation.
The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength
of the linear association between x and y.
The coefficient of determination represents the percent of the data that is the closest
to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that
85% of the total variation in y can be explained by the linear relationship between x
and y (as described by the regression equation). The other 15% of the total variation
in y remains unexplained.
The coefficient of determination is a measure of how well the regression line
represents the data. If the regression line passes exactly through every point on the
scatter plot, it would be able to explain all of the variation. The further the line is
away from the points, the less it is able to explain.
7) Co-variance in bivariate analysis
8) Interval estimate.An interval estimate is defined by two numbers, between which a population parameter is
said to lie. For example, a < x < b is an interval estimate of the population mean . It
indicates that the population mean is greater than a but less than b.
9) Classification, tabulation, presentation of dataTabulation refers to the systematic arrangement of the information in rows and columns.
Rows are the horizontal arrangement. In simple words, tabulation is a layout of figures in
rectangular form with appropriate headings to explain different rows and columns. The
main purpose of the table is to simplify the presentation and to facilitate comparisons
"A statistical table is a systematic organisation of data in columns and rows."
"Tabulation involves the orderly and systematic presentation of numerical data in a formdesigned to elucidate the problem under consideration."
10)Frequency curve and histogramFrequency curve is obtained by joining the points of frequency polygon by a freehand
smoothed curve. Unlike frequency polygon, where the points we joined by straight lines, we
make use of free hand joining of those points in order to get a smoothed frequency curve. It
is used to remove the ruggedness of polygon and to present it in a good form or shape. We
-
8/2/2019 RM_5___6_Group_no_10_
13/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 13
smoothen the angularities of the polygon only without making any basic change in the
shape of the curve. In this case also the curve begins and ends at base line, as is in case of
polygon. Area under the curve must remain almost the same as in the case of polygon.
A histogram is a way of summarising data that are measured on an interval scale (either
discrete or continuous). It is often used in exploratory data analysis to illustrate the major
features of the distribution of the data in a convenient form. It divides up the range of
possible values in a data set into classes or groups. For each group, a rectangle is
constructed with a base length equal to the range of values in that specific group, and an
area proportional to the number of observations falling into that group. This means that the
rectangles might be drawn of non-uniform height.
The histogram is only appropriate for variables whose values are numerical and measured
on an interval scale. It is generally used when dealing with large data sets (>100
observations), when stem and leaf plots become tedious to construct. A histogram can also
help detect any unusual observations or any gaps in the data set.
11)Correlation and regression analysisRegression analysis is the mathematical process of using observations to find the line ofbest
fitthrough the data in order to make estimates and predictions about the behaviour of the
variables. This line of best fit may be linear (straight) or curvilinear to some mathematical
formula.
Correlation analysis is the process of finding how well (or badly) the line fits the
observations, such that if all the observations lie exactly on the line of best fit, the
correlation is considered to be 1 or unity.
12)Yules coefficient of associationIn order to find the degree of intensity of association between two or more sets of
attributes, we should work out the coefficient of association , Professor Yules coefficient of
association
QAB = {(AB)(ab)-(Ab)(aB)}/{(AB)(ab)+(Ab)(aB)}
QAB = Yules coefficient of association between attributes A & B
(AB)=Frequency of class AB in which A & B are present
(Ab) = Frequency of class Ab in which A is present & B is absent
(aB) = Frequency of class aB in which A is absent & B is present
(ab)= Frequency of class ab in which both A & B are absent
-
8/2/2019 RM_5___6_Group_no_10_
14/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 14
RM Assignment: RM 6
Q1 Differentiate between following
1. Completely randomized ( CR ) and randomized block ( RB ) experimental design2. Stratified sampling and cluster sampling3. Sampling and non-sampling errors.4. Probability and non-probability sampling.5. Survey and experiment.6. Simple random sampling and systematic sampling.7. Nominal data and ratio data.8. Exploratory and diagnostic research.9. Validity and reliability in attitude measurement.10.Bias and error in research11.Structured and un-structured interview.12.Latin square and factorial experimental design.13.Principle of randomizing and principle of replication.14.Multi-stage sampling and multi-phase sampling.15.Informal experimental and formal experimental design
Solution:
Q1.1 Completely randomized ( CR ) Randomized block ( RB ) experimental
design
1 It is simple design than RB It is an improvement over CR
2 Invovles 2 principles Viz the principle of
replication and the principle of
randmozation
Principle of Local control can be applied
along with the other two principles of
experimental design
3 Subjects are randomly assigned to
experiment treatments
Subjects are divided into groups-Blocks ,
such that within each group thesubjectss are relatively homogenous in
respect to some other variable'
Is Analsed by 1 way ANOVA Is Analsed by 2 way ANOVA
Q1.2 Stratified sampling Cluster sampling
1 If a population from which a sample is
to be drawn does not constitue a
homogenous group , stratified sampling
technique is used
for bigger samples divide the area into a
number of smaller non overlapping
areas and then randomly select a
number of these smaller areas(Clusters)
2 Generally used to obtain representative
sample3 Sampling population is divided into
several sub -population(Strata) that are
individually more homogenous than the
total population then from Stratum
items are selected for sampling
Sample is divided in clusters which are
themselves clusters in themselves
4 Sample size ni = { n x N1 x si} /{N1 x s1
+N2 x s2+ ..Ni x si}
-
8/2/2019 RM_5___6_Group_no_10_
15/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 15
5 High cost required Low cost involved
6 More precise Less Precise
Q1.3 Probability Sampling Non-probability Sampling
1 Also known as Random sampling or
chance sampling
Also known as deliberate sampling
2 Every item of universe has eqal chance
of inclsion in sample
Organisers of inquiry purposively
choosw the particular units of the
universe for constituing a sample on the
bais that the sma;; ass that they so
select out of a hufe one will be typical
or represntative whle
3 Probability is 1 /NCn Just quota sampling no basis
Q1.4 Survey Experiment
1 The process of examing the truth of
statitical hypothesis relating to someresearch problem is known as an
experiment.
2 Two types absolute & comparitive
3 are conducted in case of descriptive
reaserch studies
are part of experimental research
studies
4 Larger samples Small samples
5 Normally used for social & behavioural
sciences
used for measure of the effects of an
experiment which he conducts
intentionally
6 Example firld research Example Laboratory research
Q1.5 Simple random sampling Systematic sampling
1 Just a random sample Various systemeatic approaches
2 every entity from universe may become
a sample
logic is defined in order to have better
control on sample
3 low cost high cost is involved
Q1.6 Nominal data ratio data
1 Simply a system of assigning nmber
symbols to events in order to lable hem.
has absolute or zero of measurement
2 conveienet for keeping taracks actual amounts of variables3 only mode is measure of central
tendancy
Geometric or harmonic means are used
as easure of central tendency
4 Widely used in surveys Used for physical measurement
Q1.7 Exploratory research Diagnostic research
1 This is carried out for exploring new
ideasm with support
This is carried out for digonising certain
problem
-
8/2/2019 RM_5___6_Group_no_10_
16/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 16
2 This is general research leading to
surveys
This is extensive research involves
depth study and stattical tools
3 Low to moderate cost compared to
Diagonostic research
High cost compared to exploratory
research
Q1.8 Validity in attitude measurement Reliability in attitude measurement
Q1.9 Bias in Research Error in research
1 This may impacts the results of the
research
This impacts a lot the results of the
reasearch
2 This is the attitude This is system related
Q1.10 Structured interview Un-structured interview
1 Invovles a set of predetermined
questions
Questions are not fixed
2 Highly standardised techniques of
recording
Normal standards for recording
3 Rigid procedure to intervirew freedom to condct interview
4 Question order is fixed sometimes Question sequence may be chaged
Q1 .11 Latin square Factorial experimental design
1 Very frequenctly used in agricultural
reasearch
are used in experiments where the
effects of varying more than one factor
are to be determined
2 Asumption that there is no interaction
between row factor & coum factors
There is interractio between row &
column entity
3 No of row & columns are required to be
equal
more complex problem are been looked
with multiple rows and columns
4 Acuuracy us low compared to factorial
deisgn
Provide equivalent accuracy with lesss
labour and as such are a source of
economy
Q1 .12 Principle of randomizing Principle of replication
1
2
Q1. 13 Multi-stage sampling Multi-phase sampling
1 It is further dvelopment of cluster
sampling
2 Easier to administer
3 Large no of units can be sampledfor
given cost under mutlistsge
-
8/2/2019 RM_5___6_Group_no_10_
17/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 17
Q1. 14 Informal experimental Formal experimental design
1 of 3 types
before & after without control design
After only cotrol design
Before & after with cotrol design
of 4 types
Completely randomized design (CR)
Rnadomized block design (RB)
Latin sqauare design (LS)Factorial design
2 Less sophisticated offer more control
3 based on differences of magnitude Use precise sratitical procedure for
analysis
Q2 Justify following statements
1. Quota sampling is a non-probability sampling.2. We dont need hypothesis firmed up in diagnostic research.3. Wording of questionnaire can cause ineffective instrument.4. In Latin square experimental design it is assumed that factors are independent ofeach other.5. Stratified sampling method assumes strata to be homogeneous within and
heterogeneous between.
6. Convenience sampling is a method of probability sampling.7. Semantic differential scale requires identifying bi-polar adjectives describing the
object.
8. Likert scale is a summative model for attitude measurement.9. Principle of replication in experimental design is aimed at increasing statistical
accuracy
10.Principle of local control in experimental design is identifying effect of known sourceof variation in data.11.Non-sampling errors cannot be totally avoided in research.
12.Word association test is a projective method of data collection.13. Defining the problem involves in identifying unit of analysis and characteristic of
interest, time and space references and environmental conditions.
14.Projective methods of data collection are used for inferred characteristics15.On ordinal data, we can do all mathematical operations.16.Optimal sample size is based on degree of accuracy and level of confidence
expected.
17.Cluster sampling needs each cluster to be homogeneous between andheterogeneous within.
18.Systematic sampling is not truly probability sampling.19. Parameters of quality data are same whether it is primary data or secondary data.
20.We firm up hypothesis based on exploratory, descriptive and diagnostic research.Solution:
1) Quota sampling is a non-probability sampling.The first step in non-probability quota sampling is to divide the population into exclusive
subgroups. Then, the researcher must identify the proportions of these subgroups in the
population; this same proportion will be applied in the sampling process. Finally, the
-
8/2/2019 RM_5___6_Group_no_10_
18/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 18
researcher selects subjects from the various subgroups while taking into consideration the
proportions noted in the previous step. The final step ensures that the sample is
representative of the entire population. It also allows the researcher to study traits and
characteristics that are noted for each subgroup. So in quota sampling the probability is not
considered hence it is called non probability sampling.
2) We dont need hypothesis firmed up in diagnostic research.Since DR aims to identify causes of a problem and its possible solutions.
3) Wording of questionnaire can cause ineffective instrument.Wording and order of questions, ensures that each respondent receives the same
stimuli, else the purpose of the survey will not get serve
4) In Latin square experimental design it is assumed that factors are independent ofeach other.
A Latin square is used in experimental designs in which one wishes to compare
treatments and to control for two other known sources of variation. It was recognized
that within a eld there would be fertility trends running both across the eld and up
and down the eld. So in an experiment to test, say, four different fertilizers, A, B, C and
D, the eld would divided into four horizontal strips and four vertical strips, thus
producing 16 smaller plots. A Latin square design will give a random allocation of
fertilizer type to a plot in such a way that each fertilizer type is used once in each
horizontal strip (row) and once in each vertical strip (column).
5) Stratified sampling method assumes strata to be homogeneous within andheterogeneous between.
6) Convenience sampling is a method of probability sampling.Convenience sampling is a non-probability sampling technique where subjects are
selected because of their convenient accessibility and proximity to the researcher.
7) Semantic differential scale requires identifying bi-polar adjectives describing theobject.
Yes, Semantic differential is a type of a rating scale designed to measure the connotative
meaning of objects, events, and concepts.
8) Likert scale is a summative model for attitude measurement.Likert (1932) developed the principle of measuring attitudes by asking people to respond
to a series of statements about a topic, in terms of the extent to which they agree with
them, and so tapping into the cognitive and affective components of attitudes.
9) Principle of replication in experimental design is aimed at increasing statisticalaccuracy
Measurements are usually subject to variation and uncertainty. Measurements are
repeated and full experiments are replicated to help identify the sources of variation, to
better estimate the true effects of treatments, to further strengthen the experiment's
reliability and validity, and to add to the existing knowledge of about the topic.[13]
-
8/2/2019 RM_5___6_Group_no_10_
19/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 19
However, certain conditions must be met before the replication of the experiment is
commenced: the original research question has been published in a peer-reviewed
journal or widely cited, the researcher is independent of the original experiment, the
researcher must first try to replicate the original findings using the original data, and the
write-up should state that the study conducted is a replication study that tried to follow
the original study as strictly as possible.
10)Principle of local control in experimental design is identifying effect of known sourceof variation in data.
Local control refers to grouping of the experimental units in such a way that the units
within a group (i.e., block) are more homogeneous than are units in different groups.
The experimental materials or conditions are more alike within a group. Thus, the
variation among experimental units within a group is less than the variation would have
been without grouping
11)Non-sampling errors cannot be totally avoided in research.Non-sampling errors are part of the total error that can arise from doing a statistical
analysis. The remainder of the total error arises from sampling error. Unlike sampling
error, increasing the sample size will not have any effect on reducing non-sampling
error. Unfortunately, it is virtually impossible to eliminate non-sampling errors entirely.
12)Word association test is a projective method of data collection.Word Association Test: An individual is given a clue or hint and asked to respond to the
first thing that comes to mind. The association can take the shape of a picture or a word.
There can be many interpretations of the same thing. A list of words is given and you
dont know in which word they are most interested
13)Defining the problem involves in identifying unit of analysis and characteristic ofinterest, time and space references and environmental conditions.
14)Projective methods of data collection are used for inferred characteristicsThis holds that an individual puts structure on an ambiguous situation in a way that
is consistent with their own conscious & unconscious needs
15)On ordinal data, we can do all mathematical operations.Ordinal data is second level of measurement therefore The experimental (scientific)
method depends on physically measuring things. The concept of measurement has been
developed in conjunction with the concepts of numbers and units of measurement.
Statisticians categorize measurements according to levels. Each level corresponds to
how this measurement can be treated mathematically
16)Optimal sample size is based on degree of accuracy and level of confidenceexpected.
17)Cluster sampling needs each cluster to be homogeneous between andheterogeneous within.
-
8/2/2019 RM_5___6_Group_no_10_
20/20
RESEARCH METHODOLOGY : MFM SEM II GROUP 10
[Type text] Page 20
Common motivation for cluster sampling is to reduce the average cost per interview.
Given a fixed budget, this can allow an increased sample size.
18)Systematic sampling is not truly probability sampling.Systematic sampling is still thought of as being random, as long as the periodic interval is
determined beforehand and the starting point is random, For example, if you wanted to
select a random group of 1,000 people from a population of 50,000 using systematic
sampling, you would simply select every 50th person, since 50,000/1,000 = 50.
19)Parameters of quality data are same whether it is primary data or secondary data.Data that has been collected from first-hand-experience is known as primary data.
Primary data has not been published yet and is more reliable, authentic and objective.
Primary data has not been changed or altered by human beings, therefore its validity is
greater than secondary data. The review of literature in nay research is based on
secondary data. Nostly from books, journals and periodicals.
20)We firm up hypothesis based on exploratory, descriptive and diagnostic research