# Research Methods

All of the notes below are taken from Statistics without Maths for Psychology by Dancey and Reidy (click image). I recommend it for its hand-holding build up into the subject (as well as its humour!)

CHAPTER 1 – VARIABLES AND RESEARCH DESIGN

1.2 VARIABLES

Variables: things we can measure and something that can vary

– we’re interested in variables as we want to understand why they vary

Continuous: can take any value in a range (e.g. duration of violence on TV 40.2553 secs)

Discrete: only certain values in a range (e.g. no. of violent incidents on TV – can’t be half)

Difference bet. variable type & how it’s measured e.g. Anxiety (continuous) measured with score on anxiety questionnaire (discrete)

Categorical: values are categories (e.g. gender, job, whether you ate cake at 3am today)

Dichotomising: converting continuous/ discrete variables to 2 categorical e.g. tall & short people

BUT – can reduce sensitivity of statistical analysis to 67% (e.g. height 2ft & 4ft in same category ‘short’)

1.3 LEVELS/ SCALES OF MEASUREMENT

Nominal: categories that can’t be ordered (e.g. male, female, black, white)

Ordinal: categories that can be ordered but intervals aren’t equal (e.g. happiness feeling)

Interval: equal intervals with arbitrary zero (e.g. temperature – 15oC isn’t half 30oC)

Ratio: equal intervals with absolute zero (e.g. speed, time – 20kph is half 40kph)

1.4 RESEARCH DESIGNS

extraneous variables: variables that might have impact on DV but we didn’t take into account

confounding variable: specific extraneous variable related to both the IV & DV e.g. IV = sex vs basketball ability; but height is implicitly related to sex

– we eliminate these by controlling them e.g. by matching groups

1.4.2 CORRELATIONAL DESIGNS

correlational designs: examine relationships bet. variables (e.g. smoking & cancer)

– correlation doesn’t mean causation: we just know that 2 variables co-vary

– Pearson product moment correlation coefficient or Spearman’s rho correlation coefficient

1.4.4 EXPERIMENTAL DESIGN (TRUE EXPERIMENT)

– experimenter manipulates IV to see the effect on the DV

– allows inference of causation as IV & DV can be manipulated more easily

IV: manipulated variable – its value is independent of the other variables under investigation

– random allocation reduces probability of systematic differences bet. groups

– random allocation good to control interpersonal factors but won’t cover e.g. time of day

1.4.5 QUASI-EXPERIMENTAL DESIGNS

– involves looking for diffs in the DV between conditions of the IV

– looking at variables we can’t directly manipulate; no random allocation to groups

– no random allocation means we can’t say the pseudo IV manipulation caused diffs.

– if there’s no random allocation to conditions then it’s probably quasi-experimental design

– experimental/ quasi-experimental: t-test, Mann-Whitney U test, Wilcoxon ANOVA

1.5.1 WITHIN-PARTICIPANTS (aka REPEATED MEASURES/ RELATED) DESIGN

– same participants in every condition of the IV

– helps control inter-individual confounding variables

– needs less people

BUT order effects: doing conditions in a certain order (not the IV) >> diffs in the DV (bored)

BUT demand effects: Ss doing what experimenter wants them to do (as they’ve seen both conditions)

BUT can’t be used for quasi-experimental designs (e.g. female & male IV conditions)

counterbalancing: Ss do conditions in systematically different order; helps but doesn’t completely eliminate demand effects

1.5.2 BETWEEN-PARTICIPANTS (aka INDEPENDENT/ UNRELATED) DESIGN

– different participants in each condition of the IV

– eliminates order effects (bored, tired frustrated) and demand effects (working out study)

BUT needs more people

BUT lose some control over inter-individual confounds (e.g. more shy people in group A)

CHAPTER 2 – INTRODUCTION TO SPSS

CHAPTER 3 – DESCRIPTIVE STATISTICS

3.1 SAMPLES & POPULATIONS

population: all possible people/ items that have a certain characteristic (e.g. all left-handers)

sample: a selection of people or items from a population

– there are diffs. bet. the statistical techniques describing samples & describing populations:

statistics: describe samples (e.g. calculating mean for a sample is a statistic)

parameters: describe populations (e.g. calculating mean for a population is a parameter)

– inferential statistics help us help us generalise from the sample to the whole population

– descriptive statistics are used to describe samples

3.2 MEASURES OF CENTRAL TENDENCY

– indication of the typical score in our sample; an estimate of mid point of score distribution

mean: sum of all scores in sample divided by number of scores in sample

median: middle score once all scores in sample have been put in rank order

– ranking position (plain position) vs ranks (i.e. two of the same score have same rank = mean of ranking positions)

– where there’s an even no. of scores we take the mean of the two middle scores

mode: the most frequently occurring score in a sample

3.2.4 WHICH MEASURE OF CENTRAL TENDENCY SHOULD YOU USE?

– mean is calculated from actual scores not ranks or freq but is sensitive to extreme scores

– median is not sensitive to extreme scores: use where mean doesn’t indicate typical score

– mode is used for categories but can be skewed if mode is just e.g. 2 scores: 1,1,9,10,11,12

3.2.5 THE POPULATION MEAN PARAMETER (MEANS OF SAMPLES)

– population mean: get the means of a no. of samples, then find the mean of those means

– there’s a tendency for the mean of sample means to closely approximate the pop mean

3.3 SAMPLING ERROR

sampling error: the difference between the sample statistic and population parameter

– small samples increase probability of scores either above/below the population mean

3.4 GRAPHICALLY DESCRIBING DATA

exploratory data analysis: exploring data to describe it in more detail (without inference)

3.4.1 FREQUENCY HISTOGRAM

– frequency of occurrence of a score on a variable; x=score, y=freq

– shows mode, suspicious scores, distribution; no. of scores covered by a bar is ‘bin width’

3.4.2 STEM & LEAF PLOT (TUKEY 1977)

similar to histogram but frequency as: 2 (stem=tens) & 555889 (leaf=units) i.e. 2 555889

– stem width can be broken down into 5s (blocking), where ‘2.’ is 20-24 and 2* is 25-29

3.4.3 BOX PLOT (TUKEY 1977)

– allow us to easily see extreme scores & how scores in a sample are distributed

1. find the median by ordering scores: 2,12,12,19,19,20,20,20,25
2. calculate hinges (mid 50%; cut top & bottom 25%): add 1 to median position 5 (6) & div by 2 = 3

– upper & lower hinges are 3rd score from top and bottom: 12 & 20

– ‘h-spread’ is range of scores between hinges (top & bottom of box): 20-12 = 8

– median is shown by a thick line within the box

1. extreme scores are one-and-a-half times the h-spread outside the hinges: 8 x 1.5 = 12

– so extreme scores (outliers) are outside 12-12 = 0 & 20+12 = 32 (shown as score pos X)

– inside these bounds are the ‘inner fences’ (these bounds not usu. shown on boxplot)

1. ‘adjacent scores’ are the scores closest to the inner fences 0 & 32, so 2 & 25

– ‘whiskers’ are lines from hinges to adjacent scores

3.5 SCATTERGRAMS

scattergram: graphical representation of relationship bet. 2 variables, one on X and one on Y

– scores around an imaginary line from bottom left to top right = positive relationship

– scattergrams can appear to show relationships where there’s none, due to sampling error

3.7 THE NORMAL DISTRIBUTION

normal distribution: a)symmetrical about the mean b)tails meet x-axis at infinity c)bell shaped

– mean, median & mode are exactly the same (the peak)

– ND is a function of its mean & std deviation: the ND can be plotted by reference to its mean & std deviation

3.8 VARIATION OR SPREAD OF DISTRIBUTIONS

variance: the degree to which scores are different from one another

3.8.1 THE RANGE

range: maximum minus minimum – but doesn’t tell us much about shape of distribution

3.8.2 STANDARD DEVIATION

standard deviation: average deviation of scores from the mean

variance: average squared deviation of scores from the mean; SD is square root of variance

1. 1,4,5,6,9,11 – mean is 6
2. deviation from mean is -5,-2,-1,0,3,5 [mean of these is ‘mean deviation’ = 0]
3. square these deviations 25,4,1,0,9,25 and get the mean 64/6 = 10.67 = variance
4. get the square root of 10.67 = SD 3.27 so that it can be compared with the actual scores

– 70% of all scores are within 1 SD of the mean (e.g. bet. 6 +/- 3.27 i.e. from 2.73 to 9.27)

– modified or population SD is n-1 (e.g. 64/5 = 3.58) as sample SD underestimates pop SD

3.9 OTHER CHARACTERISTICS OF DISTRIBUTIONS

kurtosis: measure of how peaked a distribution is (positive or negative in SPSS)

platykurtic: flat (negative); leptokurtic: very peaked (positive); mesokurtic: between (0)

3.10 NON-NORMAL DISTRIBUTIONS

– distribution-free or non-parametric tests are used if data isn’t normally distributed

3.10.1 SKEWED DISTRIBUTIONS

skewed distributions: peak shifted away from centre (distorts mean; could be sampling error)

positively skewed: tail pointing right (peak left); negatively skewed: tail to left (peak right)

– SPSS gives positive value for positive skew, negative for neg skew & zero for no skew

– box plot with median near hinges & no whisker indicates skew; bimodal not shown up well

3.10.2 BIMODAL DISTRIBUTIONS

– has 2 peaks (2 modes), suggesting there are 2 distinct populations underlying the data

CHAPTER 4 – PROBABILITY SAMPLING AND DISTRIBUTIONS

4.1 PROBABILITY

probability: the likelihood of a particular event of interest occurring

– divide desired outcomes by total no. of possible outcomes (even on dice=3/6, 0.5 or 50%)

4.1.1 CONDITIONAL PROBABILITIES

conditional probability: probability of event happening if other condition(s) has also happened

– e.g. being struck by lightning playing golf or getting lung cancer if you smoke

4.2 THE STANDARD NORMAL DISTRIBUTION (AKA. PROBABILITY DISTRIBUTION)

SND: a normally shaped distribution with mean of zero and SD of 1; distribution of z-scores

z-score/ standardised score: subtract mean from our score & divide by SD (so we can use SND to analyze our data).

– e.g. IQ mean = 100, SD = 15; my score 95; (95-100)/15 = -0.33 SDs below mean

probability distribution: we know probability of randomly selecting a score/ range of scores from the distribution

4.4 SAMPLING DISTRIBUTIONS

sampling distribution: calculating a statistic (e.g. lots of means) from infinite samples & plotting them on a histogram

– a sampling distribution of means would make a normal distribution (even if the popn distribution was flat or skewed)

– the peak of those mean distributions will indicate the popn mean (central limit theorem)

4.5 CONFIDENCE INTERVALS AND THE STANDARD ERROR

point estimate: an estimate of an unknown number (i.e. popn parameter e.g. popn mean)

interval estimate: a range in which the unknown number will be

confidence interval: statistically determined interval estimate of a population parameter

4.5.1 STANDARD ERROR

standard error: the SD of a sampling distribution (e.g. SD of sampling distribution of mean)

– standard error is calc as SD of a sample div by square root of the sample size

– 95% confidence interval is 1.96 x standard error (95% of scores are 1.96 SDs from mean)

confidence interval: degree of certainty that our sample mean is in range of the popn mean

– e.g. mean 7, SD 3.58, sample 6; std err=3.58/2.45=1.46; 95% confid int =1.96×1.46=2.86

– we’re 95% confident that popn mean falls bet. 7+/-2.86; larger sample, smaller interval

CHAPTER 5 – HYPOTHESIS TESTING AND STATISTICAL SIGNIFICANCE

5.1 ANOTHER WAY OF APPLYING PROBABILITIES TO RESEARCH: HYPOTHESIS TESTING

p-value: probability of getting our results if there was no relationship bet. variables in pop.

– i.e. probability of getting our results (whatever they were) if the null hypothesis was true

5.2 NULL HYPOTHESIS (SIGNIFICANCE TESTING – NHT/ NHST)

research/ experimental/ alternate hypothesis: prediction of how 2 variables are related

– or how groups differ, or how participants differ under different conditions

null hypothesis: no effect (relation bet. variables) in the underlying population

– or difference bet. populations, or difference in responses of pop. under diff. conditions

– rejecting null hyp means probability of getting our findings if the null hyp was true is so small that accepting the experimental hyp makes more sense

5.3 LOGIC OF NULL HYPOTHESIS TESTING

(a) fomulate hypothesis

(b) measure & examine relationship bet. variables

(c) calculate probability of getting this relationship if there was no relationship in the popn.

(d) if probability small, means pattern is unlikely to be due to chance, so there’s probably relationship in popn.

– i.e. if there’s no relationship in popn. you’re unlikely to get it in your random sample

5.4 THE SIGNIFICANCE LEVEL

p-value (0 to 1): probability of getting your pattern of results if the null hypothesis was true*

alpha α: criteria for statistical significance; below this means our results are so unlikely, that they’re more plausible than the null hyp

– i.e. 5% (0.05/1 in 20): if we did our study 20 times, only 1 would give the results we obtained by chance, if null hyp was true

statistically significant: pattern of results so unlikely, suggesting research hyp more plausible than null

not significant: our pattern of results is highly probable if the null hypothesis were true

– BUT statistical significance (likelihood of results) doesn’t equal psychological significance (impact/effect)

– BUT α isn’t the probability that null/research hyp is true; it’s the conditional p-value prob*

5.7 STATISTICAL TESTS  (RELATED ALSO TO 5.11)

e.g. test statistic: when we convert our data into a score from a probability distribution

5.8 TYPE I ERROR

type 1 error: the null hyp is true in popn, but our sample hit that e.g. 1 in 20 chance causing us to believe there’s an effect in popn

– i.e. we see an effect (relationship) in our study, but there is none in the population

5.9 TYPE II ERROR

type 2 error: we see no effect in our study, but there is one in the population

– probability of making type 1 error denoted as α ; probability of making type 2 denoted as β

power of a study 1-β: ability of study to reject null hypothesis when it is false

5.10 WHY SET α AT 0.05

– if we set α at 0.2 we’d be tolerating a type 1 error one case in every 5, instead of 1 in 20

– we need good balance bet. making type I and II errors: α at 0.05 gives this, in most cases

– also it’s a level traditionally used by most psychologists

5.11 ONE-TAILED (DIRECTIONAL) AND TWO TAILED (BI-DIRECTIONAL) HYPOTHESES

one-tailed: specifying the direction of the relationship bet. variables, or diff. bet. conditions

two-tailed: predicting a relationship bet. variables or diff. bet conditions without direction/diff.

– if you get a p-value of 0.03 for a 2-tailed test, equivalent for 1-tailed is 0.015 (& vice versa)

– is because we can reject the null hyp for a greater no. of scores in a single tail than bet. two

5.12 ASSUMPTIONS UNDERLYING PARAMETRIC TESTS

parametric tests: based on estimation of/ assumptions about underlying popn parameters

(1) interval/ratio scale, (2) normally distributed popn, (3) variances of popns approx equal (if not, samples sizes should be equal), (4) no extreme scores

non-parametric/ distribution free tests: don’t make these assumptions/ estimations about popn parameters

CHAPTER 6 – CORRELATIONAL ANALYSIS: PEARSON’S r

6.1 BIVARIATE CORRELATIONS

– looking at the relationship bet. 2 variables; if they’re associated, they’re co-related

[6.1.1 DRAWING CONCLUSIONS FROM CORRELATIONAL ANALYSES

– involves (1)scattergrams (2)Pearson’s r (3)confidence limits (4)interpretation of results

6.1.2 PURPOSE OF CORRELATIONAL ANALYSIS

– to see if there’s a relationship bet. variables not caused by sampling error, if null is true

– also direction (+, – or zero: no linear relationship) & strength (correlation coefficient 0 to 1)

6.1.4 PERFECT POSITIVE RELATIONSHIPS

– a straight line on a scattergram when variable x is plotted against y

6.1.5 IMPERFECT POSITIVE RELATIONSHIPS

– not a straight line but there is a trend, a pattern going bottom left to top right

6.1.8 NON-LINEAR RELATIONSHIPS

– correlational analysis tests for a linear relationship, but some relationships aren’t linear

– e.g. curvilinear relationship (n) wouldn’t be statistically significant, but there is a relationship

6.1.9 THE STRENGTH OR MAGNITUDE OF THE RELATIONSHIP

– correlation coefficient r is -1 to +1: measures strength of linear relationship bet. 2 variables

– most popular are Pearson’s r (parametric, aka product moment correlation) and Spearman’s rho (equiv non-parametric)

6.1.10 VARIANCE EXPLANATION OF THE CORRELATION COEFFICIENT

– r2 tells us how much a change in one variable is due to a change in another variable

– so if r=-0.2, -0.2 x -0.2 = 0.04 = 4%; if r=-0.4, -0.4 x -0.14= 0.16 = 16%*

– means 4% of variability of scores for one variable are accounted for by variability in the other (shared variance); but also that 96% accounted for by other factors (unique variance)

– so r = shared variance OVER unique variance ; if shared high & unique low, r will be high

– above example* shows that correlation of r=-0.4 isn’t twice as strong as r=-0.2: it’s 4 times

6.1.11 STATISTICAL SIGNIFICANCE & PSYCHOLOGICAL IMPORTANCE (see also 5.4)

– use both r and probability: statistical significance <> psychological significance

6.1.12 CONFIDENCE INTERVALS AROUND r

(a) change r to Fisher’s Zr using table; (b) 1/sq rt(n-3) x 1.96

(c) add/subtract (b) to/from Zr in (a) to get upper & lower; (d) change Zr to r using table

6.2 FIRST- AND SECOND-ORDER CORRELATIONS

first-order: 1 variable of e.g. 4 is partialled out/ held constant (e.g. age, weight, height)

second-order: 2 variables of e.g. 4 partialled out (zero-order = no variables partialled out)

– done by e.g. corr age vs height, then age vs weight then remove from height vs weight corr

– if there’s diff bet. weight vs height r when age is partialled out, means some corr due to age

CHAPTER 7 – ANALYSES OF DIFFERENCES BET. 2 CONDITIONS: THE T-TEST

– a parametric test so data should be from pop. with a normal distribution

7.15 MEASURE OF EFFECT

– differences bet. means of 2 groups can be shown as standard deviations from each other

– convert them into z-scores: standardised scores

– this measure of effect is called d = mean 1 – mean 2 / mean of the SDs

– e.g. groups: (A) mean 50,SD 10 (B) mean 70,SD 5; d =-20/7.5 = -2.66 SDs from each other

7.1.6 SIZE OF THE EFFECT

– z-scores are standardised, so that mean is 0 & SD is 1; if diff. bet. means =0.1= tenth of SD

– SDs range from 0 to 3; if d is small (0.2) it means the effect of the IV is small; little overlap

7.1.7 INFERENTIAL STATISTICS: THE T-TEST

– assesses if there’s a statistically significant difference bet. the means of 2 conditions

– independent t-test for independent/between participants; related/paired t for related/within

– a measure of (a)between groups variance vs (b)within group variance; if a > b then t larger