-
Data
-
Defn.
- A series of observations, measurements, or facts that can be analysed
-
Variable
- Has a possible range of values
-
Analysis
- Gathering, modelling and transforming data with the goal of highlighting useful information, suggesting conclusions and supporting decision making.
-
Types
-
Nominal
-
Categorys
- No relationships
- Least powerful
-
Ordinal
- Rank
- Has a relationship (1st, 2nd etc.
- Non mathematical relationship
-
Interval
-
No real 'ZERO'
- eg. temperature
- Has a mathematical relationship
-
Ratio
- Has a tue 'ZERO'
- Eg. distance, height.
- Most powerful
-
Research Methods
-
Which method?
- Depends on the question
-
Quantitative
- Experiments
- Surveys
- RCTs
-
Numbers
- Tells you what happened
-
Qualitative
- Focus groups
- Interviews
- Case studies
- Tells you why it happened
-
Validity
-
Internal
-
to do with study design
- Is it ok?
- Are we measuring the right thing
- Eg measuring height for as a measure of intelligence is wrong.
-
External
-
Can it be applied outside?
- Can the results be generalised?
-
Replicability
- Can it be done again?
-
Reliability
- If experiment done again will the same results be given?
- Easier for lab based work
- Objective
- Unbiased
-
Variables
-
Must be operational
- Be explicitly stated
-
Constructs
- Defined by theoretical definitions
-
Variables
-
Quasi-independent
-
Characteristics that cant be randomly assigned
- Eg sex, age
-
True experimental variables
- Can control these in a true experiment
-
Can be randomly assigned
- eg Give Drug A or Drug B
-
Independent variables
-
The ones we control
- To bring about change in DV
-
Levels
-
At least TWO
- eg. gender- male/female
- AKA 'condition'
-
Dependent variables
- The ones we measure
- The ones that depend on the IV
-
Research design
-
Experimental design
-
True experiment design
- Design where researchers can randomly assign participants to experimental condition.
- Eg randomly assign normal participants to consume different amounts of alcohol
- Randomised
-
Quasi experimental design
- Design where researcher cant randomly assign participants to groups
- eG Compare heavy vs light drinkers, as you are either in one group or the other
-
Randomisation
-
Reduces confounding variables
- When groups to be compared differ in ways other than what the researcher has manipulated
- As they are distributed equally among the groups
- Prevents (un)intentional bias.
- Ensures participant is equally likely to be assigned to either group
- Enables use of powerful statistics.
-
Subjects design
-
Independent groups design
- Comparing BETWEEN groups
-
Potential problems
- Confounding factors
- Solutions
- Randomisation
- Matched groups
-
Matched groups
- Make sure subjects in both groups are matched as closely as possible on potential confounding factors
-
Repeated measures design
- Testing WITHIN groups
-
Advantages
- Fewer participant
- Each participant is their own control
- Removes some confounding factors
-
Disadvantages
- Order effects
- Cant return ppnt to original state
- Practice effect
- better performance due to practice
- Fatigue effect
- SOLUTION
- Counterbalancing
- Randomly assigning order to group
- Therefore we can know whether the order has made any difference
-
Causation
- How correct is our claim of A being the cause of B?
-
SOLUTION
-
Have a comparision group
- Eg treatment vs placebo
-
Could do O-X-O
- Eg. test, give alcohol, test.
- Therefore we know if alcohol is the cause
-
Forms of validity
-
Face
- Does it measure what it says it does?
-
Criterion
-
Concurrent
- Comparison of new test with established test
-
Predictive
-
Does the test have predictive value?
- Eg Does blood pressure value now predict heart attack in 5 years?
- Does the measured results agree with other measures of same thing?
-
Construct
- How well does the design tap into the underlying construct
-
Ecological
-
Does study reflect naturally occuring behaviour?
- Eg does mouse in box reflect its behaviour in wild?
-
Population
- Is our sample adequate for the claims we make about the population?
- What population are we interested in?
-
Sampling
- A sample is a selection or subset of individuals from the population
-
Why sample?
- Time
- Money
-
Sufficiency
- Maybe we dont need that much data as we feel that the sample gives an accurate data
- Access
-
How
-
Random sample
- No pattern
-
Systematic
- Drawn from the population at fixed intervals
-
Stratified
- Specified groups appear in numbers proportional to their size in population
-
Opportunity/Convenience
- People who are easily available
- Leads to bia
-
Snowball
- Get current participants to recruit more for the research
-
Useful if you want to recruit very specific population
- eg drug users might know other drug users
-
Descriptive stats
- To describe a distribution we need to select the appropriate central tendency and distribution
-
Central tendency
-
Mean
- Average
- Less useful if there is a big outlier
- Best for continuous, symmetrical data
-
Median
- Rank then find the middle value
- Best for ordinal data or interval/ratio data that is highly skewed
-
Mode
- Most common
- Misleading if frequency is only just more than other values
- Best for nominal data
-
For skewed data
- Positively skewed data, the mean has a higher value than the median, and the median has a higher value than the mode.
- Negatively skewed data, the mean has a lower value than the median, and the median has a lower value than the mode.
-
Spread of data
-
Range
- Max - Min
-
Variance/Standard deviation
- Measure of mean deviation from mean
-
SD
- The variability across individuals expected by chance
-
Interquartile range
- 3rd quartile(75th qrtle)-1st quartile(25th qrtle)
- Useful when median is used as measure
-
Value
- Large
- = data squashed
- Small
- = data spread out
- More stable than range as extreme values arent included
-
Cumulative frequencies
- Each score and the number that attained that score and below
- eg if scores are 1-5, the we can say 7 people got 4 or less
-
Percentiles
- Scores split into percentile
- A method of expressing a persons score relative to those of others
- Therefore if ur score is in 90th percentile you have done better than 90% of people
- 50th percentile=median
-
Shapes of distribution
-
Normal
- Ideal
- Symmetrical
- Mean/mode/median
- same
-
Skewed
- Topic
-
Kurtosis
- Steepness/flatness
- Steep
- Leptokurtic
- Flat
- Platykurtic
- +ve value= steep
- -ve value = flat
- 0=middling
-
Z-score
- Converts a raw score into a number that shows how many standard deviations away it lies from the mean
- Allows us to see how different an individual is from the group.
-
To calc. we need true mean+SD
- If unavailable, we have to use parametric tests.
-
Hypothesis testing+singnificance
-
Probability
- The number of times the even of interest could happen divided by the total number of possible events
-
If mutually exclusive
- Addition rule
- Sums up to 1
-
P values
- Probability that you have rejected the null hypothesis when it was true.
-
P<0.05
- Significant result
- Reject null hypothesis
-
Types of error
-
Type 1
- Incorrectly rejecting null hypothesis
-
Type 2
- Incorrectly accepting the null hypothesis
-
Types of hypothesis
-
One tailed
- Difference in one direction
- eg eating sprouts increases your IQ
-
Critical value
- Z=1.65
- If significant p-value is 0.05
- As getting a score outside this has a 5%chance
-
Two tailed
- Difference can be in either direction
- Eating sprouts alters IQ
-
Critical value
- +/- 1.96
- If significant p-value is 0.05
- As getting a score outside this has a 5%chance
-
Inferential stats
-
Parametric tests
-
Assumptions
- 1. Data normally distributed
- 2.Variance between groups is the same
- Therefore can only use data that conform with above
-
T-test
- When SD+mean unknown
- Take a few measurements and estimate mean+variability
-
#of measurements important
- Degrees of freedom
- #of measurements- #of parameters
- Higher the #, the more reliable the estimate.
- Closer the t-dist to the z-dist.
- Also decreases the critical value, as the fewer the measurements you take the larger the t-value has to be to reach significance
-
Independent measures
- How to calculate
- Difference of means/difference expected by chance
- (Mean1-Mean2)/(SD of the means)
-
Repeated measures
- Mean change/(change expected by chance)
- = mean difference/(standard error of difference)
- Shows us how different the means of the TWO groups are
-
Standard error
- A measure of how close the sample mean is to the true mean
- Depends on SD of original distribution
-
# of samples(n)
- The more samples you take the lower the standard error.
-
ANOVA
- When more than two groups need to be compared
- Analysis of variance OF MEANS
-
Between subjects design results in a variance between groups that is as a result of individual differences this means that the difference due to the factor being investigated has to be very large for the test to detect it.
- Therefore it is better to use WITHIN subjects design as this reduces the individual variability and therefore any difference due to the factor will be detected by the test.
- Therefore making the test more powerful
-
F=Variability due to factor/(variability due to error)
- (mean btwn groups)2(sq)/mean within groups(2)(sq)
-
Significance of F-value depends on two type of degrees of freedom
- 1.k-1 where k is #of groups
- 2.N-k where N is overall # of measurements.
-
Non-parametric tests
-
Sign test
- Weak
- REPEATED MEASURES ONLY
- 2 conditions
- At least 6 pairs
- Only shows direction not size of difference
- Only for DICHOTOMOUS DATA
-
Method
- Disregard scores that stay same
- Count scores that go up or down
- The lower value is the calculated stat
- If this is smaller than the critical value then it is significant
-
Wilcoxon test
- REPEATED EMASURES
-
Method
- Rank data from smallest to largest
- Discard scores that remain the same
- Take away the smallest score from the largest for each person
- Rank the differences
- Add up the ranks for ppl who did best in condition A and Condition B separately
- The smaller value is 'T' the calculated stat
- Must be equal or LOWER than the critical value for conditions to be significantly different
-
Takes into account the SIZE and the DIRECTION of the difference
- Therefore gives more info than sign test
-
Mann-Whitney U test
- aka MAN-U
- For INDEPENDENT GROUPS
-
Method
- Rank data as if it was one group
- Add up ranks for smallest group of both if groups are same.
- Take the smallest value (R)
- N1=#of cases in smallest group
- N2=#of cases in largest group
- U1&U2 are calculated
- Which ever is smallest is calculated stat
- U should be EQUAL OR LOWER than critical value for significance
-
Kruskall Wallis test
- For INDEPENDENT GROUPS
- Ranking test
- Ranks within each group
- Take difference between mean rank of each group and total mean rank
- Square it & sum them up
-
And that is used to calculate the statistic
- The larger the number the more likely that the conditions are significantly different
-
Can tell you at least 2 groups are significantly different
- NOT WHICH TWO
- for that we need to draw it out
-
Friedmans test
- For REPEATED MEASURES
-
Method
- Rank within each individual's score
- Total up the ranks for each condition
- The computer measures the dispersion of the rank sums
- Looks at how different the total ranks are from eachother
- The Stat (S) has to be LARGER than the critical value to be significant