1. MEASURES OF VARIABILITY
    1. Introduction
      1. Degree of spread or variation of the variable about a central value
        1. To know how widely the observations are spread on either side of the average
    2. Range
      1. Simplest method
      2. Difference b/w the value of the largest item & the smallest item
    3. Mean Deviation
      1. Avg of the deviations from the arithmetic mean
    4. Standard Deviation
      1. Most common and most appropriate measure of dispersion
      2. Root-mean-square (RMS) deviation of the values from their mean
        1. the square root of the variance
      3. Interpretation of SD
        1. Large SD: Data points are far from the mean
        2. Small SD: Data points are clustered closely around the mean
      4. Uses of SD in biostatistics
        1. – Summarizes the deviation of a large distribution from mean – Indicates whether the variation of difference of an individual from the mean is by chance – Helps in finding the standard error – Helps in finding the suitable size of sample for valid conclusions
    5. Coefficient of Variation (COV)
      1. Used to compare relative variability
      2. Unit-free measure to compare dispersion of one variable with another
      3. COV = SD/Mean × 100
    6. Standard Error of Mean (SE mean)
      1. measure of variability of sample summaries SEmean is the SD of sample means
      2. measure of chance variation
      3. It does not mean an error
      4. SE mean = SD/√Sample size
    7. Z Score (Standard Score)
      1. aka Normal deviate
      2. Is difference of a value from group mean, in terms of how many times of SD
        1. Z score = (Individual level − Mean)/Standard deviation
      3. Indicates how many SDs an observation is above or below the mean
  2. DISTRIBUTIONS
    1. Poisson Distribution
      1. Discrete probability distribution
      2. Probability of a no. of events occurring in a fixed period of time
    2. Normal Distribution
      1. aka -Gaussian distribution -Standard distribution
      2. Distribution of values of a quantitative variable such that they are symmetric with respect to a middle value & then the frequencies taper off rapidly and symmetrically on both sides
      3. Bell shaped distribution
      4. Mean= Median = Mode
      5. Bilaterally symmetrical curve
      6. Mean = 0 SD = 1 SD = √Variance
        1. In Normal distribution, SD= 1, thus Variance = 1
      7. • Mean ± 1SD (μ ± 1σ) covers 68% values
      8. • Mean ± 2SD (μ ± 2σ) covers 95% values
      9. • Mean ± 3SD (μ ± 3σ) covers 99% values
    3. Skewness of Central Tendency
      1. Measure of asymmetry of a probability distribution of a random variable
      2. Measures of skewness
        1. Pearson’s mode or 1st Skewness coefficient = (Mean – Mode)/SD
        2. Pearson’s median or 2nd Skewness coefficient = 3(Mean – Median)/SD
        3. Quartile skewness = (Q3 – 2Q2 + Q1)/ Q3 – Q1
      3. Asymmetrical distributions
        1. Right (positive) skew
          1. Right skewed curve
          2. Mean > Median > Mode
        2. Left (negative) skew
          1. Left skewed curve
          2. Mean < Median < Mode
  3. SAMPLING
    1. Sample size depends upon
      1. -The effect size (usually the difference between 2 groups)
      2. - The population SD (for continuous data)
      3. - The desired power of the experiment to detect the postulated effect
        1. Power =1–beta
      4. -The significance level (alpha)
    2. Types of Sampling
      1. Random Sampling
        1. aka Probability sampling Non-purposive sampling
      2. Non-random sampling
        1. aka Non-probability sampling Purposive sampling
  4. Types of Random Sampling
    1. Simple Random Sampling
      1. Every unit of population has equal and known chance of being selected
      2. aka Unrestricted random sampling
      3. Applicable for small, homogenous and readily available populations
      4. Used in clinical trials
      5. Methods
        1. Lottery method
        2. Random no. tables
        3. Computer software
    2. Systematic Random Sampling
      1. Based on sampling fraction
        1. Every Kth unit is chosen in the population list, where K is chosen by sampling interval
      2. Sampling Interval (K) Q = Total no. of units in population/ Total no. of units in sample
      3. Applicable for large, non-homogenous populations where complete list of individuals is available
    3. Stratified Random Sampling
      1. Non-homogenous population is converted to homogenous groups/classes (strata); sample is drawn from each strata at random, in proportion to its size
      2. Applicable for large non-homogenous population
      3. Gives more representative sample than simple random sampling
      4. None of the categories is under or over-represented
    4. Multistage Random Sampling
      1. Is done in successive stages; each successive sampling unit is nested in the previous sampling unit
        1. Eg, In large country surveys, states are chosen, then districts, then villages, then every 10th person in village as final sampling unit
      2. Advantage: Introduces flexibility in sampling
    5. Multiphase Random Sampling
      1. Is done in successive phases; part of information is obtained from whole sample and part from the sub-sample
      2. Eg, In a TB survey, Mantoux test done in first phase, then X-ray done in all Mantoux positives, then sputum examined in all those with positive X-ray findings
    6. Cluster Random Sampling
      1. Applicable when units of population are natural groups or clusters
      2. Use in India: Evaluation of immunization coverage
      3. WHO technique used: 30 × 7 technique (total = 210 children)
        1. WHO technique used in CRS: 30 × 7 technique (total = 210 children) * 30 clusters, each containing * 7 children who are 12 – 23 months age and are completely immunized for primary immunization (till Measles vaccine)
      4. Clusters are heterogeneous within themselves but homogenous with respect to each other
      5. Sampling interval is also calculated in CRS
      6. Accuracy: Low error rate of only ± 5%
      7. Limitation: Clusters cannot be compared with each other.
  5. Types of Non-Random Sampling
    1. Convenience Sampling
      1. Patients are selected, in part or in whole, at the convenience of the researcher
      2. Limited attempt to ensure that sample is an accurate representation of population
    2. Quota Sampling
      1. Population is first segmented into mutually exclusive sub-groups (quotas), then judgment is used to select the units from each group non-randomly
      2. Type of convenience sampling
    3. Snow-ball Sampling
      1. A technique for developing a research sample where existing study subjects recruit future subjects from among their acquaintances.
        1. Thus the sample group appears to grow like a rolling snowball
      2. Is often used in hidden populations which are difficult for researchers to access
        1. E.g. drug users or commercial sex workers
    4. Clinical Trial Sampling
  6. Measures of Variability
    1. Range
      1. Std error of mean
    2. Inter-quartile range
      1. Std error of difference b/w 2 means
    3. Mean deviation
      1. Std error of proportion
    4. Standard deviation
      1. Std error of difference b/w 2 proportions
    5. Coefficient of variation
      1. Std error of correlation coefficient