Skip to main content

Subject: Statistics I

Unit 4: Association Between Variables

Week 4 syllabus map

  1. L4.1: Association between two variables - Review of course
  2. L4.2: Association between two categorical variables - Introduction
  3. AQ4.2: Activity Question 2
  4. L4.3: Association between two categorical variables - Relative frequencies
  5. AQ4.3: Activity Question 3
  6. L4.4: Association between two numerical variables - Scatterplot
  7. AQ4.4: Activity Question 4
  8. L4.5: Association between two numerical variables - Describing association
  9. L4.6: Association between two numerical variables - Covariance
  10. AQ4.6: Activity Question 6
  11. L4.7: Association between two numerical variables - Correlation
  12. AQ4.7: Activity Question 7
  13. L4.8: Association between two numerical variables - Fitting a line
  14. AQ4.8: Activity Question 8
  15. L4.9: Association between categorical and numerical variables

How to use this week

  • Separate the questions into three families:
    • categorical with categorical
    • numerical with numerical
    • categorical with numerical
  • For any association question, ask:
    1. what are the two variables?
    2. what type is each variable?
    3. what display is appropriate?
    4. are we describing direction, strength, or comparison across groups?

Week 4 exam traps

  • Association does not automatically imply causation.
  • Correlation measures linear association, not every possible relationship.
  • Positive covariance and positive correlation point in the same direction, but their scales differ.
  • A scatterplot may show no linear pattern even if a curved relationship exists.
  • Relative frequencies are often more useful than raw counts when group totals differ.

Final revision checklist

  • interpret association in words without overclaiming cause
  • read two-way tables using counts and relative frequencies
  • describe scatterplots by direction, form, and strength
  • distinguish covariance from correlation
  • explain what a fitted line is trying to do
  • compare a numerical variable across categories sensibly