Learning outcomes
- understand association between one categorical and one numerical variable
- compare group centers and spreads
- choose sensible summaries for grouped comparison
- interpret differences without forcing causal claims
What does this case look like?
- One variable is categorical.
- The other variable is numerical.
- department and marks
- gender and height
- hostel/day-scholar status and attendance
How to study this association
- Compare the numerical variable across categories using:
- mean
- median
- spread
- grouped plots if provided
Questions to ask
- Which group has larger average?
- Which group has more spread?
- Are the group differences large or small?
- Is the comparison affected by outliers?
Example interpretation
- “Students in Group A have a higher average score than students in Group B.”
- “The spread of marks is larger in Group B.”
Exam hints and traps
- Do not summarize the categorical variable with a mean.
- The numerical variable is what gets averaged or compared.
- Difference between groups does not itself prove cause.
- Median may be more useful than mean when outliers are present.
Quick practice
- In “department and marks”, which variable is numerical?
- Can you compute a mean for department names?
- What summary is useful for marks across departments?
Answer key
- Marks
- No
- Mean, median, and spread by department
