Skip to main content

Learning outcomes

  • define the basic language of statistics
  • distinguish population from sample
  • distinguish parameter from statistic
  • identify cases and variables in a dataset

What is statistics?

  • Statistics is the study of collecting, organizing, analyzing, and interpreting data.
  • It helps us make sense of information and support decisions.

Two major branches of statistics

Descriptive statistics

  • Organizes and summarizes data already collected.
  • Uses:
    • tables
    • averages
    • percentages
    • charts and graphs
Examples:
  • average marks in a class
  • bar chart of students by department
  • minimum and maximum rainfall this month

Inferential statistics

  • Uses sample data to draw conclusions about a larger population.
  • Because a sample may not match the population exactly, inference includes uncertainty.
Examples:
  • estimating average income in a city from a survey
  • checking a sample of bulbs to estimate factory defect rate

What is data?

  • Data are facts, observations, or measurements collected for analysis.
  • Data can be:
    • numbers
    • categories
    • labels
    • recorded responses
Examples:
  • age
  • exam marks
  • blood group
  • gender
  • branch

Population and sample

  • Population: the complete set of all units of interest.
  • Sample: a subset selected from the population.
Example:
  • Population: all students in a college
  • Sample: 200 students surveyed from that college

Census and sample survey

  • Census: information is collected from every unit in the population.
  • Sample survey: information is collected from only a part of the population.
Why use a sample?
  • faster
  • cheaper
  • easier to manage

Parameter and statistic

  • Parameter: numerical summary of a population
  • Statistic: numerical summary of a sample
Examples:
  • population mean = parameter
  • sample mean = statistic
MCQ trap:
  • If the question mentions “all students”, “all households”, or “entire production”, think population and parameter.
  • If it mentions “surveyed 100”, “sample of 50”, or “selected units”, think sample and statistic.

Cases and variables

  • Case or observation: one individual unit from which data are collected.
  • Variable: a characteristic measured or recorded for each case.
Example:
  • In a student dataset:
    • case = one student
    • variables = marks, age, department, attendance

Data table structure

  • Rows represent cases.
  • Columns represent variables.
Example:
StudentAgeMarksDepartment
A1882CSE
B1975ECE

Common mistakes to avoid

  • confusing sample with population
  • calling every numerical result a parameter
  • thinking descriptive statistics is “less important”
  • confusing case with variable

Exam hints and traps

  • Population means the whole target group, not just the available group.
  • A sample is used to learn about the population.
  • Statistic comes from a sample, parameter belongs to the population.
  • A dataset may describe a sample even if only descriptive measures are computed.

Quick practice

  1. A company tests 80 batteries out of 5000. Identify population and sample.
  2. “Average lifetime of the 80 batteries” is parameter or statistic?
  3. In a class record sheet, identify one case and three variables.

Answer key

  1. Population: all 5000 batteries; sample: 80 tested batteries
  2. Statistic
  3. Case: one student; variables: roll number, marks, attendance