CBSE Class 10 Maths Notes: Statistics

📊 Introduction to Statistics

Welcome to the fascinating world of Statistics! This chapter provides the tools to analyze and interpret data, a crucial skill in the modern world. We’ll explore how to organize, summarize, and understand data through various methods.

📈 Grouped Data: Frequency Distributions and Class Intervals

This section focuses on organizing data into meaningful groups, particularly when dealing with large datasets.

Definitions:

Data: A collection of facts, such as numbers, words, measurements, observations, or descriptions.
Raw Data: Data collected in its original form.
Frequency: The number of times a particular value or observation occurs in a dataset.
Class Interval: A group or range of values into which data is organized. (e.g., 0-10, 10-20, etc.)
Frequency Distribution: A table that displays the frequency of each class interval.

Core Principles:

Creating class intervals allows us to summarize large datasets.
Class intervals should be mutually exclusive (no overlap) and exhaustive (cover all data values).
Class width should ideally be consistent for ease of analysis (e.g., all class intervals are of width 10).

Constructing a Frequency Distribution:

Collect Data: Gather the data you want to analyze.
Determine Range: Find the difference between the highest and lowest values in your dataset.
Choose Class Width: Decide the size of your class intervals. This is based on the range and desired number of intervals. (e.g., Width = (Range / Number of Intervals)
Create Class Intervals: Define the boundaries of your class intervals.
Tally Frequencies: Count how many data points fall into each class interval.
Create Frequency Table: Present the class intervals and their corresponding frequencies in a table.

Example:

Suppose we have the following marks obtained by students in a test: 12, 18, 22, 25, 28, 30, 32, 35, 38, 40, 42, 45, 48, 50, 52, 55, 58, 60. Let’s create a frequency distribution with class intervals of width 10 (e.g., 10-20, 20-30, etc.)

Frequency Table

Class Interval	Frequency
10-20	2
20-30	4
30-40	6
40-50	4
50-60	2

🧮 Measures of Central Tendency for Grouped Data

Central tendency refers to the “center” or “typical” value of a dataset. We will explore Mean (average).

1. Mean (Arithmetic Mean)

Definition: The average of all the values in a dataset.

Methods for Calculating Mean:

Direct Method:

Formula: $\bar{x} = \frac{\sum f_i x_i}{\sum f_i}$, where $\bar{x}$ is the mean, $f_i$ is the frequency of class *i*, and $x_i$ is the class mark (midpoint) of class *i*.
Steps:
1. Calculate the class mark ($x_i$) for each class interval.
2. Multiply the class mark by its corresponding frequency ($f_i x_i$).
3. Sum up all the products ($\sum f_i x_i$).
4. Sum up all the frequencies ($\sum f_i$).
5. Divide the sum of products by the sum of frequencies.

Assumed Mean Method:

Formula: $\bar{x} = a + \frac{\sum f_i d_i}{\sum f_i}$, where $a$ is the assumed mean, $d_i = x_i – a$ (deviation of the class mark from the assumed mean).
Steps:
1. Choose an assumed mean ($a$) from one of the class marks.
2. Calculate the deviation ($d_i$) for each class interval.
3. Multiply the frequency by the deviation ($f_i d_i$).
4. Sum up all the products ($\sum f_i d_i$).
5. Sum up all the frequencies ($\sum f_i$).
6. Divide the sum of ($f_i d_i$) by the sum of frequencies and add the assumed mean.

Step-Deviation Method:

Formula: $\bar{x} = a + h \cdot \frac{\sum f_i u_i}{\sum f_i}$, where $h$ is the class width and $u_i = \frac{x_i – a}{h}$ (step-deviation).
Steps:
1. Choose an assumed mean ($a$).
2. Calculate the step-deviation ($u_i$) for each class interval.
3. Multiply the frequency by the step-deviation ($f_i u_i$).
4. Sum up all the products ($\sum f_i u_i$).
5. Sum up all the frequencies ($\sum f_i$).
6. Divide the sum of ($f_i u_i$) by the sum of frequencies, multiply by class width and add the assumed mean.

Example (Direct Method):

Using the frequency table from the previous example:

Class Interval	Frequency ($f_i$)	Class Mark ($x_i$)	$f_i x_i$
10-20	2	15	30
20-30	4	25	100
30-40	6	35	210
40-50	4	45	180
50-60	2	55	110
Total	18		630

$\bar{x} = \frac{630}{18} = 35$

Therefore, the mean of the data is 35.

✍ Median and Mode for Grouped Frequency Distributions

These are also measures of central tendency, representing the “middle” value and the most frequent value, respectively.

1. Median

Definition: The middle value in a sorted dataset.

Formula: $Median = l + \frac{\frac{n}{2} – cf}{f} \times h$

$l$ = Lower limit of the median class.
$n$ = Total number of frequencies ( $\sum f_i$ ).
$cf$ = Cumulative frequency of the class preceding the median class.
$f$ = Frequency of the median class.
$h$ = Class width.

Steps:

Calculate the cumulative frequencies.
Find the median class: the class in which the cumulative frequency first exceeds $\frac{n}{2}$.
Apply the formula.

Example: Using the above table

Class Interval	Frequency ($f_i$)	Cumulative Frequency (cf)
10-20	2	2
20-30	4	6
30-40	6	12
40-50	4	16
50-60	2	18

$\frac{n}{2} = \frac{18}{2} = 9$. The median class is 30-40 (as the CF first exceeds 9).

$l=30$, $cf=6$, $f=6$, $h=10$

$Median = 30 + \frac{9-6}{6} \times 10 = 35$

Therefore, the median of the data is 35.

2. Mode

Definition: The value that appears most frequently in a dataset.

Formula: $Mode = l + \frac{f_1 – f_0}{2f_1 – f_0 – f_2} \times h$

$l$ = Lower limit of the modal class.
$f_1$ = Frequency of the modal class.
$f_0$ = Frequency of the class preceding the modal class.
$f_2$ = Frequency of the class succeeding the modal class.
$h$ = Class width.

Steps:

Identify the modal class: the class with the highest frequency.
Apply the formula.

Example: Using the frequency table from the previous example:

The modal class is 30-40 (highest frequency of 6).

$l=30$, $f_1=6$, $f_0=4$, $f_2=4$, $h=10$

$Mode = 30 + \frac{6-4}{2*6-4-4} \times 10 = 35$

Therefore, the mode of the data is 35.

🌍 Interpretation of Mean, Median, and Mode

Understanding how to interpret these measures in real-life scenarios is key to data analysis.

Mean: Provides an average value, useful for getting an overall sense of the data. Useful when data values are clustered symmetrically. Susceptible to outliers.

Median: The middle value; represents the central tendency and is less affected by extreme values (outliers) than the mean. Useful for data with skewness.

Mode: The most frequent value, useful for identifying the most common observation. Indicates the most “popular” value in the dataset.

Real-life contexts:

Mean: Average salary of employees in a company, Average marks in an exam.
Median: Median income of a population (less affected by high-earning individuals), House prices.
Mode: The most popular shoe size sold in a store, most common color of cars.

Important Notes:

The choice of which measure to use depends on the data and the question you’re trying to answer.

In symmetrical distributions, the mean, median and mode are roughly the same.

Outliers can significantly affect the mean. The median is less susceptible to outliers.

Further Reading

Practice Statistics Extra Questions

Refer Statistics NCERT Solutions

Refer Class 10 Math Notes & CBSE Syllabus