Mean — Complete Chapter for ML & Statistics

What is Mean?

We calculate the mean (average) because it gives a single value that represents the whole dataset.

Why we need mean (benefits):

  • Easy understanding: Instead of looking at many numbers, one value summarizes everything.
  • Quick comparison: You can easily compare different groups (e.g., average salary of two companies).
  • Decision making: Helps in making decisions based on overall performance (e.g., average marks, sales).
  • Finding trends: Shows general behavior of data (high, low, normal).
  • Used in formulas: Mean is the base for many calculations like variance, standard deviation, etc.

Mean is the average of a set of numbers. You add all values together, then divide by how many values there are.

Simple idea: If 5 friends scored 60, 70, 80, 90, 100 in a test — what was the "typical" score? You find the mean.

Formula:

Mean (ฮผ or x̄) = Sum of all values / Total count
              = (x₁ + x₂ + x₃ + ... + xโ‚™) / n

Example:

Values: 60, 70, 80, 90, 100
Sum = 60 + 70 + 80 + 90 + 100 = 400
Count = 5
Mean = 400 / 5 = 80

Why Do We Use Mean?

Because we need one number that represents the whole dataset.

In ML, you can't feed 10,000 raw values into every formula. You need summaries. Mean is the most fundamental summary of data.

It answers: "If everything was equal, what would each value be?"


Types of Mean (All Used in ML)


1. Arithmetic Mean

This is the standard mean everyone knows. Add everything, divide by count.


    import numpy as np

    scores = [60, 70, 80, 90, 100]
    mean = np.mean(scores)
    print(mean)  # 80.0

Used in: Loss functions, accuracy calculation, gradient descent, feature scaling.


2. Weighted Mean

Weighted mean is used when all values are not equally important.

๐Ÿ‘‰ In normal mean, every value has same importance
๐Ÿ‘‰ In weighted mean, some values have more importance (weight) than others

Some values matter MORE than others. You assign a weight to each value.

Formula:

Weighted Mean = (w₁x₁ + w₂x₂ + ... + wโ‚™xโ‚™) / (w₁ + w₂ + ... + wโ‚™)

Example 1: You have 3 exams. Final exam is worth more.


    import numpy as np
    scores  = [70,  80,  90]
    weights = [1,   1,   3]   # Final exam has weight 3

    weighted_mean = np.average(scores, weights=weights)
    print(weighted_mean)  # 84.0

    # Manual: (70*1 + 80*1 + 90*3) / (1+1+3) = 420/5 = 84


Example 2:

Marks:

  • Math = 90 (weight = 50%)
  • English = 80 (weight = 30%)
  • Science = 70 (weight = 20%)

Now we don’t treat all subjects equally.

Weighted Mean =

(90 × 0.5) + (80 × 0.3) + (70 × 0.2)
= 45 + 24 + 14
= 83

Used in: Ensemble models (XGBoost, Random Forest voting), class imbalance handling, recommendation systems.


3. Geometric Mean

๐Ÿค” First Understand the Problem — Why Arithmetic Mean Fails?

Suppose you have ₹100. You invest it:

Year

Return

Your Money

Year 1

+100%

₹100 → ₹200

Year 2

-50%

₹200 → ₹100

Arithmetic Mean says:

Reality check: Your money went ₹100 → ₹200 → ₹100 back. Real return = 0% ๐Ÿ˜

So arithmetic mean showed 25% profit when actually there was 0% profit. This exact problem is solved by Geometric Mean.

๐Ÿง  Core Idea — Growth That Multiplies

Whenever one value grows on top of the previous value (compounding), use Geometric Mean.

Type

Operation

Use When

Arithmetic Mean

Adds numbers

Values are independent

Geometric Mean

Multiplies numbers

Each value depends on previous one

๐Ÿ“ Formula

Two steps only:

  1. Multiply all numbers together
  2. Take the nth root (n = how many numbers you have)

๐Ÿ’ฐ Example 1 — Investment Returns

You have ₹1000. Returns over 3 years:

  • Year 1: +10%
  • Year 2: -20%
  • Year 3: +30%

๐Ÿ”„ Step 0 — Convert % to Multiplier (Most Important Step)

Why do we convert? Because we need to multiply, not add. A multiplier tells us what to multiply the current amount by.

Year

Return

How to Convert

Multiplier

Year 1

+10%

1.00 + 0.10

1.10

Year 2

-20%

1.00 − 0.20

0.80

Year 3

+30%

1.00 + 0.30

1.30

Rule: Always write it as 1 + (percent/100) +10% → 1 + (10/100) = 1 + 0.10 = 1.10 -20% → 1 + (-20/100) = 1 − 0.20 = 0.80

๐Ÿ“Š Step 1 — Multiply All Multipliers

Calculate left to right:

1.10×0.80=0.881.10 \times 0.80 = 0.88
0.88×1.30=1.1440.88 \times 1.30 = 1.144
Product=1.144

๐ŸŒฑ Step 2 — Take the nth Root

Here n = 3 (three years), so we take the cube root:

GM = (1.144)1/3

What does 1/3 power mean?

(1.144)1/3

means "what number multiplied by itself 3 times gives 1.144?"​

๐ŸŽฏ Step 3 — Convert Back to Percentage


✅ Step 4 — Verify the Answer (Proof)

Actual path of money:

Using GM (4.56% every year):

Both give the same final amount — so GM is correct!

What Arithmetic Mean would have given (wrong):

๐Ÿ‘จ‍๐Ÿ‘ฉ‍๐Ÿ‘ง Example 2 — Population Growth

City population = 10,00,000. Growth over 3 years:

  • Year 1: +5%
  • Year 2: +8%
  • Year 3: +6%

๐Ÿ”„ Step 0 — Convert to Multipliers

Year

Growth

Conversion

Multiplier

Year 1

+5%

1 + 0.05

1.05

Year 2

+8%

1 + 0.08

1.08

Year 3

+6%

1 + 0.06

1.06

๐Ÿ“Š Step 1 — Multiply All Multipliers

1.134 × 1.06= 1.20204

Product=1.20204

๐ŸŒฑ Step 2 — Take the Cube Root (n = 3)

๐ŸŽฏ Step 3 — Convert to Percentage

✅ Step 4 — Verify

Actual population growth:

10,00,000×1.05×1.08×1.06=12,02,040

Using GM (6.30% every year):

10,00,000×1.063×1.063×1.063=12,02,040

Both match perfectly!

๐Ÿ” Revisiting the ₹100 Problem (Now With GM)

+100% and -50%:

GM correctly said 0% return. Arithmetic mean had wrongly said +25%.

๐Ÿ Python Code With Explanation


    from scipy.stats import gmean

    # Step 1: Write returns as multipliers
    returns = [1.10, 0.80, 1.30]   # +10%, -20%, +30%

    # Step 2: gmean multiplies all and takes nth root automatically
    gm = gmean(returns)

    # Step 3: Convert back to percentage
    print(f"Geometric Mean : {gm:.4f}")              # 1.0456
    print(f"Avg return/year: {(gm - 1) * 100:.2f}%") # 4.56%

๐Ÿ“Œ When to Use — Quick Reference

Situation

Correct Mean

Average marks, height, weight

Arithmetic Mean

Investment / stock returns

Geometric Mean

Population growth

Geometric Mean

Any % change over time

Geometric Mean

Each value builds on previous

Geometric Mean

๐ŸŽฏ One Line Summary

Whenever money or any quantity grows on top of the previous result (compounding), always use Geometric Mean — Arithmetic Mean will give you a wrong answer.


4. Harmonic Mean

Harmonic Mean is used when values are related to speed, rate, or “per unit” things.

๐Ÿ‘‰ Like:

  • speed (km/h)
  • price per item
  • work per hour

What it actually means

It gives the true average when things are divided (not added or multiplied)

๐Ÿ‘‰ Special case: When you travel the same distance with different speeds

Simple example idea

You go:

  • Half distance at 60 km/h

  • Half distance at 40 km/h

๐Ÿ‘‰ Normal average = (60 + 40) / 2 = 50 ❌ WRONG
๐Ÿ‘‰ Because time taken is different

๐Ÿ‘‰ Harmonic Mean gives correct average speed

Why we need it

  • When values are rates (per unit)

  • When denominator matters (time, distance, etc.)

  • Gives real accurate result in such cases


In one line:

Harmonic mean is used to find the correct average when dealing with speeds or rates (per unit values).

Reciprocal of the arithmetic mean of reciprocals. Sounds complex — but the use case makes it click.

Formula:

Harmonic Mean = n / (1/x₁ + 1/x₂ + ... + 1/xโ‚™)

✅ Example 1 (Average Speed with multiple values)

You travel equal distances at speeds: 30 km/h, 40 km/h, 60 km/h

Step 1: Formula
3(130+140+160)\frac{3}{\left(\frac{1}{30} + \frac{1}{40} + \frac{1}{60}\right)}

Step 2: Solve
1/30 + 1/40 + 1/60
LCM = 120

= (4 + 3 + 2) / 120 = 9/120 = 3/40

Step 3: Final
3 ÷ (3/40) = 40 km/h

๐Ÿ‘‰ Final Answer: Average speed = 40 km/h

✅ Example 2 (Work Rate)

Two machines complete same work:

  • Machine A → 6 hours
  • Machine B → 12 hours

Step 1: Formula

2(16+112)\frac{2}{\left(\frac{1}{6} + \frac{1}{12}\right)}

Step 2: Solve

1/6 + 1/12 = (2 + 1) / 12 = 3/12 = 1/4

Step 3: Final

2 ÷ (1/4) = 8 hours

๐Ÿ‘‰ Final Answer: Average time = 8 hours

Example:


    from scipy.stats import hmean

    values = [4, 1]
    h_mean = hmean(values)
    print(h_mean)  # 1.6

The most important use in ML — F1 Score:

precision = 0.80
recall    = 0.60


    # F1 Score IS the harmonic mean of precision and recall
    f1 = 2 * (precision * recall) / (precision + recall)
    print(f1)  # 0.686

    # Why harmonic and not arithmetic?
    # Arithmetic mean of 0.8 and 0.6 = 0.70 (too generous)
    # Harmonic mean punishes imbalance — if either is low, F1 is low

Used in: F1 Score, averaging rates, anywhere balance between two metrics matters.


5. Moving Average (Rolling Mean)


6. Exponential Moving Average (EMA)


Mean in Core ML Concepts


Mean Absolute Error (MAE)


Mean Squared Error (MSE)


Root Mean Squared Error (RMSE)


Mean in Feature Scaling — Standardization (Z-score)

Before feeding data into ML models, you scale features. Mean is the center point.

Formula:

z = (x - mean) / standard_deviation
from sklearn.preprocessing import StandardScaler

data = [[25], [30], [35], [40], [45]]
scaler = StandardScaler()
scaled = scaler.fit_transform(data)

print(scaled)
# After scaling: mean becomes 0, std becomes 1
# [-1.41, -0.71, 0.0, 0.71, 1.41]

Why? Algorithms like Linear Regression, SVM, KNN, Neural Networks assume features are on similar scales. Without this, the feature with larger numbers dominates unfairly.


Mean in Gradient Descent

When you train a model, the loss function uses mean over all training examples.

Loss = (1/n) × ฮฃ (predicted - actual)²

The gradient (direction to update weights) is also the mean of gradients across all samples. The model learns by minimizing this average error.


Mean Imputation (Handling Missing Data)

When data has missing values, a simple strategy is to fill them with the mean of that column.

import pandas as pd
import numpy as np

df = pd.DataFrame({'age': [25, 30, np.nan, 40, np.nan, 35]})

mean_age = df['age'].mean()  # 32.5
df['age'].fillna(mean_age, inplace=True)

print(df)
# NaN values replaced with 32.5

When to use: Works well when data is roughly normally distributed and not too many values are missing.


Mean in Batch Normalization (Neural Networks)

Inside deep neural networks, after each layer, the activations are normalized using mean and standard deviation of the current batch. This keeps training stable and fast.

import torch
import torch.nn as nn

# PyTorch example
bn = nn.BatchNorm1d(num_features=4)
x = torch.tensor([[1.0, 2.0, 3.0, 4.0],
                   [5.0, 6.0, 7.0, 8.0]])

output = bn(x)
# Internally: subtracts mean, divides by std, for each feature

Mean vs Median — When Mean Fails

Mean has one big weakness: outliers destroy it.

salaries = [30000, 32000, 31000, 29000, 500000]  # one billionaire in the group

mean_salary   = np.mean(salaries)    # 124,400  ← completely misleading
median_salary = np.median(salaries)  # 31,000   ← represents the group better

Rule of thumb:

  • Data has no extreme outliers → use Mean
  • Data has outliers or is skewed → use Median
  • Always check with a histogram or box plot before deciding

Quick Reference Summary

Type               Formula                   ML Use Case
─────────────────────────────────────────────────────────────────────
Arithmetic Mean    sum / n                   Loss functions, scaling
Weighted Mean      ฮฃ(wแตขxแตข) / ฮฃwแตข            Ensembles, class weights
Geometric Mean     (x₁×x₂×...×xโ‚™)^(1/n)    Growth rates, log-scale eval
Harmonic Mean      n / ฮฃ(1/xแตข)              F1 Score, rate averaging
Rolling Mean       Mean of last k values     Time series smoothing
EMA                Weighted recent average   Adam optimizer, forecasting
MAE                mean(|actual - pred|)     Regression evaluation
MSE                mean((actual - pred)²)    Regression loss function
RMSE               √MSE                      Regression evaluation

One-Line Memory Hook for Each

  • Arithmetic → "The everyday average"
  • Weighted → "Some things matter more"
  • Geometric → "For growth and multiplication"
  • Harmonic → "For rates and balance — F1 lives here"
  • Rolling → "Sliding window over time"
  • EMA → "Recent past matters more"
  • MAE → "Average of how wrong you were"
  • MSE → "Punish big mistakes harder"
  • RMSE → "MSE in original units"

That's the complete Mean chapter — from the basic definition all the way to how it powers neural network training, model evaluation, and data preprocessing in real ML pipelines.


No comments:

Post a Comment

Mean — Complete Chapter for ML & Statistics

What is Mean? We calculate the mean (average) because it gives a single value that represents the whole dataset. Why we need mean (benefits...