Root Mean Squared Error (RMSE)

Start With The Problem — Where MSE Falls Short

You just calculated MSE for your house price model:

MSE = 20.8

Your manager asks — "so how wrong is our model on average?"

You say — "MSE is 20.8"

They ask — "20.8 what? Lakhs? Lakhs squared? What does that mean?"

You have no good answer. Because MSE unit is Lakhs² — completely uninterpretable in real world.

You want MSE's superpower (punishing big errors) but in a unit that actually makes sense.

Simple fix — just take the square root of MSE. That's RMSE.


What is RMSE?

RMSE = Square Root of MSE

That's the entire definition. Nothing new to learn conceptually — it's just MSE with a square root on top to fix the unit problem.

RMSE = √MSE = √( (1/n) × Σ (Actual - Predicted)² )

The Formula

RMSE = √ [ (1/n) × Σ (Actual - Predicted)² ]

Step by step:

  1. Find error (Actual − Predicted)
  2. Square each error
  3. Take average → this is MSE
  4. Take square root of MSE → this is RMSE

Manual Walkthrough — Step by Step

Same house price data:

House

Actual

Predicted

Error

Error²

1

50

45

5

25

2

80

85

-5

25

3

60

58

2

4

4

90

95

-5

25

5

70

65

5

25

Step 1 — Sum of squared errors:

25 + 25 + 4 + 25 + 25 = 104

Step 2 — MSE:

MSE = 104 / 5 = 20.8

Step 3 — RMSE:

RMSE = √20.8 = 4.56

Result: RMSE = 4.56 Lakhs

Now you can tell your manager — "on average, our model is wrong by ₹4.56 Lakhs" — and they actually understand it.


MAE vs RMSE — Same Unit, Different Behavior

Both are now in Lakhs. Let's compare on same data:

MAE  = 4.4  Lakhs
RMSE = 4.56 Lakhs

RMSE is slightly higher. Why? Because it penalizes bigger errors more, so it naturally comes out a bit higher than MAE.

This relationship is always true:

RMSE >= MAE   (always, without exception)

The gap between RMSE and MAE tells you something important about your model's errors.


The Gap Between RMSE and MAE — This is Gold

Gap

What it means

RMSE ≈ MAE (small gap)

Errors are consistent — no big outlier mistakes

RMSE >> MAE (big gap)

Model is making some very large errors somewhere

mae  = 4.4
rmse = 4.56
gap  = rmse - mae   # small gap = consistent errors, model is stable

mae2  = 4.4
rmse2 = 18.7
gap2  = rmse2 - mae2  # huge gap = some predictions are badly wrong

In real projects, checking this gap is a quick way to detect if your model has an outlier problem.


Python Program


    import numpy as np
    import pandas as pd
    from sklearn.metrics import mean_squared_error, mean_absolute_error
    import matplotlib.pyplot as plt

    # --- Data ---
    actual    = [50, 80, 60, 90, 70]
    predicted = [45, 85, 58, 95, 65]

    # --- Manual Calculation ---
    errors         = [a - p for a, p in zip(actual, predicted)]
    squared_errors = [e**2 for e in errors]
    mse_manual     = sum(squared_errors) / len(squared_errors)
    rmse_manual    = mse_manual ** 0.5   # square root

    print("=== Manual Calculation ===")
    print(f"Errors         : {errors}")
    print(f"Squared Errors : {squared_errors}")
    print(f"MSE            : {mse_manual}")
    print(f"RMSE           : {rmse_manual:.4f}")

    # --- Using NumPy ---
    rmse_numpy = np.sqrt(np.mean((np.array(actual) - np.array(predicted))**2))
    print(f"\nRMSE (numpy)   : {rmse_numpy:.4f}")

    # --- Using Scikit-learn ---
    mse     = mean_squared_error(actual, predicted)
    rmse    = np.sqrt(mse)
    mae     = mean_absolute_error(actual, predicted)

    print(f"RMSE (sklearn) : {rmse:.4f}")

    # --- The Gap Analysis ---
    print("\n=== MAE vs RMSE Gap Analysis ===")
    print(f"MAE  : {mae:.4f}")
    print(f"RMSE : {rmse:.4f}")
    print(f"Gap  : {rmse - mae:.4f}  ({'small - model is consistent' if (rmse - mae) < 2 else 'large - model has big error somewhere'})")

    # --- Outlier Comparison ---
    actual2    = [50, 80, 60, 90, 70]
    predicted2 = [50, 80, 60, 89, 30]   # last one is way off

    mae2  = mean_absolute_error(actual2, predicted2)
    rmse2 = np.sqrt(mean_squared_error(actual2, predicted2))

    print("\n=== Normal Model vs Outlier Model ===")
    print(f"Normal Model  → MAE: {mae:.2f}  | RMSE: {rmse:.2f}  | Gap: {rmse-mae:.2f}")
    print(f"Outlier Model → MAE: {mae2:.2f} | RMSE: {rmse2:.2f} | Gap: {rmse2-mae2:.2f}")
    print("Notice how RMSE explodes for outlier model but MAE stays modest")

    # --- Plot ---
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Plot 1 - Actual vs Predicted with error lines
    axes[0].plot(range(1, 6), actual,    label='Actual',    marker='o', linewidth=2)
    axes[0].plot(range(1, 6), predicted, label='Predicted', marker='s', linewidth=2, linestyle='--')
    for i in range(5):
        axes[0].vlines(i+1, min(actual[i], predicted[i]),
                            max(actual[i], predicted[i]),
                            colors='red', linewidth=2, alpha=0.6)
    axes[0].set_title(f'Actual vs Predicted\nMAE={mae:.2f} | RMSE={rmse:.2f}')
    axes[0].set_xlabel('House')
    axes[0].set_ylabel('Price (Lakhs)')
    axes[0].legend()
    axes[0].grid(True)

    # Plot 2 - MAE vs RMSE bar comparison across both models
    metrics  = ['MAE', 'RMSE']
    normal   = [mae, rmse]
    outlier  = [mae2, rmse2]
    x        = np.arange(len(metrics))
    width    = 0.35

    axes[1].bar(x - width/2, normal,  width, label='Normal Model',  color='steelblue', alpha=0.8)
    axes[1].bar(x + width/2, outlier, width, label='Outlier Model', color='tomato',    alpha=0.8)
    axes[1].set_title('MAE vs RMSE — Normal vs Outlier Model')
    axes[1].set_xticks(x)
    axes[1].set_xticklabels(metrics)
    axes[1].set_ylabel('Error Value')
    axes[1].legend()
    axes[1].grid(True, axis='y')

    plt.tight_layout()
    plt.savefig('rmse_plot.png')
    plt.show()
    print("\nPlot saved!")


Output:

=== Manual Calculation === Errors : [5, -5, 2, -5, 5] Squared Errors : [25, 25, 4, 25, 25] MSE : 20.8 RMSE : 4.5607 RMSE (numpy) : 4.5607 RMSE (sklearn) : 4.5607 === MAE vs RMSE Gap Analysis === MAE : 4.4000 RMSE : 4.5607 Gap : 0.1607 (small - model is consistent) === Normal Model vs Outlier Model === Normal Model → MAE: 4.40 | RMSE: 4.56 | Gap: 0.16 Outlier Model → MAE: 8.20 | RMSE: 17.89 | Gap: 9.69 Notice how RMSE explodes for outlier model but MAE stays modest Plot saved!

How to Read RMSE in Real Projects

RMSE is in the same unit as your target. So interpretation is direct:

Target Variable

RMSE = 4.56 means

House Price (Lakhs)

Wrong by ₹4.56L on average (with big error penalty)

Temperature (°C)

Wrong by 4.56°C on average

Sales (units)

Wrong by 4.56 units on average

Quick sanity check in code:


    rmse = np.sqrt(mean_squared_error(y_test, y_pred))

    target_range = y_test.max() - y_test.min()
    rmse_pct     = (rmse / target_range) * 100

    print(f"RMSE             : {rmse:.2f}")
    print(f"Target Range     : {target_range:.2f}")
    print(f"RMSE as % range  : {rmse_pct:.1f}%")

    # Rule of thumb
    if rmse_pct < 10:
        print("Model is very good")
    elif rmse_pct < 20:
        print("Model is decent")
    else:
        print("Model needs improvement")


MAE vs MSE vs RMSE — Full Picture

MAE

MSE

RMSE

Formula

avg of |errors|

avg of errors²

√MSE

Unit

Same as target

Squared

Same as target

Big error penalty

No

Yes, very heavy

Yes, heavy

Outlier sensitive

No — robust

Very sensitive

Sensitive

Interpretable

Best

Worst

Good

Used as loss function

Sometimes

Yes

Sometimes

Use when

Outliers exist

Training models

Evaluating models


The Golden Rule in Real Projects


    # Always report all three together
    mae  = mean_absolute_error(y_test, y_pred)
    mse  = mean_squared_error(y_test, y_pred)
    rmse = np.sqrt(mse)

    print(f"MAE  : {mae:.2f}")    # average error, simple
    print(f"MSE  : {mse:.2f}")    # for reference, used in training
    print(f"RMSE : {rmse:.2f}")   # main reporting metric

    # Then check the gap
    print(f"Gap (RMSE - MAE): {rmse - mae:.2f}")
    # Small gap = consistent model
    # Large gap = outlier errors hiding somewhere

In job interviews and real projects — RMSE is the most commonly reported regression metric. MAE is used when you need simplicity or have lots of outliers. MSE is mostly seen inside model training.


One Line Summary

RMSE is MSE with a square root — it keeps MSE's ability to punish large errors heavily, but brings the unit back to the same scale as your data, making it the most widely used and reported regression evaluation metric.

Mean Squared Error (MSE)

Start With The Problem — Where MAE Falls Short

Same house price example. Two different models:

House

Actual

Model A Predicted

Model B Predicted

1

50

49

50

2

80

79

80

3

60

59

60

4

90

79

89

5

70

69

30


Model A errors: 1, 1, 1, 11, 1 → MAE = 3.0

Model B errors: 0, 0, 0, 1, 40 → MAE = 8.2

Ok so MAE clearly shows Model B is worse here. But now imagine both models had same MAE = 5.

Model A errors: 5, 5, 5, 5, 5 → MAE = 5

Model B errors: 1, 1, 1, 1, 21 → MAE = 5 ← one massive mistake!

MAE says both are equal. But clearly Model B is dangerous — it made one huge blunder. You want to catch and punish big errors harder.

That's exactly what MSE does.


What is MSE?

MSE = Average of SQUARED differences between actual and predicted values

Same 3 steps as MAE but with one change:

  1. Find the error (Actual − Predicted)
  2. Square each error (instead of absolute value)
  3. Take the average

Squaring does two things — makes everything positive AND punishes big errors much harder.


The Formula

MSE = (1/n) × Σ (Actual - Predicted)²

Same as MAE formula, just square instead of absolute value.


Manual Walkthrough — Step by Step

House

Actual

Predicted

Error

Error²

1

50

45

5

25

2

80

85

-5

25

3

60

58

2

4

4

90

95

-5

25

5

70

65

5

25

Step 1 — Sum of squared errors:

25 + 25 + 4 + 25 + 25 = 104

Step 2 — Divide by n (5 houses):

MSE = 104 / 5 = 20.8

Result: MSE = 20.8


The Squaring Effect — This is the KEY idea

See what squaring does to errors of different sizes:

Error

After Absolute (MAE)

After Squaring (MSE)

1

1

1

2

2

4

5

5

25

10

10

100

20

20

400

 A 2x bigger error gets 4x bigger penalty in MSE.

A 10x bigger error gets 100x bigger penalty in MSE.

This is why MSE is said to heavily penalize large errors. Small errors barely matter, big errors scream loudly.


Visual — MAE vs MSE on Same Error

Error = 1  →  MAE adds 1    |  MSE adds 1
Error = 2  →  MAE adds 2    |  MSE adds 4
Error = 5  →  MAE adds 5    |  MSE adds 25
Error = 10 →  MAE adds 10   |  MSE adds 100  ← huge jump
Error = 20 →  MAE adds 20   |  MSE adds 400  ← MSE going crazy

Model with one huge mistake will have a massive MSE even if all other predictions are perfect.


Python Program


    import numpy as np
    import pandas as pd
    from sklearn.metrics import mean_squared_error
    import matplotlib.pyplot as plt

    # --- Data ---
    actual    = [50, 80, 60, 90, 70]
    predicted = [45, 85, 58, 95, 65]

    # --- Manual Calculation ---
    errors         = [a - p for a, p in zip(actual, predicted)]
    squared_errors = [e**2 for e in errors]
    mse_manual     = sum(squared_errors) / len(squared_errors)

    print("=== Manual Calculation ===")
    print(f"Errors         : {errors}")
    print(f"Squared Errors : {squared_errors}")
    print(f"MSE (manual)   : {mse_manual}")

    # --- Using NumPy ---
    mse_numpy = np.mean((np.array(actual) - np.array(predicted))**2)
    print(f"\nMSE (numpy)    : {mse_numpy}")

    # --- Using Scikit-learn ---
    mse_sklearn = mean_squared_error(actual, predicted)
    print(f"MSE (sklearn)  : {mse_sklearn}")

    # --- DataFrame view ---
    df = pd.DataFrame({
        'Actual'         : actual,
        'Predicted'      : predicted,
        'Error'          : errors,
        'Squared Error'  : squared_errors
    })
    print(f"\n{df.to_string(index=False)}")

    # --- Comparing MAE vs MSE on a bad outlier model ---
    actual2    = [50, 80, 60, 90, 70]
    predicted2 = [50, 80, 60, 89, 30]   # last prediction is way off

    from sklearn.metrics import mean_absolute_error
    print("\n=== Outlier Effect Comparison ===")
    print(f"Normal Model  → MAE: {mean_absolute_error(actual, predicted):.2f}  | MSE: {mean_squared_error(actual, predicted):.2f}")
    print(f"Outlier Model → MAE: {mean_absolute_error(actual2, predicted2):.2f} | MSE: {mean_squared_error(actual2, predicted2):.2f}")

    # --- Plot ---
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Plot 1 - Actual vs Predicted
    axes[0].plot(range(1, 6), actual,    label='Actual',    marker='o', linewidth=2)
    axes[0].plot(range(1, 6), predicted, label='Predicted', marker='s', linewidth=2, linestyle='--')
    for i in range(5):
        axes[0].vlines(i+1, min(actual[i], predicted[i]),
                            max(actual[i], predicted[i]),
                            colors='red', linewidth=2, alpha=0.6)
    axes[0].set_title('Actual vs Predicted')
    axes[0].set_xlabel('House')
    axes[0].set_ylabel('Price (Lakhs)')
    axes[0].legend()
    axes[0].grid(True)

    # Plot 2 - Squared errors bar chart
    axes[1].bar(range(1, 6), squared_errors, color='tomato', alpha=0.8)
    axes[1].set_title(f'Squared Errors per House (MSE = {mse_sklearn})')
    axes[1].set_xlabel('House')
    axes[1].set_ylabel('Squared Error')
    axes[1].grid(True, axis='y')

    plt.tight_layout()
    plt.savefig('mse_plot.png')
    plt.show()
    print("\nPlot saved!")


Output:

=== Manual Calculation === Errors : [5, -5, 2, -5, 5] Squared Errors : [25, 25, 4, 25, 25] MSE (manual) : 20.8 MSE (numpy) : 20.8 MSE (sklearn) : 20.8 Actual Predicted Error Squared Error 50 45 5 25 80 85 -5 25 60 58 2 4 90 95 -5 25 70 65 5 25 === Outlier Effect Comparison === Normal Model → MAE: 4.40 | MSE: 20.80 Outlier Model → MAE: 8.20 | MSE: 320.20 Plot saved!


The Big Weakness of MSE — Units Get Weird

MAE of 4.4 Lakhs → means model is wrong by ₹4.4L on average. Easy to understand.

MSE of 20.8 → 20.8 what? Lakhs²? That unit makes no real sense.

Price is in Lakhs
Error is in Lakhs
Error² is in Lakhs² ← no human thinks in Lakhs squared

This is MSE's main weakness — not directly interpretable in real world terms.

Solution → RMSE (Root Mean Squared Error) — just take square root of MSE, unit comes back to normal. That's the next concept.


MAE vs MSE — Clear Comparison

MAE

MSE

Formula

Average of |errors|

Average of errors²

Unit

Same as target

Squared unit

Big error penalty

Equal weight

Very heavy penalty

Outlier sensitive

No — robust

Yes — very sensitive

Use when

Outliers exist in data

Big mistakes are unacceptable

Differentiable

Not always

Always (good for gradient descent)


Why MSE is Loved in ML Training

This is important — MSE is not just an evaluation metric. It's often used as the loss function during model training itself.


    # Linear Regression internally minimizes MSE during training
    from sklearn.linear_model import LinearRegression

    model = LinearRegression()  # this model minimizes MSE by default
    model.fit(X_train, y_train)

Why? Because MSE is smooth and differentiable everywhere — gradient descent can flow through it cleanly. MAE has a kink at 0 which causes issues in optimization.


Real ML Project Usage


    from sklearn.metrics import mean_squared_error, mean_absolute_error
    import numpy as np

    y_pred = model.predict(X_test)

    mae  = mean_absolute_error(y_test, y_pred)
    mse  = mean_squared_error(y_test, y_pred)
    rmse = np.sqrt(mse)   # fix the unit problem

    print(f"MAE  : {mae:.2f}")   # avg error, human readable
    print(f"MSE  : {mse:.2f}")   # penalizes big errors, but weird unit
    print(f"RMSE : {rmse:.2f}")  # best of both worlds

    # Always report all three together in real projects

In real projects — always calculate MAE, MSE, and RMSE together. Each tells you something slightly different about your model's behavior.


One Line Summary

MSE squares every error before averaging — making it extremely sensitive to large mistakes, which is perfect when big prediction errors are costly and unacceptable in your use case.

Phase 2 — Angular Fundamentals

Chapter 1 — What is Angular and How Does it Think? 1.1 — What is Angular? Angular is a framework for building web applications. A framewo...