The Problem with SMA First
Remember SMA — it gives equal weight to all values in the window.
For a 3-day SMA on sales:
Day 1: 200, Day 2: 450, Day 3: 180
SMA = (200 + 450 + 180) / 3 = 276.6
Here, Day 1 (old data) and Day 3 (today) are treated equally. But think about it — should 2-day-old data matter as much as today's data?
In most real cases — No. Recent data is more important.
That's exactly what EMA fixes.
What is EMA?
EMA gives MORE weight to recent values and LESS weight to older values.
The further back a value is, the less it influences the average. Recent values dominate.
Weight Concept — Simple Visual
For a 3-period EMA, weights look like this:
|
Data Point |
Weight |
|
Today (most recent) |
Highest ⬆️ |
|
Yesterday |
Medium |
|
Day before |
Low |
|
Even older |
Very Low (almost ignored) |
Compare this to SMA where every day gets exactly equal weight.
The Formula
EMA today = (Today's Value × α) + (Yesterday's EMA × (1 - α))
Where α (alpha) is the smoothing factor:
α = 2 / (N + 1)
For N = 3:
α = 2 / (3 + 1) = 0.5
That means — 50% weight to today, 50% to the past EMA.
For N = 10:
α = 2 / (10 + 1) = 0.18
Smaller alpha = smoother = older data still matters more.
Manual Walkthrough — Step by Step
Daily Sales data:
|
Day |
Sales |
|
1 |
200 |
|
2 |
450 |
|
3 |
180 |
|
4 |
500 |
|
5 |
220 |
Using N = 3, so α = 0.5
Step 1 — Day 1: No previous EMA exists, so EMA = first value itself
EMA(1) = 200
Step 2 — Day 2:
EMA(2) = (450 × 0.5) + (200 × 0.5)
= 225 + 100
= 325
Step 3 — Day 3:
EMA(3) = (180 × 0.5) + (325 × 0.5)
= 90 + 162.5
= 252.5
Step 4 — Day 4:
EMA(4) = (500 × 0.5) + (252.5 × 0.5)
= 250 + 126.25
= 376.25
Step 5 — Day 5:
EMA(5) = (220 × 0.5) + (376.25 × 0.5)
= 110 + 188.12
= 298.12
Final result:
|
Day |
Sales |
EMA (N=3) |
|
1 |
200 |
200 |
|
2 |
450 |
325 |
|
3 |
180 |
252.5 |
|
4 |
500 |
376.25 |
|
5 |
220 |
298.12 |
Notice — EMA reacts faster to the spike on Day 4 (500) compared to SMA. That's the power.
SMA vs EMA — Side by Side
|
Feature |
SMA |
EMA |
|
Weight to all values |
Equal |
More to recent |
|
Reacts to sudden change |
Slow |
Fast |
|
Smoother line |
Yes |
Slightly less smooth |
|
NaN at start |
Yes (first N rows) |
No |
|
Best for |
Long-term trend |
Short-term, fast signals |
Python Program
import pandas as pdimport matplotlib.pyplot as plt# --- Data ---data = {'day': list(range(1, 16)),'sales': [200, 450, 180, 500, 220, 480, 210, 460, 190, 510, 230, 490, 200, 470, 215]}df = pd.DataFrame(data)# --- Calculate SMA and EMA ---df['SMA_3'] = df['sales'].rolling(window=3).mean()df['EMA_3'] = df['sales'].ewm(span=3, adjust=False).mean() # EMA with N=3df['EMA_7'] = df['sales'].ewm(span=7, adjust=False).mean() # EMA with N=7print(df.to_string(index=False))# --- Plot ---plt.figure(figsize=(12, 5))plt.plot(df['day'], df['sales'], label='Raw Sales', marker='o', linewidth=1.5, alpha=0.6)plt.plot(df['day'], df['SMA_3'], label='SMA (3-day)', linewidth=2, linestyle='--')plt.plot(df['day'], df['EMA_3'], label='EMA (3-day)', linewidth=2)plt.plot(df['day'], df['EMA_7'], label='EMA (7-day)', linewidth=2)plt.title('SMA vs EMA Comparison')plt.xlabel('Day')plt.ylabel('Sales')plt.legend()plt.grid(True)plt.tight_layout()plt.savefig('ema_vs_sma.png')plt.show()print("Plot saved!")
Output:
day sales SMA_3 EMA_3 EMA_7 1 200 NaN 200.000000 200.000000 2 450 NaN 325.000000 262.500000 3 180 276.666667 252.500000 241.875000 4 500 376.666667 376.250000 306.406250 5 220 300.000000 298.125000 284.804688 6 480 400.000000 389.062500 333.603516 7 210 303.333333 299.531250 302.702637 8 460 383.333333 379.765625 342.026978 9 190 286.666667 284.882812 304.020233 10 510 386.666667 397.441406 355.515175 11 230 310.000000 313.720703 324.136381 12 490 410.000000 401.860352 365.602286 13 200 306.666667 300.930176 324.201714 14 470 386.666667 385.465088 360.651286 15 215 295.000000 300.232544 324.238464 Plot saved!
Key Things to Remember
ewm(span=3) — span is your N value, same as window in rolling
adjust=False — uses the recursive formula shown above (standard EMA). Always use this.
No NaN — EMA starts from Day 1 itself, unlike SMA which waits for N values
Where EMA is Used in ML
# Feature Engineering with EMA df['ema_3'] = df['sales'].ewm(span=3, adjust=False).mean() # short trend df['ema_7'] = df['sales'].ewm(span=7, adjust=False).mean() # medium trend df['ema_21'] = df['sales'].ewm(span=21, adjust=False).mean() # long trend
# EMA reacts faster — great for detecting sudden changes (fraud, anomaly) df['deviation_from_ema'] = df['sales'] - df['ema_7'] # how far today is from trend
deviation_from_ema is a very powerful feature — if this value is very high or very low, it signals something unusual happening. Used heavily in anomaly detection and fraud detection models.
One Line Summary
EMA is a smarter Moving Average — it remembers the past but pays more attention to what just happened, making it faster to react to real changes in data.

No comments:
Post a Comment