We’ve all seen stocks that seem to move in sync, but how do you know if that relationship is built to last? Simple correlation can be a mirage, leading you into ‘Value Traps’ and failed pairs trades. In this blog, we explore Cointegration and Granger Causality the two statistical powerhouses that reveal which stocks are permanently ‘leashed’ together and which one is truly leading the dance. Stop trading on vibes and start trading on verified relationships.
Introduction
In a past post, we explored how Prewhitening helps us find “pure” signals between stocks. But as any seasoned investor knows, seeing two stocks move together today doesn’t mean they will stay together tomorrow.
If you want to build a truly robust trading strategy especially for Pairs Trading, you need to move past simple correlation and look at the “Big Two” of time-series relationships: Cointegration and Granger Causality.
Cointegration: The “Drunken Walker” Metaphor
Standard correlation measures if two stocks move in the same direction. Cointegration measures if the distance between them remains stable over time.
Imagine a drunk person walking a dog on a retractable leash.
- The person wanders randomly.
- The dog wanders randomly.
- Individually, they are unpredictable.
However, because of the leash, they can never drift too far apart. If the dog runs too far ahead, the leash pulls it back. In finance, this “leash” is Cointegration. It tells us that even if AAPL and MSFT wander around, they are bound by an economic reality that forces them back together.
Why it matters: If two stocks are cointegrated and they suddenly drift apart, you can bet that they will eventually move back toward each other. This is the foundation of Mean Reversion trading.
Granger Causality: Who is the Real Boss?
We often talk about “Leaders” and “Followers.” Granger Causality is the statistical test that settles the argument.
It doesn’t prove that Stock A causes Stock B in a physical sense. Instead, it asks: “Does knowing the past values of Stock A help me predict the future of Stock B better than just looking at the past of Stock B alone?”
- If yes: Stock A “Granger-causes” Stock B.
- If no: The relationship might just be a coincidence or driven by a third factor.
When to use it?
Cointegration
- Question: Do these two stay together over the long term?
- The Goal: Pairs Trading: Buy the laggard, sell the leader when they drift.
Granger Causality
- Question: Does one stock’s history provide a “cheat code” for the other’s future?
- The Goal: Predictive Modeling: Identifying which “Leader” to watch.
Code Example
import numpy as np
import pandas as pd
import yfinance as yf
from statsmodels.tsa.stattools import coint, grangercausalitytests
# Download Price Data
tickers = ["AAPL", "MSFT"]
df = yf.download(tickers, start="2018-01-01", end=None, auto_adjust=True)["Close"]
# COINTEGRATION (Use Raw Prices)
# Goal: Is there a long-term 'leash' between these two?
score, p_value_coint, _ = coint(df['AAPL'], df['MSFT'])
print(f"--- Cointegration Results ---")
print(f"P-Value: {p_value_coint:.4f}")
if p_value_coint < 0.05:
print("Result: Stocks are COINTEGRATED (The 'Leash' is strong).")
else:
print("Result: No Cointegration found.\n")
# GRANGER CAUSALITY (Use Log Returns)
# Goal: Does AAPL's history help predict MSFT's future?
# We MUST use log returns to ensure stationarity.
returns = np.log(df / df.shift(1)).dropna()
print(f"--- Granger Causality Results (AAPL -> MSFT) ---")
# The second column (AAPL) is tested to see if it causes the first column (MSFT)
granger_results = grangercausalitytests(returns[['MSFT', 'AAPL']], maxlag=5, verbose=False)
# Extracting p-values for the 'ssr_chi2test' at each lag
for lag, results in granger_results.items():
p_val = results[0]['ssr_chi2test'][1]
print(f"Lag {lag}: P-Value = {p_val:.4f}")Results
P-Value: 0.0433
Result: Stocks are COINTEGRATED (The 'Leash' is strong).
--- Granger Causality Results (AAPL -> MSFT) ---
Lag 1: P-Value = 0.0237
Lag 2: P-Value = 0.0841
Lag 3: P-Value = 0.0006
Lag 4: P-Value = 0.0009
Lag 5: P-Value = 0.0019
Interpretation of Results
Cointegration test
Since the p-value of the cointegration test is less than 0.05, we can reject the null hypothesis and conclude that the two stocks are cointegrated.
- What it means: These two stocks are tied together by a long-term economic link. Even if one stock jumps ahead or falls behind today, the “leash” will eventually pull them back together.
- The Strategy: This is perfect for Pairs Trading. If the gap (spread) between them gets too wide, you can bet that it will “mean-revert” (return to normal).
Grainger Causality
- Lag 1 (P = 0.0237): Significant. Apple’s price movement yesterday has a measurable impact on Microsoft today.
- Lag 2 (P = 0.0841): Not significant. Interestingly, the 2-day-old news from Apple doesn’t seem to matter as much.
- Lags 3, 4, and 5 (P < 0.002): Highly Significant. This is the most interesting part of your data. It suggests a “wave effect.” Information from Apple takes a few days to fully propagate through the market before hitting Microsoft with full force.
Bottom Line
Correlation is a great starting point, but it’s shallow.
- Prewhitening cleans your signal.
- Cointegration finds the “leash.”
- Granger Causality finds the “boss.”
By combining these three, you aren’t just looking at charts anymore—you are deconstructing the hidden machinery of the market.