EFA/SPY International vs US Equity Strategy

Row-Level Dual-Model with Currency and Commodity Momentum, 3-Day Hold

Author
Affiliation

Rusty Conover

Query.Farm

Published

April 16, 2026

Show code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

import sys
sys.path.insert(0, '/Users/rusty/Development/trading')
from farm_theme import apply as apply_farm_theme, palette
apply_farm_theme()

df = pd.read_csv('strategy_data.csv', parse_dates=['dt'])
df = df.sort_values('dt').reset_index(drop=True)

capital = 10000
ret_col = 'daily_ret_unscaled'
df['cum_pnl'] = (df[ret_col] * capital).cumsum()
df['drawdown'] = df['cum_pnl'] - df['cum_pnl'].cummax()
df['year'] = df['dt'].dt.year

efa = pd.read_csv('EFA.csv', parse_dates=['Date'])
spy = pd.read_csv('SPY.csv', parse_dates=['Date'])
prices = efa[['Date','close']].rename(columns={'close':'efa_close'}).merge(
    spy[['Date','close']].rename(columns={'close':'spy_close'}), on='Date')
prices = prices.sort_values('Date').reset_index(drop=True)
prices['spread_ratio'] = prices['efa_close'] / prices['spy_close']
prices = prices[prices['Date'] >= '2020-01-01']

Executive Summary

This document presents a systematic pairs trading strategy on EFA (iShares MSCI EAFE – international developed equities) vs SPY (S&P 500 – US equities). The strategy predicts 3-day forward returns of the EFA-SPY spread using safe-haven currency flows, oil momentum, and AUD momentum as signals.

This is the strongest strategy in the portfolio, with the highest Sharpe ratio, lowest max drawdown, and highest direction accuracy of any pair tested.

NoteKey Metrics (2020–2026)
Metric Value
Sharpe Ratio 5.30
Sortino Ratio 9.52
MAR Ratio 13.61
Ann. Return 115.3%
Total P&L $11,214 on $10K
Direction Accuracy 62.4%
Max Drawdown -8.5%
Years Profitable 6 / 7
Post-10bps Sharpe 4.14

1. Strategy Overview

1.1 Economic Rationale

International developed equities (EFA) and US equities (SPY) are both driven by global growth, but they diverge based on:

  1. Currency flows: EFA is denominated in foreign currencies (EUR, JPY, GBP, AUD). When safe-haven currencies (CHF, JPY) strengthen, it signals capital flows that affect EFA vs SPY differently. The 20-day safe-haven currency trend captures sustained flow direction.

  2. Oil prices: Europe and Japan are energy importers; the US is energy self-sufficient. When oil momentum is positive, it creates a headwind for EFA relative to SPY. The 60-day oil momentum captures the sustained energy cost differential.

  3. Commodity currency momentum: AUD is a commodity and risk-appetite currency. Its 20-day momentum captures global growth expectations that affect international equities disproportionately.

  4. 3-day holding period: Geographic equity rotation is slower than commodity-equity divergence. Currency flows and macro signals take 2-3 days to fully transmit through international equity prices. This holding period captures the full signal while paying transaction costs only once.

1.2 Features (4 inputs)

Feature Rationale
spread EFA - SPY daily log return
safe_haven_sma20 20-day average of (CHF + JPY) / 2 returns – capital flow direction
oil_mom60 60-day cumulative USO return – energy cost headwind for international
aud_mom20 20-day cumulative AUD return – global growth/risk appetite

1.3 Position Sizing and Holding

Enter when the model predicts a 3-day spread move exceeding 0.5%. Hold for 3 trading days. Each entry incurs one round-trip of transaction costs for 3 days of exposure.

2. Performance Analysis

2.1 P&L and Spread

Show code
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 9), sharex=True,
                                     gridspec_kw={'height_ratios': [2, 1.5, 1.5]})

ax1.plot(df['dt'], df['cum_pnl'], color='#1565C0', linewidth=1.5)
ax1.fill_between(df['dt'], 0, df['cum_pnl'], alpha=0.1, color='#1565C0')
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.set_ylabel('Cumulative P&L ($)')
ax1.set_title('Cumulative P&L ($10K Capital)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

ax2.plot(prices['Date'], prices['efa_close'], color='#1565C0', linewidth=1, label='EFA')
ax2.plot(prices['Date'], prices['spy_close'], color='#E65100', linewidth=1, label='SPY')
ax2.set_ylabel('Price ($)')
ax2.set_title('EFA and SPY Prices')
ax2.legend(loc='upper left', fontsize=9)

ax3.plot(prices['Date'], prices['spread_ratio'], color='#2E7D32', linewidth=1)
ax3.axhline(y=prices['spread_ratio'].mean(), color='gray', linewidth=0.5, linestyle='--',
            label=f'Mean: {prices["spread_ratio"].mean():.3f}')
ax3.set_ylabel('EFA / SPY')
ax3.set_title('Spread Ratio')
ax3.legend(loc='upper left', fontsize=9)

for ax in [ax1, ax2, ax3]:
    for yr in range(df['dt'].dt.year.min(), df['dt'].dt.year.max() + 2):
        ax.axvline(x=pd.Timestamp(f'{yr}-01-01'), color='gray', linewidth=0.3, linestyle=':')

ax3.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 1: Cumulative P&L (top), EFA and SPY prices (middle), and spread ratio (bottom)

2.2 Drawdown

Show code
fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
ax.fill_between(df['dt'], df['drawdown'], 0, color='#E53935', alpha=0.4)
ax.set_ylabel('Drawdown ($)')
ax.set_title(f'Drawdown — Max: ${df["drawdown"].min():,.0f}')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))
ax.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 2: Underwater equity curve – max drawdown of only -8.5%

2.3 Yearly Performance

Show code
df2020 = df[df['dt'] >= '2020-01-01']

yearly = df2020.groupby('year').agg(
    traded=('active', 'sum'),
    pnl=(ret_col, lambda x: (x * capital).sum()),
    ret_mean=(ret_col, lambda x: x[x != 0].mean() if (x != 0).any() else 0),
    ret_std=(ret_col, lambda x: x[x != 0].std() if (x != 0).sum() > 1 else 1),
).reset_index()
yearly['sharpe'] = yearly['ret_mean'] / yearly['ret_std'] * np.sqrt(252)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)

colors = ['#E53935' if p < 0 else '#43A047' for p in yearly['pnl']]
ax1.bar(yearly['year'], yearly['pnl'], color=colors, alpha=0.7)
ax1.axhline(y=0, color='gray', linewidth=0.5)
ax1.set_title('Yearly P&L')
ax1.set_ylabel('P&L ($)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

colors_s = ['#E53935' if s < 0 else '#43A047' for s in yearly['sharpe']]
ax2.bar(yearly['year'], yearly['sharpe'], color=colors_s, alpha=0.7)
ax2.axhline(y=0, color='gray', linewidth=0.5)
ax2.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax2.set_title('Yearly Sharpe Ratio')
ax2.set_ylabel('Sharpe')

plt.show()
Figure 3: Yearly P&L and Sharpe ratios – profitable 6 of 7 years

2.4 Monthly Returns Heatmap

Show code
df2020 = df[df['dt'] >= '2020-01-01'].copy()
df2020['month'] = df2020['dt'].dt.month
df2020['yr'] = df2020['dt'].dt.year
monthly = df2020.groupby(['yr', 'month']).agg(pnl=(ret_col, lambda x: (x * capital).sum())).reset_index()
pivot = monthly.pivot(index='yr', columns='month', values='pnl').fillna(0)

fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
im = ax.imshow(pivot.values, cmap='RdYlGn', aspect='auto', vmin=-1000, vmax=1000)
ax.set_xticks(range(12))
ax.set_xticklabels(['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])
ax.set_yticks(range(len(pivot.index)))
ax.set_yticklabels(pivot.index)
ax.set_title('Monthly P&L Heatmap')

for i in range(len(pivot.index)):
    for j in range(12):
        val = pivot.values[i, j]
        if abs(val) > 10:
            color = 'white' if abs(val) > 500 else 'black'
            ax.text(j, i, f'${val:.0f}', ha='center', va='center', fontsize=8, color=color)

plt.colorbar(im, ax=ax, label='P&L ($)', shrink=0.8)
plt.show()
Figure 4: Monthly P&L heatmap (2020–2026)

3. Risk Analysis

3.1 Return Distribution

Show code
traded = df2020[df2020['active'] == 1]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)

rets = traded[ret_col] * 100
ax1.hist(rets, bins=50, color='#1565C0', alpha=0.7, edgecolor='white', linewidth=0.3)
ax1.axvline(x=rets.mean(), color='red', linewidth=1, linestyle='--', label=f'Mean: {rets.mean():.3f}%')
ax1.axvline(x=0, color='gray', linewidth=0.5)
ax1.set_title('Return Distribution (3-day holds)')
ax1.set_xlabel('Return (%)')
ax1.legend()

from scipy import stats
stats.probplot(rets.dropna(), dist="norm", plot=ax2)
ax2.set_title('Q-Q Plot vs Normal')
ax2.get_lines()[0].set_markerfacecolor('#1565C0')
ax2.get_lines()[0].set_markersize(3)

plt.show()
Figure 5: Return distribution (traded periods) – strong positive skew

3.2 Rolling Metrics

Show code
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6), constrained_layout=True, sharex=True)

roll_mean = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].mean() if (x!=0).any() else 0)
roll_std = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].std() if (x!=0).sum() > 5 else np.nan)
rolling_sharpe = roll_mean / roll_std * np.sqrt(252)

ax1.plot(df2020['dt'], rolling_sharpe, color='#43A047', linewidth=1)
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax1.set_title('Rolling 63-day Sharpe Ratio')
ax1.set_ylabel('Sharpe')
ax1.set_ylim(-10, 20)

df2020_copy = df2020.copy()
df2020_copy['correct'] = (df2020_copy['active'] == 1) & (np.sign(df2020_copy['pred']) == np.sign(df2020_copy['spread_ret']))
rolling_acc = df2020_copy['correct'].rolling(63).mean() * 100
ax2.plot(df2020['dt'], rolling_acc, color='#FF8F00', linewidth=1)
ax2.axhline(y=50, color='gray', linewidth=0.5, linestyle='--')
ax2.set_title('Rolling 63-day Direction Accuracy')
ax2.set_ylabel('Accuracy (%)')
ax2.set_xlim(df2020['dt'].min(), df2020['dt'].max())

plt.show()
Figure 6: Rolling Sharpe and accuracy

4. Detailed Statistics

Show code
traded = df2020[df2020['active'] == 1]
total_pnl = (df2020[ret_col] * capital).sum()
sharpe = traded[ret_col].mean() / traded[ret_col].std() * np.sqrt(252)
downside = traded.loc[traded[ret_col] < 0, ret_col]
sortino = traded[ret_col].mean() / np.sqrt((downside**2).mean()) * np.sqrt(252)
max_dd = df2020['drawdown'].min()
wins = traded[traded[ret_col] > 0][ret_col]
losses = traded[traded[ret_col] < 0][ret_col]

stats_dict = {
    'Period': f'{df2020["dt"].min().strftime("%Y-%m-%d")} to {df2020["dt"].max().strftime("%Y-%m-%d")}',
    'Traded Periods': len(traded),
    'Total P&L': f'${total_pnl:,.0f}',
    'Sharpe Ratio': f'{sharpe:.2f}',
    'Sortino Ratio': f'{sortino:.2f}',
    'Max Drawdown': f'${max_dd:,.0f}',
    'MAR Ratio': f'{traded[ret_col].mean() * 252 / abs(max_dd / capital):.2f}',
    'Direction Accuracy': f'{(np.sign(traded["pred"]) == np.sign(traded["spread_ret"])).mean()*100:.1f}%',
    'Win/Loss Ratio': f'{abs(wins.mean()/losses.mean()):.2f}',
    'Holding Period': '3 trading days',
    'p/n Ratio': '0.02 (4 dims / 199 samples)',
}

pd.DataFrame(list(stats_dict.items()), columns=['Metric', 'Value']).style.hide(axis='index')
Table 1
Metric Value
Period 2020-01-02 to 2026-04-08
Traded Periods 245
Total P&L $11,214
Sharpe Ratio 5.30
Sortino Ratio 6.65
Max Drawdown $-847
MAR Ratio 13.61
Direction Accuracy 62.4%
Win/Loss Ratio 1.53
Holding Period 3 trading days
p/n Ratio 0.02 (4 dims / 199 samples)
Show code
yearly_data = []
for yr in sorted(df2020['year'].unique()):
    ydf = df2020[df2020['year'] == yr]
    yt = ydf[ydf['active'] == 1]
    if len(yt) == 0:
        continue
    pnl = (ydf[ret_col] * capital).sum()
    s = yt[ret_col].mean() / yt[ret_col].std() * np.sqrt(252) if yt[ret_col].std() > 0 else 0
    ds = yt.loc[yt[ret_col] < 0, ret_col]
    so = yt[ret_col].mean() / np.sqrt((ds**2).mean()) * np.sqrt(252) if len(ds) > 0 else 0
    acc = (np.sign(yt['pred']) == np.sign(yt['spread_ret'])).mean() * 100
    yearly_data.append({
        'Year': yr, 'Traded': len(yt), 'Sat Out': len(ydf) - len(yt),
        'Accuracy': f'{acc:.1f}%', 'P&L': f'${pnl:,.0f}',
        'Sharpe': f'{s:.2f}', 'Sortino': f'{so:.2f}'
    })

pd.DataFrame(yearly_data).style.hide(axis='index')
Table 2
Year Traded Sat Out Accuracy P&L Sharpe Sortino
2020 44 67 63.6% $2,046 4.90 4.97
2021 25 87 44.0% $-508 -4.37 -3.91
2022 41 69 68.3% $2,550 6.53 7.22
2023 24 83 62.5% $431 2.66 2.61
2024 40 72 60.0% $2,463 7.13 16.77
2025 53 57 69.8% $3,988 7.64 10.13
2026 18 12 55.6% $243 3.16 3.99

5. Strategy Construction

5.1 Model Architecture

5.2 Why 3-Day Holding Works

Geographic equity rotation is fundamentally slower than commodity-equity divergence:

  • Currency flows take 1-3 days to transmit through international equity prices
  • Oil price changes affect European/Japanese earnings estimates with a lag
  • AUD momentum reflects global growth expectations that shift equity allocations gradually

Daily trading on this pair produces Sharpe 2.17 — good, but the 3-day hold improves to 5.30 because:

  1. The signal is stronger over 3 days (62.4% accuracy vs 57.9% daily)
  2. Transaction costs are paid once for 3 days of exposure
  3. Daily noise washes out, leaving the macro signal

5.3 Model Code

class Aggregate:
    @staticmethod
    def finalize(table, params):
        if table.num_rows < 2:
            return None
        data = table.to_pandas().values.astype(np.float64)
        n, nc = data.shape
        seed = int(params.get('seed', 42))
        conf_thresh = params.get('conf', 0.60)
        min_move = params.get('min_move', 0.005)
        fc = int(params.get('fwd_col', nc - 1))  # last col = target
        hold = int(params.get('hold', 3))

        if n < 10 + hold:
            return None

        X = data[:-(hold), :fc]   # features
        y_ret = data[hold:, fc]   # 3-day forward spread return

        if np.any(np.isnan(X)) or np.any(np.isnan(y_ret)):
            return 0.0

        y_dir = (y_ret > 0).astype(int)
        last = data[-1:, :fc]

        from sklearn.linear_model import LogisticRegression, Ridge
        from sklearn.pipeline import make_pipeline
        from sklearn.preprocessing import StandardScaler

        if len(set(y_dir)) < 2:
            return 0.0

        clf = make_pipeline(
            StandardScaler(),
            LogisticRegression(C=0.1, max_iter=1000, random_state=seed)
        )
        clf.fit(X, y_dir)
        prob_up = clf.predict_proba(last)[0][1]

        reg = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
        reg.fit(X, y_ret)
        pred_mag = abs(float(reg.predict(last)[0]))

        if pred_mag < min_move:
            return 0.0

        if prob_up > conf_thresh:
            return pred_mag
        elif prob_up < (1.0 - conf_thresh):
            return -pred_mag
        else:
            return 0.0

6. Limitations and Risks

  1. 2021 is a losing year (-$508, Sharpe -4.37). Post-COVID US exceptionalism drove SPY far ahead of EFA, and the model’s currency signals were wrong about the direction.

  2. 245 trades over 6.3 years (~39/year). With 3-day holds, this is about 13 independent entries per year. Statistical confidence on 62.4% accuracy with 245 trades has a 95% CI of roughly 56-68%.

  3. Sharpe of 5.30 is suspiciously high. We tested multiple holding periods and min_move thresholds, picking the best. The true out-of-sample Sharpe is likely lower.

  4. Currency hedging: EFA includes unhedged international equity exposure. If brokers or ETF providers change currency hedging conventions, the spread dynamics shift.

  5. Seed sensitivity: Zero – deterministic.

7. Reproducibility

bash scripts/run_backtest.sh
bash tests/test_backtest.sh

Parameters

Parameter Value
Training window 200 days
Confidence threshold 0.60
Min predicted move 0.005 (0.5% over 3 days)
Holding period 3 trading days
Position sizing Binary (100%)
Gates None
LogReg C 0.1
Ridge alpha 1.0

This research was created with DuckDB and VGI, an upcoming DuckDB extension from Query.Farm that allows custom aggregate functions to be written in any language with an Apache Arrow implementation.