Methodology

A backtest by itself answers one narrow question: "how would this exact strategy have performed on this exact historical data?" It does not answer the more important question: "is this result likely to hold up going forward, or is it an artifact of how the strategy was built and tested?" Maple's methodology is built to help answer the second question.

The core validators

Maple examines a strategy's backtest results through a series of independent checks, each targeting a specific, well-known way backtests can mislead:

  • Out-of-sample testing. Checking whether a strategy's performance on data it was not tuned on looks meaningfully different from its performance on the data it was built with — a large gap is a classic overfitting signal.
  • Parameter sensitivity. Testing whether small changes to a strategy's parameters (a slightly different moving-average length, a slightly different stop level) produce wildly different results. A robust strategy tends to perform reasonably across a range of nearby settings, not just one narrow, precisely tuned combination.
  • Sample-size adequacy. Checking whether a strategy generated enough trades, across enough conditions, for its statistics to be meaningful, rather than resting on a handful of lucky (or unlucky) trades.
  • Drawdown analysis. Looking beyond headline returns to examine how deep and how long a strategy's worst losing periods were, and whether the backtest window happened to avoid the kind of stress period that would reveal real weaknesses.
  • Monte Carlo simulation. Reordering and resampling a strategy's trade sequence many times to see how much its results depend on the specific order events happened to occur in, rather than the underlying rule itself.
  • Walk-forward testing. Re-testing a strategy across a rolling series of time windows, rather than one fixed historical period, to see whether its edge persists as market conditions change over time.
  • Regime analysis. Checking whether a strategy's performance is concentrated in one type of market condition (for example, a strong trend) and largely absent in others, which affects how much its historical average result should be trusted going forward.

How this becomes a confidence read

These checks combine into a single, plain-language confidence signal alongside the raw performance numbers — not a prediction of future returns, but an assessment of how much statistical weight a given backtest result can reasonably bear. A strategy can have a strong historical return and still receive a low-confidence read, if that return depends heavily on a narrow set of parameters, a small number of trades, or a specific market regime.

This is the same category of thinking professional quantitative researchers apply before trusting a backtest — Maple's contribution is making that discipline available without requiring a statistics background to apply it.