Paul
Teetor

Many spreads show seasonality -- that is, regular patterns from year-to-year -- and can be a valuable source of profits and diversification. Seasonal spreads, however, seem to get little attention from the quantitative community. I’d like to remedy that by showing one way to identify seasonal spreads using legitimate statistical techniques. I'll start with an example of a seasonal spread: crude oil versus gasoline.

If you look closely, it appears that the spread often falls during Winter, then rises from Spring into Summer. But how can we be sure? Perhaps we are just imagining the pattern, based on a few examples. And what are the actual months when it falls and rises?

Here is how we can test for a seasonal pattern.

- Compute the daily change in
the spread, s
_{t}, giving us one Δs_{t}value for every day of the spread’s history. - For each day, group the day’s change according to the month of that day.
- Compute the average change for each monthly group.

Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | |

Mean change | -31.97 | 52.15 | -16.74 | 12.98 | 94.73 | 78.16 | 33.08 | -12.39 | -18.88 | 22.77 | -121.50 | -90.37 |

(These averages are in dollars per day.)

The monthly averages suggest trades: January is a down month, on average, so sell the spread during January. Likewise, February is an up month, on average, so buy the spread during February.

Those trades would be very naïve, however. A statistician would ask an important question before risking real money: What is probability that the true mean is actually positive or negative? The averages for April and August are pretty small (12.98 and -12.9), for example. Maybe our sample is too small, and these averages are not representative. How realistic are these numbers?

The solution is to form confidence intervals for each monthly mean. This barchart shows the mean change taken from the table above, then superimposes the confidence intervals over each bar, giving us a composite view of the monthly averages.

Consider the bar for January. Yes, the mean change is negative, but the confidence interval crosses over into positive territory. Statistically speaking, we cannot be confident that the true mean is negative. It could be zero or even positive, so selling the spread during January could be unwise. In fact, the confidence interval includes zero for most months, and we cannot be confident there is any seasonal trend in those months, either positive or negative.

The confidence interval for May, however, is completely positive. We can be confident that the average historical daily change in May is positive. In fact, May and June are clearly “up” months, and November and December are clearly “down” months. Now we have our seasonal trade for the CL/RB spread:

- Buy the spread on May 1. Close the position on Jul 1.
- Short the spread on Nov 1. Close the position on Jan 1.

ANOVA stands for analysis of variance. ANOVA compares groups of observations, such as our month-wise groups of spread changes. It reports the probability that one or more groups have significantly different means, compared with the other groups.

The ANOVA report includes a probability value, or simply p-value, which is the probability that all the means are identical. So a small p-value means one or more groups are probably not the same. In the CL/RB example, above, my computer reports a p-value of less than 0.0001, so the probability that all months have the same average change is less than 0.01%.

My computer runs a weekly batch job which computes the ANOVA p-value for every spread in my database. If the p-value is 0.05 or less, I know there is a 95% probability that the spread changes are significantly different from month-to-month; in other words, the spread exhibits a seasonal pattern. In those cases, the batch job saves the p-value in the database. Later, I run a report to select the spreads with the best (i.e., smallest) p-values. Those are my candidates for seasonal trades.

The computer tests every combination of stocks and futures, so it occasionally reports a bizarre seasonal spread. It recently discovered that the spread between British Pound futures (BP) and Live Cattle (LC) shows seasonality at a confidence level of 96% or better. Would I trade the BP/LC spread? Of course not, because I cannot discern the economic logic of the trade.

- The analysis does not make a prediction. It only reports the past pattern. When you trade that pattern, you are assuming this year is like other years.
- The basic ANOVA analysis only reports that some months are different, not which months are different. The trader must look at monthly pattern to choose the right time for the trade.
- The trade decision must also incorporate the current market conditions. For example, if the history recommends buying but the spread is already quite high, the trade could be unwise.
- This analysis looks at monthly patterns. The seasonal patterns at other boundaries might be more distinct and, hence, better trades.

When I say “long the CL/RB spread”, I mean buy CL and sell RB. Likewise, “short the spread” means sell CL and buy RB.

I compute the hedge ratios for my spreads using ordinary least squares, as suggested in Ernie Chan's book. The ratio for the CL/RB spread, above, was 1.1376 CL contracts for each RB contract.

Notice that I compute the spread change, not the spread return. Quants usually study price returns, but that won’t work with spreads because the spread can be zero, giving an undefined return. The daily change follows a similar bell-shaped distribution, so it’s a reasonable object for study.

I monitor about 115 stocks and futures, so I have about 6,670 spreads to be tested weekly. A typical recent run found that over 270 spreads that exhibit some seasonality, or about 4% of those tested. The ANOVA batch job requires about 2-1/2 hours to complete on my computer. The software is written in a combination of Perl and R, the free statistical software system, running under Linux.

Most good textbooks on statistics discuss ANOVA. There is an article on Wikipedia, but it is not a tutorial.

Any decent software for statistics includes the ANOVA analysis. That includes R and Octave, which are both free.