WORKING PAPERS SERIES
WP00-01
Evaluating the Performance of Nearest
Neighbour Algorithms when Forecasting
US Industry Returns
Christian S. Pedersen and Stephen Satchell
1
Evaluating the Performance of Nearest Neighbour Algorithms
when Forecasting US Industry Returns
1
C. S. Pedersen
2
S. E. Satchell
3
Abstract
Using both industry-specific data on 55 US industry sectors and an extensive range of
macroeconomic variables, the authors compare the performance of nearest neighbour
algorithms, OLS, and a number of two-stage models based on these two methods, when
forecasting industry returns. As industry returns are a relatively under-researched area in the
Finance literature, we also give a brief review of the existing theories as part motivation for our
specific choice of variables, which are commonly employed by asset managers in practice.
Performance is measured by the Information Coefficient (IC), which is defined as the average
correlation between the 55 forecasted returns and the realised returns across industries over
time. Due to transaction costs, investors and asset managers typically want a steady out-
performance over time. Hence, the volatility of IC is taken into account through the application
of “Sharpe Ratios”. We find that two-stage procedures mixing industry-specific information with
macroeconomic indicators generally outperform both the stand-alone nearest neighbour
algorithms and time-series based OLS macroeconomic models.
Keywords: Nearest Neighbour Algorithm, US Industry Returns, Forecasting
1.
Introduction
The purpose of this paper is to build models of US industry returns and to
compare the forecasting properties of these models. We will use both macroeconomic
and industry-specific variables, and apply non-linear econometric techniques which
are compared with more conventional models based on OLS. In general, nearest
neighbour algorithms are an example of kernel/robust regression and applicable when
the exact functional relationship between input and output is not known. T